Re: Load is high on the Kudu dedicated node.

Todd Lipcon Tue, 14 Mar 2017 08:27:23 -0700

Hi Jason,

By "bulk indexing only" you mean you are loading data with a high rate of
inserts?

It seems that there is a lot of contention on the memory trackers.
https://issues.apache.org/jira/browse/KUDU-1502 is one JIRA where I noted
this was the case. If that's the culprit, I would look into the following:

- try to change your insert pattern so that it is more sequential in nature
(random inserts will cause a lot of block cache lookups to check for
duplicate keys)
- if you have RAM available, increase both the block cache capacity and the
server's memory limit accordingly, so that the bloom lookups will hit
Kudu's cache instead of having to go to the operating system cache.

Aside from that, we'll be spending some time on improving performance of
write-heavy workloads in upcoming releases, and I think fixing this
MemTracker contention will be one of the issues tackled.

In case the above isn't the issue, do you think you could use 'perf record
-g -a' and generate a flame graph?
http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html

-Todd

On Tue, Mar 14, 2017 at 6:14 AM, Jason Heo <[email protected]> wrote:

> Hi. I'm experiencing high load and high cpu usage. Kudu is running on 5
> kudu dedicated nodes. 2 nodes' load is 40, while 3 nodes' load is 15.
>
> Here is the output of `perf record -a & perf report` during bulk indexing
> only operation.
>
> http://imgur.com/8lz1CRk
>
> I'm wondering this is a reasonable situation.
>
> I'm using Kudu on CDH 5.10
>
> Thanks.
>

-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Load is high on the Kudu dedicated node.

Reply via email to