[
https://issues.apache.org/jira/browse/KUDU-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15923600#comment-15923600
]
Todd Lipcon commented on KUDU-1930:
-----------------------------------
Another factor here: it seems that most of the calls into AddDictWords are
passing only one value at a time -- perhaps the MRS compaction input is only
yielding "blocks" of a single row, in which case we get really poor batching
(and thus poor cache locality in the dict lookups, etc). This is a major source
of cache misses based on perf counters.
> Improve performance of dictionary builder
> -----------------------------------------
>
> Key: KUDU-1930
> URL: https://issues.apache.org/jira/browse/KUDU-1930
> Project: Kudu
> Issue Type: Bug
> Components: cfile, perf
> Affects Versions: 1.3.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
>
> I locally tweaked tpch_real_world to use hash partitioning instead of range
> partitioning, so that the different threads overlapped on the same tablets,
> simulating a more realistic parallel load scenario. I noticed that the MM
> threads were CPU bound, with a high percentage of CPU in AddCodeWords().
> Initial prototypes indicate that optimizing the hashmap used here would be an
> easy win.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)