[ 
https://issues.apache.org/jira/browse/KUDU-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17178068#comment-17178068
 ] 

ASF subversion and git services commented on KUDU-2844:
-------------------------------------------------------

Commit fb0f4bc3bf63614a831fedc6cc29cf860dddaf49 in kudu's branch 
refs/heads/master from Todd Lipcon
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=fb0f4bc ]

KUDU-2844 (2/3): move RowBlock memory into a new RowBlockMemory struct

This takes the Arena* member of RowBlock and moves it into a new
RowBlockMemory structure. The RowBlockMemory structure will later
be extended to include a list of reference-counted block handles.

Change-Id: I17a21f33f44988795ffe064b3ba41055e1a19e90
Reviewed-on: http://gerrit.cloudera.org:8080/15801
Reviewed-by: Andrew Wong <[email protected]>
Tested-by: Kudu Jenkins


> Avoid copying strings from dictionary or plain-encoded blocks
> -------------------------------------------------------------
>
>                 Key: KUDU-2844
>                 URL: https://issues.apache.org/jira/browse/KUDU-2844
>             Project: Kudu
>          Issue Type: Improvement
>          Components: cfile, perf
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Major
>         Attachments: fg.svg
>
>
> When scanning a plain or dictionary-encoded binary column, we currently loop 
> over each entry and copy the string into the destination RowBlock's arena. In 
> TPCH Q1, the scanner threads use a significant percentage of CPU doing this 
> copying, and it also increases CPU cache footprint which likely decreases 
> performance in downstream operations like predicate evaluation, merging, 
> result serialization, etc.
> Instead of doing this, we could "attach" the dictionary block (with 
> ref-counting) to the RowBlock and refer directly to the dictionary entry from 
> the RowBlock. When the RowBlock eventually is reset, we can drop the 
> reference. This should be safe because we never mutate indirect data in-place.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to