[ 
https://issues.apache.org/jira/browse/KUDU-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16867570#comment-16867570
 ] 

ZhangYao commented on KUDU-2854:
--------------------------------

After reading related code as saying in Description, if no entries in the 
dictionary match the predicate, we can short circuit at a few layers.

But currently we don't have a quick way to judge if there is any delta for the 
whole column(cfile) or the whole data block(part of cfile). Although base 
data's dictionary may not hit the predicate but after applying delta things may 
change. Cfile reads data batch by batch so we can only judge if there is any 
deltas for the batch and can short circuit the following data copy if on 
entries hit the predicate and no delta for this batch. This has been 
implemented in BinaryDictBlockDecoder::CopyNextAndEval.


Is there any way we can easily judge if a column contain deltas or if a data 
block contain deltas?(?)

> Short circuit predicates on dictionary-coded columns
> ----------------------------------------------------
>
>                 Key: KUDU-2854
>                 URL: https://issues.apache.org/jira/browse/KUDU-2854
>             Project: Kudu
>          Issue Type: Improvement
>          Components: cfile, perf, tserver
>            Reporter: Todd Lipcon
>            Priority: Major
>
> In the common case that a column has no updates in a given DRS, if we see 
> that no entries in the dictionary match the predicate, we can short circuit 
> at a few layers:
> - we can store a flag in the cfile footer that indicates that all blocks are 
> dict-coded (ie there are no fallbacks). In that case, we can skip the whole 
> rowset
> - if a cfile is partially dict-encoded, we can skip any dict-coded blocks 
> without decoding the dictionary words



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to