[
https://issues.apache.org/jira/browse/KUDU-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230392#comment-17230392
]
ASF subversion and git services commented on KUDU-1644:
-------------------------------------------------------
Commit 6a7cadc7eddeaaa374971d5ba16fec8422e33db9 in kudu's branch
refs/heads/master from ningw
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=6a7cadc ]
KUDU-1644 hash-partition based in-list predicate optimization
Hash prune for single hash-key based inList query. Reduce the values to
predicate
by hash-partition match.
This patch reduces the IN List predicated values to be pushed to tablet
without change the content to be returned.
Table has P partitions, N records. Inlist predicate has V values.
Before:
To each tablet, time complexity to complete hash-key based in-list query is:
LOG(V) * N
After:
Complexity becomes:
LOG(V/P) * N
E.g.
Hash partition of table 'profile':
hash(id) by id partitions 3, simply use mod as hash function.
select * from profile where id in (1,2,3,4,5,6,7,8,9,10)
Before:
Tablet 1: id in (1,2,3,4,5,6,7,8,9,10)
Tablet 2: id in (1,2,3,4,5,6,7,8,9,10)
Tablet 3: id in (1,2,3,4,5,6,7,8,9,10)
After:
Tablet 1: id in (0,3,6,9)
Tablet 2: id in (1,4,7,10)
Tablet 3: id in (2,5,8)
Change-Id: I202001535669a72de7fbb9e766dbc27db48e0aa2
Reviewed-on: http://gerrit.cloudera.org:8080/16674
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <[email protected]>
> Simplify IN-list predicate values based on tablet partition key or rowset PK
> bounds
> -----------------------------------------------------------------------------------
>
> Key: KUDU-1644
> URL: https://issues.apache.org/jira/browse/KUDU-1644
> Project: Kudu
> Issue Type: Sub-task
> Components: perf, tablet
> Reporter: Dan Burkert
> Priority: Major
> Attachments: image-2019-12-05-14-52-05-846.png,
> image-2019-12-05-14-52-18-487.png, image-2019-12-05-14-53-51-175.png,
> image-2019-12-05-14-53-57-741.png, image-2019-12-05-14-54-03-485.png
>
>
> When new scans are optimized by the tablet, the tablet's partition key bounds
> aren't taken into account in order to remove predicates from the scan. One
> of the most important such optimizations is that IN-list predicates could
> remove values based on the tablet's constraints.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)