Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16674 )

Change subject: KUDU-1644 hash-partition based in-list predicate optimization
......................................................................


Patch Set 19: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16674/18//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16674/18//COMMIT_MSG@16
PS18, Line 16: Before:
             : To each tablet, time complexity to complete hash-key based 
in-list query is:
             : LOG(V) * N
             :
             : After:
             : Complexity becomes:
             : LOG(V/P) * N
> https://docs.google.com/document/d/1WO4TT2ZqGsvlgogyKOsChpinEeupZCkxn9OI5xu
Thanks for the experiments!

Interpreting the results a bit, for larger lists, the improvement seems less 
pronounced, as more of the time is spent in copying the larger results set 
compared to evaluating the predicate.

I would expect that for lists that are on the same order of magnitude as the 
number of hash buckets (in this case, 5-10), that the improvement may be even 
greater, as we would short circuit some tablets completely.



--
To view, visit http://gerrit.cloudera.org:8080/16674
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I202001535669a72de7fbb9e766dbc27db48e0aa2
Gerrit-Change-Number: 16674
Gerrit-PatchSet: 19
Gerrit-Owner: wangning <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <[email protected]>
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: wangning <[email protected]>
Gerrit-Comment-Date: Thu, 12 Nov 2020 06:06:03 +0000
Gerrit-HasComments: Yes

Reply via email to