Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9570 )

Change subject: IMPALA-6621: Improve set lookup performance for in-predicate 
evaluation
......................................................................


Patch Set 1:

(1 comment)

Out of curiousity I tried out this alternative hash map: 
https://github.com/greg7mdp/sparsepp

It was actually slower for decimal (700ms vs 500ms). I also concluded that 
google's dense_hash_set was somewhat tricky to make work, since it requires 
providing a sentinel value to represent empty entries.

http://gerrit.cloudera.org:8080/#/c/9570/1/be/src/exprs/in-predicate.h
File be/src/exprs/in-predicate.h:

http://gerrit.cloudera.org:8080/#/c/9570/1/be/src/exprs/in-predicate.h@359
PS1, Line 359:       state->val_set.insert(GetVal<T, SetType>(state->type, 
*arg));
We should change this function to use the bulk insert API to avoid N^2 
behaviour with flat_set: 
http://www.boost.org/doc/libs/1_56_0/doc/html/boost/container/flat_set.html#idp30015536-bb



--
To view, visit http://gerrit.cloudera.org:8080/9570
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifd1627d779d10a16468cc3c2d0bc26a497e048df
Gerrit-Change-Number: 9570
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Mon, 12 Mar 2018 20:40:02 +0000
Gerrit-HasComments: Yes

Reply via email to