Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10421 )

Change subject: IMPALA-5168: Codegen HASH_PARTITIONED 
KrpcDataStreamSender::Send()
......................................................................


Patch Set 2:

Tim, thanks for pointing that out. I tried the idea of splitting up each 
partition expression evaluation and hashing into a separate function. The idea 
is that if there are few expressions, the compiler should be smart enough to 
inline them so we will get the same benefit. If there are too many expressions, 
the compiler will do appropriate level of inlining. However, for some reasons, 
this doesn't quite pan out as expected. In particular, splitting each 
expression into a separate function resulted in slightly longer codegen time. 
The patch is here 
(https://github.com/michaelhkw/incubator-impala/commit/76d071f1bf2d82d4b0905ed2e64150f43198538f).
 I compared the level of regression with different wide tables using the query 
you suggested:

widetable_250_cols: 5% regression
widetable_1000_cols: 3% regression
widetable_250_cols_big: 12% regression

widetable_250_cols_big is created by concatenating multiple widetable_250_cols 
to a table with 800000 rows. It's large enough to trigger 3 instances of scan 
nodes (instead of 1 in the case of widetable_250_col). The larger table 
triggers slightly different optimized code as there are 3 channels. In the case 
of widetable_250_col (smaller table), there is only 1 channel so the optimizer 
was smart enough to optimize away all the evaluate and hash functions  as all 
rows are being added to the same channel.

Anyhow, I am open to other suggestions (e.g. optimizing the patch posted 
above). Given the level regression observed above, do you think there are extra 
steps which we should take ?


--
To view, visit http://gerrit.cloudera.org:8080/10421
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c44cc9312c062cc7a5a4ac9156ceaa31fb887ff
Gerrit-Change-Number: 10421
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Wed, 23 May 2018 01:21:16 +0000
Gerrit-HasComments: No

Reply via email to