Tim Armstrong has posted comments on this change.

Change subject: IMPALA-2581: LIMIT can be propagated down into some aggregations
......................................................................


Patch Set 4: -Code-Review

I'm seeing some odd behaviour playing around with this. It looks like the 
streaming aggregation is still processing its full input, so e.g. select 
distinct * from tpch_20_parquet.lineitem limit 10 takes a while. I think what's 
happening is that it doesn't return eos to the DataStreamSender, which then 
keeps repeatedly calling GetNext().



Operator       #Hosts  Avg Time  Max Time    #Rows  Est. #Rows  Peak Mem  Est. 
Peak Mem  Detail                         
------------------------------------------------------------------------------------------------------------------------
04:EXCHANGE         1  38.626us  38.626us       10          10         0        
-1.00 B  UNPARTITIONED                  
03:AGGREGATE        3  54.343ms  57.428ms       28          10   2.48 MB       
10.00 MB  FINALIZE                       
02:EXCHANGE         3  44.995us  54.989us       28          10         0        
      0  HASH(tpch_20_parquet.lineit... 
01:AGGREGATE        3  33s149ms  33s383ms       30          10   5.00 MB       
10.00 MB  STREAMING                      
00:SCAN HDFS        3  16s484ms  21s155ms  119.99M     119.99M   1.35 GB        
1.38 GB  tpch_20_parquet.lineitem

-- 
To view, visit http://gerrit.cloudera.org:8080/3822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I59c5b7af7a73ccdbc5496b28eacb9b6859d202bc
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: No

Reply via email to