Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11698 )

Change subject: IMPALA-5004: Switch to sorting node for large TopN queries
......................................................................


Patch Set 3:

(1 comment)

This will be a great change. See comment about using a memory-based decision 
rather than, say, a row-based decision or a decision informed by the memory 
available to the query.

http://gerrit.cloudera.org:8080/#/c/11698/3/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/11698/3/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@313
PS3, Line 313:           estimatedTopNMaterializedSize < 
ctx_.getQueryOptions().topn_bytes_limit;
This code makes the sort/top-n decision based on memory. The number of bytes is 
a parameter. This seems awkward: the memory used must be part of the overall 
query budget. A query could, in odd cases, have multiple TopN operators, but 
the code here treats them one by one.

Further, a TopN will always use fewer bytes than a Sort: a TopN needs to keep 
only n rows in general, but sort must buffer all rows. (Though, of course, sort 
can spill to relieve memory pressure.)

I wonder, does it make sense to impose the limit as a row limit rather than a 
memory limit?

Or, does it make sense to set the limit as part of a query memory planning 
exercise rather than as another free parameter that the user must juggle when 
thinking about memory budgets?



--
To view, visit http://gerrit.cloudera.org:8080/11698
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34c9db33c9302b55e9978f53f9c7061f2806c8a9
Gerrit-Change-Number: 11698
Gerrit-PatchSet: 3
Gerrit-Owner: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Paul Rogers <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Comment-Date: Thu, 18 Oct 2018 21:15:27 +0000
Gerrit-HasComments: Yes

Reply via email to