Paul Rogers has posted comments on this change. ( http://gerrit.cloudera.org:8080/11698 )
Change subject: IMPALA-5004: Switch to sorting node for large TopN queries ...................................................................... Patch Set 3: (1 comment) This will be a great change. See comment about using a memory-based decision rather than, say, a row-based decision or a decision informed by the memory available to the query. http://gerrit.cloudera.org:8080/#/c/11698/3/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: http://gerrit.cloudera.org:8080/#/c/11698/3/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@313 PS3, Line 313: estimatedTopNMaterializedSize < ctx_.getQueryOptions().topn_bytes_limit; This code makes the sort/top-n decision based on memory. The number of bytes is a parameter. This seems awkward: the memory used must be part of the overall query budget. A query could, in odd cases, have multiple TopN operators, but the code here treats them one by one. Further, a TopN will always use fewer bytes than a Sort: a TopN needs to keep only n rows in general, but sort must buffer all rows. (Though, of course, sort can spill to relieve memory pressure.) I wonder, does it make sense to impose the limit as a row limit rather than a memory limit? Or, does it make sense to set the limit as part of a query memory planning exercise rather than as another free parameter that the user must juggle when thinking about memory budgets? -- To view, visit http://gerrit.cloudera.org:8080/11698 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I34c9db33c9302b55e9978f53f9c7061f2806c8a9 Gerrit-Change-Number: 11698 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Paul Rogers <[email protected]> Gerrit-Reviewer: Sahil Takiar <[email protected]> Gerrit-Comment-Date: Thu, 18 Oct 2018 21:15:27 +0000 Gerrit-HasComments: Yes
