Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15963 )

Change subject: IMPALA-6692: Trigger sort node run before hitting memory limit.
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15963/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15963/9//COMMIT_MSG@23
PS9, Line 23: This patch speedup the decision to start the sort without waiting 
it
            : to hit memory limit first by capping the intermediary quicksort 
run to
            : lower memory limit,
> From a quick look at the planner estimate code it seems we're estimating on
Estimates could work well on very simple queries, but even a query like a 
scanning a huge table + a some predicate can be tricky -  should we pessimize 
by always spilling, even if predicate might turn out to be very selective? The 
sort node is generally at the "end" of the query, so in complex queries the 
estimates can be easily orders of magnitude off at that point. + stats can be 
missing/not up to date

For these reasons I think that a simple adaptive behavior or a more 
sophisticated solution mentioned by Tim (doing quicksort in smaller runs from 
the start but postpone spilling if possible) would be much more reliable. Using 
estimates in specific cases (e.g. full table scan without predicates) could be 
a further optimization.



--
To view, visit http://gerrit.cloudera.org:8080/15963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2a0ba7c4bae4f1d300d4d9d7f594f63ced06a240
Gerrit-Change-Number: 15963
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: David Rorke <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Fri, 05 Jun 2020 09:21:07 +0000
Gerrit-HasComments: Yes

Reply via email to