David Rorke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15963 )

Change subject: IMPALA-6692: Trigger sort node run before hitting memory limit.
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15963/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15963/9//COMMIT_MSG@23
PS9, Line 23: This patch speedup the decision to start the sort without waiting 
it
            : to hit memory limit first by capping the intermediary quicksort 
run to
            : lower memory limit,
> Estimates could work well on very simple queries, but even a query like a s
I agree that we shouldn't use estimates in cases where they're likely to be 
unreliable.
Maybe we should separate the discussion into what we think is needed if we want 
to make this the default behavior vs what's good enough if it's enabled only 
with a non-default query option.   The potential for regression is obviously 
more of a concern if we're going to enable by default.
A few other thoughts:
* If we're looking for heuristics at planning or runtime to indicate whether 
this is likely to help we should also consider whether the sort has partitioned 
inputs.   It's not clear whether this approach helps if the input to the sort 
isn't partitioned.
* Regarding Tim's suggestion for making the sort more incremental but without 
spilling more aggressively - I agree this would be interesting to explore.  
It's not obvious how much of the benefit we're seeing in benchmarks of the 
current patch is caused by the earlier spilling vs just reducing burstiness 
through more incremental runs.
* Since the patch can have immediate benefit in some well known cases like 
partitioned inserts, we might consider putting it in with the query option but 
without additional safeguards against regressions (but documenting the cases 
where it's safe to use) and then explore some of these other approaches in a 
follow up phase.



--
To view, visit http://gerrit.cloudera.org:8080/15963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2a0ba7c4bae4f1d300d4d9d7f594f63ced06a240
Gerrit-Change-Number: 15963
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: David Rorke <dro...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Fri, 05 Jun 2020 16:05:48 +0000
Gerrit-HasComments: Yes

Reply via email to