Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15963 )

Change subject: IMPALA-6692: Trigger sort node run before hitting memory limit.
......................................................................


Patch Set 10:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/15963/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15963/9//COMMIT_MSG@23
PS9, Line 23: This patch speedup the decision to start the sort without waiting 
it
            : to hit memory limit first by capping the intermediary quicksort 
run to
            : lower memory limit,
> The following estimates vs actual from a partitioned parquet insert of cata
Patch set 10 add flag to either enforce sort_run_bytes_limit or not. It will 
not enforce sort_run_bytes_limit for the first run unless planner estimates 
hint that spill is inevitable.


http://gerrit.cloudera.org:8080/#/c/15963/9//COMMIT_MSG@47
PS9, Line 47:
> This is the maximum time of a single AddBatchTime call.
I decide to keep the counter name as it is, because there is already another 
counter for quicksort total time.


http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/runtime/sorter.cc
File be/src/runtime/sorter.cc:

http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/runtime/sorter.cc@879
PS9, Line 879:   int num_processed = 0;
> Ahh, Sorter::enable_spilling_ is a bit misleading - that actually is whethe
Done, it is now summarized by enforce_sort_run_bytes_limit_ flag.


http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/runtime/sorter.cc@880
PS9, Line 880:   int cur_batch_index = 0;
             :   while (cur_batch_index < batch->num_rows()) {
             :     RETURN_IF_ERROR(AddBatchNoSpill(batch, cur_batch_index, &nu
> Ack
Done


http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/runtime/sorter.cc@894
PS9, Line 894:     }
> Ack
Move the timer and counter to sort-node.cc


http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/service/query-options.cc@904
PS9, Line 904:         RETURN_IF_ERROR(
> Ack
I decide to silently cap it to minimal 32 MB as your later suggestion.
It feels more natural so that we can keep -1 and 0 to represent 'no limit' and 
not invalidate range between 1 byte to 32 MB.


http://gerrit.cloudera.org:8080/#/c/15963/9/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/15963/9/common/thrift/ImpalaService.thrift@531
PS9, Line 531: intermediate
> Ack
Done


http://gerrit.cloudera.org:8080/#/c/15963/9/common/thrift/ImpalaService.thrift@533
PS9, Line 533:   SORT_RUN_BYTES_LIMIT = 103
> Ack
Done


http://gerrit.cloudera.org:8080/#/c/15963/9/tests/query_test/test_sort.py
File tests/query_test/test_sort.py:

http://gerrit.cloudera.org:8080/#/c/15963/9/tests/query_test/test_sort.py@79
PS9, Line 79:     options = [('-1', '2g'), ('100m', '2g'), ('400m', '400m'), 
('120m', '400m')]
> Ack
Done



--
To view, visit http://gerrit.cloudera.org:8080/15963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2a0ba7c4bae4f1d300d4d9d7f594f63ced06a240
Gerrit-Change-Number: 15963
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: David Rorke <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Fri, 05 Jun 2020 21:00:07 +0000
Gerrit-HasComments: Yes

Reply via email to