Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/15963 )
Change subject: IMPALA-6692: Trigger sort node run before hitting memory limit. ...................................................................... Patch Set 10: (9 comments) http://gerrit.cloudera.org:8080/#/c/15963/9//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15963/9//COMMIT_MSG@23 PS9, Line 23: This patch speedup the decision to start the sort without waiting it : to hit memory limit first by capping the intermediary quicksort run to : lower memory limit, > The following estimates vs actual from a partitioned parquet insert of cata Patch set 10 add flag to either enforce sort_run_bytes_limit or not. It will not enforce sort_run_bytes_limit for the first run unless planner estimates hint that spill is inevitable. http://gerrit.cloudera.org:8080/#/c/15963/9//COMMIT_MSG@47 PS9, Line 47: > This is the maximum time of a single AddBatchTime call. I decide to keep the counter name as it is, because there is already another counter for quicksort total time. http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/runtime/sorter.cc File be/src/runtime/sorter.cc: http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/runtime/sorter.cc@879 PS9, Line 879: int num_processed = 0; > Ahh, Sorter::enable_spilling_ is a bit misleading - that actually is whethe Done, it is now summarized by enforce_sort_run_bytes_limit_ flag. http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/runtime/sorter.cc@880 PS9, Line 880: int cur_batch_index = 0; : while (cur_batch_index < batch->num_rows()) { : RETURN_IF_ERROR(AddBatchNoSpill(batch, cur_batch_index, &nu > Ack Done http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/runtime/sorter.cc@894 PS9, Line 894: } > Ack Move the timer and counter to sort-node.cc http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/service/query-options.cc@904 PS9, Line 904: RETURN_IF_ERROR( > Ack I decide to silently cap it to minimal 32 MB as your later suggestion. It feels more natural so that we can keep -1 and 0 to represent 'no limit' and not invalidate range between 1 byte to 32 MB. http://gerrit.cloudera.org:8080/#/c/15963/9/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/15963/9/common/thrift/ImpalaService.thrift@531 PS9, Line 531: intermediate > Ack Done http://gerrit.cloudera.org:8080/#/c/15963/9/common/thrift/ImpalaService.thrift@533 PS9, Line 533: SORT_RUN_BYTES_LIMIT = 103 > Ack Done http://gerrit.cloudera.org:8080/#/c/15963/9/tests/query_test/test_sort.py File tests/query_test/test_sort.py: http://gerrit.cloudera.org:8080/#/c/15963/9/tests/query_test/test_sort.py@79 PS9, Line 79: options = [('-1', '2g'), ('100m', '2g'), ('400m', '400m'), ('120m', '400m')] > Ack Done -- To view, visit http://gerrit.cloudera.org:8080/15963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2a0ba7c4bae4f1d300d4d9d7f594f63ced06a240 Gerrit-Change-Number: 15963 Gerrit-PatchSet: 10 Gerrit-Owner: Riza Suminto <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: David Rorke <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Fri, 05 Jun 2020 21:00:07 +0000 Gerrit-HasComments: Yes
