Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/15963 )
Change subject: IMPALA-6692: Trigger sort node run before hitting memory limit. ...................................................................... Patch Set 14: (13 comments) http://gerrit.cloudera.org:8080/#/c/15963/12/be/src/exec/sort-node.h File be/src/exec/sort-node.h: http://gerrit.cloudera.org:8080/#/c/15963/12/be/src/exec/sort-node.h@77 PS12, Line 77: will > nit: "will go"? Done http://gerrit.cloudera.org:8080/#/c/15963/12/be/src/exec/sort-node.cc File be/src/exec/sort-node.cc: http://gerrit.cloudera.org:8080/#/c/15963/12/be/src/exec/sort-node.cc@89 PS12, Line 89: > Ok, in that case, we should just abandon estimate in case of cardinality is This is now changed to obtain full input estimate from planner. http://gerrit.cloudera.org:8080/#/c/15963/12/be/src/exec/sort-node.cc@90 PS12, Line 90: > I will look at possibility to access that average size data in the backend. I figure out how to get more precise estimate from planner. It is close enough to the actual input size that I observe in my test. http://gerrit.cloudera.org:8080/#/c/15963/12/be/src/exec/sort-node.cc@92 PS12, Line 92: & > I think that VLOG(3) is enough. Done http://gerrit.cloudera.org:8080/#/c/15963/12/be/src/runtime/sorter.h File be/src/runtime/sorter.h: http://gerrit.cloudera.org:8080/#/c/15963/12/be/src/runtime/sorter.h@101 PS12, Line 101: t > "the" would be better Done http://gerrit.cloudera.org:8080/#/c/15963/12/be/src/runtime/sorter.h@101 PS12, Line 101: /// 'estimated_input_size' is the total rows in bytes that are estimated to get added > nit: missing "are" Done http://gerrit.cloudera.org:8080/#/c/15963/12/be/src/runtime/sorter.h@102 PS12, Line 102: /// into this sorter. This is used to decide if sorter needs to proactively spill for > nit: needs Done http://gerrit.cloudera.org:8080/#/c/15963/12/be/src/runtime/sorter.h@102 PS12, Line 102: ively sp > nit: spill Done http://gerrit.cloudera.org:8080/#/c/15963/12/be/src/runtime/sorter.h@223 PS12, Line 223: do > nit: "do an"? Done http://gerrit.cloudera.org:8080/#/c/15963/12/be/src/runtime/sorter.cc File be/src/runtime/sorter.cc: http://gerrit.cloudera.org:8080/#/c/15963/12/be/src/runtime/sorter.cc@816 PS12, Line 816: PrettyPrinter::PrintBytes(estimated_input_size), > I think that VLOG(3) is enough here - this should happen if the cardinality Done http://gerrit.cloudera.org:8080/#/c/15963/12/be/src/runtime/sorter.cc@907 PS12, Line 907: PrettyPrinter::PrintBytes(GetSortRunBytesLimit())); > Same as line 816. Done http://gerrit.cloudera.org:8080/#/c/15963/12/common/thrift/ImpalaInternalService.thrift File common/thrift/ImpalaInternalService.thrift: http://gerrit.cloudera.org:8080/#/c/15963/12/common/thrift/ImpalaInternalService.thrift@645 PS12, Line 645: a join > typo: backends Done http://gerrit.cloudera.org:8080/#/c/15963/12/tests/query_test/test_sort.py File tests/query_test/test_sort.py: http://gerrit.cloudera.org:8080/#/c/15963/12/tests/query_test/test_sort.py@74 PS12, Line 74: """The first sort run is given a privilege to ignore sort_run_bytes_limit, except : when estimate hints that spill is inevitable. The lower sort_run_bytes_limit of : a query is, the more sort runs are likely to be produced and spilled. : Case 1 : 0 SpilledRuns, because all rows fit within the maximum reservation. : sort_run_bytes_limit is not enforced. : Case 2 : 3 SpilledRuns, because the first run hit reservation limit, and the : next 2 runs are capped to 150m. : Case 3 : 4 SpilledRuns, because sort node estimate that spill is inevitable. : So all runs are capped to 130m, including the first one.""" > I will look at that 'query_result.runtime_profile'. Done -- To view, visit http://gerrit.cloudera.org:8080/15963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2a0ba7c4bae4f1d300d4d9d7f594f63ced06a240 Gerrit-Change-Number: 15963 Gerrit-PatchSet: 14 Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: David Rorke <dro...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Sat, 27 Jun 2020 01:50:13 +0000 Gerrit-HasComments: Yes