Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21777 )
Change subject: IMPALA-13334: Fix test_sort.py DCHECK hit when max_sort_run_size>0 ...................................................................... Patch Set 6: (10 comments) http://gerrit.cloudera.org:8080/#/c/21777/3/be/src/service/query-options-test.cc File be/src/service/query-options-test.cc: http://gerrit.cloudera.org:8080/#/c/21777/3/be/src/service/query-options-test.cc@274 PS3, Line 274: : for (const auto& test_case : case_set) { > Ok, I can accept Daniels argument. I think this test can stay. I notice this is gone in ps6 in favor of set.test. I also suggest moving this to under SetSpecialOptions test to add max_sort_run_size=1 testcase. http://gerrit.cloudera.org:8080/#/c/21777/6/testdata/workloads/tpch/queries/sort-reservation-usage-single-node.test File testdata/workloads/tpch/queries/sort-reservation-usage-single-node.test: http://gerrit.cloudera.org:8080/#/c/21777/6/testdata/workloads/tpch/queries/sort-reservation-usage-single-node.test@9 PS6, Line 9: 2m "2MB" to differentiate against 2 millions. http://gerrit.cloudera.org:8080/#/c/21777/5/tests/query_test/test_sort.py File tests/query_test/test_sort.py: http://gerrit.cloudera.org:8080/#/c/21777/5/tests/query_test/test_sort.py@168 PS5, Line 168: if exec_option['max_sort_run_size'] == 2: : exec_option['mem_limit'] = "144m" > AFAIK if memory estimation cannot be calculated, then every query passes ad Got it, thank you for the explanation. http://gerrit.cloudera.org:8080/#/c/21777/6/tests/query_test/test_sort.py File tests/query_test/test_sort.py: http://gerrit.cloudera.org:8080/#/c/21777/6/tests/query_test/test_sort.py@115 PS6, Line 115: options = [('2g', '100m', '0'), ('400m', '130m', '5')] I'd suggest to rewrite into this for clarity: options = [('2g', '100m', '0'), ('400m', '130m', '4')] if (exec_option['max_sort_run_size'] == 0): # max_sort_run_size > 0 will spill more in Case 2. options = [('2g', '100m', '0'), ('400m', '130m', '5')] or # max_sort_run_size > 0 will spill more in Case 2. options = [('2g', '100m', '0'), ('400m', '130m', ('5' if exec_option['max_sort_run_size'] > 0 else '4'))] http://gerrit.cloudera.org:8080/#/c/21777/6/tests/query_test/test_sort.py@167 PS6, Line 167: # With 2 pages per run the varlen data is more fragmented and requires a higher limit. # With max_sort_run_size=2 (2 pages per run) the varlen data is more fragmented and requires a higher limit to maintain "TotalMergesPerformed: 1" assertion. http://gerrit.cloudera.org:8080/#/c/21777/6/tests/query_test/test_sort.py@243 PS6, Line 243: The larger the run size, the sooner the sorter can give up memory to the next node. Move this comment just after L245. And maybe reword to this: "If max_sort_run_size > 0, the larger the run size, the sooner the sorter can give up memory to the next node." http://gerrit.cloudera.org:8080/#/c/21777/6/tests/query_test/test_sort.py@246 PS6, Line 246: buffer_limit = '27m' Add comment: # Increase buffer_limit to maintain such that query never spill. http://gerrit.cloudera.org:8080/#/c/21777/6/tests/query_test/test_sort.py@250 PS6, Line 250: buffer_limit "buffer_pool_limit", for consistency. http://gerrit.cloudera.org:8080/#/c/21777/6/tests/query_test/test_sort.py@250 PS6, Line 250: $LIMIT "$BUFFER_POOL_LIMIT", for consistency. http://gerrit.cloudera.org:8080/#/c/21777/6/tests/query_test/test_sort.py@402 PS6, Line 402: # With 2 pages per run the var-len data is more fragmented, therefore it spills more. # With max_sort_run_size=2 (2 pages per run) the var-len data is more fragmented, therefore it spills more. -- To view, visit http://gerrit.cloudera.org:8080/21777 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I943d8edcc87df168448a174d6c9c6b46fe960eae Gerrit-Change-Number: 21777 Gerrit-PatchSet: 6 Gerrit-Owner: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 29 Oct 2024 15:10:53 +0000 Gerrit-HasComments: Yes
