Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/9943 )
Change subject: IMPALA-5706: Spilling sort optimisations ...................................................................... Patch Set 10: (6 comments) http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.h File be/src/runtime/sorter.h: http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.h@193 PS8, Line 193: /// function calculates the maximum number of runs that can be taken care of during the > this sentence seems garbled. "can be taken care of"? Done http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.h@199 PS8, Line 199: /// round of merging. Returns at most MaxRunsInNextMerge(), so the Sorter will have > Can you mention the invariant that we have enough reservation for this numb Done http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.cc File be/src/runtime/sorter.cc: http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.cc@1676 PS8, Line 1676: num_required_buffers += var_len_pages_found ? 2 : 1; > I think we need two buffers regardless if there are var-len slots, since Ru As far as I see in Run::Init() it already knows if the run has var_len_slots (from TupleDescriptor given as constructor param) and add a page for that only in case it does. http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.cc@1679 PS8, Line 1679: i - > nit: i - 1 Done http://gerrit.cloudera.org:8080/#/c/9943/8/tests/query_test/test_sort.py File tests/query_test/test_sort.py: http://gerrit.cloudera.org:8080/#/c/9943/8/tests/query_test/test_sort.py@104 PS8, Line 104: orders o2 on (o1.o_orderkey = o2.o_orderkey) order by o1.o_orderdate limit 100000""" > Should we also set num_nodes=1 here? Otherwise it might not be robust when Good point. Done http://gerrit.cloudera.org:8080/#/c/9943/10/tests/query_test/test_sort.py File tests/query_test/test_sort.py: http://gerrit.cloudera.org:8080/#/c/9943/10/tests/query_test/test_sort.py@142 PS10, Line 142: exec_option['buffer_pool_limit'] = "22m" Had to change this as the minimum mem requirement increased for this query. I guess that is because I pulled in another change with the rebase that affected this. As a result, the total merges done by the first sort decreased. -- To view, visit http://gerrit.cloudera.org:8080/9943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9 Gerrit-Change-Number: 9943 Gerrit-PatchSet: 10 Gerrit-Owner: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Wed, 16 May 2018 13:29:18 +0000 Gerrit-HasComments: Yes