Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9943 )

Change subject: IMPALA-5706: Spilling sort optimisations
......................................................................


Patch Set 10:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.h
File be/src/runtime/sorter.h:

http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.h@193
PS8, Line 193:   /// function calculates the maximum number of runs that can be 
taken care of during the
> this sentence seems garbled. "can be taken care of"?
Done


http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.h@199
PS8, Line 199:   /// round of merging. Returns at most MaxRunsInNextMerge(), so 
the Sorter will have
> Can you mention the invariant that we have enough reservation for this numb
Done


http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.cc
File be/src/runtime/sorter.cc:

http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.cc@1676
PS8, Line 1676:       num_required_buffers += var_len_pages_found ? 2 : 1;
> I think we need two buffers regardless if there are var-len slots, since Ru
As far as I see in Run::Init() it already knows if the run has var_len_slots 
(from TupleDescriptor given as constructor param) and add a page for that only 
in case it does.


http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.cc@1679
PS8, Line 1679:  i -
> nit: i - 1
Done


http://gerrit.cloudera.org:8080/#/c/9943/8/tests/query_test/test_sort.py
File tests/query_test/test_sort.py:

http://gerrit.cloudera.org:8080/#/c/9943/8/tests/query_test/test_sort.py@104
PS8, Line 104:     orders o2 on (o1.o_orderkey = o2.o_orderkey) order by 
o1.o_orderdate limit 100000"""
> Should we also set num_nodes=1 here? Otherwise it might not be robust when
Good point. Done


http://gerrit.cloudera.org:8080/#/c/9943/10/tests/query_test/test_sort.py
File tests/query_test/test_sort.py:

http://gerrit.cloudera.org:8080/#/c/9943/10/tests/query_test/test_sort.py@142
PS10, Line 142:     exec_option['buffer_pool_limit'] = "22m"
Had to change this as the minimum mem requirement increased for this query. I 
guess that is because I pulled in another change with the rebase that affected 
this. As a result, the total merges done by the first sort decreased.



--
To view, visit http://gerrit.cloudera.org:8080/9943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9
Gerrit-Change-Number: 9943
Gerrit-PatchSet: 10
Gerrit-Owner: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Wed, 16 May 2018 13:29:18 +0000
Gerrit-HasComments: Yes

Reply via email to