Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9943 )

Change subject: IMPALA-5706: Parallelise read I/O in sorter
......................................................................


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/9943/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9943/3//COMMIT_MSG@14
PS3, Line 14:   - Remove the hard-coded maximum limit of buffers that can be 
used
            :     for merging the sorted runs. Instead this number is calculated
            :     based on the available memeory through buffer pool.
Does this also apply to the final merge? If yes, does this mean that the parent 
node (which calls Open/GetNext on the sort node) will have less memory to do 
its job than before this change? My concern is that if the sort node reserves 
all the memory it can during Open, and the parent node wants to reserve memory 
after it calls its child node's Open, then it will not have enough memory left.


http://gerrit.cloudera.org:8080/#/c/9943/3//COMMIT_MSG@26
PS3, Line 26:   - In case doing a sort on top of a join when working with 
restricted
            :     amount of memory then additional memory is successfully 
allocated
            :     for merging than was available during the initial sort runs.
I found this sentence hard to understand - can you rephrase it in a more 
detailed way? As I understand it, when the initial sort runs are finished, then 
the child node (join in this case) can be closed, so its released memory can be 
used by the merge runs.

It would be nice to write a small time line of the reservations during the life 
of a sort node, e.g.
Open:
1. child is opened, it can reserve memory
2. sort node reserves memory for initial sort runs
3. initial runs
4. child is closed, its reservation is released
5. sort node can try to increase its reservation
6. intermediate merges
GetNext:
7. final merge
Close:
8. reservation released


http://gerrit.cloudera.org:8080/#/c/9943/3//COMMIT_MSG@32
PS3, Line 32:     this change slighlty decrease the execution time for sorting.
type: missing s


http://gerrit.cloudera.org:8080/#/c/9943/3//COMMIT_MSG@32
PS3, Line 32: slighlty
typo: slightly


http://gerrit.cloudera.org:8080/#/c/9943/3/be/src/runtime/sorter.cc
File be/src/runtime/sorter.cc:

http://gerrit.cloudera.org:8080/#/c/9943/3/be/src/runtime/sorter.cc@1037
PS3, Line 1037:   DCHECK(next_page->is_pinned());
I think that the original DCHECK is still useful, but not for the next page, 
but for the page after that if there is one.


http://gerrit.cloudera.org:8080/#/c/9943/3/be/src/runtime/sorter.cc@1039
PS3, Line 1039: 'next_page' is already pinn
I would add to this comment that all pages must be pinned.



--
To view, visit http://gerrit.cloudera.org:8080/9943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9
Gerrit-Change-Number: 9943
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Fri, 20 Apr 2018 16:00:23 +0000
Gerrit-HasComments: Yes

Reply via email to