Matthew Jacobs has posted comments on this change. Change subject: IMPALA-5498: Support for partial sorts ......................................................................
Patch Set 6: (14 comments) http://gerrit.cloudera.org:8080/#/c/7267/6//COMMIT_MSG Commit Message: Line 7: IMPALA-5498: Support for partial sorts lets specify in Kudu for now PS6, Line 28: A nit: This also adds... Line 32: used previously for inserts into Kudu. "for inserts into Kudu"->"for DML statements into Kudu tables only. Future work can extend this to other table sinks IMPALA-????" Line 42: Now: no spills, sort takes 6m19s, for ~18% speedup comparison vs no partition/sort at all? http://gerrit.cloudera.org:8080/#/c/7267/6/be/src/exec/partial-sort-node.h File be/src/exec/partial-sort-node.h: PS6, Line 62: sort_exec_exprs_ update PS6, Line 62: derived based awkward/confusing http://gerrit.cloudera.org:8080/#/c/7267/6/be/src/runtime/sorter.cc File be/src/runtime/sorter.cc: PS6, Line 1383: RunSize can you indicate the units, e.g. #rows -> NumRowsPerRun? PS6, Line 1387: if (no_spill_) { : // 'min_buffers_required' is 1 since the sorter will only have 1 run at a time, so : // there won't be any merges. : min_buffers_required = 1; : } else { : min_buffers_required = MIN_BUFFERS_PER_MERGE; : } nit: ternary operator http://gerrit.cloudera.org:8080/#/c/7267/6/be/src/runtime/sorter.h File be/src/runtime/sorter.h: PS6, Line 45: Specifying 'no_spill' as true is For this use case, 'no_spill' should be set to true so the sorter can reduce... PS6, Line 103: bool no_spill = false comment on this even though it's mentioned above Line 117: /// up, it is sorted and a new unsorted run is created. comment that this may not be called if no_spill is true PS6, Line 127: InputDone comment that this may not be called if no_spill is true (are there any other cases I missed?) http://gerrit.cloudera.org:8080/#/c/7267/6/fe/src/main/java/org/apache/impala/planner/SortNode.java File fe/src/main/java/org/apache/impala/planner/SortNode.java: PS6, Line 56: TODO: run experiments to : // determine the value for this, consider making it configurable, enforce it in the BE. JIRA? PS6, Line 124: { : return type_ != TSortType.PARTIAL; : } nit: oneline? -- To view, visit http://gerrit.cloudera.org:8080/7267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zach Amsden <[email protected]> Gerrit-HasComments: Yes
