Zach Amsden has posted comments on this change.

Change subject: IMPALA-5498: Support for partial sorts
......................................................................


Patch Set 3:

(6 comments)

Looks promising!

http://gerrit.cloudera.org:8080/#/c/7267/3/be/src/exec/partial-sort-node.cc
File be/src/exec/partial-sort-node.cc:

Line 32:     sorter_(NULL),
This doesn't need to handle offset?


Line 127:     input_batch_index_ += num_processed;
Since this only requires a partial sort, you can exit early when the cumulative 
number of rows processed exceeds the limit.  This might end up sorting 
significantly fewer rows (at the cost of less well sorted data when a limit is 
present).


Line 140:   if (ReachedLimit()) {
I guess it is not clear to me why allowing a limit without an offset is useful. 
 If there is a limit, the total ordering of rows is undefined and there is no 
way to continue fetching them without also allowing an offset.  Maybe document 
this limitation?


http://gerrit.cloudera.org:8080/#/c/7267/3/be/src/exec/partial-sort-node.h
File be/src/exec/partial-sort-node.h:

Line 58: 
Just for general niceness, can you also update be/src/exec/sort-node.h's 
private SortInput method to have WARN_UNUSED_RESULT?


http://gerrit.cloudera.org:8080/#/c/7267/3/be/src/runtime/sorter.cc
File be/src/runtime/sorter.cc:

Line 1434:   DCHECK(unsorted_run_ != NULL);
Unnecessary - this is unconditionally dereferenced


http://gerrit.cloudera.org:8080/#/c/7267/3/common/thrift/PlanNodes.thrift
File common/thrift/PlanNodes.thrift:

Line 360:   // Not used with TSortType::PARTIAL.
Please ignore my earlier comments about offset handling (although a comment 
that offsets are not supported in exec/partion-sode-node.cc is still welcome.


-- 
To view, visit http://gerrit.cloudera.org:8080/7267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <zams...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to