Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/8226 )
Change subject: IMPALA-2758: Change BufferedTupleStream::GetRows to returning multi batches ...................................................................... Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/8226/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8226/1//COMMIT_MSG@14 PS1, Line 14: Any testing? http://gerrit.cloudera.org:8080/#/c/8226/1/be/src/exec/partitioned-hash-join-node.cc File be/src/exec/partitioned-hash-join-node.cc: http://gerrit.cloudera.org:8080/#/c/8226/1/be/src/exec/partitioned-hash-join-node.cc@950 PS1, Line 950: break; I think this changes the behavior here - this will only break out of the FOREACH_ROW loop, but what we want is to stop iterating over the entire set of build rows (that was previously one batch, but is now the same set of rows split into multiple batches). http://gerrit.cloudera.org:8080/#/c/8226/1/be/src/runtime/buffered-tuple-stream.h File be/src/runtime/buffered-tuple-stream.h: http://gerrit.cloudera.org:8080/#/c/8226/1/be/src/runtime/buffered-tuple-stream.h@339 PS1, Line 339: vector<std::unique_ptr<RowBatch>>* batch Can you move this to be after 'batch_size'? We generally keep out parameters as the trailing parameters for clarity. -- To view, visit http://gerrit.cloudera.org:8080/8226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414 Gerrit-Change-Number: 8226 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi Wang <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Comment-Date: Fri, 06 Oct 2017 18:24:58 +0000 Gerrit-HasComments: Yes
