Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks ......................................................................
Patch Set 7: (13 comments) http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/blocking-join-node.h File be/src/exec/blocking-join-node.h: Line 115: template <bool ASYNC_BUILD> why does this need to be a template? http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/data-sink.h File be/src/exec/data-sink.h: Line 40: /// Superclass of all data sinks. i think it's time to provide this with a real class comment (what is the abstraction behind it, where is it used, etc.) Line 47: virtual std::string Name() = 0; GetName(), otherwise it sounds like a getter Line 51: virtual Status Prepare(RuntimeState* state, MemTracker* mem_tracker); explain param; also, why does it get created outside this class? possibly explain in class comment Line 59: virtual Status NextBatchToSend(RowBatch** batch); i don't understand the desired control flow, please explain in class comment http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/hdfs-table-sink.h File be/src/exec/hdfs-table-sink.h: Line 131: /// Prepares output_exprs and partition_key_exprs, and connects to HDFS. leave blank line between function declaration and the comment for the next function Line 142: /// TODO: Move calls to functions that can fail in Close() to FlushFinal() is this todo still appropriate? http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/nested-loop-join-builder.h File be/src/exec/nested-loop-join-builder.h: Line 25: /// Builder for the NestedLoopJoinNode operator. explain builder (or use different term; you're basically just repeating the class name). i should be able to get a rough understanding of what this class does and where it's being used from the class comment. Line 32: class NestedLoopJoinBuilder : public DataSink { fine to abbreviate to Nlj Line 49: /// Returns the next batch from 'build_batch_cache_' don't reference internal state Line 50: inline RowBatch* GetNextBatchForBuild() { return build_batch_cache_.GetNextBatch(); } this is going beyond DataSink functionality, and should be described in the class comment. http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/nested-loop-join-node.h File be/src/exec/nested-loop-join-node.h: Line 35: /// null-aware left anti-join. missing reference to builder class http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/runtime/plan-fragment-executor.h File be/src/runtime/plan-fragment-executor.h: Line 196: boost::scoped_ptr<RowBatch> row_batch_; separate sink-related items visually w/ blank line -- To view, visit http://gerrit.cloudera.org:8080/3282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475 Gerrit-PatchSet: 7 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
