Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides 
DataSinks
......................................................................


Patch Set 7:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

Line 115:   template <bool ASYNC_BUILD>
why does this need to be a template?


http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/data-sink.h
File be/src/exec/data-sink.h:

Line 40: /// Superclass of all data sinks.
i think it's time to provide this with a real class comment (what is the 
abstraction behind it, where is it used, etc.)


Line 47:   virtual std::string Name() = 0;
GetName(), otherwise it sounds like a getter


Line 51:   virtual Status Prepare(RuntimeState* state, MemTracker* mem_tracker);
explain param; also, why does it get created outside this class? possibly 
explain in class comment


Line 59:   virtual Status NextBatchToSend(RowBatch** batch);
i don't understand the desired control flow, please explain in class comment


http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/hdfs-table-sink.h
File be/src/exec/hdfs-table-sink.h:

Line 131:   /// Prepares output_exprs and partition_key_exprs, and connects to 
HDFS.
leave blank line between function declaration and the comment for the next 
function


Line 142:   /// TODO: Move calls to functions that can fail in Close() to 
FlushFinal()
is this todo still appropriate?


http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/nested-loop-join-builder.h
File be/src/exec/nested-loop-join-builder.h:

Line 25: /// Builder for the NestedLoopJoinNode operator.
explain builder (or use different term; you're basically just repeating the 
class name).

i should be able to get a rough understanding of what this class does and where 
it's being used from the class comment.


Line 32: class NestedLoopJoinBuilder : public DataSink {
fine to abbreviate to Nlj


Line 49:   /// Returns the next batch from 'build_batch_cache_'
don't reference internal state


Line 50:   inline RowBatch* GetNextBatchForBuild() { return 
build_batch_cache_.GetNextBatch(); }
this is going beyond DataSink functionality, and should be described in the 
class comment.


http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/nested-loop-join-node.h
File be/src/exec/nested-loop-join-node.h:

Line 35: /// null-aware left anti-join.
missing reference to builder class


http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/runtime/plan-fragment-executor.h
File be/src/runtime/plan-fragment-executor.h:

Line 196:   boost::scoped_ptr<RowBatch> row_batch_;
separate sink-related items visually w/ blank line


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to