Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides 
DataSinks
......................................................................


Patch Set 11:

(37 comments)

Addressed most of the comments. There is still an unresolved question around 
whether the DataSink interface should allocate row batches. 

I spoke to Marcel about it and we didn't reach a conclusion about which 
solution would be better, so I've left that unchanged in this patch but cleaned 
up the other issues.

http://gerrit.cloudera.org:8080/#/c/3282/8/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

Line 23: #include "exec/data-sink.h"
> forward declare DataSink
This became necessary in a later iteration of the patch, since we declare a 
subclass of DataSink in the header.


Line 98:   /// Prepare() work, e.g. codegen.
> Nit, but I don't think we construct the "build side". The build side is an 
Done. Renamed ConstructBuildSide to ConstructJoinBuild


Line 104:   /// Subclasses should close any other structures and then call
> ... the Open() and build ...
Not quite sure what you were suggesting, tried to make the wording a bit better.


Line 105:   /// BlockingJoinNode::Close().
> permit (plural)
Done


Line 106:   virtual void Close(RuntimeState* state);
> Suggest: If the build_sink is non-NULL, uses it to construct the join build
Done


PS8, Line 109: 
Changed to JoinBuildSink to be more accurate.


Line 112:   TJoinOp::type join_op_;
> ... to construct the join build via a data sink.
Done


Line 118: 
> left child (probe side).
Done


Line 123:   TupleRow* current_probe_row_;  // The row currently being probed
> typo: extra '.'
Done


Line 196: 
> nit: I don't think we use the DataSink interface to construct the join buil
Done


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

Line 42: /// BlockingJoinNode. Reset() is added to support subplans and more 
efficient allocation
> remove "is added"
Done


Line 43: /// of RowBatch objects is enabled with the GetNextBatchForBuild() 
interface.
> remove "is enabled"
Reworded to avoid it.


Line 51:   /// Frees resources associated with cache row batches.
> cached row batches
Done


Line 52:   /// Subclasses must call JoinBuildSink::Close().
> is the order of Close() calls important? If so, mention which Close() shoul
Not really, but documented it for consistency.


Line 55:   /// Returns the next build batch that should be filled and passed to 
AddBuildBatch()
> Returns an empty build batch that should be filled and passed to Send().
Done


Line 56:   /// or Send(). For join types that accumulates batches, this helps 
avoid additional
> accumulate
Done


Line 60:   inline RowBatch* GetNextBatchForBuild() { return 
build_batch_cache_.GetNextBatch(); }
> GetEmptyBatch() or GetNextEmptyBatch()? The important part seems to be that
Renamed to GetNextEmptyBatch().

Still looking at the alternative with the DataSink interface and going 
back-and-forth. I think adding it to the DataSink interface results in less 
code, but prevents some usage patterns of the DataSink interface that may make 
sense in the future. E.g. if the caller of the DataSink methods wants to have 
multiple batches in flight - maybe it's pulling data from multiple queues?


Line 70:   const bool using_builder_interface_;
> Isn't the accumulates_batches_ flag sufficient? The comment on accumulates_
No, they're different. The DataSink interface doesn't have a 
GetNextBatchForBuild() method, so we can't assume that the DataSink client 
allocated a batch in that way - it can pass in an arbitrary row batch. This 
flag implies that we can assume that the caller allocated the batch from the 
cache.

E.g. for PartitionedHashJoinNode, using_builder_interface_ will be true, but 
accumulates_batches_ will be false, since the contents of the batches are 
always copied.


PS11, Line 70: using_builder_interface_
I renamed this to using_join_build_interface for consistency.


Line 73:   /// If true, the caller should call NextBuildBatch() to get an empty 
batch before every
Updated this comment to reflect that only BlockingJoinNode can call 
GetNextEmptyBatch().


Line 76:   const bool accumulates_batches_;
> Couldn't this be done by making GetNextBatchForBuild() virtual? My thinking
I did that in an earlier version of the patch. The concern with that approach 
was that a subplan-specific optimisation was complicating the interface for all 
data sinks, most of which can never appear in a subplan.

The virtue of the current approach is that the complication for this 
optimisation is isolated to blocking-join-node.{cc,h}


Line 150:   virtual Status ConstructBuildSide(RuntimeState* state) = 0;
> Terminology nit that is not your fault: To me "Build Side" means "the exec 
Did this.


Line 154:   /// If build_sink is non-NULL, the sink interface will be used to 
construct the build
> If build_sink is non-NULL, uses it to construct the build side.
Done


http://gerrit.cloudera.org:8080/#/c/3282/8/be/src/exec/data-sink.cc
File be/src/exec/data-sink.cc:

Line 166
> single line
This was removed in a later version of the patch.


http://gerrit.cloudera.org:8080/#/c/3282/8/be/src/exec/data-sink.h
File be/src/exec/data-sink.h:

Line 71:   virtual Status FlushFinal(RuntimeState* state) = 0;
> The important piece seems to be that the returned batch is empty and is exp
The method was removed in the later patchset. Will do this if we decide to 
revert to this approach.


Line 104:   /// Owned by the RuntimeState.
> const methods
Done


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/hash-join-node.cc
File be/src/exec/hash-join-node.cc:

Line 207:   RETURN_IF_ERROR(BlockingJoinNode::GetFirstProbeRow(state));
> Seems a little weird that we prepare for probe after getting the first prob
ResetForProbe() looks up the first probe row in the hash table for GetNext(), 
so it needs to be called after GetFirstProbeRow().

This keeps the flow more analogous to PHJ (although ResetForProbe() is called 
there once per batch).


http://gerrit.cloudera.org:8080/#/c/3282/8/be/src/exec/hash-join-node.h
File be/src/exec/hash-join-node.h:

Line 70:   /// holds everything referenced from build side
> Can you spell out 'everything' a little more?
I don't actually know. This was a mechanical refactor to move this from 
BlockingJoinNode since it's not used by any other subclass. Hoping to delete 
this module soon.


Line 120:   void ResetForProbe();
> PrepareForProbe()?
Renamed to InitGetNext().


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/hash-join-node.h
File be/src/exec/hash-join-node.h:

Line 70:   /// holds everything referenced from build side
> can you spell out 'everything' a little more? is it memory referenced from 
Replied on the other patchset, but this was just a mechanical refactor to move 
it from BlockingJoinNode. Doesn't seem worth it to clean this up.


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/hdfs-table-writer.cc
File be/src/exec/hdfs-table-writer.cc:

Line 20: #include "runtime/row-batch.h"
> remove?
Done. Maybe was an editing error.


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/nested-loop-join-builder.h
File be/src/exec/nested-loop-join-builder.h:

Line 24: /// Builder for the NestedLoopJoinNode operator that accumulates the 
build-side rows
> remove "operator"
Done


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/nested-loop-join-node.cc
File be/src/exec/nested-loop-join-node.cc:

Line 59:   SCOPED_TIMER(runtime_profile_->total_time_counter());
> won't this double count the time spent in BlockingJoinNode::Open()? I think
The general pattern (E.g. ExecNode::Open(), BlockingJoinNode::Open()) is that 
the abstract base classes don't use timers, so I think this is right as-is.


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/nested-loop-join-node.h
File be/src/exec/nested-loop-join-node.h:

Line 59:   boost::scoped_ptr<NljBuilder> build_side_;
> build_sink_?
Renamed to builder_ to match the class name.


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

Line 242:   SCOPED_TIMER(runtime_profile_->total_time_counter());
> I think the counter should start after BlockingJoinNode::Open()
Why is that? It seems strange not to include the time spent in the parent class.


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/row-batch-cache.h
File be/src/exec/row-batch-cache.h:

Line 61:     row_batches_.clear(); // unique_ptr automatically calls all 
destructors.
> Add a DCHECK in the RowBatchCache d'tor to make sure row_batches_ is empty?
Done


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/util/stopwatch.h
File be/src/util/stopwatch.h:

Line 241:   ScopedStopWatch(T* sw, bool enabled = true) :
> The enabled flag seems awkward because it's not clear why anyone would use 
If you used ScopedStopWatch directly it would still be useful in any situation 
where you want to have a scoped timer but also make a decision about whether to 
enable a timer at runtime (or at compile time with a template argument). 
Otherwise you have to mess around ifdefs or very carefully turn the timer on 
and off at the right places.

You could mess about with template types and have a DummyScopedStopWatch type, 
but that seems clunkier and less general (since you can't make a runtime 
decision to enable the timer then).


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 11
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to