Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder

Patch Set 14:


Before I responded to individual comments I wanted to point out that the 
coupling is now one way - the builder never refers to PhjNode aside from 
metadata provided via the constructor (you can verify there are no references 
to partitioned-hash-join-node.h or PartitionedHashJoin node in the builder 

At this point it should only require minor changes to this hash join code to 
create the builder in a separate fragment *before* the join node, provided you 
have a 1:1 mapping from builder to node. 1:n broadcast joins with no spilling 
would require some more changes, but is a lot closer.

I chose this stopping point because there was a clear physical separation 
between the classes, even if the spilling algorithm has some logical coupling 
and assumes a 1:1 mapping from node to builder. 

Reducing the logical coupling would require significant changes to the spilling 
algorithm that is driven from PartitionedHashJoin::GetNext(), which I think 
should be deferred until there is some consensus about what spilling for 1:n 
broadcast joins with multithreading might look like.

File be/src/exec/partitioned-hash-join-builder.h:

Line 123:   bool HashTableStoresNulls() const;
> why is this here rather than the join node? especially given that the join 
Both classes already have their own hash table context.

Line 128:   inline const std::vector<bool>& is_not_distinct_from() const {
> and then should this be here? is it to be closer to being able to break the
There isn't a parent link - all the necessary metadata about the join is passed 
in when the builder is initialised.

To view, visit http://gerrit.cloudera.org:8080/3873
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1e02ea9c7a7d1a0f373b11aa06c3237e1c7bd4cb
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to