Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder
......................................................................


Patch Set 18:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/3873/18/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

PS18, Line 326: we'd like all partitions to
> ... all partitions ..
Done


PS18, Line 327: be
> are 
Done


Line 353:   // building hash tables because allocating probe buffers may cause 
some more partitions
> duplicated sentences
Done


PS18, Line 797: filters_.size() > 0)
> this doesn't seem to exactly match the logic at line 166 (that doesn't care
I simplified the code around line 166 by calculating 
bool build_filters; outside of the branches. Should be equivalent and easier to 
reason about.


http://gerrit.cloudera.org:8080/#/c/3873/18/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

Line 276:   /// Returns non-ok status if we couldn't spill a partition.
> nit: weird line breaking.
Done


PS18, Line 280: should have all of their build rows
> what is this trying to say? that we're done partitioning the build?
Yeah, clarified in terms of FlushFinal().


PS18, Line 286: no buffers in either the build or probe stream.
> clarify whether this means neither has buffers or at least one does not hav
Done


PS18, Line 287: in_mem
> are these states actually represented in code? (if not, why the underscore?
Done


PS18, Line 291: probe_buffers_for_spilled_partitions_
> what's that?
Ah, stale comment from earlier version of the code. Updated it.


PS18, Line 301: probe_buffers_for_spilled_partitions_
> same
Ah, this method doesn't even exist and more. Removed up updated the 
CloseAndDeletePartitions() comment (since it does that work now).;


PS18, Line 349: PartitionedHashJoinNode
> who owns it?
Clarified. Also added a TODO to indicate this will need to be updated for the 
spilling broadcast join case when we come to design a solution for that.


PS18, Line 427: so that the
              :   /// initial probe buffer is allocated
> this doesn't really explain why the builder has to do it.  Would be good to
Done


Line 432:   /// phase of the algorithm without spilling more partitions.
> this hints at it, so maybe move this up and reword to be more explicit.
Done


http://gerrit.cloudera.org:8080/#/c/3873/18/be/src/exec/partitioned-hash-join-node-ir.cc
File be/src/exec/partitioned-hash-join-node-ir.cc:

PS18, Line 285: *status
> if we can now do this, can we also just make NextProbeRow() and others retu
IIRC I started to do that refactoring at one point but quickly realised that it 
didn't work.

NextProbeRow() returns false both when status is an error and when it does not 
have a next row to move to.


Line 293:       if (LIKELY(hash_tbl != NULL)) {
> is probe_partition NULL in this case?
Yes it is. I added additional DCHECKs to PrepareForProbe() to validate the 
consistency of the partition state. I think adding additional DCHECKs here 
would obscure the logic somewhat. I thought it was more helpful below since the 
logic relied more directly on the correlation between the different partitions.


http://gerrit.cloudera.org:8080/#/c/3873/18/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

Line 47:     "the memory limit may help this query to complete successfully.";
> weird line breakages
Done


http://gerrit.cloudera.org:8080/#/c/3873/18/be/src/exec/partitioned-hash-join-node.h
File be/src/exec/partitioned-hash-join-node.h:

Line 404:   /// builder_->hash_partitions. This is non-empty only in the 
PARTITIONING_PROBE or
> underscore, right?
I guess the accessor is hash_partitions() so changed it to that.


PS18, Line 406: a
> and?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3873
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1e02ea9c7a7d1a0f373b11aa06c3237e1c7bd4cb
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to