Dan Hecht has posted comments on this change. Change subject: IMPALA-4674: Part 3: fix null-aware anti join ......................................................................
Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/7367/1//COMMIT_MSG Commit Message: PS1, Line 13: note not Line 25: Instead we just iterate over the rows of the stream. are there join class comments that should be updated to explain this strategy? http://gerrit.cloudera.org:8080/#/c/7367/1/be/src/exec/partitioned-hash-join-builder.cc File be/src/exec/partitioned-hash-join-builder.cc: Line 251: RETURN_IF_ERROR(null_aware_partition_->Spill(BufferedTupleStreamV2::UNPIN_ALL)); it's a bit confusing that we call Partition::Spill() when Partition::is_spilled() is already true. The comment here helps, but maybe something we can say in the Spill() function comment about this use of Spill()? do we do this elsewhere? http://gerrit.cloudera.org:8080/#/c/7367/1/be/src/exec/partitioned-hash-join-node.h File be/src/exec/partitioned-hash-join-node.h: PS1, Line 95: /// Null aware anti-join (NAAJ) extends the above algorithm by accumulating rows with : /// NULLs into several different streams, which are processed in a separate step to : /// produce additional output rows. The NAAJ algorithm is documented in more detail in : /// header comments for the null aware functions and data structures. any of this need updates (or the comments this references)? -- To view, visit http://gerrit.cloudera.org:8080/7367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie2e60eb4dd32bd287a31479a6232400df65964c1 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-HasComments: Yes
