Alex Behm has posted comments on this change. Change subject: IMPALA-3586: Implement union passthrough ......................................................................
Patch Set 16: (9 comments) http://gerrit.cloudera.org:8080/#/c/5816/14/be/src/exec/union-node.cc File be/src/exec/union-node.cc: Line 214: COUNTER_SET(rows_returned_counter_, num_rows_returned_); > Done. So we don't need query maintenance at all? We do need it, but only occasionally. It's already called at the beginning of GetNext(), so we don't need it here in addition. Line 286: } > Not sure if that makes sense. The DCHECK would only be true on the first ca Ahh right. I was trying to add some DCHECKs to assert child_eos_ is in the expected state because that was a little tricky to understand. I suspect my comment is no longer relevant after your new changes. http://gerrit.cloudera.org:8080/#/c/5816/16/be/src/exec/union-node.cc File be/src/exec/union-node.cc: Line 104: // Figure out which the sources from which we will need to fetch rows. This is being Figure out which -> Determine Line 261: Status UnionNode::GetNext(RuntimeState* state, RowBatch* row_batch, bool* eos) { This function looks so much better! http://gerrit.cloudera.org:8080/#/c/5816/16/be/src/exec/union-node.h File be/src/exec/union-node.h: Line 91: bool const_todo_; Minor naming thing, I don't feel too strongly but 'todo' seems unusual. Alternative: has_more_const_ has_more_passthrough_ has_more_materialied_ Line 107: /// call on the child. Sets "passthrough_todo_" to false when all passthrough children we typically use single quotes Line 109: Status GetNextPassThrough(RuntimeState* state, RowBatch* row_batch); I liked your original idea of passing a 'has_more' output parameter better. Yes, we'll always pass &passthrough_todo_ in here, but function params are easier to read and reason about than side-effects. Better to avoid side effects if we can. http://gerrit.cloudera.org:8080/#/c/5816/16/fe/src/main/java/org/apache/impala/planner/UnionNode.java File fe/src/main/java/org/apache/impala/planner/UnionNode.java: Line 215: isChildPassthrough_.clear(); PlanNode.init() generally needs to be idempotent because it could be called multiple times on the same PlanNode, e.g., once during single node planning and again later during distributed planning. Pretty sure that's the case for UnionNode. That means you could potentially end up in a weird state because the reordering does not consider the unionResultExprs_. Line 221: // Order the children, such that all passthrough children come before the move this reordering into a private helper function -- To view, visit http://gerrit.cloudera.org:8080/5816 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia8f6d5062724ba5b78174c3227a7a796d10d8416 Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
