Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-3586: Implement union passthrough
......................................................................


Patch Set 19:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/5816/19/be/src/exec/union-node.cc
File be/src/exec/union-node.cc:

Line 111:   DCHECK_LT(child_idx_, children_.size());
> this DCHECK should come before any use of child_idx_ since if it's violated
Done


Line 114:     
DCHECK(child(child_idx_)->row_desc().LayoutEquals(row_batch->row_desc()));
> what do these dchecks have to do with child_eos_? i.e. shouldn't they true 
Yes, They should be true. I think the idea was to call this once per child. I 
actually think it's safer to call this every time because currently it does not 
get called for the first child (which gets opened in Open()).

Changed it so that it gets called every time.


PS19, Line 132: child_idx_++
> ++child_idx_ per our style.
Done


PS19, Line 147: child_idx_ < children_.size()
> Would make more sense as HasMoreMaterialized()
Done. Even though this does an extra unnecessary check 
(first_materialized_child_idx_ <= child_idx_), this is cleaner.


PS19, Line 150: children
> children that need materialization
Done


PS19, Line 153: Row
> Child row batch
Done


Line 168:       // There are only 3 ways of getting out of this loop:
> it would be helpful to concisely say what this loop is responsible for give
Done


Line 193:         COUNTER_SET(rows_returned_counter_, num_rows_returned_);
> should we do: child_batch_.reset() here? we shouldn't ever reference it aga
I see. So the invariant should be if child_batch references something (it's not 
reset), then it corresponds to an open materialized child.

Added reset() here. By the way, reset() will be called twice, once here and 
once in Close().


PS19, Line 200:     // We end up here iff one of the following is true (or 
both).
              :     // 1. We are done consuming all batches from the current 
child and we need to move on
              :     //    to the next child.
              :     // 2. The output row batch is at capacity.
> this could be summarized for a quicker read:
Changed the structure along the lines of what you suggested, and cleaned this 
up.


PS19, Line 204:  In other words, the only way to not end up here if we entered 
the outer loop is if
              :     // the limit is reached.
> rather than say that in a comment, how about:
Done


Line 263:       (!HasMorePassthrough() && !HasMoreMaterialized() && 
!HasMoreConst(state));
> shouldn't we put lines 254-263 inside a do-while loop, so that we don't ret
Unfortunately that wouldn't work.

For passthrough, we simply forward as they are. Even if they are partially 
filled, we just forward them. That's kind of the point of this patch.

Maybe we could put materialized and const in a while loop, but that would make 
the code messier and the benefit would be minimal (up to 1 fewer partially 
filled row batch per union node per entire cluster).


http://gerrit.cloudera.org:8080/#/c/5816/19/be/src/exec/union-node.h
File be/src/exec/union-node.h:

PS19, Line 115: child_idx
> 'child_idx'
Done


Line 129:     return child_idx_ >= first_materialized_child_idx_ && child_idx_ 
< children_.size();
> this suggestion is okay to ignore, but i find these kind of conditions easi
Reordered it as you suggested. (By the way, in Python you can actually write "w 
< x <= y < z")

first_materialized_child_idx_ points to the first materialized child. So 
everything after it is materialized. that means if child_idx_ is greater or 
equal to materialized_child_idx_ then it's materialized.

It's the exact opposite of passthrough (where we check if child_idx_ is less 
than).


http://gerrit.cloudera.org:8080/#/c/5816/19/be/src/exprs/slot-ref.cc
File be/src/exprs/slot-ref.cc:

Line 94:     DCHECK(false);
> this is confusing. either it's an invariant or it's not. it's even more con
Removed.


-- 
To view, visit http://gerrit.cloudera.org:8080/5816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia8f6d5062724ba5b78174c3227a7a796d10d8416
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Taras Bobrovytsky <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to