Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8196 )

Change subject: IMPALA-4236: Codegen CopyRows() for select nodes
......................................................................


Patch Set 6:

(2 comments)

Your code looks good but I'd really like to simplify this code further so that 
it's easier to understand in future.

http://gerrit.cloudera.org:8080/#/c/8196/6/be/src/exec/select-node.cc
File be/src/exec/select-node.cc:

http://gerrit.cloudera.org:8080/#/c/8196/6/be/src/exec/select-node.cc@91
PS6, Line 91:   if (ReachedLimit() || (child_row_idx_ == 
child_row_batch_->num_rows() && child_eos_)) {
This is still a bit wonky (not your change but we should try to leave this in a 
good state) - it's not clear when this will actually be taken.


The invariant around GetNext() is that callers shouldn't call GetNext() after 
'eos' is true. Given that invariant, I don't think it's possible that this is 
taken.

* ReachedLimit() should never be true here. limit_ == 0 should be caught by the 
frontend and converted into an EmptySetNode. If limit_ > 0, then we should have 
set *eos after copying the rows below.
* If we hit the end of the input below on line 120, we should already have set 
*eos.


http://gerrit.cloudera.org:8080/#/c/8196/6/be/src/exec/select-node.cc@107
PS6, Line 107:       child_row_batch_->TransferResourceOwnership(row_batch);
Sorry for the additional churn but I think the resource transfer is overly 
complicated. This actually will often attach the resources to the batch after 
the last batch that references them. Instead it should attach resources to the 
batch as soon as the last row from the input batch is copied to the output.

I'm thinking removing the transfer and AtCapacity() check here and changing the 
logic below to something like:

  if (*eos || child_row_idx_ == child_row_batch_->num_rows()) {
    child_row_batch_->TransferResourceOwnership(row_batch);
    child_row_batch_->Reset();
  }
  if (*eos || row_batch->AtCapacity()) return Status::OK();

Or maybe it would then be clearer as a do/while() loop.

This would require some further changes, since it would reset 
child_row_batch_->num_rows() to 0 - but maybe child_row_batch_->num_rows() == 0 
should actually just signal that the batch has been consumed.



--
To view, visit http://gerrit.cloudera.org:8080/8196
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie0d496d004418468e16b6f564f90f45ebbf87c1e
Gerrit-Change-Number: 8196
Gerrit-PatchSet: 6
Gerrit-Owner: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Mon, 09 Oct 2017 18:56:37 +0000
Gerrit-HasComments: Yes

Reply via email to