Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8196 )

Change subject: IMPALA-4236: Codegen CopyRows() for select nodes
......................................................................


Patch Set 1:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG@8
PS1, Line 8:
Can you add a testing section? It would be good to mention what test coverage 
we have for this code path (it's certainly possible that we have some gaps).


http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG@9
PS1, Line 9: Preformance
Performance


http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG@10
PS1, Line 10: perdicates
predicates


http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG@11
PS1, Line 11: [column_name > value AND]
Can you include a concrete example of the query to make it easier for others to 
reproduce?


http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG@18
PS1, Line 18: | Select Node  | 12s385ms   | 1m1s        | 234ms      | 797ms    
   |
Nice!


http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node-ir.cc
File be/src/exec/select-node-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node-ir.cc@29
PS1, Line 29: child_row_batch_->num_rows()
This probably doesn't matter, but you could save a few instructions by hoisting 
this out of the loop. E.g.

  int child_batch_num_rows = ...;

Pretty sure the compiler can't infer that num_rows() is constant across loop 
iterations (since 'this' and 'this->child_row_batch_' could alias memory that's 
written in the loop so it will end up chasing the pointers on every iteration.


http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node-ir.cc@41
PS1, Line 41:       COUNTER_SET(rows_returned_counter_, num_rows_returned_);
COUNTER_SET can be pretty expensive because of the atomic operation. We can 
actually move it out of this function entirely into the caller.


http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc
File be/src/exec/select-node.cc:

http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc@55
PS1, Line 55: NULL
Use nullptr in new code (here and below).


http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc@65
PS1, Line 65:   if (codegen_status.ok()) {
Instead of this branch the control flow may be easier to follow if you factor 
out the CopyRows codegen logic into a separate Status-returning function that 
returns early on an error.

This will be more true if you start checking FinalizeFunction() below.


http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc@71
PS1, Line 71:     DCHECK(copy_rows_fn != NULL);
We should probably check if this is NULL and fail codegen if so. I'm not sure 
if we do this everywhere in the code but we probably should be doing so 
(although it isn't super-important, since we shouldn't be failing verification 
in the first place).

(The invariants around this function are a bit wonky at the moment since we 
don't really have good testing around this, but FinalizeFunction() is 
documented as returning NULL in some cases).


http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc@99
PS1, Line 99:   bool copy_rows_success = false;
Just declare it inside the loop where it's needed? When I see it declared up 
here I tend to think that the value lives across loop iterations.


http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc@114
PS1, Line 114:     copy_rows_success = false;
It's assigned on both branches so this isn't necessary.

I think some people consider this good defensive coding practice but it seems 
unnecessary to me.



--
To view, visit http://gerrit.cloudera.org:8080/8196
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie0d496d004418468e16b6f564f90f45ebbf87c1e
Gerrit-Change-Number: 8196
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Mon, 02 Oct 2017 23:40:05 +0000
Gerrit-HasComments: Yes

Reply via email to