Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8196 )
Change subject: IMPALA-4236: Codegen CopyRows() for select nodes ...................................................................... Patch Set 1: (12 comments) http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG@8 PS1, Line 8: Can you add a testing section? It would be good to mention what test coverage we have for this code path (it's certainly possible that we have some gaps). http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG@9 PS1, Line 9: Preformance Performance http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG@10 PS1, Line 10: perdicates predicates http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG@11 PS1, Line 11: [column_name > value AND] Can you include a concrete example of the query to make it easier for others to reproduce? http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG@18 PS1, Line 18: | Select Node | 12s385ms | 1m1s | 234ms | 797ms | Nice! http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node-ir.cc File be/src/exec/select-node-ir.cc: http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node-ir.cc@29 PS1, Line 29: child_row_batch_->num_rows() This probably doesn't matter, but you could save a few instructions by hoisting this out of the loop. E.g. int child_batch_num_rows = ...; Pretty sure the compiler can't infer that num_rows() is constant across loop iterations (since 'this' and 'this->child_row_batch_' could alias memory that's written in the loop so it will end up chasing the pointers on every iteration. http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node-ir.cc@41 PS1, Line 41: COUNTER_SET(rows_returned_counter_, num_rows_returned_); COUNTER_SET can be pretty expensive because of the atomic operation. We can actually move it out of this function entirely into the caller. http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc File be/src/exec/select-node.cc: http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc@55 PS1, Line 55: NULL Use nullptr in new code (here and below). http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc@65 PS1, Line 65: if (codegen_status.ok()) { Instead of this branch the control flow may be easier to follow if you factor out the CopyRows codegen logic into a separate Status-returning function that returns early on an error. This will be more true if you start checking FinalizeFunction() below. http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc@71 PS1, Line 71: DCHECK(copy_rows_fn != NULL); We should probably check if this is NULL and fail codegen if so. I'm not sure if we do this everywhere in the code but we probably should be doing so (although it isn't super-important, since we shouldn't be failing verification in the first place). (The invariants around this function are a bit wonky at the moment since we don't really have good testing around this, but FinalizeFunction() is documented as returning NULL in some cases). http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc@99 PS1, Line 99: bool copy_rows_success = false; Just declare it inside the loop where it's needed? When I see it declared up here I tend to think that the value lives across loop iterations. http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc@114 PS1, Line 114: copy_rows_success = false; It's assigned on both branches so this isn't necessary. I think some people consider this good defensive coding practice but it seems unnecessary to me. -- To view, visit http://gerrit.cloudera.org:8080/8196 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie0d496d004418468e16b6f564f90f45ebbf87c1e Gerrit-Change-Number: 8196 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Mon, 02 Oct 2017 23:40:05 +0000 Gerrit-HasComments: Yes
