Tim Armstrong has posted comments on this change. Change subject: IMPALA-4883: Union Codegen ......................................................................
Patch Set 4: (2 comments) Hmm, I guess the per-row overhead is probably not as significant for the case when we're returning a bunch of columns. There might be a more pronounced effect if we're just returning a handful of columns. I think we can still squeeze some more cycles out of this with minimal effort, but if we stop seeing a measurable improvement for a query that returns a smaller number of columns we could consider stopping then. http://gerrit.cloudera.org:8080/#/c/6459/4/be/src/exec/union-node-ir.cc File be/src/exec/union-node-ir.cc: Line 28: while (!dst_batch->AtCapacity() && child_row_idx_ < child_batch_->num_rows()) { The remaining optimisation is to pull all references to member variables out of the loop. I.e. child_row_idx_, child_batch_, child_row_idx_, child_expr_lists_, tuple_desc_->byte_size(). This will reduce the number of loads and stores quite a bit. E.g. int child_row_idx = child_row_idx_; int tuple_byte_size = tuple_desc_->byte_size; while (...) { } child_row_idx_ = child_row_idx; Currently it will do a load and store to variables like 'child_row_idx_' via the 'this' pointer on every loop iteration. The compiler could do that automatically if it could deduce that the values aren't modified via a different pointer, but I don't think it's deducible in this case because the compiler has to generate code that's "correct" in a weird case like 'this' and *tuple_buf pointing to the same memory. Line 34: if (ReachedLimit()) break; We can avoid checking limits for each row if we check it at the end and truncate the batch using RowBatch::set_num_rows. E.g. see SortNode::GetNext(). -- To view, visit http://gerrit.cloudera.org:8080/6459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib4107d27582ff5416172810364a6e76d3d93c439 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
