Tim Armstrong has posted comments on this change. Change subject: IMPALA-4883: Implement Codegen for Union ......................................................................
Patch Set 2: (4 comments) The code looks generally good. I had two high-level comments: we should make sure that the output in the profile makes sense, and we should try to rearrange the inner perf-critical loop to squeeze out more perf (I think we can do much better than 50%). http://gerrit.cloudera.org:8080/#/c/6459/2/be/src/exec/union-node.cc File be/src/exec/union-node.cc: Line 26: unnecessary new line Line 110: if (!codegen_status.ok()) continue; We shouldn't drop this status - at the least we should display it in the profile. It's also reasonable to abort codegen for the node entirely if it simplifies things - the cases when codegen fails are mainly corner cases like CHAR. Line 129: runtime_profile()->AddCodegenMsg(codegen_status.ok(), codegen_status); How does this show up in the profile? If we're going to show the results of codegening each union branch we should make sure that it is displayed in a reasonable way - e.g. mention the child number. Line 184: if (UNLIKELY(codegend_union_materialize_exprs_fns_.size() == 0 || It would be great if we could avoid all of this per-row overhead - I'm counting a function call, three condition checks and an indirect function call via the function pointer before it does any actual work for the row, and that's just including this function, not the caller. We usually try to push loops like this down into the codegen'd code. I think in this case the loop can be tightened up a lot and pushed down into cross-compiled code. If you have a for loop over (0 ..rows left in child batch), I think the only thing that really needs to be checked per row is the capacity of the destination batch - the other loop conditions can be converted into an iteration count before entering the loop. HdfsParquetScanner::ProcessScratchBatch() is an example where all of this is pretty optimised. I don't think that fully optimising it should block this getting in since a good improvement already, but there's so much low-hanging fruit that we're missing out on a lot of the potential benefit of your work. Also if we leave code sitting around with inefficient patterns like this people tend to copy the bad patterns (e.g. the Kudu scan node inherited some bad patterns from the HBase scan node). -- To view, visit http://gerrit.cloudera.org:8080/6459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib4107d27582ff5416172810364a6e76d3d93c439 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
