Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner ......................................................................
Patch Set 3: (7 comments) I have done some benchmarking - I can construct a very contrived query (eg. a single scan over a parquet table with a huge where clause) that demonstrates that the changes do in fact result in a speedup. However, running a real workload like tpch (either locally or on the 16 node) doesn't really show much improvement. I'm not sure if this is because tpch has too much else going on that drowns out the effects of this change, or if I'm doing something wrong. Is there specific performance numbers your looking for, and do you have a sense from previous, similar changes how much of a speedup this should result in? http://gerrit.cloudera.org:8080/#/c/3774/2//COMMIT_MSG Commit Message: Line 7: IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner > We're really codegening TransferScratchTuples() if we make the additional c Done PS2, Line 12: EvalConjunct > nit: EvalConjuncts Done http://gerrit.cloudera.org:8080/#/c/3774/2/be/src/exec/hdfs-parquet-scanner-ir.cc File be/src/exec/hdfs-parquet-scanner-ir.cc: PS2, Line 47: filters/c > We know the value of tuple_size at prepare time so can optimise out this br Done PS2, Line 72: > This has_conjuncts check was for perf to avoid calling into EvalConjuncts() Done Line 86 > It would be better to avoid cross-compiling this resource transfer code. I Done http://gerrit.cloudera.org:8080/#/c/3774/2/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 386: > Swap this with the SCOPED_TIMER above - this order doesn't make sense. Done Line 389: return Status("Failed to get codegen"); > Indent 4 spaces. Done -- To view, visit http://gerrit.cloudera.org:8080/3774 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic327e437c7cd2b3f92cdb11c1e907bfee2d44ee8 Gerrit-PatchSet: 3 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
