Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3629: Codegen TransferScratchTuples() in 
hdfs-parquet-scanner
......................................................................


Patch Set 3:

(7 comments)

I have done some benchmarking - I can construct a very contrived query (eg. a 
single scan over a parquet table with a huge where clause) that demonstrates 
that the changes do in fact result in a speedup.

However, running a real workload like tpch (either locally or on the 16 node) 
doesn't really show much improvement. I'm not sure if this is because tpch has 
too much else going on that drowns out the effects of this change, or if I'm 
doing something wrong.

Is there specific performance numbers your looking for, and do you have a sense 
from previous, similar changes how much of a speedup this should result in?

http://gerrit.cloudera.org:8080/#/c/3774/2//COMMIT_MSG
Commit Message:

Line 7: IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner
> We're really codegening TransferScratchTuples() if we make the additional c
Done


PS2, Line 12: EvalConjunct
> nit: EvalConjuncts
Done


http://gerrit.cloudera.org:8080/#/c/3774/2/be/src/exec/hdfs-parquet-scanner-ir.cc
File be/src/exec/hdfs-parquet-scanner-ir.cc:

PS2, Line 47:  filters/c
> We know the value of tuple_size at prepare time so can optimise out this br
Done


PS2, Line 72: 
> This has_conjuncts check was for perf to avoid calling into EvalConjuncts()
Done


Line 86
> It would be better to avoid cross-compiling this resource transfer code. I 
Done


http://gerrit.cloudera.org:8080/#/c/3774/2/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 386: 
> Swap this with the SCOPED_TIMER above - this order doesn't make sense.
Done


Line 389:     return Status("Failed to get codegen");
> Indent 4 spaces.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic327e437c7cd2b3f92cdb11c1e907bfee2d44ee8
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to