Tim Armstrong has posted comments on this change.

Change subject: IMPALA-2736: Basic column-wise slot materialization in Parquet 
scanner.
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/2779/3/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 1729: int HdfsParquetScanner::TransferScratchTuples(ScratchTupleBatch* 
scratch_batch) {
> Thanks for the suggestions, Tim. I rewrote this function to be more perform
I think the new version is just as readable, and it's way easier to reason 
about performance, so thumbs up from me. 

I'm not in favour of optimising to death, but I think it's good to write hot 
loops in a way that it's somewhat feasible to understand the performance 
characteristics of what is actually going to execute on the CPU.


http://gerrit.cloudera.org:8080/#/c/2779/3/be/src/util/rle-encoding.h
File be/src/util/rle-encoding.h:

Line 249:   // significantly better than UNLIKELY(literal_count_ == 0 && 
repeat_count_ == 0)
> Correct. I'm already working on batch-reading and caching the def/rep level
Nice.


Line 250:   if (repeat_count_ == 0) {
> Actually Mostafa tried (repeat_count_ & literal_count_) == 0 and it was sti
You mean (repeat_count_ | literal_count_) ? I'm pretty sure & is incorrect 
there, since it's always false if either is 0. Anyway, I think you have bigger 
fish to fry than this :)


-- 
To view, visit http://gerrit.cloudera.org:8080/2779
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I72a613fa805c542e39df20588fb25c57b5f139aa
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Alex Behm <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Skye Wanderman-Milne <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to