Tim Armstrong has posted comments on this change.

Change subject: Optimized ReadValueBatch() for Parquet scalar column readers.
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/2843/2/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 301:   uint8_t* cached_levels_;
> Good question. Had thought about it and opted for the safer solution. For l
In general, it's best not to add more untracked memory.

It's only 1024 bytes per column, which is small compared to other overhead like 
dictionaries. So if there's a perf benefit it's probably ok.

I'm ok with the MemPool approach too.


Line 1872:       if (col_reader->IsCollectionReader() || 
col_reader->IsBoolColumnReader()) {
> I added one, but not sure if it's clearer/better.
I think it's better not having this code know about the specifics of all the 
column reader types, even if it's still ugly.


-- 
To view, visit http://gerrit.cloudera.org:8080/2843
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I21fa9b050a45f2dd45cc0091ea5b008d3c0a3f30
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Alex Behm <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to