Tim Armstrong has posted comments on this change.

Change subject: Optimized ReadValueBatch() for Parquet scalar column readers.
......................................................................


Patch Set 2:

(4 comments)

The changes make a lot of sense

http://gerrit.cloudera.org:8080/#/c/2843/2/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 276:   /// Decodes and caches the next batch of levels. Resets members 
associated with the cache.
Is it valid to call this when the prev batch hasn't been totally consumed?


Line 301: cached_levels_
Would it make sense to just have this be a constant-sized array with e.g. 1024 
entries. Could save some of the plumbing of the MemPool, reduce indirection and 
make it more tunable.


Line 854:   /// It assumes a data page with remaining values is available, and 
that the def/rep
Can we assert any of these preconditions with DCHECKs?


Line 1872:       if (col_reader->IsCollectionReader() || 
col_reader->IsBoolColumnReader()) {
Maybe should have a NeedsSeeding() method?


-- 
To view, visit http://gerrit.cloudera.org:8080/2843
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I21fa9b050a45f2dd45cc0091ea5b008d3c0a3f30
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Alex Behm <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to