Alex Behm has posted comments on this change.

Change subject: IMPALA-2736: Basic column-wise slot materialization in Parquet 
scanner.
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/2779/2/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 706:     while (LIKELY(val_count < max_values && !RowGroupAtEnd() && 
continue_execution)) {
> Hm, ok, fair enough, seems like it's trickier than I initially thought.
Actually, we do have the number of top-level values in a row group. But using 
those would mean we'd have to specialize the implementation of this function 
based on IN_COLLECTION, i.e., if we are in a collection we'd still have to 
revert to this version of the code.

Also, we might not be able to check for the same error conditions as before 
(e.g. did we read more values than were promised in the row group metadata?)

What do you think?


Line 1752:     if (UNLIKELY(
> I think the check is meant to occur less than once per batch in the Assembl
Makes sense.

I'm a little reluctant to address the issue in this patch because it's at least 
not totally trivial. I added a TODO. Let me know if you are not ok with 
deferring.


Line 1912:           ReadCollectionItem(column_readers, materialize_tuple, 
pool, tuple);
> Ah ok, I hadn't thought about the additional scratch batches, I think becau
I added a blurb in the header how slots are materialized.


-- 
To view, visit http://gerrit.cloudera.org:8080/2779
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I72a613fa805c542e39df20588fb25c57b5f139aa
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Alex Behm <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Skye Wanderman-Milne <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to