Alex Behm has posted comments on this change. Change subject: Optimized ReadValueBatch() for Parquet scalar column readers. ......................................................................
Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/2843/5/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 827: bool MaterializeValueBatch(MemPool* pool, int max_values, int tuple_size, > That logic is already specialized and in such cases we will stamp out row b Unfortunately, what I said was only almost accurate. There is one special case that will still hit this code path with !MATERIALIZED && !IN_COLLECTION. It can happen if we have a query that only references columns that fail to resolve in the data files. As Marcel suggested, I hoisted out that part into this function directly. The change is simple but please have another look. I'm doing another exhaustive run. -- To view, visit http://gerrit.cloudera.org:8080/2843 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I21fa9b050a45f2dd45cc0091ea5b008d3c0a3f30 Gerrit-PatchSet: 5 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Alex Behm <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
