Quanlong Huang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15103


Change subject: IMPALA-9324: Correctly handle ORC UNION type in scanner
......................................................................

IMPALA-9324: Correctly handle ORC UNION type in scanner

We don't support reading UNION columns. Queries on tables containing
UNION types will fail in planning. Error message is metadata loading
error. However, scanner may need to read an ORC file with UNION types if
the table schema doesn't map to the UNION columns. Though the UNION
values won't be read, the scanner need to resolve the file schema,
including the UNION types, correctly.

In OrcSchemaResolver::BuildSchemaPath, we create a map from ORC type ids
to Impala SchemaPath representation for all types of the file. We should
deal with UNION types as well.

This patch also include some refactor to improve code readability.

Tests:
 - Add tests for table schema and file schema mismatching on all complex
   types.

Change-Id: I452d27b4e281eada00b62ac58af773a3479163ec
---
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/orc-metadata-utils.cc
M be/src/exec/orc-metadata-utils.h
M common/thrift/generate_error_codes.py
M 
testdata/workloads/functional-query/queries/DataErrorsTest/orc-type-checks.test
M tests/query_test/test_scanners.py
6 files changed, 127 insertions(+), 25 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/15103/1
--
To view, visit http://gerrit.cloudera.org:8080/15103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I452d27b4e281eada00b62ac58af773a3479163ec
Gerrit-Change-Number: 15103
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang <[email protected]>

Reply via email to