Mustafa Iman created TEZ-4177:
---------------------------------
Summary: Improve error message for external orc table
Key: TEZ-4177
URL: https://issues.apache.org/jira/browse/TEZ-4177
Project: Apache Tez
Issue Type: Improvement
Reporter: Mustafa Iman
Assignee: Mustafa Iman
Since there is no schema validation for external tables, users may face various
errors if their orc data and external table schema does not match. If orc
schema has fewer columns than projection OrcEncodedDataConsumer may receive an
incomplete TypeDescription array which will manifest itself as
NullPointerException later.
We can at least verify that OrcEncodedDataConsumer gets enough
TypeDescriptions. If assertion fails, user sees there is something wrong with
the schema and hopefully resolves the problem quickly. If there are enough
columns in the file but the schema of the query does not match, user generally
sees a ClassCastException. If there are enough columns and types accidentally
match, there is nothing we can do as this is an external table.
We have seen this when trying to use a managed table as external table
location. Although user facing schemas are the same, managed table has acid
related metadata. I am adding a q file demonstrating NullPointerException with
TestMiniLlapLocalCliDriver and the output after the fix. I haven't added this
to precommit tests as it is hard to assert the exception message from mini
driver framework and effectively it is just changing the error.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)