Todd Lipcon created IMPALA-7309:
-----------------------------------
Summary: Prevent the addition of Avro schemas to non-Avro tables
with incompatible schema
Key: IMPALA-7309
URL: https://issues.apache.org/jira/browse/IMPALA-7309
Project: IMPALA
Issue Type: Bug
Components: Catalog, Frontend
Reporter: Todd Lipcon
Per a recent [mailing list
thread|https://lists.apache.org/thread.html/fb68c54bd66a40982ee17f9f16f87a4112220a5df035a311bda310f1@<dev.impala.apache.org>]
the behavior of Avro partitions within non-Avro tables is inconsistent with
Hive, and somewhat suprising. For example, the addition of a partition can
cause the results of "describe" on the table to change, but only after a
refresh or invalidate. In the mailing list thread, we decided to change the
behavior to:
1. Schema handling:
- if a table's properties indicate it's an avro table, parse and adopt the
external avro schema as the table schema, or infer an avro-compatible schema
from the existing columns
- if a table's properties indicate it's _not_ an avro table, but there is
an external avro schema defined in the table properties, then parse the
avro schema and include it in the TableDescriptor (for use by avro
partitions) but *do not* adopt it as the table schema.
2. Handling incompatible schemas:
- If the table-level format is non-Avro,
- AND the table contains column types incompatible with Avro (eg tinyint),
- AND the table has an existing avro partition,
- THEN the query will yield an error about incompatible types
3. Try to prevent shooting in the foot
- If the table-level format is non-Avro,
- AND the table contains column types incompatible with Avro (eg tinyint),
- THEN disallow changing the file format of an existing partition to Avro
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]