[jira] [Created] (HIVE-16559) Parquet schema evolution for partitioned tables may break if table and partition serdes differ

Barna Zsombor Klara (JIRA) Fri, 28 Apr 2017 03:07:23 -0700

Barna Zsombor Klara created HIVE-16559:
------------------------------------------


             Summary: Parquet schema evolution for partitioned tables may break 
if table and partition serdes differ
                 Key: HIVE-16559
                 URL: https://issues.apache.org/jira/browse/HIVE-16559
             Project: Hive
          Issue Type: Bug
            Reporter: Barna Zsombor Klara
            Assignee: Barna Zsombor Klara


Parquet schema evolution should make it possible to have partitions/tables 
 backed by files with different schemas. Hive should match the table columns 
with file columns based on the column name if possible.
However if the serde for a table is missing columns from the serde of a 
partition Hive fails to match the columns together.
Steps to reproduce:
{code}
CREATE TABLE myparquettable_parted
(
  name string,
  favnumber int,
  favcolor string,
  age int,
  favpet string
)
PARTITIONED BY (day string)
STORED AS PARQUET;

INSERT OVERWRITE TABLE myparquettable_parted
PARTITION(day='2017-04-04')
SELECT
   'mary' as name,
   5 AS favnumber,
   'blue' AS favcolor,
   35 AS age,
   'dog' AS favpet;

REPLACE COLUMNS
(
favnumber int,
age int
);   <!--- No cascade option, so the partition will not be altered. 
{code}
{{SELECT * FROM myparquettable_parted where day='2017-04-04';}}
will fail with:
{{java.lang.UnsupportedOperationException: Cannot inspect 
org.apache.hadoop.io.IntWritable}}

Hive should either match the columns together or prevent the user from dropping 
columns from the table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (HIVE-16559) Parquet schema evolution for partitioned tables may break if table and partition serdes differ

Reply via email to