Attila Magyar created HIVE-23253: ------------------------------------ Summary: Synchronization between external SerDe schemas and Metastore Key: HIVE-23253 URL: https://issues.apache.org/jira/browse/HIVE-23253 Project: Hive Issue Type: Bug Components: Hive, Metastore Affects Versions: 3.1.2 Reporter: Attila Magyar Fix For: 3.0.0
In HIVE-15995 an ALTER <table> UPDATE COLUMNS statement was introduce to sync external SerDe schema changes with the metastore. This command can only be manually invoked. See it in the documentation. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionUpdatecolumns Maybe it would make sense to run an update columns automatically in certain cases to prevent problems coming from cases where the user forgets running the update columns manually. One way to reproduce the issue is to change the schema url via an alter table statement. {code:java} [root@c7401 vagrant]# cat test_schema1.avsc { "type":"record", "name":"test_schema", "namespace":"gdc_datascience_qa", "fields":[ { "name":"name", "type":[ "null", "string" ], "default":null } ] }[root@c7401 vagrant]# cat test_schema2.avsc { "type":"record", "name":"test_schema", "namespace":"gdc_datascience_qa", "fields":[ { "name":"name", "type":[ "null", "string" ], "default":null }, { "name":"last_name", "type":[ "null", "string" ], "default":null } ] } {code} {code:java} $ hadoop fs -copyFromLocal *.avsc /tmp/ [beeline] create external table t1 stored as avro tblproperties ('avro.schema.url'='/tmp/test_schema1.avsc'); [beeline] alter table t1 set tblproperties('avro.schema.url'='/tmp/test_schema2.avsc'); [beeline] insert into t1 values ('n1', 'l1'); [beeline] create external table t2 stored as avro tblproperties ('avro.schema.url'='/tmp/test_schema2.avsc'); [beeline] insert into t2 values ('n2', 'l2'); [beeline] insert overwrite table t1 select * from t2; {code} Error: {code:java} MetaException(message:Column last_name doesn't exist in table t1 in database default) at org.apache.hadoop.hive.metastore.ObjectStore.validateTableCols(ObjectStore.java:8652) at org.apache.hadoop.hive.metastore.ObjectStore.getMTableColumnStatistics(ObjectStore.java:8602) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionColStats(ObjectStore.java:8416) at org.apache.hadoop.hive.metastore.ObjectStore.updateTableColumnStatistics(ObjectStore.java:8446 {code} Running an ALTER UPDATE COLUMNS fixes the problem. cc: [~szita] -- This message was sent by Atlassian Jira (v8.3.4#803005)