[ https://issues.apache.org/jira/browse/SPARK-21617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113602#comment-16113602 ]
Marcelo Vanzin commented on SPARK-21617: ---------------------------------------- Here's the full test error from our internal build against 2.1: {noformat} 15:11:29.602 WARN org.apache.spark.sql.hive.test.TestHiveExternalCatalog: Could not alter schema of table `default`.`t1` in a Hive compatible way. Updating Hive metastore in Spark SQL specific format. [snip] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions : c1 at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:624) at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:602) - alter datasource table add columns - partitioned - csv *** FAILED *** org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: at least one column must be specified for the table; at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:107) at org.apache.spark.sql.hive.HiveExternalCatalog.alterTableSchema(HiveExternalCatalog.scala:656) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.alterTableSchema(SessionCatalog.scala:372) {noformat} So the exception above is just a warning, and the problem seems to actually be in how Spark is recovering from that situation (the exception handler in {{HiveExternalCatalog.alterTableSchema}}). > ALTER TABLE...ADD COLUMNS creates invalid metadata in Hive metastore for DS > tables > ---------------------------------------------------------------------------------- > > Key: SPARK-21617 > URL: https://issues.apache.org/jira/browse/SPARK-21617 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.2.0 > Reporter: Marcelo Vanzin > > When you have a data source table and you run a "ALTER TABLE...ADD COLUMNS" > query, Spark will save invalid metadata to the Hive metastore. > Namely, it will overwrite the table's schema with the data frame's schema; > that is not desired for data source tables (where the schema is stored in a > table property instead). > Moreover, if you use a newer metastore client where > METASTORE_DISALLOW_INCOMPATIBLE_COL_TYPE_CHANGES is on by default, you > actually get an exception: > {noformat} > InvalidOperationException(message:The following columns have types > incompatible with the existing columns in their respective positions : > c1) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.throwExceptionIfIncompatibleColTypeChange(MetaStoreUtils.java:615) > at > org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:133) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3704) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:3675) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99) > at com.sun.proxy.$Proxy26.alter_table_with_environment_context(Unknown > Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java:402) > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table_with_environmentContext(SessionHiveMetaStoreClient.java:309) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154) > at com.sun.proxy.$Proxy27.alter_table_with_environmentContext(Unknown > Source) > at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:601) > {noformat} > That exception is handled by Spark in an odd way (see code in > {{HiveExternalCatalog.scala}}) which still stores invalid metadata. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org