[ 
https://issues.apache.org/jira/browse/SPARK-21617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113602#comment-16113602
 ] 

Marcelo Vanzin commented on SPARK-21617:
----------------------------------------

Here's the full test error from our internal build against 2.1:

{noformat}
15:11:29.602 WARN org.apache.spark.sql.hive.test.TestHiveExternalCatalog: Could 
not alter schema of table  `default`.`t1` in a Hive compatible way. Updating 
Hive metastore in Spark SQL specific format.
[snip]
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter 
table. The following columns have types incompatible with the existing columns 
in their respective positions :
c1
        at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:624)
        at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:602)
- alter datasource table add columns - partitioned - csv *** FAILED ***
  org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: at least one column must be 
specified for the table;
  at 
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:107)
  at 
org.apache.spark.sql.hive.HiveExternalCatalog.alterTableSchema(HiveExternalCatalog.scala:656)
  at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.alterTableSchema(SessionCatalog.scala:372)
{noformat}

So the exception above is just a warning, and the problem seems to actually be 
in how Spark is recovering from that situation (the exception handler in 
{{HiveExternalCatalog.alterTableSchema}}).


> ALTER TABLE...ADD COLUMNS creates invalid metadata in Hive metastore for DS 
> tables
> ----------------------------------------------------------------------------------
>
>                 Key: SPARK-21617
>                 URL: https://issues.apache.org/jira/browse/SPARK-21617
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Marcelo Vanzin
>
> When you have a data source table and you run a "ALTER TABLE...ADD COLUMNS" 
> query, Spark will save invalid metadata to the Hive metastore.
> Namely, it will overwrite the table's schema with the data frame's schema; 
> that is not desired for data source tables (where the schema is stored in a 
> table property instead).
> Moreover, if you use a newer metastore client where 
> METASTORE_DISALLOW_INCOMPATIBLE_COL_TYPE_CHANGES is on by default, you 
> actually get an exception:
> {noformat}
> InvalidOperationException(message:The following columns have types 
> incompatible with the existing columns in their respective positions :
> c1)
>       at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.throwExceptionIfIncompatibleColTypeChange(MetaStoreUtils.java:615)
>       at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:133)
>       at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3704)
>       at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:3675)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
>       at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
>       at com.sun.proxy.$Proxy26.alter_table_with_environment_context(Unknown 
> Source)
>       at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java:402)
>       at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table_with_environmentContext(SessionHiveMetaStoreClient.java:309)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
>       at com.sun.proxy.$Proxy27.alter_table_with_environmentContext(Unknown 
> Source)
>       at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:601)
> {noformat}
> That exception is handled by Spark in an odd way (see code in 
> {{HiveExternalCatalog.scala}}) which still stores invalid metadata.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to