[ 
https://issues.apache.org/jira/browse/SPARK-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086650#comment-15086650
 ] 

Yin Huai commented on SPARK-12682:
----------------------------------

I think we should also add the flag to disable saving metadata in hive 
compatible way because even if we can save the metadata in the hive compatible 
way, when we read the table, it can fail. It will be good to have a data source 
option (if it is really hard, we can use a sql conf).

> Hive will fail if the schema of a parquet table has a very wide schema
> ----------------------------------------------------------------------
>
>                 Key: SPARK-12682
>                 URL: https://issues.apache.org/jira/browse/SPARK-12682
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Yin Huai
>
> To reproduce it, you can create a table with many many columns. You need to 
> make sure that all of data type strings combined exceeds 4000 chars (strings 
> are generated by HiveMetastoreTypes.toMetastoreType). Then, save the table as 
> parquet. Because we will try to use a hive compatible way to store the 
> metadata, we will set the serde to parquet serde. Then, when you load the 
> table, you will see a {{java.lang.IllegalArgumentException}} thrown from 
> Hive's {{TypeInfoUtils}}. I believe the cause is the same as SPARK-6024. 
> Hive's parquet does not handle wide schema well and the data type string is 
> truncated. 
> Once you hit this problem, you will not be able to drop the table because 
> Hive fails to evaluate drop table command. To at least provide a better 
> workaround. We should see if we should have a native drop table call to 
> metastore and if we should add a flag to disable saving a data source table's 
> metadata in hive compatible way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to