[jira] [Commented] (SPARK-12682) Hive will fail if the schema of a parquet table has a very wide schema

Apache Spark (JIRA) Mon, 18 Jan 2016 20:59:32 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106242#comment-15106242
 ]


Apache Spark commented on SPARK-12682:
--------------------------------------

User 'sameeragarwal' has created a pull request for this issue:
https://github.com/apache/spark/pull/10826

> Hive will fail if the schema of a parquet table has a very wide schema
> ----------------------------------------------------------------------
>
>                 Key: SPARK-12682
>                 URL: https://issues.apache.org/jira/browse/SPARK-12682
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Yin Huai
>
> To reproduce it, you can create a table with many many columns. You need to 
> make sure that all of data type strings combined exceeds 4000 chars (strings 
> are generated by HiveMetastoreTypes.toMetastoreType). Then, save the table as 
> parquet. Because we will try to use a hive compatible way to store the 
> metadata, we will set the serde to parquet serde. Then, when you load the 
> table, you will see a {{java.lang.IllegalArgumentException}} thrown from 
> Hive's {{TypeInfoUtils}}. I believe the cause is the same as SPARK-6024. 
> Hive's parquet does not handle wide schema well and the data type string is 
> truncated. 
> Once you hit this problem, you will not be able to drop the table because 
> Hive fails to evaluate drop table command. To at least provide a better 
> workaround. We should see if we should have a native drop table call to 
> metastore and if we should add a flag to disable saving a data source table's 
> metadata in hive compatible way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-12682) Hive will fail if the schema of a parquet table has a very wide schema

Reply via email to