[jira] [Resolved] (SPARK-12682) Hive will fail if the schema of a parquet table has a very wide schema

Yin Huai (JIRA) Tue, 26 Jan 2016 08:01:13 -0800

     [ 
https://issues.apache.org/jira/browse/SPARK-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Yin Huai resolved SPARK-12682.
------------------------------
       Resolution: Fixed
    Fix Version/s: 1.6.1
                   2.0.0

Issue resolved by pull request 10826
[https://github.com/apache/spark/pull/10826]

> Hive will fail if the schema of a parquet table has a very wide schema
> ----------------------------------------------------------------------
>
>                 Key: SPARK-12682
>                 URL: https://issues.apache.org/jira/browse/SPARK-12682
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Yin Huai
>             Fix For: 2.0.0, 1.6.1
>
>
> To reproduce it, you can create a table with many many columns. You need to 
> make sure that all of data type strings combined exceeds 4000 chars (strings 
> are generated by HiveMetastoreTypes.toMetastoreType). Then, save the table as 
> parquet. Because we will try to use a hive compatible way to store the 
> metadata, we will set the serde to parquet serde. Then, when you load the 
> table, you will see a {{java.lang.IllegalArgumentException}} thrown from 
> Hive's {{TypeInfoUtils}}. I believe the cause is the same as SPARK-6024. 
> Hive's parquet does not handle wide schema well and the data type string is 
> truncated. 
> Once you hit this problem, you will not be able to drop the table because 
> Hive fails to evaluate drop table command. To at least provide a better 
> workaround. We should see if we should have a native drop table call to 
> metastore and if we should add a flag to disable saving a data source table's 
> metadata in hive compatible way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-12682) Hive will fail if the schema of a parquet table has a very wide schema

Reply via email to