[ 
https://issues.apache.org/jira/browse/SPARK-8624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rex Xiong updated SPARK-8624:
-----------------------------
    Description: 
In 1.4.0, parquet is read by DataFrameReader.parquet, when creating 
ParquetRelation2 object, "parameters" is hard-coded as "Map.empty[String, 
String]", so ParquetRelation2.shouldMergeSchemas is always true (the default 
value).
In previous version, spark.sql.hive.convertMetastoreParquet.mergeSchema config 
is respected.
This bug downgrade performance a lot for a folder with hundreds of parquet 
files and we don't want a schema merge.

  was:
In 1.4.0, parquet is read by DataFrameReader.parquet, when creating 
ParquetRelation2 object, "Map.empty[String, String]" is hard-coded as 
"parameters", so ParquetRelation2.shouldMergeSchemas is always true (the 
default value).
In previous version, spark.sql.hive.convertMetastoreParquet.mergeSchema config 
is respected.
This bug downgrade performance a lot for a folder with hundreds of parquet 
files and we don't want a schema merge.


> DataFrameReader doesn't respect MERGE_SCHEMA setting for Parquet
> ----------------------------------------------------------------
>
>                 Key: SPARK-8624
>                 URL: https://issues.apache.org/jira/browse/SPARK-8624
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.4.0
>            Reporter: Rex Xiong
>              Labels: parquet
>
> In 1.4.0, parquet is read by DataFrameReader.parquet, when creating 
> ParquetRelation2 object, "parameters" is hard-coded as "Map.empty[String, 
> String]", so ParquetRelation2.shouldMergeSchemas is always true (the default 
> value).
> In previous version, spark.sql.hive.convertMetastoreParquet.mergeSchema 
> config is respected.
> This bug downgrade performance a lot for a folder with hundreds of parquet 
> files and we don't want a schema merge.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to