Cheng Lian created SPARK-8990:
---------------------------------

             Summary: DataFrameReader.parquet() ignores user specified data 
source options
                 Key: SPARK-8990
                 URL: https://issues.apache.org/jira/browse/SPARK-8990
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.4.0
            Reporter: Cheng Lian
            Assignee: Cheng Lian


A bad consequence of this is that {{sqlContext.read.parquet(path)}} always do 
schema merging. For example:
{code}
import sqlContext._
import sqlContext.implicits._

val path = "s3n://my-bucket/parquet/tiny"
range(0, 10).coalesce(1).write.mode("overwrite").parquet(path)

// Explicitly disables schema merging
read.option("mergeSchema", "false").format("parquet").load(path)
{code}
However, we still see all files are opened for schema discovery:
{noformat}
15/07/10 14:46:52 INFO s3native.NativeS3FileSystem: Opening 
's3n://databricks-lian/parquet/tiny/_metadata' for reading
15/07/10 14:46:52 INFO s3native.NativeS3FileSystem: Opening key 
'parquet/tiny/_metadata' for reading at position '314'
15/07/10 14:46:52 INFO s3native.NativeS3FileSystem: Opening 
's3n://databricks-lian/parquet/tiny/part-r-00000-da490c43-15e2-46b5-95ff-4863e6ab1cc4.gz.parquet'
 for reading
15/07/10 14:46:52 INFO s3native.NativeS3FileSystem: Opening 
's3n://databricks-lian/parquet/tiny/_common_metadata' for reading
15/07/10 14:46:52 INFO s3native.NativeS3FileSystem: Opening key 
'parquet/tiny/part-r-00000-da490c43-15e2-46b5-95ff-4863e6ab1cc4.gz.parquet' for 
reading at position '345'
15/07/10 14:46:52 INFO s3native.NativeS3FileSystem: Opening key 
'parquet/tiny/_common_metadata' for reading at position '191'
15/07/10 14:46:52 INFO s3native.NativeS3FileSystem: Opening key 
'parquet/tiny/_metadata' for reading at position '4'
15/07/10 14:46:52 INFO s3native.NativeS3FileSystem: Opening key 
'parquet/tiny/part-r-00000-da490c43-15e2-46b5-95ff-4863e6ab1cc4.gz.parquet' for 
reading at position '97'
15/07/10 14:46:52 INFO s3native.NativeS3FileSystem: Opening key 
'parquet/tiny/_common_metadata' for reading at position '4'
{noformat}
To workaround this issue, use the following instead:
{noformat}
sqlContext.read.option("mergeSchema", "false").format("parquet").load(path)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to