[ https://issues.apache.org/jira/browse/SPARK-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-7737: ----------------------------------- Assignee: Yin Huai (was: Apache Spark) > parquet schema discovery should not fail because of empty _temporary dir > ------------------------------------------------------------------------- > > Key: SPARK-7737 > URL: https://issues.apache.org/jira/browse/SPARK-7737 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.4.0 > Reporter: Yin Huai > Assignee: Yin Huai > Priority: Blocker > > Parquet schema discovery will fail when the dir is like > {code} > /partitions5k/i=2/_SUCCESS > /partitions5k/i=2/_temporary/ > /partitions5k/i=2/part-r-00001.gz.parquet > /partitions5k/i=2/part-r-00002.gz.parquet > /partitions5k/i=2/part-r-00003.gz.parquet > /partitions5k/i=2/part-r-00004.gz.parquet > {code} > {code} > java.lang.AssertionError: assertion failed: Conflicting partition column > names detected: > > at scala.Predef$.assert(Predef.scala:179) > at > org.apache.spark.sql.sources.PartitioningUtils$.resolvePartitions(PartitioningUtils.scala:159) > at > org.apache.spark.sql.sources.PartitioningUtils$.parsePartitions(PartitioningUtils.scala:71) > at > org.apache.spark.sql.sources.HadoopFsRelation.org$apache$spark$sql$sources$HadoopFsRelation$$discoverPartitions(interfaces.scala:468) > at > org.apache.spark.sql.sources.HadoopFsRelation$$anonfun$partitionSpec$3.apply(interfaces.scala:424) > at > org.apache.spark.sql.sources.HadoopFsRelation$$anonfun$partitionSpec$3.apply(interfaces.scala:423) > at scala.Option.getOrElse(Option.scala:120) > at > org.apache.spark.sql.sources.HadoopFsRelation.partitionSpec(interfaces.scala:422) > at > org.apache.spark.sql.sources.HadoopFsRelation.schema$lzycompute(interfaces.scala:482) > at > org.apache.spark.sql.sources.HadoopFsRelation.schema(interfaces.scala:480) > at > org.apache.spark.sql.sources.LogicalRelation.<init>(LogicalRelation.scala:30) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:134) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:118) > at org.apache.spark.sql.SQLContext.load(SQLContext.scala:1135) > {code} > 1.3 works fine. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org