[ https://issues.apache.org/jira/browse/SPARK-18108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Richard Moorhead updated SPARK-18108: ------------------------------------- Attachment: stacktrace.out > Partition discovery fails with explicitly written long partitions > ----------------------------------------------------------------- > > Key: SPARK-18108 > URL: https://issues.apache.org/jira/browse/SPARK-18108 > Project: Spark > Issue Type: Question > Components: Spark Core > Affects Versions: 2.0.1 > Reporter: Richard Moorhead > Priority: Minor > Attachments: stacktrace.out > > > We have parquet data written from Spark1.6 that, when read from 2.0.1, > produces errors. > {code} > case class A(a: Long, b: Int) > val as = Seq(A(1,2)) > //partition explicitly written > spark.createDataFrame(as).write.parquet("/data/a=1/") > spark.read.parquet("/data/").collect > {code} > The above code fails; stack trace attached. > If an integer used, explicit partition discovery succeeds. > {code} > case class A(a: Int, b: Int) > val as = Seq(A(1,2)) > //partition explicitly written > spark.createDataFrame(as).write.parquet("/data/a=1/") > spark.read.parquet("/data/").collect > {code} > The action succeeds. Additionally, if 'partitionBy' is used instead of > explicit writes, partition discovery succeeds. > Question: Is the first example a reasonable use case? > [PartitioningUtils|https://github.com/apache/spark/blob/branch-2.0/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala#L319] > seems to default to Integer types unless the partition value exceeds the > integer type's length. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org