Takeshi Yamamuro created SPARK-24676:
----------------------------------------
Summary: Project required data from parsed data when
csvColumnPruning disabled
Key: SPARK-24676
URL: https://issues.apache.org/jira/browse/SPARK-24676
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.3.1
Reporter: Takeshi Yamamuro
I hit a bug below when parsing csv data;
{code}
./bin/spark-shell --conf spark.sql.csv.parser.columnPruning.enabled=false
scala> val dir = "/tmp/spark-csv/csv"
scala> spark.range(10).selectExpr("id % 2 AS p",
"id").write.mode("overwrite").partitionBy("p").csv(dir)
scala> spark.read.csv(dir).selectExpr("sum(p)").collect()
18/06/25 13:48:46 ERROR Executor: Exception in task 2.0 in stage 2.0 (TID 7)
java.lang.ClassCastException: org.apache.spark.unsafe.types.UTF8String cannot
be cast to java.lang.Integer
at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:101)
at
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getInt(rows.scala:41)
...
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]