I'm trying to save an RDD as a parquet file through the saveAsParquestFile()
api,
With code that looks something like:
val sc = ...
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext._
val someRDD: RDD[SomeCaseClass] = ...
someRDD.saveAsParquetFile("someRDD.parquet")
However, I get the following error:
java.lang.IncompatibleClassChangeError: Found class
org.apache.hadoop.mapreduce.TaskAttemptContext, but interface was expected
I'm trying to figure out what the issue is, help is appreciated, thx!
My sbt configuration has the following:
val sparkV = "1.0.0"
// ...
"org.apache.spark" %% "spark-core" % sparkV,
"org.apache.spark" %% "spark-mllib" % sparkV,
"org.apache.spark" %% "spark-sql" % sparkV,
Here's the stack trace:
java.lang.IncompatibleClassChangeError: Found class
org.apache.hadoop.mapreduce.TaskAttemptContext, but interface was expected
at
org.apache.spark.sql.parquet.AppendingParquetOutputFormat.getDefaultWorkFile(ParquetTableOperations.scala:256)
at
parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:251)
at
org.apache.spark.sql.parquet.InsertIntoParquetTable.org$apache$spark$sql$parquet$InsertIntoParquetTable$$writeShard$1(ParquetTableOperations.scala:224)
at
org.apache.spark.sql.parquet.InsertIntoParquetTable$$anonfun$saveAsHadoopFile$1.apply(ParquetTableOperations.scala:242)
at
org.apache.spark.sql.parquet.InsertIntoParquetTable$$anonfun$saveAsHadoopFile$1.apply(ParquetTableOperations.scala:242)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/SchemaRDD-s-saveAsParquetFile-throws-java-lang-IncompatibleClassChangeError-tp6837.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.