[ https://issues.apache.org/jira/browse/TOREE-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084653#comment-17084653 ]
Marcello Leida commented on TOREE-428: -------------------------------------- Comfirn here as well that the workaround is not working > Can't use case class in the Scala notebook > ------------------------------------------ > > Key: TOREE-428 > URL: https://issues.apache.org/jira/browse/TOREE-428 > Project: TOREE > Issue Type: Bug > Components: Build > Reporter: Haifeng Li > Priority: Major > Fix For: 0.2.0 > > > the version of docker: > jupyter/all-spark-notebook:lastest > the way to start docker: > docker run -it --rm -p 8888:8888 jupyter/all-spark-notebook:latest > or > docker ps -a > docker start -i containerID > the steps: > Visit http://localhost:8888 > Start an toree notebook > input code above > {code:java} > import spark.implicits._ > val p = spark.sparkContext.textFile ("../Data/person.txt") > val pmap = p.map ( _.split (",")) > pmap.collect() > {code} > the output:res0: Array[Array[String]] = Array(Array(Barack, Obama, 53), > Array(George, Bush, 68), Array(Bill, Clinton, 68)) > {code:java} > case class Persons (first_name:String,last_name: String,age:Int) > val personRDD = pmap.map ( p => Persons (p(0), p(1), p(2).toInt)) > personRDD.take(1) > {code} > the error message: > {code:java} > org.apache.spark.SparkDriverExecutionException: Execution error > at > org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1186) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1711) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658) > at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) > at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:2043) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:2062) > at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1354) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) > at org.apache.spark.rdd.RDD.withScope(RDD.scala:362) > at org.apache.spark.rdd.RDD.take(RDD.scala:1327) > ... 39 elided > Caused by: java.lang.ArrayStoreException: [LPersons; > at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:90) > at > org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:2043) > at > org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:2043) > at org.apache.spark.scheduler.JobWaiter.taskSucceeded(JobWaiter.scala:59) > at > org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1182) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1711) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658) > at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) > {code} > The above code is working with the spark-shell. From error message, I > speculated that the driver program didn't correctly handle case class Persons > to RDD partition. -- This message was sent by Atlassian Jira (v8.3.4#803005)