Looks like the exception was thrown from this line: ByteBuffer.wrap(taskBinary.value), Thread.currentThread.getContextClassLoader)
Comment for taskBinary says: * @param taskBinary broadcasted version of the serialized RDD and the function to apply on each * partition of the given RDD. Once deserialized, the type should be * (RDD[T], (TaskContext, Iterator[T]) => U). Can you write a simple test which reproduces this problem ? Thanks On Wed, May 11, 2016 at 3:40 AM, Daniel Haviv < daniel.ha...@veracity-group.com> wrote: > Hi, > I'm running a very simple job (textFile->map->groupby->count) and hitting > this with Spark 1.6.0 on EMR 4.3 (Hadoop 2.7.1) and hitting this exception > when running on yarn-client and not in local mode: > > 16/05/11 10:29:26 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID > 1, ip-172-31-33-97.ec2.internal, partition 0,NODE_LOCAL, 15116 bytes) > 16/05/11 10:29:26 WARN TaskSetManager: Lost task 0.1 in stage 0.0 (TID 1, > ip-172-31-33-97.ec2.internal): java.lang.ClassCastException: > org.apache.spark.util.SerializableConfiguration cannot be cast to [B > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > > If found a jira that relates to streaming and accumulators but I'm using > neither. > > Any ideas ? > Should I file a jira? > > Thank you, > Daniel >