[jira] [Comment Edited] (SPARK-11193) Spark 1.5+ Kinesis Streaming - ClassCastException when starting KinesisReceiver
[ https://issues.apache.org/jira/browse/SPARK-11193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046079#comment-15046079 ] Jean-Baptiste Onofré edited comment on SPARK-11193 at 12/8/15 4:06 PM: --- To reproduce the issue, I added the following test in KryoSerializerSuite: {code} test("Bug: SPARK-11193") { val ser = new KryoSerializer(conf.clone.set("spark.kryo.registrationRequired", "false")) .newInstance() val myMap: mutable.HashMap[String, String] = new mutable.HashMap[String, String] with mutable.SynchronizedMap[String, String] myMap.put("foo", "bar") val myMapBytes = ser.serialize(myMap) val deserialized: mutable.HashMap[String, String] with mutable.SynchronizedMap[String, String] = ser.deserialize(myMapBytes) deserialized.clear() } {code} When running this test, I got: {code} scala.collection.mutable.HashMap cannot be cast to scala.collection.mutable.SynchronizedMap java.lang.ClassCastException: scala.collection.mutable.HashMap cannot be cast to scala.collection.mutable.SynchronizedMap {code} similar to what happens in KinesisReceiver. I'm figuring out the fix to do in KryoSerializer. was (Author: jbonofre): I'm adding a test in KryoSerializerSuite about the support of SynchronizedMap. > Spark 1.5+ Kinesis Streaming - ClassCastException when starting > KinesisReceiver > --- > > Key: SPARK-11193 > URL: https://issues.apache.org/jira/browse/SPARK-11193 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.5.0, 1.5.1 >Reporter: Phil Kallos > Attachments: screen.png > > > After upgrading from Spark 1.4.x -> 1.5.x, I am now unable to start a Kinesis > Spark Streaming application, and am being consistently greeted with this > exception: > java.lang.ClassCastException: scala.collection.mutable.HashMap cannot be cast > to scala.collection.mutable.SynchronizedMap > at > org.apache.spark.streaming.kinesis.KinesisReceiver.onStart(KinesisReceiver.scala:175) > at > org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148) > at > org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130) > at > org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:542) > at > org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:532) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:1982) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:1982) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:88) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Worth noting that I am able to reproduce this issue locally, and also on > Amazon EMR (using the latest emr-release 4.1.0 which packages Spark 1.5.0). > Also, I am not able to run the included kinesis-asl example. > Built locally using: > git checkout v1.5.1 > mvn -Pyarn -Pkinesis-asl -Phadoop-2.6 -DskipTests clean package > Example run command: > bin/run-example streaming.KinesisWordCountASL phibit-test kinesis-connector > https://kinesis.us-east-1.amazonaws.com -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-11193) Spark 1.5+ Kinesis Streaming - ClassCastException when starting KinesisReceiver
[ https://issues.apache.org/jira/browse/SPARK-11193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974653#comment-14974653 ] Jean-Baptiste Onofré edited comment on SPARK-11193 at 10/30/15 6:13 AM: Thanks for the update. I'm in vacation up to Wednesday. I will resume my investigation/fix on this next Thursday. was (Author: jbonofre): Hanks for thé update. I'm in vacation up to Wednesday. I will resume my investigation/fix on this next Thursday. > Spark 1.5+ Kinesis Streaming - ClassCastException when starting > KinesisReceiver > --- > > Key: SPARK-11193 > URL: https://issues.apache.org/jira/browse/SPARK-11193 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.5.0, 1.5.1 >Reporter: Phil Kallos > Attachments: screen.png > > > After upgrading from Spark 1.4.x -> 1.5.x, I am now unable to start a Kinesis > Spark Streaming application, and am being consistently greeted with this > exception: > java.lang.ClassCastException: scala.collection.mutable.HashMap cannot be cast > to scala.collection.mutable.SynchronizedMap > at > org.apache.spark.streaming.kinesis.KinesisReceiver.onStart(KinesisReceiver.scala:175) > at > org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148) > at > org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130) > at > org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:542) > at > org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:532) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:1982) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:1982) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:88) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Worth noting that I am able to reproduce this issue locally, and also on > Amazon EMR (using the latest emr-release 4.1.0 which packages Spark 1.5.0). > Also, I am not able to run the included kinesis-asl example. > Built locally using: > git checkout v1.5.1 > mvn -Pyarn -Pkinesis-asl -Phadoop-2.6 -DskipTests clean package > Example run command: > bin/run-example streaming.KinesisWordCountASL phibit-test kinesis-connector > https://kinesis.us-east-1.amazonaws.com -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-11193) Spark 1.5+ Kinesis Streaming - ClassCastException when starting KinesisReceiver
[ https://issues.apache.org/jira/browse/SPARK-11193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974481#comment-14974481 ] Fabien Comte edited comment on SPARK-11193 at 10/26/15 4:29 PM: Same problem with EMR 4.1.0 and Java 8. I am using spark-streaming-kinesis-asl_2.10 in version 1.5.0. The ser/de of the receiver fails with the Kryo serializer, everything works fine with the default java serializer. was (Author: comtef): Same problem with EMR 4.1.0 and Java 8. I am using spark-streaming-kinesis-asl_2.10 in version 1.5.0. > Spark 1.5+ Kinesis Streaming - ClassCastException when starting > KinesisReceiver > --- > > Key: SPARK-11193 > URL: https://issues.apache.org/jira/browse/SPARK-11193 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.5.0, 1.5.1 >Reporter: Phil Kallos > Attachments: screen.png > > > After upgrading from Spark 1.4.x -> 1.5.x, I am now unable to start a Kinesis > Spark Streaming application, and am being consistently greeted with this > exception: > java.lang.ClassCastException: scala.collection.mutable.HashMap cannot be cast > to scala.collection.mutable.SynchronizedMap > at > org.apache.spark.streaming.kinesis.KinesisReceiver.onStart(KinesisReceiver.scala:175) > at > org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148) > at > org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130) > at > org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:542) > at > org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:532) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:1982) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:1982) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:88) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Worth noting that I am able to reproduce this issue locally, and also on > Amazon EMR (using the latest emr-release 4.1.0 which packages Spark 1.5.0). > Also, I am not able to run the included kinesis-asl example. > Built locally using: > git checkout v1.5.1 > mvn -Pyarn -Pkinesis-asl -Phadoop-2.6 -DskipTests clean package > Example run command: > bin/run-example streaming.KinesisWordCountASL phibit-test kinesis-connector > https://kinesis.us-east-1.amazonaws.com -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-11193) Spark 1.5+ Kinesis Streaming - ClassCastException when starting KinesisReceiver
[ https://issues.apache.org/jira/browse/SPARK-11193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14965133#comment-14965133 ] Jean-Baptiste Onofré edited comment on SPARK-11193 at 10/20/15 1:58 PM: I guess you used Scala 2.10.4 (default) for the compilation. Can you try to build with -Pscala-2.11 to see if it helps ? was (Author: jbonofre): I guess you used Scala 2.10.4 (default) for the compilation. Can you try to add scala-2.11 in the pom.xml properties to build with scala 2.11 and see if it helps ? > Spark 1.5+ Kinesis Streaming - ClassCastException when starting > KinesisReceiver > --- > > Key: SPARK-11193 > URL: https://issues.apache.org/jira/browse/SPARK-11193 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.5.0, 1.5.1 >Reporter: Phil Kallos > > After upgrading from Spark 1.4.x -> 1.5.x, I am now unable to start a Kinesis > Spark Streaming application, and am being consistently greeted with this > exception: > java.lang.ClassCastException: scala.collection.mutable.HashMap cannot be cast > to scala.collection.mutable.SynchronizedMap > at > org.apache.spark.streaming.kinesis.KinesisReceiver.onStart(KinesisReceiver.scala:175) > at > org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148) > at > org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130) > at > org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:542) > at > org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:532) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:1982) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:1982) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:88) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Worth noting that I am able to reproduce this issue locally, and also on > Amazon EMR (using the latest emr-release 4.1.0 which packages Spark 1.5.0). > Also, I am not able to run the included kinesis-asl example. > Built locally using: > git checkout v1.5.1 > mvn -Pyarn -Pkinesis-asl -Phadoop-2.6 -DskipTests clean package > Example run command: > bin/run-example streaming.KinesisWordCountASL phibit-test kinesis-connector > https://kinesis.us-east-1.amazonaws.com -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-11193) Spark 1.5+ Kinesis Streaming - ClassCastException when starting KinesisReceiver
[ https://issues.apache.org/jira/browse/SPARK-11193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14965210#comment-14965210 ] Jean-Baptiste Onofré edited comment on SPARK-11193 at 10/20/15 2:50 PM: It looks to work fine to me: {code} jbonofre@latitude:~/Workspace/spark/bin$ ./run-example streaming.KinesisWordCountASL jbonofre-test kinesis-connector https://kinesis.us-east-1.amazonaws.com Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 15/10/20 16:48:09 INFO StreamingExamples: Setting log level to [WARN] for streaming example. To override add a custom log4j.properties to the classpath. 15/10/20 16:48:10 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/10/20 16:48:10 WARN Utils: Your hostname, latitude resolves to a loopback address: 127.0.1.1; using 192.168.134.10 instead (on interface eth0) 15/10/20 16:48:10 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 15/10/20 16:48:11 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set. --- Time: 1445352496000 ms --- --- Time: 1445352498000 ms --- --- Time: 144535250 ms --- {code} was (Author: jbonofre): It looks to work fine to me: {code} jbonofre@latitude:~/Workspace/spark/bin$ ./run-example streaming.KinesisWordCountASL jbonofre-test kinesis-connector https://kinesis.us-east-1.amazonaws.com Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 15/10/20 16:48:09 INFO StreamingExamples: Setting log level to [WARN] for streaming example. To override add a custom log4j.properties to the classpath. 15/10/20 16:48:10 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/10/20 16:48:10 WARN Utils: Your hostname, latitude resolves to a loopback address: 127.0.1.1; using 192.168.134.10 instead (on interface eth0) 15/10/20 16:48:10 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 15/10/20 16:48:11 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set. --- Time: 1445352496000 ms --- --- Time: 1445352498000 ms --- --- Time: 144535250 ms --- {/code} > Spark 1.5+ Kinesis Streaming - ClassCastException when starting > KinesisReceiver > --- > > Key: SPARK-11193 > URL: https://issues.apache.org/jira/browse/SPARK-11193 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.5.0, 1.5.1 >Reporter: Phil Kallos > > After upgrading from Spark 1.4.x -> 1.5.x, I am now unable to start a Kinesis > Spark Streaming application, and am being consistently greeted with this > exception: > java.lang.ClassCastException: scala.collection.mutable.HashMap cannot be cast > to scala.collection.mutable.SynchronizedMap > at > org.apache.spark.streaming.kinesis.KinesisReceiver.onStart(KinesisReceiver.scala:175) > at > org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148) > at > org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130) > at > org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:542) > at > org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:532) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:1982) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:1982) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:88) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Worth noting that I am able to reproduce this issue locally, and also on > Amazon EMR (using the latest emr-release 4.1.0 which packages Spark 1.5.0). > Also, I am not
[jira] [Comment Edited] (SPARK-11193) Spark 1.5+ Kinesis Streaming - ClassCastException when starting KinesisReceiver
[ https://issues.apache.org/jira/browse/SPARK-11193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14965210#comment-14965210 ] Jean-Baptiste Onofré edited comment on SPARK-11193 at 10/20/15 2:53 PM: It looks to work fine to me: {code} jbonofre@latitude:~/Workspace/spark/bin$ ./run-example streaming.KinesisWordCountASL jbonofre-test kinesis-connector https://kinesis.us-east-1.amazonaws.com Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 15/10/20 16:48:09 INFO StreamingExamples: Setting log level to [WARN] for streaming example. To override add a custom log4j.properties to the classpath. 15/10/20 16:48:10 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/10/20 16:48:10 WARN Utils: Your hostname, latitude resolves to a loopback address: 127.0.1.1; using 192.168.134.10 instead (on interface eth0) 15/10/20 16:48:10 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 15/10/20 16:48:11 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set. --- Time: 1445352496000 ms --- --- Time: 1445352498000 ms --- --- Time: 144535250 ms --- {code} Let me produce some data on the shards. was (Author: jbonofre): It looks to work fine to me: {code} jbonofre@latitude:~/Workspace/spark/bin$ ./run-example streaming.KinesisWordCountASL jbonofre-test kinesis-connector https://kinesis.us-east-1.amazonaws.com Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 15/10/20 16:48:09 INFO StreamingExamples: Setting log level to [WARN] for streaming example. To override add a custom log4j.properties to the classpath. 15/10/20 16:48:10 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/10/20 16:48:10 WARN Utils: Your hostname, latitude resolves to a loopback address: 127.0.1.1; using 192.168.134.10 instead (on interface eth0) 15/10/20 16:48:10 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 15/10/20 16:48:11 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set. --- Time: 1445352496000 ms --- --- Time: 1445352498000 ms --- --- Time: 144535250 ms --- {code} > Spark 1.5+ Kinesis Streaming - ClassCastException when starting > KinesisReceiver > --- > > Key: SPARK-11193 > URL: https://issues.apache.org/jira/browse/SPARK-11193 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.5.0, 1.5.1 >Reporter: Phil Kallos > > After upgrading from Spark 1.4.x -> 1.5.x, I am now unable to start a Kinesis > Spark Streaming application, and am being consistently greeted with this > exception: > java.lang.ClassCastException: scala.collection.mutable.HashMap cannot be cast > to scala.collection.mutable.SynchronizedMap > at > org.apache.spark.streaming.kinesis.KinesisReceiver.onStart(KinesisReceiver.scala:175) > at > org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148) > at > org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130) > at > org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:542) > at > org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:532) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:1982) > at > org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:1982) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:88) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Worth noting that I am able to reproduce this issue locally, and also on > Amazon EMR (using the latest emr-release 4.1.0 which