Re: [Spark 1.6] Spark Streaming - java.lang.AbstractMethodError
Right .. if you are using github version, just modify the ReceiverLauncher and add that . I will fix it for Spark 1.6 and release new version in spark-packages for spark 1.6 Dibyendu On Thu, Jan 7, 2016 at 4:14 PM, Ted Yuwrote: > I cloned g...@github.com:dibbhatt/kafka-spark-consumer.git a moment ago. > > In ./src/main/java/consumer/kafka/ReceiverLauncher.java , I see: >jsc.addStreamingListener(new StreamingListener() { > > There is no onOutputOperationStarted method implementation. > > Looks like it should be added for Spark 1.6.0 > > Cheers > > On Thu, Jan 7, 2016 at 2:39 AM, Dibyendu Bhattacharya < > dibyendu.bhattach...@gmail.com> wrote: > >> You are using low level spark kafka consumer . I am the author of the >> same. >> >> Are you using the spark-packages version ? if yes which one ? >> >> Regards, >> Dibyendu >> >> On Thu, Jan 7, 2016 at 4:07 PM, Jacek Laskowski wrote: >> >>> Hi, >>> >>> Do you perhaps use custom StreamingListener? >>> `StreamingListenerBus.scala:47` calls >>> `StreamingListener.onOutputOperationStarted` that was added in >>> [SPARK-10900] [STREAMING] Add output operation events to >>> StreamingListener [1] >>> >>> The other guess could be that at runtime you still use Spark < 1.6. >>> >>> [1] https://issues.apache.org/jira/browse/SPARK-10900 >>> >>> Pozdrawiam, >>> Jacek >>> >>> Jacek Laskowski | https://medium.com/@jaceklaskowski/ >>> Mastering Apache Spark >>> ==> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/ >>> Follow me at https://twitter.com/jaceklaskowski >>> >>> >>> >>> On Thu, Jan 7, 2016 at 10:59 AM, Walid LEZZAR >>> wrote: >>> > Hi, >>> > >>> > We have been using spark streaming for a little while now. >>> > >>> > Until now, we were running our spark streaming jobs in spark 1.5.1 and >>> it >>> > was working well. Yesterday, we upgraded to spark 1.6.0 without any >>> changes >>> > in the code. But our streaming jobs are not working any more. We are >>> getting >>> > an "AbstractMethodError". Please, find the stack trace at the end of >>> the >>> > mail. Can we have some hints on what this error means ? (we are using >>> spark >>> > to connect to kafka) >>> > >>> > The stack trace : >>> > 16/01/07 10:44:39 INFO ZkState: Starting curator service >>> > 16/01/07 10:44:39 INFO CuratorFrameworkImpl: Starting >>> > 16/01/07 10:44:39 INFO ZooKeeper: Initiating client connection, >>> > connectString=localhost:2181 sessionTimeout=12 >>> > watcher=org.apache.curator.ConnectionState@2e9fa23a >>> > 16/01/07 10:44:39 INFO ClientCnxn: Opening socket connection to server >>> > localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL >>> > (unknown error) >>> > 16/01/07 10:44:39 INFO ClientCnxn: Socket connection established to >>> > localhost/127.0.0.1:2181, initiating session >>> > 16/01/07 10:44:39 INFO ClientCnxn: Session establishment complete on >>> server >>> > localhost/127.0.0.1:2181, sessionid = 0x1521b6d262e0035, negotiated >>> timeout >>> > = 6 >>> > 16/01/07 10:44:39 INFO ConnectionStateManager: State change: CONNECTED >>> > 16/01/07 10:44:40 INFO PartitionManager: Read partition information >>> from: >>> > >>> /spark-kafka-consumer/StreamingArchiver/lbc.job.multiposting.input/partition_0 >>> > --> null >>> > 16/01/07 10:44:40 INFO JobScheduler: Added jobs for time 145215988 >>> ms >>> > 16/01/07 10:44:40 INFO JobScheduler: Starting job streaming job >>> > 145215988 ms.0 from job set of time 145215988 ms >>> > 16/01/07 10:44:40 ERROR Utils: uncaught error in thread >>> > StreamingListenerBus, stopping SparkContext >>> > >>> > ERROR Utils: uncaught error in thread StreamingListenerBus, stopping >>> > SparkContext >>> > java.lang.AbstractMethodError >>> > at >>> > >>> org.apache.spark.streaming.scheduler.StreamingListenerBus.onPostEvent(StreamingListenerBus.scala:47) >>> > at >>> > >>> org.apache.spark.streaming.scheduler.StreamingListenerBus.onPostEvent(StreamingListenerBus.scala:26) >>> > at >>> > org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55) >>> > at >>> > >>> org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37) >>> > at >>> > >>> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80) >>> > at >>> > >>> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) >>> > at >>> > >>> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) >>> > at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) >>> > at >>> > >>> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64) >>> > at >>> org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1180) >>> >
Re: [Spark 1.6] Spark Streaming - java.lang.AbstractMethodError
Some discussion is there in https://github.com/dibbhatt/kafka-spark-consumer and some is mentioned in https://issues.apache.org/jira/browse/SPARK-11045 Let me know if those answer your question . In short, Direct Stream is good choice if you need exact once semantics and message ordering , but many use case does not need such requirement of exact-once and message ordering . If you use Direct Stream the RDD processing parallelism is limited to Kafka partition and you need to store offset details to external store as checkpoint location is not reliable if you modify driver code . Whereas in Receiver based mode , you need to enable WAL for no data loss . But Spark Receiver based consumer from KafkaUtils which uses Kafka High Level API has serious issues , and thus if at all you need to switch to receiver based mode , this low level consumer is a better choice. Performance wise I have not published any number yet , but from internal testing and benchmarking I did ( and validated by folks who uses this consumer ), it perform much better than any existing consumer in Spark . Regards, Dibyendu On Thu, Jan 7, 2016 at 4:28 PM, Jacek Laskowskiwrote: > On Thu, Jan 7, 2016 at 11:39 AM, Dibyendu Bhattacharya > wrote: > > You are using low level spark kafka consumer . I am the author of the > same. > > If I may ask, what are the differences between this and the direct > version shipped with spark? I've just started toying with it, and > would appreciate some guidance. Thanks. > > Jacek >
[Spark 1.6] Spark Streaming - java.lang.AbstractMethodError
Hi, We have been using spark streaming for a little while now. Until now, we were running our spark streaming jobs in spark 1.5.1 and it was working well. Yesterday, we upgraded to spark 1.6.0 without any changes in the code. But our streaming jobs are not working any more. We are getting an "AbstractMethodError". Please, find the stack trace at the end of the mail. Can we have some hints on what this error means ? (we are using spark to connect to kafka) The stack trace : 16/01/07 10:44:39 INFO ZkState: Starting curator service 16/01/07 10:44:39 INFO CuratorFrameworkImpl: Starting 16/01/07 10:44:39 INFO ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=12 watcher=org.apache.curator.ConnectionState@2e9fa23a 16/01/07 10:44:39 INFO ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 16/01/07 10:44:39 INFO ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session 16/01/07 10:44:39 INFO ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x1521b6d262e0035, negotiated timeout = 6 16/01/07 10:44:39 INFO ConnectionStateManager: State change: CONNECTED 16/01/07 10:44:40 INFO PartitionManager: Read partition information from: /spark-kafka-consumer/StreamingArchiver/lbc.job.multiposting.input/partition_0 --> null 16/01/07 10:44:40 INFO JobScheduler: Added jobs for time 145215988 ms 16/01/07 10:44:40 INFO JobScheduler: Starting job streaming job 145215988 ms.0 from job set of time 145215988 ms 16/01/07 10:44:40 ERROR Utils: uncaught error in thread StreamingListenerBus, stopping SparkContext ERROR Utils: uncaught error in thread StreamingListenerBus, stopping SparkContext java.lang.AbstractMethodError at org.apache.spark.streaming.scheduler.StreamingListenerBus.onPostEvent(StreamingListenerBus.scala:47) at org.apache.spark.streaming.scheduler.StreamingListenerBus.onPostEvent(StreamingListenerBus.scala:26) at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55) at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37) at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80) at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64) at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1180) at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63) 16/01/07 10:44:40 INFO JobScheduler: Finished job streaming job 145215988 ms.0 from job set of time 145215988 ms 16/01/07 10:44:40 INFO JobScheduler: Total delay: 0.074 s for time 145215988 ms (execution: 0.032 s) 16/01/07 10:44:40 ERROR JobScheduler: Error running job streaming job 145215988 ms.0 java.lang.IllegalStateException: SparkContext has been shutdown at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:920) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:918) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:918) at org.apache.spark.api.java.JavaRDDLike$class.foreachPartition(JavaRDDLike.scala:225) at org.apache.spark.api.java.AbstractJavaRDDLike.foreachPartition(JavaRDDLike.scala:46) at fr.leboncoin.morpheus.jobs.streaming.StreamingArchiver.lambda$run$ade930b4$1(StreamingArchiver.java:103) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$3.apply(JavaDStreamLike.scala:335) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$3.apply(JavaDStreamLike.scala:335) at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:661) at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:661) at