Hi,
I'm trying to use the Spark Sink with Flume but it seems I'm missing some
of the dependencies.
I'm running the following code:
./bin/spark-shell --master yarn --jars
/home/impact/flumeStreaming/spark-streaming-flume_2.10-1.6.1.jar,/home/impact/flumeStreaming/flume-ng-core-1.6.0.jar,/home/impact/flumeStreaming/flume-ng-sdk-1.6.0.jar
import org.apache.spark.streaming.flume._
import org.apache.spark.streaming._
val ssc = new StreamingContext(sc, Seconds(60))
val flumeStream = FlumeUtils.createPollingStream(ssc, "impact1", 9999)
flumeStream.print
ssc.start
And getting this execption.
16/03/20 18:17:17 INFO scheduler.ReceiverTracker: Registered receiver for
stream 0 from impact3.indigo.co.il:51581
16/03/20 18:17:17 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 4.0
(TID 76, impact3.indigo.co.il): java.lang.NoClassDefFoundError:
org/apache/spark/streaming/flume/sink/SparkFlumeProtocol$Callback
at
org.apache.spark.streaming.flume.FlumePollingReceiver$$anonfun$onStart$1.apply(FlumePollingInputDStream.scala:84)
What deps am I missing ?
Thank you.
Daniel