[ 
https://issues.apache.org/jira/browse/SPARK-19776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russell Abedin updated SPARK-19776:
-----------------------------------
    Description: 
My question is 
https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/streaming/JavaKafkaWordCount.java
 correct?

I'm pretty new to both Spark and Java.  I wanted to find an example of Spark 
Streaming using Java, streaming from Kafka. The JavaKafkaWordCount at 
https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/streaming/JavaKafkaWordCount.java
 looked to be perfect.

However, when I tried running it, I found a couple of issues that I needed to 
overcome.

1. This line was unnecessary:
{code}
StreamingExamples.setStreamingLogLevels();
{code}

Having this line in there (and the associated import) caused me to go looking 
for a dependency spark-examples_2.10 which of no real use to me.

2. After running it, this line: 
{code}
JavaPairReceiverInputDStream<String, String> messages = 
KafkaUtils.createStream(jssc, args[0], args[1], topicMap);
{code}

Appeared to throw an error around logging:
{code}
Exception in thread "main" java.lang.NoClassDefFoundError: 
org/apache/spark/Logging
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
        at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at 
org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:91
        at 
org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:66
        at 
org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:11
        at 
org.apache.spark.streaming.kafka.KafkaUtils.createStream(KafkaUtils.scala)
        at main.java.com.cm.JavaKafkaWordCount.main(JavaKafkaWordCount.java:72)
{code}

To get around this, I found that the code sample in 
https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html 
helped me to come up with the right lines to see streaming from Kafka in 
action. Specifically this called createDirectStream instead of createStream.

So is the example in 
https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/streaming/JavaKafkaWordCount.java
 or is there something I could have done differently to get that example 
working?

  was:
My question is 
https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/streaming/JavaKafkaWordCount.java
 correct?

I'm pretty new to both Spark and Java.  I wanted to find an example of Spark 
Streaming using Java, streaming from Kafka. The JavaKafkaWordCount at 
https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/streaming/JavaKafkaWordCount.java
 looked to be perfect.

However, when I tried running it, I found a couple of issues that I needed to 
overcome.

1. This line was unnecessary:
{code}
StreamingExamples.setStreamingLogLevels();
{code}

Having this line in there (and the associated import) caused me to go looking 
for a dependency spark-examples_2.10 which of no real use to me.

2. After running it, I this line: 
{code}
JavaPairReceiverInputDStream<String, String> messages = 
KafkaUtils.createStream(jssc, args[0], args[1], topicMap);
{code}

Appeared to throw an error around logging:
{code}
Exception in thread "main" java.lang.NoClassDefFoundError: 
org/apache/spark/Logging
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
        at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at 
org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:91
        at 
org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:66
        at 
org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:11
        at 
org.apache.spark.streaming.kafka.KafkaUtils.createStream(KafkaUtils.scala)
        at main.java.com.cm.JavaKafkaWordCount.main(JavaKafkaWordCount.java:72)
{code}

To get around this, I found that the code sample in 
https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html 
helped me to come up with the right lines to see streaming from Kafka in 
action. Specifically this called createDirectStream instead of createStream.

So is the example in 
https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/streaming/JavaKafkaWordCount.java
 or is there something I could have done differently to get that example 
working?


> Is the JavaKafkaWordCount example correct for Spark version 2.1?
> ----------------------------------------------------------------
>
>                 Key: SPARK-19776
>                 URL: https://issues.apache.org/jira/browse/SPARK-19776
>             Project: Spark
>          Issue Type: Question
>          Components: Examples, ML
>    Affects Versions: 2.1.0
>            Reporter: Russell Abedin
>
> My question is 
> https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/streaming/JavaKafkaWordCount.java
>  correct?
> I'm pretty new to both Spark and Java.  I wanted to find an example of Spark 
> Streaming using Java, streaming from Kafka. The JavaKafkaWordCount at 
> https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/streaming/JavaKafkaWordCount.java
>  looked to be perfect.
> However, when I tried running it, I found a couple of issues that I needed to 
> overcome.
> 1. This line was unnecessary:
> {code}
> StreamingExamples.setStreamingLogLevels();
> {code}
> Having this line in there (and the associated import) caused me to go looking 
> for a dependency spark-examples_2.10 which of no real use to me.
> 2. After running it, this line: 
> {code}
> JavaPairReceiverInputDStream<String, String> messages = 
> KafkaUtils.createStream(jssc, args[0], args[1], topicMap);
> {code}
> Appeared to throw an error around logging:
> {code}
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/spark/Logging
>         at java.lang.ClassLoader.defineClass1(Native Method)
>         at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>         at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>         at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>         at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>         at 
> org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:91
>         at 
> org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:66
>         at 
> org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:11
>         at 
> org.apache.spark.streaming.kafka.KafkaUtils.createStream(KafkaUtils.scala)
>         at 
> main.java.com.cm.JavaKafkaWordCount.main(JavaKafkaWordCount.java:72)
> {code}
> To get around this, I found that the code sample in 
> https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html 
> helped me to come up with the right lines to see streaming from Kafka in 
> action. Specifically this called createDirectStream instead of createStream.
> So is the example in 
> https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/streaming/JavaKafkaWordCount.java
>  or is there something I could have done differently to get that example 
> working?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to