Thanks guys.

On Wed, Sep 23, 2015 at 3:54 PM, Tathagata Das <t...@databricks.com> wrote:

> SPARK_CLASSPATH is I believe deprecated right now. So I am not surprised
> that there is some difference in the code paths.
>
> On Tue, Sep 22, 2015 at 9:45 PM, Saisai Shao <sai.sai.s...@gmail.com>
> wrote:
>
>> I think it is something related to class loader, the behavior is
>> different for classpath and --jars. If you want to know the details I think
>> you'd better dig out some source code.
>>
>> Thanks
>> Jerry
>>
>> On Tue, Sep 22, 2015 at 9:10 PM, ayan guha <guha.a...@gmail.com> wrote:
>>
>>> I must have been gone mad :) Thanks for pointing it out. I downloaded
>>> 1.5.0 assembly jar and added it in SPARK_CLASSPATH.
>>>
>>> However, I am getting a new error now
>>>
>>> >>> kvs =
>>> KafkaUtils.createDirectStream(ssc,['spark'],{"metadata.broker.list":'l
>>> ocalhost:9092'})
>>>
>>>
>>> ________________________________________________________________________________
>>> ________________
>>>
>>>   Spark Streaming's Kafka libraries not found in class path. Try one of
>>> the foll
>>> owing.
>>>
>>>   1. Include the Kafka library and its dependencies with in the
>>>      spark-submit command as
>>>
>>>      $ bin/spark-submit --packages
>>> org.apache.spark:spark-streaming-kafka:1.5.0
>>> ...
>>>
>>>   2. Download the JAR of the artifact from Maven Central
>>> http://search.maven.org
>>> /,
>>>      Group Id = org.apache.spark, Artifact Id =
>>> spark-streaming-kafka-assembly,
>>> Version = 1.5.0.
>>>      Then, include the jar in the spark-submit command as
>>>
>>>      $ bin/spark-submit --jars <spark-streaming-kafka-assembly.jar> ...
>>>
>>>
>>> ________________________________________________________________________________
>>> ________________
>>>
>>>
>>>
>>> Traceback (most recent call last):
>>>   File "<stdin>", line 1, in <module>
>>>   File
>>> "D:\sw\spark-1.5.0-bin-hadoop2.6\spark-1.5.0-bin-hadoop2.6\python\pyspark
>>> \streaming\kafka.py", line 130, in createDirectStream
>>>     raise e
>>> py4j.protocol.Py4JJavaError: An error occurred while calling
>>> o30.loadClass.
>>> : java.lang.ClassNotFoundException:
>>> org.apache.spark.streaming.kafka.KafkaUtilsP
>>> ythonHelper
>>>         at java.net.URLClassLoader.findClass(Unknown Source)
>>>         at java.lang.ClassLoader.loadClass(Unknown Source)
>>>         at java.lang.ClassLoader.loadClass(Unknown Source)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
>>> Source)
>>>         at java.lang.reflect.Method.invoke(Unknown Source)
>>>         at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
>>>         at
>>> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
>>>         at py4j.Gateway.invoke(Gateway.java:259)
>>>         at
>>> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
>>>         at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>>         at py4j.GatewayConnection.run(GatewayConnection.java:207)
>>>         at java.lang.Thread.run(Unknown Source)
>>>
>>> >>> os.environ['SPARK_CLASSPATH']
>>> 'D:\\sw\\spark-streaming-kafka-assembly_2.10-1.5.0'
>>> >>>
>>>
>>>
>>> So I launched pyspark with --jars with the assembly jar. Now it is
>>> working.
>>>
>>> THANK YOU for help.
>>>
>>> Curiosity:  Why adding it to SPARK CLASSPATH did not work?
>>>
>>> Best
>>> Ayan
>>>
>>> On Wed, Sep 23, 2015 at 2:25 AM, Saisai Shao <sai.sai.s...@gmail.com>
>>> wrote:
>>>
>>>> I think you're using the wrong version of kafka assembly jar, I think
>>>> Python API from direct Kafka stream is not supported for Spark 1.3.0, you'd
>>>> better change to version 1.5.0, looks like you're using Spark 1.5.0, why
>>>> you choose Kafka assembly 1.3.0?
>>>>
>>>>
>>>> D:\sw\spark-streaming-kafka-assembly_2.10-1.3.0.jar
>>>>
>>>>
>>>>
>>>> On Tue, Sep 22, 2015 at 6:41 AM, ayan guha <guha.a...@gmail.com> wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> I have added spark assembly jar to SPARK CLASSPATH
>>>>>
>>>>> >>> print os.environ['SPARK_CLASSPATH']
>>>>> D:\sw\spark-streaming-kafka-assembly_2.10-1.3.0.jar
>>>>>
>>>>>
>>>>> Now  I am facing below issue with a test topic
>>>>>
>>>>> >>> ssc = StreamingContext(sc, 2)
>>>>> >>> kvs =
>>>>> KafkaUtils.createDirectStream(ssc,['spark'],{"metadata.broker.list":'l
>>>>> ocalhost:9092'})
>>>>> Traceback (most recent call last):
>>>>>   File "<stdin>", line 1, in <module>
>>>>>   File
>>>>> "D:\sw\spark-1.5.0-bin-hadoop2.6\spark-1.5.0-bin-hadoop2.6\python\pyspark
>>>>> \streaming\kafka.py", line 126, in createDirectStream
>>>>>     jstream = helper.createDirectStream(ssc._jssc, kafkaParams,
>>>>> set(topics), jfr
>>>>> omOffsets)
>>>>>   File
>>>>> "D:\sw\spark-1.5.0-bin-hadoop2.6\spark-1.5.0-bin-hadoop2.6\python\lib\py4
>>>>> j-0.8.2.1-src.zip\py4j\java_gateway.py", line 538, in __call__
>>>>>   File
>>>>> "D:\sw\spark-1.5.0-bin-hadoop2.6\spark-1.5.0-bin-hadoop2.6\python\pyspark
>>>>> \sql\utils.py", line 36, in deco
>>>>>     return f(*a, **kw)
>>>>>   File
>>>>> "D:\sw\spark-1.5.0-bin-hadoop2.6\spark-1.5.0-bin-hadoop2.6\python\lib\py4
>>>>> j-0.8.2.1-src.zip\py4j\protocol.py", line 304, in get_return_value
>>>>> py4j.protocol.Py4JError: An error occurred while calling
>>>>> o22.createDirectStream.
>>>>>  Trace:
>>>>> py4j.Py4JException: Method createDirectStream([class
>>>>> org.apache.spark.streaming.
>>>>> api.java.JavaStreamingContext, class java.util.HashMap, class
>>>>> java.util.HashSet,
>>>>>  class java.util.HashMap]) does not exist
>>>>>         at
>>>>> py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:333)
>>>>>
>>>>>         at
>>>>> py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:342)
>>>>>
>>>>>         at py4j.Gateway.invoke(Gateway.java:252)
>>>>>         at
>>>>> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
>>>>>         at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>>>>         at py4j.GatewayConnection.run(GatewayConnection.java:207)
>>>>>         at java.lang.Thread.run(Unknown Source)
>>>>>
>>>>>
>>>>> >>>
>>>>>
>>>>> Am I doing something wrong?
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Ayan Guha
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Ayan Guha
>>>
>>
>>
>


-- 
Best Regards,
Ayan Guha

Reply via email to