Re: Spark Streaming failing on YARN Cluster

Ramkumar V Wed, 19 Aug 2015 01:07:22 -0700

We are using Cloudera-5.3.1. since it is one of the earlier version of CDH,
it doesnt supports the latest version of spark. So i installed spark-1.4.1
separately in my machine. I couldnt able to do spark-submit in cluster
mode. How to core-site.xml under classpath ? it will be very helpful if you
could explain in detail to solve this issue.


*Thanks*,
<https://in.linkedin.com/in/ramkumarcs31>


On Fri, Aug 14, 2015 at 8:25 AM, Jeff Zhang <zjf...@gmail.com> wrote:

>
>    1. 15/08/12 13:24:49 INFO Client: Source and destination file systems
>    are the same. Not copying
>    
> file:/home/hdfs/spark-1.4.1/assembly/target/scala-2.10/spark-assembly-1.4.1-hadoop2.5.0-cdh5.3.5.jar
>    2. 15/08/12 13:24:49 INFO Client: Source and destination file systems
>    are the same. Not copying
>    
> file:/home/hdfs/spark-1.4.1/external/kafka-assembly/target/spark-streaming-kafka-assembly_2.10-1.4.1.jar
>    3. 15/08/12 13:24:49 INFO Client: Source and destination file systems
>    are the same. Not copying 
> file:/home/hdfs/spark-1.4.1/python/lib/pyspark.zip
>    4. 15/08/12 13:24:49 INFO Client: Source and destination file systems
>    are the same. Not copying
>    file:/home/hdfs/spark-1.4.1/python/lib/py4j-0.8.2.1-src.zip
>    5. 15/08/12 13:24:49 INFO Client: Source and destination file systems
>    are the same. Not copying
>    file:/home/hdfs/spark-1.4.1/examples/src/main/python/streaming/kyt.py
>    6.
>
>
>    1. diagnostics: Application application_1437639737006_3808 failed 2
>    times due to AM Container for appattempt_1437639737006_3808_000002 exited
>    with  exitCode: -1000 due to: File
>    file:/home/hdfs/spark-1.4.1/python/lib/pyspark.zip does not exist
>    2. .Failing this attempt.. Failing the application.
>
>
>
> The machine you run spark is the client machine, while the yarn AM is
> running on another machine. And the yarn AM complains that the files are
> not found as your logs shown.
> From the logs, its seems that these files are not copied to the HDFS as
> local resources. I doubt that you didn't put core-site.xml under your
> classpath, so that spark can not detect your remote file system and won't
> copy the files to hdfs as local resources. Usually in yarn-cluster mode,
> you should be able to see the logs like following.
>
> > 15/08/14 10:48:49 INFO yarn.Client: Preparing resources for our AM
> container
> > 15/08/14 10:48:49 INFO yarn.Client: Uploading resource
> file:/Users/abc/github/spark/assembly/target/scala-2.10/spark-assembly-1.5.0-SNAPSHOT-hadoop2.6.0.jar
> -> hdfs://
> 0.0.0.0:9000/user/abc/.sparkStaging/application_1439432662178_0019/spark-assembly-1.5.0-SNAPSHOT-hadoop2.6.0.jar
> > 15/08/14 10:48:50 INFO yarn.Client: Uploading resource
> file:/Users/abc/github/spark/spark.py -> hdfs://
> 0.0.0.0:9000/user/abc/.sparkStaging/application_1439432662178_0019/spark.py
> > 15/08/14 10:48:50 INFO yarn.Client: Uploading resource
> file:/Users/abc/github/spark/python/lib/pyspark.zip -> hdfs://
> 0.0.0.0:9000/user/abc/.sparkStaging/application_1439432662178_0019/pyspark.zip
>
> On Thu, Aug 13, 2015 at 2:50 PM, Ramkumar V <ramkumar.c...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I have a cluster of 1 master and 2 slaves. I'm running a spark streaming
>> in master and I want to utilize all nodes in my cluster. i had specified
>> some parameters like driver memory and executor memory in my code. when i
>> give --deploy-mode cluster --master yarn-cluster in my spark-submit, it
>> gives the following error.
>>
>> Log link : *http://pastebin.com/kfyVWDGR <http://pastebin.com/kfyVWDGR>*
>>
>> How to fix this issue ? Please help me if i'm doing wrong.
>>
>>
>> *Thanks*,
>> Ramkumar V
>>
>>
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Re: Spark Streaming failing on YARN Cluster

Reply via email to