Re: How to control Spark Executors from getting Lost when using YARN client mode?

2015-08-04 Thread Jeff Zhang
Please check the node manager logs to see why the container is killed.

On Mon, Aug 3, 2015 at 11:59 PM, Umesh Kacha  wrote:

> Hi all any help will be much appreciated my spark job runs fine but in the
> middle it starts loosing executors because of netafetchfailed exception
> saying shuffle not found at the location since executor is lost
> On Jul 31, 2015 11:41 PM, "Umesh Kacha"  wrote:
>
>> Hi thanks for the response. It looks like YARN container is getting
>> killed but dont know why I see shuffle metafetchexception as mentioned in
>> the following SO link. I have enough memory 8 nodes 8 cores 30 gig memory
>> each. And because of this metafetchexpcetion YARN killing container running
>> executor how can it over run memory I tried to give each executor 25 gig
>> still it is not sufficient and it fails. Please guide I dont understand
>> what is going on I am using Spark 1.4.0 I am using spark.shuffle.memory as
>> 0.0 and spark.storage.memory as 0.5. I have almost all optimal properties
>> like Kyro serializer I have kept 500 akka frame size 20 akka threads dont
>> know I am trapped its been two days I am trying to recover from this issue.
>>
>>
>> http://stackoverflow.com/questions/29850784/what-are-the-likely-causes-of-org-apache-spark-shuffle-metadatafetchfailedexcept
>>
>>
>>
>> On Thu, Jul 30, 2015 at 9:56 PM, Ashwin Giridharan <
>> ashwin.fo...@gmail.com> wrote:
>>
>>> What is your cluster configuration ( size and resources) ?
>>>
>>> If you do not have enough resources, then your executor will not run.
>>> Moreover allocating 8 cores to an executor is too much.
>>>
>>> If you have a cluster with four nodes running NodeManagers, each
>>> equipped with 4 cores and 8GB of memory,
>>> then an optimal configuration would be,
>>>
>>> --num-executors 8 --executor-cores 2 --executor-memory 2G
>>>
>>> Thanks,
>>> Ashwin
>>>
>>> On Thu, Jul 30, 2015 at 12:08 PM, unk1102  wrote:
>>>
>>>> Hi I have one Spark job which runs fine locally with less data but when
>>>> I
>>>> schedule it on YARN to execute I keep on getting the following ERROR and
>>>> slowly all executors gets removed from UI and my job fails
>>>>
>>>> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on
>>>> myhost1.com: remote Rpc client disassociated
>>>> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on
>>>> myhost2.com: remote Rpc client disassociated
>>>> I use the following command to schedule spark job in yarn-client mode
>>>>
>>>>  ./spark-submit --class com.xyz.MySpark --conf
>>>> "spark.executor.extraJavaOptions=-XX:MaxPermSize=512M"
>>>> --driver-java-options
>>>> -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client
>>>> --executor-memory 2G --executor-cores 8 --num-executors 12
>>>> /home/myuser/myspark-1.0.jar
>>>>
>>>> I dont know what is the problem please guide. I am new to Spark. Thanks
>>>> in
>>>> advance.
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>>
>>>> -
>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks & Regards,
>>> Ashwin Giridharan
>>>
>>
>>


-- 
Best Regards

Jeff Zhang


Re: How to control Spark Executors from getting Lost when using YARN client mode?

2015-08-03 Thread Umesh Kacha
Hi all any help will be much appreciated my spark job runs fine but in the
middle it starts loosing executors because of netafetchfailed exception
saying shuffle not found at the location since executor is lost
On Jul 31, 2015 11:41 PM, "Umesh Kacha"  wrote:

> Hi thanks for the response. It looks like YARN container is getting killed
> but dont know why I see shuffle metafetchexception as mentioned in the
> following SO link. I have enough memory 8 nodes 8 cores 30 gig memory each.
> And because of this metafetchexpcetion YARN killing container running
> executor how can it over run memory I tried to give each executor 25 gig
> still it is not sufficient and it fails. Please guide I dont understand
> what is going on I am using Spark 1.4.0 I am using spark.shuffle.memory as
> 0.0 and spark.storage.memory as 0.5. I have almost all optimal properties
> like Kyro serializer I have kept 500 akka frame size 20 akka threads dont
> know I am trapped its been two days I am trying to recover from this issue.
>
>
> http://stackoverflow.com/questions/29850784/what-are-the-likely-causes-of-org-apache-spark-shuffle-metadatafetchfailedexcept
>
>
>
> On Thu, Jul 30, 2015 at 9:56 PM, Ashwin Giridharan  > wrote:
>
>> What is your cluster configuration ( size and resources) ?
>>
>> If you do not have enough resources, then your executor will not run.
>> Moreover allocating 8 cores to an executor is too much.
>>
>> If you have a cluster with four nodes running NodeManagers, each equipped
>> with 4 cores and 8GB of memory,
>> then an optimal configuration would be,
>>
>> --num-executors 8 --executor-cores 2 --executor-memory 2G
>>
>> Thanks,
>> Ashwin
>>
>> On Thu, Jul 30, 2015 at 12:08 PM, unk1102  wrote:
>>
>>> Hi I have one Spark job which runs fine locally with less data but when I
>>> schedule it on YARN to execute I keep on getting the following ERROR and
>>> slowly all executors gets removed from UI and my job fails
>>>
>>> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on
>>> myhost1.com: remote Rpc client disassociated
>>> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on
>>> myhost2.com: remote Rpc client disassociated
>>> I use the following command to schedule spark job in yarn-client mode
>>>
>>>  ./spark-submit --class com.xyz.MySpark --conf
>>> "spark.executor.extraJavaOptions=-XX:MaxPermSize=512M"
>>> --driver-java-options
>>> -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client
>>> --executor-memory 2G --executor-cores 8 --num-executors 12
>>> /home/myuser/myspark-1.0.jar
>>>
>>> I dont know what is the problem please guide. I am new to Spark. Thanks
>>> in
>>> advance.
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>>
>> --
>> Thanks & Regards,
>> Ashwin Giridharan
>>
>
>


Re: How to control Spark Executors from getting Lost when using YARN client mode?

2015-07-31 Thread Umesh Kacha
Hi thanks for the response. It looks like YARN container is getting killed
but dont know why I see shuffle metafetchexception as mentioned in the
following SO link. I have enough memory 8 nodes 8 cores 30 gig memory each.
And because of this metafetchexpcetion YARN killing container running
executor how can it over run memory I tried to give each executor 25 gig
still it is not sufficient and it fails. Please guide I dont understand
what is going on I am using Spark 1.4.0 I am using spark.shuffle.memory as
0.0 and spark.storage.memory as 0.5. I have almost all optimal properties
like Kyro serializer I have kept 500 akka frame size 20 akka threads dont
know I am trapped its been two days I am trying to recover from this issue.

http://stackoverflow.com/questions/29850784/what-are-the-likely-causes-of-org-apache-spark-shuffle-metadatafetchfailedexcept



On Thu, Jul 30, 2015 at 9:56 PM, Ashwin Giridharan 
wrote:

> What is your cluster configuration ( size and resources) ?
>
> If you do not have enough resources, then your executor will not run.
> Moreover allocating 8 cores to an executor is too much.
>
> If you have a cluster with four nodes running NodeManagers, each equipped
> with 4 cores and 8GB of memory,
> then an optimal configuration would be,
>
> --num-executors 8 --executor-cores 2 --executor-memory 2G
>
> Thanks,
> Ashwin
>
> On Thu, Jul 30, 2015 at 12:08 PM, unk1102  wrote:
>
>> Hi I have one Spark job which runs fine locally with less data but when I
>> schedule it on YARN to execute I keep on getting the following ERROR and
>> slowly all executors gets removed from UI and my job fails
>>
>> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on
>> myhost1.com: remote Rpc client disassociated
>> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on
>> myhost2.com: remote Rpc client disassociated
>> I use the following command to schedule spark job in yarn-client mode
>>
>>  ./spark-submit --class com.xyz.MySpark --conf
>> "spark.executor.extraJavaOptions=-XX:MaxPermSize=512M"
>> --driver-java-options
>> -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client
>> --executor-memory 2G --executor-cores 8 --num-executors 12
>> /home/myuser/myspark-1.0.jar
>>
>> I dont know what is the problem please guide. I am new to Spark. Thanks in
>> advance.
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>
>
> --
> Thanks & Regards,
> Ashwin Giridharan
>


Re: How to control Spark Executors from getting Lost when using YARN client mode?

2015-07-30 Thread Ashwin Giridharan
What is your cluster configuration ( size and resources) ?

If you do not have enough resources, then your executor will not run.
Moreover allocating 8 cores to an executor is too much.

If you have a cluster with four nodes running NodeManagers, each equipped
with 4 cores and 8GB of memory,
then an optimal configuration would be,

--num-executors 8 --executor-cores 2 --executor-memory 2G

Thanks,
Ashwin

On Thu, Jul 30, 2015 at 12:08 PM, unk1102  wrote:

> Hi I have one Spark job which runs fine locally with less data but when I
> schedule it on YARN to execute I keep on getting the following ERROR and
> slowly all executors gets removed from UI and my job fails
>
> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on
> myhost1.com: remote Rpc client disassociated
> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on
> myhost2.com: remote Rpc client disassociated
> I use the following command to schedule spark job in yarn-client mode
>
>  ./spark-submit --class com.xyz.MySpark --conf
> "spark.executor.extraJavaOptions=-XX:MaxPermSize=512M"
> --driver-java-options
> -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client
> --executor-memory 2G --executor-cores 8 --num-executors 12
> /home/myuser/myspark-1.0.jar
>
> I dont know what is the problem please guide. I am new to Spark. Thanks in
> advance.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


-- 
Thanks & Regards,
Ashwin Giridharan


Re: How to control Spark Executors from getting Lost when using YARN client mode?

2015-07-30 Thread Ted Yu
See past thread on this topic:

http://search-hadoop.com/m/q3RTt0NZXV1cC6q02

On Thu, Jul 30, 2015 at 9:08 AM, unk1102  wrote:

> Hi I have one Spark job which runs fine locally with less data but when I
> schedule it on YARN to execute I keep on getting the following ERROR and
> slowly all executors gets removed from UI and my job fails
>
> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on
> myhost1.com: remote Rpc client disassociated
> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on
> myhost2.com: remote Rpc client disassociated
> I use the following command to schedule spark job in yarn-client mode
>
>  ./spark-submit --class com.xyz.MySpark --conf
> "spark.executor.extraJavaOptions=-XX:MaxPermSize=512M"
> --driver-java-options
> -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client
> --executor-memory 2G --executor-cores 8 --num-executors 12
> /home/myuser/myspark-1.0.jar
>
> I dont know what is the problem please guide. I am new to Spark. Thanks in
> advance.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


How to control Spark Executors from getting Lost when using YARN client mode?

2015-07-30 Thread unk1102
Hi I have one Spark job which runs fine locally with less data but when I
schedule it on YARN to execute I keep on getting the following ERROR and
slowly all executors gets removed from UI and my job fails

15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on
myhost1.com: remote Rpc client disassociated
15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on
myhost2.com: remote Rpc client disassociated
I use the following command to schedule spark job in yarn-client mode

 ./spark-submit --class com.xyz.MySpark --conf
"spark.executor.extraJavaOptions=-XX:MaxPermSize=512M" --driver-java-options
-XX:MaxPermSize=512m --driver-memory 3g --master yarn-client
--executor-memory 2G --executor-cores 8 --num-executors 12 
/home/myuser/myspark-1.0.jar

I dont know what is the problem please guide. I am new to Spark. Thanks in
advance.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org