Re: Spark on YARN

Alan Prando Wed, 19 Nov 2014 08:30:13 -0800

Hi all!

Thanks for answering!


@Sean, I tried to run with 30 executor-cores , and 1 machine still without
processing.
@Vanzin, I checked RM's web UI, and all nodes were detecteds and "RUNNING".
The interesting fact is that available
memory and available core of 1 node was different of other 2, with just 1
available core and 1 available gig ram.

@All, I created a new cluster with 10 slaves and 1 master, and now 9 of my
slaves are working, and 1 still without processing.

It's fine by me! I'm just wondering why YARN's doing it... Does anyone know
the answer?

2014-11-18 16:18 GMT-02:00 Sean Owen <so...@cloudera.com>:

> My guess is you're asking for all cores of all machines but the driver
> needs at least one core, so one executor is unable to find a machine to fit
> on.
> On Nov 18, 2014 7:04 PM, "Alan Prando" <a...@scanboo.com.br> wrote:
>
>> Hi Folks!
>>
>> I'm running Spark on YARN cluster installed with Cloudera Manager Express.
>> The cluster has 1 master and 3 slaves, each machine with 32 cores and 64G
>> RAM.
>>
>> My spark's job is working fine, however it seems that just 2 of 3 slaves
>> are working (htop shows 2 slaves working 100% on 32 cores, and 1 slaves
>> without any processing).
>>
>> I'm using this command:
>> ./spark-submit --master yarn --num-executors 3 --executor-cores 32
>>  --executor-memory 32g feature_extractor.py -r 390
>>
>> Additionaly, spark's log testify communications with 2 slaves only:
>> 14/11/18 17:19:38 INFO YarnClientSchedulerBackend: Registered executor:
>> Actor[akka.tcp://sparkExecutor@ip-172-31-13-180.ec2.internal:33177/user/Executor#-113177469]
>> with ID 1
>> 14/11/18 17:19:38 INFO RackResolver: Resolved
>> ip-172-31-13-180.ec2.internal to /default
>> 14/11/18 17:19:38 INFO YarnClientSchedulerBackend: Registered executor:
>> Actor[akka.tcp://sparkExecutor@ip-172-31-13-179.ec2.internal:51859/user/Executor#-323896724]
>> with ID 2
>> 14/11/18 17:19:38 INFO RackResolver: Resolved
>> ip-172-31-13-179.ec2.internal to /default
>> 14/11/18 17:19:38 INFO BlockManagerMasterActor: Registering block manager
>> ip-172-31-13-180.ec2.internal:50959 with 16.6 GB RAM
>> 14/11/18 17:19:39 INFO BlockManagerMasterActor: Registering block manager
>> ip-172-31-13-179.ec2.internal:53557 with 16.6 GB RAM
>> 14/11/18 17:19:51 INFO YarnClientSchedulerBackend: SchedulerBackend is
>> ready for scheduling beginning after waiting
>> maxRegisteredResourcesWaitingTime: 30000(ms)
>>
>> Is there a configuration to call spark's job on YARN cluster with all
>> slaves?
>>
>> Thanks in advance! =]
>>
>> ---
>> Regards
>> Alan Vidotti Prando.
>>
>>
>>

Re: Spark on YARN

Reply via email to