Re: How to avoid using some nodes while running a spark program on yarn

2015-03-14 Thread Ted Yu
Out of curiosity, I searched for 'capacity scheduler deadlock' yielded the following: [YARN-3265] CapacityScheduler deadlock when computing absolute max avail capacity (fix for trunk/branch-2) [YARN-3251] Fix CapacityScheduler deadlock when computing absolute max avail capacity (short term fix fo

Re: How to avoid using some nodes while running a spark program on yarn

2015-03-14 Thread Simon Elliston Ball
You won’t be able to use YARN labels on 2.2.0. However, you only need the labels if you want to map containers on specific hardware. In your scenario, the capacity scheduler in YARN might be the best bet. You can setup separate queues for the streaming and other jobs to protect a percentage of c

Re: How to avoid using some nodes while running a spark program on yarn

2015-03-14 Thread James
My hadoop version is 2.2.0, and my spark version is 1.2.0 2015-03-14 17:22 GMT+08:00 Ted Yu : > Which release of hadoop are you using ? > > Can you utilize node labels feature ? > See YARN-2492 and YARN-796 > > Cheers > > On Sat, Mar 14, 2015 at 1:49 AM, James wrote: > >> Hello, >> >> I am got a

Re: How to avoid using some nodes while running a spark program on yarn

2015-03-14 Thread Ted Yu
Which release of hadoop are you using ? Can you utilize node labels feature ? See YARN-2492 and YARN-796 Cheers On Sat, Mar 14, 2015 at 1:49 AM, James wrote: > Hello, > > I am got a cluster with spark on yarn. Currently some nodes of it are > running a spark streamming program, thus their loca

How to avoid using some nodes while running a spark program on yarn

2015-03-14 Thread James
Hello, I am got a cluster with spark on yarn. Currently some nodes of it are running a spark streamming program, thus their local space is not enough to support other application. Thus I wonder is that possible to use a blacklist to avoid using these nodes when running a new spark program? Alcaid