Thanks that is good to know.  Is there any way to say "please fail if I don't 
get the node I want?"  Do I just release the container and try again?
I'd like to understand the implications of this policy.  Suppose I have 1000 
data splits and cluster capacity of 100 containers.  If I try to schedule 200 
tasks, requesting a local data node for each one, how do I ensure the highest 
chance that the tasks run against local data?  Do I just ask for all 200 at 
once?  Should I ask for 100 at a time and then re-target the remainder as 
containers come open?
Or am I thinking about this all wrong... perhaps I should ask for containers, 
see what nodes they are on, and then assign the data splits to them once I see 
the set of available containers?
john
From: Arun C Murthy [mailto:[email protected]]
Sent: Thursday, June 13, 2013 12:27 AM
To: [email protected]
Subject: Re: container allocation

By default, the ResourceManager will try give you a container on that node, 
rack or anywhere (in that order).

We recently added ability to whitelist or blacklist nodes to allow for more 
control.

Arun

On Jun 12, 2013, at 8:03 AM, John Lilley wrote:


If I request a container on a node, and that node is busy, will the request 
fail, or will it give me a container on a different node?  In other words is 
the node name a requirement or a hint?
Thanks
John


--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Reply via email to