On Jan 7, 2012, at 6:47 PM, Praveen Sripati wrote:

> Thanks for the response.
> 
> I was just thinking why some of the design decisions were made with MRv2.
> 
> > No, the OR condition is implied by the hierarchy of requests (node, rack, 
> > *).
> 
> If InputSplit1 is on Node11 and Node12 and InputSplit2 on Node21 and Node22. 
> Then the AM can ask for 1 containers on each of the nodes and * as 2 for map 
> tasks. Then the RM can return  2 nodes on Node11 and make * as 0. The data 
> locality is lost for InputSplit2 or else the AM has to make another call to 
> RM releasing one of the container and asking for another container.

Remember, you also have racks information to guide the RM...

> A bit more complex request specifying the dependencies might be more 
> effective.

At a very high cost - it's very expensive for the RM to track splits for each 
task across nodes & racks. To the extent possible, our goal has been to push 
work to the AM and keep the RM (and NM) really simple to scale & perform well.

> 
> > NM doesn't make any 'out' calls to anyone by RM, else it would be severe 
> > scalability bottleneck.
> 
> There is already a one-way communication between the AM and NM for launching 
> the containers. The response can from the NM can hold the list of completed 
> containers from the previous call.
> 

Again, we want too keep the framework (RM/NM) really simple. So, the task can 
communicate it's status to the AM itself. 

> > All interactions (RPCs) are authenticated. Also, there is a container token 
> > provided by the RM (during allocation) which is verified by the NM during 
> > container launch.
> 
> So, a shared key has to be deployed manually on all the nodes for the NM?

No, it's automatically shared on startup between the daemons.

Arun

Reply via email to