Re: Queries on MRv2

Arun C Murthy Tue, 14 Jun 2011 12:30:52 -0700


On Jun 14, 2011, at 6:31 PM, Praveen Sripati wrote:

Hi,
I have gone through MapReduce NextGen Blog entries and JIRA and havethe
following queries
There is a single API between the Scheduler and theApplicationMaster:
(List <Container> newContainers, List <ContainerStatus>
containerStatuses) allocate (List <ResourceRequest> ask,List<Container>
release)
The AM ask for specific resources via a list of ResourceRequests(ask)
and releases unnecessary Containers which were allocated by theScheduler.
The response contains a list of newly allocated Containers and the
statuses of application-specific Containers that completed since the
previous interaction between the AM and the RM.

Q) If split-0 is is available in host1, host2 and host3, can
ApplicationMaster request a scheduler for a container on host1 orhost2 orhost3? This way the scheduler can allocate the resources moreeffectively.


Yes, absolutely.

Q) In a cluster there might be nodes of different capacities, howwill thescheduler know that a particular node has 4 GB and another has 16 GBRAM
before allocating the resources to the ApplicationMaster?

The NodeManager informs the RM about its capabilities on registration.The RM allocates appropriate resources to the AM(s).

Q) Are the unnecessary containers (List<Container> release) in therequestreleased by the ApplicationMaster the ones rejected by theApplicationMaster
or those on which the map/reduce tasks have been completed?


Only unused ones.

Q) What does the following in the response contain - "List<ContainerStatus>
containerStatuses"?


Status for completed completed containers.

Q) Once the ApplicationMaster gets the list of the new containersfrom theScheduler, what is the interaction between the ApplicationMaster andthe
Node Manager? Will the ApplicationMaster ask the Node Manager on the
different nodes to launch/monitor the map/reduce tasks in thosecontainers?

No, the AM directly monitors the containers via an application-specific protocol.


For MR applications we use TaskUmbilicalProtocol.

The NM just monitors the unix process and informs the RM on exit ofthe unix process.

Q) Does the Scheduler ask the Node Manager to create the containerson the
different nodes?

No, the Scheduler allocates them to the respective AMs who then launchthe container by talking to the NM.

The NM can securely verify the authenticity of the 'container launch'request, including the resources allocated to the container.

The resource requests are also aggregated by racks and then by the
special any (*) for all containers. All resource requests aresubject to
change via the delta protocol.
Q) Does (*) mean that the ApplicationMaster is OK with a containerin any
rack/host? This might be applicable for Reduce tasks.

Yes.

Hope this helps.

Arun

Re: Queries on MRv2

Reply via email to