Sandy Ryza created MAPREDUCE-4922:
-------------------------------------

             Summary: Request with multiple data local nodes can cause NPE in 
AppSchedulingInfo
                 Key: MAPREDUCE-4922
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4922
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 2.0.2-alpha
            Reporter: Sandy Ryza
            Assignee: Sandy Ryza


With the way that the schedulers work, each request for a container on a node 
must consist of 3 ResourceRequests - one on the node, one on the rack, and one 
with *.

AppSchedulingInfo tracks the outstanding requests.  When a node is assigned a 
node-local container, allocateNodeLocal decrements the outstanding requests at 
each level - node, rack, and *.  If the rack requests reach 0, it removes the 
mapping.

A mapreduce task with multiple data local nodes submits multiple container 
requests, one for each node.  It also submits one for each unique rack, and one 
for *.  If there are fewer unique racks than data local nodes, this means that 
fewer rack-local ResourceRequests will be submitted than node-local 
ResourceRequests, so the rack-local mapping will be deleted before all the 
node-local requests are allocated and an NPE will come up the next time a 
node-local request from that rack is allocated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to