Sandy Ryza created MAPREDUCE-4922: ------------------------------------- Summary: Request with multiple data local nodes can cause NPE in AppSchedulingInfo Key: MAPREDUCE-4922 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4922 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.2-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza
With the way that the schedulers work, each request for a container on a node must consist of 3 ResourceRequests - one on the node, one on the rack, and one with *. AppSchedulingInfo tracks the outstanding requests. When a node is assigned a node-local container, allocateNodeLocal decrements the outstanding requests at each level - node, rack, and *. If the rack requests reach 0, it removes the mapping. A mapreduce task with multiple data local nodes submits multiple container requests, one for each node. It also submits one for each unique rack, and one for *. If there are fewer unique racks than data local nodes, this means that fewer rack-local ResourceRequests will be submitted than node-local ResourceRequests, so the rack-local mapping will be deleted before all the node-local requests are allocated and an NPE will come up the next time a node-local request from that rack is allocated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira