Varun Vasudev created YARN-2628: ----------------------------------- Summary: Capacity scheduler with DominantResourceCalculator carries out reservation even though slots are free Key: YARN-2628 URL: https://issues.apache.org/jira/browse/YARN-2628 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.5.1 Reporter: Varun Vasudev Assignee: Varun Vasudev
We've noticed that if you run the CapacityScheduler with the DominantResourceCalculator, sometimes apps will end up with containers in a reserved state even though free slots are available. The root cause seems to be this piece of code from CapacityScheduler.java - {noformat} // Try to schedule more if there are no reservations to fulfill if (node.getReservedContainer() == null) { if (Resources.greaterThanOrEqual(calculator, getClusterResource(), node.getAvailableResource(), minimumAllocation)) { if (LOG.isDebugEnabled()) { LOG.debug("Trying to schedule on node: " + node.getNodeName() + ", available: " + node.getAvailableResource()); } root.assignContainers(clusterResource, node, false); } } else { LOG.info("Skipping scheduling since node " + node.getNodeID() + " is reserved by application " + node.getReservedContainer().getContainerId().getApplicationAttemptId() ); } {noformat} The code is meant to check if a node has any slots available for containers . Since it uses the greaterThanOrEqual function, we end up in situation where greaterThanOrEqual returns true, even though we may not have enough CPU or memory to actually run the container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)