[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated MAPREDUCE-7100:
--------------------------------
    Description: 
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data nodes). The problem we meet is that the container allocation is quite slow 
for some jobs.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
    // Create resource requests
    for (String host : req.hosts) {
      // Data-local
      if (!isNodeBlacklisted(host)) {
        addResourceRequest(req.priority, host, req.capability,
            null);
      }
    }

    // Nothing Rack-local for now
    for (String rack : req.racks) {
      addResourceRequest(req.priority, rack, req.capability,
          null);
    }

    // Off-switch
    addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
        req.nodeLabelExpression);
  }
{code}

It seem that the request of data-local and rack-local could be skipped when 
computing layer is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=<data-node> numContainers=1 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=<its rack> numContainers=1 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1 #asks=3
{code}
Although eventually, the resource request with resourceName=<data-node> will 
not be satisfied (because the data node is not node manager) in RM, it could be 
better if AM does not request data-local or rack-local, when we already know 
that computer layer runs out of the storage cluster.



  was:
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data nodes). The problem we meet is that the container allocation is quite slow 
for some jobs.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
    // Create resource requests
    for (String host : req.hosts) {
      // Data-local
      if (!isNodeBlacklisted(host)) {
        addResourceRequest(req.priority, host, req.capability,
            null);
      }
    }

    // Nothing Rack-local for now
    for (String rack : req.racks) {
      addResourceRequest(req.priority, rack, req.capability,
          null);
    }

    // Off-switch
    addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
        req.nodeLabelExpression);
  }
{code}

It seem that the request of data-local and rack-local could be skipped when 
computing layer is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=<data-node> numContainers=1 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=<its rack> numContainers=1 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1 #asks=3
{code}
Although eventually, the resource request with resourceName=<data-node> will 
not be satisfied (because the data node is not node manager), it could be 
better if the request of data-node and rack-local could be skipped (by options) 
in an earlier stage, when we already know that computer layer runs out of the 
storage cluster.




> Provide options to skip adding resource request for data-local and rack-local 
> respectively
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7100
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: applicationmaster
>            Reporter: Xiang Li
>            Priority: Minor
>
> We are using hadoop 2.7.3 and the computing layer is running out of the 
> storage cluster (that is, node managers are running on a different set of 
> nodes from data nodes). The problem we meet is that the container allocation 
> is quite slow for some jobs.
> After some debugging, we found that in 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
> (the following code is from trunk, not 2.7.3)
> {code}
> protected void addContainerReq(ContainerRequest req) {
>     // Create resource requests
>     for (String host : req.hosts) {
>       // Data-local
>       if (!isNodeBlacklisted(host)) {
>         addResourceRequest(req.priority, host, req.capability,
>             null);
>       }
>     }
>     // Nothing Rack-local for now
>     for (String rack : req.racks) {
>       addResourceRequest(req.priority, rack, req.capability,
>           null);
>     }
>     // Off-switch
>     addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
>         req.nodeLabelExpression);
>   }
> {code}
> It seem that the request of data-local and rack-local could be skipped when 
> computing layer is not the same as the storage cluster.
> If I get it correctly, req.hosts and req.racks are provided by InputFormat. 
> If the mapper is to read HDFS, req.hosts is the corresponding data node and 
> req.racks is its rack. The debug log of AM is like:
> {code}
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 
> addResourceRequest: applicationId=1 priority=20 resourceName=<data-node> 
> numContainers=1 #asks=1
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 
> addResourceRequest: applicationId=1 priority=20 resourceName=<its rack> 
> numContainers=1 #asks=2
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 
> addResourceRequest: applicationId=1 priority=20 resourceName=* 
> numContainers=1 #asks=3
> {code}
> Although eventually, the resource request with resourceName=<data-node> will 
> not be satisfied (because the data node is not node manager) in RM, it could 
> be better if AM does not request data-local or rack-local, when we already 
> know that computer layer runs out of the storage cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to