[
https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16759333#comment-16759333
]
Zhaohui Xin edited comment on MAPREDUCE-7100 at 2/3/19 8:26 AM:
----------------------------------------------------------------
Hi, I added a patch to cancel locality resource request as an option in job.
In my opinion, canceling local requests will also avoid rack resolution issues.
was (Author: uranus):
Hi, I added a patch to cancel locality resource request as an option in job.
> Provide options to skip adding resource request for data-local and rack-local
> respectively
> ------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-7100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: applicationmaster
> Reporter: Xiang Li
> Assignee: Zhaohui Xin
> Priority: Critical
> Attachments: MAPREDUCE-7100.001.patch
>
>
> We are using hadoop 2.7.3 and the computing layer is running out of the
> storage cluster (that is, node managers are running on a different set of
> nodes from data nodes). The problem we meet is that the container allocation
> is quite slow for some jobs.
> After some debugging, we found that in
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq()
> (the following code is from trunk, not 2.7.3)
> {code}
> protected void addContainerReq(ContainerRequest req) {
> // Create resource requests
> for (String host : req.hosts) {
> // Data-local
> if (!isNodeBlacklisted(host)) {
> addResourceRequest(req.priority, host, req.capability,
> null);
> }
> }
> // Nothing Rack-local for now
> for (String rack : req.racks) {
> addResourceRequest(req.priority, rack, req.capability,
> null);
> }
> // Off-switch
> addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
> req.nodeLabelExpression);
> }
> {code}
> It seem that the request of data-local and rack-local could be skipped when
> computing layer is not the same as the storage cluster.
> If I get it correctly, req.hosts and req.racks are provided by InputFormat.
> If the mapper is to read HDFS, req.hosts is the corresponding data node and
> req.racks is its rack. The debug log of AM is like:
> {code}
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor:
> addResourceRequest: applicationId=1 priority=20 resourceName=<data-node>
> numContainers=1 #asks=1
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor:
> addResourceRequest: applicationId=1 priority=20 resourceName=<its rack>
> numContainers=1 #asks=2
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor:
> addResourceRequest: applicationId=1 priority=20 resourceName=*
> numContainers=1 #asks=3
> {code}
> Although eventually, the resource request with resourceName=<data-node> will
> not be satisfied (because the data node is not node manager) in RM, it could
> be better if AM does not request data-local or rack-local at the very
> beginning, when we already know that computer layer runs out of the storage
> cluster.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]