[jira] [Commented] (MAPREDUCE-7100) Provide options to skip adding resource request for data-local and rack-local respectively

2018-05-31 Thread Xiang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497485#comment-16497485
 ] 

Xiang Li commented on MAPREDUCE-7100:
-

[~owen.omalley] [~leftnoteasy]  Could you please review it at your most 
convenience and provide some guidances? Are there any existing solutions?

> Provide options to skip adding resource request for data-local and rack-local 
> respectively
> --
>
> Key: MAPREDUCE-7100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Reporter: Xiang Li
>Priority: Minor
>
> We are using hadoop 2.7.3 and the computing layer is running out of the 
> storage cluster (that is, node managers are running on a different set of 
> nodes from data nodes). The problem we meet is that the container allocation 
> is quite slow for some jobs.
> After some debugging, we found that in 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
> (the following code is from trunk, not 2.7.3)
> {code}
> protected void addContainerReq(ContainerRequest req) {
> // Create resource requests
> for (String host : req.hosts) {
>   // Data-local
>   if (!isNodeBlacklisted(host)) {
> addResourceRequest(req.priority, host, req.capability,
> null);
>   }
> }
> // Nothing Rack-local for now
> for (String rack : req.racks) {
>   addResourceRequest(req.priority, rack, req.capability,
>   null);
> }
> // Off-switch
> addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
> req.nodeLabelExpression);
>   }
> {code}
> It seem that the request of data-local and rack-local could be skipped when 
> computing layer is not the same as the storage cluster.
> If I get it correctly, req.hosts and req.racks are provided by InputFormat. 
> If the mapper is to read HDFS, req.hosts is the corresponding data node and 
> req.racks is its rack. The debug log of AM is like:
> {code}
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 
> addResourceRequest: applicationId=1 priority=20 resourceName= 
> numContainers=1 #asks=1
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 
> addResourceRequest: applicationId=1 priority=20 resourceName= 
> numContainers=1 #asks=2
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 
> addResourceRequest: applicationId=1 priority=20 resourceName=* 
> numContainers=1 #asks=3
> {code}
> Although eventually, the resource request with resourceName= will 
> not be satisfied (because the data node is not node manager) in RM, it could 
> be better if AM does not request data-local or rack-local at the very 
> beginning, when we already know that computer layer runs out of the storage 
> cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7099) Daily test result fails in MapReduce JobClient though there isn't any error

2018-05-31 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497202#comment-16497202
 ] 

Miklos Szegedi commented on MAPREDUCE-7099:
---

{code:java}
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.013 s 
- in org.apache.hadoop.mapred.TestIFile

[INFO] Running org.apache.hadoop.mapred.pipes.TestPipeApplication

[INFO] Running org.apache.hadoop.mapred.pipes.TestPipesNonJavaInputFormat{code}
It seems to be the test above. It repros locally.

> Daily test result fails in MapReduce JobClient though there isn't any error
> ---
>
> Key: MAPREDUCE-7099
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7099
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build, test
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Critical
>
> Looks like the test result in MapReduce JobClient always fails lately. Please 
> see the results of hadoop-qbt-trunk-java8-linux-x86:
>  
> [https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/]/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
> {noformat}
> [INFO] Results:
> [INFO] 
> [WARNING] Tests run: 565, Failures: 0, Errors: 0, Skipped: 10
> [INFO] 
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 02:06 h
> [INFO] Finished at: 2018-05-30T12:32:39+00:00
> [INFO] Final Memory: 25M/645M
> [INFO] 
> 
> [WARNING] The requested profile "parallel-tests" could not be activated 
> because it does not exist.
> [WARNING] The requested profile "shelltest" could not be activated because it 
> does not exist.
> [WARNING] The requested profile "native" could not be activated because it 
> does not exist.
> [WARNING] The requested profile "yarn-ui" could not be activated because it 
> does not exist.
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
> project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
> in the fork -> [Help 1]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7101) Revisit behavior of JHS scan file behavior

2018-05-31 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496959#comment-16496959
 ] 

Wangda Tan commented on MAPREDUCE-7101:
---

Regarding to semantics of directory's modification time, adding [~steve_l] / 
[~arpitagarwal] for suggestions.

Regarding to this scan behavior, I propose to entirely remove the if check: 
{code}
  if (modTime != newModTime
  || (scanTime/1000) == (modTime/1000)
  || (scanTime/1000 + 1) == (modTime/1000)) {
  // ...
  }
{code} 

Not sure how bad it could impact performance. Wanna to hear thoughts from 
[~jlowe] / [~vinodkv]

> Revisit behavior of JHS scan file behavior
> --
>
> Key: MAPREDUCE-7101
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7101
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Wangda Tan
>Priority: Critical
>
> Currently, the JHS scan directory if the modification of *directory* changed: 
> {code} 
> public synchronized void scanIfNeeded(FileStatus fs) {
>   long newModTime = fs.getModificationTime();
>   if (modTime != newModTime) {
> <... omitted some logics ...>
> // reset scanTime before scanning happens
> scanTime = System.currentTimeMillis();
> Path p = fs.getPath();
> try {
>   scanIntermediateDirectory(p);
> {code}
> This logic relies on an assumption that, the directory's modification time 
> will be updated if a file got placed under the directory.
> However, the semantic of directory's modification time is not consistent in 
> different FS implementations. For example, MAPREDUCE-6680 fixed some issues 
> of truncated modification time. And HADOOP-12837 mentioned on S3, the 
> directory's modification time is always 0.
> I think we need to revisit behavior of this logic to make it to more robustly 
> work on different file systems.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7101) Revisit behavior of JHS scan file behavior

2018-05-31 Thread Wangda Tan (JIRA)
Wangda Tan created MAPREDUCE-7101:
-

 Summary: Revisit behavior of JHS scan file behavior
 Key: MAPREDUCE-7101
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7101
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Wangda Tan


Currently, the JHS scan directory if the modification of *directory* changed: 

{code} 
public synchronized void scanIfNeeded(FileStatus fs) {
  long newModTime = fs.getModificationTime();
  if (modTime != newModTime) {
<... omitted some logics ...>
// reset scanTime before scanning happens
scanTime = System.currentTimeMillis();
Path p = fs.getPath();
try {
  scanIntermediateDirectory(p);
{code}

This logic relies on an assumption that, the directory's modification time will 
be updated if a file got placed under the directory.

However, the semantic of directory's modification time is not consistent in 
different FS implementations. For example, MAPREDUCE-6680 fixed some issues of 
truncated modification time. And HADOOP-12837 mentioned on S3, the directory's 
modification time is always 0.

I think we need to revisit behavior of this logic to make it to more robustly 
work on different file systems.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7100) Provide options to skip adding resource request for data-local and rack-local respectively

2018-05-31 Thread Xiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated MAPREDUCE-7100:

Description: 
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data nodes). The problem we meet is that the container allocation is quite slow 
for some jobs.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}

It seem that the request of data-local and rack-local could be skipped when 
computing layer is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1 #asks=3
{code}
Although eventually, the resource request with resourceName= will 
not be satisfied (because the data node is not node manager) in RM, it could be 
better if AM does not request data-local or rack-local at the very beginning, 
when we already know that computer layer runs out of the storage cluster.



  was:
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data nodes). The problem we meet is that the container allocation is quite slow 
for some jobs.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}

It seem that the request of data-local and rack-local could be skipped when 
computing layer is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1 #asks=3
{code}
Although eventually, the resource request with resourceName= will 
not be satisfied (because the data node is not node manager) in RM, it could be 
better if AM does not request data-local or rack-local, when we already know 
that computer layer runs out of the storage cluster.




> Provide options to skip adding resource request for data-local and rack-local 
> respectively
> --
>
> Key: MAPREDUCE-7100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Reporter: Xiang Li
>Priority: Minor
>
> We are using hadoop 2.7.3 and the computing layer is running out of the 
> storage cluster (that is, node managers ar

[jira] [Updated] (MAPREDUCE-7100) Provide options to skip adding resource request for data-local and rack-local respectively

2018-05-31 Thread Xiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated MAPREDUCE-7100:

Description: 
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data nodes). The problem we meet is that the container allocation is quite slow 
for some jobs.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}

It seem that the request of data-local and rack-local could be skipped when 
computing layer is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1 #asks=3
{code}
Although eventually, the resource request with resourceName= will 
not be satisfied (because the data node is not node manager) in RM, it could be 
better if AM does not request data-local or rack-local, when we already know 
that computer layer runs out of the storage cluster.



  was:
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data nodes). The problem we meet is that the container allocation is quite slow 
for some jobs.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}

It seem that the request of data-local and rack-local could be skipped when 
computing layer is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1 #asks=3
{code}
Although eventually, the resource request with resourceName= will 
not be satisfied (because the data node is not node manager), it could be 
better if the request of data-node and rack-local could be skipped (by options) 
in an earlier stage, when we already know that computer layer runs out of the 
storage cluster.




> Provide options to skip adding resource request for data-local and rack-local 
> respectively
> --
>
> Key: MAPREDUCE-7100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Reporter: Xiang Li
>Priority: Minor
>
> We are using hadoop 2.7.3 and the computing layer is running out of the 
> storage cluster (that is

[jira] [Updated] (MAPREDUCE-7100) Provide options to skip adding resource request for data-local and rack-local respectively

2018-05-31 Thread Xiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated MAPREDUCE-7100:

Summary: Provide options to skip adding resource request for data-local and 
rack-local respectively  (was: Provide options to skip adding container request 
for data-local and rack-local respectively)

> Provide options to skip adding resource request for data-local and rack-local 
> respectively
> --
>
> Key: MAPREDUCE-7100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Reporter: Xiang Li
>Priority: Minor
>
> We are using hadoop 2.7.3 and the computing layer is running out of the 
> storage cluster (that is, node managers are running on a different set of 
> nodes from data nodes). The problem we meet is that the container allocation 
> is quite slow for some jobs.
> After some debugging, we found that in 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
> (the following code is from trunk, not 2.7.3)
> {code}
> protected void addContainerReq(ContainerRequest req) {
> // Create resource requests
> for (String host : req.hosts) {
>   // Data-local
>   if (!isNodeBlacklisted(host)) {
> addResourceRequest(req.priority, host, req.capability,
> null);
>   }
> }
> // Nothing Rack-local for now
> for (String rack : req.racks) {
>   addResourceRequest(req.priority, rack, req.capability,
>   null);
> }
> // Off-switch
> addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
> req.nodeLabelExpression);
>   }
> {code}
> It seem that the request of data-local and rack-local could be skipped when 
> computing layer is not the same as the storage cluster.
> If I get it correctly, req.hosts and req.racks are provided by InputFormat. 
> If the mapper is to read HDFS, req.hosts is the corresponding data node and 
> req.racks is its rack. The debug log of AM is like:
> {code}
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 
> addResourceRequest: applicationId=1 priority=20 resourceName= 
> numContainers=1 #asks=1
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 
> addResourceRequest: applicationId=1 priority=20 resourceName= 
> numContainers=1 #asks=2
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 
> addResourceRequest: applicationId=1 priority=20 resourceName=* 
> numContainers=1 #asks=3
> {code}
> Although eventually, the resource request with resourceName= will 
> not be satisfied (because the data node is not node manager), it could be 
> better if the request of data-node and rack-local could be skipped (by 
> options) in an earlier stage, when we already know that computer layer runs 
> out of the storage cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7100) Provide options to skip adding container request for data-local and rack-local respectively

2018-05-31 Thread Xiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated MAPREDUCE-7100:

Description: 
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data nodes). The problem we meet is that the container allocation is quite slow 
for some jobs.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}

It seem that the request of data-local and rack-local could be skipped when 
computing layer is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1 #asks=3
{code}
Although eventually, the resource request with resourceName= will 
not be satisfied (because the data node is not node manager), it could be 
better if the request of data-node and rack-local could be skipped (by options) 
in an earlier stage, when we already know that computer layer runs out of the 
storage cluster.



  was:
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data nodes). The problem we meet is that the container allocation is quite slow 
for some jobs.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}

It seem that the request of data-local and rack-local could be skipped when 
computing layer is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1 #asks=3
{code}
Although eventually, the resource request with resourceName= will 
not be satisfied (because the data node is not node manager), it could be 
better that if we know that computing layer is not the same as the storage 
cluster, the request of data-node and rack-local could be skipped (by options) 
in an earlier stage.




> Provide options to skip adding container request for data-local and 
> rack-local respectively
> ---
>
> Key: MAPREDUCE-7100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Reporter: Xiang Li
>Priority: Minor
>
> We are using hadoop 2.7.3 and the computing layer is runn

[jira] [Commented] (MAPREDUCE-7100) Provide options to skip adding container request for data-local and rack-local respectively

2018-05-31 Thread Xiang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496538#comment-16496538
 ] 

Xiang Li commented on MAPREDUCE-7100:
-

Please correct me if my understanding is wrong, or there are some existing 
solutions or options to achieve that.

> Provide options to skip adding container request for data-local and 
> rack-local respectively
> ---
>
> Key: MAPREDUCE-7100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Reporter: Xiang Li
>Priority: Minor
>
> We are using hadoop 2.7.3 and the computing layer is running out of the 
> storage cluster (that is, node managers are running on a different set of 
> nodes from data nodes). The problem we meet is that the container allocation 
> is quite slow for some jobs.
> After some debugging, we found that in 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
> (the following code is from trunk, not 2.7.3)
> {code}
> protected void addContainerReq(ContainerRequest req) {
> // Create resource requests
> for (String host : req.hosts) {
>   // Data-local
>   if (!isNodeBlacklisted(host)) {
> addResourceRequest(req.priority, host, req.capability,
> null);
>   }
> }
> // Nothing Rack-local for now
> for (String rack : req.racks) {
>   addResourceRequest(req.priority, rack, req.capability,
>   null);
> }
> // Off-switch
> addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
> req.nodeLabelExpression);
>   }
> {code}
> It seem that the request of data-local and rack-local could be skipped when 
> computing layer is not the same as the storage cluster.
> If I get it correctly, req.hosts and req.racks are provided by InputFormat. 
> If the mapper is to read HDFS, req.hosts is the corresponding data node and 
> req.racks is its rack. The debug log of AM is like:
> {code}
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 
> addResourceRequest: applicationId=1 priority=20 resourceName= 
> numContainers=1 #asks=1
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 
> addResourceRequest: applicationId=1 priority=20 resourceName= 
> numContainers=1 #asks=2
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 
> addResourceRequest: applicationId=1 priority=20 resourceName=* 
> numContainers=1 #asks=3
> {code}
> Although eventually, the resource request with resourceName= will 
> not be satisfied (because the data node is not node manager), it could be 
> better if the request of data-node and rack-local could be skipped (by 
> options) in an earlier stage, when we already know that computer layer runs 
> out of the storage cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7100) Provide options to skip adding container request for data-local and rack-local respectively

2018-05-31 Thread Xiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated MAPREDUCE-7100:

Description: 
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data nodes). The problem we meet is that the container allocation is quite slow 
for some jobs.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}

It seem that the request of data-local and rack-local could be skipped when 
computing layer is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1 #asks=3
{code}
Although eventually, the resource request with resourceName= will 
not be satisfied (because the data node is not node manager), it could be 
better that if we know that computing layer is not the same as the storage 
cluster, the request of data-node and rack-local could be skipped (by options) 
in an earlier stage.



  was:
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data nodes). The problem we meet is that the container allocation is quite slow 
for some jobs.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}

It seem that the request of data-local and rack-local could be skipped when 
computing layer is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1…256 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1…256 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1…256 #asks=3
{code}
Although eventually, the resource request with resourceName= will 
not be satisfied (because the data node is not node manager), it could be 
better that if we know that computing layer is not the same as the storage 
cluster, the request of data-node and rack-local could be skipped (by options) 
in an earlier stage.




> Provide options to skip adding container request for data-local and 
> rack-local respectively
> ---
>
> Key: MAPREDUCE-7100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Reporter: Xiang Li
>Priority: Minor
>
> We are using hadoop 2.7.3 and the computing l

[jira] [Updated] (MAPREDUCE-7100) Provide options to skip adding container request for data-local and rack-local respectively

2018-05-31 Thread Xiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated MAPREDUCE-7100:

Description: 
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data nodes). The problem we meet is that the container allocation is quite slow 
for some jobs.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}

It seem that the request of data-local and rack-local could be skipped when 
computing layer is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1…256 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1…256 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1…256 #asks=3
{code}
Although eventually, the resource request with resourceName= will 
not be satisfied (because the data node is not node manager), it could be 
better that if we know that computing layer is not the same as the storage 
cluster, the request of data-node and rack-local could be skipped (by options) 
in an earlier stage.



  was:
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data nodes). The problem we meet is that the container allocation is quite slow 
for some jobs.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}

The request of data-local and rack-local could be skipped when computing layer 
is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1…256 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1…256 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1…256 #asks=3
{code}
Although eventually, the resource request with resourceName= will 
not be satisfied (because the data node is not node manager), it could be 
better that if we know that computing layer is not the same as the storage 
cluster, the request of data-node and rack-local could be skipped (by options) 
in an earlier stage.




> Provide options to skip adding container request for data-local and 
> rack-local respectively
> ---
>
> Key: MAPREDUCE-7100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Reporter: Xiang Li
>Priority: Minor
>
> We are using hadoop 2.7.3 and the computing la

[jira] [Updated] (MAPREDUCE-7100) Provide options to skip adding container request for data-local and rack-local respectively

2018-05-31 Thread Xiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated MAPREDUCE-7100:

Description: 
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data nodes). The problem we meet is that the container allocation is quite slow 
for some jobs.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}

The request of data-local and rack-local could be skipped when computing layer 
is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1…256 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1…256 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1…256 #asks=3
{code}
Although eventually, the resource request with resourceName= will 
not be satisfied (because the data node is not node manager), it could be 
better that if we know that computing layer is not the same as the storage 
cluster, the request of data-node and rack-local could be skipped (by options) 
in an earlier stage.



  was:
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data node). The problem we meet is that the container allocation is quite slow.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}

The request of data-local and rack-local could be skipped when computing layer 
is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1…256 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1…256 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1…256 #asks=3
{code}
Although eventually, the resource request with resourceName= will 
not be satisfied (because the data node is not node manager), it could be 
better that if we know that computing layer is not the same as the storage 
cluster, the request of data-node and rack-local could be skipped (by options) 
in an earlier stage.




> Provide options to skip adding container request for data-local and 
> rack-local respectively
> ---
>
> Key: MAPREDUCE-7100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Reporter: Xiang Li
>Priority: Minor
>
> We are using hadoop 2.7.3 and the computing layer is running out of the 
> 

[jira] [Updated] (MAPREDUCE-7100) Provide options to skip adding container request for data-local and rack-local respectively

2018-05-31 Thread Xiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated MAPREDUCE-7100:

Description: 
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data node). The problem we meet is that the container allocation is quite slow.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}

The request of data-local and rack-local could be skipped when computing layer 
is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1…256 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1…256 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1…256 #asks=3
{code}
Although eventually, the resource request with resourceName= will 
not be satisfied (because the data node is not node manager), it could be 
better that if we know that computing layer is not the same as the storage 
cluster, the request of data-node and rack-local could be skipped (by options) 
in an earlier stage.



  was:
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data node). The problem we meet is that the container allocation is quite slow.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}

The request of data-local and rack-local could be skipped when computing layer 
is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1…256 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1…256 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1…256 #asks=3
{code}
Although eventually, the resource request with resourceName= will 
not be satisfied (because the data node is not node manager), it could be 
better that if we know that computing layer is not the same as the storage 
cluster, the request of data-node and rack-local could be skipped (by options) 
in a earlier stage.




> Provide options to skip adding container request for data-local and 
> rack-local respectively
> ---
>
> Key: MAPREDUCE-7100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Reporter: Xiang Li
>Priority: Minor
>
> We are using hadoop 2.7.3 and the computing layer is running out of the 
> storage cluster (

[jira] [Updated] (MAPREDUCE-7100) Provide options to skip adding container request for data-local and rack-local respectively

2018-05-31 Thread Xiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated MAPREDUCE-7100:

Description: 
We are using hadoop 2.7.3 and the computing layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data node). The problem we meet is that the container allocation is quite slow.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}

The request of data-local and rack-local could be skipped when computing layer 
is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If 
the mapper is to read HDFS, req.hosts is the corresponding data node and 
req.racks is its rack. The debug log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1…256 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName= numContainers=1…256 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=1 priority=20 resourceName=* numContainers=1…256 #asks=3
{code}
Although eventually, the resource request with resourceName= will 
not be satisfied (because the data node is not node manager), it could be 
better that if we know that computing layer is not the same as the storage 
cluster, the request of data-node and rack-local could be skipped (by options) 
in a earlier stage.



  was:
We are using hadoop 2.7.3 and the computer layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data node). The problem we meet is that the container allocation is quite slow.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq()
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}


> Provide options to skip adding container request for data-local and 
> rack-local respectively
> ---
>
> Key: MAPREDUCE-7100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Reporter: Xiang Li
>Priority: Minor
>
> We are using hadoop 2.7.3 and the computing layer is running out of the 
> storage cluster (that is, node managers are running on a different set of 
> nodes from data node). The problem we meet is that the container allocation 
> is quite slow.
> After some debugging, we found that in 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq() 
> (the following code is from trunk, not 2.7.3)
> {code}
> protected void addContainerReq(ContainerRequest req) {
> // Create resource requests
> for (String host : req.hosts) {
>   // Data-local
>   if (!isNodeBlacklisted(host)) {
> addResourceRequest(req.priority, host, req.capability,
> null);
>   }
> }
> // Nothing Rack-local for now
> for (String rack : req.racks) {
>   addResourceRequest(req.priority, rack, req.capability,
>   null);
> }
> // Off-switch
> addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
> req.nodeLabelExpression);
>   }
> {code}
> The request of data-local and rack-local could be skipped when computing 
> layer is not the same as the storage cluster.
> If I get it correctly, req.hosts and req.racks are provided by InputFormat. 
> If the mapper is to 

[jira] [Updated] (MAPREDUCE-7100) Provide options to skip adding container request for data-local and rack-local respectively

2018-05-31 Thread Xiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated MAPREDUCE-7100:

Description: 
We are using hadoop 2.7.3 and the computer layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data node). The problem we meet is that the container allocation is quite slow.
After some debugging, we found that in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq()
{code}
protected void addContainerReq(ContainerRequest req) {
// Create resource requests
for (String host : req.hosts) {
  // Data-local
  if (!isNodeBlacklisted(host)) {
addResourceRequest(req.priority, host, req.capability,
null);
  }
}

// Nothing Rack-local for now
for (String rack : req.racks) {
  addResourceRequest(req.priority, rack, req.capability,
  null);
}

// Off-switch
addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
req.nodeLabelExpression);
  }
{code}

  was:
We are using hadoop 2.7.3 and the computer layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data node). The problem we meet is that the container allocation is quite slow.
After some debugging, we found that in 
{{org.apache.hadoop.mapreduce.v2.app.rm}}


> Provide options to skip adding container request for data-local and 
> rack-local respectively
> ---
>
> Key: MAPREDUCE-7100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Reporter: Xiang Li
>Priority: Minor
>
> We are using hadoop 2.7.3 and the computer layer is running out of the 
> storage cluster (that is, node managers are running on a different set of 
> nodes from data node). The problem we meet is that the container allocation 
> is quite slow.
> After some debugging, we found that in 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq()
> {code}
> protected void addContainerReq(ContainerRequest req) {
> // Create resource requests
> for (String host : req.hosts) {
>   // Data-local
>   if (!isNodeBlacklisted(host)) {
> addResourceRequest(req.priority, host, req.capability,
> null);
>   }
> }
> // Nothing Rack-local for now
> for (String rack : req.racks) {
>   addResourceRequest(req.priority, rack, req.capability,
>   null);
> }
> // Off-switch
> addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
> req.nodeLabelExpression);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7100) Provide options to skip adding container request for data-local and rack-local respectively

2018-05-31 Thread Xiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated MAPREDUCE-7100:

Description: 
We are using hadoop 2.7.3 and the computer layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data node). The problem we meet is that the container allocation is quite slow.
After some debugging, we found that in 
bq. org.apache.hadoop.mapreduce.v2.app.rm

> Provide options to skip adding container request for data-local and 
> rack-local respectively
> ---
>
> Key: MAPREDUCE-7100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Reporter: Xiang Li
>Priority: Minor
>
> We are using hadoop 2.7.3 and the computer layer is running out of the 
> storage cluster (that is, node managers are running on a different set of 
> nodes from data node). The problem we meet is that the container allocation 
> is quite slow.
> After some debugging, we found that in 
> bq. org.apache.hadoop.mapreduce.v2.app.rm



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7100) Provide options to skip adding container request for data-local and rack-local respectively

2018-05-31 Thread Xiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated MAPREDUCE-7100:

Description: 
We are using hadoop 2.7.3 and the computer layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data node). The problem we meet is that the container allocation is quite slow.
After some debugging, we found that in 
{{org.apache.hadoop.mapreduce.v2.app.rm}}

  was:
We are using hadoop 2.7.3 and the computer layer is running out of the storage 
cluster (that is, node managers are running on a different set of nodes from 
data node). The problem we meet is that the container allocation is quite slow.
After some debugging, we found that in 
bq. org.apache.hadoop.mapreduce.v2.app.rm


> Provide options to skip adding container request for data-local and 
> rack-local respectively
> ---
>
> Key: MAPREDUCE-7100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Reporter: Xiang Li
>Priority: Minor
>
> We are using hadoop 2.7.3 and the computer layer is running out of the 
> storage cluster (that is, node managers are running on a different set of 
> nodes from data node). The problem we meet is that the container allocation 
> is quite slow.
> After some debugging, we found that in 
> {{org.apache.hadoop.mapreduce.v2.app.rm}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7100) Provide options to skip adding container request for data-local and rack-local respectively

2018-05-31 Thread Xiang Li (JIRA)
Xiang Li created MAPREDUCE-7100:
---

 Summary: Provide options to skip adding container request for 
data-local and rack-local respectively
 Key: MAPREDUCE-7100
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster
Reporter: Xiang Li






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7098) Upgrade common-langs version to 3.7 in hadoop-mapreduce-project

2018-05-31 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496401#comment-16496401
 ] 

Akira Ajisaka commented on MAPREDUCE-7098:
--

Thanks!

> Upgrade common-langs version to 3.7 in hadoop-mapreduce-project
> ---
>
> Key: MAPREDUCE-7098
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7098
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: MAPREDUCE-7098.1.patch
>
>
> commons-lang 2.6 is widely used. Let's upgrade to 3.6.
> This jira is separated from HADOOP-10783.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7099) Daily test result fails in MapReduce JobClient though there isn't any error

2018-05-31 Thread Takanobu Asanuma (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated MAPREDUCE-7099:

Summary: Daily test result fails in MapReduce JobClient though there isn't 
any error  (was: Daily tests fail in MapReduce JobClient though there isn't any 
error)

> Daily test result fails in MapReduce JobClient though there isn't any error
> ---
>
> Key: MAPREDUCE-7099
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7099
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build, test
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Critical
>
> Looks like the test result in MapReduce JobClient always fails lately. Please 
> see the results of hadoop-qbt-trunk-java8-linux-x86:
>  
> [https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/]/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
> {noformat}
> [INFO] Results:
> [INFO] 
> [WARNING] Tests run: 565, Failures: 0, Errors: 0, Skipped: 10
> [INFO] 
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 02:06 h
> [INFO] Finished at: 2018-05-30T12:32:39+00:00
> [INFO] Final Memory: 25M/645M
> [INFO] 
> 
> [WARNING] The requested profile "parallel-tests" could not be activated 
> because it does not exist.
> [WARNING] The requested profile "shelltest" could not be activated because it 
> does not exist.
> [WARNING] The requested profile "native" could not be activated because it 
> does not exist.
> [WARNING] The requested profile "yarn-ui" could not be activated because it 
> does not exist.
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
> project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
> in the fork -> [Help 1]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7099) Daily tests fail in MapReduce JobClient though there isn't any error

2018-05-31 Thread Takanobu Asanuma (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated MAPREDUCE-7099:

Description: 
Looks like the test result in MapReduce JobClient always fails lately. Please 
see the results of hadoop-qbt-trunk-java8-linux-x86:
 
[https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/]/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
{noformat}
[INFO] Results:
[INFO] 
[WARNING] Tests run: 565, Failures: 0, Errors: 0, Skipped: 10
[INFO] 
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 02:06 h
[INFO] Finished at: 2018-05-30T12:32:39+00:00
[INFO] Final Memory: 25M/645M
[INFO] 
[WARNING] The requested profile "parallel-tests" could not be activated because 
it does not exist.
[WARNING] The requested profile "shelltest" could not be activated because it 
does not exist.
[WARNING] The requested profile "native" could not be activated because it does 
not exist.
[WARNING] The requested profile "yarn-ui" could not be activated because it 
does not exist.
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
in the fork -> [Help 1]
{noformat}

  was:
Looks like the build always fails lately. Please see the results of 
hadoop-qbt-trunk-java8-linux-x86:
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86//artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt

{noformat}
[INFO] Results:
[INFO] 
[WARNING] Tests run: 565, Failures: 0, Errors: 0, Skipped: 10
[INFO] 
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 02:06 h
[INFO] Finished at: 2018-05-30T12:32:39+00:00
[INFO] Final Memory: 25M/645M
[INFO] 
[WARNING] The requested profile "parallel-tests" could not be activated because 
it does not exist.
[WARNING] The requested profile "shelltest" could not be activated because it 
does not exist.
[WARNING] The requested profile "native" could not be activated because it does 
not exist.
[WARNING] The requested profile "yarn-ui" could not be activated because it 
does not exist.
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
in the fork -> [Help 1]
{noformat}


> Daily tests fail in MapReduce JobClient though there isn't any error
> 
>
> Key: MAPREDUCE-7099
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7099
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build, test
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Critical
>
> Looks like the test result in MapReduce JobClient always fails lately. Please 
> see the results of hadoop-qbt-trunk-java8-linux-x86:
>  
> [https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/]/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
> {noformat}
> [INFO] Results:
> [INFO] 
> [WARNING] Tests run: 565, Failures: 0, Errors: 0, Skipped: 10
> [INFO] 
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 02:06 h
> [INFO] Finished at: 2018-05-30T12:32:39+00:00
> [INFO] Final Memory: 25M/645M
> [INFO] 
> 
> [WARNING] The requested profile "parallel-tests" could not be activated 
> because it does not exist.
> [WARNING] The requested profile "shelltest" could not be activated because it 
> does not exist.
> [WARNING] The requested profile "native" could not be activated because it 
> does not exist.
> [WARNING] The requested profile "yarn-ui" could not be activated because it 
> does not exist.
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
> project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
> in the fork -> [Help 1]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (MAPREDUCE-7099) Daily tests fail in MapReduce JobClient though there isn't any error

2018-05-31 Thread Takanobu Asanuma (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated MAPREDUCE-7099:

Summary: Daily tests fail in MapReduce JobClient though there isn't any 
error  (was: Daily build fails in MapReduce JobClient though there isn't any 
error)

> Daily tests fail in MapReduce JobClient though there isn't any error
> 
>
> Key: MAPREDUCE-7099
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7099
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build, test
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Critical
>
> Looks like the build always fails lately. Please see the results of 
> hadoop-qbt-trunk-java8-linux-x86:
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86//artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
> {noformat}
> [INFO] Results:
> [INFO] 
> [WARNING] Tests run: 565, Failures: 0, Errors: 0, Skipped: 10
> [INFO] 
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 02:06 h
> [INFO] Finished at: 2018-05-30T12:32:39+00:00
> [INFO] Final Memory: 25M/645M
> [INFO] 
> 
> [WARNING] The requested profile "parallel-tests" could not be activated 
> because it does not exist.
> [WARNING] The requested profile "shelltest" could not be activated because it 
> does not exist.
> [WARNING] The requested profile "native" could not be activated because it 
> does not exist.
> [WARNING] The requested profile "yarn-ui" could not be activated because it 
> does not exist.
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
> project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
> in the fork -> [Help 1]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7099) Daily build fails in MapReduce JobClient though there isn't any error

2018-05-31 Thread Takanobu Asanuma (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated MAPREDUCE-7099:

Component/s: test

> Daily build fails in MapReduce JobClient though there isn't any error
> -
>
> Key: MAPREDUCE-7099
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7099
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build, test
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Critical
>
> Looks like the build always fails lately. Please see the results of 
> hadoop-qbt-trunk-java8-linux-x86:
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86//artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
> {noformat}
> [INFO] Results:
> [INFO] 
> [WARNING] Tests run: 565, Failures: 0, Errors: 0, Skipped: 10
> [INFO] 
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 02:06 h
> [INFO] Finished at: 2018-05-30T12:32:39+00:00
> [INFO] Final Memory: 25M/645M
> [INFO] 
> 
> [WARNING] The requested profile "parallel-tests" could not be activated 
> because it does not exist.
> [WARNING] The requested profile "shelltest" could not be activated because it 
> does not exist.
> [WARNING] The requested profile "native" could not be activated because it 
> does not exist.
> [WARNING] The requested profile "yarn-ui" could not be activated because it 
> does not exist.
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
> project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
> in the fork -> [Help 1]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7098) Upgrade common-langs version to 3.7 in hadoop-mapreduce-project

2018-05-31 Thread Takanobu Asanuma (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496386#comment-16496386
 ] 

Takanobu Asanuma commented on MAPREDUCE-7098:
-

Filed it in MAPREDUCE-7099.

> Upgrade common-langs version to 3.7 in hadoop-mapreduce-project
> ---
>
> Key: MAPREDUCE-7098
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7098
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: MAPREDUCE-7098.1.patch
>
>
> commons-lang 2.6 is widely used. Let's upgrade to 3.6.
> This jira is separated from HADOOP-10783.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7099) Daily build fails in MapReduce JobClient though there isn't any error

2018-05-31 Thread Takanobu Asanuma (JIRA)
Takanobu Asanuma created MAPREDUCE-7099:
---

 Summary: Daily build fails in MapReduce JobClient though there 
isn't any error
 Key: MAPREDUCE-7099
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7099
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Reporter: Takanobu Asanuma
Assignee: Takanobu Asanuma


Looks like the build always fails lately. Please see the results of 
hadoop-qbt-trunk-java8-linux-x86:
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86//artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt

{noformat}
[INFO] Results:
[INFO] 
[WARNING] Tests run: 565, Failures: 0, Errors: 0, Skipped: 10
[INFO] 
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 02:06 h
[INFO] Finished at: 2018-05-30T12:32:39+00:00
[INFO] Final Memory: 25M/645M
[INFO] 
[WARNING] The requested profile "parallel-tests" could not be activated because 
it does not exist.
[WARNING] The requested profile "shelltest" could not be activated because it 
does not exist.
[WARNING] The requested profile "native" could not be activated because it does 
not exist.
[WARNING] The requested profile "yarn-ui" could not be activated because it 
does not exist.
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
in the fork -> [Help 1]
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7098) Upgrade common-langs version to 3.7 in hadoop-mapreduce-project

2018-05-31 Thread Takanobu Asanuma (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496356#comment-16496356
 ] 

Takanobu Asanuma commented on MAPREDUCE-7098:
-

Thanks for reviewing and committing it, [~ajisakaa]!

bq. would you create a separate jira to track the failure of jenkins job?

Sure. I will file it soon.

> Upgrade common-langs version to 3.7 in hadoop-mapreduce-project
> ---
>
> Key: MAPREDUCE-7098
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7098
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: MAPREDUCE-7098.1.patch
>
>
> commons-lang 2.6 is widely used. Let's upgrade to 3.6.
> This jira is separated from HADOOP-10783.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7098) Upgrade common-langs version to 3.7 in hadoop-mapreduce-project

2018-05-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496330#comment-16496330
 ] 

Hudson commented on MAPREDUCE-7098:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14321 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14321/])
MAPREDUCE-7098. Upgrade common-langs version to 3.7 in (aajisaka: rev 
d1e2b8098078af4af31392ed7f2fa350a7d1c3b2)
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/PathOutputCommitterFactory.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/TasksBlock.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/webapp/TestAppController.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestJobCounters.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/AppController.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/security/TokenCache.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/NotRunningJob.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/db/TestDBOutputFormat.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/tools/CLI.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/HumanReadableHistoryViewerPrinter.java
* (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/pom.xml
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/webapp/HsJobsBlock.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/TaskPage.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/webapp/HsTaskPage.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/StringUtils.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/checkpoint/RandomNameCNS.java
* (edit) 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestStringUtils.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/partition/TestRehashPartitioner.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ClientServiceDelegate.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ClientCache.java


> Upgrade common-langs version to 3.7 in hadoop-mapreduce-project
> ---
>
> Key: MAPREDUCE-7098
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7098
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: MAPREDUCE-7098.1.patch
>
>
> commons-lang 2.6 is widely used. Let's upgrade to 3.6.
> This jira is separated from HADOOP-10783.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7098) Upgrade common-langs version to 3.7 in hadoop-mapreduce-project

2018-05-31 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated MAPREDUCE-7098:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.2.0
   Status: Resolved  (was: Patch Available)

Committed this to trunk. Thanks!

Hi [~tasanuma0829], would you create a separate jira to track the failure of 
jenkins job?

> Upgrade common-langs version to 3.7 in hadoop-mapreduce-project
> ---
>
> Key: MAPREDUCE-7098
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7098
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: MAPREDUCE-7098.1.patch
>
>
> commons-lang 2.6 is widely used. Let's upgrade to 3.6.
> This jira is separated from HADOOP-10783.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7098) Upgrade common-langs version to 3.7 in hadoop-mapreduce-project

2018-05-31 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496220#comment-16496220
 ] 

Akira Ajisaka commented on MAPREDUCE-7098:
--

Thanks [~tasanuma0829]. LGTM, +1.

> Upgrade common-langs version to 3.7 in hadoop-mapreduce-project
> ---
>
> Key: MAPREDUCE-7098
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7098
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
> Attachments: MAPREDUCE-7098.1.patch
>
>
> commons-lang 2.6 is widely used. Let's upgrade to 3.6.
> This jira is separated from HADOOP-10783.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org