[jira] [Commented] (MAPREDUCE-6771) Diagnostics information can be lost in .jhist if task containers are killed by Node Manager.

2016-08-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15440439#comment-15440439
 ] 

Hadoop QA commented on MAPREDUCE-6771:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 21s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
34s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 44s 
{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 58s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12825749/mapreduce6771.001.patch
 |
| JIRA Issue | MAPREDUCE-6771 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 3f99f0b54520 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 19c743c |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6700/testReport/ |
| modules | C: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app U: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6700/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Diagnostics information can be lost in .jhist if task containers are killed 
> by Node Manager.
> 
>
> Key: MAPREDUCE-6771
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6771
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  

[jira] [Updated] (MAPREDUCE-6771) Diagnostics information can be lost in .jhist if task containers are killed by Node Manager.

2016-08-26 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6771:
--
Status: Patch Available  (was: Open)

> Diagnostics information can be lost in .jhist if task containers are killed 
> by Node Manager.
> 
>
> Key: MAPREDUCE-6771
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6771
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.7.3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6771.001.patch
>
>
> Task containers can go over their resource limit, and killed by Node Manager. 
> Then MR AM gets notified of the container status and diagnostics information 
> through its heartbeat with RM.  However, it is possible that the diagnostics 
> information never gets into .jhist file, so when the job completes, the 
> diagnostics information associated with the failed task attempts is empty.  
> This makes it hard for users to root cause job failures that are often caused 
> by memory leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6771) Diagnostics information can be lost in .jhist if task containers are killed by Node Manager.

2016-08-26 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6771:
--
Attachment: mapreduce6771.001.patch

Uploading a patch to fix this. Not sure how a unit test can be written. Any 
suggestion is greatly appreciated.

> Diagnostics information can be lost in .jhist if task containers are killed 
> by Node Manager.
> 
>
> Key: MAPREDUCE-6771
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6771
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.7.3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6771.001.patch
>
>
> Task containers can go over their resource limit, and killed by Node Manager. 
> Then MR AM gets notified of the container status and diagnostics information 
> through its heartbeat with RM.  However, it is possible that the diagnostics 
> information never gets into .jhist file, so when the job completes, the 
> diagnostics information associated with the failed task attempts is empty.  
> This makes it hard for users to root cause job failures that are often caused 
> by memory leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6771) Diagnostics information can be lost in .jhist if task containers are killed by Node Manager.

2016-08-26 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15440379#comment-15440379
 ] 

Haibo Chen commented on MAPREDUCE-6771:
---

If tasked are killed or failed on NM before they can notify AM,  the user need 
to dig through NM logs, or task logs hoping they can find some useful 
information as to why the task attempt failed.

> Diagnostics information can be lost in .jhist if task containers are killed 
> by Node Manager.
> 
>
> Key: MAPREDUCE-6771
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6771
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.7.3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> Task containers can go over their resource limit, and killed by Node Manager. 
> Then MR AM gets notified of the container status and diagnostics information 
> through its heartbeat with RM.  However, it is possible that the diagnostics 
> information never gets into .jhist file, so when the job completes, the 
> diagnostics information associated with the failed task attempts is empty.  
> This makes it hard for users to root cause job failures that are often caused 
> by memory leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6771) Diagnostics information can be lost in .jhist if task containers are killed by Node Manager.

2016-08-26 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15440372#comment-15440372
 ] 

Haibo Chen commented on MAPREDUCE-6771:
---

Analysis:
{code:java}
RMContainerAllocator.getResources() {
  ...
  for (ContainerStatus cont : finishedContainers) {
  LOG.info("Received completed container " + cont.getContainerId());
  TaskAttemptId attemptID = assignedRequests.get(cont.getContainerId());
  if (attemptID == null) {
LOG.error("Container complete event for unknown container id "
+ cont.getContainerId());
  } else {
pendingRelease.remove(cont.getContainerId());
assignedRequests.remove(attemptID);

// send the container completed event to Task attempt
eventHandler.handle(createContainerFinishedEvent(cont, attemptID));

// Send the diagnostics
String diagnostics = StringInterner.weakIntern(cont.getDiagnostics());
eventHandler.handle(new TaskAttemptDiagnosticsUpdateEvent(attemptID,
diagnostics));

preemptionPolicy.handleCompletedContainer(attemptID);
  }
  ...
}
{code}
The scenario in question is described as follows: A job is running, and one of 
tasks attempt running on a NM is killed by the NM because the container exceeds 
its resource limit. The container status/diagnostics is sent to RM by the NM 
and then later to MR AM in its periodical heartbeat with RM as shown above. In 
MR AM, the task attempt is still in RUNNING state from AM's perspective, since 
the task heartbeat has not timed out. 

Upon receiving from RM that the task attempt container has finished, the 
RMCommunicator thread will place a ContainerFinishedEvent and a 
TaskAttemptDiagnosticsUpdateEvent in the event queue. 

The ContainerFinishedEvent will cause the task attempt in MR AM to transition 
from RUNNING to FAILED and a TaskAttemptUnsuccessfulCompletionEvent that 
contains the associated diagnostics information to be written to the .jhist 
file.  The TaskAttemptDiagnosticsUpdateEvent will update the diagnostics 
information associated with the task attempt. 

But since the ContainerFinishedEvent is placed and processed before the 
TaskAttemptDiagnosticsUpdateEvent, the TaskAttemptUnsuccessfulCompletionEvent 
written to .jhist file will not contain the diagnostics info received from RM.

After the job is completed, the user  tries to access the failed task attempts 
through JHS, the TaskAttemptUnsuccessfulCompletionEvent is parsed to generate 
the failed attempt page.  The page will not have diagnostics info from RM (such 
as container killed by Node Manager...) because it was never written to .jhist 
in the first place.

> Diagnostics information can be lost in .jhist if task containers are killed 
> by Node Manager.
> 
>
> Key: MAPREDUCE-6771
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6771
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.7.3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> Task containers can go over their resource limit, and killed by Node Manager. 
> Then MR AM gets notified of the container status and diagnostics information 
> through its heartbeat with RM.  However, it is possible that the diagnostics 
> information never gets into .jhist file, so when the job completes, the 
> diagnostics information associated with the failed task attempts is empty.  
> This makes it hard for users to root cause job failures that are often caused 
> by memory leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6771) Diagnostics information can be lost in .jhist if task containers are killed by Node Manager.

2016-08-26 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6771:
--
Summary: Diagnostics information can be lost in .jhist if task containers 
are killed by Node Manager.  (was: Diagnostics information is lost in .jhist if 
task containers are killed by Node Manager.)

> Diagnostics information can be lost in .jhist if task containers are killed 
> by Node Manager.
> 
>
> Key: MAPREDUCE-6771
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6771
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.7.3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> Task containers can go over their resource limit, and killed by Node Manager. 
> Then MR AM gets notified of the container status and diagnostics information 
> through its heartbeat with RM.  However, it is possible that the diagnostics 
> information never gets into .jhist file, so when the job completes, the 
> diagnostics information associated with the failed task attempts is empty.  
> This makes it hard for users to root cause job failures that are often caused 
> by memory leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6771) Diagnostics information is lost in .jhist if task containers are killed by Node Manager.

2016-08-26 Thread Haibo Chen (JIRA)
Haibo Chen created MAPREDUCE-6771:
-

 Summary: Diagnostics information is lost in .jhist if task 
containers are killed by Node Manager.
 Key: MAPREDUCE-6771
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6771
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.7.3
Reporter: Haibo Chen
Assignee: Haibo Chen


Task containers can go over their resource limit, and killed by Node Manager. 
Then MR AM gets notified of the container status and diagnostics information 
through its heartbeat with RM.  However, it is possible that the diagnostics 
information never gets into .jhist file, so when the job completes, the 
diagnostics information associated with the failed task attempts is empty.  
This makes it hard for users to root cause job failures that are often caused 
by memory leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6769) Fix forgotten conversion from "slave" to "worker" in mapred script

2016-08-26 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6769:

Status: Patch Available  (was: Open)

> Fix forgotten conversion from "slave" to "worker" in mapred script
> --
>
> Key: MAPREDUCE-6769
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6769
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 3.0.0-alpha2
>Reporter: Albert Chu
>Assignee: Albert Chu
>Priority: Minor
>
> In HADOOP-13209 (commit 23c3ff85a9e73d8f0755e14f12cc7c89b72acddd), "slaves" 
> was replaced with "workers" including the function name change from 
> hadoop_common_slave_mode_execute to hadoop_common_worker_mode_execute and 
> environment variable name change from HADOOP_SLAVE_MODE to HADOOP_WORKER_MODE.
> It appears this change was forgotten in hadoop-mapred-project/bin/mapred.
> Github pull request with fix to be sent shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6769) Fix forgotten conversion from "slave" to "worker" in mapred script

2016-08-26 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6769:

Assignee: Albert Chu

> Fix forgotten conversion from "slave" to "worker" in mapred script
> --
>
> Key: MAPREDUCE-6769
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6769
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 3.0.0-alpha2
>Reporter: Albert Chu
>Assignee: Albert Chu
>Priority: Minor
>
> In HADOOP-13209 (commit 23c3ff85a9e73d8f0755e14f12cc7c89b72acddd), "slaves" 
> was replaced with "workers" including the function name change from 
> hadoop_common_slave_mode_execute to hadoop_common_worker_mode_execute and 
> environment variable name change from HADOOP_SLAVE_MODE to HADOOP_WORKER_MODE.
> It appears this change was forgotten in hadoop-mapred-project/bin/mapred.
> Github pull request with fix to be sent shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6770) NodeHealthScriptRunner#reportHealthStatus bug fix

2016-08-26 Thread Yufei Gu (JIRA)
Yufei Gu created MAPREDUCE-6770:
---

 Summary: NodeHealthScriptRunner#reportHealthStatus bug fix
 Key: MAPREDUCE-6770
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6770
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager
Reporter: Yufei Gu
Assignee: Yufei Gu


{code}
  case FAILED_WITH_EXIT_CODE:
setHealthStatus(true, "", now);
break;
{code}
should be 
{code}
  case FAILED_WITH_EXIT_CODE:
setHealthStatus(false, "", now);
break;
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6768) TestRecovery.testSpeculative failed with NPE

2016-08-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15439675#comment-15439675
 ] 

Hadoop QA commented on MAPREDUCE-6768:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 23s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
48s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
34s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s 
{color} | {color:red} 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app: 
The patch generated 1 new + 118 unchanged - 1 fixed = 119 total (was 119) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 49s 
{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
16s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 37m 39s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12825692/mapreduce6768.002.patch
 |
| JIRA Issue | MAPREDUCE-6768 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux cacaee0233c1 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / cde3a00 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6699/artifact/patchprocess/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6699/testReport/ |
| modules | C: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app U: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6699/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> TestRecovery.testSpeculative failed with NPE
> 
>
> Key: MAPREDUCE-6768
> URL: 

[jira] [Commented] (MAPREDUCE-6768) TestRecovery.testSpeculative failed with NPE

2016-08-26 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15439643#comment-15439643
 ] 

Jason Lowe commented on MAPREDUCE-6768:
---

bq. I guess I must be following some bad practice I have seen in the code base.

It's not a terrible practice, just that I've been sensitive to unnecessarily 
long sleeps in unit tests lately.  See the discussion at YARN-5393 for details.

+1 pending Jenkins.  The patch still can't be used as-is on other branches 
since JDK7 will want task1Attempt2 to be final for use in the inner class, but 
that's something I can easily fix during the commit.


> TestRecovery.testSpeculative failed with NPE
> 
>
> Key: MAPREDUCE-6768
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6768
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6768.001.patch, mapreduce6768.002.patch
>
>
> 1 tests failed.
> REGRESSION:  org.apache.hadoop.mapreduce.v2.app.TestRecovery.testSpeculative
> Error Message:
> null
> Stack Trace:
> java.lang.NullPointerException: null
> at 
> org.apache.hadoop.mapreduce.v2.app.TestRecovery.testSpeculative(TestRecovery.java:1201)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6768) TestRecovery.testSpeculative failed with NPE

2016-08-26 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15439589#comment-15439589
 ] 

Haibo Chen commented on MAPREDUCE-6768:
---

Thanks for the review Jason! I guess I must be following some bad practice I 
have seen in the code base. In the new patch, I have increased the overall 
timeout to 10s, and lowered the check interval to 10 milliseconds. Also, 
removed the use of lambda.

> TestRecovery.testSpeculative failed with NPE
> 
>
> Key: MAPREDUCE-6768
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6768
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6768.001.patch, mapreduce6768.002.patch
>
>
> 1 tests failed.
> REGRESSION:  org.apache.hadoop.mapreduce.v2.app.TestRecovery.testSpeculative
> Error Message:
> null
> Stack Trace:
> java.lang.NullPointerException: null
> at 
> org.apache.hadoop.mapreduce.v2.app.TestRecovery.testSpeculative(TestRecovery.java:1201)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6768) TestRecovery.testSpeculative failed with NPE

2016-08-26 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6768:
--
Attachment: mapreduce6768.002.patch

> TestRecovery.testSpeculative failed with NPE
> 
>
> Key: MAPREDUCE-6768
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6768
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6768.001.patch, mapreduce6768.002.patch
>
>
> 1 tests failed.
> REGRESSION:  org.apache.hadoop.mapreduce.v2.app.TestRecovery.testSpeculative
> Error Message:
> null
> Stack Trace:
> java.lang.NullPointerException: null
> at 
> org.apache.hadoop.mapreduce.v2.app.TestRecovery.testSpeculative(TestRecovery.java:1201)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6769) Fix forgotten conversion from "slave" to "worker" in mapred script

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15439526#comment-15439526
 ] 

ASF GitHub Bot commented on MAPREDUCE-6769:
---

GitHub user chu11 opened a pull request:

https://github.com/apache/hadoop/pull/123

MAPREDUCE-6769. Fix forgotten name conversion from "slave" to "worker" in 
mapred script,

most notably fixing environment variable name change and function name
change.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chu11/hadoop MAPREDUCE-6769

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hadoop/pull/123.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #123


commit 2bdf1a0e3e993a1bd7b1dd94e4d4fd42b6d26907
Author: Albert Chu 
Date:   2016-08-26T18:19:09Z

Fix forgotten name conversion from "slave" to "worker" in mapred script,
most notably fixing environment variable name change and function name
change.




> Fix forgotten conversion from "slave" to "worker" in mapred script
> --
>
> Key: MAPREDUCE-6769
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6769
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 3.0.0-alpha2
>Reporter: Albert Chu
>Priority: Minor
>
> In HADOOP-13209 (commit 23c3ff85a9e73d8f0755e14f12cc7c89b72acddd), "slaves" 
> was replaced with "workers" including the function name change from 
> hadoop_common_slave_mode_execute to hadoop_common_worker_mode_execute and 
> environment variable name change from HADOOP_SLAVE_MODE to HADOOP_WORKER_MODE.
> It appears this change was forgotten in hadoop-mapred-project/bin/mapred.
> Github pull request with fix to be sent shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6769) Fix forgotten conversion from "slave" to "worker" in mapred script

2016-08-26 Thread Albert Chu (JIRA)
Albert Chu created MAPREDUCE-6769:
-

 Summary: Fix forgotten conversion from "slave" to "worker" in 
mapred script
 Key: MAPREDUCE-6769
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6769
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.0.0-alpha2
Reporter: Albert Chu
Priority: Minor


In HADOOP-13209 (commit 23c3ff85a9e73d8f0755e14f12cc7c89b72acddd), "slaves" was 
replaced with "workers" including the function name change from 
hadoop_common_slave_mode_execute to hadoop_common_worker_mode_execute and 
environment variable name change from HADOOP_SLAVE_MODE to HADOOP_WORKER_MODE.

It appears this change was forgotten in hadoop-mapred-project/bin/mapred.

Github pull request with fix to be sent shortly.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6740) Enforce mapreduce.task.timeout to be at least mapreduce.task.progress-report.interval

2016-08-26 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15439452#comment-15439452
 ] 

Karthik Kambatla commented on MAPREDUCE-6740:
-

Latest patch looks good. Couple of minor comments:
# TaskHeartbeatHandler: When declaring taskTimeout, avoid setting the value to 
{{5 * 60 * 1000}}.
# Do we need a test case for when we don't set TASK_REPORT_INTERVAL?

> Enforce mapreduce.task.timeout to be at least 
> mapreduce.task.progress-report.interval
> -
>
> Key: MAPREDUCE-6740
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6740
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.8.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Minor
> Attachments: mapreduce6740.001.patch, mapreduce6740.002.patch, 
> mapreduce6740.003.patch, mapreduce6740.004.patch, mapreduce6740.005.patch, 
> mapreduce6740.006.patch
>
>
> Mapreduce-6242 makes task status update interval configurable to ease the 
> pressure on MR AM to process status updates, but it did not ensure that 
> mapreduce.task.timeout is no smaller than the configured value of task report 
> interval. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6768) TestRecovery.testSpeculative failed with NPE

2016-08-26 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438934#comment-15438934
 ] 

Jason Lowe commented on MAPREDUCE-6768:
---

Thanks for the patch!

I suspect this patch is going to be appropriate for more than just trunk, so as 
such it'd be good to avoid the lambda use.

I think only a 800 msec wait is going to be too short if the test runs on a 
slow VM or some other hiccup occurs.

Nit: Any reason to wait 100mec instead of 10 per iteration?  Yes, I'm overly 
sensitive to sleeps lately with all the slow YARN tests. ;-)


> TestRecovery.testSpeculative failed with NPE
> 
>
> Key: MAPREDUCE-6768
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6768
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6768.001.patch
>
>
> 1 tests failed.
> REGRESSION:  org.apache.hadoop.mapreduce.v2.app.TestRecovery.testSpeculative
> Error Message:
> null
> Stack Trace:
> java.lang.NullPointerException: null
> at 
> org.apache.hadoop.mapreduce.v2.app.TestRecovery.testSpeculative(TestRecovery.java:1201)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org