[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17364279#comment-17364279
 ] 

Hadoop QA commented on MAPREDUCE-7353:
--------------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 
10s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to 
include any new or modified tests. Please justify why no new tests are needed 
for this patch. Also please list what manual steps were performed to verify 
this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
45s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 16s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 16m 
17s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  0m 
58s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 45s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  1m  
6s{color} | {color:green}{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} || ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
17s{color} | {color:green}{color} | {color:green} hadoop-mapreduce-client-app 
in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green}{color} | {color:green} The patch does not generate 
ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 79m 30s{color} | 
{color:black}{color} | {color:black}{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/PreCommit-MAPREDUCE-Build/80/artifact/out/Dockerfile
 |
| JIRA Issue | MAPREDUCE-7353 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13026912/MAPREDUCE-7353.001.patch
 |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle spotbugs |
| uname | Linux 219b68df36c3 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / 2b304ad6457 |
| Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
| Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
|  Test Results | 
https://ci-hadoop.apache.org/job/PreCommit-MAPREDUCE-Build/80/testReport/ |
| Max. process+thread count | 697 (vs. ulimit of 5500) |
| modules | C: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app U: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app |
| Console output | 
https://ci-hadoop.apache.org/job/PreCommit-MAPREDUCE-Build/80/console |
| versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
| Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |


This message was automatically generated.



> Mapreduce job fails when NM is stopped
> --------------------------------------
>
>                 Key: MAPREDUCE-7353
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7353
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Bilwa S T
>            Assignee: Bilwa S T
>            Priority: Major
>         Attachments: MAPREDUCE-7353.001.patch
>
>
> Job fails as task fail due to too many fetch failures 
> {code:java}
> Line 48048: 2021-06-02 16:25:02,002 | INFO  | ContainerLauncher #6 | 
> Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container 
> container_e03_1622107691213_1054_01_000005 taskAttempt 
> attempt_1622107691213_1054_m_000000_0 | ContainerLauncherImpl.java:394
>       Line 48053: 2021-06-02 16:25:02,002 | INFO  | ContainerLauncher #6 | 
> KILLING attempt_1622107691213_1054_m_000000_0 | ContainerLauncherImpl.java:209
>       Line 58026: 2021-06-02 16:26:34,034 | INFO  | AsyncDispatcher event 
> handler | TaskAttempt killed because it ran on unusable node 
> node-group-1ZYEq0002:26009. AttemptId:attempt_1622107691213_1054_m_000000_0 | 
> JobImpl.java:1401
>       Line 58030: 2021-06-02 16:26:34,034 | DEBUG | AsyncDispatcher event 
> handler | Processing attempt_1622107691213_1054_m_000000_0 of type TA_KILL | 
> TaskAttemptImpl.java:1390
>       Line 58035: 2021-06-02 16:26:34,034 | INFO  | RMCommunicator Allocator 
> | Killing taskAttempt:attempt_1622107691213_1054_m_000000_0 because it is 
> running on unusable node:node-group-1ZYEq0002:26009 | 
> RMContainerAllocator.java:1066
>       Line 58043: 2021-06-02 16:26:34,034 | DEBUG | AsyncDispatcher event 
> handler | Processing attempt_1622107691213_1054_m_000000_0 of type TA_KILL | 
> TaskAttemptImpl.java:1390
>       Line 58054: 2021-06-02 16:26:34,034 | DEBUG | AsyncDispatcher event 
> handler | Processing attempt_1622107691213_1054_m_000000_0 of type 
> TA_DIAGNOSTICS_UPDATE | TaskAttemptImpl.java:1390
>       Line 58055: 2021-06-02 16:26:34,034 | INFO  | AsyncDispatcher event 
> handler | Diagnostics report from attempt_1622107691213_1054_m_000000_0: 
> Container released on a *lost* node | TaskAttemptImpl.java:2649
>       Line 58057: 2021-06-02 16:26:34,034 | DEBUG | AsyncDispatcher event 
> handler | Processing attempt_1622107691213_1054_m_000000_0 of type TA_KILL | 
> TaskAttemptImpl.java:1390
>       Line 60317: 2021-06-02 16:26:57,057 | INFO  | AsyncDispatcher event 
> handler | Too many fetch-failures for output of task attempt: 
> attempt_1622107691213_1054_m_000000_0 ... raising fetch failure to map | 
> JobImpl.java:2005
>       Line 60319: 2021-06-02 16:26:57,057 | DEBUG | AsyncDispatcher event 
> handler | Processing attempt_1622107691213_1054_m_000000_0 of type 
> TA_TOO_MANY_FETCH_FAILURE | TaskAttemptImpl.java:1390
>       Line 60320: 2021-06-02 16:26:57,057 | INFO  | AsyncDispatcher event 
> handler | attempt_1622107691213_1054_m_000000_0 transitioned from state 
> SUCCESS_CONTAINER_CLEANUP to FAILED, event type is TA_TOO_MANY_FETCH_FAILURE 
> and nodeId=node-group-1ZYEq0002:26009 | TaskAttemptImpl.java:1411
>       Line 69487: 2021-06-02 16:30:02,002 | DEBUG | AsyncDispatcher event 
> handler | Processing attempt_1622107691213_1054_m_000000_0 of type 
> TA_DIAGNOSTICS_UPDATE | TaskAttemptImpl.java:1390
>       Line 69527: 2021-06-02 16:30:02,002 | INFO  | AsyncDispatcher event 
> handler | Diagnostics report from attempt_1622107691213_1054_m_000000_0: 
> cleanup failed for container container_e03_1622107691213_1054_01_000005 : 
> java.net.ConnectException: Call From node-group-1ZYEq0001/192.168.0.66 to 
> node-group-1ZYEq0002:26009 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
>       Line 69607: 2021-06-02 16:30:02,002 | DEBUG | AsyncDispatcher event 
> handler | Processing attempt_1622107691213_1054_m_000000_0 of type 
> TA_CONTAINER_CLEANED | TaskAttemptImpl.java:1390
>       Line 69609: 2021-06-02 16:30:02,002 | DEBUG | AsyncDispatcher event 
> handler | Processing attempt_1622107691213_1054_m_000000_0 of type 
> TA_CONTAINER_CLEANED | TaskAttemptImpl.java:1390
>       Line 73645: 2021-06-02 16:23:56,056 | DEBUG | fetcher#9 | Fetcher 9 
> going to fetch from node-group-1ZYEq0002:26008 for: 
> [attempt_1622107691213_1054_m_000000_0] | Fetcher.java:318
>       Line 73646: 2021-06-02 16:23:56,056 | DEBUG | fetcher#9 | MapOutput URL 
> for node-group-1ZYEq0002:26008 -> 
> http://node-group-1ZYEq0002:26008/mapOutput?job=job_1622107691213_1054&reduce=4&map=attempt_1622107691213_1054_m_000000_0
>  | Fetcher.java:686
>       Line 74093: 2021-06-02 16:26:56,056 | INFO  | fetcher#9 | Reporting 
> fetch failure for attempt_1622107691213_1054_m_000000_0 to MRAppMaster. | 
> ShuffleSchedulerImpl.java:349
> {code}
> As we can see from logs that RM reported AM about node update at 16:26:34 but 
> event was skipped as KILL event is ignored when TaskAttemptImpl is in 
> SUCCESS_CONTAINER_CLEANUP state. So next we receive TA_TOO_MANY_FETCH_FAILURE 
> event which will lead to task fail. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to