[jira] [Commented] (MAPREDUCE-7053) Timed out tasks can fail to produce thread dump
[ https://issues.apache.org/jira/browse/MAPREDUCE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465895#comment-16465895 ] Eric Payne commented on MAPREDUCE-7053: --- bq. Thanks for the work here. I noticed that you reverted it from 3.0.2, but per your comment above, it's in branch-3.0.1. [~yzhangal], It was reverted from branch-3.0.1 as well. Sorry about the confusion. > Timed out tasks can fail to produce thread dump > --- > > Key: MAPREDUCE-7053 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7053 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Fix For: 3.1.0, 2.10.0, 2.9.1, 2.8.4, 3.0.3 > > Attachments: MAPREDUCE-7053-branch-2.001.patch, > MAPREDUCE-7053.001.patch > > > TestMRJobs#testThreadDumpOnTaskTimeout has been failing sporadically > recently. When the AM times out a task it immediately removes it from the > list of known tasks and then connects to the NM to request a thread dump > followed by a kill. If the task heartbeats in after the task has been > removed from the list of known tasks but before the thread dump signal > arrives then the task can exit with a "org.apache.hadoop.mapred.Task: Parent > died." message and no thread dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7053) Timed out tasks can fail to produce thread dump
[ https://issues.apache.org/jira/browse/MAPREDUCE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464977#comment-16464977 ] Yongjun Zhang commented on MAPREDUCE-7053: -- HI [~eepayne], Thanks for the work here. I noticed that you reverted it from 3.0.2, but per your comment above, it's in branch-3.0.1. {code:java} commit 6b23e5dc24f92a1bebdd0af7877b33ee452c1842 Author: Eric Payne Date: Fri Feb 16 09:11:14 2018 -0600 Revert "MAPREDUCE-7053: Timed out tasks can fail to produce thread dump. Contributed by Jason Lowe." This reverts commit a881a89f02434d8ada2ea6784cfd90de67fcc7bd. commit a881a89f02434d8ada2ea6784cfd90de67fcc7bd Author: Eric Payne Date: Fri Feb 16 08:15:09 2018 -0600 MAPREDUCE-7053: Timed out tasks can fail to produce thread dump. Contributed by Jason Lowe. (cherry picked from commit 82f029f7b50679ea477a3a898e4ee400fa394adf) {code} Any concern so to remove it from 3.0.2? Thanks [~eddyxu] for changing the Fixed Version to 3.0.3. > Timed out tasks can fail to produce thread dump > --- > > Key: MAPREDUCE-7053 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7053 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Fix For: 3.1.0, 2.10.0, 2.9.1, 2.8.4, 3.0.3 > > Attachments: MAPREDUCE-7053-branch-2.001.patch, > MAPREDUCE-7053.001.patch > > > TestMRJobs#testThreadDumpOnTaskTimeout has been failing sporadically > recently. When the AM times out a task it immediately removes it from the > list of known tasks and then connects to the NM to request a thread dump > followed by a kill. If the task heartbeats in after the task has been > removed from the list of known tasks but before the thread dump signal > arrives then the task can exit with a "org.apache.hadoop.mapred.Task: Parent > died." message and no thread dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7053) Timed out tasks can fail to produce thread dump
[ https://issues.apache.org/jira/browse/MAPREDUCE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16367455#comment-16367455 ] Eric Payne commented on MAPREDUCE-7053: --- Thanks [~jlowe]. I committed MAPREDUCE-7053.001.patch to trunk, and cherry-picked to branch-3.1, branch-3.0, and branch-3.0.1. I committed MAPREDUCE-7053-branch-2.001.patch branch-2, branch-2.9 and branch-2.8 > Timed out tasks can fail to produce thread dump > --- > > Key: MAPREDUCE-7053 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7053 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Attachments: MAPREDUCE-7053-branch-2.001.patch, > MAPREDUCE-7053.001.patch > > > TestMRJobs#testThreadDumpOnTaskTimeout has been failing sporadically > recently. When the AM times out a task it immediately removes it from the > list of known tasks and then connects to the NM to request a thread dump > followed by a kill. If the task heartbeats in after the task has been > removed from the list of known tasks but before the thread dump signal > arrives then the task can exit with a "org.apache.hadoop.mapred.Task: Parent > died." message and no thread dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7053) Timed out tasks can fail to produce thread dump
[ https://issues.apache.org/jira/browse/MAPREDUCE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16367387#comment-16367387 ] Hudson commented on MAPREDUCE-7053: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13670 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13670/]) MAPREDUCE-7053: Timed out tasks can fail to produce thread dump. (epayne: rev 82f029f7b50679ea477a3a898e4ee400fa394adf) * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/TaskHeartbeatHandler.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestTaskHeartbeatHandler.java > Timed out tasks can fail to produce thread dump > --- > > Key: MAPREDUCE-7053 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7053 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Attachments: MAPREDUCE-7053-branch-2.001.patch, > MAPREDUCE-7053.001.patch > > > TestMRJobs#testThreadDumpOnTaskTimeout has been failing sporadically > recently. When the AM times out a task it immediately removes it from the > list of known tasks and then connects to the NM to request a thread dump > followed by a kill. If the task heartbeats in after the task has been > removed from the list of known tasks but before the thread dump signal > arrives then the task can exit with a "org.apache.hadoop.mapred.Task: Parent > died." message and no thread dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7053) Timed out tasks can fail to produce thread dump
[ https://issues.apache.org/jira/browse/MAPREDUCE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16366430#comment-16366430 ] Hadoop QA commented on MAPREDUCE-7053: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 20m 31s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 13s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 51s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 14s{color} | {color:orange} hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app: The patch generated 1 new + 26 unchanged - 1 fixed = 27 total (was 27) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 58s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 49m 49s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:17213a0 | | JIRA Issue | MAPREDUCE-7053 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12910809/MAPREDUCE-7053-branch-2.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 8b2acb9175d7 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-2 / fe044e6 | | maven | version: Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) | | Default Java | 1.7.0_151 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7344/artifact/out/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7344/testReport/ | | Max. process+thread count | 481 (vs. ulimit of 5500) | | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7344/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatical
[jira] [Commented] (MAPREDUCE-7053) Timed out tasks can fail to produce thread dump
[ https://issues.apache.org/jira/browse/MAPREDUCE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16366266#comment-16366266 ] Jason Lowe commented on MAPREDUCE-7053: --- Thanks for the reviews! Here's the equivalent patch for branch-2. There needs to be a separate one for branch-2.7 as well. > Timed out tasks can fail to produce thread dump > --- > > Key: MAPREDUCE-7053 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7053 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4, 2.7.6 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Attachments: MAPREDUCE-7053-branch-2.001.patch, > MAPREDUCE-7053.001.patch > > > TestMRJobs#testThreadDumpOnTaskTimeout has been failing sporadically > recently. When the AM times out a task it immediately removes it from the > list of known tasks and then connects to the NM to request a thread dump > followed by a kill. If the task heartbeats in after the task has been > removed from the list of known tasks but before the thread dump signal > arrives then the task can exit with a "org.apache.hadoop.mapred.Task: Parent > died." message and no thread dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7053) Timed out tasks can fail to produce thread dump
[ https://issues.apache.org/jira/browse/MAPREDUCE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16366263#comment-16366263 ] Eric Payne commented on MAPREDUCE-7053: --- Thanks [~jlowe] for fixing this problem, and thanks [~pbacsko] for the review. +1. The patch LGTM. > Timed out tasks can fail to produce thread dump > --- > > Key: MAPREDUCE-7053 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7053 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4, 2.7.6 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Attachments: MAPREDUCE-7053.001.patch > > > TestMRJobs#testThreadDumpOnTaskTimeout has been failing sporadically > recently. When the AM times out a task it immediately removes it from the > list of known tasks and then connects to the NM to request a thread dump > followed by a kill. If the task heartbeats in after the task has been > removed from the list of known tasks but before the thread dump signal > arrives then the task can exit with a "org.apache.hadoop.mapred.Task: Parent > died." message and no thread dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7053) Timed out tasks can fail to produce thread dump
[ https://issues.apache.org/jira/browse/MAPREDUCE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365521#comment-16365521 ] Peter Bacsko commented on MAPREDUCE-7053: - Patch looks good to me. +1 (non-binding) > Timed out tasks can fail to produce thread dump > --- > > Key: MAPREDUCE-7053 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7053 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4, 2.7.6 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Major > Attachments: MAPREDUCE-7053.001.patch > > > TestMRJobs#testThreadDumpOnTaskTimeout has been failing sporadically > recently. When the AM times out a task it immediately removes it from the > list of known tasks and then connects to the NM to request a thread dump > followed by a kill. If the task heartbeats in after the task has been > removed from the list of known tasks but before the thread dump signal > arrives then the task can exit with a "org.apache.hadoop.mapred.Task: Parent > died." message and no thread dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7053) Timed out tasks can fail to produce thread dump
[ https://issues.apache.org/jira/browse/MAPREDUCE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364921#comment-16364921 ] Hadoop QA commented on MAPREDUCE-7053: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 12s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 14s{color} | {color:orange} hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app: The patch generated 1 new + 31 unchanged - 1 fixed = 32 total (was 32) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 34s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 18s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 51m 19s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | MAPREDUCE-7053 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12910638/MAPREDUCE-7053.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux be8f6b7bef7d 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 8f66aff | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7341/artifact/out/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7341/testReport/ | | Max. process+thread count | 597 (vs. ulimit of 5500) | | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-
[jira] [Commented] (MAPREDUCE-7053) Timed out tasks can fail to produce thread dump
[ https://issues.apache.org/jira/browse/MAPREDUCE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364628#comment-16364628 ] Peter Bacsko commented on MAPREDUCE-7053: - [~jlowe] is this yet another {{System.exit()}}?! Wow, wondering how many times we have to touch the code to fix this completely :) > Timed out tasks can fail to produce thread dump > --- > > Key: MAPREDUCE-7053 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7053 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4, 2.7.6 >Reporter: Jason Lowe >Priority: Major > > TestMRJobs#testThreadDumpOnTaskTimeout has been failing sporadically > recently. When the AM times out a task it immediately removes it from the > list of known tasks and then connects to the NM to request a thread dump > followed by a kill. If the task heartbeats in after the task has been > removed from the list of known tasks but before the thread dump signal > arrives then the task can exit with a "org.apache.hadoop.mapred.Task: Parent > died." message and no thread dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7053) Timed out tasks can fail to produce thread dump
[ https://issues.apache.org/jira/browse/MAPREDUCE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364487#comment-16364487 ] Jason Lowe commented on MAPREDUCE-7053: --- The easiest "fix" for this issue is to have the AM ignore tasks that are unknown as it did before, although that could cause unknown tasks to linger on the cluster far longer than they should if somehow a task were to "escape." > Timed out tasks can fail to produce thread dump > --- > > Key: MAPREDUCE-7053 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7053 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4, 2.7.6 >Reporter: Jason Lowe >Priority: Major > > TestMRJobs#testThreadDumpOnTaskTimeout has been failing sporadically > recently. When the AM times out a task it immediately removes it from the > list of known tasks and then connects to the NM to request a thread dump > followed by a kill. If the task heartbeats in after the task has been > removed from the list of known tasks but before the thread dump signal > arrives then the task can exit with a "org.apache.hadoop.mapred.Task: Parent > died." message and no thread dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7053) Timed out tasks can fail to produce thread dump
[ https://issues.apache.org/jira/browse/MAPREDUCE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364484#comment-16364484 ] Jason Lowe commented on MAPREDUCE-7053: --- This is triggered by MAPREDUCE-5124. Before MAPREDUCE-5124 tasks that were unknown were not rejected. After MAPREDUCE-5124 the AM started proactively rejecting unknown tasks and exposed this latent issue. > Timed out tasks can fail to produce thread dump > --- > > Key: MAPREDUCE-7053 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7053 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4, 2.7.6 >Reporter: Jason Lowe >Priority: Major > > TestMRJobs#testThreadDumpOnTaskTimeout has been failing sporadically > recently. When the AM times out a task it immediately removes it from the > list of known tasks and then connects to the NM to request a thread dump > followed by a kill. If the task heartbeats in after the task has been > removed from the list of known tasks but before the thread dump signal > arrives then the task can exit with a "org.apache.hadoop.mapred.Task: Parent > died." message and no thread dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org