[jira] [Work started] (MAPREDUCE-5685) getCacheFiles() api doesn't work in WrappedReducer.java due to typo
[ https://issues.apache.org/jira/browse/MAPREDUCE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on MAPREDUCE-5685 started by Yi Song. getCacheFiles() api doesn't work in WrappedReducer.java due to typo Key: MAPREDUCE-5685 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5685 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Yi Song Assignee: Yi Song Priority: Blocker Attachments: MAPREDUCE-5685.patch Typo in WrappedReducer.java which causes getCacheFiles() fucntions returns null Java File: hadoop-common / hadoop-mapreduce-project / hadoop-mapreduce-client / hadoop-mapreduce-client-core / src / main / java / org / apache / hadoop / mapreduce / lib / reduce / WrappedReducer.java line 140: Error code: {code} return reduceContext.getCacheArchives(); {code} Should be: {code} return reduceContext.getCacheFiles(); {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5686) Found Class org.apache.hadoop.mapreduce.TaskAttemptContext,but interface was excepted
[ https://issues.apache.org/jira/browse/MAPREDUCE-5686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850312#comment-13850312 ] Steve Loughran commented on MAPREDUCE-5686: --- Have you searched for this error string online to see if others have found it before? Found Class org.apache.hadoop.mapreduce.TaskAttemptContext,but interface was excepted - Key: MAPREDUCE-5686 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5686 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: ranjini hi, Iam using the hadoop version 0.20. Please suggest to fix the bug. Thanks in advance. Ranjini -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (MAPREDUCE-5655) Remote job submit from windows to a linux hadoop cluster fails due to wrong classpath
[ https://issues.apache.org/jira/browse/MAPREDUCE-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Pados updated MAPREDUCE-5655: Description: I was trying to run a java class on my client, windows 7 developer environment, which submits a job to the remote Hadoop cluster, initiates a mapreduce there, and then downloads the results back to the local machine. General use case is to use hadoop services from a web application installed on a non-cluster computer, or as part of a developer environment. The problem was, that the ApplicationMaster's startup shell script (launch_container.sh) was generated with wrong CLASSPATH entry. Together with the java process call on the bottom of the file, these entries were generated in windows style, using % as shell variable marker and ; as the CLASSPATH delimiter. I tracked down the root cause, and found that the MrApps.java, and the YarnRunner.java classes create these entries, and is passed forward to the ApplicationMaster, assuming that the OS that runs these classes will match the one running the ApplicationMaster. But it's not the case, these are in 2 different jvm, and also the OS can be different, the strings are generated based on the client/submitter side's OS. I made some workaround changes to these 2 files, so i could launch my job, however there may be more problems ahead. update error message: 13/12/04 16:33:15 INFO mapreduce.Job: Job job_1386170530016_0001 failed with state FAILED due to: Application application_1386170530016_0001 failed 2 times due to AM Container for appattempt_1386170530016_0001_02 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) update2: It also reqires to add the following property to mapred-site.xml (or mapred-default.xml), on the windows box, so that the job launcher knows, that the job runner will be a linux: property namemapred.remote.os/name valueLinux/value descriptionRemote MapReduce framework's OS, can be either Linux or Windows/description /property without this entry, the patched jar does the same as the unpatched, so it's required to work! was: I was trying to run a java class on my client, windows 7 developer environment, which submits a job to the remote Hadoop cluster, initiates a mapreduce there, and then downloads the results back to the local machine. General use case is to use hadoop services from a web application installed on a non-cluster computer, or as part of a developer environment. The problem was, that the ApplicationMaster's startup shell script (launch_container.sh) was generated with wrong CLASSPATH entry. Together with the java process call on the bottom of the file, these entries were generated in windows style, using % as shell variable marker and ; as the CLASSPATH delimiter. I tracked down the root cause, and found that the MrApps.java, and the YarnRunner.java classes create these entries, and is passed forward to the ApplicationMaster, assuming that the OS that runs these classes will match the one running the ApplicationMaster. But it's not the case, these are in 2 different jvm, and also the OS can be different, the strings are generated based on the client/submitter side's OS. I made some workaround changes to these 2 files, so i could launch my job, however there may be more problems ahead. update error message: 13/12/04 16:33:15 INFO mapreduce.Job: Job job_1386170530016_0001 failed with state FAILED due to: Application application_1386170530016_0001 failed 2 times due to AM Container for appattempt_1386170530016_0001_02 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379)
[jira] [Updated] (MAPREDUCE-5655) Remote job submit from windows to a linux hadoop cluster fails due to wrong classpath
[ https://issues.apache.org/jira/browse/MAPREDUCE-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Pados updated MAPREDUCE-5655: Description: I was trying to run a java class on my client, windows 7 developer environment, which submits a job to the remote Hadoop cluster, initiates a mapreduce there, and then downloads the results back to the local machine. General use case is to use hadoop services from a web application installed on a non-cluster computer, or as part of a developer environment. The problem was, that the ApplicationMaster's startup shell script (launch_container.sh) was generated with wrong CLASSPATH entry. Together with the java process call on the bottom of the file, these entries were generated in windows style, using % as shell variable marker and ; as the CLASSPATH delimiter. I tracked down the root cause, and found that the MrApps.java, and the YarnRunner.java classes create these entries, and is passed forward to the ApplicationMaster, assuming that the OS that runs these classes will match the one running the ApplicationMaster. But it's not the case, these are in 2 different jvm, and also the OS can be different, the strings are generated based on the client/submitter side's OS. I made some workaround changes to these 2 files, so i could launch my job, however there may be more problems ahead. update error message: 13/12/04 16:33:15 INFO mapreduce.Job: Job job_1386170530016_0001 failed with state FAILED due to: Application application_1386170530016_0001 failed 2 times due to AM Container for appattempt_1386170530016_0001_02 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) update2: It also reqires to add the following property to mapred-site.xml (or mapred-default.xml), on the windows box, so that the job launcher knows, that the job runner will be a linux: property namemapred.remote.os/name valueLinux/value descriptionRemote MapReduce framework's OS, can be either Linux or Windows/description /property was: I was trying to run a java class on my client, windows 7 developer environment, which submits a job to the remote Hadoop cluster, initiates a mapreduce there, and then downloads the results back to the local machine. General use case is to use hadoop services from a web application installed on a non-cluster computer, or as part of a developer environment. The problem was, that the ApplicationMaster's startup shell script (launch_container.sh) was generated with wrong CLASSPATH entry. Together with the java process call on the bottom of the file, these entries were generated in windows style, using % as shell variable marker and ; as the CLASSPATH delimiter. I tracked down the root cause, and found that the MrApps.java, and the YarnRunner.java classes create these entries, and is passed forward to the ApplicationMaster, assuming that the OS that runs these classes will match the one running the ApplicationMaster. But it's not the case, these are in 2 different jvm, and also the OS can be different, the strings are generated based on the client/submitter side's OS. I made some workaround changes to these 2 files, so i could launch my job, however there may be more problems ahead. update error message: 13/12/04 16:33:15 INFO mapreduce.Job: Job job_1386170530016_0001 failed with state FAILED due to: Application application_1386170530016_0001 failed 2 times due to AM Container for appattempt_1386170530016_0001_02 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at
[jira] [Commented] (MAPREDUCE-5623) TestJobCleanup fails because of RejectedExecutionException and NPE.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850354#comment-13850354 ] Hudson commented on MAPREDUCE-5623: --- FAILURE: Integrated in Hadoop-Yarn-trunk #424 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/424/]) MAPREDUCE-5623. TestJobCleanup fails because of RejectedExecutionException and NPE. Contributed by Jason Lowe (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1551285) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestJobCleanup.java TestJobCleanup fails because of RejectedExecutionException and NPE. --- Key: MAPREDUCE-5623 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5623 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Tsuyoshi OZAWA Assignee: Jason Lowe Fix For: trunk, 2.4.0, 0.23.11 Attachments: MAPREDUCE-5623.1.patch, MAPREDUCE-5623.2.patch, MAPREDUCE-5623.3.patch org.apache.hadoop.mapred.TestJobCleanup can fail because of RejectedExecutionException by NonAggregatingLogHandler. This problem is described in YARN-1409. TestJobCleanup can still fail after fixing RejectedExecutionException, because of NPE by Job#getCounters()'s returning null. {code} --- Test set: org.apache.hadoop.mapred.TestJobCleanup --- Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 140.933 sec FAILURE! - in org.apache.hadoop.mapred.TestJobCleanup testCustomAbort(org.apache.hadoop.mapred.TestJobCleanup) Time elapsed: 31.068 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.mapred.TestJobCleanup.testFailedJob(TestJobCleanup.java:199) at org.apache.hadoop.mapred.TestJobCleanup.testCustomAbort(TestJobCleanup.java:296) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5623) TestJobCleanup fails because of RejectedExecutionException and NPE.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850369#comment-13850369 ] Hudson commented on MAPREDUCE-5623: --- FAILURE: Integrated in Hadoop-Hdfs-0.23-Build #823 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/823/]) svn merge -c 1551285 FIXES: MAPREDUCE-5623. TestJobCleanup fails because of RejectedExecutionException and NPE. Contributed by Jason Lowe (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1551290) * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestJobCleanup.java TestJobCleanup fails because of RejectedExecutionException and NPE. --- Key: MAPREDUCE-5623 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5623 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Tsuyoshi OZAWA Assignee: Jason Lowe Fix For: trunk, 2.4.0, 0.23.11 Attachments: MAPREDUCE-5623.1.patch, MAPREDUCE-5623.2.patch, MAPREDUCE-5623.3.patch org.apache.hadoop.mapred.TestJobCleanup can fail because of RejectedExecutionException by NonAggregatingLogHandler. This problem is described in YARN-1409. TestJobCleanup can still fail after fixing RejectedExecutionException, because of NPE by Job#getCounters()'s returning null. {code} --- Test set: org.apache.hadoop.mapred.TestJobCleanup --- Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 140.933 sec FAILURE! - in org.apache.hadoop.mapred.TestJobCleanup testCustomAbort(org.apache.hadoop.mapred.TestJobCleanup) Time elapsed: 31.068 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.mapred.TestJobCleanup.testFailedJob(TestJobCleanup.java:199) at org.apache.hadoop.mapred.TestJobCleanup.testCustomAbort(TestJobCleanup.java:296) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5623) TestJobCleanup fails because of RejectedExecutionException and NPE.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850450#comment-13850450 ] Hudson commented on MAPREDUCE-5623: --- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1615 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1615/]) MAPREDUCE-5623. TestJobCleanup fails because of RejectedExecutionException and NPE. Contributed by Jason Lowe (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1551285) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestJobCleanup.java TestJobCleanup fails because of RejectedExecutionException and NPE. --- Key: MAPREDUCE-5623 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5623 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Tsuyoshi OZAWA Assignee: Jason Lowe Fix For: trunk, 2.4.0, 0.23.11 Attachments: MAPREDUCE-5623.1.patch, MAPREDUCE-5623.2.patch, MAPREDUCE-5623.3.patch org.apache.hadoop.mapred.TestJobCleanup can fail because of RejectedExecutionException by NonAggregatingLogHandler. This problem is described in YARN-1409. TestJobCleanup can still fail after fixing RejectedExecutionException, because of NPE by Job#getCounters()'s returning null. {code} --- Test set: org.apache.hadoop.mapred.TestJobCleanup --- Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 140.933 sec FAILURE! - in org.apache.hadoop.mapred.TestJobCleanup testCustomAbort(org.apache.hadoop.mapred.TestJobCleanup) Time elapsed: 31.068 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.mapred.TestJobCleanup.testFailedJob(TestJobCleanup.java:199) at org.apache.hadoop.mapred.TestJobCleanup.testCustomAbort(TestJobCleanup.java:296) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Resolved] (MAPREDUCE-5686) Found Class org.apache.hadoop.mapreduce.TaskAttemptContext,but interface was excepted
[ https://issues.apache.org/jira/browse/MAPREDUCE-5686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved MAPREDUCE-5686. --- Resolution: Invalid As pointed out on the other JIRAs you have filed, _please_ ask these kinds of questions on the [user@ mailing list|http://hadoop.apache.org/mailing_lists.html#User]. JIRA is for Hadoop developers to track bugs and is not a channel for general user support. That is why the user@ mailing list exists. The problem here is the same type of problem you reported in MAPREDUCE-5666 and MAPREDUCE-5668. It looks like you are compiling against a later version of Hadoop than you are running on, and that is not supported. Found Class org.apache.hadoop.mapreduce.TaskAttemptContext,but interface was excepted - Key: MAPREDUCE-5686 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5686 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: ranjini hi, Iam using the hadoop version 0.20. Please suggest to fix the bug. Thanks in advance. Ranjini -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5623) TestJobCleanup fails because of RejectedExecutionException and NPE.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850538#comment-13850538 ] Hudson commented on MAPREDUCE-5623: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1641 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1641/]) MAPREDUCE-5623. TestJobCleanup fails because of RejectedExecutionException and NPE. Contributed by Jason Lowe (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1551285) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestJobCleanup.java TestJobCleanup fails because of RejectedExecutionException and NPE. --- Key: MAPREDUCE-5623 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5623 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Tsuyoshi OZAWA Assignee: Jason Lowe Fix For: trunk, 2.4.0, 0.23.11 Attachments: MAPREDUCE-5623.1.patch, MAPREDUCE-5623.2.patch, MAPREDUCE-5623.3.patch org.apache.hadoop.mapred.TestJobCleanup can fail because of RejectedExecutionException by NonAggregatingLogHandler. This problem is described in YARN-1409. TestJobCleanup can still fail after fixing RejectedExecutionException, because of NPE by Job#getCounters()'s returning null. {code} --- Test set: org.apache.hadoop.mapred.TestJobCleanup --- Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 140.933 sec FAILURE! - in org.apache.hadoop.mapred.TestJobCleanup testCustomAbort(org.apache.hadoop.mapred.TestJobCleanup) Time elapsed: 31.068 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.mapred.TestJobCleanup.testFailedJob(TestJobCleanup.java:199) at org.apache.hadoop.mapred.TestJobCleanup.testCustomAbort(TestJobCleanup.java:296) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5679) TestJobHistoryParsing has race condition
[ https://issues.apache.org/jira/browse/MAPREDUCE-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850648#comment-13850648 ] Jason Lowe commented on MAPREDUCE-5679: --- +1 to the latest patch. Committing this. TestJobHistoryParsing has race condition Key: MAPREDUCE-5679 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5679 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.2.0 Reporter: Liyin Liang Assignee: Liyin Liang Attachments: MAPREDUCE-5679-2.diff, MAPREDUCE-5679-3.diff, MAPREDUCE-5679.diff org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing can fail because of race condition. {noformat} testHistoryParsingWithParseErrors(org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing) Time elapsed: 4.102 sec ERROR! java.io.IOException: Unable to initialize History Viewer at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:520) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:398) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.init(ChecksumFileSystem.java:137) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:798) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.init(JobHistoryParser.java:86) at org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.init(HistoryViewer.java:85) at org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing.checkHistoryParsing(TestJobHistoryParsing.java:339) at org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing.testHistoryParsingWithParseErrors(TestJobHistoryParsing.java:125) {noformat} In the checkHistoryParsing() function, after {code} HistoryFileInfo fileInfo = jobHistory.getJobFileInfo(jobId); {code} a thread named MoveIntermediateToDone will be launched to move history file from done_intermediate to done directory. If the history file is moved, {code} HistoryViewer viewer = new HistoryViewer(fc.makeQualified( fileInfo.getHistoryFile()).toString(), conf, true); {code} will throw IOException,because the history file is not found. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (MAPREDUCE-5679) TestJobHistoryParsing has race condition
[ https://issues.apache.org/jira/browse/MAPREDUCE-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-5679: -- Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks, Liyin! I committed this to trunk and branch-2. TestJobHistoryParsing has race condition Key: MAPREDUCE-5679 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5679 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.2.0 Reporter: Liyin Liang Assignee: Liyin Liang Fix For: 3.0.0, 2.4.0 Attachments: MAPREDUCE-5679-2.diff, MAPREDUCE-5679-3.diff, MAPREDUCE-5679.diff org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing can fail because of race condition. {noformat} testHistoryParsingWithParseErrors(org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing) Time elapsed: 4.102 sec ERROR! java.io.IOException: Unable to initialize History Viewer at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:520) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:398) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.init(ChecksumFileSystem.java:137) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:798) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.init(JobHistoryParser.java:86) at org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.init(HistoryViewer.java:85) at org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing.checkHistoryParsing(TestJobHistoryParsing.java:339) at org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing.testHistoryParsingWithParseErrors(TestJobHistoryParsing.java:125) {noformat} In the checkHistoryParsing() function, after {code} HistoryFileInfo fileInfo = jobHistory.getJobFileInfo(jobId); {code} a thread named MoveIntermediateToDone will be launched to move history file from done_intermediate to done directory. If the history file is moved, {code} HistoryViewer viewer = new HistoryViewer(fc.makeQualified( fileInfo.getHistoryFile()).toString(), conf, true); {code} will throw IOException,because the history file is not found. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5679) TestJobHistoryParsing has race condition
[ https://issues.apache.org/jira/browse/MAPREDUCE-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850662#comment-13850662 ] Hudson commented on MAPREDUCE-5679: --- SUCCESS: Integrated in Hadoop-trunk-Commit #4898 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4898/]) MAPREDUCE-5679. TestJobHistoryParsing has race condition. Contributed by Liyin Liang (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1551616) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestJobHistoryParsing.java TestJobHistoryParsing has race condition Key: MAPREDUCE-5679 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5679 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.2.0 Reporter: Liyin Liang Assignee: Liyin Liang Fix For: 3.0.0, 2.4.0 Attachments: MAPREDUCE-5679-2.diff, MAPREDUCE-5679-3.diff, MAPREDUCE-5679.diff org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing can fail because of race condition. {noformat} testHistoryParsingWithParseErrors(org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing) Time elapsed: 4.102 sec ERROR! java.io.IOException: Unable to initialize History Viewer at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:520) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:398) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.init(ChecksumFileSystem.java:137) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:798) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.init(JobHistoryParser.java:86) at org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.init(HistoryViewer.java:85) at org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing.checkHistoryParsing(TestJobHistoryParsing.java:339) at org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing.testHistoryParsingWithParseErrors(TestJobHistoryParsing.java:125) {noformat} In the checkHistoryParsing() function, after {code} HistoryFileInfo fileInfo = jobHistory.getJobFileInfo(jobId); {code} a thread named MoveIntermediateToDone will be launched to move history file from done_intermediate to done directory. If the history file is moved, {code} HistoryViewer viewer = new HistoryViewer(fc.makeQualified( fileInfo.getHistoryFile()).toString(), conf, true); {code} will throw IOException,because the history file is not found. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-4443) MR AM and job history server should be resilient to jobs that exceed counter limits
[ https://issues.apache.org/jira/browse/MAPREDUCE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850769#comment-13850769 ] Rahul Jain commented on MAPREDUCE-4443: --- Agreed, we should get this patch in, and longer term should remove the counter limits or at least use much higher limit values with yarn MR AM and job history server should be resilient to jobs that exceed counter limits Key: MAPREDUCE-4443 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4443 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Rahul Jain Assignee: Mayank Bansal Labels: usability Attachments: MAPREDUCE-4443-trunk-1.patch, MAPREDUCE-4443-trunk-2.patch, MAPREDUCE-4443-trunk-3.patch, MAPREDUCE-4443-trunk-draft.patch, am_failed_counter_limits.txt We saw this problem migrating applications to MapReduceV2: Our applications use hadoop counters extensively (1000+ counters for certain jobs). While this may not be one of recommended best practices in hadoop, the real issue here is reliability of the framework when applications exceed counter limits. The hadoop servers (yarn, history server) were originally brought up with mapreduce.job.counters.max=1000 under core-site.xml We then ran map-reduce job under an application using its own job specific overrides, with mapreduce.job.counters.max=1 All the tasks for the job finished successfully; however the overall job still failed due to AM encountering exceptions as: {code} 2012-07-12 17:31:43,485 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks : 712012-07-12 17:31:43,502 FATAL [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher threa dorg.apache.hadoop.mapreduce.counters.LimitExceededException: Too many counters: 1001 max=1000 at org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:58) at org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:65) at org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.addCounter(AbstractCounterGroup.java:77) at org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.addCounterImpl(AbstractCounterGroup.java:94) at org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.findCounter(AbstractCounterGroup.java:105) at org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.incrAllCounters(AbstractCounterGroup.java:202) at org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:337) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.constructFinalFullcounters(JobImpl.java:1212) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.mayBeConstructFinalFullCounters(JobImpl.java:1198) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.createJobFinishedEvent(JobImpl.java:1179) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.logJobHistoryFinishedEvent(JobImpl.java:711) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.checkJobCompleteSuccess(JobImpl.java:737) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.checkJobForCompletion(JobImpl.java:1360) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.transition(JobImpl.java:1340) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.transition(JobImpl.java:1323) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:380) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:666) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:113) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:890) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:886) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:125) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:74) at java.lang.Thread.run(Thread.java:662) 2012-07-12 17:31:43,502 INFO [AsyncDispatcher event handler]
[jira] [Commented] (MAPREDUCE-4443) MR AM and job history server should be resilient to jobs that exceed counter limits
[ https://issues.apache.org/jira/browse/MAPREDUCE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850746#comment-13850746 ] Hadoop QA commented on MAPREDUCE-4443: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12579168/MAPREDUCE-4443-trunk-3.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4263//console This message is automatically generated. MR AM and job history server should be resilient to jobs that exceed counter limits Key: MAPREDUCE-4443 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4443 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Rahul Jain Assignee: Mayank Bansal Labels: usability Attachments: MAPREDUCE-4443-trunk-1.patch, MAPREDUCE-4443-trunk-2.patch, MAPREDUCE-4443-trunk-3.patch, MAPREDUCE-4443-trunk-draft.patch, am_failed_counter_limits.txt We saw this problem migrating applications to MapReduceV2: Our applications use hadoop counters extensively (1000+ counters for certain jobs). While this may not be one of recommended best practices in hadoop, the real issue here is reliability of the framework when applications exceed counter limits. The hadoop servers (yarn, history server) were originally brought up with mapreduce.job.counters.max=1000 under core-site.xml We then ran map-reduce job under an application using its own job specific overrides, with mapreduce.job.counters.max=1 All the tasks for the job finished successfully; however the overall job still failed due to AM encountering exceptions as: {code} 2012-07-12 17:31:43,485 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks : 712012-07-12 17:31:43,502 FATAL [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher threa dorg.apache.hadoop.mapreduce.counters.LimitExceededException: Too many counters: 1001 max=1000 at org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:58) at org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:65) at org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.addCounter(AbstractCounterGroup.java:77) at org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.addCounterImpl(AbstractCounterGroup.java:94) at org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.findCounter(AbstractCounterGroup.java:105) at org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.incrAllCounters(AbstractCounterGroup.java:202) at org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:337) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.constructFinalFullcounters(JobImpl.java:1212) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.mayBeConstructFinalFullCounters(JobImpl.java:1198) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.createJobFinishedEvent(JobImpl.java:1179) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.logJobHistoryFinishedEvent(JobImpl.java:711) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.checkJobCompleteSuccess(JobImpl.java:737) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.checkJobForCompletion(JobImpl.java:1360) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.transition(JobImpl.java:1340) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.transition(JobImpl.java:1323) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:380) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:666) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:113) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:890) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:886) at
[jira] [Commented] (MAPREDUCE-4443) MR AM and job history server should be resilient to jobs that exceed counter limits
[ https://issues.apache.org/jira/browse/MAPREDUCE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850744#comment-13850744 ] Ashutosh Chauhan commented on MAPREDUCE-4443: - Seems like this is patch available for long time. Looks useful to me. Lets get this in. Also, I have raised MAPREDUCE-5680 which is larger in scope than this. MR AM and job history server should be resilient to jobs that exceed counter limits Key: MAPREDUCE-4443 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4443 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Rahul Jain Assignee: Mayank Bansal Labels: usability Attachments: MAPREDUCE-4443-trunk-1.patch, MAPREDUCE-4443-trunk-2.patch, MAPREDUCE-4443-trunk-3.patch, MAPREDUCE-4443-trunk-draft.patch, am_failed_counter_limits.txt We saw this problem migrating applications to MapReduceV2: Our applications use hadoop counters extensively (1000+ counters for certain jobs). While this may not be one of recommended best practices in hadoop, the real issue here is reliability of the framework when applications exceed counter limits. The hadoop servers (yarn, history server) were originally brought up with mapreduce.job.counters.max=1000 under core-site.xml We then ran map-reduce job under an application using its own job specific overrides, with mapreduce.job.counters.max=1 All the tasks for the job finished successfully; however the overall job still failed due to AM encountering exceptions as: {code} 2012-07-12 17:31:43,485 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks : 712012-07-12 17:31:43,502 FATAL [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher threa dorg.apache.hadoop.mapreduce.counters.LimitExceededException: Too many counters: 1001 max=1000 at org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:58) at org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:65) at org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.addCounter(AbstractCounterGroup.java:77) at org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.addCounterImpl(AbstractCounterGroup.java:94) at org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.findCounter(AbstractCounterGroup.java:105) at org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.incrAllCounters(AbstractCounterGroup.java:202) at org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:337) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.constructFinalFullcounters(JobImpl.java:1212) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.mayBeConstructFinalFullCounters(JobImpl.java:1198) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.createJobFinishedEvent(JobImpl.java:1179) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.logJobHistoryFinishedEvent(JobImpl.java:711) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.checkJobCompleteSuccess(JobImpl.java:737) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.checkJobForCompletion(JobImpl.java:1360) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.transition(JobImpl.java:1340) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.transition(JobImpl.java:1323) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:380) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:666) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:113) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:890) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:886) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:125) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:74) at java.lang.Thread.run(Thread.java:662) 2012-07-12 17:31:43,502 INFO [AsyncDispatcher
[jira] [Updated] (MAPREDUCE-5550) Task Status message (reporter.setStatus) not shown in UI with Hadoop 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-5550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated MAPREDUCE-5550: -- Description: Hadoop 1.0 JobTracker UI displays Job Status message when list of mapper or reduce tasks are listed. This give an idea of how that task is making progress. Hadoop 2.0 AM/JHS UI does not have this. It would be good to have this on AM/JHS UI. was: Hadoop 1.0 JobTracker UI displays Job Status message when list of mapper or reduce tasks are listed. This give an idea of how that task is making progress. Hadoop 2.0 ResourceManager UI does not have this. It would be good to have this on RM UI. Task Status message (reporter.setStatus) not shown in UI with Hadoop 2.0 Key: MAPREDUCE-5550 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5550 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.5-alpha Reporter: Vrushali C Assignee: Gera Shegalov Attachments: MAPREDUCE-5550.v01.patch, MAPREDUCE-5550.v02.patch, Map_tasks_new_UI.png, Map_tasks_oldUI.png, Screen Shot 2013-10-15 at 11.15.24 AM.png, Screen Shot 2013-10-15 at 11.16.02 AM.png Hadoop 1.0 JobTracker UI displays Job Status message when list of mapper or reduce tasks are listed. This give an idea of how that task is making progress. Hadoop 2.0 AM/JHS UI does not have this. It would be good to have this on AM/JHS UI. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5197) Checkpoint Service: a library component to facilitate checkpoint of task state
[ https://issues.apache.org/jira/browse/MAPREDUCE-5197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850963#comment-13850963 ] Hudson commented on MAPREDUCE-5197: --- SUCCESS: Integrated in Hadoop-trunk-Commit #4903 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4903/]) MAPREDUCE-5197. Add a service for checkpointing task state. Contributed by Carlo Curino (cdouglas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1551726) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/checkpoint * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/checkpoint/CheckpointID.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/checkpoint/CheckpointNamingService.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/checkpoint/CheckpointService.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/checkpoint/FSCheckpointID.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/checkpoint/FSCheckpointService.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/checkpoint/RandomNameCNS.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/checkpoint/SimpleNamingService.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/checkpoint * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/checkpoint/TestFSCheckpointID.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/checkpoint/TestFSCheckpointService.java Checkpoint Service: a library component to facilitate checkpoint of task state -- Key: MAPREDUCE-5197 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5197 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Carlo Curino Assignee: Carlo Curino Fix For: 3.0.0 Attachments: MAPREDUCE-5197.1.patch, MAPREDUCE-5197.2.patch, MAPREDUCE-5197.3.patch, MAPREDUCE-5197.patch, MAPREDUCE-5197.patch A small library that abstract file API for the purpose of checkpointing. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850969#comment-13850969 ] Gera Shegalov commented on MAPREDUCE-5044: -- Our patch does not depend on YARN-445. In the specific scenario of a task timeout there is no need for an extra RPC. Have AM trigger jstack on task attempts that timeout before killing them Key: MAPREDUCE-5044 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Gera Shegalov Attachments: MAPREDUCE-5044.v01.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png When an AM expires a task attempt it would be nice if it triggered a jstack output via SIGQUIT before killing the task attempt. This would be invaluable for helping users debug their hung tasks, especially if they do not have shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5648) Allow user-specified diagnostics and speculation diagnostics for killed tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850972#comment-13850972 ] Sandy Ryza commented on MAPREDUCE-5648: --- Can we split this into two JIRAs, one for the speculation change and one for the user-specified reason? I think the former is likely to be far less controversial than the latter. Allow user-specified diagnostics and speculation diagnostics for killed tasks - Key: MAPREDUCE-5648 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5648 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client, mr-am, mrv2 Affects Versions: 2.2.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Attachments: MAPREDUCE-5648.v01.patch, Screen Shot 2013-11-23 at 11.12.15 AM.png Our users and tools want to be able to supply additional custom diagnostic messages to mapreduce ClientProtocol killTask. We also need to clearly indicate when a task attempt is killed because another task attempt succeeded first when speculative execution is enabled. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (MAPREDUCE-5189) Basic AM changes to support preemption requests (per YARN-45)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-5189: - Status: Patch Available (was: Open) Basic AM changes to support preemption requests (per YARN-45) - Key: MAPREDUCE-5189 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5189 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Reporter: Carlo Curino Assignee: Carlo Curino Attachments: MAPREDUCE-5189.1.patch, MAPREDUCE-5189.2.patch, MAPREDUCE-5189.3.patch, MAPREDUCE-5189.4.patch, MAPREDUCE-5189.patch, MAPREDUCE-5189.patch This JIRA tracks the minimum amount of changes necessary in the mapreduce AM to receive preemption requests (per YARN-45) and invoke a local policy that manages preemption. (advanced policies and mechanisms will be tracked separately) -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5650) Job fails when hprof mapreduce.task.profile.map/reduce.params is specified
[ https://issues.apache.org/jira/browse/MAPREDUCE-5650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850985#comment-13850985 ] Sandy Ryza commented on MAPREDUCE-5650: --- If I understand correctly, the issue is that the default (mapreduce.task.profile.params) params will be supplied even when more specific params are specified. Was the behavior different in MR1? Want to make sure this is not an incompatible change. Job fails when hprof mapreduce.task.profile.map/reduce.params is specified -- Key: MAPREDUCE-5650 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5650 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.2.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Attachments: MAPREDUCE-5650.v01.patch, MAPREDUCE-5650.v02.patch When one uses dedicated hprof mapreduce.task.profile.map.params or mapreduce.task.profile.reduce.params, the profiled tasks will fail to launch because hprof parameters are supplied to the child jvm twice. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5189) Basic AM changes to support preemption requests (per YARN-45)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851017#comment-13851017 ] Hadoop QA commented on MAPREDUCE-5189: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12617529/MAPREDUCE-5189.4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4264//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4264//console This message is automatically generated. Basic AM changes to support preemption requests (per YARN-45) - Key: MAPREDUCE-5189 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5189 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Reporter: Carlo Curino Assignee: Carlo Curino Attachments: MAPREDUCE-5189.1.patch, MAPREDUCE-5189.2.patch, MAPREDUCE-5189.3.patch, MAPREDUCE-5189.4.patch, MAPREDUCE-5189.patch, MAPREDUCE-5189.patch This JIRA tracks the minimum amount of changes necessary in the mapreduce AM to receive preemption requests (per YARN-45) and invoke a local policy that manages preemption. (advanced policies and mechanisms will be tracked separately) -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (MAPREDUCE-5189) Basic AM changes to support preemption requests (per YARN-45)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-5189: - Resolution: Fixed Fix Version/s: 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Basic AM changes to support preemption requests (per YARN-45) - Key: MAPREDUCE-5189 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5189 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Reporter: Carlo Curino Assignee: Carlo Curino Fix For: 3.0.0 Attachments: MAPREDUCE-5189.1.patch, MAPREDUCE-5189.2.patch, MAPREDUCE-5189.3.patch, MAPREDUCE-5189.4.patch, MAPREDUCE-5189.patch, MAPREDUCE-5189.patch This JIRA tracks the minimum amount of changes necessary in the mapreduce AM to receive preemption requests (per YARN-45) and invoke a local policy that manages preemption. (advanced policies and mechanisms will be tracked separately) -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Moved] (MAPREDUCE-5687) TestYARNRunner#testResourceMgrDelegate fails with NPE after YARN-1446
[ https://issues.apache.org/jira/browse/MAPREDUCE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli moved YARN-1511 to MAPREDUCE-5687: -- Key: MAPREDUCE-5687 (was: YARN-1511) Project: Hadoop Map/Reduce (was: Hadoop YARN) TestYARNRunner#testResourceMgrDelegate fails with NPE after YARN-1446 - Key: MAPREDUCE-5687 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5687 Project: Hadoop Map/Reduce Issue Type: Test Reporter: Ted Yu Assignee: Jian He Attachments: YARN-1511.patch, YARN-1511.patch On trunk, I got: {code} Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.049 sec FAILURE! - in org.apache.hadoop.mapred.TestYARNRunner testResourceMgrDelegate(org.apache.hadoop.mapred.TestYARNRunner) Time elapsed: 0.782 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.killApplication(YarnClientImpl.java:201) at org.apache.hadoop.mapred.ResourceMgrDelegate.killApplication(ResourceMgrDelegate.java:284) at org.apache.hadoop.mapred.TestYARNRunner.testResourceMgrDelegate(TestYARNRunner.java:212) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5189) Basic AM changes to support preemption requests (per YARN-45)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851048#comment-13851048 ] Chris Douglas commented on MAPREDUCE-5189: -- I committed this. Thanks Carlo Basic AM changes to support preemption requests (per YARN-45) - Key: MAPREDUCE-5189 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5189 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Reporter: Carlo Curino Assignee: Carlo Curino Fix For: 3.0.0 Attachments: MAPREDUCE-5189.1.patch, MAPREDUCE-5189.2.patch, MAPREDUCE-5189.3.patch, MAPREDUCE-5189.4.patch, MAPREDUCE-5189.patch, MAPREDUCE-5189.patch This JIRA tracks the minimum amount of changes necessary in the mapreduce AM to receive preemption requests (per YARN-45) and invoke a local policy that manages preemption. (advanced policies and mechanisms will be tracked separately) -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5189) Basic AM changes to support preemption requests (per YARN-45)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851068#comment-13851068 ] Hudson commented on MAPREDUCE-5189: --- SUCCESS: Integrated in Hadoop-trunk-Commit #4905 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4905/]) MAPREDUCE-5189. Add policies and wiring to respond to preemption requests from YARN. Contributed by Carlo Curino. (cdouglas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1551748) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/preemption * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/preemption/AMPreemptionPolicy.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/preemption/KillAMPreemptionPolicy.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/preemption/NoopAMPreemptionPolicy.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRAppBenchmark.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFail.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobCounter.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/checkpoint/EnumCounter.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/checkpoint/TaskCheckpointID.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/org/apache/hadoop/mapreduce/JobCounter.properties Basic AM changes to support preemption requests (per YARN-45) - Key: MAPREDUCE-5189 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5189 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Reporter: Carlo Curino Assignee: Carlo Curino Fix For: 3.0.0 Attachments: MAPREDUCE-5189.1.patch, MAPREDUCE-5189.2.patch, MAPREDUCE-5189.3.patch, MAPREDUCE-5189.4.patch, MAPREDUCE-5189.patch, MAPREDUCE-5189.patch This JIRA tracks the minimum amount of changes necessary in the mapreduce AM to receive preemption requests (per YARN-45) and invoke a local policy that manages preemption. (advanced policies and mechanisms will be tracked separately) -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-5044: --- Status: Open (was: Patch Available) I do think this is tied to YARN-445. But we can discuss that elsewhere. This patch has YARN changes. YARN and MapReduce are split into separate sub-modules of Hadoop. Please file a YARN ticket for the changes you need in YARN here: https://issues.apache.org/jira/browse/YARN Have AM trigger jstack on task attempts that timeout before killing them Key: MAPREDUCE-5044 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Gera Shegalov Attachments: MAPREDUCE-5044.v01.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png When an AM expires a task attempt it would be nice if it triggered a jstack output via SIGQUIT before killing the task attempt. This would be invaluable for helping users debug their hung tasks, especially if they do not have shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5687) TestYARNRunner#testResourceMgrDelegate fails with NPE after YARN-1446
[ https://issues.apache.org/jira/browse/MAPREDUCE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851139#comment-13851139 ] Hadoop QA commented on MAPREDUCE-5687: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12619174/YARN-1511.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient: org.apache.hadoop.mapreduce.security.TestJHSSecurity {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2683//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2683//console This message is automatically generated. TestYARNRunner#testResourceMgrDelegate fails with NPE after YARN-1446 - Key: MAPREDUCE-5687 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5687 Project: Hadoop Map/Reduce Issue Type: Test Reporter: Ted Yu Assignee: Jian He Attachments: YARN-1511.patch, YARN-1511.patch On trunk, I got: {code} Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.049 sec FAILURE! - in org.apache.hadoop.mapred.TestYARNRunner testResourceMgrDelegate(org.apache.hadoop.mapred.TestYARNRunner) Time elapsed: 0.782 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.killApplication(YarnClientImpl.java:201) at org.apache.hadoop.mapred.ResourceMgrDelegate.killApplication(ResourceMgrDelegate.java:284) at org.apache.hadoop.mapred.TestYARNRunner.testResourceMgrDelegate(TestYARNRunner.java:212) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (MAPREDUCE-5687) TestYARNRunner#testResourceMgrDelegate fails with NPE after YARN-1446
[ https://issues.apache.org/jira/browse/MAPREDUCE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-5687: --- Resolution: Fixed Fix Version/s: 2.4.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I just committed this to trunk and branch-2. Thanks Jian! TestYARNRunner#testResourceMgrDelegate fails with NPE after YARN-1446 - Key: MAPREDUCE-5687 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5687 Project: Hadoop Map/Reduce Issue Type: Test Reporter: Ted Yu Assignee: Jian He Fix For: 2.4.0 Attachments: YARN-1511.patch, YARN-1511.patch On trunk, I got: {code} Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.049 sec FAILURE! - in org.apache.hadoop.mapred.TestYARNRunner testResourceMgrDelegate(org.apache.hadoop.mapred.TestYARNRunner) Time elapsed: 0.782 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.killApplication(YarnClientImpl.java:201) at org.apache.hadoop.mapred.ResourceMgrDelegate.killApplication(ResourceMgrDelegate.java:284) at org.apache.hadoop.mapred.TestYARNRunner.testResourceMgrDelegate(TestYARNRunner.java:212) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5687) TestYARNRunner#testResourceMgrDelegate fails with NPE after YARN-1446
[ https://issues.apache.org/jira/browse/MAPREDUCE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851157#comment-13851157 ] Hudson commented on MAPREDUCE-5687: --- SUCCESS: Integrated in Hadoop-trunk-Commit #4906 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4906/]) MAPREDUCE-5687. Fixed failure in TestYARNRunner caused by YARN-1446. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1551774) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestYARNRunner.java TestYARNRunner#testResourceMgrDelegate fails with NPE after YARN-1446 - Key: MAPREDUCE-5687 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5687 Project: Hadoop Map/Reduce Issue Type: Test Reporter: Ted Yu Assignee: Jian He Fix For: 2.4.0 Attachments: YARN-1511.patch, YARN-1511.patch On trunk, I got: {code} Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.049 sec FAILURE! - in org.apache.hadoop.mapred.TestYARNRunner testResourceMgrDelegate(org.apache.hadoop.mapred.TestYARNRunner) Time elapsed: 0.782 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.killApplication(YarnClientImpl.java:201) at org.apache.hadoop.mapred.ResourceMgrDelegate.killApplication(ResourceMgrDelegate.java:284) at org.apache.hadoop.mapred.TestYARNRunner.testResourceMgrDelegate(TestYARNRunner.java:212) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5687) TestYARNRunner#testResourceMgrDelegate fails with NPE after YARN-1446
[ https://issues.apache.org/jira/browse/MAPREDUCE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851193#comment-13851193 ] Hadoop QA commented on MAPREDUCE-5687: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12619174/YARN-1511.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient: org.apache.hadoop.mapreduce.security.TestJHSSecurity {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4265//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4265//console This message is automatically generated. TestYARNRunner#testResourceMgrDelegate fails with NPE after YARN-1446 - Key: MAPREDUCE-5687 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5687 Project: Hadoop Map/Reduce Issue Type: Test Reporter: Ted Yu Assignee: Jian He Fix For: 2.4.0 Attachments: YARN-1511.patch, YARN-1511.patch On trunk, I got: {code} Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.049 sec FAILURE! - in org.apache.hadoop.mapred.TestYARNRunner testResourceMgrDelegate(org.apache.hadoop.mapred.TestYARNRunner) Time elapsed: 0.782 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.killApplication(YarnClientImpl.java:201) at org.apache.hadoop.mapred.ResourceMgrDelegate.killApplication(ResourceMgrDelegate.java:284) at org.apache.hadoop.mapred.TestYARNRunner.testResourceMgrDelegate(TestYARNRunner.java:212) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (MAPREDUCE-5685) getCacheFiles() api doesn't work in WrappedReducer.java due to typo
[ https://issues.apache.org/jira/browse/MAPREDUCE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Song updated MAPREDUCE-5685: --- Status: Patch Available (was: In Progress) getCacheFiles() api doesn't work in WrappedReducer.java due to typo Key: MAPREDUCE-5685 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5685 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Yi Song Assignee: Yi Song Priority: Blocker Attachments: MAPREDUCE-5685.patch Typo in WrappedReducer.java which causes getCacheFiles() fucntions returns null Java File: hadoop-common / hadoop-mapreduce-project / hadoop-mapreduce-client / hadoop-mapreduce-client-core / src / main / java / org / apache / hadoop / mapreduce / lib / reduce / WrappedReducer.java line 140: Error code: {code} return reduceContext.getCacheArchives(); {code} Should be: {code} return reduceContext.getCacheFiles(); {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5685) getCacheFiles() api doesn't work in WrappedReducer.java due to typo
[ https://issues.apache.org/jira/browse/MAPREDUCE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851267#comment-13851267 ] Hadoop QA commented on MAPREDUCE-5685: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12619044/MAPREDUCE-5685.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4266//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4266//console This message is automatically generated. getCacheFiles() api doesn't work in WrappedReducer.java due to typo Key: MAPREDUCE-5685 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5685 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Yi Song Assignee: Yi Song Priority: Blocker Attachments: MAPREDUCE-5685.patch Typo in WrappedReducer.java which causes getCacheFiles() fucntions returns null Java File: hadoop-common / hadoop-mapreduce-project / hadoop-mapreduce-client / hadoop-mapreduce-client-core / src / main / java / org / apache / hadoop / mapreduce / lib / reduce / WrappedReducer.java line 140: Error code: {code} return reduceContext.getCacheArchives(); {code} Should be: {code} return reduceContext.getCacheFiles(); {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (MAPREDUCE-5196) CheckpointAMPreemptionPolicy implements preemption in MR AM via checkpointing
[ https://issues.apache.org/jira/browse/MAPREDUCE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-5196: - Status: Patch Available (was: Open) CheckpointAMPreemptionPolicy implements preemption in MR AM via checkpointing -- Key: MAPREDUCE-5196 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5196 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Reporter: Carlo Curino Assignee: Carlo Curino Attachments: MAPREDUCE-5196.1.patch, MAPREDUCE-5196.2.patch, MAPREDUCE-5196.patch, MAPREDUCE-5196.patch This JIRA tracks a checkpoint-based AM preemption policy. The policy handles propagation of the preemption requests received from the RM to the appropriate tasks, and bookeeping of checkpoints. Actual checkpointing of the task state is handled in upcoming JIRAs. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (MAPREDUCE-5196) CheckpointAMPreemptionPolicy implements preemption in MR AM via checkpointing
[ https://issues.apache.org/jira/browse/MAPREDUCE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-5196: - Attachment: MAPREDUCE-5196.2.patch CheckpointAMPreemptionPolicy implements preemption in MR AM via checkpointing -- Key: MAPREDUCE-5196 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5196 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Reporter: Carlo Curino Assignee: Carlo Curino Attachments: MAPREDUCE-5196.1.patch, MAPREDUCE-5196.2.patch, MAPREDUCE-5196.patch, MAPREDUCE-5196.patch This JIRA tracks a checkpoint-based AM preemption policy. The policy handles propagation of the preemption requests received from the RM to the appropriate tasks, and bookeeping of checkpoints. Actual checkpointing of the task state is handled in upcoming JIRAs. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5196) CheckpointAMPreemptionPolicy implements preemption in MR AM via checkpointing
[ https://issues.apache.org/jira/browse/MAPREDUCE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851357#comment-13851357 ] Hadoop QA commented on MAPREDUCE-5196: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12619235/MAPREDUCE-5196.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient: org.apache.hadoop.mapreduce.security.TestJHSSecurity The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient: org.apache.hadoop.mapreduce.TestLocalRunner org.apache.hadoop.mapreduce.lib.jobcontrol.TestMapReduceJobControl org.apache.hadoop.mapred.TestClientRedirect {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4267//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4267//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4267//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-app.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4267//console This message is automatically generated. CheckpointAMPreemptionPolicy implements preemption in MR AM via checkpointing -- Key: MAPREDUCE-5196 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5196 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Reporter: Carlo Curino Assignee: Carlo Curino Attachments: MAPREDUCE-5196.1.patch, MAPREDUCE-5196.2.patch, MAPREDUCE-5196.patch, MAPREDUCE-5196.patch This JIRA tracks a checkpoint-based AM preemption policy. The policy handles propagation of the preemption requests received from the RM to the appropriate tasks, and bookeeping of checkpoints. Actual checkpointing of the task state is handled in upcoming JIRAs. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gera Shegalov updated MAPREDUCE-5044: - Attachment: MAPREDUCE-5044.v02.patch Moved YARN-related changes into YARN-1515 Have AM trigger jstack on task attempts that timeout before killing them Key: MAPREDUCE-5044 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Gera Shegalov Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png When an AM expires a task attempt it would be nice if it triggered a jstack output via SIGQUIT before killing the task attempt. This would be invaluable for helping users debug their hung tasks, especially if they do not have shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.1.4#6159)