[jira] [Created] (MAPREDUCE-4912) Investigate ways to clean up double job commit prevention
Robert Joseph Evans created MAPREDUCE-4912: -- Summary: Investigate ways to clean up double job commit prevention Key: MAPREDUCE-4912 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4912 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Robert Joseph Evans Once MAPREDUCE-4819 goes in it fixes the issue where an OutputCommiter can double commit a job. So that the output will never be touched after the job informs externally of success or failure. The code and design could potentially use some cleanup and refactoring. Issues brought up that should be investigated include: # reporting KILL for killed jobs if they crash after the kill happens instead of error. # using the job history log for recording the commit status instead of separate external files in HDFS. # Placing the recovery/retry logic in the commit handler instead of the MRAppMaster, and having the recovery service replay the logs as it normally does for recovery. This is not meant to be things that must be done, but alternatives that might clean up the code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-4904) TestMultipleLevelCaching failed in branch-1
[ https://issues.apache.org/jira/browse/MAPREDUCE-4904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Lu resolved MAPREDUCE-4904. Resolution: Fixed Assignee: Junping Du (was: meng gong) Committed to branch-1. Thanks Junping for the patch! TestMultipleLevelCaching failed in branch-1 --- Key: MAPREDUCE-4904 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4904 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 1.2.0 Reporter: meng gong Assignee: Junping Du Fix For: 1.2.0 Attachments: MAPREDUCE-4904.patch, MAPREDUCE-4904-v2.patch, MAPREDUCE-4904-v2.patch TestMultipleLevelCaching will failed: {noformat} Testcase: testMultiLevelCaching took 30.406 sec FAILED Number of local maps expected:0 but was:1 junit.framework.AssertionFailedError: Number of local maps expected:0 but was:1 at org.apache.hadoop.mapred.TestRackAwareTaskPlacement.launchJobAndTestCounters(TestRackAwareTaskPlacement.java:78) at org.apache.hadoop.mapred.TestMultipleLevelCaching.testCachingAtLevel(TestMultipleLevelCaching.java:113) at org.apache.hadoop.mapred.TestMultipleLevelCaching.testMultiLevelCaching(TestMultipleLevelCaching.java:69) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-4894) Renewal / cancellation of JobHistory tokens
[ https://issues.apache.org/jira/browse/MAPREDUCE-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved MAPREDUCE-4894. -- Resolution: Fixed Fix Version/s: 0.23.6 2.0.3-alpha 3.0.0 Renewal / cancellation of JobHistory tokens --- Key: MAPREDUCE-4894 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4894 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.4 Reporter: Siddharth Seth Assignee: Siddharth Seth Priority: Blocker Fix For: 3.0.0, 2.0.3-alpha, 0.23.6 Attachments: MAPREDUCE-4894_wip.txt, MR-4894_branch0.23.txt, MR-4894_branch0.23.txt, MR-4894_branch0.23.txt, MR-4894_trunk.txt, MR-4894_trunk.txt, MR-4894.txt Equivalent of YARN-50 for JobHistory tokens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4913) TestMRAppMaster#testMRAppMasterMissingStaging occasionally exits
Jason Lowe created MAPREDUCE-4913: - Summary: TestMRAppMaster#testMRAppMasterMissingStaging occasionally exits Key: MAPREDUCE-4913 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4913 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe testMRAppMasterMissingStaging will sometimes cause the JVM to exit due to this error from AsyncDispatcher: {noformat} 2013-01-05 02:14:54,682 FATAL [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(137)) - Error in dispatcher thread java.lang.Exception: No handler for registered for class org.apache.hadoop.mapreduce.jobhistory.EventType, cannot deliver EventType: AM_STARTED at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:132) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77) at java.lang.Thread.run(Thread.java:662) 2013-01-05 02:14:54,682 INFO [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(140)) - Exiting, bbye.. {noformat} This can cause a build to fail since the test process exits without unregistering from surefire which treats it as a build error rather than a test failure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4916) TestTrackerDistributedCacheManager is flaky due to other badly written tests
Arun C Murthy created MAPREDUCE-4916: Summary: TestTrackerDistributedCacheManager is flaky due to other badly written tests Key: MAPREDUCE-4916 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4916 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Arun C Murthy Assignee: Xuan Gong Credit to Xuan figuring this: TestTrackerDistributedCacheManager is flaky due to other badly written tests since it checks for existence of a directory upfront which might have bad perms. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4917) multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer
Jun Jin created MAPREDUCE-4917: -- Summary: multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer Key: MAPREDUCE-4917 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4917 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/raid Affects Versions: 0.22.0 Reporter: Jun Jin Assignee: Jun Jin Fix For: 0.22.0 current implementation can only run single BlockFixer since the fsck (in RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple BlockFixer will do the same thing and try to fix same file if multiple BlockFixer launched. the change/fix will be mainly in BlockFixer.java and RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths defined in separated Raid.xml for single RaidNode/BlockFixer -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira