[jira] [Commented] (MAPREDUCE-5562) MR AM should exit when unregister() throws exception
[ https://issues.apache.org/jira/browse/MAPREDUCE-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788067#comment-13788067 ] Hudson commented on MAPREDUCE-5562: --- SUCCESS: Integrated in Hadoop-Yarn-trunk #355 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/355/]) MAPREDUCE-5562. Fixed MR App Master to perform pending tasks like staging-dir cleanup, sending job-end notification correctly when unregister with RM fails. Contributed by Zhijie Shen. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529682) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/AppContext.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMCommunicator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockAppContext.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestJobEndNotifier.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestStagingCleanup.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/local/TestLocalContainerAllocator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistory.java MR AM should exit when unregister() throws exception Key: MAPREDUCE-5562 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5562 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen Fix For: 2.2.0 Attachments: MAPREDUCE-5562.10.patch, MAPREDUCE-5562.11.patch, MAPREDUCE-5562.12.patch, MAPREDUCE-5562.1.patch, MAPREDUCE-5562.2.patch, MAPREDUCE-5562.3.patch, MAPREDUCE-5562.5.patch, MAPREDUCE-5562.6.patch, MAPREDUCE-5562.7.patch, MAPREDUCE-5562.8.patch, MAPREDUCE-5562.9.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5562) MR AM should exit when unregister() throws exception
[ https://issues.apache.org/jira/browse/MAPREDUCE-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788136#comment-13788136 ] Hudson commented on MAPREDUCE-5562: --- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1545 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1545/]) MAPREDUCE-5562. Fixed MR App Master to perform pending tasks like staging-dir cleanup, sending job-end notification correctly when unregister with RM fails. Contributed by Zhijie Shen. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529682) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/AppContext.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMCommunicator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockAppContext.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestJobEndNotifier.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestStagingCleanup.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/local/TestLocalContainerAllocator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistory.java MR AM should exit when unregister() throws exception Key: MAPREDUCE-5562 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5562 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen Fix For: 2.2.0 Attachments: MAPREDUCE-5562.10.patch, MAPREDUCE-5562.11.patch, MAPREDUCE-5562.12.patch, MAPREDUCE-5562.1.patch, MAPREDUCE-5562.2.patch, MAPREDUCE-5562.3.patch, MAPREDUCE-5562.5.patch, MAPREDUCE-5562.6.patch, MAPREDUCE-5562.7.patch, MAPREDUCE-5562.8.patch, MAPREDUCE-5562.9.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5562) MR AM should exit when unregister() throws exception
[ https://issues.apache.org/jira/browse/MAPREDUCE-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788176#comment-13788176 ] Hudson commented on MAPREDUCE-5562: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1571 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1571/]) MAPREDUCE-5562. Fixed MR App Master to perform pending tasks like staging-dir cleanup, sending job-end notification correctly when unregister with RM fails. Contributed by Zhijie Shen. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529682) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/AppContext.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMCommunicator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockAppContext.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestJobEndNotifier.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestStagingCleanup.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/local/TestLocalContainerAllocator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistory.java MR AM should exit when unregister() throws exception Key: MAPREDUCE-5562 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5562 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen Fix For: 2.2.0 Attachments: MAPREDUCE-5562.10.patch, MAPREDUCE-5562.11.patch, MAPREDUCE-5562.12.patch, MAPREDUCE-5562.1.patch, MAPREDUCE-5562.2.patch, MAPREDUCE-5562.3.patch, MAPREDUCE-5562.5.patch, MAPREDUCE-5562.6.patch, MAPREDUCE-5562.7.patch, MAPREDUCE-5562.8.patch, MAPREDUCE-5562.9.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5547) Job history should not be flushed to JHS until AM gets unregistered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788192#comment-13788192 ] Jason Lowe commented on MAPREDUCE-5547: --- I'm not sure we should try to enforce YARN failure == MR failure because I don't think it's completely enforceable. The output committer is user code that can do arbitrary things, including custom job end notification e.g. FileOutputCommitter and the _SUCCESS file. As such there will always be cases where downstream consumers of the job will think it succeeded and proceed as normal despite what the RM says. In addition this change creates a couple of new problems: * The app can successfully unregister but fail to copy the history file, so now we have a case where the RM says the job succeeded but the history server will say ComeBackToMeLater until client times out. Would the history server no longer have a quick way to say I definitely don't know about that job? * We're starting to pile quite a few things into the grace period, and I'm wondering if there will be enough time to get it all done if things aren't all working properly. e.g.: slow network connection when trying to do job end notification, slow datanode(s) when copying history file, etc. Deleting the staging directory must be in the grace period to allow reattempts if we crash before unregistering, but I'm not sure we need all this other stuff there as well. I want to make sure we're not causing more problems than we're solving. Succeeding to perform job end notification and copy the history file but fail to unregister should be a very rare instance, and even if it occurs it's likely there will be a subsequent attempt that will be launched, read the previous history file, realize the job succeeded, and unregister successfully. It's only an issue if it also happens to be the last attempt unless I'm missing something. Moving all of the MR-specific job end stuff to after we unregister would be setting ourselves up for increasing the average fault visibility. Anything that goes wrong during the grace period (e.g.: AM failure/crash) will not be reattempted since the RM thinks the app is done, where it would have in the current setup if there were attempts remaining. Given that anything in the grace period is very fragile, I think we want to put as few things there as possible. Since jobs can indicate success to downstream consumers in ways we can't always control, I think it would be better to embrace the fact that sometimes YARN state != MR state and act accordingly. I think this only requires one change to ClientServiceDelegate, as currently it assumes that a YARN state of FAILED means the job failed. The client should redirect to the history server if the app is in any terminal YARN state (i.e.: FINISHED/FAILED/KILLED) and only use the YARN state as the job state if the history server doesn't know about the job. Job history should not be flushed to JHS until AM gets unregistered --- Key: MAPREDUCE-5547 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5547 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (MAPREDUCE-5569) FloatSplitter is not generating correct splits
Nathan Roberts created MAPREDUCE-5569: - Summary: FloatSplitter is not generating correct splits Key: MAPREDUCE-5569 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5569 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.1.0-beta, trunk, 1.3.0 Reporter: Nathan Roberts Assignee: Nathan Roberts The closing split is not calculated correctly: {code} // Catch any overage and create the closed interval for the last split. if (curLower = maxVal || splits.size() == 1) { splits.add(new DataDrivenDBInputFormat.DataDrivenDBInputSplit( - lowClausePrefix + Double.toString(curUpper), + lowClausePrefix + Double.toString(curLower), colName + = + Double.toString(maxVal))); } {code} For the case of min=5.0, max=7.0, 2 splits, the current code returns splits of (column1 =5.0, column1 6.0), (column1 =7.0, column1 =7.0). The second split is obviously not correct. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (MAPREDUCE-5570) Map task attempt with fetch failure has incorrect attempt finish time
Jason Lowe created MAPREDUCE-5570: - Summary: Map task attempt with fetch failure has incorrect attempt finish time Key: MAPREDUCE-5570 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5570 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 2.1.1-beta, 0.23.9 Reporter: Jason Lowe If a map task attempt is retroactively failed due to excessive fetch failures reported by reducers then the attempt's finish time is set to the time the task was retroactively failed rather than when the task attempt completed. This causes the map task attempt to appear to have run for much longer than it actually did. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5569) FloatSplitter is not generating correct splits
[ https://issues.apache.org/jira/browse/MAPREDUCE-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Roberts updated MAPREDUCE-5569: -- Affects Version/s: 0.23.9 FloatSplitter is not generating correct splits -- Key: MAPREDUCE-5569 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5569 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: trunk, 2.1.0-beta, 1.3.0, 0.23.9 Reporter: Nathan Roberts Assignee: Nathan Roberts The closing split is not calculated correctly: {code} // Catch any overage and create the closed interval for the last split. if (curLower = maxVal || splits.size() == 1) { splits.add(new DataDrivenDBInputFormat.DataDrivenDBInputSplit( - lowClausePrefix + Double.toString(curUpper), + lowClausePrefix + Double.toString(curLower), colName + = + Double.toString(maxVal))); } {code} For the case of min=5.0, max=7.0, 2 splits, the current code returns splits of (column1 =5.0, column1 6.0), (column1 =7.0, column1 =7.0). The second split is obviously not correct. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5569) FloatSplitter is not generating correct splits
[ https://issues.apache.org/jira/browse/MAPREDUCE-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Roberts updated MAPREDUCE-5569: -- Attachment: MAPREDUCE-5569-trunk.patch FloatSplitter is not generating correct splits -- Key: MAPREDUCE-5569 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5569 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: trunk, 2.1.0-beta, 1.3.0, 0.23.9 Reporter: Nathan Roberts Assignee: Nathan Roberts Attachments: MAPREDUCE-5569-trunk.patch The closing split is not calculated correctly: {code} // Catch any overage and create the closed interval for the last split. if (curLower = maxVal || splits.size() == 1) { splits.add(new DataDrivenDBInputFormat.DataDrivenDBInputSplit( - lowClausePrefix + Double.toString(curUpper), + lowClausePrefix + Double.toString(curLower), colName + = + Double.toString(maxVal))); } {code} For the case of min=5.0, max=7.0, 2 splits, the current code returns splits of (column1 =5.0, column1 6.0), (column1 =7.0, column1 =7.0). The second split is obviously not correct. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5569) FloatSplitter is not generating correct splits
[ https://issues.apache.org/jira/browse/MAPREDUCE-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Roberts updated MAPREDUCE-5569: -- Status: Patch Available (was: Open) Attached patch for trunk. Unit tests are part of HADOOP-5102. FloatSplitter is not generating correct splits -- Key: MAPREDUCE-5569 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5569 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.9, 2.1.0-beta, trunk, 1.3.0 Reporter: Nathan Roberts Assignee: Nathan Roberts Attachments: MAPREDUCE-5569-trunk.patch The closing split is not calculated correctly: {code} // Catch any overage and create the closed interval for the last split. if (curLower = maxVal || splits.size() == 1) { splits.add(new DataDrivenDBInputFormat.DataDrivenDBInputSplit( - lowClausePrefix + Double.toString(curUpper), + lowClausePrefix + Double.toString(curLower), colName + = + Double.toString(maxVal))); } {code} For the case of min=5.0, max=7.0, 2 splits, the current code returns splits of (column1 =5.0, column1 6.0), (column1 =7.0, column1 =7.0). The second split is obviously not correct. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5569) FloatSplitter is not generating correct splits
[ https://issues.apache.org/jira/browse/MAPREDUCE-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788444#comment-13788444 ] Nathan Roberts commented on MAPREDUCE-5569: --- HADOOP-5102 - should have been MAPREDUCE-5102 FloatSplitter is not generating correct splits -- Key: MAPREDUCE-5569 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5569 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: trunk, 2.1.0-beta, 1.3.0, 0.23.9 Reporter: Nathan Roberts Assignee: Nathan Roberts Attachments: MAPREDUCE-5569-trunk.patch The closing split is not calculated correctly: {code} // Catch any overage and create the closed interval for the last split. if (curLower = maxVal || splits.size() == 1) { splits.add(new DataDrivenDBInputFormat.DataDrivenDBInputSplit( - lowClausePrefix + Double.toString(curUpper), + lowClausePrefix + Double.toString(curLower), colName + = + Double.toString(maxVal))); } {code} For the case of min=5.0, max=7.0, 2 splits, the current code returns splits of (column1 =5.0, column1 6.0), (column1 =7.0, column1 =7.0). The second split is obviously not correct. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5565) job clean up fails on secure cluster as the file system is not created in the context of the ugi running the job
[ https://issues.apache.org/jira/browse/MAPREDUCE-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788451#comment-13788451 ] Chris Nauroth commented on MAPREDUCE-5565: -- Agreed with Sandy. I think this was fixed in 1.3.0 and branch-1-win by MAPREDUCE-5508, which also fixed a memory leak related to this. job clean up fails on secure cluster as the file system is not created in the context of the ugi running the job Key: MAPREDUCE-5565 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5565 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1.2.1 Reporter: Arpit Gupta Assignee: Arun C Murthy Priority: Critical Fix For: 1.3.0 Attachments: MAPREDUCE-5565.patch, MAPREDUCE-5565.patch On secure clusters we see the following exceptions in the jt log {code} 2013-10-04 04:52:31,753 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:tt/host@REALM cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] {code} And after the job finishes the staging dir is not cleaned up. While debugging with [~acmurthy] we determined that file system object needs to be created in the the context of the user who ran the job. Job however successfully completes -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5102) fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db
[ https://issues.apache.org/jira/browse/MAPREDUCE-5102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788447#comment-13788447 ] Nathan Roberts commented on MAPREDUCE-5102: --- Some comments: * In testDBInputFormat - The second parameter does not work comment can be removed or reworded. The second parameter is just a hint and does not require the inputformat to follow. * testSplitters - would be nice to also have test cases for NUM_MAPS==2 so that there is more than one split being calculated. * testFloatSplitter - Expected results aren't correct. If minvalue is 5.0 and maxvalue is 7.0, the split should be column1 = 5.0, column1 = 7.0. There is actually a corresponding bug in FloatSplitter (filed MAPREDUCE-5569). fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db Key: MAPREDUCE-5102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5102 Project: Hadoop Map/Reduce Issue Type: Test Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha Reporter: Aleksey Gorshkov Assignee: Andrey Klochkov Attachments: MAPREDUCE-5102-branch-0.23.patch, MAPREDUCE-5102-branch-0.23-v1.patch, MAPREDUCE-5102-branch-2--n3.patch, MAPREDUCE-5102-trunk--n3.patch, MAPREDUCE-5102-trunk.patch, MAPREDUCE-5102-trunk-v1.patch fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db patch MAPREDUCE-5102-trunk.patch for trunk and branch-2 patch MAPREDUCE-5102-branch-0.23.patch for branch-0.23 only -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5569) FloatSplitter is not generating correct splits
[ https://issues.apache.org/jira/browse/MAPREDUCE-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788462#comment-13788462 ] Hadoop QA commented on MAPREDUCE-5569: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12607216/MAPREDUCE-5569-trunk.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4100//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4100//console This message is automatically generated. FloatSplitter is not generating correct splits -- Key: MAPREDUCE-5569 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5569 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: trunk, 2.1.0-beta, 1.3.0, 0.23.9 Reporter: Nathan Roberts Assignee: Nathan Roberts Attachments: MAPREDUCE-5569-trunk.patch The closing split is not calculated correctly: {code} // Catch any overage and create the closed interval for the last split. if (curLower = maxVal || splits.size() == 1) { splits.add(new DataDrivenDBInputFormat.DataDrivenDBInputSplit( - lowClausePrefix + Double.toString(curUpper), + lowClausePrefix + Double.toString(curLower), colName + = + Double.toString(maxVal))); } {code} For the case of min=5.0, max=7.0, 2 splits, the current code returns splits of (column1 =5.0, column1 6.0), (column1 =7.0, column1 =7.0). The second split is obviously not correct. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5102) fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db
[ https://issues.apache.org/jira/browse/MAPREDUCE-5102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-5102: --- Attachment: MAPREDUCE-5102-trunk--n4.patch MAPREDUCE-5102-branch-2--n4.patch Improved patches according to the last comment. Also, tests which do not have much value but add non needed rigidness to the code. fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db Key: MAPREDUCE-5102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5102 Project: Hadoop Map/Reduce Issue Type: Test Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha Reporter: Aleksey Gorshkov Assignee: Andrey Klochkov Attachments: MAPREDUCE-5102-branch-0.23.patch, MAPREDUCE-5102-branch-0.23-v1.patch, MAPREDUCE-5102-branch-2--n3.patch, MAPREDUCE-5102-branch-2--n4.patch, MAPREDUCE-5102-trunk--n3.patch, MAPREDUCE-5102-trunk--n4.patch, MAPREDUCE-5102-trunk.patch, MAPREDUCE-5102-trunk-v1.patch fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db patch MAPREDUCE-5102-trunk.patch for trunk and branch-2 patch MAPREDUCE-5102-branch-0.23.patch for branch-0.23 only -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Moved] (MAPREDUCE-5571) allow access to the DFS job submission + staging directory by members of the job submitters group
[ https://issues.apache.org/jira/browse/MAPREDUCE-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers moved HADOOP- to MAPREDUCE-5571: --- Affects Version/s: (was: 2.0.5-alpha) (was: 1.2.1) 1.2.1 2.0.5-alpha Key: MAPREDUCE-5571 (was: HADOOP-) Project: Hadoop Map/Reduce (was: Hadoop Common) allow access to the DFS job submission + staging directory by members of the job submitters group - Key: MAPREDUCE-5571 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5571 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.5-alpha, 1.2.1 Environment: linux Reporter: bradley childs Attachments: HADOOP-1.2-PERM.patch, hadoop-2.0.5-perm.patch The job submission and staging directories are explicitly given 0700 permissions restricting access of job submission files only to the submitter UID. this prevents hadoop daemon services running under different UIDs from reading the job submitters files. it is common unix practice to run daemon services under their own UIDs for security purposes. This bug can be demonstrated by creating a single node configuration, which runs LocalFileSystem and not HDFS. Create two users and add them to a 'hadoop' group. Start the hadoop services with one of the users, then submit a map/reduce job with the other user (or run one of the examples). Job submission ultimately fails and the M/R job doesn't execute. The fix is simple enough and secure-- change the staging directory permissions to 2750. i have demonstrated the patch against 2.0.5 (along with another fix for an incorrect decimal-octal conversion) and will attach the patch. this bug is present since very early versions. i would like to fix it at the lowest level as it's a simple file mode change in all versions, and localized to one file. is this possible? -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (MAPREDUCE-5572) Provide alternative logic for getPos() implementation in custom RecordReader
jay vyas created MAPREDUCE-5572: --- Summary: Provide alternative logic for getPos() implementation in custom RecordReader Key: MAPREDUCE-5572 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5572 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Affects Versions: 1.2.1, 1.2.0, 1.1.1, 1.1.0, 1.1.3, 1.2.2 Reporter: jay vyas Priority: Minor The custom RecordReader class defines the getPos() as follows: long currentOffset = currentStream == null ? 0 : currentStream.getPos(); ... This is meant to prevent errors when underlying stream is null. But it doesn't gaurantee to work: The RawLocalFileSystem, for example, currectly will close the underlying file stream once it is consumed, and the currentStream will thus throw a NullPointerException when trying to access the null stream. This is only seen when running this in the context where the MapTask class, which is only relevant in mapred.* API, calls getPos() twice in tandem, before and after reading a record. This custom record reader should be gaurded, or else eliminated, since it assumes something which is not in the FileSystem contract: That a getPos will always return a integral value. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5572) Provide alternative logic for getPos() implementation in custom RecordReader of mapred implementation of MultiFileWordCount
[ https://issues.apache.org/jira/browse/MAPREDUCE-5572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay vyas updated MAPREDUCE-5572: Description: The custom RecordReader class in MultiFileWordCount (MultiFileLineRecordReader) has been replaced in newer examples with a better implementation which uses the CombineFileInputFormat, which doesn't feature this bug. However, this bug nevertheless still exists in 1.x versions of the MultiFileWordCount which rely on the mapred API. The older MultiFileWordCount implementation defines the getPos() as follows: long currentOffset = currentStream == null ? 0 : currentStream.getPos(); ... This is meant to prevent errors when underlying stream is null. But it doesn't gaurantee to work: The RawLocalFileSystem, for example, currectly will close the underlying file stream once it is consumed, and the currentStream will thus throw a NullPointerException when trying to access the null stream. This is only seen when running this in the context where the MapTask class, which is only relevant in mapred.* API, calls getPos() twice in tandem, before and after reading a record. This custom record reader should be gaurded, or else eliminated, since it assumes something which is not in the FileSystem contract: That a getPos will always return a integral value. was: The custom RecordReader class defines the getPos() as follows: long currentOffset = currentStream == null ? 0 : currentStream.getPos(); ... This is meant to prevent errors when underlying stream is null. But it doesn't gaurantee to work: The RawLocalFileSystem, for example, currectly will close the underlying file stream once it is consumed, and the currentStream will thus throw a NullPointerException when trying to access the null stream. This is only seen when running this in the context where the MapTask class, which is only relevant in mapred.* API, calls getPos() twice in tandem, before and after reading a record. This custom record reader should be gaurded, or else eliminated, since it assumes something which is not in the FileSystem contract: That a getPos will always return a integral value. Summary: Provide alternative logic for getPos() implementation in custom RecordReader of mapred implementation of MultiFileWordCount (was: Provide alternative logic for getPos() implementation in custom RecordReader) Provide alternative logic for getPos() implementation in custom RecordReader of mapred implementation of MultiFileWordCount --- Key: MAPREDUCE-5572 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5572 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Affects Versions: 1.1.0, 1.1.1, 1.2.0, 1.1.3, 1.2.1, 1.2.2 Reporter: jay vyas Priority: Minor The custom RecordReader class in MultiFileWordCount (MultiFileLineRecordReader) has been replaced in newer examples with a better implementation which uses the CombineFileInputFormat, which doesn't feature this bug. However, this bug nevertheless still exists in 1.x versions of the MultiFileWordCount which rely on the mapred API. The older MultiFileWordCount implementation defines the getPos() as follows: long currentOffset = currentStream == null ? 0 : currentStream.getPos(); ... This is meant to prevent errors when underlying stream is null. But it doesn't gaurantee to work: The RawLocalFileSystem, for example, currectly will close the underlying file stream once it is consumed, and the currentStream will thus throw a NullPointerException when trying to access the null stream. This is only seen when running this in the context where the MapTask class, which is only relevant in mapred.* API, calls getPos() twice in tandem, before and after reading a record. This custom record reader should be gaurded, or else eliminated, since it assumes something which is not in the FileSystem contract: That a getPos will always return a integral value. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5571) allow access to the DFS job submission + staging directory by members of the job submitters group
[ https://issues.apache.org/jira/browse/MAPREDUCE-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788689#comment-13788689 ] Hadoop QA commented on MAPREDUCE-5571: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12606593/HADOOP-1.2-PERM.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4101//console This message is automatically generated. allow access to the DFS job submission + staging directory by members of the job submitters group - Key: MAPREDUCE-5571 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5571 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.2.1, 2.0.5-alpha Environment: linux Reporter: bradley childs Attachments: HADOOP-1.2-PERM.patch, hadoop-2.0.5-perm.patch The job submission and staging directories are explicitly given 0700 permissions restricting access of job submission files only to the submitter UID. this prevents hadoop daemon services running under different UIDs from reading the job submitters files. it is common unix practice to run daemon services under their own UIDs for security purposes. This bug can be demonstrated by creating a single node configuration, which runs LocalFileSystem and not HDFS. Create two users and add them to a 'hadoop' group. Start the hadoop services with one of the users, then submit a map/reduce job with the other user (or run one of the examples). Job submission ultimately fails and the M/R job doesn't execute. The fix is simple enough and secure-- change the staging directory permissions to 2750. i have demonstrated the patch against 2.0.5 (along with another fix for an incorrect decimal-octal conversion) and will attach the patch. this bug is present since very early versions. i would like to fix it at the lowest level as it's a simple file mode change in all versions, and localized to one file. is this possible? -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (MAPREDUCE-5573) /ws/v1/mapreduce/blacklistednodes MR AM webservice link is broken.
Omkar Vinit Joshi created MAPREDUCE-5573: Summary: /ws/v1/mapreduce/blacklistednodes MR AM webservice link is broken. Key: MAPREDUCE-5573 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5573 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Omkar Vinit Joshi run a map reduce job (sleep) then Retrieve MR AM tracking url by running yarn applications -list try accessing trackingurl + /ws/v1/mapreduce/blacklistednodes {code} RemoteException exceptionClassCastException/exception message org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter cannot be cast to org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor /message javaClassNamejava.lang.ClassCastException/javaClassName /RemoteException {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5573) /ws/v1/mapreduce/blacklistednodes MR AM webservice link is broken.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated MAPREDUCE-5573: - Attachment: error.log /ws/v1/mapreduce/blacklistednodes MR AM webservice link is broken. -- Key: MAPREDUCE-5573 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5573 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Omkar Vinit Joshi Attachments: error.log run a map reduce job (sleep) then Retrieve MR AM tracking url by running yarn applications -list try accessing trackingurl + /ws/v1/mapreduce/blacklistednodes {code} RemoteException exceptionClassCastException/exception message org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter cannot be cast to org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor /message javaClassNamejava.lang.ClassCastException/javaClassName /RemoteException {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5102) fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db
[ https://issues.apache.org/jira/browse/MAPREDUCE-5102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788783#comment-13788783 ] Hadoop QA commented on MAPREDUCE-5102: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12607254/MAPREDUCE-5102-trunk--n4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient: org.apache.hadoop.mapreduce.lib.db.TestSplitters org.apache.hadoop.mapred.TestJobCleanup The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient: org.apache.hadoop.mapreduce.v2.TestUberAM {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4102//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4102//console This message is automatically generated. fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db Key: MAPREDUCE-5102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5102 Project: Hadoop Map/Reduce Issue Type: Test Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha Reporter: Aleksey Gorshkov Assignee: Andrey Klochkov Attachments: MAPREDUCE-5102-branch-0.23.patch, MAPREDUCE-5102-branch-0.23-v1.patch, MAPREDUCE-5102-branch-2--n3.patch, MAPREDUCE-5102-branch-2--n4.patch, MAPREDUCE-5102-trunk--n3.patch, MAPREDUCE-5102-trunk--n4.patch, MAPREDUCE-5102-trunk.patch, MAPREDUCE-5102-trunk-v1.patch fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db patch MAPREDUCE-5102-trunk.patch for trunk and branch-2 patch MAPREDUCE-5102-branch-0.23.patch for branch-0.23 only -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5102) fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db
[ https://issues.apache.org/jira/browse/MAPREDUCE-5102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-5102: --- Attachment: MAPREDUCE-5102-trunk--n4.patch MAPREDUCE-5102-branch-2--n4.patch fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db Key: MAPREDUCE-5102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5102 Project: Hadoop Map/Reduce Issue Type: Test Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha Reporter: Aleksey Gorshkov Assignee: Andrey Klochkov Attachments: MAPREDUCE-5102-branch-0.23.patch, MAPREDUCE-5102-branch-0.23-v1.patch, MAPREDUCE-5102-branch-2--n3.patch, MAPREDUCE-5102-branch-2--n4.patch, MAPREDUCE-5102-branch-2--n4.patch, MAPREDUCE-5102-trunk--n3.patch, MAPREDUCE-5102-trunk--n4.patch, MAPREDUCE-5102-trunk--n4.patch, MAPREDUCE-5102-trunk.patch, MAPREDUCE-5102-trunk-v1.patch fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db patch MAPREDUCE-5102-trunk.patch for trunk and branch-2 patch MAPREDUCE-5102-branch-0.23.patch for branch-0.23 only -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-4305) Implement delay scheduling in capacity scheduler for improving data locality
[ https://issues.apache.org/jira/browse/MAPREDUCE-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788868#comment-13788868 ] Karthik Kambatla commented on MAPREDUCE-4305: - [~acmurthy], can you take a look at this when you get a chance. It would be a nice to have addition. Implement delay scheduling in capacity scheduler for improving data locality Key: MAPREDUCE-4305 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4305 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: MAPREDUCE-4305, MAPREDUCE-4305-1.patch, PATCH-MAPREDUCE-4305-MR1-1.patch, PATCH-MAPREDUCE-4305-MR1-2.patch, PATCH-MAPREDUCE-4305-MR1-3.patch, PATCH-MAPREDUCE-4305-MR1-6.patch, PATCH-MAPREDUCE-4305-MR1-7.patch, PATCH-MAPREDUCE-4305-MR1.patch Capacity Scheduler data local tasks are about 40%-50% which is not good. While my test with 70 node cluster i consistently get data locality around 40-50% on a free cluster. I think we need to implement something like delay scheduling in the capacity scheduler for improving the data locality. http://radlab.cs.berkeley.edu/publication/308 After implementing the delay scheduling on Hadoop 22 I am getting 100 % data locality in free cluster and around 90% data locality in busy cluster. Thanks, Mayank -- This message was sent by Atlassian JIRA (v6.1#6144)