[jira] [Commented] (MAPREDUCE-5984) native-task: upgrade lz4 to lastest version
[ https://issues.apache.org/jira/browse/MAPREDUCE-5984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069939#comment-14069939 ] Binglin Chang commented on MAPREDUCE-5984: -- bq. but I'm wondering if it's possible to reuse the lz4 source files that are already checked in for hadoop-common Sure, I will update the patch to copy lz4 files to building path. And we can upgrading the version in hadoop-common in trunk. native-task: upgrade lz4 to lastest version --- Key: MAPREDUCE-5984 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5984 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Reporter: Binglin Chang Assignee: Binglin Chang Priority: Minor Attachments: MAPREDUCE-5984.v1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-2841) Task level native optimization
[ https://issues.apache.org/jira/browse/MAPREDUCE-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069945#comment-14069945 ] Binglin Chang commented on MAPREDUCE-2841: -- Hi Sean, the test succeed on macosx, but failed on ubuntu12, I update the test a little in MAPREDUCE-5985. Task level native optimization -- Key: MAPREDUCE-2841 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2841 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Environment: x86-64 Linux/Unix Reporter: Binglin Chang Assignee: Sean Zhong Attachments: DESIGN.html, MAPREDUCE-2841.v1.patch, MAPREDUCE-2841.v2.patch, dualpivot-0.patch, dualpivotv20-0.patch, fb-shuffle.patch, hadoop-3.0-mapreduce-2841-2014-7-17.patch I'm recently working on native optimization for MapTask based on JNI. The basic idea is that, add a NativeMapOutputCollector to handle k/v pairs emitted by mapper, therefore sort, spill, IFile serialization can all be done in native code, preliminary test(on Xeon E5410, jdk6u24) showed promising results: 1. Sort is about 3x-10x as fast as java(only binary string compare is supported) 2. IFile serialization speed is about 3x of java, about 500MB/s, if hardware CRC32C is used, things can get much faster(1G/ 3. Merge code is not completed yet, so the test use enough io.sort.mb to prevent mid-spill This leads to a total speed up of 2x~3x for the whole MapTask, if IdentityMapper(mapper does nothing) is used There are limitations of course, currently only Text and BytesWritable is supported, and I have not think through many things right now, such as how to support map side combine. I had some discussion with somebody familiar with hive, it seems that these limitations won't be much problem for Hive to benefit from those optimizations, at least. Advices or discussions about improving compatibility are most welcome:) Currently NativeMapOutputCollector has a static method called canEnable(), which checks if key/value type, comparator type, combiner are all compatible, then MapTask can choose to enable NativeMapOutputCollector. This is only a preliminary test, more work need to be done. I expect better final results, and I believe similar optimization can be adopt to reduce task and shuffle too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5988) Fix dead links to the javadocs of o.a.h.mapreduce.counters
Akira AJISAKA created MAPREDUCE-5988: Summary: Fix dead links to the javadocs of o.a.h.mapreduce.counters Key: MAPREDUCE-5988 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.4.1 Reporter: Akira AJISAKA Priority: Minor In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, AbstractCounters and CounterGroupBase are listed, but not linked. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5988) Fix dead links to the javadocs of o.a.h.mapreduce.counters
[ https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated MAPREDUCE-5988: - Attachment: MAPREDUCE-5988.patch Removing {{@InterfaceAudience.Private}} from the package-info to generate the javadocs of {{CounterGroupBase}} and {{AbstractCounters}}. Fix dead links to the javadocs of o.a.h.mapreduce.counters -- Key: MAPREDUCE-5988 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.4.1 Reporter: Akira AJISAKA Priority: Minor Attachments: MAPREDUCE-5988.patch In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, AbstractCounters and CounterGroupBase are listed, but not linked. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project
[ https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated MAPREDUCE-5988: - Summary: Fix dead links to the javadocs in mapreduce project (was: Fix dead links to the javadocs of o.a.h.mapreduce.counters) Fix dead links to the javadocs in mapreduce project --- Key: MAPREDUCE-5988 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.4.1 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Priority: Minor Attachments: MAPREDUCE-5988.patch In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some classes are listed, but not linked. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5988) Fix dead links to the javadocs of o.a.h.mapreduce.counters
[ https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated MAPREDUCE-5988: - Description: In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some classes are listed, but not linked. (was: In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, AbstractCounters and CounterGroupBase are listed, but not linked.) Assignee: Akira AJISAKA Fix dead links to the javadocs of o.a.h.mapreduce.counters -- Key: MAPREDUCE-5988 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.4.1 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Priority: Minor Attachments: MAPREDUCE-5988.patch In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some classes are listed, but not linked. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project
[ https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070062#comment-14070062 ] Akira AJISAKA commented on MAPREDUCE-5988: -- The below classes are linked, but undocumented. - AbstractCounters - CounterGroupBase - CancelDelegationTokenRequest - CancelDelegationTokenResponse - GetDelegationTokenRequest - RenewDelegationTokenRequest - RenewDelegationTokenResponse - HistoryFileManager - HistoryStorage Fix dead links to the javadocs in mapreduce project --- Key: MAPREDUCE-5988 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.4.1 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Priority: Minor Attachments: MAPREDUCE-5988.patch In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some classes are listed, but not linked. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project
[ https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated MAPREDUCE-5988: - Description: In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some classes are listed, but not documented. (was: In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some classes are listed, but not linked.) Fix dead links to the javadocs in mapreduce project --- Key: MAPREDUCE-5988 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.4.1 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Priority: Minor Attachments: MAPREDUCE-5988.patch In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some classes are listed, but not documented. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project
[ https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated MAPREDUCE-5988: - Attachment: MAPREDUCE-5988.2.patch Removed {{@InterfaceAudience.Private}} from each package-info. I confirmed the javadocs of the above classes were generated. Fix dead links to the javadocs in mapreduce project --- Key: MAPREDUCE-5988 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.4.1 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Priority: Minor Attachments: MAPREDUCE-5988.2.patch, MAPREDUCE-5988.patch In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some classes are listed, but not documented. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project
[ https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated MAPREDUCE-5988: - Target Version/s: 2.6.0 Status: Patch Available (was: Open) Fix dead links to the javadocs in mapreduce project --- Key: MAPREDUCE-5988 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.4.1 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Priority: Minor Attachments: MAPREDUCE-5988.2.patch, MAPREDUCE-5988.patch In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some classes are listed, but not documented. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5988) Fix dead links to the javadocs in mapreduce project
[ https://issues.apache.org/jira/browse/MAPREDUCE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070091#comment-14070091 ] Hadoop QA commented on MAPREDUCE-5988: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657101/MAPREDUCE-5988.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs: org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing org.apache.hadoop.mapreduce.v2.hs.webapp.dao.TestJobInfo {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4760//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4760//console This message is automatically generated. Fix dead links to the javadocs in mapreduce project --- Key: MAPREDUCE-5988 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5988 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.4.1 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Priority: Minor Attachments: MAPREDUCE-5988.2.patch, MAPREDUCE-5988.patch In http://hadoop.apache.org/docs/r2.4.1/api/allclasses-frame.html, some classes are listed, but not documented. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070121#comment-14070121 ] Hudson commented on MAPREDUCE-5957: --- FAILURE: Integrated in Hadoop-Yarn-trunk #620 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/620/]) MAPREDUCE-5957. AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used. Contributed by Sangjin Lee (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612358) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/commit/CommitterEventHandler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used --- Key: MAPREDUCE-5957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Fix For: 3.0.0, 2.6.0 Attachments: MAPREDUCE-5957.branch-2.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch With the job classloader enabled, the MR AM throws ClassNotFoundException if a custom output format class is specified. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895) at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469) ... 8 more Caused by: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893) ... 10 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5756) CombineFileInputFormat.getSplits() including directories in its results
[ https://issues.apache.org/jira/browse/MAPREDUCE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070123#comment-14070123 ] Hudson commented on MAPREDUCE-5756: --- FAILURE: Integrated in Hadoop-Yarn-trunk #620 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/620/]) MAPREDUCE-5756. CombineFileInputFormat.getSplits() including directories in its results. Contributed by Jason Dere (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612400) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestCombineFileInputFormat.java CombineFileInputFormat.getSplits() including directories in its results --- Key: MAPREDUCE-5756 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5756 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Fix For: 3.0.0, 2.6.0 Attachments: MAPREDUCE-5756.1.patch, MAPREDUCE-5756.2.patch Trying to track down HIVE-6401, where we see some is not a file errors because getSplits() is giving us directories. I believe the culprit is FileInputFormat.listStatus(): {code} if (recursive stat.isDirectory()) { addInputPathRecursively(result, fs, stat.getPath(), inputFilter); } else { result.add(stat); } {code} Which seems to be allowing directories to be added to the results if recursive is false. Is this meant to return directories? If not, I think it should look like this: {code} if (stat.isDirectory()) { if (recursive) { addInputPathRecursively(result, fs, stat.getPath(), inputFilter); } } else { result.add(stat); } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5989) Add DeletionService in AM
Varun Saxena created MAPREDUCE-5989: --- Summary: Add DeletionService in AM Key: MAPREDUCE-5989 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5989 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster Reporter: Varun Saxena Assignee: Varun Saxena In AM, for graceful cleanup, I propose addition of a DeletionService which will do the following : 1. Cleanup of failed tasks (temporary data need not occupy space till NM's Deletion Service is invoked) 2. Staging directory deletion (During AM shutdown, its better to place staging dir cleanup in Deletion Service: Refer to MAPREDUCE-4841 ) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5989) Add DeletionService in AM
[ https://issues.apache.org/jira/browse/MAPREDUCE-5989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070246#comment-14070246 ] Jason Lowe commented on MAPREDUCE-5989: --- Is this the same kind of DeletionService that the NM currently uses? If so I'm unclear on the tangible benefits of this, since all that service does is potentially postpone deletions. And as for the staging directory cleanup, implementing a deletion service is not needed to fix that issue. Actually I believe it's already fixed by MAPREDUCE-5476 by having it delete the staging directory after unregistering so we know no other AM attempts will try to be launched after removing the staging directory. If you could walk through an example scenario where the deletion service is used and how it's useful that would help me understand why adding such a service would be helpful. Add DeletionService in AM - Key: MAPREDUCE-5989 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5989 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster Reporter: Varun Saxena Assignee: Varun Saxena In AM, for graceful cleanup, I propose addition of a DeletionService which will do the following : 1. Cleanup of failed tasks (temporary data need not occupy space till NM's Deletion Service is invoked) 2. Staging directory deletion (During AM shutdown, its better to place staging dir cleanup in Deletion Service: Refer to MAPREDUCE-4841 ) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (MAPREDUCE-4841) Application Master Retries fail due to FileNotFoundException
[ https://issues.apache.org/jira/browse/MAPREDUCE-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned MAPREDUCE-4841: - Assignee: Jason Lowe (was: Devaraj K) Application Master Retries fail due to FileNotFoundException Key: MAPREDUCE-4841 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4841 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Affects Versions: 2.0.1-alpha Reporter: Devaraj K Assignee: Jason Lowe Priority: Critical Application attempt1 is deleting the job related files and these are not present in the HDFS for following retries. {code:xml} Application application_1353724754961_0001 failed 4 times due to AM Container for appattempt_1353724754961_0001_04 exited with exitCode: -1000 due to: RemoteTrace: java.io.FileNotFoundException: File does not exist: hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:752) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:88) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:49) at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:157) at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:155) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:153) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) at LocalTrace: org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: File does not exist: hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens at org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217) at org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:822) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:492) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:221) at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46) at org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:924) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686) .Failing this attempt.. Failing the application. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4841) Application Master Retries fail due to FileNotFoundException
[ https://issues.apache.org/jira/browse/MAPREDUCE-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070247#comment-14070247 ] Jason Lowe commented on MAPREDUCE-4841: --- I believe this has been fixed by MAPREDUCE-5476. [~devaraj.k] if you agree then we can mark this is as a duplicate of that JIRA. Application Master Retries fail due to FileNotFoundException Key: MAPREDUCE-4841 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4841 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Affects Versions: 2.0.1-alpha Reporter: Devaraj K Assignee: Jason Lowe Priority: Critical Application attempt1 is deleting the job related files and these are not present in the HDFS for following retries. {code:xml} Application application_1353724754961_0001 failed 4 times due to AM Container for appattempt_1353724754961_0001_04 exited with exitCode: -1000 due to: RemoteTrace: java.io.FileNotFoundException: File does not exist: hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:752) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:88) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:49) at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:157) at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:155) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:153) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) at LocalTrace: org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: File does not exist: hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens at org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217) at org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:822) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:492) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:221) at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46) at org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:924) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686) .Failing this attempt.. Failing the application. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-4841) Application Master Retries fail due to FileNotFoundException
[ https://issues.apache.org/jira/browse/MAPREDUCE-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K resolved MAPREDUCE-4841. -- Resolution: Fixed It has been fixed by MAPREDUCE-5476, closing it as duplicate of MAPREDUCE-5476. Application Master Retries fail due to FileNotFoundException Key: MAPREDUCE-4841 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4841 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Affects Versions: 2.0.1-alpha Reporter: Devaraj K Assignee: Jason Lowe Priority: Critical Application attempt1 is deleting the job related files and these are not present in the HDFS for following retries. {code:xml} Application application_1353724754961_0001 failed 4 times due to AM Container for appattempt_1353724754961_0001_04 exited with exitCode: -1000 due to: RemoteTrace: java.io.FileNotFoundException: File does not exist: hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:752) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:88) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:49) at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:157) at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:155) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:153) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) at LocalTrace: org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: File does not exist: hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens at org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217) at org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:822) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:492) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:221) at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46) at org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:924) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686) .Failing this attempt.. Failing the application. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Reopened] (MAPREDUCE-4841) Application Master Retries fail due to FileNotFoundException
[ https://issues.apache.org/jira/browse/MAPREDUCE-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K reopened MAPREDUCE-4841: -- Application Master Retries fail due to FileNotFoundException Key: MAPREDUCE-4841 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4841 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Affects Versions: 2.0.1-alpha Reporter: Devaraj K Assignee: Jason Lowe Priority: Critical Application attempt1 is deleting the job related files and these are not present in the HDFS for following retries. {code:xml} Application application_1353724754961_0001 failed 4 times due to AM Container for appattempt_1353724754961_0001_04 exited with exitCode: -1000 due to: RemoteTrace: java.io.FileNotFoundException: File does not exist: hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:752) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:88) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:49) at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:157) at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:155) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:153) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) at LocalTrace: org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: File does not exist: hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens at org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217) at org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:822) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:492) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:221) at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46) at org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:924) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686) .Failing this attempt.. Failing the application. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-4841) Application Master Retries fail due to FileNotFoundException
[ https://issues.apache.org/jira/browse/MAPREDUCE-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K resolved MAPREDUCE-4841. -- Resolution: Duplicate Application Master Retries fail due to FileNotFoundException Key: MAPREDUCE-4841 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4841 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Affects Versions: 2.0.1-alpha Reporter: Devaraj K Assignee: Jason Lowe Priority: Critical Application attempt1 is deleting the job related files and these are not present in the HDFS for following retries. {code:xml} Application application_1353724754961_0001 failed 4 times due to AM Container for appattempt_1353724754961_0001_04 exited with exitCode: -1000 due to: RemoteTrace: java.io.FileNotFoundException: File does not exist: hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:752) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:88) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:49) at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:157) at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:155) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:153) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) at LocalTrace: org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: File does not exist: hdfs://hacluster:8020/tmp/hadoop-yarn/staging/mapred/.staging/job_1353724754961_0001/appTokens at org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217) at org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:822) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:492) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:221) at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46) at org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:924) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686) .Failing this attempt.. Failing the application. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5963) ShuffleHandler DB schema should be versioned with compatible/incompatible changes
[ https://issues.apache.org/jira/browse/MAPREDUCE-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-5963: -- Attachment: MAPREDUCE-5963-v2.1.patch In latest patch, fix the findbug warning. ShuffleHandler DB schema should be versioned with compatible/incompatible changes - Key: MAPREDUCE-5963 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5963 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.4.1 Reporter: Junping Du Assignee: Junping Du Attachments: MAPREDUCE-5963-v2.1.patch, MAPREDUCE-5963-v2.patch, MAPREDUCE-5963.patch ShuffleHandler persist job shuffle info into DB schema, which should be versioned with compatible/incompatible changes to support rolling upgrade. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5756) CombineFileInputFormat.getSplits() including directories in its results
[ https://issues.apache.org/jira/browse/MAPREDUCE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070284#comment-14070284 ] Hudson commented on MAPREDUCE-5756: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #1812 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1812/]) MAPREDUCE-5756. CombineFileInputFormat.getSplits() including directories in its results. Contributed by Jason Dere (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612400) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestCombineFileInputFormat.java CombineFileInputFormat.getSplits() including directories in its results --- Key: MAPREDUCE-5756 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5756 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Fix For: 3.0.0, 2.6.0 Attachments: MAPREDUCE-5756.1.patch, MAPREDUCE-5756.2.patch Trying to track down HIVE-6401, where we see some is not a file errors because getSplits() is giving us directories. I believe the culprit is FileInputFormat.listStatus(): {code} if (recursive stat.isDirectory()) { addInputPathRecursively(result, fs, stat.getPath(), inputFilter); } else { result.add(stat); } {code} Which seems to be allowing directories to be added to the results if recursive is false. Is this meant to return directories? If not, I think it should look like this: {code} if (stat.isDirectory()) { if (recursive) { addInputPathRecursively(result, fs, stat.getPath(), inputFilter); } } else { result.add(stat); } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070282#comment-14070282 ] Hudson commented on MAPREDUCE-5957: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #1812 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1812/]) MAPREDUCE-5957. AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used. Contributed by Sangjin Lee (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612358) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/commit/CommitterEventHandler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used --- Key: MAPREDUCE-5957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Fix For: 3.0.0, 2.6.0 Attachments: MAPREDUCE-5957.branch-2.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch With the job classloader enabled, the MR AM throws ClassNotFoundException if a custom output format class is specified. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895) at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469) ... 8 more Caused by: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893) ... 10 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5963) ShuffleHandler DB schema should be versioned with compatible/incompatible changes
[ https://issues.apache.org/jira/browse/MAPREDUCE-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070309#comment-14070309 ] Hadoop QA commented on MAPREDUCE-5963: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12657123/MAPREDUCE-5963-v2.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4761//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4761//console This message is automatically generated. ShuffleHandler DB schema should be versioned with compatible/incompatible changes - Key: MAPREDUCE-5963 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5963 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.4.1 Reporter: Junping Du Assignee: Junping Du Attachments: MAPREDUCE-5963-v2.1.patch, MAPREDUCE-5963-v2.patch, MAPREDUCE-5963.patch ShuffleHandler persist job shuffle info into DB schema, which should be versioned with compatible/incompatible changes to support rolling upgrade. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5756) CombineFileInputFormat.getSplits() including directories in its results
[ https://issues.apache.org/jira/browse/MAPREDUCE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070356#comment-14070356 ] Hudson commented on MAPREDUCE-5756: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1839 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1839/]) MAPREDUCE-5756. CombineFileInputFormat.getSplits() including directories in its results. Contributed by Jason Dere (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612400) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestCombineFileInputFormat.java CombineFileInputFormat.getSplits() including directories in its results --- Key: MAPREDUCE-5756 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5756 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Fix For: 3.0.0, 2.6.0 Attachments: MAPREDUCE-5756.1.patch, MAPREDUCE-5756.2.patch Trying to track down HIVE-6401, where we see some is not a file errors because getSplits() is giving us directories. I believe the culprit is FileInputFormat.listStatus(): {code} if (recursive stat.isDirectory()) { addInputPathRecursively(result, fs, stat.getPath(), inputFilter); } else { result.add(stat); } {code} Which seems to be allowing directories to be added to the results if recursive is false. Is this meant to return directories? If not, I think it should look like this: {code} if (stat.isDirectory()) { if (recursive) { addInputPathRecursively(result, fs, stat.getPath(), inputFilter); } } else { result.add(stat); } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070354#comment-14070354 ] Hudson commented on MAPREDUCE-5957: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1839 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1839/]) MAPREDUCE-5957. AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used. Contributed by Sangjin Lee (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612358) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/commit/CommitterEventHandler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used --- Key: MAPREDUCE-5957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Fix For: 3.0.0, 2.6.0 Attachments: MAPREDUCE-5957.branch-2.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch With the job classloader enabled, the MR AM throws ClassNotFoundException if a custom output format class is specified. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895) at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469) ... 8 more Caused by: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893) ... 10 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-250) JobTracker should log the scheduling of setup/cleanup task
[ https://issues.apache.org/jira/browse/MAPREDUCE-250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-250. Resolution: Fixed Fairly confident this has been fixed. Closing as stale. JobTracker should log the scheduling of setup/cleanup task -- Key: MAPREDUCE-250 URL: https://issues.apache.org/jira/browse/MAPREDUCE-250 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Amar Kamat Setup/Cleanup is launched under (m+1)^th^ tip or (r+1)^th^ tip. It will be nice if jobtracker logs this info. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-2811) Adding Multiple Reducers implementations.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-2811: Description: Like HADOOP-372, we have a multi format Reducer too. Someone suggested that if we need different reducers and map implementations(like what i need) I was better of by writing 2 jobs. I dont quite agree. I am calculating 2 big matrices that must be calculated in the map step, summed in the reducers multiplied and then written to a file. The First mapper sums a matrix based on the i,j th index(key) into the file and the second mapper adds the N*1 dimension vector that uses a new line as key. These keys must be passed as such to the reduce process. (was: Like the Patch released here https://issues.apache.org/jira/browse/HADOOP-372 can we have a multi format Reducer too. Someone suggested that if we need different reducers and map implementations(like what i need) I was better of by writing 2 jobs. I dont quite agree. I am calculating 2 big matrices that must be calculated in the map step, summed in the reducers multiplied and then written to a file. The First mapper sums a matrix based on the i,j th index(key) into the file and the second mapper adds the N*1 dimension vector that uses a new line as key. These keys must be passed as such to the reduce process.) Adding Multiple Reducers implementations. - Key: MAPREDUCE-2811 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2811 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Sidharth Gupta Like HADOOP-372, we have a multi format Reducer too. Someone suggested that if we need different reducers and map implementations(like what i need) I was better of by writing 2 jobs. I dont quite agree. I am calculating 2 big matrices that must be calculated in the map step, summed in the reducers multiplied and then written to a file. The First mapper sums a matrix based on the i,j th index(key) into the file and the second mapper adds the N*1 dimension vector that uses a new line as key. These keys must be passed as such to the reduce process. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-126) Job history analysis showing wrong job runtime
[ https://issues.apache.org/jira/browse/MAPREDUCE-126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-126: --- Labels: newbie (was: ) Job history analysis showing wrong job runtime -- Key: MAPREDUCE-126 URL: https://issues.apache.org/jira/browse/MAPREDUCE-126 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.1 Reporter: Amar Kamat Labels: newbie Analysis of completed jobs shows wrong runtime. Here is the faulty code {code:title=analysisjobhistory.jsp|borderStyle=solid} bFinished At : /b %=StringUtils.getFormattedTimeWithDiff(dateFormat, job.getLong(Keys.FINISH_TIME), job.getLong(Keys.LAUNCH_TIME)) %br/ {code} I think it should be {code:title=analysisjobhistory.jsp|borderStyle=solid} bFinished At : /b %=StringUtils.getFormattedTimeWithDiff(dateFormat, job.getLong(Keys.FINISH_TIME), job.getLong(Keys.SUBMIT_TIME)) %br/ {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-126) Job history analysis showing wrong job runtime
[ https://issues.apache.org/jira/browse/MAPREDUCE-126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-126. Resolution: Incomplete This code is long gone in 2.x. Closing as stale. Job history analysis showing wrong job runtime -- Key: MAPREDUCE-126 URL: https://issues.apache.org/jira/browse/MAPREDUCE-126 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.1 Reporter: Amar Kamat Labels: newbie Analysis of completed jobs shows wrong runtime. Here is the faulty code {code:title=analysisjobhistory.jsp|borderStyle=solid} bFinished At : /b %=StringUtils.getFormattedTimeWithDiff(dateFormat, job.getLong(Keys.FINISH_TIME), job.getLong(Keys.LAUNCH_TIME)) %br/ {code} I think it should be {code:title=analysisjobhistory.jsp|borderStyle=solid} bFinished At : /b %=StringUtils.getFormattedTimeWithDiff(dateFormat, job.getLong(Keys.FINISH_TIME), job.getLong(Keys.SUBMIT_TIME)) %br/ {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-484) Logos for Hive and JobTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-484. Resolution: Fixed Stale. Closing. Logos for Hive and JobTracker - Key: MAPREDUCE-484 URL: https://issues.apache.org/jira/browse/MAPREDUCE-484 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Aaron Newton Priority: Trivial Attachments: hive job tracker icons (font outlines).ai, hive job tracker icons (font outlines).pdf, hive job tracker icons (font outlines).pdf, hive.png, hive.png, jobtracker.png Greetings fine Hadoop peoples, While working on a few projects here at Cloudera we found ourselves wanting for some sort of icon for both the JobTracker and for Hive. After checking on the project page for Hive (the JobTracker doesn't really have one) and finding that these items have no icons, we rolled up our sleeves and made some. We'd like to contribute these to the project, so if you want 'em, they're all yours. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-700) Too many copies of job-conf with the jobtracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-700. Resolution: Fixed Lots of changes here already. Closing this as stale. Too many copies of job-conf with the jobtracker --- Key: MAPREDUCE-700 URL: https://issues.apache.org/jira/browse/MAPREDUCE-700 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Amar Kamat Assignee: Amar Kamat As of today the jobtracker has job-conf copies in # mapred.system.dir : created while job-submission # jobtracker-subdir (created by JobInProgress upon creation) # log-dir : created upon job-init # history-dir : created upon job-init Its difficult to manage these conf files. The problem aggravates under restart. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-173) JobConf should also load resources from hdfs (or other filesystems)
[ https://issues.apache.org/jira/browse/MAPREDUCE-173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-173. Resolution: Fixed This is almost certainly fixed by now. JobConf should also load resources from hdfs (or other filesystems) --- Key: MAPREDUCE-173 URL: https://issues.apache.org/jira/browse/MAPREDUCE-173 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Amar Kamat {{JobConf conf = new JobConf(path)}} doesnt load the configuration if _path_ points to a resource on hdfs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-322) TaskTracker shuold run user tasks nicely in the local machine
[ https://issues.apache.org/jira/browse/MAPREDUCE-322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-322. Resolution: Fixed This has been fixed with both cgroups and task level niceness. TaskTracker shuold run user tasks nicely in the local machine - Key: MAPREDUCE-322 URL: https://issues.apache.org/jira/browse/MAPREDUCE-322 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Tsz Wo Nicholas Sze If one task tried to use all CPUs in a local machine, all other tasks or processes (includes tasktracker and datanode daemons) may hardly get a chance to run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-275) Display lost tracker information on the jobtracker webui and persist it across restarts
[ https://issues.apache.org/jira/browse/MAPREDUCE-275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-275. Resolution: Won't Fix I'm going to close this as Won't Fix. Display lost tracker information on the jobtracker webui and persist it across restarts --- Key: MAPREDUCE-275 URL: https://issues.apache.org/jira/browse/MAPREDUCE-275 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Amar Kamat Assignee: Amar Kamat As of today its difficult to distinguish between active tracker and lost trackers (lost trackers are considered active). It will be nice if the jobtracker can display what all trackers are lost and maintain it across restarts. HADOOP-5643 does something similar for decommissioned trackers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-156) ProcessTree.destroy() is sleeping for 5 seconds holding the task slot
[ https://issues.apache.org/jira/browse/MAPREDUCE-156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-156. Resolution: Won't Fix This is intentional to potentially give time for the process to clean up. Closing as won't fix. ProcessTree.destroy() is sleeping for 5 seconds holding the task slot - Key: MAPREDUCE-156 URL: https://issues.apache.org/jira/browse/MAPREDUCE-156 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Ravi Gummadi Currently, in ProcessTree.destroy(), after sending SIGTERM to the task JVM, TT sleeps for 5 seconds(default value of mapred.tasktracker.tasks.sleeptime-before-sigkill) before sending SIGKILL. This seems to be blocking the task slot(not getting released) for 5 seconds. We should avoid this so that another task could be launched in that slot immediately. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-327) Add explicit remote map count JobTracker metrics
[ https://issues.apache.org/jira/browse/MAPREDUCE-327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070734#comment-14070734 ] Allen Wittenauer commented on MAPREDUCE-327: In real-world scenarios, we've discovered that task locality as reported by the system can effectively be a lie because of CFIF/MFIF. Given 4 input splits, if the first is local but the rest are not, the task will still be considered local even though 3/4'ths of the data came off rack! Add explicit remote map count JobTracker metrics Key: MAPREDUCE-327 URL: https://issues.apache.org/jira/browse/MAPREDUCE-327 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Hong Tang Labels: newbie I am proposing to add a counter REMOTE_MAPS in addition to the following counters: TOTAL_MAPS, DATA_LOCAL_MAPS, RACK_LOCAL_MAPS. A Map Task is considered a remote-map iff the input split returns a set of locations, but none is chosen to execute the map task. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-327) Add explicit remote map count JobTracker metrics
[ https://issues.apache.org/jira/browse/MAPREDUCE-327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-327: --- Labels: newbie (was: ) Add explicit remote map count JobTracker metrics Key: MAPREDUCE-327 URL: https://issues.apache.org/jira/browse/MAPREDUCE-327 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Hong Tang Labels: newbie I am proposing to add a counter REMOTE_MAPS in addition to the following counters: TOTAL_MAPS, DATA_LOCAL_MAPS, RACK_LOCAL_MAPS. A Map Task is considered a remote-map iff the input split returns a set of locations, but none is chosen to execute the map task. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5990) If output directory can not be created, error message on stdout does not provide any clue.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-5990: Component/s: examples If output directory can not be created, error message on stdout does not provide any clue. -- Key: MAPREDUCE-5990 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5990 Project: Hadoop Map/Reduce Issue Type: Improvement Components: examples Reporter: Suhas Gogate Labels: newbie In the following wordcount example output directory path can not be created because /temp does not exists and user has not privileges to create output path at /. hadoop --config ./clustdir/ jar /homes/gogate/wordcount.jar com..wordcount.WordCount /in-path/gogate/myfile /temp/mywc-gogate 09/04/28 23:00:32 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 09/04/28 23:00:32 INFO mapred.FileInputFormat: Total input paths to process : 1 09/04/28 23:00:32 INFO mapred.FileInputFormat: Total input paths to process : 1 09/04/28 23:00:33 INFO mapred.JobClient: Running job: job_200904282249_0004 java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1113) at com..wordcount.WordCount.main(WordCount.java:55) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:155) at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5963) ShuffleHandler DB schema should be versioned with compatible/incompatible changes
[ https://issues.apache.org/jira/browse/MAPREDUCE-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070737#comment-14070737 ] Jason Lowe commented on MAPREDUCE-5963: --- +1 lgtm. Committing this. ShuffleHandler DB schema should be versioned with compatible/incompatible changes - Key: MAPREDUCE-5963 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5963 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.4.1 Reporter: Junping Du Assignee: Junping Du Attachments: MAPREDUCE-5963-v2.1.patch, MAPREDUCE-5963-v2.patch, MAPREDUCE-5963.patch ShuffleHandler persist job shuffle info into DB schema, which should be versioned with compatible/incompatible changes to support rolling upgrade. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Moved] (MAPREDUCE-5990) If output directory can not be created, error message on stdout does not provide any clue.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer moved HADOOP-5756 to MAPREDUCE-5990: - Affects Version/s: (was: 0.18.3) Issue Type: Improvement (was: Bug) Key: MAPREDUCE-5990 (was: HADOOP-5756) Project: Hadoop Map/Reduce (was: Hadoop Common) If output directory can not be created, error message on stdout does not provide any clue. -- Key: MAPREDUCE-5990 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5990 Project: Hadoop Map/Reduce Issue Type: Improvement Components: examples Reporter: Suhas Gogate Labels: newbie In the following wordcount example output directory path can not be created because /temp does not exists and user has not privileges to create output path at /. hadoop --config ./clustdir/ jar /homes/gogate/wordcount.jar com..wordcount.WordCount /in-path/gogate/myfile /temp/mywc-gogate 09/04/28 23:00:32 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 09/04/28 23:00:32 INFO mapred.FileInputFormat: Total input paths to process : 1 09/04/28 23:00:32 INFO mapred.FileInputFormat: Total input paths to process : 1 09/04/28 23:00:33 INFO mapred.JobClient: Running job: job_200904282249_0004 java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1113) at com..wordcount.WordCount.main(WordCount.java:55) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:155) at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-197) add new options to mapred job -list-attempt-ids to dump counters and diagnostic messages
[ https://issues.apache.org/jira/browse/MAPREDUCE-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-197: --- Description: It would be very nice when tracking down tasks that have strange values for their counters, if there was a command line tool to print out the task attempts and their counters and diagnostic messages. I propose adding switches to -list-attempt-ids to accomplish that: {quote} mapred job -list-attempt-ids [-counters] [-diagnostics] job type state {quote} was: It would be very nice when tracking down tasks that have strange values for their counters, if there was a command line tool to print out the task attempts and their counters and diagnostic messages. I propose adding switches to -list-attempt-ids to accomplish that: {quote} hadoop job -list-attempt-ids [-counters] [-diagnostics] job type state {quote} add new options to mapred job -list-attempt-ids to dump counters and diagnostic messages Key: MAPREDUCE-197 URL: https://issues.apache.org/jira/browse/MAPREDUCE-197 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Owen O'Malley Assignee: Owen O'Malley Labels: newbie It would be very nice when tracking down tasks that have strange values for their counters, if there was a command line tool to print out the task attempts and their counters and diagnostic messages. I propose adding switches to -list-attempt-ids to accomplish that: {quote} mapred job -list-attempt-ids [-counters] [-diagnostics] job type state {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-197) add new options to mapred job -list-attempt-ids to dump counters and diagnostic messages
[ https://issues.apache.org/jira/browse/MAPREDUCE-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-197: --- Summary: add new options to mapred job -list-attempt-ids to dump counters and diagnostic messages (was: add new options to hadoop job -list-attempt-ids to dump counters and diagnostic messages) add new options to mapred job -list-attempt-ids to dump counters and diagnostic messages Key: MAPREDUCE-197 URL: https://issues.apache.org/jira/browse/MAPREDUCE-197 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Owen O'Malley Assignee: Owen O'Malley Labels: newbie It would be very nice when tracking down tasks that have strange values for their counters, if there was a command line tool to print out the task attempts and their counters and diagnostic messages. I propose adding switches to -list-attempt-ids to accomplish that: {quote} hadoop job -list-attempt-ids [-counters] [-diagnostics] job type state {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-168) hadoop job -list all should display the code for Killed also.
[ https://issues.apache.org/jira/browse/MAPREDUCE-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-168: --- Labels: newbie (was: ) hadoop job -list all should display the code for Killed also. - Key: MAPREDUCE-168 URL: https://issues.apache.org/jira/browse/MAPREDUCE-168 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Hemanth Yamijala Labels: newbie hadoop job -list all shows a legend for the states: PREP, SUCCEEDED, FAILED and RUNNING. It should also display the state for KILLED (as 5). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-168) mapred job -list all should display the code for Killed also.
[ https://issues.apache.org/jira/browse/MAPREDUCE-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-168: --- Description: mapred job -list all shows a legend for the states: PREP, SUCCEEDED, FAILED and RUNNING. It should also display the state for KILLED (as 5). (was: hadoop job -list all shows a legend for the states: PREP, SUCCEEDED, FAILED and RUNNING. It should also display the state for KILLED (as 5).) mapred job -list all should display the code for Killed also. - Key: MAPREDUCE-168 URL: https://issues.apache.org/jira/browse/MAPREDUCE-168 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Hemanth Yamijala Labels: newbie mapred job -list all shows a legend for the states: PREP, SUCCEEDED, FAILED and RUNNING. It should also display the state for KILLED (as 5). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-197) add new options to hadoop job -list-attempt-ids to dump counters and diagnostic messages
[ https://issues.apache.org/jira/browse/MAPREDUCE-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-197: --- Labels: newbie (was: ) add new options to hadoop job -list-attempt-ids to dump counters and diagnostic messages Key: MAPREDUCE-197 URL: https://issues.apache.org/jira/browse/MAPREDUCE-197 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Owen O'Malley Assignee: Owen O'Malley Labels: newbie It would be very nice when tracking down tasks that have strange values for their counters, if there was a command line tool to print out the task attempts and their counters and diagnostic messages. I propose adding switches to -list-attempt-ids to accomplish that: {quote} hadoop job -list-attempt-ids [-counters] [-diagnostics] job type state {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-168) mapred job -list all should display the code for Killed also.
[ https://issues.apache.org/jira/browse/MAPREDUCE-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-168: --- Summary: mapred job -list all should display the code for Killed also. (was: hadoop job -list all should display the code for Killed also.) mapred job -list all should display the code for Killed also. - Key: MAPREDUCE-168 URL: https://issues.apache.org/jira/browse/MAPREDUCE-168 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Hemanth Yamijala Labels: newbie hadoop job -list all shows a legend for the states: PREP, SUCCEEDED, FAILED and RUNNING. It should also display the state for KILLED (as 5). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-403) ProcessTree can try and kill a null PID
[ https://issues.apache.org/jira/browse/MAPREDUCE-403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-403. Resolution: Incomplete I'm going to close this as stale. If this is still an issue, probably better to open a new jira. ProcessTree can try and kill a null PID - Key: MAPREDUCE-403 URL: https://issues.apache.org/jira/browse/MAPREDUCE-403 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Steve Loughran Priority: Minor Saw this in a test run, while trying to shut down a TaskTracker [sf-startdaemon-debug] 09/05/07 16:42:42 [Map-events fetcher for all reduce tasks on tracker_morzine.hpl.hp.com:localhost/127.0.0.1:36239] INFO mapred.TaskTracker : Shutting down: Map-events fetcher for all reduce tasks on tracker_morzine.hpl.hp.com:localhost/127.0.0.1:36239 [sf-startdaemon-debug] 09/05/07 16:42:42 [TerminatorThread] WARN util.ProcessTree : Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: ERROR: garbage process ID -null. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-403) ProcessTree can try and kill a null PID
[ https://issues.apache.org/jira/browse/MAPREDUCE-403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-403: --- Summary: ProcessTree can try and kill a null PID (was: ProcessTree can try and kill a null POD) ProcessTree can try and kill a null PID - Key: MAPREDUCE-403 URL: https://issues.apache.org/jira/browse/MAPREDUCE-403 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Steve Loughran Priority: Minor Saw this in a test run, while trying to shut down a TaskTracker [sf-startdaemon-debug] 09/05/07 16:42:42 [Map-events fetcher for all reduce tasks on tracker_morzine.hpl.hp.com:localhost/127.0.0.1:36239] INFO mapred.TaskTracker : Shutting down: Map-events fetcher for all reduce tasks on tracker_morzine.hpl.hp.com:localhost/127.0.0.1:36239 [sf-startdaemon-debug] 09/05/07 16:42:42 [TerminatorThread] WARN util.ProcessTree : Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: ERROR: garbage process ID -null. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-526) Sometimes job does not get removed from scheduler queue after it is killed
[ https://issues.apache.org/jira/browse/MAPREDUCE-526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-526. Resolution: Won't Fix Closing this as won't fix. Sometimes job does not get removed from scheduler queue after it is killed -- Key: MAPREDUCE-526 URL: https://issues.apache.org/jira/browse/MAPREDUCE-526 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Karam Singh Sometimes when we kill a job, it does get removed from waiting queue, while job status: Killed with Job Setup and Cleanup: Successful Also JobTracker webui shows job under failed jobs lists and hadoop job -list all, hadoop queue queuename -showJobs also shows jobs state=5. Prior to killing job state was Running -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-635) IllegalArgumentException is thrown if mapred local dir is not writable.
[ https://issues.apache.org/jira/browse/MAPREDUCE-635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-635: --- Labels: newbie (was: ) IllegalArgumentException is thrown if mapred local dir is not writable. --- Key: MAPREDUCE-635 URL: https://issues.apache.org/jira/browse/MAPREDUCE-635 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Reporter: Suman Sehgal Priority: Minor Labels: newbie If specified mapred local directory doesn't have write permission or is non-existent then IllegalArgumentException is thrown. Following error message was displayed while running a sleep job with non-writable mapred local directory specified in mapred-site.xml. sleep job command : $hadoop_home/bin/hadoop jar hadoop-examples.jar sleep -m 100 -r 10 2009-05-12 05:36:46,491 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_200905120525_0001_m_00_0: java.lang.IllegalArgumentException: n must be positive at java.util.Random.nextInt(Random.java:250) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:243) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:289) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124) at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1115) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1028) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:357) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) This error message(i.e. IllegalArgumentException) ,somehow, doesn't clearly indicate that problem is with mapred local directory. Error message should be more specific in this case. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5985) native-task: Fix build on macosx
[ https://issues.apache.org/jira/browse/MAPREDUCE-5985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070757#comment-14070757 ] Todd Lipcon commented on MAPREDUCE-5985: +1, looks good to me. This also fixed my Ubuntu build issue with the unistd.h inclusion. native-task: Fix build on macosx Key: MAPREDUCE-5985 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5985 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Reporter: Binglin Chang Assignee: Binglin Chang Priority: Minor Attachments: MAPREDUCE-5985.v1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5963) ShuffleHandler DB schema should be versioned with compatible/incompatible changes
[ https://issues.apache.org/jira/browse/MAPREDUCE-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070760#comment-14070760 ] Hudson commented on MAPREDUCE-5963: --- FAILURE: Integrated in Hadoop-trunk-Commit #5941 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5941/]) MAPREDUCE-5963. ShuffleHandler DB schema should be versioned with compatible/incompatible changes. Contributed by Junping Du (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1612652) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/java/org/apache/hadoop/mapred/TestShuffleHandler.java ShuffleHandler DB schema should be versioned with compatible/incompatible changes - Key: MAPREDUCE-5963 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5963 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.4.1 Reporter: Junping Du Assignee: Junping Du Attachments: MAPREDUCE-5963-v2.1.patch, MAPREDUCE-5963-v2.patch, MAPREDUCE-5963.patch ShuffleHandler persist job shuffle info into DB schema, which should be versioned with compatible/incompatible changes to support rolling upgrade. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-257) Preventing node from swapping
[ https://issues.apache.org/jira/browse/MAPREDUCE-257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-257. Resolution: Fixed prevent a node from swapping - don't exhaust memory - memory limits - fixed Closing. Preventing node from swapping - Key: MAPREDUCE-257 URL: https://issues.apache.org/jira/browse/MAPREDUCE-257 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Hong Tang When a node swaps, it slows everything: maps running on that node, reducers fetching output from the node, and DFS clients reading from the DN. We should just treat it the same way as if OS exhausts memory and kill some tasks to free up memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5963) ShuffleHandler DB schema should be versioned with compatible/incompatible changes
[ https://issues.apache.org/jira/browse/MAPREDUCE-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-5963: -- Resolution: Fixed Fix Version/s: 2.6.0 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks, Junping! I committed this to trunk and branch-2. ShuffleHandler DB schema should be versioned with compatible/incompatible changes - Key: MAPREDUCE-5963 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5963 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.4.1 Reporter: Junping Du Assignee: Junping Du Fix For: 3.0.0, 2.6.0 Attachments: MAPREDUCE-5963-v2.1.patch, MAPREDUCE-5963-v2.patch, MAPREDUCE-5963.patch ShuffleHandler persist job shuffle info into DB schema, which should be versioned with compatible/incompatible changes to support rolling upgrade. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-1283) Support including 3rd party jars supplied in lib/ folder of eclipse project in hadoop jar
[ https://issues.apache.org/jira/browse/MAPREDUCE-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-1283. - Resolution: Incomplete Likely stale. Support including 3rd party jars supplied in lib/ folder of eclipse project in hadoop jar - Key: MAPREDUCE-1283 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1283 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/eclipse-plugin Environment: Any Reporter: Amit Nithian Priority: Minor Attachments: jarmodule.patch Currently, the eclipse plugin only exports the generated class files to the hadoop jar but if there are any 3rd party jars specified in the lib/ folder, they should also get packaged in the jar for submission to the cluster. Currently this has to be done manually which can slow down development. I am working on a patch to the current plugin to support this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-401) du fails on Ubuntu in TestJobHistory
[ https://issues.apache.org/jira/browse/MAPREDUCE-401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-401. Resolution: Fixed Likely fixed forever ago. du fails on Ubuntu in TestJobHistory Key: MAPREDUCE-401 URL: https://issues.apache.org/jira/browse/MAPREDUCE-401 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Ubuntu 8.10 x86_64, lots of RAM and HDD spare, clean SVN_HEAD of trunk Reporter: Steve Loughran Priority: Minor TestJobHistory.testJobHistoryUserLogLocation is failing, and there is an error in the log related to du failing in the mini MR cluster -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-111) JobTracker.getSystemDir throws NPE if it is called during intialization
[ https://issues.apache.org/jira/browse/MAPREDUCE-111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-111. Resolution: Not a Problem JobTracker.getSystemDir throws NPE if it is called during intialization --- Key: MAPREDUCE-111 URL: https://issues.apache.org/jira/browse/MAPREDUCE-111 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Amareshwari Sriramadasu JobTracker.getSystemDir throws NPE if it is called during intialization. It should check if fileSystem is null and throw IllegalStateException, as in getFilesystemName method. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-5985) native-task: Fix build on macosx
[ https://issues.apache.org/jira/browse/MAPREDUCE-5985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved MAPREDUCE-5985. Resolution: Fixed Hadoop Flags: Reviewed native-task: Fix build on macosx Key: MAPREDUCE-5985 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5985 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Reporter: Binglin Chang Assignee: Binglin Chang Priority: Minor Attachments: MAPREDUCE-5985.v1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (MAPREDUCE-311) JobClient should use multiple volumes as hadoop.tmp.dir
[ https://issues.apache.org/jira/browse/MAPREDUCE-311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070802#comment-14070802 ] Allen Wittenauer edited comment on MAPREDUCE-311 at 7/22/14 8:11 PM: - It's likely too late to change hadoop.tmp.dir. But this is still an issue. Debating opening a new JIRA (under YARN) that states the problem but not a solution so that hadoop.tmp.dir is left alone. was (Author: aw): It's likely too late to change hadoop.tmp.dir. But this is still an issue. Debating opening a new JIRA that states the problem but not a solution so that hadoop.tmp.dir is left alone. JobClient should use multiple volumes as hadoop.tmp.dir --- Key: MAPREDUCE-311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-311 Project: Hadoop Map/Reduce Issue Type: Improvement Environment: All Reporter: Milind Bhandarkar Currently, hadoop.tmp.dir configuration variable allows specification of only a single directory to be used as scratch space. In particular, on the job launcher nodes with multiple volumes, this fails the entire job if the tmp.dir is somehow unusable. When the job launcher nodes have multiple volumes, the tmp space availability can be improved by using multiple volumes (either randomly or in round-robin.) The code for choosing a volume from a comma-separated list of multiple volumes is already there for mapred.local.dir etc. That needs to be used by job client as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-1775) Streaming should use hadoop.tmp.dir instead of stream.tmpdir
[ https://issues.apache.org/jira/browse/MAPREDUCE-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-1775: Labels: newbie (was: ) Streaming should use hadoop.tmp.dir instead of stream.tmpdir Key: MAPREDUCE-1775 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1775 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/streaming Environment: All Reporter: Milind Bhandarkar Priority: Minor Labels: newbie Hadoop streaming currently uses stream.tmpdir (on the job-client side) to create jars to be submitted etc. This only adds complexity to site-specific configuration files. Instead, it should use hadoop.tmp.dir configuration variable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5991) native-task should not run unit tests if native profile is not enabled
Todd Lipcon created MAPREDUCE-5991: -- Summary: native-task should not run unit tests if native profile is not enabled Key: MAPREDUCE-5991 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5991 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Todd Lipcon Currently, running mvn test without the 'native' profile enabled causes all of the native-task tests to fail. In order to integrate to trunk, we need to fix this - either using JUnit Assume commands in each test that depends on native code, or disabling the tests from the pom unless -Pnative is specified -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-311) JobClient should use multiple volumes as hadoop.tmp.dir
[ https://issues.apache.org/jira/browse/MAPREDUCE-311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070802#comment-14070802 ] Allen Wittenauer commented on MAPREDUCE-311: It's likely too late to change hadoop.tmp.dir. But this is still an issue. Debating opening a new JIRA that states the problem but not a solution so that hadoop.tmp.dir is left alone. JobClient should use multiple volumes as hadoop.tmp.dir --- Key: MAPREDUCE-311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-311 Project: Hadoop Map/Reduce Issue Type: Improvement Environment: All Reporter: Milind Bhandarkar Currently, hadoop.tmp.dir configuration variable allows specification of only a single directory to be used as scratch space. In particular, on the job launcher nodes with multiple volumes, this fails the entire job if the tmp.dir is somehow unusable. When the job launcher nodes have multiple volumes, the tmp space availability can be improved by using multiple volumes (either randomly or in round-robin.) The code for choosing a volume from a comma-separated list of multiple volumes is already there for mapred.local.dir etc. That needs to be used by job client as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5992) native-task test logs should not write to console
Todd Lipcon created MAPREDUCE-5992: -- Summary: native-task test logs should not write to console Key: MAPREDUCE-5992 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5992 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Todd Lipcon Assignee: Todd Lipcon Most of our unit tests are configured with a log4j.properties test resource so they don't spout a bunch of output to the console. We need to do the same for native-task. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-537) Instrument events in the capacity scheduler for collecting metrics information
[ https://issues.apache.org/jira/browse/MAPREDUCE-537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-537. Resolution: Incomplete I'm going to close this as stale. Instrument events in the capacity scheduler for collecting metrics information -- Key: MAPREDUCE-537 URL: https://issues.apache.org/jira/browse/MAPREDUCE-537 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Hemanth Yamijala Attachments: metrics_implementation_With_time_window.patch We need to instrument various events in the capacity scheduler so that we can collect metrics about them. This data will help us determine improvements to scheduling strategies itself. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-528) NPE in jobqueue_details.jsp page if scheduler has not started
[ https://issues.apache.org/jira/browse/MAPREDUCE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-528. Resolution: Won't Fix NPE in jobqueue_details.jsp page if scheduler has not started - Key: MAPREDUCE-528 URL: https://issues.apache.org/jira/browse/MAPREDUCE-528 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Ramya Sunil Priority: Minor Attachments: screenshot-1.jpg NullPointerException is observed in jobqueue_details.jsp page if the scheduler has not yet started -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5993) native-task: simplify/remove dead code
Todd Lipcon created MAPREDUCE-5993: -- Summary: native-task: simplify/remove dead code Key: MAPREDUCE-5993 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5993 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Todd Lipcon The native task code has a bunch of code in it which isn't related to the map output collector. I suspect much if this is dead code. Let's remove it before we merge, so that the amount of code we have to maintain going forward is more limited. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-2166) map.input.file is not set
[ https://issues.apache.org/jira/browse/MAPREDUCE-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-2166. - Resolution: Not a Problem map.input.file is not set --- Key: MAPREDUCE-2166 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2166 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Reporter: Rares Vernica Priority: Minor Hadoop does not set the map.input.file variable. I tried the fallowing and all I get is null. public class Map extends MapperObject, Text, LongWritable, Text { public void map(Object key, Text value, Context context) throws IOException, InterruptedException { Configuration conf = context.getConfiguration(); System.out.println(conf.get(map.input.file)); } protected void setup(Context context) throws IOException, InterruptedException { Configuration conf = context.getConfiguration(); System.out.println(conf.get(map.input.file)); } } -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-301) mapred.child.classpath.extension property
[ https://issues.apache.org/jira/browse/MAPREDUCE-301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-301. Resolution: Fixed Already fixed via other means. mapred.child.classpath.extension property - Key: MAPREDUCE-301 URL: https://issues.apache.org/jira/browse/MAPREDUCE-301 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Klaas Bosteels It would be useful to be able to extend the classpath for the task processes on a job per job basis via a {{mapred.child.classpath.extension}} property. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-352) Avoid creating JobInProgress objects before Access checks and Queues checks are done in JobTracker submitJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-352. Resolution: Incomplete Stale with YAWN. I mean YARN. Avoid creating JobInProgress objects before Access checks and Queues checks are done in JobTracker submitJob - Key: MAPREDUCE-352 URL: https://issues.apache.org/jira/browse/MAPREDUCE-352 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: rahul k singh In JobTracker submitJob , JobInProgress instance gets created . after this checks are done for access and queue state. In event of checks failed . There isn't any use for these JIP objects , hence in event of failure only reason these objects were created was to get conf data and be deleted. We need to fetch the information required to only do the checks instead of creating a JobInProgress object -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5992) native-task test logs should not write to console
[ https://issues.apache.org/jira/browse/MAPREDUCE-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070843#comment-14070843 ] Todd Lipcon commented on MAPREDUCE-5992: I realized it's not a log4j issue at all. The native code logs directly to stderr without going through log4j. We should see if we can tie it into log4j via JNI. native-task test logs should not write to console - Key: MAPREDUCE-5992 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5992 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Reporter: Todd Lipcon Assignee: Todd Lipcon Most of our unit tests are configured with a log4j.properties test resource so they don't spout a bunch of output to the console. We need to do the same for native-task. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-562) A single slow (but not dead) map TaskTracker impedes MapReduce progress
[ https://issues.apache.org/jira/browse/MAPREDUCE-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-562. Resolution: Incomplete This is still an interesting issue, but at this point, I feel the need to close this one. The big reason being that this problem needs to be generalized for YARN and made much less MR specific. A single slow (but not dead) map TaskTracker impedes MapReduce progress --- Key: MAPREDUCE-562 URL: https://issues.apache.org/jira/browse/MAPREDUCE-562 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Aaron Kimball We see cases where there may be a large number of mapper nodes running many tasks (e.g., a thousand). The reducers will pull 980 of the map task intermediate files down, but will be unable to retrieve the final intermediate shards from the last node. The TaskTracker on that node returns data to reducers either slowly or not at all, but its heartbeat messages make it back to the JobTracker -- so the JobTracker doesn't mark the tasks as failed. Manually stopping the offending TaskTracker works to migrate the tasks to other nodes, where the shuffling process finishes very quickly. Left on its own, it can take hours to unjam itself otherwise. We need a mechanism for reducers to provide feedback to the JobTracker that one of the mapper nodes should be regarded as lost. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5994) native-task: TestBytesUtil fails
Todd Lipcon created MAPREDUCE-5994: -- Summary: native-task: TestBytesUtil fails Key: MAPREDUCE-5994 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5994 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Todd Lipcon This class appears to have some bugs. Two tests fail consistently on my system. BytesUtil itself appears to duplicate a lot of code from guava - we should probably just use the Guava functions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-385) pipes does not allow jobconf values containing commas
[ https://issues.apache.org/jira/browse/MAPREDUCE-385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-385. Resolution: Won't Fix -jobconf be dead, yo. pipes does not allow jobconf values containing commas - Key: MAPREDUCE-385 URL: https://issues.apache.org/jira/browse/MAPREDUCE-385 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Christian Kunz Assignee: Christian Kunz Attachments: patch.HADOOP-6006, patch.HADOOP-6006.0.18 Currently hadoop pipes does not allow a -jobconf key=value,key=value... commandline parameter with one or more commas in one of the values of the key-value pairs. One use case is key=mapred.join.expr, where the value is required to have commas. And it is not always convenient to add this to a configuration file. Submitter.java could easily be changed to check for backslash in front of a comma before using it as a delimiter. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-3) Set mapred.child.ulimit automatically to the value of the RAM limits for a job, if they are set
[ https://issues.apache.org/jira/browse/MAPREDUCE-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-3. -- Resolution: Fixed Who sets ulimit anymore? No one. Why? cgroups and /proc-based memory limits. Closing as stale. Set mapred.child.ulimit automatically to the value of the RAM limits for a job, if they are set --- Key: MAPREDUCE-3 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Hemanth Yamijala Memory based monitoring and scheduling allow users to set memory limits for the tasks of their jobs. This parameter is the total memory taken by the task, and any children it may launch (for e.g. in the case of streaming). A related parameter is mapred.child.ulimit which is a hard limit on the memory used by a single process of the entire task tree. For user convenience, it would be sensible for the system to set the ulimit to atleast the memory required by the task, if the user has specified the latter. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-171) TestJobTrackerRestartWithLostTracker sometimes fails while validating history.
[ https://issues.apache.org/jira/browse/MAPREDUCE-171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-171. Resolution: Fixed I'm closing this as stale at this point. TestJobTrackerRestartWithLostTracker sometimes fails while validating history. -- Key: MAPREDUCE-171 URL: https://issues.apache.org/jira/browse/MAPREDUCE-171 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker, test Reporter: Amareshwari Sriramadasu Attachments: TEST-org.apache.hadoop.mapred.TestJobTrackerRestartWithLostTracker.txt TestJobTrackerRestartWithLostTracker fails with following error Duplicate START_TIME seen for task task_200906151249_0001_m_01 in history file at line 54 junit.framework.AssertionFailedError: Duplicate START_TIME seen for task task_200906151249_0001_m_01 in history file at line 54 at org.apache.hadoop.mapred.TestJobHistory$TestListener.handle(TestJobHistory.java:161) at org.apache.hadoop.mapred.JobHistory.parseLine(JobHistory.java:335) at org.apache.hadoop.mapred.JobHistory.parseHistoryFromFS(JobHistory.java:299) at org.apache.hadoop.mapred.TestJobHistory.validateJobHistoryFileFormat(TestJobHistory.java:478) at org.apache.hadoop.mapred.TestJobTrackerRestartWithLostTracker.testRecoveryWithLostTracker(TestJobTrackerRestartWithLostTracker.java:116) at org.apache.hadoop.mapred.TestJobTrackerRestartWithLostTracker.testRestartWithLostTracker(TestJobTrackerRestartWithLostTracker.java:162) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-103) The TaskTracker's shell environment should not be passed to the children.
[ https://issues.apache.org/jira/browse/MAPREDUCE-103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-103. Resolution: Fixed task-controller/container-executor should have fixed this. Closing. The TaskTracker's shell environment should not be passed to the children. - Key: MAPREDUCE-103 URL: https://issues.apache.org/jira/browse/MAPREDUCE-103 Project: Hadoop Map/Reduce Issue Type: Bug Components: security Reporter: Owen O'Malley HADOOP-2838 and HADOOP-5981 added support to make the TaskTracker's shell environment available to the tasks. This has two problems: 1. It makes the task tracker's environment part of the interface to the task, which is fairly brittle. 2. Security code typically only passes along whitelisted environment variables instead of everything to prevent accidental leakage from the administrator's account. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-459) Elegant decommission of lighty loaded tasktrackers from a map-reduce cluster
[ https://issues.apache.org/jira/browse/MAPREDUCE-459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-459. Resolution: Incomplete Me too. Closing. Elegant decommission of lighty loaded tasktrackers from a map-reduce cluster Key: MAPREDUCE-459 URL: https://issues.apache.org/jira/browse/MAPREDUCE-459 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: dhruba borthakur Assignee: Namit Jain There is a need to elegantly move some machines from one map-reduce cluster to another. This JIRA is to discuss how to find lightly loaded tasktrackers that are candidates for decommissioning and then to elegantly decommission them by waiting for existing tasks to finish. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-442) Ability to re-configure hadoop daemons online
[ https://issues.apache.org/jira/browse/MAPREDUCE-442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-442. Resolution: Duplicate I'm going to dupe this to HADOOP-7001, since it's closer to reality. Other jiras tend to point to it as well. Ability to re-configure hadoop daemons online - Key: MAPREDUCE-442 URL: https://issues.apache.org/jira/browse/MAPREDUCE-442 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Amar Kamat Example : Like we have _bin hadoop mradmin -refreshNodes_ we should also have _bin hadoop mradmin -reconfigure_ which re-configures mr while the cluster is online. Few parameters like job-expiry-interval etc can be changed in this way without having to restart the whole cluster. Master, once reconfigured, can ask the slaves to reconfigure (reload its config) from a well defined location on hdfs or via heartbeat. We can have some whitelisted configs that have _reloadable_ property. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-388) pipes combiner has a large memory footprint
[ https://issues.apache.org/jira/browse/MAPREDUCE-388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-388. Resolution: Incomplete Closing this as stale. pipes combiner has a large memory footprint --- Key: MAPREDUCE-388 URL: https://issues.apache.org/jira/browse/MAPREDUCE-388 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Christian Kunz Pipes combiner implementation can have a huge memory overhead compared to the spill size. How much, depends on the record size. E.g., an application asks for 2GB memory when io.sort.mb=500, key is 16 bytes, and value is 4 bytes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-441) TestMapReduceJobControl.testJobControlWithKillJob timedout in of the hudson runs
[ https://issues.apache.org/jira/browse/MAPREDUCE-441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-441. Resolution: Incomplete Almost certainly stale. TestMapReduceJobControl.testJobControlWithKillJob timedout in of the hudson runs Key: MAPREDUCE-441 URL: https://issues.apache.org/jira/browse/MAPREDUCE-441 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu TestMapReduceJobControl.testJobControlWithKillJob timedout in of the hudson runs @ http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/530/testReport/org.apache.hadoop.mapreduce.lib.jobcontrol/TestMapReduceJobControl/testJobControlWithKillJob/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-653) distcp can support bandwidth limiting
[ https://issues.apache.org/jira/browse/MAPREDUCE-653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-653. Resolution: Won't Fix distcpv2 does this now. closing as won't fix. distcp can support bandwidth limiting - Key: MAPREDUCE-653 URL: https://issues.apache.org/jira/browse/MAPREDUCE-653 Project: Hadoop Map/Reduce Issue Type: New Feature Components: distcp Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: d_bw.patch, d_bw.v1.patch, d_bw.v2.patch distcp should support an option for user to specify the bandwidth limit for the distcp job. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-533) Support task preemption in Capacity Scheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-533. Resolution: Duplicate Support task preemption in Capacity Scheduler - Key: MAPREDUCE-533 URL: https://issues.apache.org/jira/browse/MAPREDUCE-533 Project: Hadoop Map/Reduce Issue Type: New Feature Components: capacity-sched Reporter: Tsz Wo Nicholas Sze Without preemption, it is not possible to guarantee capacity since long running jobs may occupy task slots for an arbitrarily long time. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-563) Security features for Map/Reduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-563. Resolution: Fixed All your jobs are belong to someone who has a Kerberos principal. Security features for Map/Reduce Key: MAPREDUCE-563 URL: https://issues.apache.org/jira/browse/MAPREDUCE-563 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Owen O'Malley This is a top-level tracking JIRA for security work we are doing in Map/reduce. Please add reference to this when opening new security related JIRAs. Logically a subpiece of HADOOP-4487. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-564) Provide a way for the client to get the number of currently running maps/reduces
[ https://issues.apache.org/jira/browse/MAPREDUCE-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-564. Resolution: Incomplete Probably stale. See also comments about new API. Provide a way for the client to get the number of currently running maps/reduces Key: MAPREDUCE-564 URL: https://issues.apache.org/jira/browse/MAPREDUCE-564 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Affects Versions: 0.21.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: MR-564.patch, MR-564.v1.patch, MR-564.v2.patch, MR-564.v3.patch, MR-564.v4.1.patch, MR-564.v4.2.patch, MR-564.v4.patch Add counters for Number of Succeeded Maps and Number of Succeeded Reduces so that client can get this number without iterating through all the task reports while the job is in progress. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-736) Undefined variable is treated as string.
[ https://issues.apache.org/jira/browse/MAPREDUCE-736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved MAPREDUCE-736. Resolution: Incomplete Stale? Undefined variable is treated as string. Key: MAPREDUCE-736 URL: https://issues.apache.org/jira/browse/MAPREDUCE-736 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Suman Sehgal Priority: Minor Attachments: hadoop_env.txt This issue is related to HADOOP-2838. For X=$X:Y : Append Y to X (which should be taken from the tasktracker) , if we append to an undefined variable then value for undefined variable should be displayed as blank e.g. NEW_PATH=$NEW_PATH2:/tmp should be displayed as :/tmp in child's environment while that variable is being displayed as a string ($NEW_PATH2:/tmp) in the environemnt. This is happening in case of default task-controller only. This scenario works fine with linux task-controller. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-1767) Steaming infrastructures should provide statisics about job
[ https://issues.apache.org/jira/browse/MAPREDUCE-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071061#comment-14071061 ] Antonio Piccolboni commented on MAPREDUCE-1767: --- Such as? Steaming infrastructures should provide statisics about job --- Key: MAPREDUCE-1767 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1767 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/streaming Reporter: arkady borkovsky This should include -- the commands (mapper and reducer commands) executed -- time information (e.g. min, max, and avg start time, end time, elapsed time for tasks, total elapsed time ) -- sizes -- bytes and records, min, max, avg per task and total, input and output -- information about input and output data sets (all output data sets, if there are several) -- all user counters (when they are implemented for streaming) the information should be stored in a file -- e.g. in the working directory from where the job was launched, with a name derived from the job name -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5979) MR1 FairScheduler zero weight can cause sort failures
[ https://issues.apache.org/jira/browse/MAPREDUCE-5979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071082#comment-14071082 ] Karthik Kambatla commented on MAPREDUCE-5979: - +1 MR1 FairScheduler zero weight can cause sort failures - Key: MAPREDUCE-5979 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5979 Project: Hadoop Map/Reduce Issue Type: Bug Components: scheduler Affects Versions: 1.2.1 Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: MAPREDUCE-5979.001.patch, MAPREDUCE-5979.002.patch When the weight is set to zero (which is possible with a custom weight adjuster) we can get failures in comparing schedulables. This is because when calculating running tasks to weight ratio could result in a 0.0/0.0 which ends up as NaN. Comparisons with NaN are undefined such that (int)Math.signum(NaN - anyNumber) will be 0 causing different criteria to be used in comparison which may not be consistent. This will result in {{IllegalArgumentException: Comparison method violates its general contract!}} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5979) FairScheduler: zero weight can cause sort failures
[ https://issues.apache.org/jira/browse/MAPREDUCE-5979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-5979: Summary: FairScheduler: zero weight can cause sort failures (was: MR1 FairScheduler zero weight can cause sort failures) FairScheduler: zero weight can cause sort failures -- Key: MAPREDUCE-5979 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5979 Project: Hadoop Map/Reduce Issue Type: Bug Components: scheduler Affects Versions: 1.2.1 Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: MAPREDUCE-5979.001.patch, MAPREDUCE-5979.002.patch When the weight is set to zero (which is possible with a custom weight adjuster) we can get failures in comparing schedulables. This is because when calculating running tasks to weight ratio could result in a 0.0/0.0 which ends up as NaN. Comparisons with NaN are undefined such that (int)Math.signum(NaN - anyNumber) will be 0 causing different criteria to be used in comparison which may not be consistent. This will result in {{IllegalArgumentException: Comparison method violates its general contract!}} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5979) FairScheduler: zero weight can cause sort failures
[ https://issues.apache.org/jira/browse/MAPREDUCE-5979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-5979: Resolution: Fixed Fix Version/s: 1.3.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Anubhav. Just committed this to branch-1. FairScheduler: zero weight can cause sort failures -- Key: MAPREDUCE-5979 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5979 Project: Hadoop Map/Reduce Issue Type: Bug Components: scheduler Affects Versions: 1.2.1 Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Fix For: 1.3.0 Attachments: MAPREDUCE-5979.001.patch, MAPREDUCE-5979.002.patch When the weight is set to zero (which is possible with a custom weight adjuster) we can get failures in comparing schedulables. This is because when calculating running tasks to weight ratio could result in a 0.0/0.0 which ends up as NaN. Comparisons with NaN are undefined such that (int)Math.signum(NaN - anyNumber) will be 0 causing different criteria to be used in comparison which may not be consistent. This will result in {{IllegalArgumentException: Comparison method violates its general contract!}} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5966) MR1 FairScheduler use of custom weight adjuster is not thread safe for comparisons
[ https://issues.apache.org/jira/browse/MAPREDUCE-5966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071106#comment-14071106 ] Karthik Kambatla commented on MAPREDUCE-5966: - Looks like this patch doesn't apply anymore, may be due MR-5979. Can you please update it? Also, I have the following minor comments: # Reword the following comment to say Update demands and weights of jobs and pools {code} + // Update demands of jobs and pools and update weights {code} # In the test case, I don't think Math.max is required anymore. {code} + +// Until MAPREDUCE-5966 gets fixed we cannot have zero weight set +return Math.max(curWeight * random, 0.001); {code} # We should be able to fit the following in two lines with throws in the line after the method name? {code} + public void testJobSchedulableSortingWithCustomWeightAdjuster() throws + IOException, + InterruptedException { {code} # Can we make all these variables final and capital letters. Also, don't see the need for numRacks and numNodesPerRack. {code} +final int iterations = 100; +int jobCount = 100; +int numRacks = 100; +int numNodesPerRack = 2; +final int totalTaskTrackers = numNodesPerRack * numRacks; {code} # We should probably use pure camel-caps for this variable - {{randomTtid}} MR1 FairScheduler use of custom weight adjuster is not thread safe for comparisons -- Key: MAPREDUCE-5966 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5966 Project: Hadoop Map/Reduce Issue Type: Bug Components: scheduler Affects Versions: 1.2.1 Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: MAPREDUCE-5966.001.patch When comparing JobSchedulables one of the factors is the weight. If someone uses a custom weight adjuster, that may be called multiple times during a sort causing different values to return. That causes a failure in sorting because the weight may change during the sort. This reproes as {code} java.io.IOException: java.lang.IllegalArgumentException: Comparison method violates its general contract! at java.util.TimSort.mergeHi(TimSort.java:868) at java.util.TimSort.mergeAt(TimSort.java:485) at java.util.TimSort.mergeCollapse(TimSort.java:410) at java.util.TimSort.sort(TimSort.java:214) at java.util.TimSort.sort(TimSort.java:173) at java.util.Arrays.sort(Arrays.java:659) at java.util.Collections.sort(Collections.java:217) at org.apache.hadoop.mapred.PoolSchedulable.assignTask(PoolSchedulable.java:163) at org.apache.hadoop.mapred.FairScheduler.assignTasks(FairScheduler.java:499) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2961) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5974) Allow map output collector fallback
[ https://issues.apache.org/jira/browse/MAPREDUCE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071174#comment-14071174 ] Chris Douglas commented on MAPREDUCE-5974: -- bq. Doing fallback as the records are emitted would be pretty neat, but may also be somewhat difficult. [snip] *nod* Fair enough, though if each MapTask is making independent decisions about the collector, they still need to agree on the format for the shuffle. Spilling one collector to disk and changing strategies should be compatible, assuming there isn't a different format for intermediate spills. But yeah, this is very abstract, given the use cases we have. If the goal is to support a fallback collector when native libs aren't available; given the dependency on intermediate format, should the swap be internal to the native collector, even in init? If the interface were like the serialization, then one might use the keytype, etc. to pick the most-appropriate collector. As failover, I'm struggling to come up with a case that's not covered by making this an internal detail of the native collector. Allow map output collector fallback --- Key: MAPREDUCE-5974 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5974 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Affects Versions: 2.6.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: mapreduce-5974.txt Currently we only allow specifying a single MapOutputCollector implementation class in a job. It would be nice to allow a comma-separated list of classes: we should try each collector implementation in the user-specified order until we find one that can be successfully instantiated and initted. This is useful for cases where a particular optimized collector implementation cannot operate on all key/value types, or requires native code. The cluster administrator can configure the cluster to try to use the optimized collector and fall back to the default collector. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5995) native-task: revert changes which expose Text internals
Todd Lipcon created MAPREDUCE-5995: -- Summary: native-task: revert changes which expose Text internals Key: MAPREDUCE-5995 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5995 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor The current branch has some changes to the Text writable which allow it to manually set the backing array, capacity, etc. Rather than exposing these internals, we should use the newly-committed facility from HADOOP-10855 to implement this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (MAPREDUCE-5994) native-task: TestBytesUtil fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon reassigned MAPREDUCE-5994: -- Assignee: Todd Lipcon Working on removing the redundant functions here native-task: TestBytesUtil fails Key: MAPREDUCE-5994 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5994 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Reporter: Todd Lipcon Assignee: Todd Lipcon This class appears to have some bugs. Two tests fail consistently on my system. BytesUtil itself appears to duplicate a lot of code from guava - we should probably just use the Guava functions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5996) native-task: Rename system tests into standard directory layout
Todd Lipcon created MAPREDUCE-5996: -- Summary: native-task: Rename system tests into standard directory layout Key: MAPREDUCE-5996 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5996 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Todd Lipcon Assignee: Todd Lipcon Currently there are a number of tests in src/java/system. This confuses IDEs which think that the package should then be system.org.apache.hadoop instead of just org.apache.hadoop. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5996) native-task: Rename system tests into standard directory layout
[ https://issues.apache.org/jira/browse/MAPREDUCE-5996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071224#comment-14071224 ] Todd Lipcon commented on MAPREDUCE-5996: There's also a random file called testGlibcBugSpill.out which appears to be unused by any tests. I'll remove it in this patch as well. native-task: Rename system tests into standard directory layout --- Key: MAPREDUCE-5996 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5996 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Reporter: Todd Lipcon Assignee: Todd Lipcon Currently there are a number of tests in src/java/system. This confuses IDEs which think that the package should then be system.org.apache.hadoop instead of just org.apache.hadoop. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5994) native-task: TestBytesUtil fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated MAPREDUCE-5994: --- Attachment: mapreduce-5994.txt native-task: TestBytesUtil fails Key: MAPREDUCE-5994 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5994 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: mapreduce-5994.txt This class appears to have some bugs. Two tests fail consistently on my system. BytesUtil itself appears to duplicate a lot of code from guava - we should probably just use the Guava functions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5996) native-task: Rename system tests into standard directory layout
[ https://issues.apache.org/jira/browse/MAPREDUCE-5996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated MAPREDUCE-5996: --- Attachment: mapreduce-5996.txt native-task: Rename system tests into standard directory layout --- Key: MAPREDUCE-5996 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5996 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: mapreduce-5996.txt Currently there are a number of tests in src/java/system. This confuses IDEs which think that the package should then be system.org.apache.hadoop instead of just org.apache.hadoop. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5997) native-task: Use DirectBufferPool from Hadoop Common
Todd Lipcon created MAPREDUCE-5997: -- Summary: native-task: Use DirectBufferPool from Hadoop Common Key: MAPREDUCE-5997 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5997 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor The native task code has its own direct buffer pool, but Hadoop already has an implementation. HADOOP-10882 will move that implementation into Common, and this JIRA is to remove the duplicate code and use that one instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5974) Allow map output collector fallback
[ https://issues.apache.org/jira/browse/MAPREDUCE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071247#comment-14071247 ] Todd Lipcon commented on MAPREDUCE-5974: In the case of the native collector, it's still the same IFile format on disk, and the same reducer. I'm not sure about whether that's the case with other map collectors out there (eg from vendors) but I seem to recall some folks working on a specialized collector for memcmp-able keys. In that case, it might be nice to have a priority list like MemcmpableKeyCollector,NativeCollector,DefaultCollector, and each one would just throw an exception if it didn't support the types involved. Implementing this inside the native collector init() method itself might be messy -- you'd have to essentially write a wrapper collector and have every method delegate to the real implementation. I would hope that the delegation would get devirtualized and inlined, but not certain about that. If you're -0 or -1 on the current approach though, I'm willing to give it a go. Allow map output collector fallback --- Key: MAPREDUCE-5974 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5974 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Affects Versions: 2.6.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: mapreduce-5974.txt Currently we only allow specifying a single MapOutputCollector implementation class in a job. It would be nice to allow a comma-separated list of classes: we should try each collector implementation in the user-specified order until we find one that can be successfully instantiated and initted. This is useful for cases where a particular optimized collector implementation cannot operate on all key/value types, or requires native code. The cluster administrator can configure the cluster to try to use the optimized collector and fall back to the default collector. -- This message was sent by Atlassian JIRA (v6.2#6252)