[jira] [Commented] (MAPREDUCE-6460) TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707800#comment-14707800 ] zhihai xu commented on MAPREDUCE-6460: -- The failure is because the test didn't wait for the app attempt unregistered from ApplicationMasterService (ApplicationMasterService#unregisterAttempt). The fix is to wait for the app entering state {{RMAppState.KILLED}} which will make sure {{appAttempt.masterService.unregisterAttempt(appAttemptId)}} being called. I uploaded the patch MAPREDUCE-6460.000.patch for review. TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails --- Key: MAPREDUCE-6460 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6460 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-6460.000.patch TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails with the following logs: --- T E S T S --- Running org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 94.525 sec FAILURE! - in org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator testAttemptNotFoundCausesRMCommunicatorException(org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator) Time elapsed: 2.606 sec FAILURE! java.lang.AssertionError: Expected exception: org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocationException at org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:32) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) Results : Failed tests: TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException Expected exception: org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocationException Tests run: 24, Failures: 1, Errors: 0, Skipped: 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6415) Create a tool to combine aggregated logs into HAR files
[ https://issues.apache.org/jira/browse/MAPREDUCE-6415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706979#comment-14706979 ] Robert Kanter commented on MAPREDUCE-6415: -- Thanks for the review [~asuresh]. This is just the preliminary patch. I still have to write unit tests, javadocs, and split out the yarn changes into a YARN JIRA. But it sounds like you're good with the approach. [~aw], any other comments? How about you [~jlowe]? Create a tool to combine aggregated logs into HAR files --- Key: MAPREDUCE-6415 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6415 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.8.0 Reporter: Robert Kanter Assignee: Robert Kanter Attachments: HAR-ableAggregatedLogs_v1.pdf, MAPREDUCE-6415_branch-2_prelim_001.patch, MAPREDUCE-6415_branch-2_prelim_002.patch, MAPREDUCE-6415_prelim_001.patch, MAPREDUCE-6415_prelim_002.patch While we wait for YARN-2942 to become viable, it would still be great to improve the aggregated logs problem. We can write a tool that combines aggregated log files into a single HAR file per application, which should solve the too many files and too many blocks problems. See the design document for details. See YARN-2942 for more context. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6357) MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute
[ https://issues.apache.org/jira/browse/MAPREDUCE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707443#comment-14707443 ] Hudson commented on MAPREDUCE-6357: --- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #294 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/294/]) MAPREDUCE-6357. MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute. Contributed by Dustin Cote. (aajisaka: rev 2ba90c93d71aa2d30ee9ed431750c10c685e5599) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute -- Key: MAPREDUCE-6357 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6357 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.6.0 Reporter: Ivan Mitic Assignee: Dustin Cote Fix For: 2.8.0 Attachments: MAPREDUCE-6357-1.patch After spending the afternoon debugging a user job where reduce tasks were failing on retry with the below exception, I think it would be worthwhile to add a note in the MultipleOutputs.write() documentation, saying that absolute paths may cause improper execution of tasks on retry or when MR speculative execution is enabled. {code} 2015-04-28 23:13:10,452 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: File already exists:wasb://full20150...@bgtstoragefull.blob.core.windows.net/user/hadoop/some/path/block-r-00299.bz2 at org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1354) at org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1195) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135) at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475) at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433) at com.ancestry.bigtree.hadoop.LevelReducer.processValue(LevelReducer.java:91) at com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:69) at com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:14) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) {code} As discussed in MAPREDUCE-3772, when the baseOutputPath passed to MultipleOutputs.write() is an absolute path (or more precisely a path that resolves outside of the job output-dir), the concept of output committing is not utilized. In this case, the user read thru the MultipleOutputs docs and was assuming that everything will be working fine, as there are blog posts saying that MultipleOutputs does handle output commit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6454) MapReduce doesn't set the HADOOP_CLASSPATH for jar lib in distributed cache.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707390#comment-14707390 ] Allen Wittenauer commented on MAPREDUCE-6454: - bq. This is because HADOOP_CLASSPATH is not part of the default white-listed environment that goes from YARN to the apps. If I have HADOOP_CLASSPATH=foo in hadoop-env.sh, when I run a shell command (say hadoop version) as part of my app, that's going to overwrite whatever Hadoop tries to set it to. The whitelist is completely irrelevant. MapReduce doesn't set the HADOOP_CLASSPATH for jar lib in distributed cache. Key: MAPREDUCE-6454 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6454 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Junping Du Assignee: Junping Du Priority: Critical Fix For: 2.7.2, 2.6.2 Attachments: MAPREDUCE-6454-v2.1.patch, MAPREDUCE-6454-v2.patch, MAPREDUCE-6454-v3.1.patch, MAPREDUCE-6454-v3.patch, MAPREDUCE-6454.patch We already set lib jars on distributed-cache to CLASSPATH. However, in some corner cases (like: MR local mode, Hive Map side local join, etc.), we need these jars on HADOOP_CLASSPATH so hadoop scripts can take it in launching runjar process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6423) MapOutput Sampler
[ https://issues.apache.org/jira/browse/MAPREDUCE-6423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-6423: - Status: Open (was: Patch Available) MapOutput Sampler - Key: MAPREDUCE-6423 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6423 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Ram Manohar Bheemana Assignee: Ram Manohar Bheemana Priority: Minor Attachments: MapOutputSampler.java Need a sampler based on the MapOutput Keys. Current InputSampler implementation has a major drawback which is input and output of a mapper should be same, generally this isn't the case. approach: 1. Create a Sampler which samples the data based on the input. 2. Run a small map reduce in uber task mode using the original job mapper and identity reducer to generate required MapOutputSample keys 3. Optionally, we can input the input file to be sample. For example inputs files A, B; we should be able to specify to use only file A for sampling. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6454) MapReduce doesn't set the HADOOP_CLASSPATH for jar lib in distributed cache.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707305#comment-14707305 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-6454: bq. Just so it's on record so when someone hits this problem: this is fragile and subject to breakage, regardless of the version of hadoop in play. It all depends upon how users have HADOOP_CLASSPATH configured in hadoop-env.sh and yarn-env.sh. It is a bit fragile, for sure, but it doesn't by default depend on what is configured in *-env.sh like you said. This is because HADOOP_CLASSPATH is not part of the default white-listed environment that goes from YARN to the apps. MapReduce doesn't set the HADOOP_CLASSPATH for jar lib in distributed cache. Key: MAPREDUCE-6454 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6454 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Junping Du Assignee: Junping Du Priority: Critical Fix For: 2.7.2, 2.6.2 Attachments: MAPREDUCE-6454-v2.1.patch, MAPREDUCE-6454-v2.patch, MAPREDUCE-6454-v3.1.patch, MAPREDUCE-6454-v3.patch, MAPREDUCE-6454.patch We already set lib jars on distributed-cache to CLASSPATH. However, in some corner cases (like: MR local mode, Hive Map side local join, etc.), we need these jars on HADOOP_CLASSPATH so hadoop scripts can take it in launching runjar process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6357) MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute
[ https://issues.apache.org/jira/browse/MAPREDUCE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707319#comment-14707319 ] Hudson commented on MAPREDUCE-6357: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #291 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/291/]) MAPREDUCE-6357. MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute. Contributed by Dustin Cote. (aajisaka: rev 2ba90c93d71aa2d30ee9ed431750c10c685e5599) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute -- Key: MAPREDUCE-6357 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6357 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.6.0 Reporter: Ivan Mitic Assignee: Dustin Cote Fix For: 2.8.0 Attachments: MAPREDUCE-6357-1.patch After spending the afternoon debugging a user job where reduce tasks were failing on retry with the below exception, I think it would be worthwhile to add a note in the MultipleOutputs.write() documentation, saying that absolute paths may cause improper execution of tasks on retry or when MR speculative execution is enabled. {code} 2015-04-28 23:13:10,452 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: File already exists:wasb://full20150...@bgtstoragefull.blob.core.windows.net/user/hadoop/some/path/block-r-00299.bz2 at org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1354) at org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1195) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135) at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475) at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433) at com.ancestry.bigtree.hadoop.LevelReducer.processValue(LevelReducer.java:91) at com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:69) at com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:14) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) {code} As discussed in MAPREDUCE-3772, when the baseOutputPath passed to MultipleOutputs.write() is an absolute path (or more precisely a path that resolves outside of the job output-dir), the concept of output committing is not utilized. In this case, the user read thru the MultipleOutputs docs and was assuming that everything will be working fine, as there are blog posts saying that MultipleOutputs does handle output commit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6454) MapReduce doesn't set the HADOOP_CLASSPATH for jar lib in distributed cache.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707401#comment-14707401 ] Allen Wittenauer commented on MAPREDUCE-6454: - Here's a test you can do to prove my point. $ echo HADOOP_CLASSPATH=/tmp $HADOOP_CONF_DIR/hadoop-env.sh $ hadoop classpath You should see /tmp $ HADOOP_CLASSPATH=/etc hadoop classpath You'll still see /tmp. You won't see /etc. (Well, unless your classpath is really weird already.) MapReduce doesn't set the HADOOP_CLASSPATH for jar lib in distributed cache. Key: MAPREDUCE-6454 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6454 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Junping Du Assignee: Junping Du Priority: Critical Fix For: 2.7.2, 2.6.2 Attachments: MAPREDUCE-6454-v2.1.patch, MAPREDUCE-6454-v2.patch, MAPREDUCE-6454-v3.1.patch, MAPREDUCE-6454-v3.patch, MAPREDUCE-6454.patch We already set lib jars on distributed-cache to CLASSPATH. However, in some corner cases (like: MR local mode, Hive Map side local join, etc.), we need these jars on HADOOP_CLASSPATH so hadoop scripts can take it in launching runjar process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6434) Add support for PartialFileOutputCommiter when checkpointing is an option during preemption
[ https://issues.apache.org/jira/browse/MAPREDUCE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Augusto Souza updated MAPREDUCE-6434: - Attachment: MAPREDUCE-6434.006.patch Add support for PartialFileOutputCommiter when checkpointing is an option during preemption --- Key: MAPREDUCE-6434 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6434 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Augusto Souza Assignee: Augusto Souza Attachments: MAPREDUCE-6434.001.patch, MAPREDUCE-6434.002.patch, MAPREDUCE-6434.003.patch, MAPREDUCE-6434.004.patch, MAPREDUCE-6434.005.patch, MAPREDUCE-6434.006.patch Finish up some renaming work related to the annotation @Preemptable (it should be @Checkpointable now) and help in the splitting of patch in MAPREDUCE-5269 that is too large for being reviewed or accepted by Jenkins CI scripts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6423) MapOutput Sampler
[ https://issues.apache.org/jira/browse/MAPREDUCE-6423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707436#comment-14707436 ] Chris Douglas commented on MAPREDUCE-6423: -- Thanks for taking a look at this. That the sampler only works on input data was always a weakness for jobs requiring their output be totally ordered. Could you generate a patch? The contribution wiki is [here|http://wiki.apache.org/hadoop/HowToContribute]. It might be easier for others to use if the Mapper was integrated with the InputSampler, but a separate tool is still an improvement. MapOutput Sampler - Key: MAPREDUCE-6423 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6423 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Ram Manohar Bheemana Assignee: Ram Manohar Bheemana Priority: Minor Attachments: MapOutputSampler.java Need a sampler based on the MapOutput Keys. Current InputSampler implementation has a major drawback which is input and output of a mapper should be same, generally this isn't the case. approach: 1. Create a Sampler which samples the data based on the input. 2. Run a small map reduce in uber task mode using the original job mapper and identity reducer to generate required MapOutputSample keys 3. Optionally, we can input the input file to be sample. For example inputs files A, B; we should be able to specify to use only file A for sampling. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6357) MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute
[ https://issues.apache.org/jira/browse/MAPREDUCE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707349#comment-14707349 ] Hudson commented on MAPREDUCE-6357: --- FAILURE: Integrated in Hadoop-Yarn-trunk #1024 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1024/]) MAPREDUCE-6357. MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute. Contributed by Dustin Cote. (aajisaka: rev 2ba90c93d71aa2d30ee9ed431750c10c685e5599) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java * hadoop-mapreduce-project/CHANGES.txt MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute -- Key: MAPREDUCE-6357 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6357 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.6.0 Reporter: Ivan Mitic Assignee: Dustin Cote Fix For: 2.8.0 Attachments: MAPREDUCE-6357-1.patch After spending the afternoon debugging a user job where reduce tasks were failing on retry with the below exception, I think it would be worthwhile to add a note in the MultipleOutputs.write() documentation, saying that absolute paths may cause improper execution of tasks on retry or when MR speculative execution is enabled. {code} 2015-04-28 23:13:10,452 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: File already exists:wasb://full20150...@bgtstoragefull.blob.core.windows.net/user/hadoop/some/path/block-r-00299.bz2 at org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1354) at org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1195) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135) at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475) at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433) at com.ancestry.bigtree.hadoop.LevelReducer.processValue(LevelReducer.java:91) at com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:69) at com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:14) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) {code} As discussed in MAPREDUCE-3772, when the baseOutputPath passed to MultipleOutputs.write() is an absolute path (or more precisely a path that resolves outside of the job output-dir), the concept of output committing is not utilized. In this case, the user read thru the MultipleOutputs docs and was assuming that everything will be working fine, as there are blog posts saying that MultipleOutputs does handle output commit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6456) Support configurable log aggregation policy
[ https://issues.apache.org/jira/browse/MAPREDUCE-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated MAPREDUCE-6456: - Assignee: Ming Ma Support configurable log aggregation policy --- Key: MAPREDUCE-6456 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6456 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Ming Ma Assignee: Ming Ma YARN-221 provides a way for a YARN application to specify log aggregation policy via LogAggregationContext. This jira covers the necessary changes in MR to use that feature so that any MR job can specify its log aggregation policy via job configuration. That includes: * Have MR define its own configurations to config these policies. * Make code change at YarnRunner to retrieve these configurations and set the values via LogAggregationContext. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6434) Add support for PartialFileOutputCommiter when checkpointing is an option during preemption
[ https://issues.apache.org/jira/browse/MAPREDUCE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707130#comment-14707130 ] Chris Douglas commented on MAPREDUCE-6434: -- Offhand, I'd guess adding {{TaskType.REDUCE.equals(context.getTaskAttemptID().getTaskType())}} to the expression would prevent it from affecting more than reducers, but I haven't looked into it. Could you test with a map-only job, where {{context.getReducerClass()}} is undefined or not on the classpath? Add support for PartialFileOutputCommiter when checkpointing is an option during preemption --- Key: MAPREDUCE-6434 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6434 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Augusto Souza Assignee: Augusto Souza Attachments: MAPREDUCE-6434.001.patch, MAPREDUCE-6434.002.patch, MAPREDUCE-6434.003.patch, MAPREDUCE-6434.004.patch, MAPREDUCE-6434.005.patch Finish up some renaming work related to the annotation @Preemptable (it should be @Checkpointable now) and help in the splitting of patch in MAPREDUCE-5269 that is too large for being reviewed or accepted by Jenkins CI scripts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6434) Add support for PartialFileOutputCommiter when checkpointing is an option during preemption
[ https://issues.apache.org/jira/browse/MAPREDUCE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707248#comment-14707248 ] Augusto Souza commented on MAPREDUCE-6434: -- Thank you very much [~chris.douglas]! I tested the version I submitted before with a map-only job, and I think the IdentityReducer is used in cases with num reduce tasks setted to zero and no setting for the reducer class, so the previous patch doesn't crash. Am I right in this assumption? Is there another way of defining jobs in which I force the {{context.getReducerClass()}} to get undefined? But, I think your feedback is valid, so I am adding a another statement to the expression to make sure only the PartialFileOutputCommiter can only be instantiated for reduce tasks. If there is a way of making {{context.getReducerClass()}} undefined, I can try to make better tests for the patch too. Add support for PartialFileOutputCommiter when checkpointing is an option during preemption --- Key: MAPREDUCE-6434 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6434 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Augusto Souza Assignee: Augusto Souza Attachments: MAPREDUCE-6434.001.patch, MAPREDUCE-6434.002.patch, MAPREDUCE-6434.003.patch, MAPREDUCE-6434.004.patch, MAPREDUCE-6434.005.patch Finish up some renaming work related to the annotation @Preemptable (it should be @Checkpointable now) and help in the splitting of patch in MAPREDUCE-5269 that is too large for being reviewed or accepted by Jenkins CI scripts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6458) Figure out the way to pass build-in classpath (files in distributed cache, etc.) from parent to spawned shells
[ https://issues.apache.org/jira/browse/MAPREDUCE-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-6458: Attachment: MAPREDUCE-6458.00.patch -00: * MAPREDUCE-6454 but with HADOOP_CLASSPATH renamed to the not-already-used HADOOP_TASK_CLASSPATH * added finalize code for HADOOP_TASK_CLASSPATH via a new hadoop_add_task_classpath function. * hadoop_add_task_classpath safely verifies the path is valid, puts it in a decent place order-wise, etc, etc * added shell unit tests for hadoop_add_task_classpath * modified shell unit tests for hadoop_finalize_classpath * added HADOOP_TASK_CLASSPATH to hadoop-config.cmd for Windows Figure out the way to pass build-in classpath (files in distributed cache, etc.) from parent to spawned shells -- Key: MAPREDUCE-6458 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6458 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Junping Du Assignee: Allen Wittenauer Attachments: MAPREDUCE-6458.00.patch In MAPREDUCE-6454 (target for branch-2.x), we provide a way with constraints to pass built-in classpath from parent to child shell, via HADOOP_CLASSPATH, so jars in distributed cache can still work in child tasks. In trunk, we may think some way different, like: involve additional env var to safely pass build-in classpath. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6458) Figure out the way to pass build-in classpath (files in distributed cache, etc.) from parent to spawned shells
[ https://issues.apache.org/jira/browse/MAPREDUCE-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-6458: Status: Patch Available (was: Open) Figure out the way to pass build-in classpath (files in distributed cache, etc.) from parent to spawned shells -- Key: MAPREDUCE-6458 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6458 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Junping Du Assignee: Allen Wittenauer Attachments: MAPREDUCE-6458.00.patch In MAPREDUCE-6454 (target for branch-2.x), we provide a way with constraints to pass built-in classpath from parent to child shell, via HADOOP_CLASSPATH, so jars in distributed cache can still work in child tasks. In trunk, we may think some way different, like: involve additional env var to safely pass build-in classpath. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6357) MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute
[ https://issues.apache.org/jira/browse/MAPREDUCE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707589#comment-14707589 ] Hudson commented on MAPREDUCE-6357: --- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #283 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/283/]) MAPREDUCE-6357. MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute. Contributed by Dustin Cote. (aajisaka: rev 2ba90c93d71aa2d30ee9ed431750c10c685e5599) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java * hadoop-mapreduce-project/CHANGES.txt MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute -- Key: MAPREDUCE-6357 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6357 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.6.0 Reporter: Ivan Mitic Assignee: Dustin Cote Fix For: 2.8.0 Attachments: MAPREDUCE-6357-1.patch After spending the afternoon debugging a user job where reduce tasks were failing on retry with the below exception, I think it would be worthwhile to add a note in the MultipleOutputs.write() documentation, saying that absolute paths may cause improper execution of tasks on retry or when MR speculative execution is enabled. {code} 2015-04-28 23:13:10,452 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: File already exists:wasb://full20150...@bgtstoragefull.blob.core.windows.net/user/hadoop/some/path/block-r-00299.bz2 at org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1354) at org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1195) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135) at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475) at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433) at com.ancestry.bigtree.hadoop.LevelReducer.processValue(LevelReducer.java:91) at com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:69) at com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:14) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) {code} As discussed in MAPREDUCE-3772, when the baseOutputPath passed to MultipleOutputs.write() is an absolute path (or more precisely a path that resolves outside of the job output-dir), the concept of output committing is not utilized. In this case, the user read thru the MultipleOutputs docs and was assuming that everything will be working fine, as there are blog posts saying that MultipleOutputs does handle output commit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6357) MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute
[ https://issues.apache.org/jira/browse/MAPREDUCE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707605#comment-14707605 ] Hudson commented on MAPREDUCE-6357: --- SUCCESS: Integrated in Hadoop-Hdfs-trunk #2221 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2221/]) MAPREDUCE-6357. MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute. Contributed by Dustin Cote. (aajisaka: rev 2ba90c93d71aa2d30ee9ed431750c10c685e5599) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute -- Key: MAPREDUCE-6357 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6357 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.6.0 Reporter: Ivan Mitic Assignee: Dustin Cote Fix For: 2.8.0 Attachments: MAPREDUCE-6357-1.patch After spending the afternoon debugging a user job where reduce tasks were failing on retry with the below exception, I think it would be worthwhile to add a note in the MultipleOutputs.write() documentation, saying that absolute paths may cause improper execution of tasks on retry or when MR speculative execution is enabled. {code} 2015-04-28 23:13:10,452 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: File already exists:wasb://full20150...@bgtstoragefull.blob.core.windows.net/user/hadoop/some/path/block-r-00299.bz2 at org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1354) at org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1195) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135) at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475) at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433) at com.ancestry.bigtree.hadoop.LevelReducer.processValue(LevelReducer.java:91) at com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:69) at com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:14) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) {code} As discussed in MAPREDUCE-3772, when the baseOutputPath passed to MultipleOutputs.write() is an absolute path (or more precisely a path that resolves outside of the job output-dir), the concept of output committing is not utilized. In this case, the user read thru the MultipleOutputs docs and was assuming that everything will be working fine, as there are blog posts saying that MultipleOutputs does handle output commit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6357) MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute
[ https://issues.apache.org/jira/browse/MAPREDUCE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707542#comment-14707542 ] Hudson commented on MAPREDUCE-6357: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2240 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2240/]) MAPREDUCE-6357. MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute. Contributed by Dustin Cote. (aajisaka: rev 2ba90c93d71aa2d30ee9ed431750c10c685e5599) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute -- Key: MAPREDUCE-6357 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6357 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.6.0 Reporter: Ivan Mitic Assignee: Dustin Cote Fix For: 2.8.0 Attachments: MAPREDUCE-6357-1.patch After spending the afternoon debugging a user job where reduce tasks were failing on retry with the below exception, I think it would be worthwhile to add a note in the MultipleOutputs.write() documentation, saying that absolute paths may cause improper execution of tasks on retry or when MR speculative execution is enabled. {code} 2015-04-28 23:13:10,452 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: File already exists:wasb://full20150...@bgtstoragefull.blob.core.windows.net/user/hadoop/some/path/block-r-00299.bz2 at org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1354) at org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1195) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135) at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475) at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433) at com.ancestry.bigtree.hadoop.LevelReducer.processValue(LevelReducer.java:91) at com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:69) at com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:14) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) {code} As discussed in MAPREDUCE-3772, when the baseOutputPath passed to MultipleOutputs.write() is an absolute path (or more precisely a path that resolves outside of the job output-dir), the concept of output committing is not utilized. In this case, the user read thru the MultipleOutputs docs and was assuming that everything will be working fine, as there are blog posts saying that MultipleOutputs does handle output commit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6434) Add support for PartialFileOutputCommiter when checkpointing is an option during preemption
[ https://issues.apache.org/jira/browse/MAPREDUCE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707498#comment-14707498 ] Chris Douglas commented on MAPREDUCE-6434: -- Agreed, the NPE is usually not a problem since the default should be defined in mapred-defaults, though {{JobContextImpl::getReducerClass}} can return null. At least two cases shouldn't cause a problem for map-only jobs: # The base {{mapreduce.Reducer}} is {{\@Checkpointable}}, so it would instantiate a {{PartialFileOutputCommitter}} # A {{Reducer}} in the config shouldn't cause a map-only job to fail if it's not on the classpath (this may not be true in the current code, but this shouldn't add another case) We also don't want to do anything surprising for setup/cleanup tasks. Add support for PartialFileOutputCommiter when checkpointing is an option during preemption --- Key: MAPREDUCE-6434 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6434 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Augusto Souza Assignee: Augusto Souza Attachments: MAPREDUCE-6434.001.patch, MAPREDUCE-6434.002.patch, MAPREDUCE-6434.003.patch, MAPREDUCE-6434.004.patch, MAPREDUCE-6434.005.patch, MAPREDUCE-6434.006.patch Finish up some renaming work related to the annotation @Preemptable (it should be @Checkpointable now) and help in the splitting of patch in MAPREDUCE-5269 that is too large for being reviewed or accepted by Jenkins CI scripts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6460) TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails
zhihai xu created MAPREDUCE-6460: Summary: TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails Key: MAPREDUCE-6460 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6460 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: zhihai xu Assignee: zhihai xu TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails with the following logs: --- T E S T S --- Running org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 94.525 sec FAILURE! - in org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator testAttemptNotFoundCausesRMCommunicatorException(org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator) Time elapsed: 2.606 sec FAILURE! java.lang.AssertionError: Expected exception: org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocationException at org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:32) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) Results : Failed tests: TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException Expected exception: org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocationException Tests run: 24, Failures: 1, Errors: 0, Skipped: 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6460) TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated MAPREDUCE-6460: - Component/s: test TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails --- Key: MAPREDUCE-6460 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6460 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: zhihai xu Assignee: zhihai xu TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails with the following logs: --- T E S T S --- Running org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 94.525 sec FAILURE! - in org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator testAttemptNotFoundCausesRMCommunicatorException(org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator) Time elapsed: 2.606 sec FAILURE! java.lang.AssertionError: Expected exception: org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocationException at org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:32) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) Results : Failed tests: TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException Expected exception: org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocationException Tests run: 24, Failures: 1, Errors: 0, Skipped: 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6460) TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated MAPREDUCE-6460: - Attachment: MAPREDUCE-6460.000.patch TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails --- Key: MAPREDUCE-6460 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6460 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-6460.000.patch TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails with the following logs: --- T E S T S --- Running org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 94.525 sec FAILURE! - in org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator testAttemptNotFoundCausesRMCommunicatorException(org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator) Time elapsed: 2.606 sec FAILURE! java.lang.AssertionError: Expected exception: org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocationException at org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:32) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) Results : Failed tests: TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException Expected exception: org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocationException Tests run: 24, Failures: 1, Errors: 0, Skipped: 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6460) TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated MAPREDUCE-6460: - Attachment: (was: MAPREDUCE-6460.000.patch) TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails --- Key: MAPREDUCE-6460 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6460 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-6460.000.patch TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails with the following logs: --- T E S T S --- Running org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 94.525 sec FAILURE! - in org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator testAttemptNotFoundCausesRMCommunicatorException(org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator) Time elapsed: 2.606 sec FAILURE! java.lang.AssertionError: Expected exception: org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocationException at org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:32) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) Results : Failed tests: TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException Expected exception: org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocationException Tests run: 24, Failures: 1, Errors: 0, Skipped: 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6458) Figure out the way to pass build-in classpath (files in distributed cache, etc.) from parent to spawned shells
[ https://issues.apache.org/jira/browse/MAPREDUCE-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707669#comment-14707669 ] Hadoop QA commented on MAPREDUCE-6458: -- \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 20m 18s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 7m 32s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 47s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 3m 4s | There were no new checkstyle issues. | | {color:green}+1{color} | shellcheck | 0m 6s | There were no new shellcheck (v0.3.3) issues. | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 29s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 5m 40s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 23m 17s | Tests passed in hadoop-common. | | {color:red}-1{color} | mapreduce tests | 0m 18s | Tests failed in hadoop-mapreduce-client-app. | | {color:red}-1{color} | mapreduce tests | 0m 17s | Tests failed in hadoop-mapreduce-client-common. | | {color:green}+1{color} | yarn tests | 0m 23s | Tests passed in hadoop-yarn-api. | | | | 73m 13s | | \\ \\ || Reason || Tests || | Failed build | hadoop-mapreduce-client-app | | | hadoop-mapreduce-client-common | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12751766/MAPREDUCE-6458.00.patch | | Optional Tests | shellcheck javac unit javadoc findbugs checkstyle | | git revision | trunk / 22de7c1 | | whitespace | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5949/artifact/patchprocess/whitespace.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5949/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-mapreduce-client-app test log | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5949/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt | | hadoop-mapreduce-client-common test log | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5949/artifact/patchprocess/testrun_hadoop-mapreduce-client-common.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5949/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5949/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5949/console | This message was automatically generated. Figure out the way to pass build-in classpath (files in distributed cache, etc.) from parent to spawned shells -- Key: MAPREDUCE-6458 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6458 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Junping Du Assignee: Allen Wittenauer Attachments: MAPREDUCE-6458.00.patch In MAPREDUCE-6454 (target for branch-2.x), we provide a way with constraints to pass built-in classpath from parent to child shell, via HADOOP_CLASSPATH, so jars in distributed cache can still work in child tasks. In trunk, we may think some way different, like: involve additional env var to safely pass build-in classpath. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6460) TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated MAPREDUCE-6460: - Status: Patch Available (was: Open) TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails --- Key: MAPREDUCE-6460 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6460 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-6460.000.patch TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails with the following logs: --- T E S T S --- Running org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 94.525 sec FAILURE! - in org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator testAttemptNotFoundCausesRMCommunicatorException(org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator) Time elapsed: 2.606 sec FAILURE! java.lang.AssertionError: Expected exception: org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocationException at org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:32) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) Results : Failed tests: TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException Expected exception: org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocationException Tests run: 24, Failures: 1, Errors: 0, Skipped: 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6460) TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated MAPREDUCE-6460: - Attachment: MAPREDUCE-6460.000.patch TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails --- Key: MAPREDUCE-6460 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6460 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-6460.000.patch TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails with the following logs: --- T E S T S --- Running org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 94.525 sec FAILURE! - in org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator testAttemptNotFoundCausesRMCommunicatorException(org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator) Time elapsed: 2.606 sec FAILURE! java.lang.AssertionError: Expected exception: org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocationException at org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:32) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) Results : Failed tests: TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException Expected exception: org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocationException Tests run: 24, Failures: 1, Errors: 0, Skipped: 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6415) Create a tool to combine aggregated logs into HAR files
[ https://issues.apache.org/jira/browse/MAPREDUCE-6415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706281#comment-14706281 ] Arun Suresh commented on MAPREDUCE-6415: [~rkanter], The patch looks good to me. You might want to clean up the TODOs and add some javaDocs though. +1 pending that. Create a tool to combine aggregated logs into HAR files --- Key: MAPREDUCE-6415 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6415 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.8.0 Reporter: Robert Kanter Assignee: Robert Kanter Attachments: HAR-ableAggregatedLogs_v1.pdf, MAPREDUCE-6415_branch-2_prelim_001.patch, MAPREDUCE-6415_branch-2_prelim_002.patch, MAPREDUCE-6415_prelim_001.patch, MAPREDUCE-6415_prelim_002.patch While we wait for YARN-2942 to become viable, it would still be great to improve the aggregated logs problem. We can write a tool that combines aggregated log files into a single HAR file per application, which should solve the too many files and too many blocks problems. See the design document for details. See YARN-2942 for more context. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6459) native task crashes when merging spilled file on ppc64
[ https://issues.apache.org/jira/browse/MAPREDUCE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated MAPREDUCE-6459: --- Attachment: ppc64_error.txt native task crashes when merging spilled file on ppc64 -- Key: MAPREDUCE-6459 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6459 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.6.0 Environment: Linux version 2.6.32-431.el6.ppc64 Reporter: Tao Jie Attachments: ppc64_error.txt when running native task on ppc64,merging spilled files fails since we could not deserialize local spill file correctly. Function readVLong in WritableUtils.h and Buffers.h, we try to compare a char with a number and convert a char to int64_t. It does not work correctly on ppc64 since char definition is different between ppc64 and x86 platform. On x86 platform char is defined as signed number while on ppc64 char is unsigned. As a result, we write EOF marker [-1, -1] at the end of spill partition, but deserialize chars as [255, 255]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6458) Figure out the way to pass build-in classpath (files in distributed cache, etc.) from parent to spawned shells
[ https://issues.apache.org/jira/browse/MAPREDUCE-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6458: -- Description: In MAPREDUCE-6454 (target for branch-2.x), we provide a way with constraints to pass built-in classpath from parent to child shell, via HADOOP_CLASSPATH, so jars in distributed cache can still work in child tasks. In trunk, we may think some way different, like: involve additional env var to safely pass build-in classpath. (was: In MAPREDUCE-6454 (target for branch-2.x), we provide an extremely fragile way to pass built-in classpath from parent to child shell, via HADOOP_CLASSPATH, so jars in distributed cache can still work in child tasks. In trunk, we may think some way different, like: involve additional env var to safely pass build-in classpath.) Figure out the way to pass build-in classpath (files in distributed cache, etc.) from parent to spawned shells -- Key: MAPREDUCE-6458 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6458 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Junping Du Assignee: Allen Wittenauer In MAPREDUCE-6454 (target for branch-2.x), we provide a way with constraints to pass built-in classpath from parent to child shell, via HADOOP_CLASSPATH, so jars in distributed cache can still work in child tasks. In trunk, we may think some way different, like: involve additional env var to safely pass build-in classpath. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6458) Figure out the way to pass build-in classpath (files in distributed cache, etc.) from parent to spawned shells
[ https://issues.apache.org/jira/browse/MAPREDUCE-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706488#comment-14706488 ] Junping Du commented on MAPREDUCE-6458: --- bq. Re-assigning this to me and updating the description to reflect reality, since I actually understand how bash works. Please feel free to take it if you have bandwidth to work on it immediately. Figure out the way to pass build-in classpath (files in distributed cache, etc.) from parent to spawned shells -- Key: MAPREDUCE-6458 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6458 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Junping Du Assignee: Allen Wittenauer In MAPREDUCE-6454 (target for branch-2.x), we provide a way with constraints to pass built-in classpath from parent to child shell, via HADOOP_CLASSPATH, so jars in distributed cache can still work in child tasks. In trunk, we may think some way different, like: involve additional env var to safely pass build-in classpath. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6363) [NNBench] Lease mismatch error when running with multiple mappers
[ https://issues.apache.org/jira/browse/MAPREDUCE-6363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706491#comment-14706491 ] Ajith S commented on MAPREDUCE-6363: Hi [~ajisakaa] I think what [~uladz] is pointing, that when we run CREATE test, we will create files withe unique names, thanks to taskid, so CREATE is fine. But when we run rename or delete, the taskid will be new and it will not actually rename or delete the files(created by CREATE benchmark) because it will not find the file name based on file_+taskId as taskId will be new. right.? [NNBench] Lease mismatch error when running with multiple mappers - Key: MAPREDUCE-6363 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6363 Project: Hadoop Map/Reduce Issue Type: Bug Components: benchmarks Reporter: Brahma Reddy Battula Assignee: Vlad Sharanhovich Priority: Critical Fix For: 2.8.0 Attachments: HDFS4929.patch, MAPREDUCE-6363-001.patch, MAPREDUCE-6363-002.patch, MAPREDUCE-6363-003.patch Command : ./yarn jar ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.0.1-tests.jar nnbench -operation create_write -numberOfFiles 1000 -blockSize 268435456 -bytesToWrite 102400 -baseDir /benchmarks/NNBench`hostname -s` -replicationFactorPerFile 3 -maps 100 -reduces 10 Trace : 013-06-21 10:44:53,763 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9005, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 192.168.105.214:36320: error: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: Lease mismatch on /benchmarks/NNBenchlinux-185/data/file_linux-214__0 owned by DFSClient_attempt_1371782327901_0001_m_48_0_1383437860_1 but is accessed by DFSClient_attempt_1371782327901_0001_m_84_0_1880545303_1 org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: Lease mismatch on /benchmarks/NNBenchlinux-185/data/file_linux-214__0 owned by DFSClient_attempt_1371782327901_0001_m_48_0_1383437860_1 but is accessed by DFSClient_attempt_1371782327901_0001_m_84_0_1880545303_1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2351) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2098) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2019) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:501) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:213) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:52012) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:435) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:925) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1710) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1706) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6459) native task crashes when merging spilled file on ppc64
Tao Jie created MAPREDUCE-6459: -- Summary: native task crashes when merging spilled file on ppc64 Key: MAPREDUCE-6459 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6459 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.6.0 Environment: Linux version 2.6.32-431.el6.ppc64 Reporter: Tao Jie Attachments: ppc64_error.txt when running native task on ppc64,merging spilled files fails since we could not deserialize local spill file correctly. Function readVLong in WritableUtils.h and Buffers.h, we try to compare a char with a number and convert a char to int64_t. It does not work correctly on ppc64 since char definition is different between ppc64 and x86 platform. On x86 platform char is defined as signed number while on ppc64 char is unsigned. As a result, we write EOF marker [-1, -1] at the end of spill partition, but deserialize chars as [255, 255]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)