[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586946#comment-13586946 ] Gelesh commented on MAPREDUCE-4974: --- [~snihalani], I think you reffred an old patch, Please look at MAPREDUCE-4974.4.patch Optimising the LineRecordReader initialize() method --- Key: MAPREDUCE-4974 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1, mrv2, performance Affects Versions: 2.0.2-alpha, 0.23.5 Environment: Hadoop Linux Reporter: Arun A K Assignee: Gelesh Labels: patch, performance Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, MAPREDUCE-4974.4.patch Original Estimate: 1h Remaining Estimate: 1h I found there is a a scope of optimizing the code, over initialize() if we have compressionCodecs codec instantiated only if its a compressed input. Mean while Gelesh George Omathil, added if we could avoid the null check of key value. This would time save, since for every next key value generation, null check is done. The intention being to instantiate only once and avoid NPE as well. Hope both could be met if initialize key value over initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gelesh updated MAPREDUCE-4974: -- Attachment: (was: MAPREDUCE-4974.1.patch) Optimising the LineRecordReader initialize() method --- Key: MAPREDUCE-4974 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1, mrv2, performance Affects Versions: 2.0.2-alpha, 0.23.5 Environment: Hadoop Linux Reporter: Arun A K Assignee: Gelesh Labels: patch, performance Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, MAPREDUCE-4974.4.patch Original Estimate: 1h Remaining Estimate: 1h I found there is a a scope of optimizing the code, over initialize() if we have compressionCodecs codec instantiated only if its a compressed input. Mean while Gelesh George Omathil, added if we could avoid the null check of key value. This would time save, since for every next key value generation, null check is done. The intention being to instantiate only once and avoid NPE as well. Hope both could be met if initialize key value over initialize() method. We both have worked on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5029) Recursively take all files in the directories of a root directory
[ https://issues.apache.org/jira/browse/MAPREDUCE-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586994#comment-13586994 ] Steve Loughran commented on MAPREDUCE-5029: --- Please can you see if this problem still exists on Hadoop trunk. There have been changes to do iterative enumeration of directory contents rather than the ls * -see {{FileSystem.listFiles()}}. If this method isn't being used in MR jobs, maybe it could be. Recursively take all files in the directories of a root directory - Key: MAPREDUCE-5029 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5029 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.20.2 Reporter: Abhilash S R Suppose we have a root directories with 1000's of sub directories and in each directory there can be 100's of files.So while specifying the root directory in the input path in map-reduce the program crashes due to sub directories in the root directory.So if this feature is includes in latest version it will be great helpful for programers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5030) YARNClientImpl logging too aggressively
Karthik Kambatla created MAPREDUCE-5030: --- Summary: YARNClientImpl logging too aggressively Key: MAPREDUCE-5030 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5030 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Trivial Every time we execute bin/hadoop job etc, the following two lines show up: {noformat} 13/02/26 07:05:19 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited. 13/02/26 07:05:20 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is started. {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5031) Maps hitting IndexOutOfBoundsException for higher values of mapreduce.task.io.sort.mb
Karthik Kambatla created MAPREDUCE-5031: --- Summary: Maps hitting IndexOutOfBoundsException for higher values of mapreduce.task.io.sort.mb Key: MAPREDUCE-5031 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5031 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.5, 2.0.3-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla While trying to reproduce MAPREDUCE-5028 on trunk, ran into what seems to be a different issue. To reproduce: Psuedo-dist mode: mapreduce.{map,reduce}.memory.mb=2048, mapreduce.{map,reduce}.java.opts=-Xmx2048m, mapreduce.task.io.sort.mb=1280 The map tasks fail with the following error: {noformat} Error: java.lang.IndexOutOfBoundsException at java.nio.Buffer.checkIndex(Buffer.java:512) at java.nio.ByteBufferAsIntBufferL.put(ByteBufferAsIntBufferL.java:113) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1141) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:686) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:47) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:36) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:757) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1488) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value
[ https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-5028: Affects Version/s: (was: 0.23.5) (was: 2.0.3-alpha) Maps fail when io.sort.mb is set to high value -- Key: MAPREDUCE-5028 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: mr-5028-branch1.patch Verified the problem exists on branch-1 with the following configuration: Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, io.sort.mb=1280, dfs.block.size=2147483648 Run teragen to generate 4 GB data Maps fail when you run wordcount on this configuration with the following error: {noformat} java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116) at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346) {noformat} Marked branch-0.23 and branch-2 also because the offending code seems to exist there too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value
[ https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-5028: Target Version/s: 1.2.0 (was: 1.2.0, 0.23.7, 2.0.4-beta) Maps fail when io.sort.mb is set to high value -- Key: MAPREDUCE-5028 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: mr-5028-branch1.patch Verified the problem exists on branch-1 with the following configuration: Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, io.sort.mb=1280, dfs.block.size=2147483648 Run teragen to generate 4 GB data Maps fail when you run wordcount on this configuration with the following error: {noformat} java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116) at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346) {noformat} Marked branch-0.23 and branch-2 also because the offending code seems to exist there too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value
[ https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-5028: Description: Verified the problem exists on branch-1 with the following configuration: Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, io.sort.mb=1280, dfs.block.size=2147483648 Run teragen to generate 4 GB data Maps fail when you run wordcount on this configuration with the following error: {noformat} java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116) at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346) {noformat} was: Verified the problem exists on branch-1 with the following configuration: Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, io.sort.mb=1280, dfs.block.size=2147483648 Run teragen to generate 4 GB data Maps fail when you run wordcount on this configuration with the following error: {noformat} java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116) at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855) at
[jira] [Updated] (MAPREDUCE-5031) Maps hitting IndexOutOfBoundsException for higher values of mapreduce.task.io.sort.mb
[ https://issues.apache.org/jira/browse/MAPREDUCE-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-5031: Description: While trying to reproduce MAPREDUCE-5028 on trunk, ran into what seems to be a different issue. To reproduce: Psuedo-dist mode: mapreduce.{map,reduce}.memory.mb=2048, mapreduce.{map,reduce}.java.opts=-Xmx2048m, mapreduce.task.io.sort.mb=1280 The map tasks fail with the following error: {noformat} Error: java.lang.IndexOutOfBoundsException at java.nio.Buffer.checkIndex(Buffer.java:512) at java.nio.ByteBufferAsIntBufferL.put(ByteBufferAsIntBufferL.java:113) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1141) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:686) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:47) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:36) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:757) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1488) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153) {noformat} was: While trying to reproduce MAPREDUCE-5028 on trunk, ran into what seems to be a different issue. To reproduce: Psuedo-dist mode: mapreduce.{map,reduce}.memory.mb=2048, mapreduce.{map,reduce}.java.opts=-Xmx2048m, mapreduce.task.io.sort.mb=1280 The map tasks fail with the following error: {noformat} Error: java.lang.IndexOutOfBoundsException at java.nio.Buffer.checkIndex(Buffer.java:512) at java.nio.ByteBufferAsIntBufferL.put(ByteBufferAsIntBufferL.java:113) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1141) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:686) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:47) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:36) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:757) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1488) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153) {noformat} Maps hitting IndexOutOfBoundsException for higher values of mapreduce.task.io.sort.mb - Key: MAPREDUCE-5031 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5031 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Karthik Kambatla Assignee: Karthik Kambatla While trying to reproduce MAPREDUCE-5028 on trunk, ran into what seems to be a different issue. To reproduce: Psuedo-dist mode: mapreduce.{map,reduce}.memory.mb=2048, mapreduce.{map,reduce}.java.opts=-Xmx2048m, mapreduce.task.io.sort.mb=1280 The map tasks fail with the following error: {noformat} Error: java.lang.IndexOutOfBoundsException at java.nio.Buffer.checkIndex(Buffer.java:512) at java.nio.ByteBufferAsIntBufferL.put(ByteBufferAsIntBufferL.java:113) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1141) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:686) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:47) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:36) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:757) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339) at
[jira] [Updated] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value
[ https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-5028: Status: Patch Available (was: Open) Maps fail when io.sort.mb is set to high value -- Key: MAPREDUCE-5028 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: mr-5028-branch1.patch Verified the problem exists on branch-1 with the following configuration: Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, io.sort.mb=1280, dfs.block.size=2147483648 Run teragen to generate 4 GB data Maps fail when you run wordcount on this configuration with the following error: {noformat} java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116) at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value
[ https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587298#comment-13587298 ] Hadoop QA commented on MAPREDUCE-5028: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12570924/mr-5028-branch1.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3360//console This message is automatically generated. Maps fail when io.sort.mb is set to high value -- Key: MAPREDUCE-5028 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: mr-5028-branch1.patch Verified the problem exists on branch-1 with the following configuration: Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, io.sort.mb=1280, dfs.block.size=2147483648 Run teragen to generate 4 GB data Maps fail when you run wordcount on this configuration with the following error: {noformat} java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116) at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-5027) Shuffle does not limit number of outstanding connections
[ https://issues.apache.org/jira/browse/MAPREDUCE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Parker reassigned MAPREDUCE-5027: Assignee: Robert Parker Shuffle does not limit number of outstanding connections Key: MAPREDUCE-5027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5027 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Robert Parker The ShuffleHandler does not have any configurable limits to the number of outstanding connections allowed. Therefore a node with many map outputs and many reducers in the cluster trying to fetch those outputs can exhaust a nodemanager out of file descriptors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5027) Shuffle does not limit number of outstanding connections
[ https://issues.apache.org/jira/browse/MAPREDUCE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Parker updated MAPREDUCE-5027: - Status: Patch Available (was: Open) netty seems to be more geared to limit connections per IP. Extended the idea provided by Jason (thanks for the code snippet). Shuffle does not limit number of outstanding connections Key: MAPREDUCE-5027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5027 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.5, 2.0.3-alpha Reporter: Jason Lowe Assignee: Robert Parker Attachments: MAPREDUCE-5027.patch The ShuffleHandler does not have any configurable limits to the number of outstanding connections allowed. Therefore a node with many map outputs and many reducers in the cluster trying to fetch those outputs can exhaust a nodemanager out of file descriptors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5027) Shuffle does not limit number of outstanding connections
[ https://issues.apache.org/jira/browse/MAPREDUCE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Parker updated MAPREDUCE-5027: - Attachment: MAPREDUCE-5027.patch Shuffle does not limit number of outstanding connections Key: MAPREDUCE-5027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5027 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Robert Parker Attachments: MAPREDUCE-5027.patch The ShuffleHandler does not have any configurable limits to the number of outstanding connections allowed. Therefore a node with many map outputs and many reducers in the cluster trying to fetch those outputs can exhaust a nodemanager out of file descriptors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5027) Shuffle does not limit number of outstanding connections
[ https://issues.apache.org/jira/browse/MAPREDUCE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587422#comment-13587422 ] Hadoop QA commented on MAPREDUCE-5027: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571028/MAPREDUCE-5027.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 one of tests included doesn't have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle: org.apache.hadoop.mapred.TestShuffleHandler {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3361//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3361//console This message is automatically generated. Shuffle does not limit number of outstanding connections Key: MAPREDUCE-5027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5027 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Robert Parker Attachments: MAPREDUCE-5027.patch The ShuffleHandler does not have any configurable limits to the number of outstanding connections allowed. Therefore a node with many map outputs and many reducers in the cluster trying to fetch those outputs can exhaust a nodemanager out of file descriptors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5032) MapTask.MapOutputBuffer contains arithmetic overflows
Chris Douglas created MAPREDUCE-5032: Summary: MapTask.MapOutputBuffer contains arithmetic overflows Key: MAPREDUCE-5032 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5032 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Affects Versions: 0.23.5, 2.0.3-alpha, 1.1.1 Reporter: Chris Douglas Assignee: Chris Douglas There are several places where offsets into the collection buffer can overflow when applied to large buffers. These should be accommodated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value
[ https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587448#comment-13587448 ] Chris Douglas commented on MAPREDUCE-5028: -- bq. getLength() returns the size of the entire buffer, and not just the remaining part of the buffer Just to be clear, it's not the size of the backing array, but the index one greater than the last valid character in the input stream buffer. (ByteArrayInputStream) The change to {{DataInputBuffer}} implies the former, which is inaccurate. The corrects several misuses of DataInputBuffer, which is great. There's another misuse at {{ReduceContextImpl.ValueIterator::next}} that could be included with these changes. Most of this code doesn't check for overflow; it wasn't written for extremely large buffers. Just glancing at related code, MapTask.InMemValBytes contains code that could overflow and I'm sure there are others. Filed MAPREDUCE-5032. Maps fail when io.sort.mb is set to high value -- Key: MAPREDUCE-5028 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: mr-5028-branch1.patch Verified the problem exists on branch-1 with the following configuration: Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, io.sort.mb=1280, dfs.block.size=2147483648 Run teragen to generate 4 GB data Maps fail when you run wordcount on this configuration with the following error: {noformat} java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116) at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4693) Historyserver should provide counters for failed tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated MAPREDUCE-4693: - Attachment: MAPREDUCE-4693.4.patch Historyserver should provide counters for failed tasks -- Key: MAPREDUCE-4693 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4693 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe Assignee: Xuan Gong Labels: usability Attachments: MAPREDUCE-4693.1.patch, MAPREDUCE-4693.2.patch, MAPREDUCE-4693.3.patch, MAPREDUCE-4693.4.patch Currently the historyserver is not providing counters for failed tasks, even though they are available via the AM as long as the job is still running. Those counters are lost when the client needs to redirect to the historyserver after the job completes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4693) Historyserver should provide counters for failed tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587509#comment-13587509 ] Hadoop QA commented on MAPREDUCE-4693: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571043/MAPREDUCE-4693.4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 tests included appear to have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs hadoop-tools/hadoop-rumen. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3362//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3362//console This message is automatically generated. Historyserver should provide counters for failed tasks -- Key: MAPREDUCE-4693 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4693 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe Assignee: Xuan Gong Labels: usability Attachments: MAPREDUCE-4693.1.patch, MAPREDUCE-4693.2.patch, MAPREDUCE-4693.3.patch, MAPREDUCE-4693.4.patch Currently the historyserver is not providing counters for failed tasks, even though they are available via the AM as long as the job is still running. Those counters are lost when the client needs to redirect to the historyserver after the job completes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5033) mapred shell script should respect usage flags (--help -help -h)
Andrew Wang created MAPREDUCE-5033: -- Summary: mapred shell script should respect usage flags (--help -help -h) Key: MAPREDUCE-5033 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5033 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.0.3-alpha Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Like in HADOOP-9267, the mapred shell script should respect the normal Unix-y help flags. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5033) mapred shell script should respect usage flags (--help -help -h)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated MAPREDUCE-5033: --- Attachment: mapreduce-5033-1.patch Little patch attached. Tested manually by running the mapred script. mapred shell script should respect usage flags (--help -help -h) Key: MAPREDUCE-5033 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5033 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.0.3-alpha Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Attachments: mapreduce-5033-1.patch Like in HADOOP-9267, the mapred shell script should respect the normal Unix-y help flags. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5033) mapred shell script should respect usage flags (--help -help -h)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated MAPREDUCE-5033: --- Status: Patch Available (was: Open) mapred shell script should respect usage flags (--help -help -h) Key: MAPREDUCE-5033 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5033 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.0.3-alpha Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Attachments: mapreduce-5033-1.patch Like in HADOOP-9267, the mapred shell script should respect the normal Unix-y help flags. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4693) Historyserver should provide counters for failed tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated MAPREDUCE-4693: - Status: Open (was: Patch Available) Historyserver should provide counters for failed tasks -- Key: MAPREDUCE-4693 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4693 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe Assignee: Xuan Gong Labels: usability Attachments: MAPREDUCE-4693.1.patch, MAPREDUCE-4693.2.patch, MAPREDUCE-4693.3.patch, MAPREDUCE-4693.4.patch Currently the historyserver is not providing counters for failed tasks, even though they are available via the AM as long as the job is still running. Those counters are lost when the client needs to redirect to the historyserver after the job completes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4693) Historyserver should provide counters for failed tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated MAPREDUCE-4693: - Status: Patch Available (was: Open) Historyserver should provide counters for failed tasks -- Key: MAPREDUCE-4693 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4693 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe Assignee: Xuan Gong Labels: usability Attachments: MAPREDUCE-4693.1.patch, MAPREDUCE-4693.2.patch, MAPREDUCE-4693.3.patch, MAPREDUCE-4693.4.patch Currently the historyserver is not providing counters for failed tasks, even though they are available via the AM as long as the job is still running. Those counters are lost when the client needs to redirect to the historyserver after the job completes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5027) Shuffle does not limit number of outstanding connections
[ https://issues.apache.org/jira/browse/MAPREDUCE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Parker updated MAPREDUCE-5027: - Attachment: MAPREDUCE-5027.patch Shuffle does not limit number of outstanding connections Key: MAPREDUCE-5027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5027 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Robert Parker Attachments: MAPREDUCE-5027.patch, MAPREDUCE-5027.patch The ShuffleHandler does not have any configurable limits to the number of outstanding connections allowed. Therefore a node with many map outputs and many reducers in the cluster trying to fetch those outputs can exhaust a nodemanager out of file descriptors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5033) mapred shell script should respect usage flags (--help -help -h)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587624#comment-13587624 ] Hadoop QA commented on MAPREDUCE-5033: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571051/mapreduce-5033-1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3363//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3363//console This message is automatically generated. mapred shell script should respect usage flags (--help -help -h) Key: MAPREDUCE-5033 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5033 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.0.3-alpha Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Attachments: mapreduce-5033-1.patch Like in HADOOP-9267, the mapred shell script should respect the normal Unix-y help flags. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5027) Shuffle does not limit number of outstanding connections
[ https://issues.apache.org/jira/browse/MAPREDUCE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587654#comment-13587654 ] Hadoop QA commented on MAPREDUCE-5027: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571063/MAPREDUCE-5027.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 one of tests included doesn't have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle: org.apache.hadoop.mapred.TestShuffleHandler {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3364//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3364//console This message is automatically generated. Shuffle does not limit number of outstanding connections Key: MAPREDUCE-5027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5027 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Robert Parker Attachments: MAPREDUCE-5027.patch, MAPREDUCE-5027.patch The ShuffleHandler does not have any configurable limits to the number of outstanding connections allowed. Therefore a node with many map outputs and many reducers in the cluster trying to fetch those outputs can exhaust a nodemanager out of file descriptors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5034) Class cast exception in MergeManagerImpl.java
Mariappan Asokan created MAPREDUCE-5034: --- Summary: Class cast exception in MergeManagerImpl.java Key: MAPREDUCE-5034 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5034 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Mariappan Asokan When reduce side merge spills to disk, the following exception was thrown: org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl$CompressAwarePath cannot be cast to java.lang.Comparable at java.util.TreeMap.put(TreeMap.java:542) at java.util.TreeSet.add(TreeSet.java:238) at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.closeOnDiskFile(MergeManagerImpl.java:340) at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl$InMemoryMerger.merge(MergeManagerImpl.java:495) at org.apache.hadoop.mapreduce.task.reduce.MergeThread.run(MergeThread.java:94) It looks like a bug introduced by MAPREDUCE-2264 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2264) Job status exceeds 100% in some cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587714#comment-13587714 ] Mariappan Asokan commented on MAPREDUCE-2264: - The patch has introduced a class cast exception. Please see MAPREDUCE-5034 Job status exceeds 100% in some cases -- Key: MAPREDUCE-2264 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2264 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.2, 0.20.205.0 Reporter: Adam Kramer Assignee: Devaraj K Labels: critical-0.22.0 Fix For: 1.2.0, 2.0.3-alpha Attachments: MAPREDUCE-2264-0.20.205-1.patch, MAPREDUCE-2264-0.20.205.patch, MAPREDUCE-2264-0.20.3.patch, MAPREDUCE-2264-branch-1-1.patch, MAPREDUCE-2264-branch-1-2.patch, MAPREDUCE-2264-branch-1.patch, MAPREDUCE-2264-trunk-1.patch, MAPREDUCE-2264-trunk-1.patch, MAPREDUCE-2264-trunk-2.patch, MAPREDUCE-2264-trunk-3.patch, MAPREDUCE-2264-trunk-4.patch, MAPREDUCE-2264-trunk-5.patch, MAPREDUCE-2264-trunk-5.patch, MAPREDUCE-2264-trunk-addendum.patch, MAPREDUCE-2264-trunk.patch, more than 100%.bmp I'm looking now at my jobtracker's list of running reduce tasks. One of them is 120.05% complete, the other is 107.28% complete. I understand that these numbers are estimates, but there is no case in which an estimate of 100% for a non-complete task is better than an estimate of 99.99%, nor is there any case in which an estimate greater than 100% is valid. I suggest that whatever logic is computing these set 99.99% as a hard maximum. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5034) Class cast exception in MergeManagerImpl.java
[ https://issues.apache.org/jira/browse/MAPREDUCE-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587719#comment-13587719 ] Sandy Ryza commented on MAPREDUCE-5034: --- Which version is this found in? There was an issue in MAPREDUCE-2264 that looks very similar to this that was already discovered. Because of it the original patch was reverted and replaced with an update. Class cast exception in MergeManagerImpl.java - Key: MAPREDUCE-5034 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5034 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Mariappan Asokan When reduce side merge spills to disk, the following exception was thrown: org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl$CompressAwarePath cannot be cast to java.lang.Comparable at java.util.TreeMap.put(TreeMap.java:542) at java.util.TreeSet.add(TreeSet.java:238) at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.closeOnDiskFile(MergeManagerImpl.java:340) at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl$InMemoryMerger.merge(MergeManagerImpl.java:495) at org.apache.hadoop.mapreduce.task.reduce.MergeThread.run(MergeThread.java:94) It looks like a bug introduced by MAPREDUCE-2264 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5027) Shuffle does not limit number of outstanding connections
[ https://issues.apache.org/jira/browse/MAPREDUCE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Parker updated MAPREDUCE-5027: - Attachment: MAPREDUCE-5027.patch Shuffle does not limit number of outstanding connections Key: MAPREDUCE-5027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5027 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Robert Parker Attachments: MAPREDUCE-5027.patch The ShuffleHandler does not have any configurable limits to the number of outstanding connections allowed. Therefore a node with many map outputs and many reducers in the cluster trying to fetch those outputs can exhaust a nodemanager out of file descriptors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5027) Shuffle does not limit number of outstanding connections
[ https://issues.apache.org/jira/browse/MAPREDUCE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Parker updated MAPREDUCE-5027: - Attachment: (was: MAPREDUCE-5027.patch) Shuffle does not limit number of outstanding connections Key: MAPREDUCE-5027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5027 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Robert Parker Attachments: MAPREDUCE-5027.patch The ShuffleHandler does not have any configurable limits to the number of outstanding connections allowed. Therefore a node with many map outputs and many reducers in the cluster trying to fetch those outputs can exhaust a nodemanager out of file descriptors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5027) Shuffle does not limit number of outstanding connections
[ https://issues.apache.org/jira/browse/MAPREDUCE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Parker updated MAPREDUCE-5027: - Attachment: (was: MAPREDUCE-5027.patch) Shuffle does not limit number of outstanding connections Key: MAPREDUCE-5027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5027 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Robert Parker Attachments: MAPREDUCE-5027.patch The ShuffleHandler does not have any configurable limits to the number of outstanding connections allowed. Therefore a node with many map outputs and many reducers in the cluster trying to fetch those outputs can exhaust a nodemanager out of file descriptors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2264) Job status exceeds 100% in some cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587725#comment-13587725 ] Sandy Ryza commented on MAPREDUCE-2264: --- This looks the same as the issue that was reported by Chris on January 27th - is it possible you're using the version of the patch that was reverted? The way to check would be to see whether CompressAwarePath extends Path. Job status exceeds 100% in some cases -- Key: MAPREDUCE-2264 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2264 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.2, 0.20.205.0 Reporter: Adam Kramer Assignee: Devaraj K Labels: critical-0.22.0 Fix For: 1.2.0, 2.0.3-alpha Attachments: MAPREDUCE-2264-0.20.205-1.patch, MAPREDUCE-2264-0.20.205.patch, MAPREDUCE-2264-0.20.3.patch, MAPREDUCE-2264-branch-1-1.patch, MAPREDUCE-2264-branch-1-2.patch, MAPREDUCE-2264-branch-1.patch, MAPREDUCE-2264-trunk-1.patch, MAPREDUCE-2264-trunk-1.patch, MAPREDUCE-2264-trunk-2.patch, MAPREDUCE-2264-trunk-3.patch, MAPREDUCE-2264-trunk-4.patch, MAPREDUCE-2264-trunk-5.patch, MAPREDUCE-2264-trunk-5.patch, MAPREDUCE-2264-trunk-addendum.patch, MAPREDUCE-2264-trunk.patch, more than 100%.bmp I'm looking now at my jobtracker's list of running reduce tasks. One of them is 120.05% complete, the other is 107.28% complete. I understand that these numbers are estimates, but there is no case in which an estimate of 100% for a non-complete task is better than an estimate of 99.99%, nor is there any case in which an estimate greater than 100% is valid. I suggest that whatever logic is computing these set 99.99% as a hard maximum. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5034) Class cast exception in MergeManagerImpl.java
[ https://issues.apache.org/jira/browse/MAPREDUCE-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587727#comment-13587727 ] Mariappan Asokan commented on MAPREDUCE-5034: - Hi Sandy, It could be a previous version of the patch that we picked up for testing. I will confirm soon. -- Asokan Class cast exception in MergeManagerImpl.java - Key: MAPREDUCE-5034 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5034 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Mariappan Asokan When reduce side merge spills to disk, the following exception was thrown: org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl$CompressAwarePath cannot be cast to java.lang.Comparable at java.util.TreeMap.put(TreeMap.java:542) at java.util.TreeSet.add(TreeSet.java:238) at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.closeOnDiskFile(MergeManagerImpl.java:340) at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl$InMemoryMerger.merge(MergeManagerImpl.java:495) at org.apache.hadoop.mapreduce.task.reduce.MergeThread.run(MergeThread.java:94) It looks like a bug introduced by MAPREDUCE-2264 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5034) Class cast exception in MergeManagerImpl.java
[ https://issues.apache.org/jira/browse/MAPREDUCE-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587738#comment-13587738 ] Mariappan Asokan commented on MAPREDUCE-5034: - Hi Sandy, You are right. It is from a previous version of the patch. Sorry, about the confusion. I will reject this Jira. -- Asokan Class cast exception in MergeManagerImpl.java - Key: MAPREDUCE-5034 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5034 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Mariappan Asokan When reduce side merge spills to disk, the following exception was thrown: org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl$CompressAwarePath cannot be cast to java.lang.Comparable at java.util.TreeMap.put(TreeMap.java:542) at java.util.TreeSet.add(TreeSet.java:238) at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.closeOnDiskFile(MergeManagerImpl.java:340) at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl$InMemoryMerger.merge(MergeManagerImpl.java:495) at org.apache.hadoop.mapreduce.task.reduce.MergeThread.run(MergeThread.java:94) It looks like a bug introduced by MAPREDUCE-2264 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-5034) Class cast exception in MergeManagerImpl.java
[ https://issues.apache.org/jira/browse/MAPREDUCE-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariappan Asokan resolved MAPREDUCE-5034. - Resolution: Not A Problem Class cast exception in MergeManagerImpl.java - Key: MAPREDUCE-5034 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5034 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Mariappan Asokan When reduce side merge spills to disk, the following exception was thrown: org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl$CompressAwarePath cannot be cast to java.lang.Comparable at java.util.TreeMap.put(TreeMap.java:542) at java.util.TreeSet.add(TreeSet.java:238) at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.closeOnDiskFile(MergeManagerImpl.java:340) at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl$InMemoryMerger.merge(MergeManagerImpl.java:495) at org.apache.hadoop.mapreduce.task.reduce.MergeThread.run(MergeThread.java:94) It looks like a bug introduced by MAPREDUCE-2264 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value
[ https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-5028: Attachment: mr-5028-branch1.patch Maps fail when io.sort.mb is set to high value -- Key: MAPREDUCE-5028 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: mr-5028-branch1.patch, mr-5028-branch1.patch Verified the problem exists on branch-1 with the following configuration: Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, io.sort.mb=1280, dfs.block.size=2147483648 Run teragen to generate 4 GB data Maps fail when you run wordcount on this configuration with the following error: {noformat} java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116) at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value
[ https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587743#comment-13587743 ] Karthik Kambatla commented on MAPREDUCE-5028: - Thanks for your comments, Chris. bq. Just to be clear, it's not the size of the backing array, but the index one greater than the last valid character in the input stream buffer. (ByteArrayInputStream) The change to DataInputBuffer implies the former, which is inaccurate. Removed that comment, and added comments to {{DataInputBuffer#reset()}} to reflect {{ByteArrayInputStream}}. bq. There's another misuse at ReduceContextImpl.ValueIterator::next that could be included with these changes. If I am not mistaken, ReduceContextImpl is only in trunk versions. This patch is only for branch-1. Filed MAPREDUCE-5031 earlier to address the slightly different issues on trunk, can mark it as a duplicate to the one that you created. Have a half-baked patch, I can submit that there. Maps fail when io.sort.mb is set to high value -- Key: MAPREDUCE-5028 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: mr-5028-branch1.patch, mr-5028-branch1.patch Verified the problem exists on branch-1 with the following configuration: Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, io.sort.mb=1280, dfs.block.size=2147483648 Run teragen to generate 4 GB data Maps fail when you run wordcount on this configuration with the following error: {noformat} java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116) at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value
[ https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587744#comment-13587744 ] Karthik Kambatla commented on MAPREDUCE-5028: - The -1 from Hadoop QA was because the uploaded branch is for patch 1. I have verified ant test-core passes. Maps fail when io.sort.mb is set to high value -- Key: MAPREDUCE-5028 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: mr-5028-branch1.patch, mr-5028-branch1.patch Verified the problem exists on branch-1 with the following configuration: Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, io.sort.mb=1280, dfs.block.size=2147483648 Run teragen to generate 4 GB data Maps fail when you run wordcount on this configuration with the following error: {noformat} java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116) at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5027) Shuffle does not limit number of outstanding connections
[ https://issues.apache.org/jira/browse/MAPREDUCE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Parker updated MAPREDUCE-5027: - Attachment: MAPREDUCE-5027.patch Shuffle does not limit number of outstanding connections Key: MAPREDUCE-5027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5027 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Robert Parker Attachments: MAPREDUCE-5027.patch The ShuffleHandler does not have any configurable limits to the number of outstanding connections allowed. Therefore a node with many map outputs and many reducers in the cluster trying to fetch those outputs can exhaust a nodemanager out of file descriptors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5027) Shuffle does not limit number of outstanding connections
[ https://issues.apache.org/jira/browse/MAPREDUCE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Parker updated MAPREDUCE-5027: - Attachment: (was: MAPREDUCE-5027.patch) Shuffle does not limit number of outstanding connections Key: MAPREDUCE-5027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5027 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Robert Parker Attachments: MAPREDUCE-5027.patch The ShuffleHandler does not have any configurable limits to the number of outstanding connections allowed. Therefore a node with many map outputs and many reducers in the cluster trying to fetch those outputs can exhaust a nodemanager out of file descriptors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value
[ https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587821#comment-13587821 ] Hadoop QA commented on MAPREDUCE-5028: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571090/mr-5028-branch1.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3365//console This message is automatically generated. Maps fail when io.sort.mb is set to high value -- Key: MAPREDUCE-5028 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: mr-5028-branch1.patch, mr-5028-branch1.patch Verified the problem exists on branch-1 with the following configuration: Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, io.sort.mb=1280, dfs.block.size=2147483648 Run teragen to generate 4 GB data Maps fail when you run wordcount on this configuration with the following error: {noformat} java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116) at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5027) Shuffle does not limit number of outstanding connections
[ https://issues.apache.org/jira/browse/MAPREDUCE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587836#comment-13587836 ] Hadoop QA commented on MAPREDUCE-5027: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571092/MAPREDUCE-5027.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 one of tests included doesn't have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3366//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3366//console This message is automatically generated. Shuffle does not limit number of outstanding connections Key: MAPREDUCE-5027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5027 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Robert Parker Attachments: MAPREDUCE-5027.patch The ShuffleHandler does not have any configurable limits to the number of outstanding connections allowed. Therefore a node with many map outputs and many reducers in the cluster trying to fetch those outputs can exhaust a nodemanager out of file descriptors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4659) Confusing output when running hadoop version from one hadoop installation when HADOOP_HOME points to another
[ https://issues.apache.org/jira/browse/MAPREDUCE-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated MAPREDUCE-4659: -- Attachment: MAPREDUCE-4659-branch-1.patch Confusing output when running hadoop version from one hadoop installation when HADOOP_HOME points to another -- Key: MAPREDUCE-4659 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4659 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.20.2, 2.0.1-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: MAPREDUCE-4659-2.patch, MAPREDUCE-4659-3.patch, MAPREDUCE-4659-4.patch, MAPREDUCE-4659-5.patch, MAPREDUCE-4659-branch-1.patch, MAPREDUCE-4659.patch Hadoop version X is downloaded to ~/hadoop-x, and Hadoop version Y is downloaded to ~/hadoop-y. HADOOP_HOME is set to hadoop-x. A user running hadoop-y/bin/hadoop might expect to be running the hadoop-y jars, but, because of HADOOP_HOME, will actually be running hadoop-x jars. hadoop version could help clear this up a little by reporting the current HADOOP_HOME. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4659) Confusing output when running hadoop version from one hadoop installation when HADOOP_HOME points to another
[ https://issues.apache.org/jira/browse/MAPREDUCE-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated MAPREDUCE-4659: -- Attachment: MAPREDUCE-4659-6.patch Confusing output when running hadoop version from one hadoop installation when HADOOP_HOME points to another -- Key: MAPREDUCE-4659 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4659 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.20.2, 2.0.1-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: MAPREDUCE-4659-2.patch, MAPREDUCE-4659-3.patch, MAPREDUCE-4659-4.patch, MAPREDUCE-4659-5.patch, MAPREDUCE-4659-6.patch, MAPREDUCE-4659-branch-1.patch, MAPREDUCE-4659.patch Hadoop version X is downloaded to ~/hadoop-x, and Hadoop version Y is downloaded to ~/hadoop-y. HADOOP_HOME is set to hadoop-x. A user running hadoop-y/bin/hadoop might expect to be running the hadoop-y jars, but, because of HADOOP_HOME, will actually be running hadoop-x jars. hadoop version could help clear this up a little by reporting the current HADOOP_HOME. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4659) Confusing output when running hadoop version from one hadoop installation when HADOOP_HOME points to another
[ https://issues.apache.org/jira/browse/MAPREDUCE-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587927#comment-13587927 ] Sandy Ryza commented on MAPREDUCE-4659: --- Attached branch-1 patch and refresh for trunk Confusing output when running hadoop version from one hadoop installation when HADOOP_HOME points to another -- Key: MAPREDUCE-4659 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4659 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.20.2, 2.0.1-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: MAPREDUCE-4659-2.patch, MAPREDUCE-4659-3.patch, MAPREDUCE-4659-4.patch, MAPREDUCE-4659-5.patch, MAPREDUCE-4659-6.patch, MAPREDUCE-4659-branch-1.patch, MAPREDUCE-4659.patch Hadoop version X is downloaded to ~/hadoop-x, and Hadoop version Y is downloaded to ~/hadoop-y. HADOOP_HOME is set to hadoop-x. A user running hadoop-y/bin/hadoop might expect to be running the hadoop-y jars, but, because of HADOOP_HOME, will actually be running hadoop-x jars. hadoop version could help clear this up a little by reporting the current HADOOP_HOME. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5027) Shuffle does not limit number of outstanding connections
[ https://issues.apache.org/jira/browse/MAPREDUCE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Parker updated MAPREDUCE-5027: - Attachment: MAPREDUCE-5027.patch Added timeout parameter to test annotation. Shuffle does not limit number of outstanding connections Key: MAPREDUCE-5027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5027 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Robert Parker Attachments: MAPREDUCE-5027.patch, MAPREDUCE-5027.patch The ShuffleHandler does not have any configurable limits to the number of outstanding connections allowed. Therefore a node with many map outputs and many reducers in the cluster trying to fetch those outputs can exhaust a nodemanager out of file descriptors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5033) mapred shell script should respect usage flags (--help -help -h)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587931#comment-13587931 ] Aaron T. Myers commented on MAPREDUCE-5033: --- +1, patch looks good to me. I'm going to commit this momentarily. mapred shell script should respect usage flags (--help -help -h) Key: MAPREDUCE-5033 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5033 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.0.3-alpha Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Attachments: mapreduce-5033-1.patch Like in HADOOP-9267, the mapred shell script should respect the normal Unix-y help flags. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5033) mapred shell script should respect usage flags (--help -help -h)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated MAPREDUCE-5033: -- Resolution: Fixed Fix Version/s: 2.0.4-beta Target Version/s: 2.0.4-beta Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've just committed this to trunk and branch-2. Thanks a lot for the contribution, Andrew. mapred shell script should respect usage flags (--help -help -h) Key: MAPREDUCE-5033 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5033 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.0.3-alpha Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Fix For: 2.0.4-beta Attachments: mapreduce-5033-1.patch Like in HADOOP-9267, the mapred shell script should respect the normal Unix-y help flags. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5027) Shuffle does not limit number of outstanding connections
[ https://issues.apache.org/jira/browse/MAPREDUCE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587940#comment-13587940 ] Hadoop QA commented on MAPREDUCE-5027: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571124/MAPREDUCE-5027.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 one of tests included doesn't have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3367//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3367//console This message is automatically generated. Shuffle does not limit number of outstanding connections Key: MAPREDUCE-5027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5027 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Robert Parker Attachments: MAPREDUCE-5027.patch, MAPREDUCE-5027.patch The ShuffleHandler does not have any configurable limits to the number of outstanding connections allowed. Therefore a node with many map outputs and many reducers in the cluster trying to fetch those outputs can exhaust a nodemanager out of file descriptors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5033) mapred shell script should respect usage flags (--help -help -h)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587941#comment-13587941 ] Hudson commented on MAPREDUCE-5033: --- Integrated in Hadoop-trunk-Commit #3388 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3388/]) MAPREDUCE-5033. mapred shell script should respect usage flags (--help -help -h). Contributed by Andrew Wang. (Revision 1450584) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1450584 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred mapred shell script should respect usage flags (--help -help -h) Key: MAPREDUCE-5033 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5033 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.0.3-alpha Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Fix For: 2.0.4-beta Attachments: mapreduce-5033-1.patch Like in HADOOP-9267, the mapred shell script should respect the normal Unix-y help flags. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4659) Confusing output when running hadoop version from one hadoop installation when HADOOP_HOME points to another
[ https://issues.apache.org/jira/browse/MAPREDUCE-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587976#comment-13587976 ] Hadoop QA commented on MAPREDUCE-4659: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571123/MAPREDUCE-4659-6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:red}-1 one of tests included doesn't have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3368//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3368//console This message is automatically generated. Confusing output when running hadoop version from one hadoop installation when HADOOP_HOME points to another -- Key: MAPREDUCE-4659 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4659 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.20.2, 2.0.1-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: MAPREDUCE-4659-2.patch, MAPREDUCE-4659-3.patch, MAPREDUCE-4659-4.patch, MAPREDUCE-4659-5.patch, MAPREDUCE-4659-6.patch, MAPREDUCE-4659-branch-1.patch, MAPREDUCE-4659.patch Hadoop version X is downloaded to ~/hadoop-x, and Hadoop version Y is downloaded to ~/hadoop-y. HADOOP_HOME is set to hadoop-x. A user running hadoop-y/bin/hadoop might expect to be running the hadoop-y jars, but, because of HADOOP_HOME, will actually be running hadoop-x jars. hadoop version could help clear this up a little by reporting the current HADOOP_HOME. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira