Re: MapReduce combiner issue : EOFException while reading Value
Hi Guys Can anyone please proved any suggestion on this. I am still facing this issue when running with combiner ? Please give your valuable inputs Regards, Arpit Wanchoo | Sr. Software Engineer Guavus Network Systems. 6th Floor, Enkay Towers, Tower B B1,Vanijya Nikunj, Udyog Vihar Phase - V, Gurgaon,Haryana. Mobile Number +91-9899949788 On 28-May-2012, at 10:08 AM, Arpit Wanchoo wrote: Hi I have been trying to setup a map reduce job with hadoop 0.20.203.1. Scenario : My mapper is writing key value pairs where I have total 13 types of keys and corresponding value classes. For each input record I write all these i.e 13 key-val pair to context. Also for one specific key (say K1) I want its mapper output to go in one file and for all other keys to rest of files. For doing this ,this I have define my partitioner as : public int getPartition(DimensionSet key, MeasureSet value, int numPartitions) { if(numPartitions 2){ int x= (key.hashCode() Integer.MAX_VALUE) % numPartitions; return x; } int cubeId = key.getCubeId(); if (cubeId == CubeName.AT_COutgoing.ordinal()) { return 0; } else { int x=((key.hashCode() Integer.MAX_VALUE) % (numPartitions-1)) + 1; return x; } } My combiner and reducer are doing the same thing. Issue : My job is running fine when I don't use a combiner. But when I run with combiner , I am getting EOFException. java.io.EOFException at java.io.DataInputStream.readUnsignedShort(Unknown Source) at java.io.DataInputStream.readUTF(Unknown Source) at java.io.DataInputStream.readUTF(Unknown Source) at com.guavus.mapred.common.collection.ValueCollection.readFieldsLong(ValueCollection.java:40) at com.guavus.mapred.common.collection.ValueCollection.readFields(ValueCollection.java:21) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116) at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1420) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1435) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:852) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1343) My Finding : On checking and debugging what I got was that for the particular key-val pair (K1, which I want to write to reduce number 0), the combiner reads the key successfully but while trying to read the values it gives EOFException because it doesn't find anything in DataInput stream. Also this is occurring when data is large and combiner runs more than once. I have noticed that the combiner is failing to get the value for this key when running for the 2nd time . (I read somewhere that combiner begins when the some amount of data has been written by mapper even though mapper is still writing data to context). Actually the issue occured with any key which was defined in partitioner to get partition 0 for writing. I verified many times that my mapper is writing no null value. The issue looks really strange because combiner is able to read the key but doesn't get any value in data stream. Please suggest what could be the root cause for this or what can I do to track the root cause. Regards, Arpit Wanchoo
RE: MapReduce combiner issue : EOFException while reading Value
Can you check ValueCollection.write(DataOutput) method is writing properly whatever you are expecting in readFields() method. Thanks Devaraj From: Arpit Wanchoo [arpit.wanc...@guavus.com] Sent: Thursday, May 31, 2012 2:57 PM To: common-user@hadoop.apache.org Subject: Re: MapReduce combiner issue : EOFException while reading Value Hi Guys Can anyone please proved any suggestion on this. I am still facing this issue when running with combiner ? Please give your valuable inputs Regards, Arpit Wanchoo | Sr. Software Engineer Guavus Network Systems. 6th Floor, Enkay Towers, Tower B B1,Vanijya Nikunj, Udyog Vihar Phase - V, Gurgaon,Haryana. Mobile Number +91-9899949788 On 28-May-2012, at 10:08 AM, Arpit Wanchoo wrote: Hi I have been trying to setup a map reduce job with hadoop 0.20.203.1. Scenario : My mapper is writing key value pairs where I have total 13 types of keys and corresponding value classes. For each input record I write all these i.e 13 key-val pair to context. Also for one specific key (say K1) I want its mapper output to go in one file and for all other keys to rest of files. For doing this ,this I have define my partitioner as : public int getPartition(DimensionSet key, MeasureSet value, int numPartitions) { if(numPartitions 2){ int x= (key.hashCode() Integer.MAX_VALUE) % numPartitions; return x; } int cubeId = key.getCubeId(); if (cubeId == CubeName.AT_COutgoing.ordinal()) { return 0; } else { int x=((key.hashCode() Integer.MAX_VALUE) % (numPartitions-1)) + 1; return x; } } My combiner and reducer are doing the same thing. Issue : My job is running fine when I don't use a combiner. But when I run with combiner , I am getting EOFException. java.io.EOFException at java.io.DataInputStream.readUnsignedShort(Unknown Source) at java.io.DataInputStream.readUTF(Unknown Source) at java.io.DataInputStream.readUTF(Unknown Source) at com.guavus.mapred.common.collection.ValueCollection.readFieldsLong(ValueCollection.java:40) at com.guavus.mapred.common.collection.ValueCollection.readFields(ValueCollection.java:21) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116) at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1420) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1435) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:852) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1343) My Finding : On checking and debugging what I got was that for the particular key-val pair (K1, which I want to write to reduce number 0), the combiner reads the key successfully but while trying to read the values it gives EOFException because it doesn't find anything in DataInput stream. Also this is occurring when data is large and combiner runs more than once. I have noticed that the combiner is failing to get the value for this key when running for the 2nd time . (I read somewhere that combiner begins when the some amount of data has been written by mapper even though mapper is still writing data to context). Actually the issue occured with any key which was defined in partitioner to get partition 0 for writing. I verified many times that my mapper is writing no null value. The issue looks really strange because combiner is able to read the key but doesn't get any value in data stream. Please suggest what could be the root cause for this or what can I do to track the root cause. Regards, Arpit Wanchoo
Re: EOFException at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)......
I got the problem with I am unable to solve it. I need to apply a filter for _SUCCESS file while using FileSystem.listStatus method. Can someone please guide me how to filter _SUCCESS files. Thanks On Tue, May 29, 2012 at 1:42 PM, waqas latif waqas...@gmail.com wrote: So my question is that do hadoop 0.20 and 1.0.3 differ in their support of writing or reading sequencefiles? same code works fine with hadoop 0.20 but problem occurs when run it under hadoop 1.0.3. On Sun, May 27, 2012 at 6:15 PM, waqas latif waqas...@gmail.com wrote: But the thing is, it works with hadoop 0.20. even with 100 x100(and even bigger matrices) but when it comes to hadoop 1.0.3 then even there is a problem with 3x3 matrix. On Sun, May 27, 2012 at 12:00 PM, Prashant Kommireddi prash1...@gmail.com wrote: I have seen this issue with large file writes using SequenceFile writer. Not found the same issue when testing with writing fairly small files ( 1GB). On Fri, May 25, 2012 at 10:33 PM, Kasi Subrahmanyam kasisubbu...@gmail.comwrote: Hi, If you are using a custom writable object while passing data from the mapper to the reducer make sure that the read fields and the write has the same number of variables. It might be possible that you wrote datavtova file using custom writable but later modified the custom writable (like adding new attribute to the writable) which the old data doesn't have. It might be a possibility is please check once On Friday, May 25, 2012, waqas latif wrote: Hi Experts, I am fairly new to hadoop MapR and I was trying to run a matrix multiplication example presented by Mr. Norstadt under following link http://www.norstad.org/matrix-multiply/index.html. I can run it successfully with hadoop 0.20.2 but I tried to run it with hadoop 1.0.3 but I am getting following error. Is it the problem with my hadoop configuration or it is compatibility problem in the code which was written in hadoop 0.20 by author.Also please guide me that how can I fix this error in either case. Here is the error I am getting. in thread main java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1486) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1475) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1470) at TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:60) at TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:87) at TestMatrixMultiply.checkAnswer(TestMatrixMultiply.java:112) at TestMatrixMultiply.runOneTest(TestMatrixMultiply.java:150) at TestMatrixMultiply.testRandom(TestMatrixMultiply.java:278) at TestMatrixMultiply.main(TestMatrixMultiply.java:308) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Thanks in advance Regards, waqas
Re: EOFException at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)......
When your code does a listStatus, you can pass a PathFilter object along that can do this filtering for you. See http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/fs/FileSystem.html#listStatus(org.apache.hadoop.fs.Path,%20org.apache.hadoop.fs.PathFilter) for the API javadocs on that. On Wed, May 30, 2012 at 7:46 PM, waqas latif waqas...@gmail.com wrote: I got the problem with I am unable to solve it. I need to apply a filter for _SUCCESS file while using FileSystem.listStatus method. Can someone please guide me how to filter _SUCCESS files. Thanks On Tue, May 29, 2012 at 1:42 PM, waqas latif waqas...@gmail.com wrote: So my question is that do hadoop 0.20 and 1.0.3 differ in their support of writing or reading sequencefiles? same code works fine with hadoop 0.20 but problem occurs when run it under hadoop 1.0.3. On Sun, May 27, 2012 at 6:15 PM, waqas latif waqas...@gmail.com wrote: But the thing is, it works with hadoop 0.20. even with 100 x100(and even bigger matrices) but when it comes to hadoop 1.0.3 then even there is a problem with 3x3 matrix. On Sun, May 27, 2012 at 12:00 PM, Prashant Kommireddi prash1...@gmail.com wrote: I have seen this issue with large file writes using SequenceFile writer. Not found the same issue when testing with writing fairly small files ( 1GB). On Fri, May 25, 2012 at 10:33 PM, Kasi Subrahmanyam kasisubbu...@gmail.comwrote: Hi, If you are using a custom writable object while passing data from the mapper to the reducer make sure that the read fields and the write has the same number of variables. It might be possible that you wrote datavtova file using custom writable but later modified the custom writable (like adding new attribute to the writable) which the old data doesn't have. It might be a possibility is please check once On Friday, May 25, 2012, waqas latif wrote: Hi Experts, I am fairly new to hadoop MapR and I was trying to run a matrix multiplication example presented by Mr. Norstadt under following link http://www.norstad.org/matrix-multiply/index.html. I can run it successfully with hadoop 0.20.2 but I tried to run it with hadoop 1.0.3 but I am getting following error. Is it the problem with my hadoop configuration or it is compatibility problem in the code which was written in hadoop 0.20 by author.Also please guide me that how can I fix this error in either case. Here is the error I am getting. in thread main java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1486) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1475) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1470) at TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:60) at TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:87) at TestMatrixMultiply.checkAnswer(TestMatrixMultiply.java:112) at TestMatrixMultiply.runOneTest(TestMatrixMultiply.java:150) at TestMatrixMultiply.testRandom(TestMatrixMultiply.java:278) at TestMatrixMultiply.main(TestMatrixMultiply.java:308) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Thanks in advance Regards, waqas -- Harsh J
Re: EOFException at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)......
Thanks Harsh. I got it running. On Wed, May 30, 2012 at 5:58 PM, Harsh J ha...@cloudera.com wrote: When your code does a listStatus, you can pass a PathFilter object along that can do this filtering for you. See http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/fs/FileSystem.html#listStatus(org.apache.hadoop.fs.Path,%20org.apache.hadoop.fs.PathFilter) for the API javadocs on that. On Wed, May 30, 2012 at 7:46 PM, waqas latif waqas...@gmail.com wrote: I got the problem with I am unable to solve it. I need to apply a filter for _SUCCESS file while using FileSystem.listStatus method. Can someone please guide me how to filter _SUCCESS files. Thanks On Tue, May 29, 2012 at 1:42 PM, waqas latif waqas...@gmail.com wrote: So my question is that do hadoop 0.20 and 1.0.3 differ in their support of writing or reading sequencefiles? same code works fine with hadoop 0.20 but problem occurs when run it under hadoop 1.0.3. On Sun, May 27, 2012 at 6:15 PM, waqas latif waqas...@gmail.com wrote: But the thing is, it works with hadoop 0.20. even with 100 x100(and even bigger matrices) but when it comes to hadoop 1.0.3 then even there is a problem with 3x3 matrix. On Sun, May 27, 2012 at 12:00 PM, Prashant Kommireddi prash1...@gmail.com wrote: I have seen this issue with large file writes using SequenceFile writer. Not found the same issue when testing with writing fairly small files ( 1GB). On Fri, May 25, 2012 at 10:33 PM, Kasi Subrahmanyam kasisubbu...@gmail.comwrote: Hi, If you are using a custom writable object while passing data from the mapper to the reducer make sure that the read fields and the write has the same number of variables. It might be possible that you wrote datavtova file using custom writable but later modified the custom writable (like adding new attribute to the writable) which the old data doesn't have. It might be a possibility is please check once On Friday, May 25, 2012, waqas latif wrote: Hi Experts, I am fairly new to hadoop MapR and I was trying to run a matrix multiplication example presented by Mr. Norstadt under following link http://www.norstad.org/matrix-multiply/index.html. I can run it successfully with hadoop 0.20.2 but I tried to run it with hadoop 1.0.3 but I am getting following error. Is it the problem with my hadoop configuration or it is compatibility problem in the code which was written in hadoop 0.20 by author.Also please guide me that how can I fix this error in either case. Here is the error I am getting. in thread main java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1486) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1475) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1470) at TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:60) at TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:87) at TestMatrixMultiply.checkAnswer(TestMatrixMultiply.java:112) at TestMatrixMultiply.runOneTest(TestMatrixMultiply.java:150) at TestMatrixMultiply.testRandom(TestMatrixMultiply.java:278) at TestMatrixMultiply.main(TestMatrixMultiply.java:308) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Thanks in advance Regards, waqas -- Harsh J
Re: EOFException at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)......
So my question is that do hadoop 0.20 and 1.0.3 differ in their support of writing or reading sequencefiles? same code works fine with hadoop 0.20 but problem occurs when run it under hadoop 1.0.3. On Sun, May 27, 2012 at 6:15 PM, waqas latif waqas...@gmail.com wrote: But the thing is, it works with hadoop 0.20. even with 100 x100(and even bigger matrices) but when it comes to hadoop 1.0.3 then even there is a problem with 3x3 matrix. On Sun, May 27, 2012 at 12:00 PM, Prashant Kommireddi prash1...@gmail.com wrote: I have seen this issue with large file writes using SequenceFile writer. Not found the same issue when testing with writing fairly small files ( 1GB). On Fri, May 25, 2012 at 10:33 PM, Kasi Subrahmanyam kasisubbu...@gmail.comwrote: Hi, If you are using a custom writable object while passing data from the mapper to the reducer make sure that the read fields and the write has the same number of variables. It might be possible that you wrote datavtova file using custom writable but later modified the custom writable (like adding new attribute to the writable) which the old data doesn't have. It might be a possibility is please check once On Friday, May 25, 2012, waqas latif wrote: Hi Experts, I am fairly new to hadoop MapR and I was trying to run a matrix multiplication example presented by Mr. Norstadt under following link http://www.norstad.org/matrix-multiply/index.html. I can run it successfully with hadoop 0.20.2 but I tried to run it with hadoop 1.0.3 but I am getting following error. Is it the problem with my hadoop configuration or it is compatibility problem in the code which was written in hadoop 0.20 by author.Also please guide me that how can I fix this error in either case. Here is the error I am getting. in thread main java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1486) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1475) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1470) at TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:60) at TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:87) at TestMatrixMultiply.checkAnswer(TestMatrixMultiply.java:112) at TestMatrixMultiply.runOneTest(TestMatrixMultiply.java:150) at TestMatrixMultiply.testRandom(TestMatrixMultiply.java:278) at TestMatrixMultiply.main(TestMatrixMultiply.java:308) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Thanks in advance Regards, waqas
Re: EOFException at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)......
I have seen this issue with large file writes using SequenceFile writer. Not found the same issue when testing with writing fairly small files ( 1GB). On Fri, May 25, 2012 at 10:33 PM, Kasi Subrahmanyam kasisubbu...@gmail.comwrote: Hi, If you are using a custom writable object while passing data from the mapper to the reducer make sure that the read fields and the write has the same number of variables. It might be possible that you wrote datavtova file using custom writable but later modified the custom writable (like adding new attribute to the writable) which the old data doesn't have. It might be a possibility is please check once On Friday, May 25, 2012, waqas latif wrote: Hi Experts, I am fairly new to hadoop MapR and I was trying to run a matrix multiplication example presented by Mr. Norstadt under following link http://www.norstad.org/matrix-multiply/index.html. I can run it successfully with hadoop 0.20.2 but I tried to run it with hadoop 1.0.3 but I am getting following error. Is it the problem with my hadoop configuration or it is compatibility problem in the code which was written in hadoop 0.20 by author.Also please guide me that how can I fix this error in either case. Here is the error I am getting. in thread main java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1486) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1475) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1470) at TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:60) at TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:87) at TestMatrixMultiply.checkAnswer(TestMatrixMultiply.java:112) at TestMatrixMultiply.runOneTest(TestMatrixMultiply.java:150) at TestMatrixMultiply.testRandom(TestMatrixMultiply.java:278) at TestMatrixMultiply.main(TestMatrixMultiply.java:308) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Thanks in advance Regards, waqas
Re: EOFException at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)......
But the thing is, it works with hadoop 0.20. even with 100 x100(and even bigger matrices) but when it comes to hadoop 1.0.3 then even there is a problem with 3x3 matrix. On Sun, May 27, 2012 at 12:00 PM, Prashant Kommireddi prash1...@gmail.comwrote: I have seen this issue with large file writes using SequenceFile writer. Not found the same issue when testing with writing fairly small files ( 1GB). On Fri, May 25, 2012 at 10:33 PM, Kasi Subrahmanyam kasisubbu...@gmail.comwrote: Hi, If you are using a custom writable object while passing data from the mapper to the reducer make sure that the read fields and the write has the same number of variables. It might be possible that you wrote datavtova file using custom writable but later modified the custom writable (like adding new attribute to the writable) which the old data doesn't have. It might be a possibility is please check once On Friday, May 25, 2012, waqas latif wrote: Hi Experts, I am fairly new to hadoop MapR and I was trying to run a matrix multiplication example presented by Mr. Norstadt under following link http://www.norstad.org/matrix-multiply/index.html. I can run it successfully with hadoop 0.20.2 but I tried to run it with hadoop 1.0.3 but I am getting following error. Is it the problem with my hadoop configuration or it is compatibility problem in the code which was written in hadoop 0.20 by author.Also please guide me that how can I fix this error in either case. Here is the error I am getting. in thread main java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1486) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1475) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1470) at TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:60) at TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:87) at TestMatrixMultiply.checkAnswer(TestMatrixMultiply.java:112) at TestMatrixMultiply.runOneTest(TestMatrixMultiply.java:150) at TestMatrixMultiply.testRandom(TestMatrixMultiply.java:278) at TestMatrixMultiply.main(TestMatrixMultiply.java:308) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Thanks in advance Regards, waqas
MapReduce combiner issue : EOFException while reading Value
Hi I have been trying to setup a map reduce job with hadoop 0.20.203.1. Scenario : My mapper is writing key value pairs where I have total 13 types of keys and corresponding value classes. For each input record I write all these i.e 13 key-val pair to context. Also for one specific key (say K1) I want its mapper output to go in one file and for all other keys to rest of files. For doing this ,this I have define my partitioner as : public int getPartition(DimensionSet key, MeasureSet value, int numPartitions) { if(numPartitions 2){ int x= (key.hashCode() Integer.MAX_VALUE) % numPartitions; return x; } int cubeId = key.getCubeId(); if (cubeId == CubeName.AT_COutgoing.ordinal()) { return 0; } else { int x=((key.hashCode() Integer.MAX_VALUE) % (numPartitions-1)) + 1; return x; } } My combiner and reducer are doing the same thing. Issue : My job is running fine when I don't use a combiner. But when I run with combiner , I am getting EOFException. java.io.EOFException at java.io.DataInputStream.readUnsignedShort(Unknown Source) at java.io.DataInputStream.readUTF(Unknown Source) at java.io.DataInputStream.readUTF(Unknown Source) at com.guavus.mapred.common.collection.ValueCollection.readFieldsLong(ValueCollection.java:40) at com.guavus.mapred.common.collection.ValueCollection.readFields(ValueCollection.java:21) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116) at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1420) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1435) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:852) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1343) My Finding : On checking and debugging what I got was that for the particular key-val pair (K1, which I want to write to reduce number 0), the combiner reads the key successfully but while trying to read the values it gives EOFException because it doesn't find anything in DataInput stream. Also this is occurring when data is large and combiner runs more than once. I have noticed that the combiner is failing to get the value for this key when running for the 2nd time . (I read somewhere that combiner begins when the some amount of data has been written by mapper even though mapper is still writing data to context). Actually the issue occured with any key which was defined in partitioner to get partition 0 for writing. I verified many times that my mapper is writing no null value. The issue looks really strange because combiner is able to read the key but doesn't get any value in data stream. Please suggest what could be the root cause for this or what can I do to track the root cause. Regards, Arpit Wanchoo
Re: EOFException at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)......
Regards, waqas. I think that you have to ask to MapR experts. On 05/25/2012 05:42 AM, waqas latif wrote: Hi Experts, I am fairly new to hadoop MapR and I was trying to run a matrix multiplication example presented by Mr. Norstadt under following link http://www.norstad.org/matrix-multiply/index.html. I can run it successfully with hadoop 0.20.2 but I tried to run it with hadoop 1.0.3 but I am getting following error. Is it the problem with my hadoop configuration or it is compatibility problem in the code which was written in hadoop 0.20 by author.Also please guide me that how can I fix this error in either case. Here is the error I am getting. The same code that you write for 0.20.2 should work in 1.0.3 too. in thread main java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1486) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1475) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1470) at TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:60) at TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:87) at TestMatrixMultiply.checkAnswer(TestMatrixMultiply.java:112) at TestMatrixMultiply.runOneTest(TestMatrixMultiply.java:150) at TestMatrixMultiply.testRandom(TestMatrixMultiply.java:278) at TestMatrixMultiply.main(TestMatrixMultiply.java:308) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Thanks in advance Regards, waqas Can you put here the completed log for this? Best wishes -- Marcos Luis Ortíz Valmaseda Data Engineer Sr. System Administrator at UCI http://marcosluis2186.posterous.com http://www.linkedin.com/in/marcosluis2186 Twitter: @marcosluis2186 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS... CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION http://www.uci.cu http://www.facebook.com/universidad.uci http://www.flickr.com/photos/universidad_uci
Re: EOFException
Hi, In write method ,use writeInt() rather than write method. It should solve your problem. On Mon, Apr 30, 2012 at 10:40 PM, Keith Thompson kthom...@binghamton.eduwrote: I have been running several MapReduce jobs on some input text files. They were working fine earlier and then I suddenly started getting EOFException every time. Even the jobs that ran fine before (on the exact same input files) aren't running now. I am a bit perplexed as to what is causing this error. Here is the error: 12/04/30 12:55:55 INFO mapred.JobClient: Task Id : attempt_201202240659_6328_m_01_1, Status : FAILED java.lang.RuntimeException: java.io.EOFException at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:128) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.compare(MapTask.java:967) at org.apache.hadoop.util.QuickSort.fix(QuickSort.java:30) at org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:83) at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:59) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1253) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1154) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.mapred.Child.main(Child.java:262) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at com.xerox.twitter.bin.UserTime.readFields(UserTime.java:31) at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:122) Since the compare function seems to be involved, here is my custom key class. Note: I did not include year in the key because all keys have the same year. public class UserTime implements WritableComparableUserTime { int id, month, day, year, hour, min, sec; public UserTime() { } public UserTime(int u, int mon, int d, int y, int h, int m, int s) { id = u; month = mon; day = d; year = y; hour = h; min = m; sec = s; } @Override public void readFields(DataInput in) throws IOException { // TODO Auto-generated method stub id = in.readInt(); month = in.readInt(); day = in.readInt(); year = in.readInt(); hour = in.readInt(); min = in.readInt(); sec = in.readInt(); } @Override public void write(DataOutput out) throws IOException { // TODO Auto-generated method stub out.write(id); out.write(month); out.write(day); out.write(year); out.write(hour); out.write(min); out.write(sec); } @Override public int compareTo(UserTime that) { // TODO Auto-generated method stub if(compareUser(that) == 0) return (compareTime(that)); else if(compareUser(that) == 1) return 1; else return -1; } private int compareUser(UserTime that) { if(id that.id) return 1; else if(id == that.id) return 0; else return -1; } //assumes all are from the same year private int compareTime(UserTime that) { if(month that.month || (month == that.month day that.day) || (month == that.month day == that.day hour that.hour) || (month == that.month day == that.day hour == that.hour min that.min) || (month == that.month day == that.day hour == that.hour min == that.min sec that.sec)) return 1; else if(month == that.month day == that.day hour == that.hour min == that.min sec == that.sec) return 0; else return -1; } public String toString() { String h, m, s; if(hour 10) h = 0+hour; else h = Integer.toString(hour); if(min 10) m = 0+min; else m = Integer.toString(hour); if(sec 10) s = 0+min; else s = Integer.toString(hour); return (id+\t+month+/+day+/+year+\t+h+:+m+:+s); } } Thanks for any help. Regards, Keith -- https://github.com/zinnia-phatak-dev/Nectar
EOFException
I have been running several MapReduce jobs on some input text files. They were working fine earlier and then I suddenly started getting EOFException every time. Even the jobs that ran fine before (on the exact same input files) aren't running now. I am a bit perplexed as to what is causing this error. Here is the error: 12/04/30 12:55:55 INFO mapred.JobClient: Task Id : attempt_201202240659_6328_m_01_1, Status : FAILED java.lang.RuntimeException: java.io.EOFException at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:128) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.compare(MapTask.java:967) at org.apache.hadoop.util.QuickSort.fix(QuickSort.java:30) at org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:83) at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:59) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1253) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1154) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.mapred.Child.main(Child.java:262) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at com.xerox.twitter.bin.UserTime.readFields(UserTime.java:31) at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:122) Since the compare function seems to be involved, here is my custom key class. Note: I did not include year in the key because all keys have the same year. public class UserTime implements WritableComparableUserTime { int id, month, day, year, hour, min, sec; public UserTime() { } public UserTime(int u, int mon, int d, int y, int h, int m, int s) { id = u; month = mon; day = d; year = y; hour = h; min = m; sec = s; } @Override public void readFields(DataInput in) throws IOException { // TODO Auto-generated method stub id = in.readInt(); month = in.readInt(); day = in.readInt(); year = in.readInt(); hour = in.readInt(); min = in.readInt(); sec = in.readInt(); } @Override public void write(DataOutput out) throws IOException { // TODO Auto-generated method stub out.write(id); out.write(month); out.write(day); out.write(year); out.write(hour); out.write(min); out.write(sec); } @Override public int compareTo(UserTime that) { // TODO Auto-generated method stub if(compareUser(that) == 0) return (compareTime(that)); else if(compareUser(that) == 1) return 1; else return -1; } private int compareUser(UserTime that) { if(id that.id) return 1; else if(id == that.id) return 0; else return -1; } //assumes all are from the same year private int compareTime(UserTime that) { if(month that.month || (month == that.month day that.day) || (month == that.month day == that.day hour that.hour) || (month == that.month day == that.day hour == that.hour min that.min) || (month == that.month day == that.day hour == that.hour min == that.min sec that.sec)) return 1; else if(month == that.month day == that.day hour == that.hour min == that.min sec == that.sec) return 0; else return -1; } public String toString() { String h, m, s; if(hour 10) h = 0+hour; else h = Integer.toString(hour); if(min 10) m = 0+min; else m = Integer.toString(hour); if(sec 10) s = 0+min; else s = Integer.toString(hour); return (id+\t+month+/+day+/+year+\t+h+:+m+:+s); } } Thanks for any help. Regards, Keith
Re: EOFException
Hi, Seems like HDFS is in safemode. On Fri, Mar 16, 2012 at 1:37 AM, Mohit Anchlia mohitanch...@gmail.comwrote: This is actually just hadoop job over HDFS. I am assuming you also know why this is erroring out? On Thu, Mar 15, 2012 at 1:02 PM, Gopal absoft...@gmail.com wrote: On 03/15/2012 03:06 PM, Mohit Anchlia wrote: When I start a job to read data from HDFS I start getting these errors. Does anyone know what this means and how to resolve it? 2012-03-15 10:41:31,402 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Exception in createBlockOutputStream 164.28.62.204:50010java.io.** EOFException 2012-03-15 10:41:31,402 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Abandoning block blk_-6402969611996946639_11837 2012-03-15 10:41:31,403 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Excluding datanode 164.28.62.204:50010 2012-03-15 10:41:31,406 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Exception in createBlockOutputStream 164.28.62.198:50010java.io.** EOFException 2012-03-15 10:41:31,406 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Abandoning block blk_-5442664108986165368_11838 2012-03-15 10:41:31,407 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Exception in createBlockOutputStream 164.28.62.197:50010java.io.** EOFException 2012-03-15 10:41:31,407 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Abandoning block blk_-3373089616877234160_11838 2012-03-15 10:41:31,407 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Excluding datanode 164.28.62.198:50010 2012-03-15 10:41:31,409 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Excluding datanode 164.28.62.197:50010 2012-03-15 10:41:31,410 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Exception in createBlockOutputStream 164.28.62.204:50010java.io.** EOFException 2012-03-15 10:41:31,410 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Abandoning block blk_4481292025401332278_11838 2012-03-15 10:41:31,411 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Excluding datanode 164.28.62.204:50010 2012-03-15 10:41:31,412 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Exception in createBlockOutputStream 164.28.62.200:50010java.io.** EOFException 2012-03-15 10:41:31,412 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Abandoning block blk_-5326771177080888701_11838 2012-03-15 10:41:31,413 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Excluding datanode 164.28.62.200:50010 2012-03-15 10:41:31,414 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Exception in createBlockOutputStream 164.28.62.197:50010java.io.** EOFException 2012-03-15 10:41:31,414 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Abandoning block blk_-8073750683705518772_11839 2012-03-15 10:41:31,415 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Excluding datanode 164.28.62.197:50010 2012-03-15 10:41:31,416 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Exception in createBlockOutputStream 164.28.62.199:50010java.io.** EOFException 2012-03-15 10:41:31,416 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Exception in createBlockOutputStream 164.28.62.198:50010java.io.** EOFException 2012-03-15 10:41:31,416 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Abandoning block blk_441003866688859169_11838 2012-03-15 10:41:31,416 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Abandoning block blk_-466858474055876377_11839 2012-03-15 10:41:31,417 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Excluding datanode 164.28.62.198:50010 2012-03-15 10:41:31,417 [Thread-5] WARN org.apache.hadoop.hdfs.**DFSClient - Try shutting down and restarting hbase. -- https://github.com/zinnia-phatak-dev/Nectar
Re: EOFException
On 03/15/2012 03:06 PM, Mohit Anchlia wrote: When I start a job to read data from HDFS I start getting these errors. Does anyone know what this means and how to resolve it? 2012-03-15 10:41:31,402 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream 164.28.62.204:50010java.io.EOFException 2012-03-15 10:41:31,402 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_-6402969611996946639_11837 2012-03-15 10:41:31,403 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Excluding datanode 164.28.62.204:50010 2012-03-15 10:41:31,406 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream 164.28.62.198:50010java.io.EOFException 2012-03-15 10:41:31,406 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_-5442664108986165368_11838 2012-03-15 10:41:31,407 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream 164.28.62.197:50010java.io.EOFException 2012-03-15 10:41:31,407 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_-3373089616877234160_11838 2012-03-15 10:41:31,407 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Excluding datanode 164.28.62.198:50010 2012-03-15 10:41:31,409 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Excluding datanode 164.28.62.197:50010 2012-03-15 10:41:31,410 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream 164.28.62.204:50010java.io.EOFException 2012-03-15 10:41:31,410 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_4481292025401332278_11838 2012-03-15 10:41:31,411 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Excluding datanode 164.28.62.204:50010 2012-03-15 10:41:31,412 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream 164.28.62.200:50010java.io.EOFException 2012-03-15 10:41:31,412 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_-5326771177080888701_11838 2012-03-15 10:41:31,413 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Excluding datanode 164.28.62.200:50010 2012-03-15 10:41:31,414 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream 164.28.62.197:50010java.io.EOFException 2012-03-15 10:41:31,414 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_-8073750683705518772_11839 2012-03-15 10:41:31,415 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Excluding datanode 164.28.62.197:50010 2012-03-15 10:41:31,416 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream 164.28.62.199:50010java.io.EOFException 2012-03-15 10:41:31,416 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream 164.28.62.198:50010java.io.EOFException 2012-03-15 10:41:31,416 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_441003866688859169_11838 2012-03-15 10:41:31,416 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_-466858474055876377_11839 2012-03-15 10:41:31,417 [Thread-5] INFO org.apache.hadoop.hdfs.DFSClient - Excluding datanode 164.28.62.198:50010 2012-03-15 10:41:31,417 [Thread-5] WARN org.apache.hadoop.hdfs.DFSClient - Try shutting down and restarting hbase.
Re: EOFException
This is actually just hadoop job over HDFS. I am assuming you also know why this is erroring out? On Thu, Mar 15, 2012 at 1:02 PM, Gopal absoft...@gmail.com wrote: On 03/15/2012 03:06 PM, Mohit Anchlia wrote: When I start a job to read data from HDFS I start getting these errors. Does anyone know what this means and how to resolve it? 2012-03-15 10:41:31,402 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Exception in createBlockOutputStream 164.28.62.204:50010java.io.** EOFException 2012-03-15 10:41:31,402 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Abandoning block blk_-6402969611996946639_11837 2012-03-15 10:41:31,403 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Excluding datanode 164.28.62.204:50010 2012-03-15 10:41:31,406 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Exception in createBlockOutputStream 164.28.62.198:50010java.io.** EOFException 2012-03-15 10:41:31,406 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Abandoning block blk_-5442664108986165368_11838 2012-03-15 10:41:31,407 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Exception in createBlockOutputStream 164.28.62.197:50010java.io.** EOFException 2012-03-15 10:41:31,407 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Abandoning block blk_-3373089616877234160_11838 2012-03-15 10:41:31,407 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Excluding datanode 164.28.62.198:50010 2012-03-15 10:41:31,409 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Excluding datanode 164.28.62.197:50010 2012-03-15 10:41:31,410 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Exception in createBlockOutputStream 164.28.62.204:50010java.io.** EOFException 2012-03-15 10:41:31,410 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Abandoning block blk_4481292025401332278_11838 2012-03-15 10:41:31,411 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Excluding datanode 164.28.62.204:50010 2012-03-15 10:41:31,412 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Exception in createBlockOutputStream 164.28.62.200:50010java.io.** EOFException 2012-03-15 10:41:31,412 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Abandoning block blk_-5326771177080888701_11838 2012-03-15 10:41:31,413 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Excluding datanode 164.28.62.200:50010 2012-03-15 10:41:31,414 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Exception in createBlockOutputStream 164.28.62.197:50010java.io.** EOFException 2012-03-15 10:41:31,414 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Abandoning block blk_-8073750683705518772_11839 2012-03-15 10:41:31,415 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Excluding datanode 164.28.62.197:50010 2012-03-15 10:41:31,416 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Exception in createBlockOutputStream 164.28.62.199:50010java.io.** EOFException 2012-03-15 10:41:31,416 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Exception in createBlockOutputStream 164.28.62.198:50010java.io.** EOFException 2012-03-15 10:41:31,416 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Abandoning block blk_441003866688859169_11838 2012-03-15 10:41:31,416 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Abandoning block blk_-466858474055876377_11839 2012-03-15 10:41:31,417 [Thread-5] INFO org.apache.hadoop.hdfs.**DFSClient - Excluding datanode 164.28.62.198:50010 2012-03-15 10:41:31,417 [Thread-5] WARN org.apache.hadoop.hdfs.**DFSClient - Try shutting down and restarting hbase.
EOFException thrown by a Hadoop pipes program
Hello, I have a small Hadoop pipes program that throws java.io.EOFException. The program takes as input a small text file and uses hadoop.pipes.java.recordreader and hadoop.pipes.java.recordwriter. The input is very simple like: |1 262144 42.8084 15.9157 4.1324 0.06 0.1 | However, Hadoop will throw an EOFException, which I can't see the reason. Below is the stack trace: |10/12/08 23:04:04 INFO mapred.JobClient: Running job: job_201012081252_0016 10/12/08 23:04:05 INFO mapred.JobClient:map0% reduce0% 10/12/08 23:04:16 INFO mapred.JobClient: Task Id : attempt_201012081252_0016_m_00_0, Status : FAILED java.io.IOException: pipe child exception at org.apache.hadoop.mapred.pipes.Application.abort(Application.java:151) at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:101) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:267) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(BinaryProtocol.java:114) | BTW, I ran this on a fully-distributed mode (a cluster with 3 work nodes). I am stuck and any help is appreciated! Thanks - Peng | |
Re: EOFException and BadLink, but file descriptors number is ok?
Yes, you're likely to see an error in the DN log. Do you see anything about max number of xceivers? -Todd On Thu, Feb 4, 2010 at 11:42 PM, Meng Mao meng...@gmail.com wrote: not sure what else I could be checking to see where the problem lies. Should I be looking in the datanode logs? I looked briefly in there and didn't see anything from around the time exceptions started getting reported. lsof during the job execution? Number of open threads? I'm at a loss here. On Thu, Feb 4, 2010 at 2:52 PM, Meng Mao meng...@gmail.com wrote: I wrote a hadoop job that checks for ulimits across the nodes, and every node is reporting: core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 139264 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 65536 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 139264 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited Is anything in there telling about file number limits? From what I understand, a high open files limit like 65536 should be enough. I estimate only a couple thousand part-files on HDFS being written to at once, and around 200 on the filesystem per node. On Wed, Feb 3, 2010 at 4:04 PM, Meng Mao meng...@gmail.com wrote: also, which is the ulimit that's important, the one for the user who is running the job, or the hadoop user that owns the Hadoop processes? On Tue, Feb 2, 2010 at 7:29 PM, Meng Mao meng...@gmail.com wrote: I've been trying to run a fairly small input file (300MB) on Cloudera Hadoop 0.20.1. The job I'm using probably writes to on the order of over 1000 part-files at once, across the whole grid. The grid has 33 nodes in it. I get the following exception in the run logs: 10/01/30 17:24:25 INFO mapred.JobClient: map 100% reduce 12% 10/01/30 17:24:25 INFO mapred.JobClient: Task Id : attempt_201001261532_1137_r_13_0, Status : FAILED java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.io.Text.readString(Text.java:400) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2869) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263) lots of EOFExceptions 10/01/30 17:24:25 INFO mapred.JobClient: Task Id : attempt_201001261532_1137_r_19_0, Status : FAILED java.io.IOException: Bad connect ack with firstBadLink 10.2.19.1:50010 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2871) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263) 10/01/30 17:24:36 INFO mapred.JobClient: map 100% reduce 11% 10/01/30 17:24:42 INFO mapred.JobClient: map 100% reduce 12% 10/01/30 17:24:49 INFO mapred.JobClient: map 100% reduce 13% 10/01/30 17:24:55 INFO mapred.JobClient: map 100% reduce 14% 10/01/30 17:25:00 INFO mapred.JobClient: map 100% reduce 15% From searching around, it seems like the most common cause of BadLink and EOFExceptions is when the nodes don't have enough file descriptors set. But across all the grid machines, the file-max has been set to 1573039. Furthermore, we set ulimit -n to 65536 using hadoop-env.sh. Where else should I be looking for what's causing this?
Re: EOFException and BadLink, but file descriptors number is ok?
ack, after looking at the logs again, there are definitely xcievers errors. It's set to 256! I had thought I had cleared this a possible cause, but guess I was wrong. Gonna retest right away. Thanks! On Fri, Feb 5, 2010 at 11:05 AM, Todd Lipcon t...@cloudera.com wrote: Yes, you're likely to see an error in the DN log. Do you see anything about max number of xceivers? -Todd On Thu, Feb 4, 2010 at 11:42 PM, Meng Mao meng...@gmail.com wrote: not sure what else I could be checking to see where the problem lies. Should I be looking in the datanode logs? I looked briefly in there and didn't see anything from around the time exceptions started getting reported. lsof during the job execution? Number of open threads? I'm at a loss here. On Thu, Feb 4, 2010 at 2:52 PM, Meng Mao meng...@gmail.com wrote: I wrote a hadoop job that checks for ulimits across the nodes, and every node is reporting: core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 139264 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 65536 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 139264 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited Is anything in there telling about file number limits? From what I understand, a high open files limit like 65536 should be enough. I estimate only a couple thousand part-files on HDFS being written to at once, and around 200 on the filesystem per node. On Wed, Feb 3, 2010 at 4:04 PM, Meng Mao meng...@gmail.com wrote: also, which is the ulimit that's important, the one for the user who is running the job, or the hadoop user that owns the Hadoop processes? On Tue, Feb 2, 2010 at 7:29 PM, Meng Mao meng...@gmail.com wrote: I've been trying to run a fairly small input file (300MB) on Cloudera Hadoop 0.20.1. The job I'm using probably writes to on the order of over 1000 part-files at once, across the whole grid. The grid has 33 nodes in it. I get the following exception in the run logs: 10/01/30 17:24:25 INFO mapred.JobClient: map 100% reduce 12% 10/01/30 17:24:25 INFO mapred.JobClient: Task Id : attempt_201001261532_1137_r_13_0, Status : FAILED java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.io.Text.readString(Text.java:400) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2869) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263) lots of EOFExceptions 10/01/30 17:24:25 INFO mapred.JobClient: Task Id : attempt_201001261532_1137_r_19_0, Status : FAILED java.io.IOException: Bad connect ack with firstBadLink 10.2.19.1:50010 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2871) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263) 10/01/30 17:24:36 INFO mapred.JobClient: map 100% reduce 11% 10/01/30 17:24:42 INFO mapred.JobClient: map 100% reduce 12% 10/01/30 17:24:49 INFO mapred.JobClient: map 100% reduce 13% 10/01/30 17:24:55 INFO mapred.JobClient: map 100% reduce 14% 10/01/30 17:25:00 INFO mapred.JobClient: map 100% reduce 15% From searching around, it seems like the most common cause of BadLink and EOFExceptions is when the nodes don't have enough file descriptors set. But across all the grid machines, the file-max has been set to 1573039. Furthermore, we set ulimit -n to 65536 using hadoop-env.sh. Where else should I be looking for what's causing this?
Re: EOFException and BadLink, but file descriptors number is ok?
I wrote a hadoop job that checks for ulimits across the nodes, and every node is reporting: core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 139264 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 65536 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 139264 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited Is anything in there telling about file number limits? From what I understand, a high open files limit like 65536 should be enough. I estimate only a couple thousand part-files on HDFS being written to at once, and around 200 on the filesystem per node. On Wed, Feb 3, 2010 at 4:04 PM, Meng Mao meng...@gmail.com wrote: also, which is the ulimit that's important, the one for the user who is running the job, or the hadoop user that owns the Hadoop processes? On Tue, Feb 2, 2010 at 7:29 PM, Meng Mao meng...@gmail.com wrote: I've been trying to run a fairly small input file (300MB) on Cloudera Hadoop 0.20.1. The job I'm using probably writes to on the order of over 1000 part-files at once, across the whole grid. The grid has 33 nodes in it. I get the following exception in the run logs: 10/01/30 17:24:25 INFO mapred.JobClient: map 100% reduce 12% 10/01/30 17:24:25 INFO mapred.JobClient: Task Id : attempt_201001261532_1137_r_13_0, Status : FAILED java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.io.Text.readString(Text.java:400) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2869) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263) lots of EOFExceptions 10/01/30 17:24:25 INFO mapred.JobClient: Task Id : attempt_201001261532_1137_r_19_0, Status : FAILED java.io.IOException: Bad connect ack with firstBadLink 10.2.19.1:50010 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2871) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263) 10/01/30 17:24:36 INFO mapred.JobClient: map 100% reduce 11% 10/01/30 17:24:42 INFO mapred.JobClient: map 100% reduce 12% 10/01/30 17:24:49 INFO mapred.JobClient: map 100% reduce 13% 10/01/30 17:24:55 INFO mapred.JobClient: map 100% reduce 14% 10/01/30 17:25:00 INFO mapred.JobClient: map 100% reduce 15% From searching around, it seems like the most common cause of BadLink and EOFExceptions is when the nodes don't have enough file descriptors set. But across all the grid machines, the file-max has been set to 1573039. Furthermore, we set ulimit -n to 65536 using hadoop-env.sh. Where else should I be looking for what's causing this?
Re: EOFException and BadLink, but file descriptors number is ok?
not sure what else I could be checking to see where the problem lies. Should I be looking in the datanode logs? I looked briefly in there and didn't see anything from around the time exceptions started getting reported. lsof during the job execution? Number of open threads? I'm at a loss here. On Thu, Feb 4, 2010 at 2:52 PM, Meng Mao meng...@gmail.com wrote: I wrote a hadoop job that checks for ulimits across the nodes, and every node is reporting: core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 139264 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 65536 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 139264 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited Is anything in there telling about file number limits? From what I understand, a high open files limit like 65536 should be enough. I estimate only a couple thousand part-files on HDFS being written to at once, and around 200 on the filesystem per node. On Wed, Feb 3, 2010 at 4:04 PM, Meng Mao meng...@gmail.com wrote: also, which is the ulimit that's important, the one for the user who is running the job, or the hadoop user that owns the Hadoop processes? On Tue, Feb 2, 2010 at 7:29 PM, Meng Mao meng...@gmail.com wrote: I've been trying to run a fairly small input file (300MB) on Cloudera Hadoop 0.20.1. The job I'm using probably writes to on the order of over 1000 part-files at once, across the whole grid. The grid has 33 nodes in it. I get the following exception in the run logs: 10/01/30 17:24:25 INFO mapred.JobClient: map 100% reduce 12% 10/01/30 17:24:25 INFO mapred.JobClient: Task Id : attempt_201001261532_1137_r_13_0, Status : FAILED java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.io.Text.readString(Text.java:400) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2869) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263) lots of EOFExceptions 10/01/30 17:24:25 INFO mapred.JobClient: Task Id : attempt_201001261532_1137_r_19_0, Status : FAILED java.io.IOException: Bad connect ack with firstBadLink 10.2.19.1:50010 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2871) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263) 10/01/30 17:24:36 INFO mapred.JobClient: map 100% reduce 11% 10/01/30 17:24:42 INFO mapred.JobClient: map 100% reduce 12% 10/01/30 17:24:49 INFO mapred.JobClient: map 100% reduce 13% 10/01/30 17:24:55 INFO mapred.JobClient: map 100% reduce 14% 10/01/30 17:25:00 INFO mapred.JobClient: map 100% reduce 15% From searching around, it seems like the most common cause of BadLink and EOFExceptions is when the nodes don't have enough file descriptors set. But across all the grid machines, the file-max has been set to 1573039. Furthermore, we set ulimit -n to 65536 using hadoop-env.sh. Where else should I be looking for what's causing this?
Re: EOFException and BadLink, but file descriptors number is ok?
also, which is the ulimit that's important, the one for the user who is running the job, or the hadoop user that owns the Hadoop processes? On Tue, Feb 2, 2010 at 7:29 PM, Meng Mao meng...@gmail.com wrote: I've been trying to run a fairly small input file (300MB) on Cloudera Hadoop 0.20.1. The job I'm using probably writes to on the order of over 1000 part-files at once, across the whole grid. The grid has 33 nodes in it. I get the following exception in the run logs: 10/01/30 17:24:25 INFO mapred.JobClient: map 100% reduce 12% 10/01/30 17:24:25 INFO mapred.JobClient: Task Id : attempt_201001261532_1137_r_13_0, Status : FAILED java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.io.Text.readString(Text.java:400) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2869) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263) lots of EOFExceptions 10/01/30 17:24:25 INFO mapred.JobClient: Task Id : attempt_201001261532_1137_r_19_0, Status : FAILED java.io.IOException: Bad connect ack with firstBadLink 10.2.19.1:50010 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2871) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263) 10/01/30 17:24:36 INFO mapred.JobClient: map 100% reduce 11% 10/01/30 17:24:42 INFO mapred.JobClient: map 100% reduce 12% 10/01/30 17:24:49 INFO mapred.JobClient: map 100% reduce 13% 10/01/30 17:24:55 INFO mapred.JobClient: map 100% reduce 14% 10/01/30 17:25:00 INFO mapred.JobClient: map 100% reduce 15% From searching around, it seems like the most common cause of BadLink and EOFExceptions is when the nodes don't have enough file descriptors set. But across all the grid machines, the file-max has been set to 1573039. Furthermore, we set ulimit -n to 65536 using hadoop-env.sh. Where else should I be looking for what's causing this?
EOFException and BadLink, but file descriptors number is ok?
I've been trying to run a fairly small input file (300MB) on Cloudera Hadoop 0.20.1. The job I'm using probably writes to on the order of over 1000 part-files at once, across the whole grid. The grid has 33 nodes in it. I get the following exception in the run logs: 10/01/30 17:24:25 INFO mapred.JobClient: map 100% reduce 12% 10/01/30 17:24:25 INFO mapred.JobClient: Task Id : attempt_201001261532_1137_r_13_0, Status : FAILED java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.io.Text.readString(Text.java:400) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2869) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263) lots of EOFExceptions 10/01/30 17:24:25 INFO mapred.JobClient: Task Id : attempt_201001261532_1137_r_19_0, Status : FAILED java.io.IOException: Bad connect ack with firstBadLink 10.2.19.1:50010 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2871) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263) 10/01/30 17:24:36 INFO mapred.JobClient: map 100% reduce 11% 10/01/30 17:24:42 INFO mapred.JobClient: map 100% reduce 12% 10/01/30 17:24:49 INFO mapred.JobClient: map 100% reduce 13% 10/01/30 17:24:55 INFO mapred.JobClient: map 100% reduce 14% 10/01/30 17:25:00 INFO mapred.JobClient: map 100% reduce 15% From searching around, it seems like the most common cause of BadLink and EOFExceptions is when the nodes don't have enough file descriptors set. But across all the grid machines, the file-max has been set to 1573039. Furthermore, we set ulimit -n to 65536 using hadoop-env.sh. Where else should I be looking for what's causing this?