Hadoop Version - 17.1 io.sort.factor =10 The key is of the form "ID :DenseVector Representation in mahout with dimensionality size = 160k" For example: C1:[,0.00111111, 3.002, ...... 1.001,....] So, typical size of the key of the mapper output can be 160K*6 (assuming double in string is represented in 5 bytes)+ 5 (bytes for C1:[]) + size required to store that the object is of type Text
Thanks Pallavi Devaraj Das wrote: > > > > > On 9/17/08 6:06 PM, "Pallavi Palleti" <[EMAIL PROTECTED]> wrote: > >> >> Hi all, >> >> I am getting outofmemory error as shown below when I ran map-red on >> huge >> amount of data.: >> java.lang.OutOfMemoryError: Java heap space >> at >> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:52) >> at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:90) >> at >> org.apache.hadoop.io.SequenceFile$Reader.nextRawKey(SequenceFile.java:1974) >> at >> org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawKey(Sequence >> File.java:3002) >> at >> org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:28 >> 02) >> at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:2511) >> at >> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1040) >> at >> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:698) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:220) >> at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124 >> The above error comes almost at the end of map job. I have set the heap >> size >> to 1GB. Still the problem is persisting. Can someone please help me how >> to >> avoid this error? > What is the typical size of your key? What is the value of io.sort.factor? > Hadoop version? > > > > -- View this message in context: http://www.nabble.com/OutOfMemory-Error-tp19531174p19545298.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
