If vector size too large, Current Hama will returns out of memory,
too. So, I would like to add the 2D layout version to 0.1 release plan
for parallel matrix multiplication.

Therefore, I'll renaming some classes.

MultiplicationMap.java -> Mult1DLayoutMap.java
MultiplicationReduce.java -> Mult1DLayoutReduce.java

/Edward

On Fri, Sep 19, 2008 at 5:41 PM, Edward J. Yoon <[EMAIL PROTECTED]> wrote:
> Great experience!
>
> /Edward
>
> On Fri, Sep 19, 2008 at 2:50 PM, Palleti, Pallavi
> <[EMAIL PROTECTED]> wrote:
>> Yeah. That was the problem. And Hama can be surely useful for large scale 
>> matrix operations.
>>
>> But for this problem, I have modified the code to just pass the ID 
>> information and read the vector information only when it is needed. In this 
>> case, it was needed only in the reducer phase. This way, it avoided this 
>> problem of out of memory error and also faster now.
>>
>> Thanks
>> Pallavi
>> -----Original Message-----
>> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Edward J. Yoon
>> Sent: Friday, September 19, 2008 10:35 AM
>> To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [email protected]
>> Subject: Re: OutOfMemory Error
>>
>>> The key is of the form "ID :DenseVector Representation in mahout with
>>
>> I guess vector size seems too large so it'll need a distributed vector
>> architecture (or 2d partitioning strategies) for large scale matrix
>> operations. The hama team investigate these problem areas. So, it will
>> be improved If hama can be used for mahout in the future.
>>
>> /Edward
>>
>> On Thu, Sep 18, 2008 at 12:28 PM, Pallavi Palleti <[EMAIL PROTECTED]> wrote:
>>>
>>> Hadoop Version - 17.1
>>> io.sort.factor =10
>>> The key is of the form "ID :DenseVector Representation in mahout with
>>> dimensionality size = 160k"
>>> For example: C1:[,0.00111111, 3.002, ...... 1.001,....]
>>> So, typical size of the key  of the mapper output can be 160K*6 (assuming
>>> double in string is represented in 5 bytes)+ 5 (bytes for C1:[])  + size
>>> required to store that the object is of type Text
>>>
>>> Thanks
>>> Pallavi
>>>
>>>
>>>
>>> Devaraj Das wrote:
>>>>
>>>>
>>>>
>>>>
>>>> On 9/17/08 6:06 PM, "Pallavi Palleti" <[EMAIL PROTECTED]> wrote:
>>>>
>>>>>
>>>>> Hi all,
>>>>>
>>>>>    I am getting outofmemory error as shown below when I ran map-red on
>>>>> huge
>>>>> amount of data.:
>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>> at
>>>>> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:52)
>>>>> at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:90)
>>>>> at
>>>>> org.apache.hadoop.io.SequenceFile$Reader.nextRawKey(SequenceFile.java:1974)
>>>>> at
>>>>> org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawKey(Sequence
>>>>> File.java:3002)
>>>>> at
>>>>> org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:28
>>>>> 02)
>>>>> at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:2511)
>>>>> at
>>>>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1040)
>>>>> at
>>>>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:698)
>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:220)
>>>>> at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124
>>>>> The above error comes almost at the end of map job. I have set the heap
>>>>> size
>>>>> to 1GB. Still the problem is persisting.  Can someone please help me how
>>>>> to
>>>>> avoid this error?
>>>> What is the typical size of your key? What is the value of io.sort.factor?
>>>> Hadoop version?
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> View this message in context: 
>>> http://www.nabble.com/OutOfMemory-Error-tp19531174p19545298.html
>>> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>>
>> --
>> Best regards, Edward J. Yoon
>> [EMAIL PROTECTED]
>> http://blog.udanax.org
>>
>
>
>
> --
> Best regards, Edward J. Yoon
> [EMAIL PROTECTED]
> http://blog.udanax.org
>



-- 
Best regards, Edward J. Yoon
[EMAIL PROTECTED]
http://blog.udanax.org

Reply via email to