Sorry, same on large cluster :( Let's fix 0.7 release.

On Wed, Sep 26, 2012 at 10:46 AM, Edward J. Yoon <[email protected]> wrote:
> P.S., ignore this. +      data = null;
>
> On Wed, Sep 26, 2012 at 10:16 AM, Edward J. Yoon <[email protected]> 
> wrote:
>> FYI, the problem is fixed by below small patch.
>>
>> Index: src/main/java/org/apache/hama/bsp/message/AvroMessageManagerImpl.java
>> ===================================================================
>> --- src/main/java/org/apache/hama/bsp/message/AvroMessageManagerImpl.java    
>>    (revision
>> 1389695)
>> +++ src/main/java/org/apache/hama/bsp/message/AvroMessageManagerImpl.java    
>>    (working
>> copy)
>> @@ -135,12 +135,15 @@
>>        ByteArrayInputStream inArray = new ByteArrayInputStream(byteArray);
>>        DataInputStream in = new DataInputStream(inArray);
>>        msg.readFields(in);
>> +      in.close();
>> +      inArray.close();
>>      } else {
>>        
>> peer.incrementCounter(BSPPeerImpl.PeerCounter.COMPRESSED_BYTES_RECEIVED,
>>            byteArray.length);
>>        msg = compressor.decompressBundle(new BSPCompressedBundle(byteArray));
>>      }
>>
>> +    byteArray = null;
>>      return msg;
>>    }
>>
>> @@ -154,12 +157,14 @@
>>        byte[] byteArray = outArray.toByteArray();
>>        
>> peer.incrementCounter(BSPPeerImpl.PeerCounter.MESSAGE_BYTES_TRANSFERED,
>>            byteArray.length);
>> +      outArray.close();
>>        return ByteBuffer.wrap(byteArray);
>>      } else {
>>        BSPCompressedBundle compMsgBundle = compressor.compressBundle(msg);
>>        byte[] data = compMsgBundle.getData();
>>        peer.incrementCounter(BSPPeerImpl.PeerCounter.COMPRESSED_BYTES_SENT,
>>            data.length);
>> +      data = null;
>>        return ByteBuffer.wrap(data);
>>      }
>>    }
>>
>> On Thu, Sep 20, 2012 at 4:22 PM, Edward J. Yoon <[email protected]> 
>> wrote:
>>> It seems we didn't test 0.5 release thoroughly.
>>>
>>> Let's give ourselves more time.
>>>
>>> On Wed, Sep 19, 2012 at 8:00 PM, Thomas Jungblut
>>> <[email protected]> wrote:
>>>> Wait, the problem had multiple roots. We just fixed a few. I will sit down
>>>> and take a look at the messaging.
>>>> If this is scalable we can remove the multi step partitioning.
>>>> Am 19.09.2012 12:21 schrieb "Edward J. Yoon" <[email protected]>:
>>>>
>>>>> P.S., Since memory issue of graph job will be fixed by Thomas's
>>>>> HAMA-642, I'll remove my dirty multi-step partitioning code in graph
>>>>> module if there's no problem w/ Hadoop RPC tomorrow.
>>>>>
>>>>> On Wed, Sep 19, 2012 at 5:53 PM, Thomas Jungblut
>>>>> <[email protected]> wrote:
>>>>> > I will give you more details what I planned on the interface changes 
>>>>> > once
>>>>> > I'm back from my lecture.
>>>>> >
>>>>> > 2012/9/19 Suraj Menon <[email protected]>
>>>>> >
>>>>> >> As a beginning we should have a spilling queue and the same with
>>>>> combiner
>>>>> >> running in batch if possible.
>>>>> >> I have been looking into implementing the spilling queue. Chalking out
>>>>> the
>>>>> >> requirements, we should look into the following:
>>>>> >>
>>>>> >> A queue should persist all the data if required by the framework for
>>>>> fault
>>>>> >> tolerance. ( I feel it would be a bad idea for framework to spend
>>>>> resource
>>>>> >> on making a separate copy of the file )
>>>>> >> Asynchronous communication is our next important feature required for
>>>>> >> performance.Hence we would need this queue with combiner on sender side
>>>>> to
>>>>> >> batch the messages before sending. This implies we need to support both
>>>>> >> concurrent reads and writes.
>>>>> >>
>>>>> >> -Suraj
>>>>> >>
>>>>> >> On Wed, Sep 19, 2012 at 4:21 AM, Thomas Jungblut
>>>>> >> <[email protected]>wrote:
>>>>> >>
>>>>> >> > Oh okay, very interesting. Just another argument for making the
>>>>> messaging
>>>>> >> > more scalable ;)
>>>>> >> >
>>>>> >> > 2012/9/19 Edward J. Yoon <[email protected]>
>>>>> >> >
>>>>> >> > > Didn't check memory usage because each machine's memory is 48 GB,
>>>>> but I
>>>>> >> > > guess there's no big difference.
>>>>> >> > >
>>>>> >> > > In short, "bin/hama bench 16 10000 32" was maximum capacity (See
>>>>> [1]).
>>>>> >> If
>>>>> >> > > message numbers or nodes are increased, job is always fails. Hadoop
>>>>> RPC
>>>>> >> > is
>>>>> >> > > OK.
>>>>> >> > >
>>>>> >> > > Will need time to debug this.
>>>>> >> > >
>>>>> >> > > 1. http://wiki.apache.org/hama/**Benchmarks#Random_**
>>>>> >> > > Communication_Benchmark<
>>>>> >> > http://wiki.apache.org/hama/Benchmarks#Random_Communication_Benchmark
>>>>> >
>>>>> >> > >
>>>>> >> > > On 9/19/2012 4:34 PM, Thomas Jungblut wrote:
>>>>> >> > >
>>>>> >> > >> BTW after HAMA-642<https://issues.**
>>>>> apache.org/jira/browse/HAMA-**642
>>>>> >> <
>>>>> >> > https://issues.apache.org/jira/browse/HAMA-642>>
>>>>> >> > >>  I will
>>>>> >> > >>
>>>>> >> > >> redesign our messaging system to being completely disk based with
>>>>> >> > caching.
>>>>> >> > >> I will formulate a followup issue for this. However I plan to get
>>>>> rid
>>>>> >> of
>>>>> >> > >> the RPC anyway, I think it is more efficient to stream the 
>>>>> >> > >> messages
>>>>> >> from
>>>>> >> > >> disk over network to the other host via NIO (we can later replace
>>>>> it
>>>>> >> > with
>>>>> >> > >> netty). Also this saves us the time to do the checkpointing,
>>>>> because
>>>>> >> > this
>>>>> >> > >> can be combined with it pretty well. RPC requires the whole bundle
>>>>> to
>>>>> >> be
>>>>> >> > >> in
>>>>> >> > >> RAM, which is totally bad.
>>>>> >> > >> Will follow with more details later.
>>>>> >> > >>
>>>>> >> > >> 2012/9/19 Thomas Jungblut<thomas.jungblut@**gmail.com<
>>>>> >> > [email protected]>
>>>>> >> > >> >:
>>>>> >> > >>
>>>>> >> > >>> What is more memory efficient?
>>>>> >> > >>>
>>>>> >> > >>> Am 19.09.2012 08:23 schrieb "Edward J. Yoon"<
>>>>> [email protected]
>>>>> >> >:
>>>>> >> > >>>
>>>>> >> > >>>  Let's change the default value of RPC in hama-default.xml to
>>>>> Hadoop
>>>>> >> > RPC.
>>>>> >> > >>>>
>>>>> >> > >>> I
>>>>> >> > >>
>>>>> >> > >>> am testing Hadoop RPC and Avro RPC on 4 racks cluster. Avro RPC 
>>>>> >> > >>> is
>>>>> >> > >>>>
>>>>> >> > >>> criminal.
>>>>> >> > >>
>>>>> >> > >>> There's no significant performance difference.
>>>>> >> > >>>>
>>>>> >> > >>>> --
>>>>> >> > >>>> Best Regards, Edward J. Yoon
>>>>> >> > >>>> @eddieyoon
>>>>> >> > >>>>
>>>>> >> > >>>>
>>>>> >> > > --
>>>>> >> > > Best Regards, Edward J. Yoon
>>>>> >> > > @eddieyoon
>>>>> >> > >
>>>>> >> > >
>>>>> >> >
>>>>> >>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards, Edward J. Yoon
>>>>> @eddieyoon
>>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Reply via email to