P.S., ignore this. + data = null; On Wed, Sep 26, 2012 at 10:16 AM, Edward J. Yoon <[email protected]> wrote: > FYI, the problem is fixed by below small patch. > > Index: src/main/java/org/apache/hama/bsp/message/AvroMessageManagerImpl.java > =================================================================== > --- src/main/java/org/apache/hama/bsp/message/AvroMessageManagerImpl.java > (revision > 1389695) > +++ src/main/java/org/apache/hama/bsp/message/AvroMessageManagerImpl.java > (working > copy) > @@ -135,12 +135,15 @@ > ByteArrayInputStream inArray = new ByteArrayInputStream(byteArray); > DataInputStream in = new DataInputStream(inArray); > msg.readFields(in); > + in.close(); > + inArray.close(); > } else { > > peer.incrementCounter(BSPPeerImpl.PeerCounter.COMPRESSED_BYTES_RECEIVED, > byteArray.length); > msg = compressor.decompressBundle(new BSPCompressedBundle(byteArray)); > } > > + byteArray = null; > return msg; > } > > @@ -154,12 +157,14 @@ > byte[] byteArray = outArray.toByteArray(); > peer.incrementCounter(BSPPeerImpl.PeerCounter.MESSAGE_BYTES_TRANSFERED, > byteArray.length); > + outArray.close(); > return ByteBuffer.wrap(byteArray); > } else { > BSPCompressedBundle compMsgBundle = compressor.compressBundle(msg); > byte[] data = compMsgBundle.getData(); > peer.incrementCounter(BSPPeerImpl.PeerCounter.COMPRESSED_BYTES_SENT, > data.length); > + data = null; > return ByteBuffer.wrap(data); > } > } > > On Thu, Sep 20, 2012 at 4:22 PM, Edward J. Yoon <[email protected]> wrote: >> It seems we didn't test 0.5 release thoroughly. >> >> Let's give ourselves more time. >> >> On Wed, Sep 19, 2012 at 8:00 PM, Thomas Jungblut >> <[email protected]> wrote: >>> Wait, the problem had multiple roots. We just fixed a few. I will sit down >>> and take a look at the messaging. >>> If this is scalable we can remove the multi step partitioning. >>> Am 19.09.2012 12:21 schrieb "Edward J. Yoon" <[email protected]>: >>> >>>> P.S., Since memory issue of graph job will be fixed by Thomas's >>>> HAMA-642, I'll remove my dirty multi-step partitioning code in graph >>>> module if there's no problem w/ Hadoop RPC tomorrow. >>>> >>>> On Wed, Sep 19, 2012 at 5:53 PM, Thomas Jungblut >>>> <[email protected]> wrote: >>>> > I will give you more details what I planned on the interface changes once >>>> > I'm back from my lecture. >>>> > >>>> > 2012/9/19 Suraj Menon <[email protected]> >>>> > >>>> >> As a beginning we should have a spilling queue and the same with >>>> combiner >>>> >> running in batch if possible. >>>> >> I have been looking into implementing the spilling queue. Chalking out >>>> the >>>> >> requirements, we should look into the following: >>>> >> >>>> >> A queue should persist all the data if required by the framework for >>>> fault >>>> >> tolerance. ( I feel it would be a bad idea for framework to spend >>>> resource >>>> >> on making a separate copy of the file ) >>>> >> Asynchronous communication is our next important feature required for >>>> >> performance.Hence we would need this queue with combiner on sender side >>>> to >>>> >> batch the messages before sending. This implies we need to support both >>>> >> concurrent reads and writes. >>>> >> >>>> >> -Suraj >>>> >> >>>> >> On Wed, Sep 19, 2012 at 4:21 AM, Thomas Jungblut >>>> >> <[email protected]>wrote: >>>> >> >>>> >> > Oh okay, very interesting. Just another argument for making the >>>> messaging >>>> >> > more scalable ;) >>>> >> > >>>> >> > 2012/9/19 Edward J. Yoon <[email protected]> >>>> >> > >>>> >> > > Didn't check memory usage because each machine's memory is 48 GB, >>>> but I >>>> >> > > guess there's no big difference. >>>> >> > > >>>> >> > > In short, "bin/hama bench 16 10000 32" was maximum capacity (See >>>> [1]). >>>> >> If >>>> >> > > message numbers or nodes are increased, job is always fails. Hadoop >>>> RPC >>>> >> > is >>>> >> > > OK. >>>> >> > > >>>> >> > > Will need time to debug this. >>>> >> > > >>>> >> > > 1. http://wiki.apache.org/hama/**Benchmarks#Random_** >>>> >> > > Communication_Benchmark< >>>> >> > http://wiki.apache.org/hama/Benchmarks#Random_Communication_Benchmark >>>> > >>>> >> > > >>>> >> > > On 9/19/2012 4:34 PM, Thomas Jungblut wrote: >>>> >> > > >>>> >> > >> BTW after HAMA-642<https://issues.** >>>> apache.org/jira/browse/HAMA-**642 >>>> >> < >>>> >> > https://issues.apache.org/jira/browse/HAMA-642>> >>>> >> > >> I will >>>> >> > >> >>>> >> > >> redesign our messaging system to being completely disk based with >>>> >> > caching. >>>> >> > >> I will formulate a followup issue for this. However I plan to get >>>> rid >>>> >> of >>>> >> > >> the RPC anyway, I think it is more efficient to stream the messages >>>> >> from >>>> >> > >> disk over network to the other host via NIO (we can later replace >>>> it >>>> >> > with >>>> >> > >> netty). Also this saves us the time to do the checkpointing, >>>> because >>>> >> > this >>>> >> > >> can be combined with it pretty well. RPC requires the whole bundle >>>> to >>>> >> be >>>> >> > >> in >>>> >> > >> RAM, which is totally bad. >>>> >> > >> Will follow with more details later. >>>> >> > >> >>>> >> > >> 2012/9/19 Thomas Jungblut<thomas.jungblut@**gmail.com< >>>> >> > [email protected]> >>>> >> > >> >: >>>> >> > >> >>>> >> > >>> What is more memory efficient? >>>> >> > >>> >>>> >> > >>> Am 19.09.2012 08:23 schrieb "Edward J. Yoon"< >>>> [email protected] >>>> >> >: >>>> >> > >>> >>>> >> > >>> Let's change the default value of RPC in hama-default.xml to >>>> Hadoop >>>> >> > RPC. >>>> >> > >>>> >>>> >> > >>> I >>>> >> > >> >>>> >> > >>> am testing Hadoop RPC and Avro RPC on 4 racks cluster. Avro RPC is >>>> >> > >>>> >>>> >> > >>> criminal. >>>> >> > >> >>>> >> > >>> There's no significant performance difference. >>>> >> > >>>> >>>> >> > >>>> -- >>>> >> > >>>> Best Regards, Edward J. Yoon >>>> >> > >>>> @eddieyoon >>>> >> > >>>> >>>> >> > >>>> >>>> >> > > -- >>>> >> > > Best Regards, Edward J. Yoon >>>> >> > > @eddieyoon >>>> >> > > >>>> >> > > >>>> >> > >>>> >> >>>> >>>> >>>> >>>> -- >>>> Best Regards, Edward J. Yoon >>>> @eddieyoon >>>> >> >> >> >> -- >> Best Regards, Edward J. Yoon >> @eddieyoon > > > > -- > Best Regards, Edward J. Yoon > @eddieyoon
-- Best Regards, Edward J. Yoon @eddieyoon
