Outgoing messages manager is already applied to 0.6.4. Basically, if we can't avoid the step of bundling messages, DiskQueue and SpillingQueue for outgoing messages are meaningless. If we have to choice one of Elegant design and High performance, I'd like to choice high performance and efficiency.
On Fri, Aug 15, 2014 at 7:21 PM, Suraj Menon <menonsur...@gmail.com> wrote: > Hi Edward, sorry to enter the discussion so late. > > Bundling and Unbundling of message queue is not Spilling queue's > responsibility, it was ended up there to be compatible with the existent > implementation of BSP Peer communication. Remember Spilling Queue > implementation was done to immediately remove some OutOfMemory issues on > sender side first. Spilling Queue gives you a byte array (ByteBuffer) with > a batch of serialized messages. This is effectively bundling the messages > in byte array (hence the ByteArrayMessageBundle) and sending them for > processing. The SpilledDataProcessor's are implemented as a pipeline of > processing done using inheritance, something like what we may use trait for > in Scala. So if we have a SpilledDataProcessor that sends this bundled > message via RPC to the peer, there is no need to write them to file and > read them back. As I previously mentioned this was done to be compatible > with the existent implementation of peer.send. > > Also, the async checkpoint recovery code was written before spilling queue. > Today we can remove the single message write and do this in "before peer > sync" phase to just write the whole file to HDFS. > > I would say performance numbers and maintainability comes first and if you > think removing spilling queue is a solution go for it. As far as async > checkpointing is to be considered, that was a first proof of concept we did > and it is high time we move forward from there. > > Chiahung, do you have some instruction on where and how I can build the > scala version of your code? > > I am really finding it hard to dedicate time for Hama these days. > > - Suraj > > > On Tue, Aug 12, 2014 at 7:15 AM, Edward J. Yoon <edwardy...@apache.org> > wrote: > >> ChiaHung, >> >> Yes, I'm thinking similar things. >> >> On Tue, Aug 12, 2014 at 4:11 PM, Chia-Hung Lin <cli...@googlemail.com> >> wrote: >> > I am currently working on this part based on the superstep api, >> > similar to the Superstep.java in the trunk. >> > >> > The checkpointer[1] saves bundle message instead of single message. >> > Not very sure if this is what you are looking for? >> > >> > [1]. >> https://github.com/chlin501/hama/blob/peer-comm-mech-changed/core/src/main/scala/org/apache/hama/monitor/Checkpointer.scala >> > >> > >> > >> > >> > On 12 August 2014 15:04, Edward J. Yoon <edwardy...@apache.org> wrote: >> >> I think that transferring single messages at a time is not a wise way. >> >> Bundle is used to avoid network overheads and contentions. So, if we >> >> use Bundle, each processor always sends/receives an bundles. >> >> >> >> BSPMessageBundle is Writable (and Iterable). And it manages the >> >> serialized message as a byte array. If we write an bundles when >> >> checkpointing or using Disk-queue, it'll be more simple and faster. >> >> >> >> In Spilling Queue case, it always requires the process of unbundling >> >> and putting messages into queue. >> >> >> >> >> >> On Tue, Aug 12, 2014 at 2:41 PM, Tommaso Teofili >> >> <tommaso.teof...@gmail.com> wrote: >> >>> -1, can't we first discuss? Also it'd be helpful to be more specific >> on the >> >>> problems. >> >>> Tommaso >> >>> >> >>> >> >>> >> >>> 2014-08-12 4:25 GMT+02:00 Edward J. Yoon <edwardy...@apache.org>: >> >>> >> >>>> All, >> >>>> >> >>>> I'll delete Spilling queue, and rewrite checkpoint/recovery >> >>>> implementation (checkpointing bundles is better than checkpointing all >> >>>> messages). Current implementation is quite mess :/ there are huge >> >>>> deserialization/serialization overheads.. >> >>>> >> >>>> -- >> >>>> Best Regards, Edward J. Yoon >> >>>> CEO at DataSayer Co., Ltd. >> >>>> >> >> >> >> >> >> >> >> -- >> >> Best Regards, Edward J. Yoon >> >> CEO at DataSayer Co., Ltd. >> >> >> >> -- >> Best Regards, Edward J. Yoon >> CEO at DataSayer Co., Ltd. >> -- Best Regards, Edward J. Yoon CEO at DataSayer Co., Ltd.