Re: Remove Spilling Queue and rewrite checkpoint/recovery

Chia-Hung Lin Mon, 18 Aug 2014 05:58:16 -0700

I will make the code stable first before merging it back.


On 18 August 2014 17:40, Edward J. Yoon <edwardy...@apache.org> wrote:
> Do you have any plan for merging them?
>
> This is side opinion. If we want to use Git, now I'm +1.
>
> On Sat, Aug 16, 2014 at 12:00 AM, Chia-Hung Lin <cli...@googlemail.com> wrote:
>> Code right now is at https://github.com/chlin501/hama.git
>>
>> Maven and jdk are required to build the project
>>
>> Command to have a clean build:
>> mvn clean install -DskipTests=true -Dmaven.javadoc.skip=true
>>
>> To test a specific test case:
>> mvn -DskipTests=false -Dtest=<TestCaseName> test
>>
>>
>> On 15 August 2014 18:21, Suraj Menon <menonsur...@gmail.com> wrote:
>>> Hi Edward, sorry to enter the discussion so late.
>>>
>>> Bundling and Unbundling of message queue is not Spilling queue's
>>> responsibility, it was ended up there to be compatible with the existent
>>> implementation of BSP Peer communication. Remember Spilling Queue
>>> implementation was done to immediately remove some OutOfMemory issues on
>>> sender side first. Spilling Queue gives you a byte array (ByteBuffer) with
>>> a batch of serialized messages.  This is effectively bundling the messages
>>> in byte array (hence the ByteArrayMessageBundle) and sending them for
>>> processing. The SpilledDataProcessor's are implemented as a pipeline of
>>> processing done using inheritance, something like what we may use trait for
>>> in Scala. So if we have a SpilledDataProcessor that sends this bundled
>>> message via RPC to the peer, there is no need to write them to file and
>>> read them back. As I previously mentioned this was done to be compatible
>>> with the existent implementation of peer.send.
>>>
>>> Also, the async checkpoint recovery code was written before spilling queue.
>>> Today we can remove the single message write and do this in "before peer
>>> sync" phase to just write the whole file to HDFS.
>>>
>>> I would say performance numbers and maintainability comes first and if you
>>> think removing spilling queue is a solution go for it. As far as async
>>> checkpointing is to be considered, that was a first proof of concept we did
>>> and it is high time we move forward from there.
>>>
>>> Chiahung, do you have some instruction on where and how I can build the
>>> scala version of your code?
>>>
>>> I am really finding it hard to dedicate time for Hama these days.
>>>
>>> - Suraj
>>>
>>>
>>> On Tue, Aug 12, 2014 at 7:15 AM, Edward J. Yoon <edwardy...@apache.org>
>>> wrote:
>>>
>>>> ChiaHung,
>>>>
>>>> Yes, I'm thinking similar things.
>>>>
>>>> On Tue, Aug 12, 2014 at 4:11 PM, Chia-Hung Lin <cli...@googlemail.com>
>>>> wrote:
>>>> > I am currently working on this part based on the superstep api,
>>>> > similar to the Superstep.java in the trunk.
>>>> >
>>>> > The checkpointer[1] saves bundle message instead of single message.
>>>> > Not very sure if this is what you are looking for?
>>>> >
>>>> > [1].
>>>> https://github.com/chlin501/hama/blob/peer-comm-mech-changed/core/src/main/scala/org/apache/hama/monitor/Checkpointer.scala
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On 12 August 2014 15:04, Edward J. Yoon <edwardy...@apache.org> wrote:
>>>> >> I think that transferring single messages at a time is not a wise way.
>>>> >> Bundle is used to avoid network overheads and contentions. So, if we
>>>> >> use Bundle, each processor always sends/receives an bundles.
>>>> >>
>>>> >> BSPMessageBundle is Writable (and Iterable). And it manages the
>>>> >> serialized message as a byte array. If we write an bundles when
>>>> >> checkpointing or using Disk-queue, it'll be more simple and faster.
>>>> >>
>>>> >> In Spilling Queue case, it always requires the process of unbundling
>>>> >> and putting messages into queue.
>>>> >>
>>>> >>
>>>> >> On Tue, Aug 12, 2014 at 2:41 PM, Tommaso Teofili
>>>> >> <tommaso.teof...@gmail.com> wrote:
>>>> >>> -1, can't we first discuss? Also it'd be helpful to be more specific
>>>> on the
>>>> >>> problems.
>>>> >>> Tommaso
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> 2014-08-12 4:25 GMT+02:00 Edward J. Yoon <edwardy...@apache.org>:
>>>> >>>
>>>> >>>> All,
>>>> >>>>
>>>> >>>> I'll delete Spilling queue, and rewrite checkpoint/recovery
>>>> >>>> implementation (checkpointing bundles is better than checkpointing all
>>>> >>>> messages). Current implementation is quite mess :/ there are huge
>>>> >>>> deserialization/serialization overheads..
>>>> >>>>
>>>> >>>> --
>>>> >>>> Best Regards, Edward J. Yoon
>>>> >>>> CEO at DataSayer Co., Ltd.
>>>> >>>>
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Best Regards, Edward J. Yoon
>>>> >> CEO at DataSayer Co., Ltd.
>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards, Edward J. Yoon
>>>> CEO at DataSayer Co., Ltd.
>>>>
>
>
>
> --
> Best Regards, Edward J. Yoon
> CEO at DataSayer Co., Ltd.

Re: Remove Spilling Queue and rewrite checkpoint/recovery

Reply via email to