Re: [DISCUSS] Hama 0.5 roadmap

Edward J. Yoon Tue, 14 Feb 2012 14:49:41 -0800

+1 :)

Sent from my iPad


On Feb 15, 2012, at 4:55 AM, Thomas Jungblut <[email protected]> 
wrote:

>> 
>> Maybe 2~3 months later?
> 
> 
> I would love that schedule, but I don't think we are going to handle this
> timelimit with our current throughput.
> Lin (may I call you like that?:D) has to add more detailed descriptions to
> the tasks so that we can also work on them.
> So realistically we can make it in 5-6 months, our regular release schedule.
> I know there is a business behind, but it doesn't help us to hurry.
> 
> HAMA-511 should not be a blocker for 0.5 release, it should be considered
>> as a long term task I think.
> 
> 
> +1.
> 
> We have to stabilize ourselves first rather than finding ways to
>> differentiate ourselves from the competition or considering new paradigms.
> 
> 
> To be honest, the whole graph domain is conquered by giraph. And it is
> perfectly fine, because they are focused on it.
> Anyways, we have to push Hama into another direction. We can support graph
> processing, but our great success should be in iterative algorithms which
> can easily implemented with BSP.
> The first is to make my K-Means "tasteful" to Mahout, they are a great
> driver, especially for researches.
> The second idea is to support this dryad functionality, there is no
> framework which has this ability out there and since Hortonworks is
> supporting Microsoft, I think we can get some new people for Hama.
> And the third one is to improve the real-time processing. This will be
> greatly driven by the second idea, however we have to add some more simpler
> API for these task. This must be evaluated then (lets say in 0.6.0).
> 
> speculative task execution
> 
> 
> Sorry, I seem to have not answered your question at all in the mail you've
> linked.
> It is a cool feature, but I guess this should come along with
> fault-tolerance, e.G. if we detect that a task is longer running than the
> other.
> 
> A future target for Hama is a distributed cache like in BSPLib where you
> can get and put objects.
> I am having an eye on Apache Direct Memory, however they are in early stage
> of incubation, so this may take a bit of time.
> 
> Everything else has been targeted so far.
> 
> What about graduation?
> In my opinion we have stabilized so far with our community, I expect two
> new comitters soon, a third one also seem to get on its way for
> contribution.
> The other tasks seems to be ticked off as well.
> 
> 2012/2/14 Edward J. Yoon <[email protected]>
> 
>> Are you looking for this link?
>> http://wiki.apache.org/hama/GroomServerFaultTolerance
>> 
>>>> There are many tasks required to work on and to be integrated in order
>>>> to get (GroomServer) fault tolerance ready. Tasks include:
>>>> - GroomServer status/ resource monitor
>>>> - Failure Detection
>>>> - Checkpointed data integration
>>>> - Refactoring bsp() (if necessary)
>>>> - Master decision making
>> 
>> Hmm, yes. and I missed message compressor.
>> 
>> Could you please split them into more smaller task so that we can help you?
>> 
>>> I also would like to know why we rejected the idea of speculative task
>>> execution?
>> 
>> I wanted to talk about speculative task execution before but, the idea
>> of speculative task execution is not discussed/reported yet. (
>> http://markmail.org/thread/sq7neayhstqufrsz )
>> 
>> To support this, we should add 'Progress' feature first. Currently,
>> job/task progress checker is not implemented yet.
>> 
>>> How serious is the feature of real-time processing for Hama? I am told
>> that
>>> some are already using it for the purpose and read Thomas's blog on the
>>> same. Are we deferring it until we have a design for offline processing
>> or
>>> should we keep it in mind for fault tolerance?
>> 
>> I think, yes if possible. But in some cases, maybe turning off
>> recovery mode is the best.
>> 
>> I don't understand perfectly yet, so would you please describe the
>> issues which must be discussed/considered?
>> 
>> On Tue, Feb 14, 2012 at 3:15 AM, Suraj Menon <[email protected]>
>> wrote:
>>> +1 on HAMA 511 should not be blocker.
>>> 
>>> Also, I lost the wiki link that explains the fault tolerant design. It
>>> would be helpful to undestand the recovery design. I believe that we will
>>> have the recovery BSP tasks scheduled to start running(in high
>> probability)
>>> on node with data where the checkpointed messages are written on HDFS
>> with
>>> a single input split?
>>> I also would like to know why we rejected the idea of speculative task
>>> execution?
>>> I am currently working on HAMA-445 and HAMA-498. Thanks to Chiahung, I
>> have
>>> 2-3 good papers to read already :).
>>> 
>>> How serious is the feature of real-time processing for Hama? I am told
>> that
>>> some are already using it for the purpose and read Thomas's blog on the
>>> same. Are we deferring it until we have a design for offline processing
>> or
>>> should we keep it in mind for fault tolerance?
>>> 
>>> 
>>> Thanks,
>>> Suraj
>>> 
>>> 
>>> 
>>> On Mon, Feb 13, 2012 at 12:25 PM, Chia-Hung Lin <[email protected]
>>> wrote:
>>> 
>>>> There are many tasks required to work on and to be integrated in order
>>>> to get (GroomServer) fault tolerance ready. Tasks include:
>>>> - GroomServer status/ resource monitor
>>>> - Failure Detection
>>>> - Checkpointed data integration
>>>> - Refactoring bsp() (if necessary)
>>>> - Master decision making
>>>> 
>>>> Currently I am working on the first one, and with a patch for 2nd on
>>>> jira already. In my viewpoint, it might be difficult to get those
>>>> tasks done within 2-3 months.
>>>> 
>>>> On 13 February 2012 17:05, Edward J. Yoon <[email protected]>
>> wrote:
>>>>> Hi,
>>>>> 
>>>>> I think, it's time to discuss about our 0.5 roadmap more clearly.
>>>>> 
>>>>> IMO, I'd like to release Hama 0.5 with only fault tolerant processing,
>>>>> clearly defined BSP and Pregel interfaces. Maybe 2~3 months later?
>>>>> And, HAMA-511 should not be a blocker for 0.5 release, it should be
>>>>> considered as a long term task I think.
>>>>> 
>>>>> There's a lot of new M/R alternatives but no stable alternatives and
>>>>> no dominant player at the moment. We have to stabilize ourselves first
>>>>> rather than finding ways to differentiate ourselves from the
>>>>> competition or considering new paradigms.
>>>>> 
>>>>> Please feel free to leave your opinion!
>>>>> 
>>>>> --
>>>>> Best Regards, Edward J. Yoon
>>>>> @eddieyoon
>>>> 
>> 
>> 
>> 
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>> 
> 
> 
> 
> -- 
> Thomas Jungblut
> Berlin <[email protected]>

Re: [DISCUSS] Hama 0.5 roadmap

Reply via email to