We also can separate the issue into two parts: 1) cluster high
availability and 2) fault tolerant job processing. Only HAMA-370 is
related with 1).

On Fri, Feb 3, 2012 at 10:23 AM, Edward J. Yoon <[email protected]> wrote:
> +1
>
> On Thu, Feb 2, 2012 at 8:39 PM, Thomas Jungblut
> <[email protected]> wrote:
>> Hey,
>>
>> I had a bit of time to go through the jira issues and sort out several
>> things related to Fault Tolerance.
>>
>> Here are my results:
>>
>> Fault Tolerance in Hama (all jiras related):
>>
>> [HAMA-199] Add fault tolerance to BSPPeer < CLOSE, too generic
>> [HAMA-445] Make configurable checkpointing
>> [HAMA-440] Features required in recovery procedure.
>> [HAMA-498] BSPTask should periodically ping its parent.
>>
>> Then I have splitted this in two main parts, "Detect Failure" and "Solve
>> Failure":
>>
>> Detect Failure:
>> [HAMA-370] Failure detector for Hama < Nearly complete?
>> [HAMA-498] BSPTask should periodically ping its parent.
>>
>> Solve Failure:
>> [HAMA-445] Make configurable checkpointing
>>> TODO:
>>> Groom needs functionality to restart a task
>>> BSPMaster needs functionality to restart a groom
>>
>> Also here is MISC, which is not strongly related.
>>
>> MISC:
>> [HAMA-445] Make configurable checkpointing
>> [HAMA-440] Features required in recovery procedure.
>>> TODO mainly discussion:
>>> New BSP "interface", with a chaining of supersteps to make restarting
>> tasks more simpler (contained in 440)
>>
>>
>> Let's make an umbrella jira for this larger task and close 199, since this
>> is way too generic and too old.
>> We should also split 440, because it combines too much unrelated things
>> together.
>>
>> Also "Lin" has assigned the majority of them. What is your progress? And do
>> you mind splitting these?
>>
>> [LINKS]
>> https://issues.apache.org/jira/browse/HAMA-440
>> https://issues.apache.org/jira/browse/HAMA-119
>> https://issues.apache.org/jira/browse/HAMA-445
>> https://issues.apache.org/jira/browse/HAMA-440
>> https://issues.apache.org/jira/browse/HAMA-370
>> https://issues.apache.org/jira/browse/HAMA-498
>>
>> --
>> Thomas Jungblut
>> Berlin <[email protected]>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Reply via email to