Thanks. I just "refactored" our issue tracker ;) Hope it wasn't to spammy.
2012/2/4 Chia-Hung Lin <[email protected]> > +1 It's good if we have an umbrella jira so we can track it easier. > > Failure detection (HAMA-370) was already done and tested on my > machines previously. > > First point in HAMA-440 is not needed because it has been integrated > into bsp task. > > > > On 3 February 2012 09:38, Edward J. Yoon <[email protected]> wrote: > > We also can separate the issue into two parts: 1) cluster high > > availability and 2) fault tolerant job processing. Only HAMA-370 is > > related with 1). > > > > On Fri, Feb 3, 2012 at 10:23 AM, Edward J. Yoon <[email protected]> > wrote: > >> +1 > >> > >> On Thu, Feb 2, 2012 at 8:39 PM, Thomas Jungblut > >> <[email protected]> wrote: > >>> Hey, > >>> > >>> I had a bit of time to go through the jira issues and sort out several > >>> things related to Fault Tolerance. > >>> > >>> Here are my results: > >>> > >>> Fault Tolerance in Hama (all jiras related): > >>> > >>> [HAMA-199] Add fault tolerance to BSPPeer < CLOSE, too generic > >>> [HAMA-445] Make configurable checkpointing > >>> [HAMA-440] Features required in recovery procedure. > >>> [HAMA-498] BSPTask should periodically ping its parent. > >>> > >>> Then I have splitted this in two main parts, "Detect Failure" and > "Solve > >>> Failure": > >>> > >>> Detect Failure: > >>> [HAMA-370] Failure detector for Hama < Nearly complete? > >>> [HAMA-498] BSPTask should periodically ping its parent. > >>> > >>> Solve Failure: > >>> [HAMA-445] Make configurable checkpointing > >>>> TODO: > >>>> Groom needs functionality to restart a task > >>>> BSPMaster needs functionality to restart a groom > >>> > >>> Also here is MISC, which is not strongly related. > >>> > >>> MISC: > >>> [HAMA-445] Make configurable checkpointing > >>> [HAMA-440] Features required in recovery procedure. > >>>> TODO mainly discussion: > >>>> New BSP "interface", with a chaining of supersteps to make restarting > >>> tasks more simpler (contained in 440) > >>> > >>> > >>> Let's make an umbrella jira for this larger task and close 199, since > this > >>> is way too generic and too old. > >>> We should also split 440, because it combines too much unrelated things > >>> together. > >>> > >>> Also "Lin" has assigned the majority of them. What is your progress? > And do > >>> you mind splitting these? > >>> > >>> [LINKS] > >>> https://issues.apache.org/jira/browse/HAMA-440 > >>> https://issues.apache.org/jira/browse/HAMA-119 > >>> https://issues.apache.org/jira/browse/HAMA-445 > >>> https://issues.apache.org/jira/browse/HAMA-440 > >>> https://issues.apache.org/jira/browse/HAMA-370 > >>> https://issues.apache.org/jira/browse/HAMA-498 > >>> > >>> -- > >>> Thomas Jungblut > >>> Berlin <[email protected]> > >> > >> > >> > >> -- > >> Best Regards, Edward J. Yoon > >> @eddieyoon > > > > > > > > -- > > Best Regards, Edward J. Yoon > > @eddieyoon > -- Thomas Jungblut Berlin <[email protected]>
