Re: Fault Tolerance in Hama
Internally, the framework checkpoint the messages transferred among BSP tasks during the BSP synchronization period. If user want to checkpoint additional other things, user should use HDFS APIs directly. On Mon, Feb 29, 2016 at 11:15 PM, Behroz Sikander wrote: > Ok. So, Hama does support FT but it is not thoroughly tested. > > Btw, how can a user checkpoint or Hama does that internally ? Is there any > method exposed using BSPPeer ? > > Regards, > Behroz > > On Mon, Feb 29, 2016 at 2:03 PM, Edward J. Yoon > wrote: > >> If I remember correctly, .. the framework change the job status as a >> "recovering" first, and then simply restart all the tasks from the >> last checkpoint. It works well but I only tested simple jobs (no >> input/output) on my cluster (see also HAMA-973). >> >> To write perfect FT application from user side, every states in BSP >> program need to be written on the disk. So, some people discussed and >> introduced new Superstep API that provides more abstract interface >> like Pregel. >> >> >> On Mon, Feb 29, 2016 at 8:09 PM, Behroz Sikander >> wrote: >> > Hi, >> > Just a quick question, is Hama fault tolerant ? What happens if a Hama >> > tasks fails ? >> > >> > Regards, >> > Behroz >> >> >> >> -- >> Best Regards, Edward J. Yoon >> -- Best Regards, Edward J. Yoon
Re: Fault Tolerance in Hama
Ok. So, Hama does support FT but it is not thoroughly tested. Btw, how can a user checkpoint or Hama does that internally ? Is there any method exposed using BSPPeer ? Regards, Behroz On Mon, Feb 29, 2016 at 2:03 PM, Edward J. Yoon wrote: > If I remember correctly, .. the framework change the job status as a > "recovering" first, and then simply restart all the tasks from the > last checkpoint. It works well but I only tested simple jobs (no > input/output) on my cluster (see also HAMA-973). > > To write perfect FT application from user side, every states in BSP > program need to be written on the disk. So, some people discussed and > introduced new Superstep API that provides more abstract interface > like Pregel. > > > On Mon, Feb 29, 2016 at 8:09 PM, Behroz Sikander > wrote: > > Hi, > > Just a quick question, is Hama fault tolerant ? What happens if a Hama > > tasks fails ? > > > > Regards, > > Behroz > > > > -- > Best Regards, Edward J. Yoon >
Re: [VOTE] Apache Hama 0.7.1 release (RC2)
Hi guys, This is a reminder that, if you are an PMC member, please vote for the new Hama release. Thanks! On Sat, Feb 27, 2016 at 10:36 PM, Edward J. Yoon wrote: > I'm +1, this works well on my cluster. > > On Mon, Feb 15, 2016 at 8:35 AM, Edward J. Yoon > wrote: >> Hi all, >> >> I just created a 2nd release candidate for Apache Hama 0.7.1 release. This >> RC fixes newly reported bug of graph module. It compiled Java7. >> >> RC2 is available at: >> http://people.apache.org/~edwardyoon/dist/0.7.1-RC2/ >> >> Tags: >> https://github.com/apache/hama/tree/0.7.1-RC2 >> >> Please try it on your environment, run the tests, verify checksum files, >> etc. and vote. >> >> Thanks~ >> >> -- >> Best Regards, Edward J. Yoon >> >> >> > > > > -- > Best Regards, Edward J. Yoon -- Best Regards, Edward J. Yoon
Re: Fault Tolerance in Hama
If I remember correctly, .. the framework change the job status as a "recovering" first, and then simply restart all the tasks from the last checkpoint. It works well but I only tested simple jobs (no input/output) on my cluster (see also HAMA-973). To write perfect FT application from user side, every states in BSP program need to be written on the disk. So, some people discussed and introduced new Superstep API that provides more abstract interface like Pregel. On Mon, Feb 29, 2016 at 8:09 PM, Behroz Sikander wrote: > Hi, > Just a quick question, is Hama fault tolerant ? What happens if a Hama > tasks fails ? > > Regards, > Behroz -- Best Regards, Edward J. Yoon
Fault Tolerance in Hama
Hi, Just a quick question, is Hama fault tolerant ? What happens if a Hama tasks fails ? Regards, Behroz