Re: HA support for Spark

Jun Feng Liu Wed, 10 Dec 2014 19:37:41 -0800

Right, perhaps also need preserve some DAG information? I am wondering if
there is any work around this.




                                                                       
             Sandy Ryza                                                
             <sandy.ryza@cloud                                         
             era.com>                                                   To
                                       Jun Feng Liu/China/IBM@IBMCN,   
             2014-12-11 01:34                                           cc
                                       Reynold Xin <r...@databricks.com>,
                                       "dev@spark.apache.org"          
                                       <dev@spark.apache.org>          
                                                                   Subject
                                       Re: HA support for Spark        
                                                                       
                                                                       
                                                                       
                                                                       
                                                                       
                                                                       




I think that if we were able to maintain the full set of created RDDs as
well as some scheduler and block manager state, it would be enough for most
apps to recover.

On Wed, Dec 10, 2014 at 5:30 AM, Jun Feng Liu <liuj...@cn.ibm.com> wrote:

> Well, it should not be mission impossible thinking there are so many HA
> solution existing today. I would interest to know if there is any
specific
> difficult.
>
> Best Regards
>
>
> *Jun Feng Liu*
> IBM China Systems & Technology Laboratory in Beijing
>
>   ------------------------------
>  [image: 2D barcode - encoded with contact information] *Phone:
*86-10-82452683
>
> * E-mail:* *liuj...@cn.ibm.com* <liuj...@cn.ibm.com>
> [image: IBM]
>
> BLD 28,ZGC Software Park
> No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193
> China
>
>
>
>
>
>  *Reynold Xin <r...@databricks.com <r...@databricks.com>>*
>
> 2014/12/10 16:30
>   To
> Jun Feng Liu/China/IBM@IBMCN,
> cc
> "dev@spark.apache.org" <dev@spark.apache.org>
> Subject
> Re: HA support for Spark
>
>
>
>
> This would be plausible for specific purposes such as Spark streaming or
> Spark SQL, but I don't think it is doable for general Spark driver since
it
> is just a normal JVM process with arbitrary program state.
>
> On Wed, Dec 10, 2014 at 12:25 AM, Jun Feng Liu <liuj...@cn.ibm.com>
wrote:
>
> > Do we have any high availability support in Spark driver level? For
> > example, if we want spark drive can move to another node continue
> execution
> > when failure happen. I can see the RDD checkpoint can help to
> serialization
> > the status of RDD. I can image to load the check point from another
node
> > when error happen, but seems like will lost track all tasks status or
> even
> > executor information that maintain in spark context. I am not sure if
> there
> > is any existing stuff I can leverage to do that. thanks for any
suggests
> >
> > Best Regards
> >
> >
> > *Jun Feng Liu*
> > IBM China Systems & Technology Laboratory in Beijing
> >
> >   ------------------------------
> >  [image: 2D barcode - encoded with contact information] *Phone:
> *86-10-82452683
> >
> > * E-mail:* *liuj...@cn.ibm.com* <liuj...@cn.ibm.com>
> > [image: IBM]
> >
> > BLD 28,ZGC Software Park
> > No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193
> > China
> >
> >
> >
> >
> >
>
>

Re: HA support for Spark

Reply via email to