Re: JIRA-704 : TajoMaster High Availability .

Hyunsik Choi Wed, 16 Apr 2014 07:42:28 -0700

I'm sorry for late response, and thank you Alvin for your understanding.

Best Regards,
Hyunsik



On Wed, Apr 16, 2014 at 11:19 PM, Alvin Henrick <[email protected]> wrote:

> Hi All ,
>              Not a problem. I wasn't aware that 704 was overlapping with
> 611.Yes, I was planning to use Apache Curator as well and did the small POC
> and posted on Github. Apache Curator has the service discovery recipe which
> we can use.
>              As per hyunsik the only work left on 704 is Catalog
> replication across TajoMaster's which can be easily achieved via database
> replication.
>
>       Xuhui and Min ,
>                                 Let me know If I can help because I have
> done some good research on Apache Curator and Zookeeper (How to
> utilize/configure apache curator api's ).
>                                 Here is the Git repository where I did
> some work [email protected]:alvinhenrick/zooKeeper-poc.git for 704 before
> getting into the real implementation.
>
>               I will remove the in progress status and associate 704 with
> 611 and move onto tackle another interesting/priority issue :). Let me know
> guys how do you wan't to tackle this so that we don't duplicate the effort.
>
>               Have a wonderful day!!!
>
> Thanks!
> Warm Regards,
> Alvin.
>
>
> On Apr 16, 2014, at 6:56 AM, Hyunsik Choi wrote:
>
> > Hi Alvin,
> >
> > First of all, thank you Alvin for your contribution. Your proposal looks
> > nice and reasonable for me.
> >
> > BTW, as other guys mentioned, TAJO-704 and TAJO-611 seem to be somewhat
> > overlapped to each other. We need to arrange the tasks to avoid
> duplicated
> > works.
> >
> > In my opinion, TajoMaster HA feature involves three sub features:
> >  1) Leader election of multiple TajoMasters - One of multiple TajoMasters
> > always is the leader TajoMaster.
> >  2) Service discovery of TajoClient side - TajoClient API call should be
> > resilient even though the original TajoMaster is not available.
> >  3) Cluster resource management and Catalog information that TajoMaster
> > keeps in main-memory. - the information should not be lost.
> >
> > I think that (1) and (2) are duplicated to TAJO-611 for service
> discovery.
> > So, it would be nice if TAJO-704 should only focus on (3). It's because
> > TAJO-611 already started few weeks ago and TAJO-704 may be the relatively
> > earlier stage. *Instead, you can continue the work with Xuhui and Min.*
> > Someone can divide the service discovery issue into more subtasks.
> >
> > In addition, I'd like to more discuss (3). Currently, a running
> TajoMaster
> > keeps two information: cluster resource information of all workers and
> > catalog information. In order to guarantee the HA of the data, TajoMaster
> > should either persistently materialize them or consistently synchronize
> > them across multiple TajoMasters. BTW, we will replace the resource
> > management feature of TajoMaster into a decentralized manner in new
> > scheduler issue. As a result, I think that TajoMaster HA needs to focus
> on
> > only the high availability of catalog information. The HA of catalog can
> be
> > easily achieved by database replication or we can make our own module for
> > it. In my view, I prefer the former.
> >
> > Hi Xuhui and Min,
> >
> > Could you share the brief progress of service discovery issue? If so, we
> > can easily figure out how we start the service discovery together.
> >
> > Warm regards,
> > Hyunsik
> >
> >
> >
> > On Wed, Apr 16, 2014 at 3:36 PM, Min Zhou <[email protected]> wrote:
> >
> >> Actually, we are not only thinking about the HA, but also service
> discovery
> >> when the future tajo scheduler would rely on.  Tajo scheduler can get
> all
> >> the active workers from that service.
> >>
> >>
> >> Regards,
> >> Min
> >>
> >>
> >> On Tue, Apr 15, 2014 at 10:05 PM, Xuhui Liu <[email protected]> wrote:
> >>
> >>> Hi Alvin,
> >>>
> >>> TAJO-611 will introduce Curator as a service discovery service to Tajo
> >> and
> >>> Curator is based on ZK. Maybe we can work together.
> >>>
> >>> Thanks,
> >>> Xuhui
> >>>
> >>>
> >>> On Wed, Apr 16, 2014 at 12:17 PM, Min Zhou <[email protected]>
> wrote:
> >>>
> >>>> HI Alvin,
> >>>>
> >>>> I think this jira has somewhat overlap with TAJO-611,  can you have
> >> some
> >>>> cooperation?
> >>>>
> >>>> Thanks,
> >>>> Min
> >>>>
> >>>>
> >>>> On Tue, Apr 15, 2014 at 7:22 PM, Henry Saputra <
> >> [email protected]
> >>>>> wrote:
> >>>>
> >>>>> Jaehwa, I think we should think about pluggable mechanism that would
> >>>>> allow some kind distributed system like ZK to be used if wanted.
> >>>>>
> >>>>> - Henry
> >>>>>
> >>>>> On Tue, Apr 15, 2014 at 7:15 PM, Jaehwa Jung <[email protected]>
> >>>> wrote:
> >>>>>> Hi, Alvin
> >>>>>>
> >>>>>> I'm sorry for late response, and thank you very much for your
> >>>>> contribution.
> >>>>>> I agree with your opinion for zookeeper. But, zookeeper requires an
> >>>>>> additional dependency that someone does not want.
> >>>>>>
> >>>>>> I'd like to suggest adding an abstraction layer for handling
> >>> TajoMaster
> >>>>> HA.
> >>>>>> When I had created TAJO-740, I wished that TajoMaster HA would
> >> have a
> >>>>>> generic interface and a basic implementation using HDFS. Next, your
> >>>>>> proposed zookeeper implementation will be added there. It will
> >> allow
> >>>>> users
> >>>>>> to choice their desired implementation according to their
> >>> environments.
> >>>>>>
> >>>>>> In addition, I'd like to propose that TajoMaster embeds the HA
> >>> module,
> >>>>> and
> >>>>>> it would be great if HA works well by launching a backup
> >> TajoMaster.
> >>>>>> Deploying additional process besides TajoMaster and TajoWorker
> >>>> processes
> >>>>>> may give more burden to users.
> >>>>>>
> >>>>>> *Cheers*
> >>>>>> *Jaehwa*
> >>>>>>
> >>>>>>
> >>>>>> 2014-04-13 14:36 GMT+09:00 Jihoon Son <[email protected]>:
> >>>>>>
> >>>>>>> Hi Alvin.
> >>>>>>> Thanks for your suggestion.
> >>>>>>>
> >>>>>>> In overall, your suggestion looks very reasonable to me!
> >>>>>>> I'll check the POC.
> >>>>>>>
> >>>>>>> Many thanks,
> >>>>>>> Jihoon
> >>>>>>> Hi All ,
> >>>>>>>            After doing lot of research in my opinion we should
> >>>> utilize
> >>>>>>> zookeeper for Tajo Master HA.I have created a small POC and shared
> >>> it
> >>>>> on my
> >>>>>>> Github repository ( [email protected]:
> >> alvinhenrick/zooKeeper-poc.git).
> >>>>>>>
> >>>>>>>            Just to make things little bit easier and
> >> maintainable I
> >>>> am
> >>>>>>> utilizing Apache Curator the Fluent Zookeeper Client API
> >> developed
> >>> at
> >>>>>>> Netflix and is now part of an  apache open source project.
> >>>>>>>
> >>>>>>>            I have attached the diagram to convey my message to
> >> the
> >>>> team
> >>>>>>> members.Will upload it to JIRA once everyone agree with the
> >> proposed
> >>>>>>> solution.
> >>>>>>>
> >>>>>>>            Here is the flow going to look like.
> >>>>>>>
> >>>>>>>            TajoMasterZkController   ==>
> >>>>>>>
> >>>>>>>
> >>>>>>>   1. This component  will start and connect to zookeeper quorum
> >> and
> >>>>> fight
> >>>>>>>      ( :) ) to obtain the latch / lock to become the master .
> >>>>>>>      2. Once the lock is obtained the Apache Curator API will
> >>> invoke
> >>>>>>>      takeLeadership () method at this time will start the
> >>> TajoMaster.
> >>>>>>>      3. As long as the TajoMaster is running the Controller will
> >>> keep
> >>>>> the
> >>>>>>>      lock and update the meta data on zookeeper server with the
> >>>>>>> HOSTNAME and RPC
> >>>>>>>      PORT.
> >>>>>>>      4. The other participant will keep waiting for the latch/
> >> lock
> >>>> to
> >>>>> be
> >>>>>>>      released by zookeeper to obtain the leadership.
> >>>>>>>      5. The advantage is we can have as many Tajo Master's as we
> >>>> wan't
> >>>>> but
> >>>>>>>      only one can be the leader and will consume the resources
> >> only
> >>>>> after
> >>>>>>>      obtaining the latch/lock.
> >>>>>>>
> >>>>>>>
> >>>>>>>           TajoWorkerZkController ==>
> >>>>>>>
> >>>>>>>   1. This component  will start and connect to zookeeper (will
> >>> create
> >>>>>>>      EPHEMERAL ZNODE) and wait for the events from zookeeper.
> >>>>>>>      2. The first listener will listener for successful
> >>> registration.
> >>>>>>>      3. The second listener on master node will listen for any
> >>>>> changes to
> >>>>>>>      the master node received from zookeeper server.
> >>>>>>>      4.  If the failover occurs the data on the master ZNODE will
> >>> be
> >>>>>>>      changed and the new HOSTNAME and RPC PORT can be obtained
> >> and
> >>>> the
> >>>>>>>      TajoWorker can establish the new RPC connection with the
> >>>>> TajoMaster.
> >>>>>>>
> >>>>>>>          To demonstrate I have created the small Readme.txt file
> >>>>>>> on Github on how to run the example. Please read the log
> >> statements
> >>> on
> >>>>> the
> >>>>>>> console.
> >>>>>>>
> >>>>>>>          Similar to TajoWorkerZkController we can also
> >>>>>>> implement TajoClientZkController.
> >>>>>>>
> >>>>>>>          Any help or advice is appreciated.
> >>>>>>>
> >>>>>>> Thanks!
> >>>>>>> Warm Regards,
> >>>>>>> Alvin.
> >>>>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> My research interests are distributed systems, parallel computing and
> >>>> bytecode based virtual machine.
> >>>>
> >>>> My profile:
> >>>> http://www.linkedin.com/in/coderplay
> >>>> My blog:
> >>>> http://coderplay.javaeye.com
> >>>>
> >>>
> >>
> >>
> >>
> >> --
> >> My research interests are distributed systems, parallel computing and
> >> bytecode based virtual machine.
> >>
> >> My profile:
> >> http://www.linkedin.com/in/coderplay
> >> My blog:
> >> http://coderplay.javaeye.com
> >>
>
>

Re: JIRA-704 : TajoMaster High Availability .

Reply via email to