It seems ZK is based on PAXOS. The it will be much simpler. We can focus on
how to use ZK well.

Cheers,
Xuhui


On Thu, Apr 17, 2014 at 4:14 PM, Xuhui Liu <[email protected]> wrote:

> Talking about the HA of TajoMaster. Keeping consistence among primary
> master and slave masters will be a big challenge. Have we ever thought
> about the PAXOS protocol? It's designed to keep consistence in distributed
> environment.
>
> Thanks,
> Daniel
>
>
> On Wed, Apr 16, 2014 at 7:56 PM, Hyunsik Choi <[email protected]> wrote:
>
>> Hi Alvin,
>>
>> First of all, thank you Alvin for your contribution. Your proposal looks
>> nice and reasonable for me.
>>
>> BTW, as other guys mentioned, TAJO-704 and TAJO-611 seem to be somewhat
>> overlapped to each other. We need to arrange the tasks to avoid duplicated
>> works.
>>
>> In my opinion, TajoMaster HA feature involves three sub features:
>>   1) Leader election of multiple TajoMasters - One of multiple TajoMasters
>> always is the leader TajoMaster.
>>   2) Service discovery of TajoClient side - TajoClient API call should be
>> resilient even though the original TajoMaster is not available.
>>   3) Cluster resource management and Catalog information that TajoMaster
>> keeps in main-memory. - the information should not be lost.
>>
>> I think that (1) and (2) are duplicated to TAJO-611 for service discovery.
>> So, it would be nice if TAJO-704 should only focus on (3). It's because
>> TAJO-611 already started few weeks ago and TAJO-704 may be the relatively
>> earlier stage. *Instead, you can continue the work with Xuhui and Min.*
>> Someone can divide the service discovery issue into more subtasks.
>>
>> In addition, I'd like to more discuss (3). Currently, a running TajoMaster
>> keeps two information: cluster resource information of all workers and
>> catalog information. In order to guarantee the HA of the data, TajoMaster
>> should either persistently materialize them or consistently synchronize
>> them across multiple TajoMasters. BTW, we will replace the resource
>> management feature of TajoMaster into a decentralized manner in new
>> scheduler issue. As a result, I think that TajoMaster HA needs to focus on
>> only the high availability of catalog information. The HA of catalog can
>> be
>> easily achieved by database replication or we can make our own module for
>> it. In my view, I prefer the former.
>>
>> Hi Xuhui and Min,
>>
>> Could you share the brief progress of service discovery issue? If so, we
>> can easily figure out how we start the service discovery together.
>>
>> Warm regards,
>> Hyunsik
>>
>>
>>
>> On Wed, Apr 16, 2014 at 3:36 PM, Min Zhou <[email protected]> wrote:
>>
>> > Actually, we are not only thinking about the HA, but also service
>> discovery
>> > when the future tajo scheduler would rely on.  Tajo scheduler can get
>> all
>> > the active workers from that service.
>> >
>> >
>> > Regards,
>> > Min
>> >
>> >
>> > On Tue, Apr 15, 2014 at 10:05 PM, Xuhui Liu <[email protected]> wrote:
>> >
>> > > Hi Alvin,
>> > >
>> > > TAJO-611 will introduce Curator as a service discovery service to Tajo
>> > and
>> > > Curator is based on ZK. Maybe we can work together.
>> > >
>> > > Thanks,
>> > > Xuhui
>> > >
>> > >
>> > > On Wed, Apr 16, 2014 at 12:17 PM, Min Zhou <[email protected]>
>> wrote:
>> > >
>> > > > HI Alvin,
>> > > >
>> > > > I think this jira has somewhat overlap with TAJO-611,  can you have
>> > some
>> > > > cooperation?
>> > > >
>> > > > Thanks,
>> > > > Min
>> > > >
>> > > >
>> > > > On Tue, Apr 15, 2014 at 7:22 PM, Henry Saputra <
>> > [email protected]
>> > > > >wrote:
>> > > >
>> > > > > Jaehwa, I think we should think about pluggable mechanism that
>> would
>> > > > > allow some kind distributed system like ZK to be used if wanted.
>> > > > >
>> > > > > - Henry
>> > > > >
>> > > > > On Tue, Apr 15, 2014 at 7:15 PM, Jaehwa Jung <[email protected]
>> >
>> > > > wrote:
>> > > > > > Hi, Alvin
>> > > > > >
>> > > > > > I'm sorry for late response, and thank you very much for your
>> > > > > contribution.
>> > > > > > I agree with your opinion for zookeeper. But, zookeeper
>> requires an
>> > > > > > additional dependency that someone does not want.
>> > > > > >
>> > > > > > I'd like to suggest adding an abstraction layer for handling
>> > > TajoMaster
>> > > > > HA.
>> > > > > > When I had created TAJO-740, I wished that TajoMaster HA would
>> > have a
>> > > > > > generic interface and a basic implementation using HDFS. Next,
>> your
>> > > > > > proposed zookeeper implementation will be added there. It will
>> > allow
>> > > > > users
>> > > > > > to choice their desired implementation according to their
>> > > environments.
>> > > > > >
>> > > > > > In addition, I'd like to propose that TajoMaster embeds the HA
>> > > module,
>> > > > > and
>> > > > > > it would be great if HA works well by launching a backup
>> > TajoMaster.
>> > > > > > Deploying additional process besides TajoMaster and TajoWorker
>> > > > processes
>> > > > > > may give more burden to users.
>> > > > > >
>> > > > > > *Cheers*
>> > > > > > *Jaehwa*
>> > > > > >
>> > > > > >
>> > > > > > 2014-04-13 14:36 GMT+09:00 Jihoon Son <[email protected]>:
>> > > > > >
>> > > > > >> Hi Alvin.
>> > > > > >> Thanks for your suggestion.
>> > > > > >>
>> > > > > >> In overall, your suggestion looks very reasonable to me!
>> > > > > >> I'll check the POC.
>> > > > > >>
>> > > > > >> Many thanks,
>> > > > > >> Jihoon
>> > > > > >> Hi All ,
>> > > > > >>             After doing lot of research in my opinion we should
>> > > > utilize
>> > > > > >> zookeeper for Tajo Master HA.I have created a small POC and
>> shared
>> > > it
>> > > > > on my
>> > > > > >> Github repository ( [email protected]:
>> > alvinhenrick/zooKeeper-poc.git).
>> > > > > >>
>> > > > > >>             Just to make things little bit easier and
>> > maintainable I
>> > > > am
>> > > > > >> utilizing Apache Curator the Fluent Zookeeper Client API
>> >  developed
>> > > at
>> > > > > >> Netflix and is now part of an  apache open source project.
>> > > > > >>
>> > > > > >>             I have attached the diagram to convey my message to
>> > the
>> > > > team
>> > > > > >> members.Will upload it to JIRA once everyone agree with the
>> > proposed
>> > > > > >> solution.
>> > > > > >>
>> > > > > >>             Here is the flow going to look like.
>> > > > > >>
>> > > > > >>             TajoMasterZkController   ==>
>> > > > > >>
>> > > > > >>
>> > > > > >>    1. This component  will start and connect to zookeeper
>> quorum
>> > and
>> > > > > fight
>> > > > > >>       ( :) ) to obtain the latch / lock to become the master .
>> > > > > >>       2. Once the lock is obtained the Apache Curator API will
>> > > invoke
>> > > > > >>       takeLeadership () method at this time will start the
>> > > TajoMaster.
>> > > > > >>       3. As long as the TajoMaster is running the Controller
>> will
>> > > keep
>> > > > > the
>> > > > > >>       lock and update the meta data on zookeeper server with
>> the
>> > > > > >> HOSTNAME and RPC
>> > > > > >>       PORT.
>> > > > > >>       4. The other participant will keep waiting for the latch/
>> > lock
>> > > > to
>> > > > > be
>> > > > > >>       released by zookeeper to obtain the leadership.
>> > > > > >>       5. The advantage is we can have as many Tajo Master's as
>> we
>> > > > wan't
>> > > > > but
>> > > > > >>       only one can be the leader and will consume the resources
>> > only
>> > > > > after
>> > > > > >>       obtaining the latch/lock.
>> > > > > >>
>> > > > > >>
>> > > > > >>            TajoWorkerZkController ==>
>> > > > > >>
>> > > > > >>    1. This component  will start and connect to zookeeper (will
>> > > create
>> > > > > >>       EPHEMERAL ZNODE) and wait for the events from zookeeper.
>> > > > > >>       2. The first listener will listener for successful
>> > > registration.
>> > > > > >>       3. The second listener on master node will listen for any
>> > > > >  changes to
>> > > > > >>       the master node received from zookeeper server.
>> > > > > >>       4.  If the failover occurs the data on the master ZNODE
>> will
>> > > be
>> > > > > >>       changed and the new HOSTNAME and RPC PORT can be obtained
>> > and
>> > > > the
>> > > > > >>       TajoWorker can establish the new RPC connection with the
>> > > > > TajoMaster.
>> > > > > >>
>> > > > > >>           To demonstrate I have created the small Readme.txt
>> file
>> > > > > >> on Github on how to run the example. Please read the log
>> > statements
>> > > on
>> > > > > the
>> > > > > >> console.
>> > > > > >>
>> > > > > >>           Similar to TajoWorkerZkController we can also
>> > > > > >> implement TajoClientZkController.
>> > > > > >>
>> > > > > >>           Any help or advice is appreciated.
>> > > > > >>
>> > > > > >> Thanks!
>> > > > > >> Warm Regards,
>> > > > > >> Alvin.
>> > > > > >>
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > My research interests are distributed systems, parallel computing
>> and
>> > > > bytecode based virtual machine.
>> > > >
>> > > > My profile:
>> > > > http://www.linkedin.com/in/coderplay
>> > > > My blog:
>> > > > http://coderplay.javaeye.com
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > My research interests are distributed systems, parallel computing and
>> > bytecode based virtual machine.
>> >
>> > My profile:
>> > http://www.linkedin.com/in/coderplay
>> > My blog:
>> > http://coderplay.javaeye.com
>> >
>>
>
>

Reply via email to