Hi Alvin, Thank you for your understanding.
Xuhui, Could you please share your design and current progress here? Thanks, Min On Wed, Apr 16, 2014 at 7:42 AM, Hyunsik Choi <[email protected]> wrote: > I'm sorry for late response, and thank you Alvin for your understanding. > > Best Regards, > Hyunsik > > > On Wed, Apr 16, 2014 at 11:19 PM, Alvin Henrick <[email protected]> > wrote: > > > Hi All , > > Not a problem. I wasn't aware that 704 was overlapping with > > 611.Yes, I was planning to use Apache Curator as well and did the small > POC > > and posted on Github. Apache Curator has the service discovery recipe > which > > we can use. > > As per hyunsik the only work left on 704 is Catalog > > replication across TajoMaster's which can be easily achieved via database > > replication. > > > > Xuhui and Min , > > Let me know If I can help because I have > > done some good research on Apache Curator and Zookeeper (How to > > utilize/configure apache curator api's ). > > Here is the Git repository where I did > > some work [email protected]:alvinhenrick/zooKeeper-poc.git for 704 before > > getting into the real implementation. > > > > I will remove the in progress status and associate 704 with > > 611 and move onto tackle another interesting/priority issue :). Let me > know > > guys how do you wan't to tackle this so that we don't duplicate the > effort. > > > > Have a wonderful day!!! > > > > Thanks! > > Warm Regards, > > Alvin. > > > > > > On Apr 16, 2014, at 6:56 AM, Hyunsik Choi wrote: > > > > > Hi Alvin, > > > > > > First of all, thank you Alvin for your contribution. Your proposal > looks > > > nice and reasonable for me. > > > > > > BTW, as other guys mentioned, TAJO-704 and TAJO-611 seem to be somewhat > > > overlapped to each other. We need to arrange the tasks to avoid > > duplicated > > > works. > > > > > > In my opinion, TajoMaster HA feature involves three sub features: > > > 1) Leader election of multiple TajoMasters - One of multiple > TajoMasters > > > always is the leader TajoMaster. > > > 2) Service discovery of TajoClient side - TajoClient API call should > be > > > resilient even though the original TajoMaster is not available. > > > 3) Cluster resource management and Catalog information that TajoMaster > > > keeps in main-memory. - the information should not be lost. > > > > > > I think that (1) and (2) are duplicated to TAJO-611 for service > > discovery. > > > So, it would be nice if TAJO-704 should only focus on (3). It's because > > > TAJO-611 already started few weeks ago and TAJO-704 may be the > relatively > > > earlier stage. *Instead, you can continue the work with Xuhui and Min.* > > > Someone can divide the service discovery issue into more subtasks. > > > > > > In addition, I'd like to more discuss (3). Currently, a running > > TajoMaster > > > keeps two information: cluster resource information of all workers and > > > catalog information. In order to guarantee the HA of the data, > TajoMaster > > > should either persistently materialize them or consistently synchronize > > > them across multiple TajoMasters. BTW, we will replace the resource > > > management feature of TajoMaster into a decentralized manner in new > > > scheduler issue. As a result, I think that TajoMaster HA needs to focus > > on > > > only the high availability of catalog information. The HA of catalog > can > > be > > > easily achieved by database replication or we can make our own module > for > > > it. In my view, I prefer the former. > > > > > > Hi Xuhui and Min, > > > > > > Could you share the brief progress of service discovery issue? If so, > we > > > can easily figure out how we start the service discovery together. > > > > > > Warm regards, > > > Hyunsik > > > > > > > > > > > > On Wed, Apr 16, 2014 at 3:36 PM, Min Zhou <[email protected]> wrote: > > > > > >> Actually, we are not only thinking about the HA, but also service > > discovery > > >> when the future tajo scheduler would rely on. Tajo scheduler can get > > all > > >> the active workers from that service. > > >> > > >> > > >> Regards, > > >> Min > > >> > > >> > > >> On Tue, Apr 15, 2014 at 10:05 PM, Xuhui Liu <[email protected]> wrote: > > >> > > >>> Hi Alvin, > > >>> > > >>> TAJO-611 will introduce Curator as a service discovery service to > Tajo > > >> and > > >>> Curator is based on ZK. Maybe we can work together. > > >>> > > >>> Thanks, > > >>> Xuhui > > >>> > > >>> > > >>> On Wed, Apr 16, 2014 at 12:17 PM, Min Zhou <[email protected]> > > wrote: > > >>> > > >>>> HI Alvin, > > >>>> > > >>>> I think this jira has somewhat overlap with TAJO-611, can you have > > >> some > > >>>> cooperation? > > >>>> > > >>>> Thanks, > > >>>> Min > > >>>> > > >>>> > > >>>> On Tue, Apr 15, 2014 at 7:22 PM, Henry Saputra < > > >> [email protected] > > >>>>> wrote: > > >>>> > > >>>>> Jaehwa, I think we should think about pluggable mechanism that > would > > >>>>> allow some kind distributed system like ZK to be used if wanted. > > >>>>> > > >>>>> - Henry > > >>>>> > > >>>>> On Tue, Apr 15, 2014 at 7:15 PM, Jaehwa Jung <[email protected]> > > >>>> wrote: > > >>>>>> Hi, Alvin > > >>>>>> > > >>>>>> I'm sorry for late response, and thank you very much for your > > >>>>> contribution. > > >>>>>> I agree with your opinion for zookeeper. But, zookeeper requires > an > > >>>>>> additional dependency that someone does not want. > > >>>>>> > > >>>>>> I'd like to suggest adding an abstraction layer for handling > > >>> TajoMaster > > >>>>> HA. > > >>>>>> When I had created TAJO-740, I wished that TajoMaster HA would > > >> have a > > >>>>>> generic interface and a basic implementation using HDFS. Next, > your > > >>>>>> proposed zookeeper implementation will be added there. It will > > >> allow > > >>>>> users > > >>>>>> to choice their desired implementation according to their > > >>> environments. > > >>>>>> > > >>>>>> In addition, I'd like to propose that TajoMaster embeds the HA > > >>> module, > > >>>>> and > > >>>>>> it would be great if HA works well by launching a backup > > >> TajoMaster. > > >>>>>> Deploying additional process besides TajoMaster and TajoWorker > > >>>> processes > > >>>>>> may give more burden to users. > > >>>>>> > > >>>>>> *Cheers* > > >>>>>> *Jaehwa* > > >>>>>> > > >>>>>> > > >>>>>> 2014-04-13 14:36 GMT+09:00 Jihoon Son <[email protected]>: > > >>>>>> > > >>>>>>> Hi Alvin. > > >>>>>>> Thanks for your suggestion. > > >>>>>>> > > >>>>>>> In overall, your suggestion looks very reasonable to me! > > >>>>>>> I'll check the POC. > > >>>>>>> > > >>>>>>> Many thanks, > > >>>>>>> Jihoon > > >>>>>>> Hi All , > > >>>>>>> After doing lot of research in my opinion we should > > >>>> utilize > > >>>>>>> zookeeper for Tajo Master HA.I have created a small POC and > shared > > >>> it > > >>>>> on my > > >>>>>>> Github repository ( [email protected]: > > >> alvinhenrick/zooKeeper-poc.git). > > >>>>>>> > > >>>>>>> Just to make things little bit easier and > > >> maintainable I > > >>>> am > > >>>>>>> utilizing Apache Curator the Fluent Zookeeper Client API > > >> developed > > >>> at > > >>>>>>> Netflix and is now part of an apache open source project. > > >>>>>>> > > >>>>>>> I have attached the diagram to convey my message to > > >> the > > >>>> team > > >>>>>>> members.Will upload it to JIRA once everyone agree with the > > >> proposed > > >>>>>>> solution. > > >>>>>>> > > >>>>>>> Here is the flow going to look like. > > >>>>>>> > > >>>>>>> TajoMasterZkController ==> > > >>>>>>> > > >>>>>>> > > >>>>>>> 1. This component will start and connect to zookeeper quorum > > >> and > > >>>>> fight > > >>>>>>> ( :) ) to obtain the latch / lock to become the master . > > >>>>>>> 2. Once the lock is obtained the Apache Curator API will > > >>> invoke > > >>>>>>> takeLeadership () method at this time will start the > > >>> TajoMaster. > > >>>>>>> 3. As long as the TajoMaster is running the Controller will > > >>> keep > > >>>>> the > > >>>>>>> lock and update the meta data on zookeeper server with the > > >>>>>>> HOSTNAME and RPC > > >>>>>>> PORT. > > >>>>>>> 4. The other participant will keep waiting for the latch/ > > >> lock > > >>>> to > > >>>>> be > > >>>>>>> released by zookeeper to obtain the leadership. > > >>>>>>> 5. The advantage is we can have as many Tajo Master's as we > > >>>> wan't > > >>>>> but > > >>>>>>> only one can be the leader and will consume the resources > > >> only > > >>>>> after > > >>>>>>> obtaining the latch/lock. > > >>>>>>> > > >>>>>>> > > >>>>>>> TajoWorkerZkController ==> > > >>>>>>> > > >>>>>>> 1. This component will start and connect to zookeeper (will > > >>> create > > >>>>>>> EPHEMERAL ZNODE) and wait for the events from zookeeper. > > >>>>>>> 2. The first listener will listener for successful > > >>> registration. > > >>>>>>> 3. The second listener on master node will listen for any > > >>>>> changes to > > >>>>>>> the master node received from zookeeper server. > > >>>>>>> 4. If the failover occurs the data on the master ZNODE will > > >>> be > > >>>>>>> changed and the new HOSTNAME and RPC PORT can be obtained > > >> and > > >>>> the > > >>>>>>> TajoWorker can establish the new RPC connection with the > > >>>>> TajoMaster. > > >>>>>>> > > >>>>>>> To demonstrate I have created the small Readme.txt file > > >>>>>>> on Github on how to run the example. Please read the log > > >> statements > > >>> on > > >>>>> the > > >>>>>>> console. > > >>>>>>> > > >>>>>>> Similar to TajoWorkerZkController we can also > > >>>>>>> implement TajoClientZkController. > > >>>>>>> > > >>>>>>> Any help or advice is appreciated. > > >>>>>>> > > >>>>>>> Thanks! > > >>>>>>> Warm Regards, > > >>>>>>> Alvin. > > >>>>>>> > > >>>>> > > >>>> > > >>>> > > >>>> > > >>>> -- > > >>>> My research interests are distributed systems, parallel computing > and > > >>>> bytecode based virtual machine. > > >>>> > > >>>> My profile: > > >>>> http://www.linkedin.com/in/coderplay > > >>>> My blog: > > >>>> http://coderplay.javaeye.com > > >>>> > > >>> > > >> > > >> > > >> > > >> -- > > >> My research interests are distributed systems, parallel computing and > > >> bytecode based virtual machine. > > >> > > >> My profile: > > >> http://www.linkedin.com/in/coderplay > > >> My blog: > > >> http://coderplay.javaeye.com > > >> > > > > > -- My research interests are distributed systems, parallel computing and bytecode based virtual machine. My profile: http://www.linkedin.com/in/coderplay My blog: http://coderplay.javaeye.com
