Hi Alvin.
Thanks for your suggestion.
In overall, your suggestion looks very reasonable to me!
I'll check the POC.
Many thanks,
Jihoon
Hi All ,
After doing lot of research in my opinion we should utilize
zookeeper for Tajo Master HA.I have created a small POC and shared it on my
Github repository ( [email protected]:alvinhenrick/zooKeeper-poc.git).
Just to make things little bit easier and maintainable I am
utilizing Apache Curator the Fluent Zookeeper Client API developed at
Netflix and is now part of an apache open source project.
I have attached the diagram to convey my message to the team
members.Will upload it to JIRA once everyone agree with the proposed
solution.
Here is the flow going to look like.
TajoMasterZkController ==>
1. This component will start and connect to zookeeper quorum and fight
( :) ) to obtain the latch / lock to become the master .
2. Once the lock is obtained the Apache Curator API will invoke
takeLeadership () method at this time will start the TajoMaster.
3. As long as the TajoMaster is running the Controller will keep the
lock and update the meta data on zookeeper server with the
HOSTNAME and RPC
PORT.
4. The other participant will keep waiting for the latch/ lock to be
released by zookeeper to obtain the leadership.
5. The advantage is we can have as many Tajo Master's as we wan't but
only one can be the leader and will consume the resources only after
obtaining the latch/lock.
TajoWorkerZkController ==>
1. This component will start and connect to zookeeper (will create
EPHEMERAL ZNODE) and wait for the events from zookeeper.
2. The first listener will listener for successful registration.
3. The second listener on master node will listen for any changes to
the master node received from zookeeper server.
4. If the failover occurs the data on the master ZNODE will be
changed and the new HOSTNAME and RPC PORT can be obtained and the
TajoWorker can establish the new RPC connection with the TajoMaster.
To demonstrate I have created the small Readme.txt file
on Github on how to run the example. Please read the log statements on the
console.
Similar to TajoWorkerZkController we can also
implement TajoClientZkController.
Any help or advice is appreciated.
Thanks!
Warm Regards,
Alvin.