Re: Zookeeper in Airavata to achieve reliability

Lahiru Gunathilake Tue, 17 Jun 2014 12:31:30 -0700

Hi Eran,

I think I should take back my last email. When I carefully look at storm I
have following question.


How are we going to store the Job statuses  and relaunch the jobs which was
running in failure nodes ? Its true that storm is starting new workers but
there should be a way to find missing jobs by someone in the system. Since
we are not having a data stream there is no use to start new workers unless
we handler the missing jobs. I think we need to have a better control of
our component and persist the states of jobs each GFAC node is handling.
Directly using zookeeper will let us to do a proper fault tolerance
implementation.

Regards
Lahiru



On Tue, Jun 17, 2014 at 3:14 PM, Lahiru Gunathilake <[email protected]>
wrote:

> Hi Supun,
>
> I think in this usecase we only use storm topology to do the communication
> among workers and we are completely ignoring the stream processing part.
> Orchestrator will talk to Nimbus and GFAC nodes will be Worker nodes in the
> storm topology. But I think we can achieve extremely fault tolerance system
> by directly using storm based on following statement in storm site with
> minimum changes in airavata.
>
> Additionally, the Nimbus daemon and Supervisor daemons are fail-fast and
> stateless; all state is kept in Zookeeper or on local disk. This means you
> can kill -9 Nimbus or the Supervisors and they’ll start back up like
> nothing happened. This design leads to Storm clusters being incredibly
> stable.
>
>
>
>
> On Tue, Jun 17, 2014 at 3:02 PM, Supun Kamburugamuva <[email protected]>
> wrote:
>
>> Hi Eran,
>>
>> I'm using Storm every day and this is one of the strangest things I've
>> heard about using Storm. My be there are more use cases for Storm other
>> than Distributed Stream processing. AFAIK the Bolts, spouts are built to
>> handle a stream of events that doesn't take much time to process. In
>> Airavata we don't process the messages. Instead we run experiments based
>> on
>> the commands given.
>>
>> If you want process isolation, distributed execution, cluster resource
>> management Yarn would be a better thing to explore.
>>
>> Thanks,
>> Supun..
>>
>>
>> On Tue, Jun 17, 2014 at 2:27 PM, Eran Chinthaka Withana <
>> [email protected]> wrote:
>>
>> > Hi Lahiru,
>> >
>> > good summarization. Thanks Lahiru.
>> >
>> > I think you are trying to stick to a model where Orchestrator
>> distributing
>> > to work for GFac worker and trying to do the impedance mismatch through
>> a
>> > messaging solution. If you step back and think, we don't even want the
>> > orchestrator to handle everything. From its point of view, it should
>> submit
>> > jobs to the framework, and will wait or get notified once the job is
>> done.
>> >
>> > There are multiple ways of doing this. And here is one method.
>> >
>> > Orchestrator submits all its jobs to Job queue (implemented using any MQ
>> > impl like Rabbit or Kafka). A storm topology is implemented to dequeue
>> > messages, process them (i.e. submit those jobs and get those executed)
>> and
>> > notify the Orchestrator with the status (either through another
>> > JobCompletionQueue or direct invocation).
>> >
>> > With this approach, the MQ provider will help to match impedance between
>> > job submission and consumption. Storm helps with worker coordination,
>> load
>> > balancing, throttling on your job execution framework, worker pool
>> > management and fault tolerance.
>> >
>> > Of course, you can implement this based only on ZK and handle everything
>> > else on your own but storm had done exactly that with the use of ZK
>> > underneath.
>> >
>> > Finally, if you go for a model like this, then even beyond job
>> submission,
>> > you can use the same model to do anything within the framework for
>> internal
>> > communication. For example, the workflow engine will submit its jobs to
>> > queues based on what it has to do. Storm topologies exists for each
>> queues
>> > to dequeue messages and carry out the work in a reliable manner.
>> Consider
>> > this as mini-workflows within a larger workflow framework.
>> >
>> > We can have a voice chat if its more convenient. But not at 7am PST :)
>> >
>> >
>> > Thanks,
>> > Eran Chinthaka Withana
>> >
>> >
>> > On Tue, Jun 17, 2014 at 10:12 AM, Lahiru Gunathilake <[email protected]
>> >
>> > wrote:
>> >
>> > > Hi All,
>> > >
>> > > Ignoring the tool that we are going to use to implement fault
>> tolerance I
>> > > have summarized the model we have decided so far. I will use the tool
>> > name
>> > > as X, we can use Zookeeper or some other implementation. Following
>> design
>> > > assume tool X  and Registry have high availability.
>> > >
>> > > 1. Orchestrator and GFAC worker node communication is going to be
>> queue
>> > > based and tool X is going to be used for this communication. (We have
>> to
>> > > implement this with considering race condition between different gfac
>> > > workers).
>> > > 2. We are having multiple instances of GFAC which are identical (In
>> > future
>> > > we can group gfac workers). Existence of each worker node is
>> identified
>> > > using X. If node goes down orchestrator will be notified by X.
>> > > 3. When a particular request comes and accepted by one gfac worker
>> that
>> > > information will be replicated in tool X and a place where this
>> > information
>> > > is persisted even the worker failed.
>> > > 4. When a job comes to a final state like failed or cancelled or
>> > completed
>> > > above information will be removed. So at a given time orchestrator can
>> > poll
>> > > active jobs in each worker by giving a worker ID.
>> > > 5. Tool X will make sure that when a worker goes down it will notify
>> > > orchestrator. During a worker failure, based on step 3 and 4
>> orchestrator
>> > > can poll all the active jobs of that worker and do the same thing
>> like in
>> > > step 1 (store the experiment ID to the queue) and gfac worker will
>> pick
>> > the
>> > > jobs.
>> > >
>> > > 6. When GFAC receive a job like in step 5 it have to carefully
>> evaluate
>> > the
>> > > state from registry and decide what to be done (If the job is pending
>> > then
>> > > gfac just have to monitor, if job state is like input transferred not
>> > even
>> > > submitted gfac has to execute rest of the chain and submit the job to
>> the
>> > > resource and start monitoring).
>> > >
>> > > If we can find a tool X which supports all these features and tool
>> itself
>> > > is fault tolerance and support atomicity, high availability and simply
>> > API
>> > > to implement we can use that tool.
>> > >
>> > > WDYT ?
>> > >
>> > > Lahiru
>> > >
>> > >
>> > > On Mon, Jun 16, 2014 at 2:38 PM, Supun Kamburugamuva <
>> [email protected]>
>> > > wrote:
>> > >
>> > > > Hi Lahiru,
>> > > >
>> > > > Before moving with an implementation it may be worth to consider
>> some
>> > of
>> > > > the following aspects as well.
>> > > >
>> > > > 1. How to report the progress of an experiment as state in
>> ZooKeeper?
>> > > What
>> > > > happens if a GFac instance crashes while executing an experiment?
>> Are
>> > > there
>> > > > check-points we can save so that another GFac instance can take
>> over?
>> > > > 2. What is the threading model of GFac instances? (I consider this
>> as a
>> > > > very important aspect)
>> > > > 3. What are the information needed to be stored in the ZooKeeper?
>> You
>> > may
>> > > > need to store other information about an experiment apart from its
>> > > > experiment ID.
>> > > > 4. How to report errors?
>> > > > 5. For GFac weather you need a threading model or worker process
>> model?
>> > > >
>> > > > Thanks,
>> > > > Supun..
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <
>> [email protected]
>> > >
>> > > > wrote:
>> > > >
>> > > > > Hi All,
>> > > > >
>> > > > > I think the conclusion is like this,
>> > > > >
>> > > > > 1, We make the gfac as a worker not a thrift service and we can
>> start
>> > > > > multiple workers either with bunch of providers and handlers
>> > configured
>> > > > in
>> > > > > each worker or provider specific  workers to handle the class path
>> > > issues
>> > > > > (not the common scenario).
>> > > > >
>> > > > > 2. Gfac workers can be configured to watch for a given path in
>> > > zookeeper,
>> > > > > and multiple workers can listen to the same path. Default path
>> can be
>> > > > > /airavata/gfac or can configure paths like /airavata/gfac/gsissh
>> > > > > /airavata/gfac/bes.
>> > > > >
>> > > > > 3. Orchestrator can configure with a logic to store experiment
>> IDs in
>> > > > > zookeeper with a path, and orchestrator can be configured to
>> provider
>> > > > > specific path logic too. So when a new request come orchestrator
>> > store
>> > > > the
>> > > > > experimentID and these experiments IDs are stored in Zk as a
>> queue.
>> > > > >
>> > > > > 4. Since gfac workers are watching they will be notified and as
>> supun
>> > > > > suggested can use a leader selection algorithm[1] and one gfac
>> worker
>> > > >  will
>> > > > > take the leadership for each experiment. If there are gfac
>> instances
>> > > for
>> > > > > each provider same logic will apply among those nodes with same
>> > > provider
>> > > > > type.
>> > > > >
>> > > > > [1]http://curator.apache.org/curator-recipes/leader-election.html
>> > > > >
>> > > > > I would like to implement this if there are  no objections.
>> > > > >
>> > > > > Lahiru
>> > > > >
>> > > > >
>> > > > > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <
>> > > [email protected]
>> > > > >
>> > > > > wrote:
>> > > > >
>> > > > > > Hi Marlon,
>> > > > > >
>> > > > > > I think you are exactly correct.
>> > > > > >
>> > > > > > Supun..
>> > > > > >
>> > > > > >
>> > > > > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <
>> [email protected]>
>> > > > wrote:
>> > > > > >
>> > > > > > > Let me restate this, and please tell me if I'm wrong.
>> > > > > > >
>> > > > > > > Orchestrator decides (somehow) that a particular job requires
>> > > > JSDL/BES,
>> > > > > > so
>> > > > > > > it places the Experiment ID in Zookeeper's
>> > /airavata/gfac/jsdl-bes
>> > > > > node.
>> > > > > > >  GFAC servers associated with this instance notice the update.
>> >  The
>> > > > > first
>> > > > > > > GFAC to claim the job gets it, uses the Experiment ID to get
>> the
>> > > > > detailed
>> > > > > > > information it needs from the Registry.  ZooKeeper handles the
>> > > > locking,
>> > > > > > etc
>> > > > > > > to make sure that only one GFAC at a time is trying to handle
>> an
>> > > > > > experiment.
>> > > > > > >
>> > > > > > > Marlon
>> > > > > > >
>> > > > > > >
>> > > > > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
>> > > > > > >
>> > > > > > >> Hi Supun,
>> > > > > > >>
>> > > > > > >> Thanks for the clarification.
>> > > > > > >>
>> > > > > > >> Regards
>> > > > > > >> Lahiru
>> > > > > > >>
>> > > > > > >>
>> > > > > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
>> > > > > > [email protected]>
>> > > > > > >> wrote:
>> > > > > > >>
>> > > > > > >>  Hi Lahiru,
>> > > > > > >>>
>> > > > > > >>> My suggestion is that may be you don't need a Thrift service
>> > > > between
>> > > > > > >>> Orchestrator and the component executing the experiment.
>> When a
>> > > new
>> > > > > > >>> experiment is submitted, orchestrator decides who can
>> execute
>> > > this
>> > > > > job.
>> > > > > > >>> Then it put the information about this experiment execution
>> in
>> > > > > > ZooKeeper.
>> > > > > > >>> The component which wants to executes the experiment is
>> > listening
>> > > > to
>> > > > > > this
>> > > > > > >>> ZooKeeper path and when it sees the experiment it will
>> execute
>> > > it.
>> > > > So
>> > > > > > >>> that
>> > > > > > >>> the communication happens through an state change in
>> ZooKeeper.
>> > > > This
>> > > > > > can
>> > > > > > >>> potentially simply your architecture.
>> > > > > > >>>
>> > > > > > >>> Thanks,
>> > > > > > >>> Supun.
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
>> > > > > > [email protected]>
>> > > > > > >>> wrote:
>> > > > > > >>>
>> > > > > > >>>  Hi Supun,
>> > > > > > >>>>
>> > > > > > >>>> So your suggestion is to create a znode for each thrift
>> > service
>> > > we
>> > > > > > have
>> > > > > > >>>> and
>> > > > > > >>>> when the request comes that node gets modified with input
>> data
>> > > for
>> > > > > > that
>> > > > > > >>>> request and thrift service is having a watch for that node
>> and
>> > > it
>> > > > > will
>> > > > > > >>>> be
>> > > > > > >>>> notified because of the watch and it can read the input
>> from
>> > > > > zookeeper
>> > > > > > >>>> and
>> > > > > > >>>> invoke the operation?
>> > > > > > >>>>
>> > > > > > >>>> Lahiru
>> > > > > > >>>>
>> > > > > > >>>>
>> > > > > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
>> > > > > > >>>> [email protected]>
>> > > > > > >>>> wrote:
>> > > > > > >>>>
>> > > > > > >>>>  Hi all,
>> > > > > > >>>>>
>> > > > > > >>>>> Here is what I think about Airavata and ZooKeeper. In
>> > Airavata
>> > > > > there
>> > > > > > >>>>> are
>> > > > > > >>>>> many components and these components must be stateless to
>> > > achieve
>> > > > > > >>>>> scalability and reliability.Also there must be a
>> mechanism to
>> > > > > > >>>>>
>> > > > > > >>>> communicate
>> > > > > > >>>>
>> > > > > > >>>>> between the components. At the moment Airavata uses RPC
>> calls
>> > > > based
>> > > > > > on
>> > > > > > >>>>> Thrift for the communication.
>> > > > > > >>>>>
>> > > > > > >>>>> ZooKeeper can be used both as a place to hold state and
>> as a
>> > > > > > >>>>>
>> > > > > > >>>> communication
>> > > > > > >>>>
>> > > > > > >>>>> layer between the components. I'm involved with a project
>> > that
>> > > > has
>> > > > > > many
>> > > > > > >>>>> distributed components like AIravata. Right now we use
>> Thrift
>> > > > > > services
>> > > > > > >>>>>
>> > > > > > >>>> to
>> > > > > > >>>>
>> > > > > > >>>>> communicate among the components. But we find it
>> difficult to
>> > > use
>> > > > > RPC
>> > > > > > >>>>>
>> > > > > > >>>> calls
>> > > > > > >>>>
>> > > > > > >>>>> and achieve stateless behaviour and thinking of replacing
>> > > Thrift
>> > > > > > >>>>>
>> > > > > > >>>> services
>> > > > > > >>>>
>> > > > > > >>>>> with ZooKeeper based communication layer. So I think it is
>> > > better
>> > > > > to
>> > > > > > >>>>> explore the possibility of removing the Thrift services
>> > between
>> > > > the
>> > > > > > >>>>> components and use ZooKeeper as a communication mechanism
>> > > between
>> > > > > the
>> > > > > > >>>>> services. If you do this you will have to move the state
>> to
>> > > > > ZooKeeper
>> > > > > > >>>>>
>> > > > > > >>>> and
>> > > > > > >>>>
>> > > > > > >>>>> will automatically achieve the stateless behaviour in the
>> > > > > components.
>> > > > > > >>>>>
>> > > > > > >>>>> Also I think trying to make ZooKeeper optional is a bad
>> idea.
>> > > If
>> > > > we
>> > > > > > are
>> > > > > > >>>>> trying to integrate something fundamentally important to
>> > > > > architecture
>> > > > > > >>>>> as
>> > > > > > >>>>> how to store state, we shouldn't make it optional.
>> > > > > > >>>>>
>> > > > > > >>>>> Thanks,
>> > > > > > >>>>> Supun..
>> > > > > > >>>>>
>> > > > > > >>>>>
>> > > > > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
>> > > > > > >>>>> [email protected]> wrote:
>> > > > > > >>>>>
>> > > > > > >>>>>  Hi Lahiru,
>> > > > > > >>>>>>
>> > > > > > >>>>>> As i understood,  not only reliability , you are trying
>> to
>> > > > achieve
>> > > > > > >>>>>> some
>> > > > > > >>>>>> other requirement by introducing zookeeper, like health
>> > > > monitoring
>> > > > > > of
>> > > > > > >>>>>>
>> > > > > > >>>>> the
>> > > > > > >>>>
>> > > > > > >>>>> services, categorization with service implementation etc
>> ...
>> > .
>> > > In
>> > > > > > that
>> > > > > > >>>>>> case, i think we can get use of zookeeper's features but
>> if
>> > we
>> > > > > only
>> > > > > > >>>>>>
>> > > > > > >>>>> focus
>> > > > > > >>>>
>> > > > > > >>>>> on reliability, i have little bit of concern, why can't we
>> > use
>> > > > > > >>>>>>
>> > > > > > >>>>> clustering +
>> > > > > > >>>>
>> > > > > > >>>>> LB ?
>> > > > > > >>>>>>
>> > > > > > >>>>>> Yes it is better we add Zookeeper as a prerequisite if
>> user
>> > > need
>> > > > > to
>> > > > > > >>>>>> use
>> > > > > > >>>>>> it.
>> > > > > > >>>>>>
>> > > > > > >>>>>> Thanks,
>> > > > > > >>>>>>   Shameera.
>> > > > > > >>>>>>
>> > > > > > >>>>>>
>> > > > > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
>> > > > > > >>>>>> [email protected]
>> > > > > > >>>>>> wrote:
>> > > > > > >>>>>>
>> > > > > > >>>>>>  Hi Gagan,
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> I need to start another discussion about it, but I had
>> an
>> > > > offline
>> > > > > > >>>>>>> discussion with Suresh about auto-scaling. I will start
>> > > another
>> > > > > > >>>>>>> thread
>> > > > > > >>>>>>> about this topic too.
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> Regards
>> > > > > > >>>>>>> Lahiru
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
>> > > > > > >>>>>>>
>> > > > > > >>>>>> [email protected]
>> > > > > > >>>>
>> > > > > > >>>>> wrote:
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>  Thanks Lahiru for pointing to nice library, added to my
>> > > > > dictionary
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>> :).
>> > > > > > >>>>
>> > > > > > >>>>>  I would like to know how are we planning to start
>> multiple
>> > > > > servers.
>> > > > > > >>>>>>>> 1. Spawning new servers based on load? Some times we
>> call
>> > it
>> > > > as
>> > > > > > auto
>> > > > > > >>>>>>>> scalable.
>> > > > > > >>>>>>>> 2. To make some specific number of nodes available
>> such as
>> > > we
>> > > > > > want 2
>> > > > > > >>>>>>>> servers to be available at any time so if one goes down
>> > > then I
>> > > > > > need
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>> to
>> > > > > > >>>>
>> > > > > > >>>>>  spawn one new to make available servers count 2.
>> > > > > > >>>>>>>> 3. Initially start all the servers.
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I
>> don't
>> > > > > believe
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>> existing
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> architecture support this?
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>> Regards,
>> > > > > > >>>>>>>> Gagan
>> > > > > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
>> > > > [email protected]
>> > > > > >
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>> wrote:
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> Hi Gagan,
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> Thanks for your response. Please see my inline
>> comments.
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> [email protected]>
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> wrote:
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>  Hi Lahiru,
>> > > > > > >>>>>>>>>> Just my 2 cents.
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>> I am big fan of zookeeper but also against adding
>> > multiple
>> > > > > hops
>> > > > > > in
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> the
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> system which can add unnecessary complexity. Here I am
>> not
>> > > > able
>> > > > > to
>> > > > > > >>>>>>>>>> understand the requirement of zookeeper may be I am
>> > wrong
>> > > > > > because
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> of
>> > > > > > >>>>
>> > > > > > >>>>> less
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> knowledge of the airavata system in whole. So I would
>> like
>> > > to
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> discuss
>> > > > > > >>>>
>> > > > > > >>>>>  following point.
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>> 1. How it will help us in making system more
>> reliable.
>> > > > > Zookeeper
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> is
>> > > > > > >>>>
>> > > > > > >>>>> not
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> able to restart services. At max it can tell whether
>> > service
>> > > > is
>> > > > > up
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> or not
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> which could only be the case if airavata service goes
>> down
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> gracefully and
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> we have any automated way to restart it. If this is
>> just
>> > > > matter
>> > > > > of
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> routing
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> client requests to the available thrift servers then
>> this
>> > > can
>> > > > be
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> achieved
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> with the help of load balancer which I guess is already
>> > > there
>> > > > in
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> thrift
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> wish list.
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>  We have multiple thrift services and currently we
>> start
>> > > > only
>> > > > > > one
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> instance
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> of them and each thrift service is a stateless
>> service. To
>> > > > keep
>> > > > > > the
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> high
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> availability we have to start multiple instances of
>> them
>> > in
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> production
>> > > > > > >>>>
>> > > > > > >>>>>  scenario. So for clients to get an available thrift
>> service
>> > we
>> > > > can
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> use
>> > > > > > >>>>
>> > > > > > >>>>>  zookeeper znodes to represent each available service.
>> There
>> > > are
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> some
>> > > > > > >>>>
>> > > > > > >>>>>  libraries which is doing similar[1] and I think we can
>> use
>> > > them
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> directly.
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> 2. As far as registering of different providers is
>> > concerned
>> > > > do
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> you
>> > > > > > >>>>
>> > > > > > >>>>>  think for that we really need external store.
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>  Yes I think so, because its light weight and
>> reliable
>> > and
>> > > > we
>> > > > > > have
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> to
>> > > > > > >>>>
>> > > > > > >>>>> do
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> very minimal amount of work to achieve all these
>> features
>> > to
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> Airavata
>> > > > > > >>>>
>> > > > > > >>>>>  because zookeeper handle all the complexity.
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>  I have seen people using zookeeper more for state
>> > > management
>> > > > > in
>> > > > > > >>>>>>>>>> distributed environments.
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>  +1, we might not be the most effective users of
>> > zookeeper
>> > > > > > because
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> all
>> > > > > > >>>>
>> > > > > > >>>>> of
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> our services are stateless services, but my point is to
>> > > > achieve
>> > > > > > >>>>>>>>> fault-tolerance we can use zookeeper and with minimal
>> > work.
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>    I would like to understand more how can we leverage
>> > > > > zookeeper
>> > > > > > in
>> > > > > > >>>>>>>>>> airavata to make system reliable.
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>  Regards,
>> > > > > > >>>>>>>>>> Gagan
>> > > > > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <
>> > [email protected]
>> > > >
>> > > > > > wrote:
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the
>> > > Architecture
>> > > > > > list
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>> for
>> > > > > > >>>>
>> > > > > > >>>>>  additional comments.
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>> Marlon
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> Hi All,
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and
>> > how
>> > > to
>> > > > > use
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> it
>> > > > > > >>>>
>> > > > > > >>>>> in
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  airavata. Its really a nice way to achieve fault
>> > tolerance
>> > > > and
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> reliable
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> communication between our thrift services and
>> clients.
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> Zookeeper
>> > > > > > >>>>
>> > > > > > >>>>> is a
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  distributed, fault tolerant system to do a reliable
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> communication
>> > > > > > >>>>
>> > > > > > >>>>>  between
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> distributed applications. This is like an in-memory
>> > file
>> > > > > > system
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> which
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  has
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> nodes in a tree structure and each node can have
>> small
>> > > > > amount
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> of
>> > > > > > >>>>
>> > > > > > >>>>> data
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  associated with it and these nodes are called znodes.
>> > > Clients
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> can
>> > > > > > >>>>
>> > > > > > >>>>>  connect
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> to a zookeeper server and add/delete and update
>> these
>> > > > > znodes.
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift
>> > services
>> > > > and
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> these
>> > > > > > >>>>
>> > > > > > >>>>> can
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  go
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> down for maintenance or these can crash, if we use
>> > > > zookeeper
>> > > > > > to
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> store
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  these
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> configuration(thrift service configurations) we can
>> > > > achieve
>> > > > > a
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> very
>> > > > > > >>>>
>> > > > > > >>>>>  reliable
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> system. Basically thrift clients can dynamically
>> > > discover
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> available
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  service
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to
>> > change
>> > > > the
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> generated
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  thrift client code but we have to change the
>> locations we
>> > > are
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> invoking
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  them). ephemeral znodes will be removed when the
>> thrift
>> > > > service
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> goes
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  down
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> and zookeeper guarantee the atomicity between these
>> > > > > > operations.
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> With
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  this
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> approach we can have a node hierarchy for multiple
>> of
>> > > > > > airavata,
>> > > > > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> For specifically for gfac we can have different
>> types
>> > of
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> services
>> > > > > > >>>>
>> > > > > > >>>>> for
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  each
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> provider implementation. This can be achieved by
>> using
>> > > the
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> hierarchical
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> support in zookeeper and providing some logic in
>> > > > gfac-thrift
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> service
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  to
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> register it to a defined path. Using the same logic
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> orchestrator
>> > > > > > >>>>
>> > > > > > >>>>> can
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  discover the provider specific gfac thrift service and
>> > > route
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> the
>> > > > > > >>>>
>> > > > > > >>>>>  message to
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> the correct thrift service.
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> With this approach I think we simply have write
>> some
>> > > > client
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> code
>> > > > > > >>>>
>> > > > > > >>>>> in
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  thrift
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> services and clients and zookeeper server
>> installation
>> > > can
>> > > > > be
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> done as
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  a
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> separate process and it will be easier to keep the
>> > > > Zookeeper
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> server
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  separate from Airavata because installation of
>> Zookeeper
>> > > > server
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> little
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  complex in production scenario. I think we have to
>> make
>> > > sure
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> everything
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> enable.zookeeper=false
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> should works fine and users doesn't have to
>> download
>> > and
>> > > > > start
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> zookeeper.
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> Thanks
>> > > > > > >>>>>>>>>>>> Lahiru
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>> --
>> > > > > > >>>>>>>>> System Analyst Programmer
>> > > > > > >>>>>>>>> PTI Lab
>> > > > > > >>>>>>>>> Indiana University
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>> --
>> > > > > > >>>>>>> System Analyst Programmer
>> > > > > > >>>>>>> PTI Lab
>> > > > > > >>>>>>> Indiana University
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>
>> > > > > > >>>>>>
>> > > > > > >>>>>> --
>> > > > > > >>>>>> Best Regards,
>> > > > > > >>>>>> Shameera Rathnayaka.
>> > > > > > >>>>>>
>> > > > > > >>>>>> email: shameera AT apache.org , shameerainfo AT
>> gmail.com
>> > > > > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
>> > > > > > >>>>>>
>> > > > > > >>>>>>
>> > > > > > >>>>>
>> > > > > > >>>>> --
>> > > > > > >>>>> Supun Kamburugamuva
>> > > > > > >>>>> Member, Apache Software Foundation; http://www.apache.org
>> > > > > > >>>>> E-mail: [email protected];  Mobile: +1 812 369 6762
>> > > > > > >>>>> Blog: http://supunk.blogspot.com
>> > > > > > >>>>>
>> > > > > > >>>>>
>> > > > > > >>>>>
>> > > > > > >>>> --
>> > > > > > >>>> System Analyst Programmer
>> > > > > > >>>> PTI Lab
>> > > > > > >>>> Indiana University
>> > > > > > >>>>
>> > > > > > >>>>
>> > > > > > >>>
>> > > > > > >>> --
>> > > > > > >>> Supun Kamburugamuva
>> > > > > > >>> Member, Apache Software Foundation; http://www.apache.org
>> > > > > > >>> E-mail: [email protected];  Mobile: +1 812 369 6762
>> > > > > > >>> Blog: http://supunk.blogspot.com
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Supun Kamburugamuva
>> > > > > > Member, Apache Software Foundation; http://www.apache.org
>> > > > > > E-mail: [email protected];  Mobile: +1 812 369 6762
>> > > > > > Blog: http://supunk.blogspot.com
>> > > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > System Analyst Programmer
>> > > > > PTI Lab
>> > > > > Indiana University
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Supun Kamburugamuva
>> > > > Member, Apache Software Foundation; http://www.apache.org
>> > > > E-mail: [email protected];  Mobile: +1 812 369 6762
>> > > > Blog: http://supunk.blogspot.com
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > System Analyst Programmer
>> > > PTI Lab
>> > > Indiana University
>> > >
>> >
>>
>>
>>
>> --
>> Supun Kamburugamuva
>> Member, Apache Software Foundation; http://www.apache.org
>> E-mail: [email protected];  Mobile: +1 812 369 6762
>> Blog: http://supunk.blogspot.com
>>
>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>



-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Reply via email to