Re: Zookeeper in Airavata to achieve reliability

Lahiru Gunathilake Mon, 16 Jun 2014 08:43:07 -0700

Hi Supun,

Thanks for the clarification.


Regards
Lahiru


On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <[email protected]>
wrote:

> Hi Lahiru,
>
> My suggestion is that may be you don't need a Thrift service between
> Orchestrator and the component executing the experiment. When a new
> experiment is submitted, orchestrator decides who can execute this job.
> Then it put the information about this experiment execution in ZooKeeper.
> The component which wants to executes the experiment is listening to this
> ZooKeeper path and when it sees the experiment it will execute it. So that
> the communication happens through an state change in ZooKeeper. This can
> potentially simply your architecture.
>
> Thanks,
> Supun.
>
>
> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <[email protected]>
> wrote:
>
>> Hi Supun,
>>
>> So your suggestion is to create a znode for each thrift service we have
>> and
>> when the request comes that node gets modified with input data for that
>> request and thrift service is having a watch for that node and it will be
>> notified because of the watch and it can read the input from zookeeper and
>> invoke the operation?
>>
>> Lahiru
>>
>>
>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <[email protected]>
>> wrote:
>>
>> > Hi all,
>> >
>> > Here is what I think about Airavata and ZooKeeper. In Airavata there are
>> > many components and these components must be stateless to achieve
>> > scalability and reliability.Also there must be a mechanism to
>> communicate
>> > between the components. At the moment Airavata uses RPC calls based on
>> > Thrift for the communication.
>> >
>> > ZooKeeper can be used both as a place to hold state and as a
>> communication
>> > layer between the components. I'm involved with a project that has many
>> > distributed components like AIravata. Right now we use Thrift services
>> to
>> > communicate among the components. But we find it difficult to use RPC
>> calls
>> > and achieve stateless behaviour and thinking of replacing Thrift
>> services
>> > with ZooKeeper based communication layer. So I think it is better to
>> > explore the possibility of removing the Thrift services between the
>> > components and use ZooKeeper as a communication mechanism between the
>> > services. If you do this you will have to move the state to ZooKeeper
>> and
>> > will automatically achieve the stateless behaviour in the components.
>> >
>> > Also I think trying to make ZooKeeper optional is a bad idea. If we are
>> > trying to integrate something fundamentally important to architecture as
>> > how to store state, we shouldn't make it optional.
>> >
>> > Thanks,
>> > Supun..
>> >
>> >
>> > On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
>> > [email protected]> wrote:
>> >
>> >> Hi Lahiru,
>> >>
>> >> As i understood,  not only reliability , you are trying to achieve some
>> >> other requirement by introducing zookeeper, like health monitoring of
>> the
>> >> services, categorization with service implementation etc ... . In that
>> >> case, i think we can get use of zookeeper's features but if we only
>> focus
>> >> on reliability, i have little bit of concern, why can't we use
>> clustering +
>> >> LB ?
>> >>
>> >> Yes it is better we add Zookeeper as a prerequisite if user need to use
>> >> it.
>> >>
>> >> Thanks,
>> >>  Shameera.
>> >>
>> >>
>> >> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <[email protected]
>> >
>> >> wrote:
>> >>
>> >>> Hi Gagan,
>> >>>
>> >>> I need to start another discussion about it, but I had an offline
>> >>> discussion with Suresh about auto-scaling. I will start another thread
>> >>> about this topic too.
>> >>>
>> >>> Regards
>> >>> Lahiru
>> >>>
>> >>>
>> >>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
>> [email protected]
>> >>> >
>> >>> wrote:
>> >>>
>> >>> > Thanks Lahiru for pointing to nice library, added to my dictionary
>> :).
>> >>> >
>> >>> > I would like to know how are we planning to start multiple servers.
>> >>> > 1. Spawning new servers based on load? Some times we call it as auto
>> >>> > scalable.
>> >>> > 2. To make some specific number of nodes available such as we want 2
>> >>> > servers to be available at any time so if one goes down then I need
>> to
>> >>> > spawn one new to make available servers count 2.
>> >>> > 3. Initially start all the servers.
>> >>> >
>> >>> > In scenario 1 and 2 zookeeper does make sense but I don't believe
>> >>> existing
>> >>> > architecture support this?
>> >>> >
>> >>> > Regards,
>> >>> > Gagan
>> >>> > On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <[email protected]>
>> >>> wrote:
>> >>> >
>> >>> >> Hi Gagan,
>> >>> >>
>> >>> >> Thanks for your response. Please see my inline comments.
>> >>> >>
>> >>> >>
>> >>> >> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
>> >>> [email protected]>
>> >>> >> wrote:
>> >>> >>
>> >>> >>> Hi Lahiru,
>> >>> >>> Just my 2 cents.
>> >>> >>>
>> >>> >>> I am big fan of zookeeper but also against adding multiple hops in
>> >>> the
>> >>> >>> system which can add unnecessary complexity. Here I am not able to
>> >>> >>> understand the requirement of zookeeper may be I am wrong because
>> of
>> >>> less
>> >>> >>> knowledge of the airavata system in whole. So I would like to
>> discuss
>> >>> >>> following point.
>> >>> >>>
>> >>> >>> 1. How it will help us in making system more reliable. Zookeeper
>> is
>> >>> not
>> >>> >>> able to restart services. At max it can tell whether service is up
>> >>> or not
>> >>> >>> which could only be the case if airavata service goes down
>> >>> gracefully and
>> >>> >>> we have any automated way to restart it. If this is just matter of
>> >>> routing
>> >>> >>> client requests to the available thrift servers then this can be
>> >>> achieved
>> >>> >>> with the help of load balancer which I guess is already there in
>> >>> thrift
>> >>> >>> wish list.
>> >>> >>>
>> >>> >> We have multiple thrift services and currently we start only one
>> >>> instance
>> >>> >> of them and each thrift service is a stateless service. To keep the
>> >>> high
>> >>> >> availability we have to start multiple instances of them in
>> production
>> >>> >> scenario. So for clients to get an available thrift service we can
>> use
>> >>> >> zookeeper znodes to represent each available service. There are
>> some
>> >>> >> libraries which is doing similar[1] and I think we can use them
>> >>> directly.
>> >>> >>
>> >>> >>> 2. As far as registering of different providers is concerned do
>> you
>> >>> >>> think for that we really need external store.
>> >>> >>>
>> >>> >> Yes I think so, because its light weight and reliable and we have
>> to
>> >>> do
>> >>> >> very minimal amount of work to achieve all these features to
>> Airavata
>> >>> >> because zookeeper handle all the complexity.
>> >>> >>
>> >>> >>> I have seen people using zookeeper more for state management in
>> >>> >>> distributed environments.
>> >>> >>>
>> >>> >> +1, we might not be the most effective users of zookeeper because
>> all
>> >>> of
>> >>> >> our services are stateless services, but my point is to achieve
>> >>> >> fault-tolerance we can use zookeeper and with minimal work.
>> >>> >>
>> >>> >>>  I would like to understand more how can we leverage zookeeper in
>> >>> >>> airavata to make system reliable.
>> >>> >>>
>> >>> >>>
>> >>> >> [1]https://github.com/eirslett/thrift-zookeeper
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>> Regards,
>> >>> >>> Gagan
>> >>> >>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <[email protected]> wrote:
>> >>> >>>
>> >>> >>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list
>> for
>> >>> >>>> additional comments.
>> >>> >>>>
>> >>> >>>> Marlon
>> >>> >>>>
>> >>> >>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>> >>> >>>> > Hi All,
>> >>> >>>> >
>> >>> >>>> > I did little research about Apache Zookeeper[1] and how to use
>> it
>> >>> in
>> >>> >>>> > airavata. Its really a nice way to achieve fault tolerance and
>> >>> >>>> reliable
>> >>> >>>> > communication between our thrift services and clients.
>> Zookeeper
>> >>> is a
>> >>> >>>> > distributed, fault tolerant system to do a reliable
>> communication
>> >>> >>>> between
>> >>> >>>> > distributed applications. This is like an in-memory file system
>> >>> which
>> >>> >>>> has
>> >>> >>>> > nodes in a tree structure and each node can have small amount
>> of
>> >>> data
>> >>> >>>> > associated with it and these nodes are called znodes. Clients
>> can
>> >>> >>>> connect
>> >>> >>>> > to a zookeeper server and add/delete and update these znodes.
>> >>> >>>> >
>> >>> >>>> >   In Apache Airavata we start multiple thrift services and
>> these
>> >>> can
>> >>> >>>> go
>> >>> >>>> > down for maintenance or these can crash, if we use zookeeper to
>> >>> store
>> >>> >>>> these
>> >>> >>>> > configuration(thrift service configurations) we can achieve a
>> very
>> >>> >>>> reliable
>> >>> >>>> > system. Basically thrift clients can dynamically discover
>> >>> available
>> >>> >>>> service
>> >>> >>>> > by using ephemeral znodes(Here we do not have to change the
>> >>> generated
>> >>> >>>> > thrift client code but we have to change the locations we are
>> >>> invoking
>> >>> >>>> > them). ephemeral znodes will be removed when the thrift service
>> >>> goes
>> >>> >>>> down
>> >>> >>>> > and zookeeper guarantee the atomicity between these operations.
>> >>> With
>> >>> >>>> this
>> >>> >>>> > approach we can have a node hierarchy for multiple of airavata,
>> >>> >>>> > orchestrator,appcatalog and gfac thrift services.
>> >>> >>>> >
>> >>> >>>> > For specifically for gfac we can have different types of
>> services
>> >>> for
>> >>> >>>> each
>> >>> >>>> > provider implementation. This can be achieved by using the
>> >>> >>>> hierarchical
>> >>> >>>> > support in zookeeper and providing some logic in gfac-thrift
>> >>> service
>> >>> >>>> to
>> >>> >>>> > register it to a defined path. Using the same logic
>> orchestrator
>> >>> can
>> >>> >>>> > discover the provider specific gfac thrift service and route
>> the
>> >>> >>>> message to
>> >>> >>>> > the correct thrift service.
>> >>> >>>> >
>> >>> >>>> > With this approach I think we simply have write some client
>> code
>> >>> in
>> >>> >>>> thrift
>> >>> >>>> > services and clients and zookeeper server installation can be
>> >>> done as
>> >>> >>>> a
>> >>> >>>> > separate process and it will be easier to keep the Zookeeper
>> >>> server
>> >>> >>>> > separate from Airavata because installation of Zookeeper server
>> >>> little
>> >>> >>>> > complex in production scenario. I think we have to make sure
>> >>> >>>> everything
>> >>> >>>> > works fine when there is no Zookeeper running, ex:
>> >>> >>>> enable.zookeeper=false
>> >>> >>>> > should works fine and users doesn't have to download and start
>> >>> >>>> zookeeper.
>> >>> >>>> >
>> >>> >>>> >
>> >>> >>>> >
>> >>> >>>> > [1]http://zookeeper.apache.org/
>> >>> >>>> >
>> >>> >>>> > Thanks
>> >>> >>>> > Lahiru
>> >>> >>>>
>> >>> >>>>
>> >>> >>
>> >>> >>
>> >>> >> --
>> >>> >> System Analyst Programmer
>> >>> >> PTI Lab
>> >>> >> Indiana University
>> >>> >>
>> >>> >
>> >>>
>> >>>
>> >>> --
>> >>> System Analyst Programmer
>> >>> PTI Lab
>> >>> Indiana University
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Best Regards,
>> >> Shameera Rathnayaka.
>> >>
>> >> email: shameera AT apache.org , shameerainfo AT gmail.com
>> >> Blog : http://shameerarathnayaka.blogspot.com/
>> >>
>> >
>> >
>> >
>> > --
>> > Supun Kamburugamuva
>> > Member, Apache Software Foundation; http://www.apache.org
>> > E-mail: [email protected];  Mobile: +1 812 369 6762
>> > Blog: http://supunk.blogspot.com
>> >
>> >
>>
>>
>> --
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>>
>
>
>
> --
> Supun Kamburugamuva
> Member, Apache Software Foundation; http://www.apache.org
> E-mail: [email protected];  Mobile: +1 812 369 6762
> Blog: http://supunk.blogspot.com
>
>


-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Reply via email to