> > In addition to this, I have a more abstract question. Isn't this simply a > pub-sub system we are talking about? Orchestrator, acting as the publisher > will put a job (experiment) to the queue. Worker, acting as subscribers, > get the work and execute it. So the main question is why are we trying to > use zookeeper to act as a queue. I'm not saying its bad but there are other > scalable and proven ways of doing this (like a persistence messaging > solutions) but state shared using ZK. > Going through this thread, the same question came to my mind. Why are not consider one of the many *MQ solutions? What does ZooKeeper give us more than them? Are we already using ZooKeeper in Airavata deployments for any other use case?
thanks, Thilina > > > Thanks, > Eran Chinthaka Withana > > > On Mon, Jun 16, 2014 at 11:22 AM, Lahiru Gunathilake <[email protected]> > wrote: > > > Hi All, > > > > I think the conclusion is like this, > > > > 1, We make the gfac as a worker not a thrift service and we can start > > multiple workers either with bunch of providers and handlers configured > in > > each worker or provider specific workers to handle the class path issues > > (not the common scenario). > > > > 2. Gfac workers can be configured to watch for a given path in zookeeper, > > and multiple workers can listen to the same path. Default path can be > > /airavata/gfac or can configure paths like /airavata/gfac/gsissh > > /airavata/gfac/bes. > > > > 3. Orchestrator can configure with a logic to store experiment IDs in > > zookeeper with a path, and orchestrator can be configured to provider > > specific path logic too. So when a new request come orchestrator store > the > > experimentID and these experiments IDs are stored in Zk as a queue. > > > > 4. Since gfac workers are watching they will be notified and as supun > > suggested can use a leader selection algorithm[1] and one gfac worker > will > > take the leadership for each experiment. If there are gfac instances for > > each provider same logic will apply among those nodes with same provider > > type. > > > > [1]http://curator.apache.org/curator-recipes/leader-election.html > > > > I would like to implement this if there are no objections. > > > > Lahiru > > > > > > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <[email protected] > > > > wrote: > > > > > Hi Marlon, > > > > > > I think you are exactly correct. > > > > > > Supun.. > > > > > > > > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <[email protected]> > wrote: > > > > > > > Let me restate this, and please tell me if I'm wrong. > > > > > > > > Orchestrator decides (somehow) that a particular job requires > JSDL/BES, > > > so > > > > it places the Experiment ID in Zookeeper's /airavata/gfac/jsdl-bes > > node. > > > > GFAC servers associated with this instance notice the update. The > > first > > > > GFAC to claim the job gets it, uses the Experiment ID to get the > > detailed > > > > information it needs from the Registry. ZooKeeper handles the > locking, > > > etc > > > > to make sure that only one GFAC at a time is trying to handle an > > > experiment. > > > > > > > > Marlon > > > > > > > > > > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote: > > > > > > > >> Hi Supun, > > > >> > > > >> Thanks for the clarification. > > > >> > > > >> Regards > > > >> Lahiru > > > >> > > > >> > > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva < > > > [email protected]> > > > >> wrote: > > > >> > > > >> Hi Lahiru, > > > >>> > > > >>> My suggestion is that may be you don't need a Thrift service > between > > > >>> Orchestrator and the component executing the experiment. When a new > > > >>> experiment is submitted, orchestrator decides who can execute this > > job. > > > >>> Then it put the information about this experiment execution in > > > ZooKeeper. > > > >>> The component which wants to executes the experiment is listening > to > > > this > > > >>> ZooKeeper path and when it sees the experiment it will execute it. > So > > > >>> that > > > >>> the communication happens through an state change in ZooKeeper. > This > > > can > > > >>> potentially simply your architecture. > > > >>> > > > >>> Thanks, > > > >>> Supun. > > > >>> > > > >>> > > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake < > > > [email protected]> > > > >>> wrote: > > > >>> > > > >>> Hi Supun, > > > >>>> > > > >>>> So your suggestion is to create a znode for each thrift service we > > > have > > > >>>> and > > > >>>> when the request comes that node gets modified with input data for > > > that > > > >>>> request and thrift service is having a watch for that node and it > > will > > > >>>> be > > > >>>> notified because of the watch and it can read the input from > > zookeeper > > > >>>> and > > > >>>> invoke the operation? > > > >>>> > > > >>>> Lahiru > > > >>>> > > > >>>> > > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva < > > > >>>> [email protected]> > > > >>>> wrote: > > > >>>> > > > >>>> Hi all, > > > >>>>> > > > >>>>> Here is what I think about Airavata and ZooKeeper. In Airavata > > there > > > >>>>> are > > > >>>>> many components and these components must be stateless to achieve > > > >>>>> scalability and reliability.Also there must be a mechanism to > > > >>>>> > > > >>>> communicate > > > >>>> > > > >>>>> between the components. At the moment Airavata uses RPC calls > based > > > on > > > >>>>> Thrift for the communication. > > > >>>>> > > > >>>>> ZooKeeper can be used both as a place to hold state and as a > > > >>>>> > > > >>>> communication > > > >>>> > > > >>>>> layer between the components. I'm involved with a project that > has > > > many > > > >>>>> distributed components like AIravata. Right now we use Thrift > > > services > > > >>>>> > > > >>>> to > > > >>>> > > > >>>>> communicate among the components. But we find it difficult to use > > RPC > > > >>>>> > > > >>>> calls > > > >>>> > > > >>>>> and achieve stateless behaviour and thinking of replacing Thrift > > > >>>>> > > > >>>> services > > > >>>> > > > >>>>> with ZooKeeper based communication layer. So I think it is better > > to > > > >>>>> explore the possibility of removing the Thrift services between > the > > > >>>>> components and use ZooKeeper as a communication mechanism between > > the > > > >>>>> services. If you do this you will have to move the state to > > ZooKeeper > > > >>>>> > > > >>>> and > > > >>>> > > > >>>>> will automatically achieve the stateless behaviour in the > > components. > > > >>>>> > > > >>>>> Also I think trying to make ZooKeeper optional is a bad idea. If > we > > > are > > > >>>>> trying to integrate something fundamentally important to > > architecture > > > >>>>> as > > > >>>>> how to store state, we shouldn't make it optional. > > > >>>>> > > > >>>>> Thanks, > > > >>>>> Supun.. > > > >>>>> > > > >>>>> > > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka < > > > >>>>> [email protected]> wrote: > > > >>>>> > > > >>>>> Hi Lahiru, > > > >>>>>> > > > >>>>>> As i understood, not only reliability , you are trying to > achieve > > > >>>>>> some > > > >>>>>> other requirement by introducing zookeeper, like health > monitoring > > > of > > > >>>>>> > > > >>>>> the > > > >>>> > > > >>>>> services, categorization with service implementation etc ... . In > > > that > > > >>>>>> case, i think we can get use of zookeeper's features but if we > > only > > > >>>>>> > > > >>>>> focus > > > >>>> > > > >>>>> on reliability, i have little bit of concern, why can't we use > > > >>>>>> > > > >>>>> clustering + > > > >>>> > > > >>>>> LB ? > > > >>>>>> > > > >>>>>> Yes it is better we add Zookeeper as a prerequisite if user need > > to > > > >>>>>> use > > > >>>>>> it. > > > >>>>>> > > > >>>>>> Thanks, > > > >>>>>> Shameera. > > > >>>>>> > > > >>>>>> > > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake < > > > >>>>>> [email protected] > > > >>>>>> wrote: > > > >>>>>> > > > >>>>>> Hi Gagan, > > > >>>>>>> > > > >>>>>>> I need to start another discussion about it, but I had an > offline > > > >>>>>>> discussion with Suresh about auto-scaling. I will start another > > > >>>>>>> thread > > > >>>>>>> about this topic too. > > > >>>>>>> > > > >>>>>>> Regards > > > >>>>>>> Lahiru > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja < > > > >>>>>>> > > > >>>>>> [email protected] > > > >>>> > > > >>>>> wrote: > > > >>>>>>> > > > >>>>>>> Thanks Lahiru for pointing to nice library, added to my > > dictionary > > > >>>>>>>> > > > >>>>>>> :). > > > >>>> > > > >>>>> I would like to know how are we planning to start multiple > > servers. > > > >>>>>>>> 1. Spawning new servers based on load? Some times we call it > as > > > auto > > > >>>>>>>> scalable. > > > >>>>>>>> 2. To make some specific number of nodes available such as we > > > want 2 > > > >>>>>>>> servers to be available at any time so if one goes down then I > > > need > > > >>>>>>>> > > > >>>>>>> to > > > >>>> > > > >>>>> spawn one new to make available servers count 2. > > > >>>>>>>> 3. Initially start all the servers. > > > >>>>>>>> > > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I don't > > believe > > > >>>>>>>> > > > >>>>>>> existing > > > >>>>>>> > > > >>>>>>>> architecture support this? > > > >>>>>>>> > > > >>>>>>>> Regards, > > > >>>>>>>> Gagan > > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" < > [email protected] > > > > > > >>>>>>>> > > > >>>>>>> wrote: > > > >>>>>>> > > > >>>>>>>> Hi Gagan, > > > >>>>>>>>> > > > >>>>>>>>> Thanks for your response. Please see my inline comments. > > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja < > > > >>>>>>>>> > > > >>>>>>>> [email protected]> > > > >>>>>>> > > > >>>>>>>> wrote: > > > >>>>>>>>> > > > >>>>>>>>> Hi Lahiru, > > > >>>>>>>>>> Just my 2 cents. > > > >>>>>>>>>> > > > >>>>>>>>>> I am big fan of zookeeper but also against adding multiple > > hops > > > in > > > >>>>>>>>>> > > > >>>>>>>>> the > > > >>>>>>> > > > >>>>>>>> system which can add unnecessary complexity. Here I am not > able > > to > > > >>>>>>>>>> understand the requirement of zookeeper may be I am wrong > > > because > > > >>>>>>>>>> > > > >>>>>>>>> of > > > >>>> > > > >>>>> less > > > >>>>>>> > > > >>>>>>>> knowledge of the airavata system in whole. So I would like to > > > >>>>>>>>>> > > > >>>>>>>>> discuss > > > >>>> > > > >>>>> following point. > > > >>>>>>>>>> > > > >>>>>>>>>> 1. How it will help us in making system more reliable. > > Zookeeper > > > >>>>>>>>>> > > > >>>>>>>>> is > > > >>>> > > > >>>>> not > > > >>>>>>> > > > >>>>>>>> able to restart services. At max it can tell whether service > is > > up > > > >>>>>>>>>> > > > >>>>>>>>> or not > > > >>>>>>> > > > >>>>>>>> which could only be the case if airavata service goes down > > > >>>>>>>>>> > > > >>>>>>>>> gracefully and > > > >>>>>>> > > > >>>>>>>> we have any automated way to restart it. If this is just > matter > > of > > > >>>>>>>>>> > > > >>>>>>>>> routing > > > >>>>>>> > > > >>>>>>>> client requests to the available thrift servers then this can > be > > > >>>>>>>>>> > > > >>>>>>>>> achieved > > > >>>>>>> > > > >>>>>>>> with the help of load balancer which I guess is already there > in > > > >>>>>>>>>> > > > >>>>>>>>> thrift > > > >>>>>>> > > > >>>>>>>> wish list. > > > >>>>>>>>>> > > > >>>>>>>>>> We have multiple thrift services and currently we start > only > > > one > > > >>>>>>>>> > > > >>>>>>>> instance > > > >>>>>>> > > > >>>>>>>> of them and each thrift service is a stateless service. To > keep > > > the > > > >>>>>>>>> > > > >>>>>>>> high > > > >>>>>>> > > > >>>>>>>> availability we have to start multiple instances of them in > > > >>>>>>>>> > > > >>>>>>>> production > > > >>>> > > > >>>>> scenario. So for clients to get an available thrift service we > can > > > >>>>>>>>> > > > >>>>>>>> use > > > >>>> > > > >>>>> zookeeper znodes to represent each available service. There are > > > >>>>>>>>> > > > >>>>>>>> some > > > >>>> > > > >>>>> libraries which is doing similar[1] and I think we can use them > > > >>>>>>>>> > > > >>>>>>>> directly. > > > >>>>>>> > > > >>>>>>>> 2. As far as registering of different providers is concerned > do > > > >>>>>>>>>> > > > >>>>>>>>> you > > > >>>> > > > >>>>> think for that we really need external store. > > > >>>>>>>>>> > > > >>>>>>>>>> Yes I think so, because its light weight and reliable and > we > > > have > > > >>>>>>>>> > > > >>>>>>>> to > > > >>>> > > > >>>>> do > > > >>>>>>> > > > >>>>>>>> very minimal amount of work to achieve all these features to > > > >>>>>>>>> > > > >>>>>>>> Airavata > > > >>>> > > > >>>>> because zookeeper handle all the complexity. > > > >>>>>>>>> > > > >>>>>>>>> I have seen people using zookeeper more for state management > > in > > > >>>>>>>>>> distributed environments. > > > >>>>>>>>>> > > > >>>>>>>>>> +1, we might not be the most effective users of zookeeper > > > because > > > >>>>>>>>> > > > >>>>>>>> all > > > >>>> > > > >>>>> of > > > >>>>>>> > > > >>>>>>>> our services are stateless services, but my point is to > achieve > > > >>>>>>>>> fault-tolerance we can use zookeeper and with minimal work. > > > >>>>>>>>> > > > >>>>>>>>> I would like to understand more how can we leverage > > zookeeper > > > in > > > >>>>>>>>>> airavata to make system reliable. > > > >>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>>> [1]https://github.com/eirslett/thrift-zookeeper > > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> Regards, > > > >>>>>>>>>> Gagan > > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <[email protected]> > > > wrote: > > > >>>>>>>>>> > > > >>>>>>>>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture > > > list > > > >>>>>>>>>>> > > > >>>>>>>>>> for > > > >>>> > > > >>>>> additional comments. > > > >>>>>>>>>>> > > > >>>>>>>>>>> Marlon > > > >>>>>>>>>>> > > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote: > > > >>>>>>>>>>> > > > >>>>>>>>>>>> Hi All, > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and how to > > use > > > >>>>>>>>>>>> > > > >>>>>>>>>>> it > > > >>>> > > > >>>>> in > > > >>>>>>> > > > >>>>>>>> airavata. Its really a nice way to achieve fault tolerance > and > > > >>>>>>>>>>>> > > > >>>>>>>>>>> reliable > > > >>>>>>>>>>> > > > >>>>>>>>>>>> communication between our thrift services and clients. > > > >>>>>>>>>>>> > > > >>>>>>>>>>> Zookeeper > > > >>>> > > > >>>>> is a > > > >>>>>>> > > > >>>>>>>> distributed, fault tolerant system to do a reliable > > > >>>>>>>>>>>> > > > >>>>>>>>>>> communication > > > >>>> > > > >>>>> between > > > >>>>>>>>>>> > > > >>>>>>>>>>>> distributed applications. This is like an in-memory file > > > system > > > >>>>>>>>>>>> > > > >>>>>>>>>>> which > > > >>>>>>> > > > >>>>>>>> has > > > >>>>>>>>>>> > > > >>>>>>>>>>>> nodes in a tree structure and each node can have small > > amount > > > >>>>>>>>>>>> > > > >>>>>>>>>>> of > > > >>>> > > > >>>>> data > > > >>>>>>> > > > >>>>>>>> associated with it and these nodes are called znodes. Clients > > > >>>>>>>>>>>> > > > >>>>>>>>>>> can > > > >>>> > > > >>>>> connect > > > >>>>>>>>>>> > > > >>>>>>>>>>>> to a zookeeper server and add/delete and update these > > znodes. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> In Apache Airavata we start multiple thrift services > and > > > >>>>>>>>>>>> > > > >>>>>>>>>>> these > > > >>>> > > > >>>>> can > > > >>>>>>> > > > >>>>>>>> go > > > >>>>>>>>>>> > > > >>>>>>>>>>>> down for maintenance or these can crash, if we use > zookeeper > > > to > > > >>>>>>>>>>>> > > > >>>>>>>>>>> store > > > >>>>>>> > > > >>>>>>>> these > > > >>>>>>>>>>> > > > >>>>>>>>>>>> configuration(thrift service configurations) we can > achieve > > a > > > >>>>>>>>>>>> > > > >>>>>>>>>>> very > > > >>>> > > > >>>>> reliable > > > >>>>>>>>>>> > > > >>>>>>>>>>>> system. Basically thrift clients can dynamically discover > > > >>>>>>>>>>>> > > > >>>>>>>>>>> available > > > >>>>>>> > > > >>>>>>>> service > > > >>>>>>>>>>> > > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to change > the > > > >>>>>>>>>>>> > > > >>>>>>>>>>> generated > > > >>>>>>> > > > >>>>>>>> thrift client code but we have to change the locations we are > > > >>>>>>>>>>>> > > > >>>>>>>>>>> invoking > > > >>>>>>> > > > >>>>>>>> them). ephemeral znodes will be removed when the thrift > service > > > >>>>>>>>>>>> > > > >>>>>>>>>>> goes > > > >>>>>>> > > > >>>>>>>> down > > > >>>>>>>>>>> > > > >>>>>>>>>>>> and zookeeper guarantee the atomicity between these > > > operations. > > > >>>>>>>>>>>> > > > >>>>>>>>>>> With > > > >>>>>>> > > > >>>>>>>> this > > > >>>>>>>>>>> > > > >>>>>>>>>>>> approach we can have a node hierarchy for multiple of > > > airavata, > > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> For specifically for gfac we can have different types of > > > >>>>>>>>>>>> > > > >>>>>>>>>>> services > > > >>>> > > > >>>>> for > > > >>>>>>> > > > >>>>>>>> each > > > >>>>>>>>>>> > > > >>>>>>>>>>>> provider implementation. This can be achieved by using the > > > >>>>>>>>>>>> > > > >>>>>>>>>>> hierarchical > > > >>>>>>>>>>> > > > >>>>>>>>>>>> support in zookeeper and providing some logic in > gfac-thrift > > > >>>>>>>>>>>> > > > >>>>>>>>>>> service > > > >>>>>>> > > > >>>>>>>> to > > > >>>>>>>>>>> > > > >>>>>>>>>>>> register it to a defined path. Using the same logic > > > >>>>>>>>>>>> > > > >>>>>>>>>>> orchestrator > > > >>>> > > > >>>>> can > > > >>>>>>> > > > >>>>>>>> discover the provider specific gfac thrift service and route > > > >>>>>>>>>>>> > > > >>>>>>>>>>> the > > > >>>> > > > >>>>> message to > > > >>>>>>>>>>> > > > >>>>>>>>>>>> the correct thrift service. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> With this approach I think we simply have write some > client > > > >>>>>>>>>>>> > > > >>>>>>>>>>> code > > > >>>> > > > >>>>> in > > > >>>>>>> > > > >>>>>>>> thrift > > > >>>>>>>>>>> > > > >>>>>>>>>>>> services and clients and zookeeper server installation can > > be > > > >>>>>>>>>>>> > > > >>>>>>>>>>> done as > > > >>>>>>> > > > >>>>>>>> a > > > >>>>>>>>>>> > > > >>>>>>>>>>>> separate process and it will be easier to keep the > Zookeeper > > > >>>>>>>>>>>> > > > >>>>>>>>>>> server > > > >>>>>>> > > > >>>>>>>> separate from Airavata because installation of Zookeeper > server > > > >>>>>>>>>>>> > > > >>>>>>>>>>> little > > > >>>>>>> > > > >>>>>>>> complex in production scenario. I think we have to make sure > > > >>>>>>>>>>>> > > > >>>>>>>>>>> everything > > > >>>>>>>>>>> > > > >>>>>>>>>>>> works fine when there is no Zookeeper running, ex: > > > >>>>>>>>>>>> > > > >>>>>>>>>>> enable.zookeeper=false > > > >>>>>>>>>>> > > > >>>>>>>>>>>> should works fine and users doesn't have to download and > > start > > > >>>>>>>>>>>> > > > >>>>>>>>>>> zookeeper. > > > >>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/ > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Thanks > > > >>>>>>>>>>>> Lahiru > > > >>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> -- > > > >>>>>>>>> System Analyst Programmer > > > >>>>>>>>> PTI Lab > > > >>>>>>>>> Indiana University > > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>> -- > > > >>>>>>> System Analyst Programmer > > > >>>>>>> PTI Lab > > > >>>>>>> Indiana University > > > >>>>>>> > > > >>>>>>> > > > >>>>>> > > > >>>>>> -- > > > >>>>>> Best Regards, > > > >>>>>> Shameera Rathnayaka. > > > >>>>>> > > > >>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com > > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/ > > > >>>>>> > > > >>>>>> > > > >>>>> > > > >>>>> -- > > > >>>>> Supun Kamburugamuva > > > >>>>> Member, Apache Software Foundation; http://www.apache.org > > > >>>>> E-mail: [email protected]; Mobile: +1 812 369 6762 > > > >>>>> Blog: http://supunk.blogspot.com > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>> -- > > > >>>> System Analyst Programmer > > > >>>> PTI Lab > > > >>>> Indiana University > > > >>>> > > > >>>> > > > >>> > > > >>> -- > > > >>> Supun Kamburugamuva > > > >>> Member, Apache Software Foundation; http://www.apache.org > > > >>> E-mail: [email protected]; Mobile: +1 812 369 6762 > > > >>> Blog: http://supunk.blogspot.com > > > >>> > > > >>> > > > >>> > > > >> > > > > > > > > > > > > > -- > > > Supun Kamburugamuva > > > Member, Apache Software Foundation; http://www.apache.org > > > E-mail: [email protected]; Mobile: +1 812 369 6762 > > > Blog: http://supunk.blogspot.com > > > > > > > > > > > -- > > System Analyst Programmer > > PTI Lab > > Indiana University > > > -- https://www.cs.indiana.edu/~tgunarat/ http://www.linkedin.com/in/thilina http://thilina.gunarathne.org
