Hi Marlon, I think you are exactly correct.
Supun.. On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <[email protected]> wrote: > Let me restate this, and please tell me if I'm wrong. > > Orchestrator decides (somehow) that a particular job requires JSDL/BES, so > it places the Experiment ID in Zookeeper's /airavata/gfac/jsdl-bes node. > GFAC servers associated with this instance notice the update. The first > GFAC to claim the job gets it, uses the Experiment ID to get the detailed > information it needs from the Registry. ZooKeeper handles the locking, etc > to make sure that only one GFAC at a time is trying to handle an experiment. > > Marlon > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote: > >> Hi Supun, >> >> Thanks for the clarification. >> >> Regards >> Lahiru >> >> >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <[email protected]> >> wrote: >> >> Hi Lahiru, >>> >>> My suggestion is that may be you don't need a Thrift service between >>> Orchestrator and the component executing the experiment. When a new >>> experiment is submitted, orchestrator decides who can execute this job. >>> Then it put the information about this experiment execution in ZooKeeper. >>> The component which wants to executes the experiment is listening to this >>> ZooKeeper path and when it sees the experiment it will execute it. So >>> that >>> the communication happens through an state change in ZooKeeper. This can >>> potentially simply your architecture. >>> >>> Thanks, >>> Supun. >>> >>> >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <[email protected]> >>> wrote: >>> >>> Hi Supun, >>>> >>>> So your suggestion is to create a znode for each thrift service we have >>>> and >>>> when the request comes that node gets modified with input data for that >>>> request and thrift service is having a watch for that node and it will >>>> be >>>> notified because of the watch and it can read the input from zookeeper >>>> and >>>> invoke the operation? >>>> >>>> Lahiru >>>> >>>> >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva < >>>> [email protected]> >>>> wrote: >>>> >>>> Hi all, >>>>> >>>>> Here is what I think about Airavata and ZooKeeper. In Airavata there >>>>> are >>>>> many components and these components must be stateless to achieve >>>>> scalability and reliability.Also there must be a mechanism to >>>>> >>>> communicate >>>> >>>>> between the components. At the moment Airavata uses RPC calls based on >>>>> Thrift for the communication. >>>>> >>>>> ZooKeeper can be used both as a place to hold state and as a >>>>> >>>> communication >>>> >>>>> layer between the components. I'm involved with a project that has many >>>>> distributed components like AIravata. Right now we use Thrift services >>>>> >>>> to >>>> >>>>> communicate among the components. But we find it difficult to use RPC >>>>> >>>> calls >>>> >>>>> and achieve stateless behaviour and thinking of replacing Thrift >>>>> >>>> services >>>> >>>>> with ZooKeeper based communication layer. So I think it is better to >>>>> explore the possibility of removing the Thrift services between the >>>>> components and use ZooKeeper as a communication mechanism between the >>>>> services. If you do this you will have to move the state to ZooKeeper >>>>> >>>> and >>>> >>>>> will automatically achieve the stateless behaviour in the components. >>>>> >>>>> Also I think trying to make ZooKeeper optional is a bad idea. If we are >>>>> trying to integrate something fundamentally important to architecture >>>>> as >>>>> how to store state, we shouldn't make it optional. >>>>> >>>>> Thanks, >>>>> Supun.. >>>>> >>>>> >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka < >>>>> [email protected]> wrote: >>>>> >>>>> Hi Lahiru, >>>>>> >>>>>> As i understood, not only reliability , you are trying to achieve >>>>>> some >>>>>> other requirement by introducing zookeeper, like health monitoring of >>>>>> >>>>> the >>>> >>>>> services, categorization with service implementation etc ... . In that >>>>>> case, i think we can get use of zookeeper's features but if we only >>>>>> >>>>> focus >>>> >>>>> on reliability, i have little bit of concern, why can't we use >>>>>> >>>>> clustering + >>>> >>>>> LB ? >>>>>> >>>>>> Yes it is better we add Zookeeper as a prerequisite if user need to >>>>>> use >>>>>> it. >>>>>> >>>>>> Thanks, >>>>>> Shameera. >>>>>> >>>>>> >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake < >>>>>> [email protected] >>>>>> wrote: >>>>>> >>>>>> Hi Gagan, >>>>>>> >>>>>>> I need to start another discussion about it, but I had an offline >>>>>>> discussion with Suresh about auto-scaling. I will start another >>>>>>> thread >>>>>>> about this topic too. >>>>>>> >>>>>>> Regards >>>>>>> Lahiru >>>>>>> >>>>>>> >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja < >>>>>>> >>>>>> [email protected] >>>> >>>>> wrote: >>>>>>> >>>>>>> Thanks Lahiru for pointing to nice library, added to my dictionary >>>>>>>> >>>>>>> :). >>>> >>>>> I would like to know how are we planning to start multiple servers. >>>>>>>> 1. Spawning new servers based on load? Some times we call it as auto >>>>>>>> scalable. >>>>>>>> 2. To make some specific number of nodes available such as we want 2 >>>>>>>> servers to be available at any time so if one goes down then I need >>>>>>>> >>>>>>> to >>>> >>>>> spawn one new to make available servers count 2. >>>>>>>> 3. Initially start all the servers. >>>>>>>> >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I don't believe >>>>>>>> >>>>>>> existing >>>>>>> >>>>>>>> architecture support this? >>>>>>>> >>>>>>>> Regards, >>>>>>>> Gagan >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <[email protected]> >>>>>>>> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Gagan, >>>>>>>>> >>>>>>>>> Thanks for your response. Please see my inline comments. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja < >>>>>>>>> >>>>>>>> [email protected]> >>>>>>> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hi Lahiru, >>>>>>>>>> Just my 2 cents. >>>>>>>>>> >>>>>>>>>> I am big fan of zookeeper but also against adding multiple hops in >>>>>>>>>> >>>>>>>>> the >>>>>>> >>>>>>>> system which can add unnecessary complexity. Here I am not able to >>>>>>>>>> understand the requirement of zookeeper may be I am wrong because >>>>>>>>>> >>>>>>>>> of >>>> >>>>> less >>>>>>> >>>>>>>> knowledge of the airavata system in whole. So I would like to >>>>>>>>>> >>>>>>>>> discuss >>>> >>>>> following point. >>>>>>>>>> >>>>>>>>>> 1. How it will help us in making system more reliable. Zookeeper >>>>>>>>>> >>>>>>>>> is >>>> >>>>> not >>>>>>> >>>>>>>> able to restart services. At max it can tell whether service is up >>>>>>>>>> >>>>>>>>> or not >>>>>>> >>>>>>>> which could only be the case if airavata service goes down >>>>>>>>>> >>>>>>>>> gracefully and >>>>>>> >>>>>>>> we have any automated way to restart it. If this is just matter of >>>>>>>>>> >>>>>>>>> routing >>>>>>> >>>>>>>> client requests to the available thrift servers then this can be >>>>>>>>>> >>>>>>>>> achieved >>>>>>> >>>>>>>> with the help of load balancer which I guess is already there in >>>>>>>>>> >>>>>>>>> thrift >>>>>>> >>>>>>>> wish list. >>>>>>>>>> >>>>>>>>>> We have multiple thrift services and currently we start only one >>>>>>>>> >>>>>>>> instance >>>>>>> >>>>>>>> of them and each thrift service is a stateless service. To keep the >>>>>>>>> >>>>>>>> high >>>>>>> >>>>>>>> availability we have to start multiple instances of them in >>>>>>>>> >>>>>>>> production >>>> >>>>> scenario. So for clients to get an available thrift service we can >>>>>>>>> >>>>>>>> use >>>> >>>>> zookeeper znodes to represent each available service. There are >>>>>>>>> >>>>>>>> some >>>> >>>>> libraries which is doing similar[1] and I think we can use them >>>>>>>>> >>>>>>>> directly. >>>>>>> >>>>>>>> 2. As far as registering of different providers is concerned do >>>>>>>>>> >>>>>>>>> you >>>> >>>>> think for that we really need external store. >>>>>>>>>> >>>>>>>>>> Yes I think so, because its light weight and reliable and we have >>>>>>>>> >>>>>>>> to >>>> >>>>> do >>>>>>> >>>>>>>> very minimal amount of work to achieve all these features to >>>>>>>>> >>>>>>>> Airavata >>>> >>>>> because zookeeper handle all the complexity. >>>>>>>>> >>>>>>>>> I have seen people using zookeeper more for state management in >>>>>>>>>> distributed environments. >>>>>>>>>> >>>>>>>>>> +1, we might not be the most effective users of zookeeper because >>>>>>>>> >>>>>>>> all >>>> >>>>> of >>>>>>> >>>>>>>> our services are stateless services, but my point is to achieve >>>>>>>>> fault-tolerance we can use zookeeper and with minimal work. >>>>>>>>> >>>>>>>>> I would like to understand more how can we leverage zookeeper in >>>>>>>>>> airavata to make system reliable. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> [1]https://github.com/eirslett/thrift-zookeeper >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>>> Gagan >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <[email protected]> wrote: >>>>>>>>>> >>>>>>>>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list >>>>>>>>>>> >>>>>>>>>> for >>>> >>>>> additional comments. >>>>>>>>>>> >>>>>>>>>>> Marlon >>>>>>>>>>> >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi All, >>>>>>>>>>>> >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and how to use >>>>>>>>>>>> >>>>>>>>>>> it >>>> >>>>> in >>>>>>> >>>>>>>> airavata. Its really a nice way to achieve fault tolerance and >>>>>>>>>>>> >>>>>>>>>>> reliable >>>>>>>>>>> >>>>>>>>>>>> communication between our thrift services and clients. >>>>>>>>>>>> >>>>>>>>>>> Zookeeper >>>> >>>>> is a >>>>>>> >>>>>>>> distributed, fault tolerant system to do a reliable >>>>>>>>>>>> >>>>>>>>>>> communication >>>> >>>>> between >>>>>>>>>>> >>>>>>>>>>>> distributed applications. This is like an in-memory file system >>>>>>>>>>>> >>>>>>>>>>> which >>>>>>> >>>>>>>> has >>>>>>>>>>> >>>>>>>>>>>> nodes in a tree structure and each node can have small amount >>>>>>>>>>>> >>>>>>>>>>> of >>>> >>>>> data >>>>>>> >>>>>>>> associated with it and these nodes are called znodes. Clients >>>>>>>>>>>> >>>>>>>>>>> can >>>> >>>>> connect >>>>>>>>>>> >>>>>>>>>>>> to a zookeeper server and add/delete and update these znodes. >>>>>>>>>>>> >>>>>>>>>>>> In Apache Airavata we start multiple thrift services and >>>>>>>>>>>> >>>>>>>>>>> these >>>> >>>>> can >>>>>>> >>>>>>>> go >>>>>>>>>>> >>>>>>>>>>>> down for maintenance or these can crash, if we use zookeeper to >>>>>>>>>>>> >>>>>>>>>>> store >>>>>>> >>>>>>>> these >>>>>>>>>>> >>>>>>>>>>>> configuration(thrift service configurations) we can achieve a >>>>>>>>>>>> >>>>>>>>>>> very >>>> >>>>> reliable >>>>>>>>>>> >>>>>>>>>>>> system. Basically thrift clients can dynamically discover >>>>>>>>>>>> >>>>>>>>>>> available >>>>>>> >>>>>>>> service >>>>>>>>>>> >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to change the >>>>>>>>>>>> >>>>>>>>>>> generated >>>>>>> >>>>>>>> thrift client code but we have to change the locations we are >>>>>>>>>>>> >>>>>>>>>>> invoking >>>>>>> >>>>>>>> them). ephemeral znodes will be removed when the thrift service >>>>>>>>>>>> >>>>>>>>>>> goes >>>>>>> >>>>>>>> down >>>>>>>>>>> >>>>>>>>>>>> and zookeeper guarantee the atomicity between these operations. >>>>>>>>>>>> >>>>>>>>>>> With >>>>>>> >>>>>>>> this >>>>>>>>>>> >>>>>>>>>>>> approach we can have a node hierarchy for multiple of airavata, >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services. >>>>>>>>>>>> >>>>>>>>>>>> For specifically for gfac we can have different types of >>>>>>>>>>>> >>>>>>>>>>> services >>>> >>>>> for >>>>>>> >>>>>>>> each >>>>>>>>>>> >>>>>>>>>>>> provider implementation. This can be achieved by using the >>>>>>>>>>>> >>>>>>>>>>> hierarchical >>>>>>>>>>> >>>>>>>>>>>> support in zookeeper and providing some logic in gfac-thrift >>>>>>>>>>>> >>>>>>>>>>> service >>>>>>> >>>>>>>> to >>>>>>>>>>> >>>>>>>>>>>> register it to a defined path. Using the same logic >>>>>>>>>>>> >>>>>>>>>>> orchestrator >>>> >>>>> can >>>>>>> >>>>>>>> discover the provider specific gfac thrift service and route >>>>>>>>>>>> >>>>>>>>>>> the >>>> >>>>> message to >>>>>>>>>>> >>>>>>>>>>>> the correct thrift service. >>>>>>>>>>>> >>>>>>>>>>>> With this approach I think we simply have write some client >>>>>>>>>>>> >>>>>>>>>>> code >>>> >>>>> in >>>>>>> >>>>>>>> thrift >>>>>>>>>>> >>>>>>>>>>>> services and clients and zookeeper server installation can be >>>>>>>>>>>> >>>>>>>>>>> done as >>>>>>> >>>>>>>> a >>>>>>>>>>> >>>>>>>>>>>> separate process and it will be easier to keep the Zookeeper >>>>>>>>>>>> >>>>>>>>>>> server >>>>>>> >>>>>>>> separate from Airavata because installation of Zookeeper server >>>>>>>>>>>> >>>>>>>>>>> little >>>>>>> >>>>>>>> complex in production scenario. I think we have to make sure >>>>>>>>>>>> >>>>>>>>>>> everything >>>>>>>>>>> >>>>>>>>>>>> works fine when there is no Zookeeper running, ex: >>>>>>>>>>>> >>>>>>>>>>> enable.zookeeper=false >>>>>>>>>>> >>>>>>>>>>>> should works fine and users doesn't have to download and start >>>>>>>>>>>> >>>>>>>>>>> zookeeper. >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> [1]http://zookeeper.apache.org/ >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> Lahiru >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> -- >>>>>>>>> System Analyst Programmer >>>>>>>>> PTI Lab >>>>>>>>> Indiana University >>>>>>>>> >>>>>>>>> >>>>>>> -- >>>>>>> System Analyst Programmer >>>>>>> PTI Lab >>>>>>> Indiana University >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Best Regards, >>>>>> Shameera Rathnayaka. >>>>>> >>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com >>>>>> Blog : http://shameerarathnayaka.blogspot.com/ >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Supun Kamburugamuva >>>>> Member, Apache Software Foundation; http://www.apache.org >>>>> E-mail: [email protected]; Mobile: +1 812 369 6762 >>>>> Blog: http://supunk.blogspot.com >>>>> >>>>> >>>>> >>>> -- >>>> System Analyst Programmer >>>> PTI Lab >>>> Indiana University >>>> >>>> >>> >>> -- >>> Supun Kamburugamuva >>> Member, Apache Software Foundation; http://www.apache.org >>> E-mail: [email protected]; Mobile: +1 812 369 6762 >>> Blog: http://supunk.blogspot.com >>> >>> >>> >> > -- Supun Kamburugamuva Member, Apache Software Foundation; http://www.apache.org E-mail: [email protected]; Mobile: +1 812 369 6762 Blog: http://supunk.blogspot.com
