Guys, Has there been any thought to using the Apache OODT file manager as the Airavata registry? Would seem to fit the use cases..
Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-283, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Eran Chinthaka Withana <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Sunday, March 2, 2014 12:31 PM To: "[email protected]" <[email protected]> Subject: Re: Object Database Suggestions for Airavata Registry >Hi Suresh, > >Sorry for the late reply. I don't think I can make it at 1pm PST today. >Can >we please re-schedule this to 5pm PST (8pm EST) or later? > >Thanks, >Eran Chinthaka Withana > > >On Sun, Mar 2, 2014 at 6:38 AM, Suresh Marru <[email protected]> wrote: > >> Hi All, >> >> Great to see we have a good quorum. So how about 4pm EST (1pm PST) today >> with a hangout on air. It works best if we start a a hangout then >>(previous >> attempts to pre-schedules on-air events did not work well. So please >>check >> this mailing list around 4pm EST for the hangout on air link. >> >> Meanwhile, please join the Airavata Google Plus community, that might be >> easier to share the link - >> https://plus.google.com/communities/100700433662281905708 >> >> Thanks all for willing to take time on a sunday, >> Suresh >> >> On Feb 28, 2014, at 9:15 PM, Supun Kamburugamuva <[email protected]> >> wrote: >> >> > +1 for Sunday afternoon. I can make it after 4 pm EST. >> > >> > Thanks, >> > Supun.. >> > >> > >> > On Fri, Feb 28, 2014 at 5:04 PM, Shameera Rathnayaka < >> [email protected] >> >> wrote: >> > >> >> +1 >> >> >> >> Thanks, >> >> Shameera. >> >> >> >> >> >> On Sat, Mar 1, 2014 at 3:11 AM, Eran Chinthaka Withana < >> >> [email protected]> wrote: >> >> >> >>> +1 for Sunday afternoon >> >>> >> >>> Thanks, >> >>> Eran Chinthaka Withana >> >>> >> >>> >> >>> On Fri, Feb 28, 2014 at 5:17 AM, Suresh Marru <[email protected]> >> wrote: >> >>> >> >>>> Hi Eran, >> >>>> >> >>>> This is a great idea. I myself owe few replies on this thread and >> >> unable >> >>>> to take time to comprehend my thoughts (and realized I should take >> time >> >>> to >> >>>> properly articulate the challenges otherwise we will be discussing >> >>>> orthogonal issues). >> >>>> >> >>>> A hangout will help us brainstorm more comprehensively. We can >>have it >> >> on >> >>>> air so we can refer back for archival purposes. How is Sunday >> afternoon >> >>> for >> >>>> everyone willing to join and contribute? >> >>>> >> >>>> Thanks, >> >>>> Suresh >> >>>> >> >>>> On Feb 28, 2014, at 1:45 AM, Eran Chinthaka Withana < >> >>>> [email protected]> wrote: >> >>>> >> >>>>> Hi, >> >>>>> >> >>>>> Is there any chance of hosting a google hangout to talk about >>this. I >> >>>> think >> >>>>> with long emails and multiple directions things are getting little >> >> bit >> >>>>> confusing in thread (I'm partly responsible for this :) ). I can >> >> join a >> >>>>> video chat during a weekend but lets make sure its convenient for >> >> both >> >>>> east >> >>>>> and west coasts :) >> >>>>> >> >>>>> WDYT? >> >>>>> >> >>>>> Thanks, >> >>>>> Eran Chinthaka Withana >> >>>>> >> >>>>> >> >>>>> On Mon, Feb 24, 2014 at 9:32 AM, Suresh Marru <[email protected]> >> >>> wrote: >> >>>>> >> >>>>>> I could respond to each thread in detail, but I see the general >> >> sense >> >>> is >> >>>>>> inquiring on the use case, so let me try and explain this and >>see if >> >>> it >> >>>>>> comes across. I am fully onboard with perceptions of relational >>vs >> >>> nosql >> >>>>>> and also agree current Airavata needs are not a direct map for >>NoSQL >> >>>>>> migration. I will summarize the driving motivation: >> >>>>>> >> >>>>>> Background: The key problem Airavata needs to solve is getting >>the >> >> API >> >>>> and >> >>>>>> associated data model right. The problem is current relational >> >>> database >> >>>>>> (with OpenJPA overlay) is severely limiting the API evolution. >> >> Science >> >>>>>> Gateways by nature are very science domain and use-case specific. >> >> But >> >>>>>> Airavata is tackling this challenging problem of providing a >>generic >> >>> API >> >>>>>> which will meet and enable these use case centric integration. >>The >> >>> issue >> >>>>>> here is, we are designing an API to handle a wide range of known >> >> (and >> >>>> some >> >>>>>> foreseen) use cases. But at the same time trying to keep it >>simple >> >> and >> >>>> yet >> >>>>>> flexible. The only way we can get through a reasonable, >>normalized >> >>>> version >> >>>>>> of API is by hands-on programming against the API. Within the >> >> Airavata >> >>>> PMC >> >>>>>> itself, we can solicit a half-a-dozen different ways on how to >> >>> visualize >> >>>>>> the data model. And we need few hackethon's with real-end users >>of >> >>>> Airavata >> >>>>>> until we find a common ground. All of this needs rapid >>prototyping. >> >>>>>> Currently a slight change in the data model is taking close to >>two >> >>>> weeks of >> >>>>>> re-arcitecting the Open-JPA based registry. There are many known >> >>>> problems >> >>>>>> with current draft of data model which have to be put-down in the >> >>>> interest >> >>>>>> of making over all system progress. >> >>>>>> >> >>>>>> So the driving motivation is not certainly any of the classic >>NoSQL >> >>>> needs. >> >>>>>> But a simple one, can we have registry which is schema-agnostic >>and >> >>> yet >> >>>> is >> >>>>>> queriable for most of the fields in the model? Can we try 10 >> >> different >> >>>>>> variants of data model (hence API) within the next 3 months with >> >>> focused >> >>>>>> hackethon's and arrive at a stable 1.0 version of API? >> >>>>>> >> >>>>>> Part one is the discussion is successful that it raised every >>one's >> >>> eye >> >>>>>> brows. Now that we have every one's attention, what will be a >>good >> >>> data >> >>>>>> store for Airavata which will meet these needs? >> >>>>>> >> >>>>>> P.S: Additional background: The API has been in development for >> >> close >> >>>> to 3 >> >>>>>> years and is falling short of pleasing a majority. Many academic >> >>>>>> standardization efforts fail terribly trying to pretend to >> >> understand >> >>>> all >> >>>>>> use cases and proposing a standard way (which ends up >>unnecessarily >> >>>> complex >> >>>>>> and not usable). Science by nature is evolutionary, and >>restricting >> >>> the >> >>>>>> capabilities by a known set of use cases prevents the use of >> >>> middleware >> >>>> for >> >>>>>> real-scientific research (and gets limited to proof of concept >> >>>>>> demonstrations, papers, educational use). The only way meeting >>the >> >>>>>> challenges of these evolving needs is to have the framework which >> >> can >> >>>>>> evolve with minimal disruption. >> >>>>>> >> >>>>>> Great thoughts so far, please keep 'em coming until we can find a >> >>>> solution >> >>>>>> not by the technical fancies but to address the real need. >> >>>>>> >> >>>>>> Cheers, >> >>>>>> Suresh >> >>>>>> >> >>>>>> On Feb 24, 2014, at 11:53 AM, Lahiru Gunathilake >><[email protected] >> >>> >> >>>>>> wrote: >> >>>>>> >> >>>>>>> On Mon, Feb 24, 2014 at 11:20 AM, Milinda Pathirage < >> >>>>>>> [email protected]> wrote: >> >>>>>>> >> >>>>>>>> I also think that moving to Cassandra or any other NoSQL will >>add >> >>>>>>>> unneccessary complexity to your solution. Also designing proper >> >>> (easy >> >>>> to >> >>>>>>>> manage changes, easy to query) NoSQL data models are hard >>(AFAIK, >> >>>>>> require >> >>>>>>>> lots of experience and understanding about data structures and >> >>>> queries). >> >>>>>>>> Also migrating from one NoSQL technology to other can require >> >>> complete >> >>>>>>>> re-write. And current relational databases can handle heavy >>loads >> >>>> except >> >>>>>>>> Google, Twitter, Amazon and Facebook like loads. I don't think >> >>>> Airavata >> >>>>>>>> will see Google and Amazon like loads. >> >>>>>>>> >> >>>>>>> +1 >> >>>>>>> >> >>>>>>>> >> >>>>>>>> If the constant changes to the data model is the problem , I >>think >> >>>> best >> >>>>>>>> option is to abstract registry implementation to something like >> >>>>>> collections >> >>>>>>>> and resources used in WSO2 Registry [1] or something suitable >>for >> >>>>>> Airavata >> >>>>>>>> context. That will make it easy to handle changes in data >>model. >> >>>>>>>> >> >>>>>>>> Also don't let the technologies drive design decision. Its >>always >> >>>>>> better to >> >>>>>>>> let use cases drive the design decision. >> >>>>>>>> >> >>>>>>> +1 >> >>>>>>> >> >>>>>>> Regards >> >>>>>>> Lahiru >> >>>>>>> >> >>>>>>>> >> >>>>>>>> Thanks >> >>>>>>>> Milinda >> >>>>>>>> >> >>>>>>>> [1] http://wso2.com/products/governance-registry/ >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> On Mon, Feb 24, 2014 at 10:57 AM, Supun Kamburugamuva < >> >>>>>> [email protected] >> >>>>>>>>> wrote: >> >>>>>>>> >> >>>>>>>>> Hi all, >> >>>>>>>>> >> >>>>>>>>> I'm not trying to discourage you on your exploration to NoSQL >> >>>>>> databases. >> >>>>>>>> I >> >>>>>>>>> have the following concern. >> >>>>>>>>> >> >>>>>>>>> Your database schema is moderately complex - even for a RDBMS >>it >> >>>> seems >> >>>>>>>>> complex and the data size is relatively small. I'm not sure >>about >> >>> the >> >>>>>>>>> current tools available but I think you will need to write >>more >> >>> code >> >>>> to >> >>>>>>>>> support all your requirements in a NoSQL database. So writing >> >> more >> >>>> code >> >>>>>>>> and >> >>>>>>>>> allow redundancy to support *relatively small* and *structured >> >>>>>>>>> data*doesn't seem right to me. May be I'm wrong and there are >> >>> better >> >>>>>>>>> tools in >> >>>>>>>>> NoSQL than RDBMS, which I doubt. >> >>>>>>>>> >> >>>>>>>>> Thanks, >> >>>>>>>>> Supun.. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On Sun, Feb 23, 2014 at 5:20 PM, Suresh Marru >><[email protected] >> >>> >> >>>>>> wrote: >> >>>>>>>>> >> >>>>>>>>>> Hi All, >> >>>>>>>>>> >> >>>>>>>>>> Airavata is actively migrating to use Thrift API for the >> >> RESTless >> >>>>>>>> design >> >>>>>>>>>> and to facilitate various language bindings from client >> >> gateways. >> >>>> The >> >>>>>>>>>> programming language support in thrift has been so far very >> >>>>>>>> encouraging. >> >>>>>>>>>> The current architecture is looking like Figure 1 at [1]. >> >>>>>>>>>> >> >>>>>>>>>> Language specific clients will be released as thrift SDK's >> >>> (similar >> >>>> to >> >>>>>>>>>> evernote sdk's [1]). These clients will be integrated into >> >> gateway >> >>>>>>>>> portals >> >>>>>>>>>> which connect to the API Server. The API operations brokers >>he >> >>>> simple >> >>>>>>>>> calls >> >>>>>>>>>> into one or more backend CPI calls (Airavata internal >>component >> >>>>>>>>>> interfaces). An example set of mappings are illustrated in >> >>> Figure 2 >> >>>>>> at >> >>>>>>>>>> [1]. The current draft of thrift API for version 0.12 is at >>[3], >> >>>>>> please >> >>>>>>>>> pay >> >>>>>>>>>> attention to experiment model at [4]. >> >>>>>>>>>> >> >>>>>>>>>> For the persistent store, we had few iterations of Airavata >> >>> Registry >> >>>>>>>>>> shifting from a legacy XRegistry to JackRabbit to now a >>OpenJPA >> >>>> based >> >>>>>>>>>> registry. To allow the API and the associated data models to >> >>> evolve, >> >>>>>> it >> >>>>>>>>>> will be useful to explore object databases so we can store >>the >> >>>>>>>> serialized >> >>>>>>>>>> version of thrift objects directly. But it will be nice to >>have >> >>> all >> >>>>>> (or >> >>>>>>>>>> most) of the fields queriable. This calls for a more >> >> column-family >> >>>>>>>> design >> >>>>>>>>>> of any NoSQL approaches. >> >>>>>>>>>> >> >>>>>>>>>> Any recommendations for a registry architecture? >> >>>>>>>>>> >> >>>>>>>>>> Quickly hacking through I find the following approach a >>viable >> >>> one: >> >>>>>>>>>> ZombieDB[5] over astyanax[6] which talks to Cassandra. >>Airavata >> >>> can >> >>>>>>>>> benefit >> >>>>>>>>>> immediately from the replication and reliability of cassandra >> >> and >> >>>>>>>>>> scalability in near future. Some of the model objects like >> >>>> experiment >> >>>>>>>>>> creation will need to have strong consistency and most of the >> >>>>>>>> monitoring >> >>>>>>>>>> can live with eventual consistency. >> >>>>>>>>>> >> >>>>>>>>>> Critical comments please? >> >>>>>>>>>> >> >>>>>>>>>> Thanks for your time, >> >>>>>>>>>> Suresh >> >>>>>>>>>> >> >>>>>>>>>> [1] - >> >>>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>>> >> >>>> >> >>> >> >> >> >>https://cwiki.apache.org/confluence/display/AIRAVATA/2014/02/23/Brainstor >>ming+Diagrams >> >>>>>>>>>> [2] - https://dev.evernote.com/doc/ >> >>>>>>>>>> [3] - >> >>>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>>> >> >>>> >> >>> >> >> >> >>https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=tree;f=airavata- >>api/thrift-interface-descriptions;hb=HEAD >> >>>>>>>>>> [4] - >> >>>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>>> >> >>>> >> >>> >> >> >> >>https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=blob_plain;f=air >>avata-api/thrift-interface-descriptions/experimentModel.thrift;hb=HEAD >> >>>>>>>>>> [5] - https://github.com/MisterTea/ZombieDB >> >>>>>>>>>> [6] - https://github.com/Netflix/astyanax >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> -- >> >>>>>>>>> Supun Kamburugamuva >> >>>>>>>>> Member, Apache Software Foundation; http://www.apache.org >> >>>>>>>>> E-mail: [email protected]; Mobile: +1 812 369 6762 >> >>>>>>>>> Blog: http://supunk.blogspot.com >> >>>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> -- >> >>>>>>>> Milinda Pathirage >> >>>>>>>> PhD Student Indiana University, Bloomington; >> >>>>>>>> E-mail: [email protected] >> >>>>>>>> Web: http://mpathirage.com >> >>>>>>>> Blog: http://blog.mpathirage.com >> >>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> System Analyst Programmer >> >>>>>>> PTI Lab >> >>>>>>> Indiana University >> >>>>>> >> >>>>>> >> >>>> >> >>>> >> >>> >> >> >> >> >> >> >> >> -- >> >> Best Regards, >> >> Shameera Rathnayaka. >> >> >> >> email: shameera AT apache.org , shameerainfo AT gmail.com >> >> Blog : http://shameerarathnayaka.blogspot.com/ >> >> >> > >> > >> > >> > -- >> > Supun Kamburugamuva >> > Member, Apache Software Foundation; http://www.apache.org >> > E-mail: [email protected]; Mobile: +1 812 369 6762 >> > Blog: http://supunk.blogspot.com >> >>
