Hi, Is there any chance of hosting a google hangout to talk about this. I think with long emails and multiple directions things are getting little bit confusing in thread (I'm partly responsible for this :) ). I can join a video chat during a weekend but lets make sure its convenient for both east and west coasts :)
WDYT? Thanks, Eran Chinthaka Withana On Mon, Feb 24, 2014 at 9:32 AM, Suresh Marru <[email protected]> wrote: > I could respond to each thread in detail, but I see the general sense is > inquiring on the use case, so let me try and explain this and see if it > comes across. I am fully onboard with perceptions of relational vs nosql > and also agree current Airavata needs are not a direct map for NoSQL > migration. I will summarize the driving motivation: > > Background: The key problem Airavata needs to solve is getting the API and > associated data model right. The problem is current relational database > (with OpenJPA overlay) is severely limiting the API evolution. Science > Gateways by nature are very science domain and use-case specific. But > Airavata is tackling this challenging problem of providing a generic API > which will meet and enable these use case centric integration. The issue > here is, we are designing an API to handle a wide range of known (and some > foreseen) use cases. But at the same time trying to keep it simple and yet > flexible. The only way we can get through a reasonable, normalized version > of API is by hands-on programming against the API. Within the Airavata PMC > itself, we can solicit a half-a-dozen different ways on how to visualize > the data model. And we need few hackethon's with real-end users of Airavata > until we find a common ground. All of this needs rapid prototyping. > Currently a slight change in the data model is taking close to two weeks of > re-arcitecting the Open-JPA based registry. There are many known problems > with current draft of data model which have to be put-down in the interest > of making over all system progress. > > So the driving motivation is not certainly any of the classic NoSQL needs. > But a simple one, can we have registry which is schema-agnostic and yet is > queriable for most of the fields in the model? Can we try 10 different > variants of data model (hence API) within the next 3 months with focused > hackethon's and arrive at a stable 1.0 version of API? > > Part one is the discussion is successful that it raised every one's eye > brows. Now that we have every one's attention, what will be a good data > store for Airavata which will meet these needs? > > P.S: Additional background: The API has been in development for close to 3 > years and is falling short of pleasing a majority. Many academic > standardization efforts fail terribly trying to pretend to understand all > use cases and proposing a standard way (which ends up unnecessarily complex > and not usable). Science by nature is evolutionary, and restricting the > capabilities by a known set of use cases prevents the use of middleware for > real-scientific research (and gets limited to proof of concept > demonstrations, papers, educational use). The only way meeting the > challenges of these evolving needs is to have the framework which can > evolve with minimal disruption. > > Great thoughts so far, please keep 'em coming until we can find a solution > not by the technical fancies but to address the real need. > > Cheers, > Suresh > > On Feb 24, 2014, at 11:53 AM, Lahiru Gunathilake <[email protected]> > wrote: > > > On Mon, Feb 24, 2014 at 11:20 AM, Milinda Pathirage < > > [email protected]> wrote: > > > >> I also think that moving to Cassandra or any other NoSQL will add > >> unneccessary complexity to your solution. Also designing proper (easy to > >> manage changes, easy to query) NoSQL data models are hard (AFAIK, > require > >> lots of experience and understanding about data structures and queries). > >> Also migrating from one NoSQL technology to other can require complete > >> re-write. And current relational databases can handle heavy loads except > >> Google, Twitter, Amazon and Facebook like loads. I don't think Airavata > >> will see Google and Amazon like loads. > >> > > +1 > > > >> > >> If the constant changes to the data model is the problem , I think best > >> option is to abstract registry implementation to something like > collections > >> and resources used in WSO2 Registry [1] or something suitable for > Airavata > >> context. That will make it easy to handle changes in data model. > >> > >> Also don't let the technologies drive design decision. Its always > better to > >> let use cases drive the design decision. > >> > > +1 > > > > Regards > > Lahiru > > > >> > >> Thanks > >> Milinda > >> > >> [1] http://wso2.com/products/governance-registry/ > >> > >> > >> On Mon, Feb 24, 2014 at 10:57 AM, Supun Kamburugamuva < > [email protected] > >>> wrote: > >> > >>> Hi all, > >>> > >>> I'm not trying to discourage you on your exploration to NoSQL > databases. > >> I > >>> have the following concern. > >>> > >>> Your database schema is moderately complex - even for a RDBMS it seems > >>> complex and the data size is relatively small. I'm not sure about the > >>> current tools available but I think you will need to write more code to > >>> support all your requirements in a NoSQL database. So writing more code > >> and > >>> allow redundancy to support *relatively small* and *structured > >>> data*doesn't seem right to me. May be I'm wrong and there are better > >>> tools in > >>> NoSQL than RDBMS, which I doubt. > >>> > >>> Thanks, > >>> Supun.. > >>> > >>> > >>> > >>> On Sun, Feb 23, 2014 at 5:20 PM, Suresh Marru <[email protected]> > wrote: > >>> > >>>> Hi All, > >>>> > >>>> Airavata is actively migrating to use Thrift API for the RESTless > >> design > >>>> and to facilitate various language bindings from client gateways. The > >>>> programming language support in thrift has been so far very > >> encouraging. > >>>> The current architecture is looking like Figure 1 at [1]. > >>>> > >>>> Language specific clients will be released as thrift SDK's (similar to > >>>> evernote sdk's [1]). These clients will be integrated into gateway > >>> portals > >>>> which connect to the API Server. The API operations brokers he simple > >>> calls > >>>> into one or more backend CPI calls (Airavata internal component > >>>> interfaces). An example set of mappings are illustrated in Figure 2 > at > >>>> [1]. The current draft of thrift API for version 0.12 is at [3], > please > >>> pay > >>>> attention to experiment model at [4]. > >>>> > >>>> For the persistent store, we had few iterations of Airavata Registry > >>>> shifting from a legacy XRegistry to JackRabbit to now a OpenJPA based > >>>> registry. To allow the API and the associated data models to evolve, > it > >>>> will be useful to explore object databases so we can store the > >> serialized > >>>> version of thrift objects directly. But it will be nice to have all > (or > >>>> most) of the fields queriable. This calls for a more column-family > >> design > >>>> of any NoSQL approaches. > >>>> > >>>> Any recommendations for a registry architecture? > >>>> > >>>> Quickly hacking through I find the following approach a viable one: > >>>> ZombieDB[5] over astyanax[6] which talks to Cassandra. Airavata can > >>> benefit > >>>> immediately from the replication and reliability of cassandra and > >>>> scalability in near future. Some of the model objects like experiment > >>>> creation will need to have strong consistency and most of the > >> monitoring > >>>> can live with eventual consistency. > >>>> > >>>> Critical comments please? > >>>> > >>>> Thanks for your time, > >>>> Suresh > >>>> > >>>> [1] - > >>>> > >>> > >> > https://cwiki.apache.org/confluence/display/AIRAVATA/2014/02/23/Brainstorming+Diagrams > >>>> [2] - https://dev.evernote.com/doc/ > >>>> [3] - > >>>> > >>> > >> > https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=tree;f=airavata-api/thrift-interface-descriptions;hb=HEAD > >>>> [4] - > >>>> > >>> > >> > https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=blob_plain;f=airavata-api/thrift-interface-descriptions/experimentModel.thrift;hb=HEAD > >>>> [5] - https://github.com/MisterTea/ZombieDB > >>>> [6] - https://github.com/Netflix/astyanax > >>>> > >>>> > >>> > >>> > >>> -- > >>> Supun Kamburugamuva > >>> Member, Apache Software Foundation; http://www.apache.org > >>> E-mail: [email protected]; Mobile: +1 812 369 6762 > >>> Blog: http://supunk.blogspot.com > >>> > >> > >> > >> > >> -- > >> Milinda Pathirage > >> PhD Student Indiana University, Bloomington; > >> E-mail: [email protected] > >> Web: http://mpathirage.com > >> Blog: http://blog.mpathirage.com > >> > > > > > > > > -- > > System Analyst Programmer > > PTI Lab > > Indiana University > >
