Re: Object Database Suggestions for Airavata Registry

Eran Chinthaka Withana Thu, 27 Feb 2014 22:47:19 -0800

Hi,

Is there any chance of hosting a google hangout to talk about this. I think
with long emails and multiple directions things are getting little bit
confusing in thread (I'm partly responsible for this :) ). I can join a
video chat during a weekend but lets make sure its convenient for both east
and west coasts :)


WDYT?

Thanks,
Eran Chinthaka Withana


On Mon, Feb 24, 2014 at 9:32 AM, Suresh Marru <[email protected]> wrote:

> I could respond to each thread in detail, but I see the general sense is
> inquiring on the use case, so let me try and explain this and see if it
> comes across. I am fully onboard with perceptions of relational vs nosql
> and also agree current Airavata needs are not a direct map for NoSQL
> migration. I will summarize the driving motivation:
>
> Background: The key problem Airavata needs to solve is getting the API and
> associated data model right. The problem is current relational database
> (with OpenJPA overlay) is severely limiting the API evolution. Science
> Gateways by nature are very science domain and use-case specific. But
> Airavata is tackling this challenging problem of providing a generic API
> which will meet and enable these use case centric integration. The issue
> here is, we are designing an API to handle a wide range of known (and some
> foreseen) use cases. But at the same time trying to keep it simple and yet
> flexible. The only way we can get through a reasonable, normalized version
> of API is by hands-on programming against the API. Within the Airavata PMC
> itself, we can solicit a half-a-dozen different ways on how to visualize
> the data model. And we need few hackethon's with real-end users of Airavata
> until we find a common ground. All of this needs rapid prototyping.
> Currently a slight change in the data model is taking close to two weeks of
> re-arcitecting the Open-JPA based registry. There are many known problems
> with current draft of data model which have to be put-down in the interest
> of making over all system progress.
>
> So the driving motivation is not certainly any of the classic NoSQL needs.
> But a simple one, can we have registry which is schema-agnostic and yet is
> queriable for most of the fields in the model? Can we try 10 different
> variants of data model (hence API) within the next 3 months with focused
> hackethon's and arrive at a stable 1.0 version of API?
>
> Part one is the discussion is successful that it raised every one's eye
> brows. Now that we have every one's attention, what will be a good data
> store for Airavata which will meet these needs?
>
> P.S: Additional background: The API has been in development for close to 3
> years and is falling short of pleasing a majority. Many academic
> standardization efforts fail terribly trying to pretend to understand all
> use cases and proposing a standard way (which ends up unnecessarily complex
> and not usable). Science by nature is evolutionary, and restricting the
> capabilities by a known set of use cases prevents the use of middleware for
> real-scientific research (and gets limited to proof of concept
> demonstrations, papers, educational use). The only way meeting the
> challenges of these evolving needs is to have the framework which can
> evolve with minimal disruption.
>
> Great thoughts so far, please keep 'em coming until we can find a solution
> not by the technical fancies but to address the real need.
>
> Cheers,
> Suresh
>
> On Feb 24, 2014, at 11:53 AM, Lahiru Gunathilake <[email protected]>
> wrote:
>
> > On Mon, Feb 24, 2014 at 11:20 AM, Milinda Pathirage <
> > [email protected]> wrote:
> >
> >> I also think that moving to Cassandra or any other NoSQL will add
> >> unneccessary complexity to your solution. Also designing proper (easy to
> >> manage changes, easy to query) NoSQL data models are hard (AFAIK,
> require
> >> lots of experience and understanding about data structures and queries).
> >> Also migrating from one NoSQL technology to other can require complete
> >> re-write. And current relational databases can handle heavy loads except
> >> Google, Twitter, Amazon and Facebook like loads. I don't think Airavata
> >> will see Google and Amazon like loads.
> >>
> > +1
> >
> >>
> >> If the constant changes to the data model is the problem , I think best
> >> option is to abstract registry implementation to something like
> collections
> >> and resources used in WSO2 Registry [1] or something suitable for
> Airavata
> >> context. That will make it easy to handle changes in data model.
> >>
> >> Also don't let the technologies drive design decision. Its always
> better to
> >> let use cases drive the design decision.
> >>
> > +1
> >
> > Regards
> > Lahiru
> >
> >>
> >> Thanks
> >> Milinda
> >>
> >> [1] http://wso2.com/products/governance-registry/
> >>
> >>
> >> On Mon, Feb 24, 2014 at 10:57 AM, Supun Kamburugamuva <
> [email protected]
> >>> wrote:
> >>
> >>> Hi all,
> >>>
> >>> I'm not trying to discourage you on your exploration to NoSQL
> databases.
> >> I
> >>> have the following concern.
> >>>
> >>> Your database schema is moderately complex - even for a RDBMS it seems
> >>> complex and the data size is relatively small. I'm not sure about the
> >>> current tools available but I think you will need to write more code to
> >>> support all your requirements in a NoSQL database. So writing more code
> >> and
> >>> allow redundancy to support *relatively small* and *structured
> >>> data*doesn't seem right to me. May be I'm wrong and there are better
> >>> tools in
> >>> NoSQL than RDBMS, which I doubt.
> >>>
> >>> Thanks,
> >>> Supun..
> >>>
> >>>
> >>>
> >>> On Sun, Feb 23, 2014 at 5:20 PM, Suresh Marru <[email protected]>
> wrote:
> >>>
> >>>> Hi All,
> >>>>
> >>>> Airavata is actively migrating to use Thrift API for the RESTless
> >> design
> >>>> and to facilitate various language bindings from client gateways. The
> >>>> programming language support in thrift has been so far very
> >> encouraging.
> >>>> The current architecture is looking like Figure 1 at [1].
> >>>>
> >>>> Language specific clients will be released as thrift SDK's (similar to
> >>>> evernote sdk's [1]). These clients will be integrated into gateway
> >>> portals
> >>>> which connect to the API Server. The API operations brokers he simple
> >>> calls
> >>>> into one or more backend CPI calls (Airavata internal component
> >>>> interfaces).  An example set of mappings are illustrated in Figure 2
> at
> >>>> [1]. The current draft of thrift API for version 0.12 is at [3],
> please
> >>> pay
> >>>> attention to experiment model at [4].
> >>>>
> >>>> For the persistent store, we had few iterations of Airavata Registry
> >>>> shifting from a legacy XRegistry to JackRabbit to now a OpenJPA based
> >>>> registry. To allow the API and the associated data models to evolve,
> it
> >>>> will be useful to explore object databases so we can store the
> >> serialized
> >>>> version of thrift objects directly. But it will be nice to have all
> (or
> >>>> most) of the fields queriable. This calls for a more column-family
> >> design
> >>>> of any NoSQL approaches.
> >>>>
> >>>> Any recommendations for a registry architecture?
> >>>>
> >>>> Quickly hacking through I find the following approach a viable one:
> >>>> ZombieDB[5] over astyanax[6] which talks to Cassandra. Airavata can
> >>> benefit
> >>>> immediately from the replication and reliability of cassandra and
> >>>> scalability in near future. Some of the model objects like experiment
> >>>> creation will need to have strong consistency and most of the
> >> monitoring
> >>>> can live with eventual consistency.
> >>>>
> >>>> Critical comments please?
> >>>>
> >>>> Thanks for your time,
> >>>> Suresh
> >>>>
> >>>> [1] -
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/AIRAVATA/2014/02/23/Brainstorming+Diagrams
> >>>> [2] - https://dev.evernote.com/doc/
> >>>> [3] -
> >>>>
> >>>
> >>
> https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=tree;f=airavata-api/thrift-interface-descriptions;hb=HEAD
> >>>> [4] -
> >>>>
> >>>
> >>
> https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=blob_plain;f=airavata-api/thrift-interface-descriptions/experimentModel.thrift;hb=HEAD
> >>>> [5] - https://github.com/MisterTea/ZombieDB
> >>>> [6] - https://github.com/Netflix/astyanax
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> Supun Kamburugamuva
> >>> Member, Apache Software Foundation; http://www.apache.org
> >>> E-mail: [email protected];  Mobile: +1 812 369 6762
> >>> Blog: http://supunk.blogspot.com
> >>>
> >>
> >>
> >>
> >> --
> >> Milinda Pathirage
> >> PhD Student Indiana University, Bloomington;
> >> E-mail: [email protected]
> >> Web: http://mpathirage.com
> >> Blog: http://blog.mpathirage.com
> >>
> >
> >
> >
> > --
> > System Analyst Programmer
> > PTI Lab
> > Indiana University
>
>

Re: Object Database Suggestions for Airavata Registry

Reply via email to