Re: Object Database Suggestions for Airavata Registry

Milinda Pathirage Mon, 24 Feb 2014 12:02:26 -0800

Hi Suresh,

Collections are similar to directories and resources are similar to files.
WSO2 Registry implement various different functionalities on top of this
abstraction. In one of our projects we use this abstraction to implement
persistence storage for text mining workflow. Our text mining workflow
starts with a workset which is a collection of books. We represent this
workset as a collection in WSO2 Registry under user's collection (Which can
be think of as a workspace specific to user and other users can't access
this workspace). This workset can contain one or more resources or
collections. Current implementation only support single resource which is
list of book identifiers. When user start a text analysis job on this
workset, job manager reads necessary information (currently list of books)
from the workset, download necessary files from a API,  run analysis
algorithms on downloaded files and finally saves back the results in a
another registry collection. This model is pretty extensible for our use
case because if we want some aditional files or data in future we just need
to add another resource or another collection to workset collection. Then
applicaion can decide what to process or what not to process.


I think you also need some abstraction like that. I am not sure whether
collections and resources abstraction is the best for you. Level of
abstraction will depend on your use cases and requirements.

Thanks
Milinda




On Mon, Feb 24, 2014 at 2:00 PM, Suresh Marru <[email protected]> wrote:

> On Feb 24, 2014, at 11:20 AM, Milinda Pathirage <
> [email protected]> wrote:
>
> > I also think that moving to Cassandra or any other NoSQL will add
> > unneccessary complexity to your solution. Also designing proper (easy to
> > manage changes, easy to query) NoSQL data models are hard (AFAIK, require
> > lots of experience and understanding about data structures and queries).
> > Also migrating from one NoSQL technology to other can require complete
> > re-write. And current relational databases can handle heavy loads except
> > Google, Twitter, Amazon and Facebook like loads. I don't think Airavata
> > will see Google and Amazon like loads.
> >
> > If the constant changes to the data model is the problem , I think best
> > option is to abstract registry implementation to something like
> collections
> > and resources used in WSO2 Registry [1] or something suitable for
> Airavata
> > context. That will make it easy to handle changes in data model.
>
> You stated it right Milinda, Airavata does not have scaling needs which
> will go beyond RDMS limits, but needs this abstraction.
>
> Can any one elaborate more on collections and resources used in WSO2
> registry?
>
> Suresh
>
> >
> > Also don't let the technologies drive design decision. Its always better
> to
> > let use cases drive the design decision.
> >
> > Thanks
> > Milinda
> >
> > [1] http://wso2.com/products/governance-registry/
> >
> >
> > On Mon, Feb 24, 2014 at 10:57 AM, Supun Kamburugamuva <[email protected]
> >wrote:
> >
> >> Hi all,
> >>
> >> I'm not trying to discourage you on your exploration to NoSQL
> databases. I
> >> have the following concern.
> >>
> >> Your database schema is moderately complex - even for a RDBMS it seems
> >> complex and the data size is relatively small. I'm not sure about the
> >> current tools available but I think you will need to write more code to
> >> support all your requirements in a NoSQL database. So writing more code
> and
> >> allow redundancy to support *relatively small* and *structured
> >> data*doesn't seem right to me. May be I'm wrong and there are better
> >> tools in
> >> NoSQL than RDBMS, which I doubt.
> >>
> >> Thanks,
> >> Supun..
> >>
> >>
> >>
> >> On Sun, Feb 23, 2014 at 5:20 PM, Suresh Marru <[email protected]>
> wrote:
> >>
> >>> Hi All,
> >>>
> >>> Airavata is actively migrating to use Thrift API for the RESTless
> design
> >>> and to facilitate various language bindings from client gateways. The
> >>> programming language support in thrift has been so far very
> encouraging.
> >>> The current architecture is looking like Figure 1 at [1].
> >>>
> >>> Language specific clients will be released as thrift SDK's (similar to
> >>> evernote sdk's [1]). These clients will be integrated into gateway
> >> portals
> >>> which connect to the API Server. The API operations brokers he simple
> >> calls
> >>> into one or more backend CPI calls (Airavata internal component
> >>> interfaces).  An example set of mappings are illustrated in Figure 2 at
> >>> [1]. The current draft of thrift API for version 0.12 is at [3], please
> >> pay
> >>> attention to experiment model at [4].
> >>>
> >>> For the persistent store, we had few iterations of Airavata Registry
> >>> shifting from a legacy XRegistry to JackRabbit to now a OpenJPA based
> >>> registry. To allow the API and the associated data models to evolve, it
> >>> will be useful to explore object databases so we can store the
> serialized
> >>> version of thrift objects directly. But it will be nice to have all (or
> >>> most) of the fields queriable. This calls for a more column-family
> design
> >>> of any NoSQL approaches.
> >>>
> >>> Any recommendations for a registry architecture?
> >>>
> >>> Quickly hacking through I find the following approach a viable one:
> >>> ZombieDB[5] over astyanax[6] which talks to Cassandra. Airavata can
> >> benefit
> >>> immediately from the replication and reliability of cassandra and
> >>> scalability in near future. Some of the model objects like experiment
> >>> creation will need to have strong consistency and most of the
> monitoring
> >>> can live with eventual consistency.
> >>>
> >>> Critical comments please?
> >>>
> >>> Thanks for your time,
> >>> Suresh
> >>>
> >>> [1] -
> >>>
> >>
> https://cwiki.apache.org/confluence/display/AIRAVATA/2014/02/23/Brainstorming+Diagrams
> >>> [2] - https://dev.evernote.com/doc/
> >>> [3] -
> >>>
> >>
> https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=tree;f=airavata-api/thrift-interface-descriptions;hb=HEAD
> >>> [4] -
> >>>
> >>
> https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=blob_plain;f=airavata-api/thrift-interface-descriptions/experimentModel.thrift;hb=HEAD
> >>> [5] - https://github.com/MisterTea/ZombieDB
> >>> [6] - https://github.com/Netflix/astyanax
> >>>
> >>>
> >>
> >>
> >> --
> >> Supun Kamburugamuva
> >> Member, Apache Software Foundation; http://www.apache.org
> >> E-mail: [email protected];  Mobile: +1 812 369 6762
> >> Blog: http://supunk.blogspot.com
> >>
> >
> >
> >
> > --
> > Milinda Pathirage
> > PhD Student Indiana University, Bloomington;
> > E-mail: [email protected]
> > Web: http://mpathirage.com
> > Blog: http://blog.mpathirage.com
>
>


-- 
Milinda Pathirage
PhD Student Indiana University, Bloomington;
E-mail: [email protected]
Web: http://mpathirage.com
Blog: http://blog.mpathirage.com

Re: Object Database Suggestions for Airavata Registry

Reply via email to