Re: Object Database Suggestions for Airavata Registry

Marlon Pierce Tue, 25 Feb 2014 11:37:26 -0800

Please define the solid and broken line arrows.  Why doesn't the
orchestrator interact with the registry?



Marlon

On 2/25/14 2:29 PM, Saminda Wijeratne wrote:
> The diagrams @[1] will depict functional requirements (at an
> abstract-level) for Airavata from CIPRES and UltraScan gateways.
>
> 1. https://iu.app.box.com/s/52d2dmtfsd8mvlwvu9f3
>
>
> On Mon, Feb 24, 2014 at 3:01 PM, Milinda Pathirage <
> [email protected]> wrote:
>
>> Hi Suresh,
>>
>> Collections are similar to directories and resources are similar to files.
>> WSO2 Registry implement various different functionalities on top of this
>> abstraction. In one of our projects we use this abstraction to implement
>> persistence storage for text mining workflow. Our text mining workflow
>> starts with a workset which is a collection of books. We represent this
>> workset as a collection in WSO2 Registry under user's collection (Which can
>> be think of as a workspace specific to user and other users can't access
>> this workspace). This workset can contain one or more resources or
>> collections. Current implementation only support single resource which is
>> list of book identifiers. When user start a text analysis job on this
>> workset, job manager reads necessary information (currently list of books)
>> from the workset, download necessary files from a API,  run analysis
>> algorithms on downloaded files and finally saves back the results in a
>> another registry collection. This model is pretty extensible for our use
>> case because if we want some aditional files or data in future we just need
>> to add another resource or another collection to workset collection. Then
>> applicaion can decide what to process or what not to process.
>>
>> I think you also need some abstraction like that. I am not sure whether
>> collections and resources abstraction is the best for you. Level of
>> abstraction will depend on your use cases and requirements.
>>
>> Thanks
>> Milinda
>>
>>
>>
>>
>> On Mon, Feb 24, 2014 at 2:00 PM, Suresh Marru <[email protected]> wrote:
>>
>>> On Feb 24, 2014, at 11:20 AM, Milinda Pathirage <
>>> [email protected]> wrote:
>>>
>>>> I also think that moving to Cassandra or any other NoSQL will add
>>>> unneccessary complexity to your solution. Also designing proper (easy
>> to
>>>> manage changes, easy to query) NoSQL data models are hard (AFAIK,
>> require
>>>> lots of experience and understanding about data structures and
>> queries).
>>>> Also migrating from one NoSQL technology to other can require complete
>>>> re-write. And current relational databases can handle heavy loads
>> except
>>>> Google, Twitter, Amazon and Facebook like loads. I don't think Airavata
>>>> will see Google and Amazon like loads.
>>>>
>>>> If the constant changes to the data model is the problem , I think best
>>>> option is to abstract registry implementation to something like
>>> collections
>>>> and resources used in WSO2 Registry [1] or something suitable for
>>> Airavata
>>>> context. That will make it easy to handle changes in data model.
>>> You stated it right Milinda, Airavata does not have scaling needs which
>>> will go beyond RDMS limits, but needs this abstraction.
>>>
>>> Can any one elaborate more on collections and resources used in WSO2
>>> registry?
>>>
>>> Suresh
>>>
>>>> Also don't let the technologies drive design decision. Its always
>> better
>>> to
>>>> let use cases drive the design decision.
>>>>
>>>> Thanks
>>>> Milinda
>>>>
>>>> [1] http://wso2.com/products/governance-registry/
>>>>
>>>>
>>>> On Mon, Feb 24, 2014 at 10:57 AM, Supun Kamburugamuva <
>> [email protected]
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I'm not trying to discourage you on your exploration to NoSQL
>>> databases. I
>>>>> have the following concern.
>>>>>
>>>>> Your database schema is moderately complex - even for a RDBMS it seems
>>>>> complex and the data size is relatively small. I'm not sure about the
>>>>> current tools available but I think you will need to write more code
>> to
>>>>> support all your requirements in a NoSQL database. So writing more
>> code
>>> and
>>>>> allow redundancy to support *relatively small* and *structured
>>>>> data*doesn't seem right to me. May be I'm wrong and there are better
>>>>> tools in
>>>>> NoSQL than RDBMS, which I doubt.
>>>>>
>>>>> Thanks,
>>>>> Supun..
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Feb 23, 2014 at 5:20 PM, Suresh Marru <[email protected]>
>>> wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> Airavata is actively migrating to use Thrift API for the RESTless
>>> design
>>>>>> and to facilitate various language bindings from client gateways. The
>>>>>> programming language support in thrift has been so far very
>>> encouraging.
>>>>>> The current architecture is looking like Figure 1 at [1].
>>>>>>
>>>>>> Language specific clients will be released as thrift SDK's (similar
>> to
>>>>>> evernote sdk's [1]). These clients will be integrated into gateway
>>>>> portals
>>>>>> which connect to the API Server. The API operations brokers he simple
>>>>> calls
>>>>>> into one or more backend CPI calls (Airavata internal component
>>>>>> interfaces).  An example set of mappings are illustrated in Figure 2
>> at
>>>>>> [1]. The current draft of thrift API for version 0.12 is at [3],
>> please
>>>>> pay
>>>>>> attention to experiment model at [4].
>>>>>>
>>>>>> For the persistent store, we had few iterations of Airavata Registry
>>>>>> shifting from a legacy XRegistry to JackRabbit to now a OpenJPA based
>>>>>> registry. To allow the API and the associated data models to evolve,
>> it
>>>>>> will be useful to explore object databases so we can store the
>>> serialized
>>>>>> version of thrift objects directly. But it will be nice to have all
>> (or
>>>>>> most) of the fields queriable. This calls for a more column-family
>>> design
>>>>>> of any NoSQL approaches.
>>>>>>
>>>>>> Any recommendations for a registry architecture?
>>>>>>
>>>>>> Quickly hacking through I find the following approach a viable one:
>>>>>> ZombieDB[5] over astyanax[6] which talks to Cassandra. Airavata can
>>>>> benefit
>>>>>> immediately from the replication and reliability of cassandra and
>>>>>> scalability in near future. Some of the model objects like experiment
>>>>>> creation will need to have strong consistency and most of the
>>> monitoring
>>>>>> can live with eventual consistency.
>>>>>>
>>>>>> Critical comments please?
>>>>>>
>>>>>> Thanks for your time,
>>>>>> Suresh
>>>>>>
>>>>>> [1] -
>>>>>>
>> https://cwiki.apache.org/confluence/display/AIRAVATA/2014/02/23/Brainstorming+Diagrams
>>>>>> [2] - https://dev.evernote.com/doc/
>>>>>> [3] -
>>>>>>
>> https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=tree;f=airavata-api/thrift-interface-descriptions;hb=HEAD
>>>>>> [4] -
>>>>>>
>> https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=blob_plain;f=airavata-api/thrift-interface-descriptions/experimentModel.thrift;hb=HEAD
>>>>>> [5] - https://github.com/MisterTea/ZombieDB
>>>>>> [6] - https://github.com/Netflix/astyanax
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Supun Kamburugamuva
>>>>> Member, Apache Software Foundation; http://www.apache.org
>>>>> E-mail: [email protected];  Mobile: +1 812 369 6762
>>>>> Blog: http://supunk.blogspot.com
>>>>>
>>>>
>>>>
>>>> --
>>>> Milinda Pathirage
>>>> PhD Student Indiana University, Bloomington;
>>>> E-mail: [email protected]
>>>> Web: http://mpathirage.com
>>>> Blog: http://blog.mpathirage.com
>>>
>>
>> --
>> Milinda Pathirage
>> PhD Student Indiana University, Bloomington;
>> E-mail: [email protected]
>> Web: http://mpathirage.com
>> Blog: http://blog.mpathirage.com
>>

Re: Object Database Suggestions for Airavata Registry

Reply via email to