I could respond to each thread in detail, but I see the general sense is 
inquiring on the use case, so let me try and explain this and see if it comes 
across. I am fully onboard with perceptions of relational vs nosql and also 
agree current Airavata needs are not a direct map for NoSQL migration. I will 
summarize the driving motivation:

Background: The key problem Airavata needs to solve is getting the API and 
associated data model right. The problem is current relational database (with 
OpenJPA overlay) is severely limiting the API evolution. Science Gateways by 
nature are very science domain and use-case specific. But Airavata is tackling 
this challenging problem of providing a generic API which will meet and enable 
these use case centric integration. The issue here is, we are designing an API 
to handle a wide range of known (and some foreseen) use cases. But at the same 
time trying to keep it simple and yet flexible. The only way we can get through 
a reasonable, normalized version of API is by hands-on programming against the 
API. Within the Airavata PMC itself, we can solicit a half-a-dozen different 
ways on how to visualize the data model. And we need few hackethon’s with 
real-end users of Airavata until we find a common ground. All of this needs 
rapid prototyping. Currently a slight change in the data model is taking close 
to two weeks of re-arcitecting the Open-JPA based registry. There are many 
known problems with current draft of data model which have to be put-down in 
the interest of making over all system progress. 

So the driving motivation is not certainly any of the classic NoSQL needs. But 
a simple one, can we have registry which is schema-agnostic and yet is 
queriable for most of the fields in the model? Can we try 10 different variants 
of data model (hence API) within the next 3 months with focused hackethon’s and 
arrive at a stable 1.0 version of API?

Part one is the discussion is successful that it raised every one’s eye brows. 
Now that we have every one’s attention, what will be a good data store for 
Airavata which will meet these needs? 

P.S: Additional background: The API has been in development for close to 3 
years and is falling short of pleasing a majority. Many academic 
standardization efforts fail terribly trying to pretend to understand all use 
cases and proposing a standard way (which ends up unnecessarily complex and not 
usable). Science by nature is evolutionary, and restricting the capabilities by 
a known set of use cases prevents the use of middleware for real-scientific 
research (and gets limited to proof of concept demonstrations, papers, 
educational use). The only way meeting the challenges of these evolving needs 
is to have the framework which can evolve with minimal disruption. 

Great thoughts so far, please keep ’em coming until we can find a solution not 
by the technical fancies but to address the real need.

Cheers,
Suresh

On Feb 24, 2014, at 11:53 AM, Lahiru Gunathilake <[email protected]> wrote:

> On Mon, Feb 24, 2014 at 11:20 AM, Milinda Pathirage <
> [email protected]> wrote:
> 
>> I also think that moving to Cassandra or any other NoSQL will add
>> unneccessary complexity to your solution. Also designing proper (easy to
>> manage changes, easy to query) NoSQL data models are hard (AFAIK, require
>> lots of experience and understanding about data structures and queries).
>> Also migrating from one NoSQL technology to other can require complete
>> re-write. And current relational databases can handle heavy loads except
>> Google, Twitter, Amazon and Facebook like loads. I don't think Airavata
>> will see Google and Amazon like loads.
>> 
> +1
> 
>> 
>> If the constant changes to the data model is the problem , I think best
>> option is to abstract registry implementation to something like collections
>> and resources used in WSO2 Registry [1] or something suitable for Airavata
>> context. That will make it easy to handle changes in data model.
>> 
>> Also don't let the technologies drive design decision. Its always better to
>> let use cases drive the design decision.
>> 
> +1
> 
> Regards
> Lahiru
> 
>> 
>> Thanks
>> Milinda
>> 
>> [1] http://wso2.com/products/governance-registry/
>> 
>> 
>> On Mon, Feb 24, 2014 at 10:57 AM, Supun Kamburugamuva <[email protected]
>>> wrote:
>> 
>>> Hi all,
>>> 
>>> I'm not trying to discourage you on your exploration to NoSQL databases.
>> I
>>> have the following concern.
>>> 
>>> Your database schema is moderately complex - even for a RDBMS it seems
>>> complex and the data size is relatively small. I'm not sure about the
>>> current tools available but I think you will need to write more code to
>>> support all your requirements in a NoSQL database. So writing more code
>> and
>>> allow redundancy to support *relatively small* and *structured
>>> data*doesn't seem right to me. May be I'm wrong and there are better
>>> tools in
>>> NoSQL than RDBMS, which I doubt.
>>> 
>>> Thanks,
>>> Supun..
>>> 
>>> 
>>> 
>>> On Sun, Feb 23, 2014 at 5:20 PM, Suresh Marru <[email protected]> wrote:
>>> 
>>>> Hi All,
>>>> 
>>>> Airavata is actively migrating to use Thrift API for the RESTless
>> design
>>>> and to facilitate various language bindings from client gateways. The
>>>> programming language support in thrift has been so far very
>> encouraging.
>>>> The current architecture is looking like Figure 1 at [1].
>>>> 
>>>> Language specific clients will be released as thrift SDK's (similar to
>>>> evernote sdk's [1]). These clients will be integrated into gateway
>>> portals
>>>> which connect to the API Server. The API operations brokers he simple
>>> calls
>>>> into one or more backend CPI calls (Airavata internal component
>>>> interfaces).  An example set of mappings are illustrated in Figure 2 at
>>>> [1]. The current draft of thrift API for version 0.12 is at [3], please
>>> pay
>>>> attention to experiment model at [4].
>>>> 
>>>> For the persistent store, we had few iterations of Airavata Registry
>>>> shifting from a legacy XRegistry to JackRabbit to now a OpenJPA based
>>>> registry. To allow the API and the associated data models to evolve, it
>>>> will be useful to explore object databases so we can store the
>> serialized
>>>> version of thrift objects directly. But it will be nice to have all (or
>>>> most) of the fields queriable. This calls for a more column-family
>> design
>>>> of any NoSQL approaches.
>>>> 
>>>> Any recommendations for a registry architecture?
>>>> 
>>>> Quickly hacking through I find the following approach a viable one:
>>>> ZombieDB[5] over astyanax[6] which talks to Cassandra. Airavata can
>>> benefit
>>>> immediately from the replication and reliability of cassandra and
>>>> scalability in near future. Some of the model objects like experiment
>>>> creation will need to have strong consistency and most of the
>> monitoring
>>>> can live with eventual consistency.
>>>> 
>>>> Critical comments please?
>>>> 
>>>> Thanks for your time,
>>>> Suresh
>>>> 
>>>> [1] -
>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/AIRAVATA/2014/02/23/Brainstorming+Diagrams
>>>> [2] - https://dev.evernote.com/doc/
>>>> [3] -
>>>> 
>>> 
>> https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=tree;f=airavata-api/thrift-interface-descriptions;hb=HEAD
>>>> [4] -
>>>> 
>>> 
>> https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=blob_plain;f=airavata-api/thrift-interface-descriptions/experimentModel.thrift;hb=HEAD
>>>> [5] - https://github.com/MisterTea/ZombieDB
>>>> [6] - https://github.com/Netflix/astyanax
>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> Supun Kamburugamuva
>>> Member, Apache Software Foundation; http://www.apache.org
>>> E-mail: [email protected];  Mobile: +1 812 369 6762
>>> Blog: http://supunk.blogspot.com
>>> 
>> 
>> 
>> 
>> --
>> Milinda Pathirage
>> PhD Student Indiana University, Bloomington;
>> E-mail: [email protected]
>> Web: http://mpathirage.com
>> Blog: http://blog.mpathirage.com
>> 
> 
> 
> 
> -- 
> System Analyst Programmer
> PTI Lab
> Indiana University

Reply via email to