+1 for Sunday afternoon Thanks, Eran Chinthaka Withana
On Fri, Feb 28, 2014 at 5:17 AM, Suresh Marru <[email protected]> wrote: > Hi Eran, > > This is a great idea. I myself owe few replies on this thread and unable > to take time to comprehend my thoughts (and realized I should take time to > properly articulate the challenges otherwise we will be discussing > orthogonal issues). > > A hangout will help us brainstorm more comprehensively. We can have it on > air so we can refer back for archival purposes. How is Sunday afternoon for > everyone willing to join and contribute? > > Thanks, > Suresh > > On Feb 28, 2014, at 1:45 AM, Eran Chinthaka Withana < > [email protected]> wrote: > > > Hi, > > > > Is there any chance of hosting a google hangout to talk about this. I > think > > with long emails and multiple directions things are getting little bit > > confusing in thread (I'm partly responsible for this :) ). I can join a > > video chat during a weekend but lets make sure its convenient for both > east > > and west coasts :) > > > > WDYT? > > > > Thanks, > > Eran Chinthaka Withana > > > > > > On Mon, Feb 24, 2014 at 9:32 AM, Suresh Marru <[email protected]> wrote: > > > >> I could respond to each thread in detail, but I see the general sense is > >> inquiring on the use case, so let me try and explain this and see if it > >> comes across. I am fully onboard with perceptions of relational vs nosql > >> and also agree current Airavata needs are not a direct map for NoSQL > >> migration. I will summarize the driving motivation: > >> > >> Background: The key problem Airavata needs to solve is getting the API > and > >> associated data model right. The problem is current relational database > >> (with OpenJPA overlay) is severely limiting the API evolution. Science > >> Gateways by nature are very science domain and use-case specific. But > >> Airavata is tackling this challenging problem of providing a generic API > >> which will meet and enable these use case centric integration. The issue > >> here is, we are designing an API to handle a wide range of known (and > some > >> foreseen) use cases. But at the same time trying to keep it simple and > yet > >> flexible. The only way we can get through a reasonable, normalized > version > >> of API is by hands-on programming against the API. Within the Airavata > PMC > >> itself, we can solicit a half-a-dozen different ways on how to visualize > >> the data model. And we need few hackethon's with real-end users of > Airavata > >> until we find a common ground. All of this needs rapid prototyping. > >> Currently a slight change in the data model is taking close to two > weeks of > >> re-arcitecting the Open-JPA based registry. There are many known > problems > >> with current draft of data model which have to be put-down in the > interest > >> of making over all system progress. > >> > >> So the driving motivation is not certainly any of the classic NoSQL > needs. > >> But a simple one, can we have registry which is schema-agnostic and yet > is > >> queriable for most of the fields in the model? Can we try 10 different > >> variants of data model (hence API) within the next 3 months with focused > >> hackethon's and arrive at a stable 1.0 version of API? > >> > >> Part one is the discussion is successful that it raised every one's eye > >> brows. Now that we have every one's attention, what will be a good data > >> store for Airavata which will meet these needs? > >> > >> P.S: Additional background: The API has been in development for close > to 3 > >> years and is falling short of pleasing a majority. Many academic > >> standardization efforts fail terribly trying to pretend to understand > all > >> use cases and proposing a standard way (which ends up unnecessarily > complex > >> and not usable). Science by nature is evolutionary, and restricting the > >> capabilities by a known set of use cases prevents the use of middleware > for > >> real-scientific research (and gets limited to proof of concept > >> demonstrations, papers, educational use). The only way meeting the > >> challenges of these evolving needs is to have the framework which can > >> evolve with minimal disruption. > >> > >> Great thoughts so far, please keep 'em coming until we can find a > solution > >> not by the technical fancies but to address the real need. > >> > >> Cheers, > >> Suresh > >> > >> On Feb 24, 2014, at 11:53 AM, Lahiru Gunathilake <[email protected]> > >> wrote: > >> > >>> On Mon, Feb 24, 2014 at 11:20 AM, Milinda Pathirage < > >>> [email protected]> wrote: > >>> > >>>> I also think that moving to Cassandra or any other NoSQL will add > >>>> unneccessary complexity to your solution. Also designing proper (easy > to > >>>> manage changes, easy to query) NoSQL data models are hard (AFAIK, > >> require > >>>> lots of experience and understanding about data structures and > queries). > >>>> Also migrating from one NoSQL technology to other can require complete > >>>> re-write. And current relational databases can handle heavy loads > except > >>>> Google, Twitter, Amazon and Facebook like loads. I don't think > Airavata > >>>> will see Google and Amazon like loads. > >>>> > >>> +1 > >>> > >>>> > >>>> If the constant changes to the data model is the problem , I think > best > >>>> option is to abstract registry implementation to something like > >> collections > >>>> and resources used in WSO2 Registry [1] or something suitable for > >> Airavata > >>>> context. That will make it easy to handle changes in data model. > >>>> > >>>> Also don't let the technologies drive design decision. Its always > >> better to > >>>> let use cases drive the design decision. > >>>> > >>> +1 > >>> > >>> Regards > >>> Lahiru > >>> > >>>> > >>>> Thanks > >>>> Milinda > >>>> > >>>> [1] http://wso2.com/products/governance-registry/ > >>>> > >>>> > >>>> On Mon, Feb 24, 2014 at 10:57 AM, Supun Kamburugamuva < > >> [email protected] > >>>>> wrote: > >>>> > >>>>> Hi all, > >>>>> > >>>>> I'm not trying to discourage you on your exploration to NoSQL > >> databases. > >>>> I > >>>>> have the following concern. > >>>>> > >>>>> Your database schema is moderately complex - even for a RDBMS it > seems > >>>>> complex and the data size is relatively small. I'm not sure about the > >>>>> current tools available but I think you will need to write more code > to > >>>>> support all your requirements in a NoSQL database. So writing more > code > >>>> and > >>>>> allow redundancy to support *relatively small* and *structured > >>>>> data*doesn't seem right to me. May be I'm wrong and there are better > >>>>> tools in > >>>>> NoSQL than RDBMS, which I doubt. > >>>>> > >>>>> Thanks, > >>>>> Supun.. > >>>>> > >>>>> > >>>>> > >>>>> On Sun, Feb 23, 2014 at 5:20 PM, Suresh Marru <[email protected]> > >> wrote: > >>>>> > >>>>>> Hi All, > >>>>>> > >>>>>> Airavata is actively migrating to use Thrift API for the RESTless > >>>> design > >>>>>> and to facilitate various language bindings from client gateways. > The > >>>>>> programming language support in thrift has been so far very > >>>> encouraging. > >>>>>> The current architecture is looking like Figure 1 at [1]. > >>>>>> > >>>>>> Language specific clients will be released as thrift SDK's (similar > to > >>>>>> evernote sdk's [1]). These clients will be integrated into gateway > >>>>> portals > >>>>>> which connect to the API Server. The API operations brokers he > simple > >>>>> calls > >>>>>> into one or more backend CPI calls (Airavata internal component > >>>>>> interfaces). An example set of mappings are illustrated in Figure 2 > >> at > >>>>>> [1]. The current draft of thrift API for version 0.12 is at [3], > >> please > >>>>> pay > >>>>>> attention to experiment model at [4]. > >>>>>> > >>>>>> For the persistent store, we had few iterations of Airavata Registry > >>>>>> shifting from a legacy XRegistry to JackRabbit to now a OpenJPA > based > >>>>>> registry. To allow the API and the associated data models to evolve, > >> it > >>>>>> will be useful to explore object databases so we can store the > >>>> serialized > >>>>>> version of thrift objects directly. But it will be nice to have all > >> (or > >>>>>> most) of the fields queriable. This calls for a more column-family > >>>> design > >>>>>> of any NoSQL approaches. > >>>>>> > >>>>>> Any recommendations for a registry architecture? > >>>>>> > >>>>>> Quickly hacking through I find the following approach a viable one: > >>>>>> ZombieDB[5] over astyanax[6] which talks to Cassandra. Airavata can > >>>>> benefit > >>>>>> immediately from the replication and reliability of cassandra and > >>>>>> scalability in near future. Some of the model objects like > experiment > >>>>>> creation will need to have strong consistency and most of the > >>>> monitoring > >>>>>> can live with eventual consistency. > >>>>>> > >>>>>> Critical comments please? > >>>>>> > >>>>>> Thanks for your time, > >>>>>> Suresh > >>>>>> > >>>>>> [1] - > >>>>>> > >>>>> > >>>> > >> > https://cwiki.apache.org/confluence/display/AIRAVATA/2014/02/23/Brainstorming+Diagrams > >>>>>> [2] - https://dev.evernote.com/doc/ > >>>>>> [3] - > >>>>>> > >>>>> > >>>> > >> > https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=tree;f=airavata-api/thrift-interface-descriptions;hb=HEAD > >>>>>> [4] - > >>>>>> > >>>>> > >>>> > >> > https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=blob_plain;f=airavata-api/thrift-interface-descriptions/experimentModel.thrift;hb=HEAD > >>>>>> [5] - https://github.com/MisterTea/ZombieDB > >>>>>> [6] - https://github.com/Netflix/astyanax > >>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Supun Kamburugamuva > >>>>> Member, Apache Software Foundation; http://www.apache.org > >>>>> E-mail: [email protected]; Mobile: +1 812 369 6762 > >>>>> Blog: http://supunk.blogspot.com > >>>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Milinda Pathirage > >>>> PhD Student Indiana University, Bloomington; > >>>> E-mail: [email protected] > >>>> Web: http://mpathirage.com > >>>> Blog: http://blog.mpathirage.com > >>>> > >>> > >>> > >>> > >>> -- > >>> System Analyst Programmer > >>> PTI Lab > >>> Indiana University > >> > >> > >
