I happened to look through the data model. > > https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=blob_plain;f=airavata-api/thrift-interface-descriptions/experimentModel.thrift;hb=HEAD
How is information on the input data and transfer method stored? Is that freeform text in DataTransferDetails? Also, is there a place to describe preprocessin of job input data or a custom submit command? Sorry for the sidetrack. Kenneth On Wed, Feb 26, 2014 at 02:13:46AM +0530, Shameera Rathnayaka wrote: > Hi all, > > Just thinking a loud here, sorry if i am moving this thread to another > direction. > > If we going to use our own registry implementation, do we have consider > provide database layer where we can plug different kind of databases?(may > be Supun also suggesting the same in his previous reply). As we are already > separating SPIs and APIs for other components, we can do the same for DB > implementation too. NoSql database like cassandra also have cql driver > which is identical to Mysql driver. So it is not difficult to implement > plugable environment, > > In wso2 registry they already have above capability but not yet implemented > CQL as i know. > > Thanks, > Shameera. > > > On Wed, Feb 26, 2014 at 1:36 AM, Saminda Wijeratne <[email protected]>wrote: > > > Sorry I missed the arrow from Registry to Orchestrator. Thanks for pointing > > it out Marlon. Updated the arrows and added a legend. > > > > Broken line arrow is involved in MessageBox component where it gets > > triggered from time to time without external user intervention. Also > > there's still some technical details we need to figure-out on how the > > MessageBox will function and expose itself in the new design. > > > > > > On Tue, Feb 25, 2014 at 2:36 PM, Marlon Pierce <[email protected]> wrote: > > > > > Please define the solid and broken line arrows. Why doesn't the > > > orchestrator interact with the registry? > > > > > > > > > Marlon > > > > > > On 2/25/14 2:29 PM, Saminda Wijeratne wrote: > > > > The diagrams @[1] will depict functional requirements (at an > > > > abstract-level) for Airavata from CIPRES and UltraScan gateways. > > > > > > > > 1. https://iu.app.box.com/s/52d2dmtfsd8mvlwvu9f3 > > > > > > > > > > > > On Mon, Feb 24, 2014 at 3:01 PM, Milinda Pathirage < > > > > [email protected]> wrote: > > > > > > > >> Hi Suresh, > > > >> > > > >> Collections are similar to directories and resources are similar to > > > files. > > > >> WSO2 Registry implement various different functionalities on top of > > this > > > >> abstraction. In one of our projects we use this abstraction to > > implement > > > >> persistence storage for text mining workflow. Our text mining workflow > > > >> starts with a workset which is a collection of books. We represent > > this > > > >> workset as a collection in WSO2 Registry under user's collection > > (Which > > > can > > > >> be think of as a workspace specific to user and other users can't > > access > > > >> this workspace). This workset can contain one or more resources or > > > >> collections. Current implementation only support single resource which > > > is > > > >> list of book identifiers. When user start a text analysis job on this > > > >> workset, job manager reads necessary information (currently list of > > > books) > > > >> from the workset, download necessary files from a API, run analysis > > > >> algorithms on downloaded files and finally saves back the results in a > > > >> another registry collection. This model is pretty extensible for our > > use > > > >> case because if we want some aditional files or data in future we just > > > need > > > >> to add another resource or another collection to workset collection. > > > Then > > > >> applicaion can decide what to process or what not to process. > > > >> > > > >> I think you also need some abstraction like that. I am not sure > > whether > > > >> collections and resources abstraction is the best for you. Level of > > > >> abstraction will depend on your use cases and requirements. > > > >> > > > >> Thanks > > > >> Milinda > > > >> > > > >> > > > >> > > > >> > > > >> On Mon, Feb 24, 2014 at 2:00 PM, Suresh Marru <[email protected]> > > > wrote: > > > >> > > > >>> On Feb 24, 2014, at 11:20 AM, Milinda Pathirage < > > > >>> [email protected]> wrote: > > > >>> > > > >>>> I also think that moving to Cassandra or any other NoSQL will add > > > >>>> unneccessary complexity to your solution. Also designing proper > > (easy > > > >> to > > > >>>> manage changes, easy to query) NoSQL data models are hard (AFAIK, > > > >> require > > > >>>> lots of experience and understanding about data structures and > > > >> queries). > > > >>>> Also migrating from one NoSQL technology to other can require > > complete > > > >>>> re-write. And current relational databases can handle heavy loads > > > >> except > > > >>>> Google, Twitter, Amazon and Facebook like loads. I don't think > > > Airavata > > > >>>> will see Google and Amazon like loads. > > > >>>> > > > >>>> If the constant changes to the data model is the problem , I think > > > best > > > >>>> option is to abstract registry implementation to something like > > > >>> collections > > > >>>> and resources used in WSO2 Registry [1] or something suitable for > > > >>> Airavata > > > >>>> context. That will make it easy to handle changes in data model. > > > >>> You stated it right Milinda, Airavata does not have scaling needs > > which > > > >>> will go beyond RDMS limits, but needs this abstraction. > > > >>> > > > >>> Can any one elaborate more on collections and resources used in WSO2 > > > >>> registry? > > > >>> > > > >>> Suresh > > > >>> > > > >>>> Also don't let the technologies drive design decision. Its always > > > >> better > > > >>> to > > > >>>> let use cases drive the design decision. > > > >>>> > > > >>>> Thanks > > > >>>> Milinda > > > >>>> > > > >>>> [1] http://wso2.com/products/governance-registry/ > > > >>>> > > > >>>> > > > >>>> On Mon, Feb 24, 2014 at 10:57 AM, Supun Kamburugamuva < > > > >> [email protected] > > > >>>> wrote: > > > >>>> > > > >>>>> Hi all, > > > >>>>> > > > >>>>> I'm not trying to discourage you on your exploration to NoSQL > > > >>> databases. I > > > >>>>> have the following concern. > > > >>>>> > > > >>>>> Your database schema is moderately complex - even for a RDBMS it > > > seems > > > >>>>> complex and the data size is relatively small. I'm not sure about > > the > > > >>>>> current tools available but I think you will need to write more > > code > > > >> to > > > >>>>> support all your requirements in a NoSQL database. So writing more > > > >> code > > > >>> and > > > >>>>> allow redundancy to support *relatively small* and *structured > > > >>>>> data*doesn't seem right to me. May be I'm wrong and there are > > better > > > >>>>> tools in > > > >>>>> NoSQL than RDBMS, which I doubt. > > > >>>>> > > > >>>>> Thanks, > > > >>>>> Supun.. > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> On Sun, Feb 23, 2014 at 5:20 PM, Suresh Marru <[email protected]> > > > >>> wrote: > > > >>>>>> Hi All, > > > >>>>>> > > > >>>>>> Airavata is actively migrating to use Thrift API for the RESTless > > > >>> design > > > >>>>>> and to facilitate various language bindings from client gateways. > > > The > > > >>>>>> programming language support in thrift has been so far very > > > >>> encouraging. > > > >>>>>> The current architecture is looking like Figure 1 at [1]. > > > >>>>>> > > > >>>>>> Language specific clients will be released as thrift SDK's > > (similar > > > >> to > > > >>>>>> evernote sdk's [1]). These clients will be integrated into gateway > > > >>>>> portals > > > >>>>>> which connect to the API Server. The API operations brokers he > > > simple > > > >>>>> calls > > > >>>>>> into one or more backend CPI calls (Airavata internal component > > > >>>>>> interfaces). An example set of mappings are illustrated in > > Figure 2 > > > >> at > > > >>>>>> [1]. The current draft of thrift API for version 0.12 is at [3], > > > >> please > > > >>>>> pay > > > >>>>>> attention to experiment model at [4]. > > > >>>>>> > > > >>>>>> For the persistent store, we had few iterations of Airavata > > Registry > > > >>>>>> shifting from a legacy XRegistry to JackRabbit to now a OpenJPA > > > based > > > >>>>>> registry. To allow the API and the associated data models to > > evolve, > > > >> it > > > >>>>>> will be useful to explore object databases so we can store the > > > >>> serialized > > > >>>>>> version of thrift objects directly. But it will be nice to have > > all > > > >> (or > > > >>>>>> most) of the fields queriable. This calls for a more column-family > > > >>> design > > > >>>>>> of any NoSQL approaches. > > > >>>>>> > > > >>>>>> Any recommendations for a registry architecture? > > > >>>>>> > > > >>>>>> Quickly hacking through I find the following approach a viable > > one: > > > >>>>>> ZombieDB[5] over astyanax[6] which talks to Cassandra. Airavata > > can > > > >>>>> benefit > > > >>>>>> immediately from the replication and reliability of cassandra and > > > >>>>>> scalability in near future. Some of the model objects like > > > experiment > > > >>>>>> creation will need to have strong consistency and most of the > > > >>> monitoring > > > >>>>>> can live with eventual consistency. > > > >>>>>> > > > >>>>>> Critical comments please? > > > >>>>>> > > > >>>>>> Thanks for your time, > > > >>>>>> Suresh > > > >>>>>> > > > >>>>>> [1] - > > > >>>>>> > > > >> > > > > > https://cwiki.apache.org/confluence/display/AIRAVATA/2014/02/23/Brainstorming+Diagrams > > > >>>>>> [2] - https://dev.evernote.com/doc/ > > > >>>>>> [3] - > > > >>>>>> > > > >> > > > > > https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=tree;f=airavata-api/thrift-interface-descriptions;hb=HEAD > > > >>>>>> [4] - > > > >>>>>> > > > >> > > > > > https://git-wip-us.apache.org/repos/asf?p=airavata.git;a=blob_plain;f=airavata-api/thrift-interface-descriptions/experimentModel.thrift;hb=HEAD > > > >>>>>> [5] - https://github.com/MisterTea/ZombieDB > > > >>>>>> [6] - https://github.com/Netflix/astyanax > > > >>>>>> > > > >>>>>> > > > >>>>> > > > >>>>> -- > > > >>>>> Supun Kamburugamuva > > > >>>>> Member, Apache Software Foundation; http://www.apache.org > > > >>>>> E-mail: [email protected]; Mobile: +1 812 369 6762 > > > >>>>> Blog: http://supunk.blogspot.com > > > >>>>> > > > >>>> > > > >>>> > > > >>>> -- > > > >>>> Milinda Pathirage > > > >>>> PhD Student Indiana University, Bloomington; > > > >>>> E-mail: [email protected] > > > >>>> Web: http://mpathirage.com > > > >>>> Blog: http://blog.mpathirage.com > > > >>> > > > >> > > > >> -- > > > >> Milinda Pathirage > > > >> PhD Student Indiana University, Bloomington; > > > >> E-mail: [email protected] > > > >> Web: http://mpathirage.com > > > >> Blog: http://blog.mpathirage.com > > > >> > > > > > > > > > > > > -- > Best Regards, > Shameera Rathnayaka. > > email: shameera AT apache.org , shameerainfo AT gmail.com > Blog : http://shameerarathnayaka.blogspot.com/
