Excellent! Do you have any pointers to where I can look to start reading the code and begin adding the necessary features to the registry?
Thanks! On Tue, Jun 30, 2015 at 5:05 AM, Supun Nakandala <[email protected]> wrote: > Hi John, > > Even though in the current thrift models has only one status entry, in the > database we maintain all the state transitions (i.e all the status > entries). But when retrieving an experiment, process, task, or job only the > latest status is returned based on the creation time stamp. So at the > registry level we can support your requirement. What is required is the > required thrift models to transfer those data via the APIs/CPIs. > > > On Tue, Jun 30, 2015 at 1:28 PM, John Weachock <[email protected]> > wrote: > >> Hi Supun, >> >> Sorry for sending this message so late! >> >> Last week I discussed a change to the data models with Suresh regarding >> task / job / experiment / etc statuses. Currently, each item has a single >> status ID that points to a status that's updated every change. However, if >> each item contained a *list* of status IDs, and each status change >> created a *new* status entry, we can record data about experiment run >> times, which could be used in future versions to assist in benchmark and >> runtime prediction efforts. Additionally, users could be provided the >> information about the progression of their experiment. >> >> Thanks, >> >> John >> >> >> On Sun, Jun 14, 2015 at 1:56 PM, Supun Nakandala < >> [email protected]> wrote: >> >>> Hi All, >>> >>> I came up with the initial version of the schema for the new experiment >>> catalog. It is very much similar to the existing model and have few changes >>> >>> 1. In Experiments I have used one text field for email addresses with >>> the intention of storing comma separated email list. The idea was to avoid >>> another DB table join. And also in Errors tables I have used a single text >>> field for storing parent error ids with the same intention. >>> >>> 2. I have used separate tables for ExperimentErrors, ProcessErrors, >>> TaskErrors rather than having a single Errors table. The idea is to avoid >>> the use of composite ids(with some ids null) and to avoid the filtering >>> correct type of errors in the code level (for example when retrieving >>> experiment errors). And also this eases the data retrieval in JPA level. I >>> have used the same concept for Statuses and Inputs and Outputs tables. >>> >>> 3. Since there are some performance issues in PGA related operations in >>> retrieving experiment related data I created a view called >>> experiment_summaries which underneath joins several tables and gives the >>> required data in one view. We can create a JPA model for this view and use >>> it for PGA related (including some of the Admin Dashboard) operations. I >>> hope this will solve the issue. >>> >>> I have attached the schema diagram here with. Please check it and let me >>> know if anything is wrong, needs to be changed or improved. >>> >>> If things look good, as the next step I would like to suggest that we >>> brainstorm different queries that we will run on this data and check >>> whether the data model can support those queries and the expected >>> performance. >>> >>> Thanks >>> Supun >>> >>> On Fri, Jun 12, 2015 at 6:39 PM, Suresh Marru <[email protected]> wrote: >>> >>>> Hi All, >>>> >>>> With the experience of adapting thrift data models for Airavata in past >>>> couple of years, its time for us to revisit them. Most persistent criticism >>>> has been the data models have been complex. Next the data models and >>>> architecture evolved in parallel and the implementations did not always >>>> match the intended models. In an effort to address these issues, lets first >>>> discuss the minimal required data models. >>>> >>>> We need to confirm the models to the general principle of Experiments >>>> deriving into a Process or a Workflow. For single application, a process >>>> can be directly derived from Experiment Details. For workflows, multiple >>>> process are created. Executing a process leads to creation of multiple >>>> Tasks. Task is a general type which are enacted at run time based on a >>>> generic execution sequence of environment setup, data input staging, >>>> application execution and monitoring, data output staging and environment >>>> cleanup. >>>> >>>> Please review the initial draft: >>>> >>>> https://github.com/apache/airavata/tree/master/thrift-interface-descriptions/airavata-data-models >>>> >>>> Assume lazy consensus and update the models, lets literately review and >>>> update these thrift IDL’s. We don’t yet need to dive into code generation, >>>> until these are close to final. >>>> >>>> @Supun, may be you can start thinking on the data base representation >>>> on these models and assume the details will change but the general >>>> structure might remain. >>>> >>>> Cheers, >>>> Suresh >>>> >>> >>> >>> >>> -- >>> Thank you >>> Supun Nakandala >>> Dept. Computer Science and Engineering >>> University of Moratuwa >>> >> >> > > > -- > Thank you > Supun Nakandala > Dept. Computer Science and Engineering > University of Moratuwa >
