I'm thinking of injecting code to save GFac job data from our Airavata code itself. So far it seems only at the Provider level this is possible since the required data is only available at that point. For instance as I see, to record gram data I need to have code in the execute function in GramProvider class where jobid becomes available along with other data. Is it really the place or is there a generic place where we can do this for all providers?
On Thu, May 30, 2013 at 2:23 PM, Saminda Wijeratne <[email protected]>wrote: > I updated the names in both the API function sets for GFacJobData and > GFacJobErrorData in ExecutionManager and ProvenanceManager. If anyone has > being using these functions outside of the Airavata trunk please update > your code to reflect this change. Basically change "GFac" to "Application" > > Thanks, > Saminda > > > On Wed, May 29, 2013 at 5:19 PM, Saminda Wijeratne <[email protected]>wrote: > >> Hi Guys, >> >> Since there is no objection for the suggested name pattern ( >> addApplicationJob(...)) and since we are sort of running short of time >> we are going ahead of with that name. >> >> If you a much better suggestion please respond within today or tomorrow >> so that we can incorporate any changes for 0.8 release without delay. >> >> Thanks, >> Saminda >> >> >> On Wed, May 22, 2013 at 11:02 PM, Saminda Wijeratne >> <[email protected]>wrote: >> >>> Application. But in our case we may have to use both. eg: >>> addApplicationJob(...) or addApplicationSubmission(...). The name >>> addApplication(...) >>> is misleading I think. wdyt? >>> >>> >>> On Wed, May 22, 2013 at 1:43 PM, Amila Jayasekara < >>> [email protected]> wrote: >>> >>>> What is more familiar ? "Application" or "Job" ? >>>> >>>> Thanks >>>> Amila >>>> >>>> >>>> On Wed, May 22, 2013 at 11:28 AM, Saminda Wijeratne <[email protected] >>>> >wrote: >>>> >>>> > On Wed, May 22, 2013 at 11:22 AM, Amila Jayasekara >>>> > <[email protected]>wrote: >>>> > >>>> > > I am bit concerned about the names. Are we assuming that API users >>>> has >>>> > > knowledge about GFac ? >>>> > > OR else we can just remove "GFac" substring and have method names >>>> like >>>> > > "void >>>> > > updateJobMetadta(..)" >>>> > > >>>> > You have a point there Amila. Perhaps we can name them as >>>> "Application" >>>> > rather than GFac since we already have the notion of an application >>>> > descriptor in the API. wdyt? >>>> > >>>> > >>>> > > Thanks >>>> > > Amila >>>> > > >>>> > > >>>> > > On Tue, May 21, 2013 at 11:28 PM, Saminda Wijeratne < >>>> [email protected] >>>> > > >wrote: >>>> > > >>>> > > > Following API functions are added for the ProvenanceManager[2], >>>> > > > >>>> > > > boolean isGFacJobExists(String gfacJobId) >>>> > > > void addGFacJob(GFacJob job) >>>> > > > void updateGFacJob(GFacJob job) >>>> > > > void updateGFacJobStatus(String gfacJobId, GFacJobStatus status) >>>> > > > void updateGFacJobData(String gfacJobId, String jobdata) >>>> > > > void updateGFacJobSubmittedTime(String gfacJobId, Date submitted) >>>> > > > void updateGFacJobCompletedTime(String gfacJobId, Date completed) >>>> > > > void updateGFacJobMetadta(String gfacJobId, String metadata) >>>> > > > GFacJob getGFacJob(String gfacJobId) >>>> > > > List<GFacJob> getGFacJobsForDescriptors(String >>>> serviceDescriptionId, >>>> > > String >>>> > > > hostDescriptionId, String applicationDescriptionId) >>>> > > > List<GFacJob> getGFacJobs(String experimentId, String >>>> > > workflowExecutionId, >>>> > > > String nodeId) >>>> > > > >>>> > > > Thoughts are welcome!!! >>>> > > > >>>> > > > >>>> > > > 2. >>>> > > > >>>> > > > >>>> > > >>>> > >>>> https://svn.apache.org/repos/asf/airavata/trunk/modules/airavata-client/src/main/java/org/apache/airavata/client/api/ProvenanceManager.java >>>> > > > >>>> > > > >>>> > > > On Tue, May 21, 2013 at 5:04 PM, Saminda Wijeratne < >>>> [email protected] >>>> > > > >wrote: >>>> > > > >>>> > > > > But I thought the providers are part of the GFac (not as a >>>> separate >>>> > > > > service). If not then the providers should report to GFac. >>>> Orelse >>>> > there >>>> > > > is >>>> > > > > no way the GFac knows what status to update which data to >>>> update etc. >>>> > > > Does >>>> > > > > the current GFac implementation support this? >>>> > > > > >>>> > > > > >>>> > > > > On Tue, May 21, 2013 at 4:47 PM, Amila Jayasekara < >>>> > > > [email protected] >>>> > > > > > wrote: >>>> > > > > >>>> > > > >> I think that should be handled at a more upper layer like >>>> Workflow >>>> > > > >> Interpretter or GFac. In FT perspective it is better if >>>> providers >>>> > are >>>> > > > >> stateless. One reason is we dont have control over some >>>> providers >>>> > and >>>> > > > and >>>> > > > >> there will be many places writing to disk if we implement the >>>> > > > persistence >>>> > > > >> logic at provider level. >>>> > > > >> >>>> > > > >> Thanks >>>> > > > >> Amila >>>> > > > >> >>>> > > > >> >>>> > > > >> On Tue, May 21, 2013 at 4:39 PM, Saminda Wijeratne < >>>> > > [email protected] >>>> > > > >> >wrote: >>>> > > > >> >>>> > > > >> > On Tue, May 21, 2013 at 4:36 PM, Amila Jayasekara >>>> > > > >> > <[email protected]>wrote: >>>> > > > >> > >>>> > > > >> > > On Tue, May 21, 2013 at 3:51 PM, Saminda Wijeratne < >>>> > > > >> [email protected] >>>> > > > >> > > >wrote: >>>> > > > >> > > >>>> > > > >> > > > Thanks for the feedback Amila. a few comments inline >>>> > > > >> > > > >>>> > > > >> > > > >>>> > > > >> > > > On Tue, May 21, 2013 at 12:29 PM, Amila Jayasekara >>>> > > > >> > > > <[email protected]>wrote: >>>> > > > >> > > > >>>> > > > >> > > > > Hi Saminda, >>>> > > > >> > > > > >>>> > > > >> > > > > Great suggestion. Also +1 for Dhanushka's proposal to >>>> have >>>> > > > >> > > > > serialize/de-serilized data. >>>> > > > >> > > > > Few suggestions, >>>> > > > >> > > > > 1. In addition to successful/error statuses we need >>>> other >>>> > > status >>>> > > > >> for >>>> > > > >> > > > nodes >>>> > > > >> > > > > & workflows >>>> > > > >> > > > > and workflows. >>>> > > > >> > > > > E . g :- >>>> > > > >> > > > > node - started, submitted, in-progress, failed, >>>> > successful >>>> > > > etc >>>> > > > >> ... >>>> > > > >> > > > > >>>> > > > >> > > > Sorry if I was too vague. Yes we have more fine-grain >>>> statuses >>>> > > for >>>> > > > >> > > workflow >>>> > > > >> > > > and node[1]. We will have a much fine-grained level of >>>> > > granuality >>>> > > > >> for a >>>> > > > >> > > > GFacJob status. >>>> > > > >> > > > public static enum GFacJobStatus{ >>>> > > > >> > > > SUBMITTED, //job is submitted, possibly waiting >>>> to >>>> > start >>>> > > > >> > > executing >>>> > > > >> > > > EXECUTING, //submitted job is being executed >>>> > > > >> > > > CANCELLED, //job was cancelled >>>> > > > >> > > > PAUSED, //job was paused >>>> > > > >> > > > WAITING_FOR_DATA, // job is waiting for data to >>>> > continue >>>> > > > >> > > executing >>>> > > > >> > > > FAILED, // error occurred while job was >>>> executing and >>>> > > the >>>> > > > >> job >>>> > > > >> > > > stopped >>>> > > > >> > > > FINISHED, // job completed successfully >>>> > > > >> > > > UNKNOWN // unknown status. lookup the metadata >>>> for >>>> > more >>>> > > > >> > details. >>>> > > > >> > > > } >>>> > > > >> > > > >>>> > > > >> > > > >>>> > > > >> > > > 2. This data will be useful in implementing FT and Load >>>> > > Balancing >>>> > > > in >>>> > > > >> > each >>>> > > > >> > > > > component. Sometime back we had discussions to make >>>> GFac >>>> > > > >> stateless. >>>> > > > >> > So >>>> > > > >> > > > who >>>> > > > >> > > > > is going to populate this data structure and persist >>>> it ? >>>> > > > >> > > > > >>>> > > > >> > > > That is a very good question... :). This summer is going >>>> to >>>> > be a >>>> > > > >> long >>>> > > > >> > > > one... ;) >>>> > > > >> > > > >>>> > > > >> > > >>>> > > > >> > > What I meant is which component is doing persistence ? >>>> (GFac or >>>> > WF >>>> > > > >> > > Interpretter). Not the actual person who is going to >>>> implement >>>> > it >>>> > > > :). >>>> > > > >> > > >>>> > > > >> > hih hih.... >>>> > > > >> > Well its going to be whatever the provider respondible for >>>> > managing >>>> > > > the >>>> > > > >> job >>>> > > > >> > lifecycle. For example GRAMProvider should be responsible for >>>> > > > recording >>>> > > > >> all >>>> > > > >> > the data relating to the GRAM jobs its working with. >>>> > > > >> > >>>> > > > >> > > >>>> > > > >> > > >>>> > > > >> > > > >>>> > > > >> > > > 1. >>>> > > > >> > > > >>>> > > > >> > > > >>>> > > > >> > > >>>> > > > >> > >>>> > > > >> >>>> > > > >>>> > > >>>> > >>>> https://svn.apache.org/repos/asf/airavata/trunk/modules/workflow-model/workflow-model-core/src/main/java/org/apache/airavata/workflow/model/graph/Node.java >>>> > > > >> > > > >>>> > > > >> > > > > >>>> > > > >> > > > > Thanks >>>> > > > >> > > > > Amila >>>> > > > >> > > > > >>>> > > > >> > > > > >>>> > > > >> > > > > On Tue, May 21, 2013 at 11:39 AM, Saminda Wijeratne < >>>> > > > >> > > [email protected] >>>> > > > >> > > > > >wrote: >>>> > > > >> > > > > >>>> > > > >> > > > > > Thats is an excellent idea. We can have the job data >>>> field >>>> > > to >>>> > > > be >>>> > > > >> > the >>>> > > > >> > > > > > designated GFac job serialized data. The whatever >>>> > > GFacProvider >>>> > > > >> > should >>>> > > > >> > > > > > adhere to it. >>>> > > > >> > > > > > >>>> > > > >> > > > > > I'm still inclined to have the rest of the fields to >>>> ease >>>> > of >>>> > > > >> > querying >>>> > > > >> > > > for >>>> > > > >> > > > > > the required data. For example if we wanted all >>>> attempts >>>> > on >>>> > > > >> > executing >>>> > > > >> > > > > for a >>>> > > > >> > > > > > particular node of a workflow or if we wanted to know >>>> > which >>>> > > > >> > > application >>>> > > > >> > > > > > descriptions are faster in execution or more >>>> reliable etc. >>>> > > we >>>> > > > >> can >>>> > > > >> > let >>>> > > > >> > > > the >>>> > > > >> > > > > > query language deal with it. wdyt? >>>> > > > >> > > > > > >>>> > > > >> > > > > > >>>> > > > >> > > > > > On Tue, May 21, 2013 at 11:24 AM, Danushka >>>> Menikkumbura < >>>> > > > >> > > > > > [email protected]> wrote: >>>> > > > >> > > > > > >>>> > > > >> > > > > > > Saminda, >>>> > > > >> > > > > > > >>>> > > > >> > > > > > > I think the data container does not need to have a >>>> > generic >>>> > > > >> > format. >>>> > > > >> > > We >>>> > > > >> > > > > can >>>> > > > >> > > > > > > have a base class that facilitate object >>>> > > > >> > > > serialization/deserialization >>>> > > > >> > > > > > and >>>> > > > >> > > > > > > let specific meta data structure implement them as >>>> > > required. >>>> > > > >> We >>>> > > > >> > get >>>> > > > >> > > > the >>>> > > > >> > > > > > > Registry API to serialize objects and save them in >>>> a >>>> > meta >>>> > > > data >>>> > > > >> > > table >>>> > > > >> > > > > > (with >>>> > > > >> > > > > > > just two columns?) and to deserialize as they are >>>> loaded >>>> > > off >>>> > > > >> the >>>> > > > >> > > > > > registry. >>>> > > > >> > > > > > > >>>> > > > >> > > > > > > Danushka >>>> > > > >> > > > > > > >>>> > > > >> > > > > > > >>>> > > > >> > > > > > > On Tue, May 21, 2013 at 8:34 PM, Saminda Wijeratne >>>> < >>>> > > > >> > > > [email protected] >>>> > > > >> > > > > > > >wrote: >>>> > > > >> > > > > > > >>>> > > > >> > > > > > > > It has being apparent more and more that saving >>>> the >>>> > data >>>> > > > >> > related >>>> > > > >> > > to >>>> > > > >> > > > > > > > executing a jobs from the GFac can be useful for >>>> many >>>> > > > >> reasons >>>> > > > >> > > such >>>> > > > >> > > > > as, >>>> > > > >> > > > > > > > >>>> > > > >> > > > > > > > debugging >>>> > > > >> > > > > > > > retrying >>>> > > > >> > > > > > > > to make smart decisions on reliability/cost etc. >>>> > > > >> > > > > > > > statistical analysis >>>> > > > >> > > > > > > > >>>> > > > >> > > > > > > > Thus we thought of saving the data related to >>>> GFac >>>> > jobs >>>> > > in >>>> > > > >> the >>>> > > > >> > > > > registry >>>> > > > >> > > > > > > in >>>> > > > >> > > > > > > > order to facilitate feature such as above in the >>>> > future. >>>> > > > >> > > > > > > > >>>> > > > >> > > > > > > > However a GFac job is potentially any sort of >>>> > computing >>>> > > > >> > resource >>>> > > > >> > > > > access >>>> > > > >> > > > > > > > (GRAM/UNICORE/EC2 etc.). Therefore we need to >>>> come up >>>> > > > with a >>>> > > > >> > > > > > generalized >>>> > > > >> > > > > > > > data structure that can hold the data of any >>>> type of >>>> > > > >> resource. >>>> > > > >> > > > > > Following >>>> > > > >> > > > > > > > are the suggested data to save for a single GFac >>>> job >>>> > > > >> execution, >>>> > > > >> > > > > > > > >>>> > > > >> > > > > > > > *experiment id, workflow instance id, node id* - >>>> > > pinpoint >>>> > > > >> the >>>> > > > >> > > node >>>> > > > >> > > > > > > > execution >>>> > > > >> > > > > > > > *service, host, application description ids *- >>>> > pinpoint >>>> > > > the >>>> > > > >> > > > > descriptors >>>> > > > >> > > > > > > > responsible >>>> > > > >> > > > > > > > *local job id* - the unique job id >>>> retrieved/generated >>>> > > per >>>> > > > >> > > > execution >>>> > > > >> > > > > > > > [PRIMARY KEY] >>>> > > > >> > > > > > > > *job data* - data related executing the job (eg: >>>> the >>>> > rsl >>>> > > > in >>>> > > > >> > GRAM) >>>> > > > >> > > > > > > > *submitted, completed time* >>>> > > > >> > > > > > > > *completed status* - whether the job was >>>> successfull >>>> > or >>>> > > > ran >>>> > > > >> in >>>> > > > >> > to >>>> > > > >> > > > > > errors >>>> > > > >> > > > > > > > etc. >>>> > > > >> > > > > > > > *metadata* - custom field to add anything user >>>> wants >>>> > > > >> > > > > > > > >>>> > > > >> > > > > > > > Your feedback is most welcome. The API related >>>> changes >>>> > > > will >>>> > > > >> > also >>>> > > > >> > > be >>>> > > > >> > > > > > > > discussed once we have a proper data structure. >>>> We are >>>> > > > >> hoping >>>> > > > >> > to >>>> > > > >> > > > > > > implement >>>> > > > >> > > > > > > > this within next few days. >>>> > > > >> > > > > > > > >>>> > > > >> > > > > > > > Thanks, >>>> > > > >> > > > > > > > Saminda >>>> > > > >> > > > > > > > >>>> > > > >> > > > > > > >>>> > > > >> > > > > > >>>> > > > >> > > > > >>>> > > > >> > > > >>>> > > > >> > > >>>> > > > >> > >>>> > > > >> >>>> > > > > >>>> > > > > >>>> > > > >>>> > > >>>> > >>>> >>> >>> >> >
