Hi Chathuri,
On Fri, Jan 17, 2014 at 11:40 AM, Chathuri Wimalasena <[email protected]>wrote: > Orchestrator table has only the current state (updated state). Previous > statuses should be saved in the GFac_Job_Status table. > Since the order of the steps are defined, do we need to store the previous states ? Regards Lahiru > > Regards, > Chathuri > > > On Fri, Jan 17, 2014 at 11:25 AM, Sachith Withana <[email protected]>wrote: > >> Thanks Saminda for this informative review. >> >> In the case of the multi-threaded vs Single Threaded, where should we >> have the synchronization enforced? >> To my knowledge, the NewJobWorkers( Getting new Jobs and submitting them) >> and the HangedJobWorkers are accessing the Orchestrator table to select the >> new and hanged jobs. >> Right now, the NewJobworkers are getting all the accepted jobs at once. >> it's not focussed on one experiment. >> >> We need to reflect the changes in the Gfac job Statuses in the >> Orchestrator table as well. So every time the status of a job change >> through the Gfac, it will be accessing the Orchestrator table as well. ( >> I've sent an email previously describing the scenario) >> >> >> >> >> >> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne >> <[email protected]>wrote: >> >>> Following are few thoughts I had during my review of the component, >>> >>> *Multi-threaded vs single threaded* >>> If we are going to have multi-threaded job submission the implementation >>> should work on handling race conditions. Essentially JobSubmitter should be >>> able to "lock" an experiment request before continuing processing that >>> request so that other JobSubmitters accessing the experiment requests a the >>> same time would skip it. >>> >>> *Orchestrator service* >>> We might want to think of the possibility in future where we will be >>> having multiple deployments of an Airavata service. This could particularly >>> be true for SciGaP. We may have to think how some of the internal data >>> structures/SPIs should be updated to accomodate such requirements in future. >>> >>> *Orchestrator Component configurations* >>> I see alot of places where the orchestrator can have configurations. I >>> think its too early finalize them, but I think we can start refactoring >>> them out perhaps to the airavata-server.properties. I'm also seeing the >>> orchestrator is now hardcoded to use default/admin gateway and username. I >>> think it should come from the request itself. >>> >>> *Visibility of API functions* >>> I think initialize(), shutdown() and startJobSubmitter() functions >>> should not be part of the API because I don't see a scenario where the >>> gateway developer would be responsible for using them. They serve a more >>> internal purpose of managing the orchestrator component IMO. As Amila >>> pointed out so long ago (wink) functions that do not concern outside >>> parties should not be used as part of the API. >>> >>> *Return values of Orchestrator API* >>> IMO unless it is specifically required to do so I think the functions >>> does not necessarily need to return anything other than throw exceptions >>> when needed. For example the launchExperiment can simply return void if all >>> is succesful and return an exception if something fails. Handling issues >>> with a try catch is not only simpler but also the explanations are readily >>> available for the user. >>> >>> *Data persisted in registry* >>> ExperimentRequest.getUsername() : I think we should clarify what this >>> username denotes. In current API, in experiment submission we consider two >>> types of users. Submission user (the user who submits the experiment to the >>> Airavata Server - this is inferred by the request itself) and the execution >>> user (the user who corelates to the application executions of the gateway - >>> thus this user can be a different user for different gateway, eg: community >>> user, gateway user). >>> I think we should persist the date/time of the experiment request as >>> well. >>> Also when retrying of API functions in the case of a failure in an >>> previous attempt there should be a way to not to repeat already performed >>> steps or gracefully roleback and redo those required steps as necessary. >>> While such actions could be transparent to the user sometimes it might make >>> sense to allow user to be notified of success/failure of a retry. However >>> this might mean keeping additional records at the registry level. >>> >>> >> >> >> -- >> Thanks, >> Sachith Withana >> >> > -- System Analyst Programmer PTI Lab Indiana University
