Re: Orchestration Component implementation review

Raminder Singh Fri, 17 Jan 2014 10:06:34 -0800

+1 for returning JobRequest object with pre-populated ExperimentID and other 
details. I can extend this object for header data also. That way we can make 
sure user is setting the right information.


We have discussed to approaches to multi-threaded orchestrator, Pull based 
(request 1st saved to DB and a thread is polling to run the jobs) and On-demand 
(request is served right away and we update the status table to help management 
of the jobs like recovery). Multithreaded implementation need to evolve and i 
agree with Saminda about the improvements. 

Thanks
Raminder 

On Jan 17, 2014, at 11:42 AM, Marlon Pierce <[email protected]> wrote:

> I have a little comment on the API.  The two step process that we came
> up with requires the user to first call createExperiment to get an
> experiment ID and then call launchExperiment(JobRequest jobRequest). 
> The jobRequest object should include the experimentID returned by
> createExperiment() but we have no way of enforcing this.
> 
> I think this will be confusing to a developer and may introduce other
> problems.  How about having createExperiment() return a JobRequest
> object with default values, including the correct experimentID?  The
> client code can update these as needed to override defaults and then
> send back to the orchestrator through launchExperiment().
> 
> 
> Marlon
> 
> On 1/17/14 10:32 AM, Saminda Wijeratne wrote:
>> Following are few thoughts I had during my review of the component,
>> 
>> *Multi-threaded vs single threaded*
>> If we are going to have multi-threaded job submission the implementation
>> should work on handling race conditions. Essentially JobSubmitter should be
>> able to "lock" an experiment request before continuing processing that
>> request so that other JobSubmitters accessing the experiment requests a the
>> same time would skip it.
>> 
>> *Orchestrator service*
>> We might want to think of the possibility in future where we will be having
>> multiple deployments of an Airavata service. This could particularly be
>> true for SciGaP. We may have to think how some of the internal data
>> structures/SPIs should be updated to accomodate such requirements in future.
>> 
>> *Orchestrator Component configurations*
>> I see alot of places where the orchestrator can have configurations. I
>> think its too early finalize them, but I think we can start refactoring
>> them out perhaps to the airavata-server.properties. I'm also seeing the
>> orchestrator is now hardcoded to use default/admin gateway and username. I
>> think it should come from the request itself.
>> 
>> *Visibility of API functions*
>> I think initialize(), shutdown() and startJobSubmitter() functions should
>> not be part of the API because I don't see a scenario where the gateway
>> developer would be responsible for using them. They serve a more internal
>> purpose of managing the orchestrator component IMO. As Amila pointed out so
>> long ago (wink) functions that do not concern outside parties should not be
>> used as part of the API.
>> 
>> *Return values of Orchestrator API*
>> IMO unless it is specifically required to do so I think the functions does
>> not necessarily need to return anything other than throw exceptions when
>> needed. For example the launchExperiment can simply return void if all is
>> succesful and return an exception if something fails. Handling issues with
>> a try catch is not only simpler but also the explanations are readily
>> available for the user.
>> 
>> *Data persisted in registry*
>> ExperimentRequest.getUsername() : I think we should clarify what this
>> username denotes. In current API, in experiment submission we consider two
>> types of users. Submission user (the user who submits the experiment to the
>> Airavata Server - this is inferred by the request itself) and the execution
>> user (the user who corelates to the application executions of the gateway -
>> thus this user can be a different user for different gateway, eg: community
>> user, gateway user).
>> I think we should persist the date/time of the experiment request as well.
>> Also when retrying of API functions in the case of a failure in an previous
>> attempt there should be a way to not to repeat already performed steps or
>> gracefully roleback and redo those required steps as necessary. While such
>> actions could be transparent to the user sometimes it might make sense to
>> allow user to be notified of success/failure of a retry. However this might
>> mean keeping additional records at the registry level.
>> 
>

Re: Orchestration Component implementation review

Reply via email to