Re: Orchestration Component implementation review

Amila Jayasekara Sun, 19 Jan 2014 20:49:44 -0800

On Sun, Jan 19, 2014 at 5:33 PM, Lahiru Gunathilake <[email protected]>wrote:


> Hi Chathuri,
>
>
> On Fri, Jan 17, 2014 at 11:40 AM, Chathuri Wimalasena <
> [email protected]> wrote:
>
>> Orchestrator table has only the current state (updated state). Previous
>> statuses should be saved in the GFac_Job_Status table.
>>
> Since the order of the steps are defined, do we need to store the previous
> states ?
>

I am also curious to know why we need to store previous states. We have a
define state diagram for a job. Also we have log files if we want to debug
a specific issue related to job statuses. So I am not sure why we need to
store previous job states. Also who and when we will access previous job
states ?

Thanks
Amila


>
> Regards
> Lahiru
>
>>
>> Regards,
>> Chathuri
>>
>>
>> On Fri, Jan 17, 2014 at 11:25 AM, Sachith Withana <[email protected]>wrote:
>>
>>> Thanks Saminda for this informative review.
>>>
>>> In the case of the multi-threaded vs Single Threaded, where should we
>>> have the synchronization enforced?
>>> To my knowledge, the NewJobWorkers( Getting new Jobs and submitting
>>> them) and the HangedJobWorkers are accessing the Orchestrator table to
>>> select the new and hanged jobs.
>>> Right now, the NewJobworkers are getting all the accepted jobs at once.
>>> it's not focussed on one experiment.
>>>
>>> We need to reflect the changes in the Gfac job Statuses in the
>>> Orchestrator table as well. So every time the status of a job change
>>> through the Gfac, it will be accessing the Orchestrator table as well. (
>>> I've sent an email previously describing the scenario)
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne 
>>> <[email protected]>wrote:
>>>
>>>> Following are few thoughts I had during my review of the component,
>>>>
>>>> *Multi-threaded vs single threaded*
>>>> If we are going to have multi-threaded job submission the
>>>> implementation should work on handling race conditions. Essentially
>>>> JobSubmitter should be able to "lock" an experiment request before
>>>> continuing processing that request so that other JobSubmitters accessing
>>>> the experiment requests a the same time would skip it.
>>>>
>>>> *Orchestrator service*
>>>> We might want to think of the possibility in future where we will be
>>>> having multiple deployments of an Airavata service. This could particularly
>>>> be true for SciGaP. We may have to think how some of the internal data
>>>> structures/SPIs should be updated to accomodate such requirements in 
>>>> future.
>>>>
>>>> *Orchestrator Component configurations*
>>>> I see alot of places where the orchestrator can have configurations. I
>>>> think its too early finalize them, but I think we can start refactoring
>>>> them out perhaps to the airavata-server.properties. I'm also seeing the
>>>> orchestrator is now hardcoded to use default/admin gateway and username. I
>>>> think it should come from the request itself.
>>>>
>>>> *Visibility of API functions*
>>>> I think initialize(), shutdown() and startJobSubmitter() functions
>>>> should not be part of the API because I don't see a scenario where the
>>>> gateway developer would be responsible for using them. They serve a more
>>>> internal purpose of managing the orchestrator component IMO. As Amila
>>>> pointed out so long ago (wink) functions that do not concern outside
>>>> parties should not be used as part of the API.
>>>>
>>>> *Return values of Orchestrator API*
>>>> IMO unless it is specifically required to do so I think the functions
>>>> does not necessarily need to return anything other than throw exceptions
>>>> when needed. For example the launchExperiment can simply return void if all
>>>> is succesful and return an exception if something fails. Handling issues
>>>> with a try catch is not only simpler but also the explanations are readily
>>>> available for the user.
>>>>
>>>> *Data persisted in registry*
>>>> ExperimentRequest.getUsername() : I think we should clarify what this
>>>> username denotes. In current API, in experiment submission we consider two
>>>> types of users. Submission user (the user who submits the experiment to the
>>>> Airavata Server - this is inferred by the request itself) and the execution
>>>> user (the user who corelates to the application executions of the gateway -
>>>> thus this user can be a different user for different gateway, eg: community
>>>> user, gateway user).
>>>> I think we should persist the date/time of the experiment request as
>>>> well.
>>>> Also when retrying of API functions in the case of a failure in an
>>>> previous attempt there should be a way to not to repeat already performed
>>>> steps or gracefully roleback and redo those required steps as necessary.
>>>> While such actions could be transparent to the user sometimes it might make
>>>> sense to allow user to be notified of success/failure of a retry. However
>>>> this might mean keeping additional records at the registry level.
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Sachith Withana
>>>
>>>
>>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>

Re: Orchestration Component implementation review

Reply via email to