Post a URL to a Google doc with world-commentable permissions.  If
someone wants write access, they can request it.


Marlon

On 12/6/13 12:34 PM, Raminder Singh wrote:
> Lahiru: Can you please start a document to record this conversation? There 
> are very valuable points to records and don’t want to loose anything in email 
> threads. 
>
> My comments are inline with prefix RS>>: 
>
> On Dec 5, 2013, at 10:12 PM, Lahiru Gunathilake <[email protected]> wrote:
>
>> Hi Amila,
>>
>> I have answered questions you raised except some how to questions (for how 
>> questions we need to figure out solutions, before that we need to come up 
>> with good design).
>>
>>
>> On Thu, Dec 5, 2013 at 7:58 PM, Amila Jayasekara <[email protected]> 
>> wrote:
>>
>>
>>
>> On Thu, Dec 5, 2013 at 2:34 PM, Lahiru Gunathilake <[email protected]> wrote:
>> Hi All,
>>
>> We are thinking of implementing an Airavata Orchestrator component to 
>> replace WorkflowInterpreter to avoid gateway developers to dealing with 
>> workflows when they simply have one single independent jobs to run in their 
>> gateways. This component is mainly focusing on how to invoke GFAC and accept 
>> requests from the client API.
>>
>> I have following features in mind about this component.
>>
>> 1. It gives a web services or REST interface where we can implement a client 
>> to invoke it to submit jobs.
> RS >> We need a API method to handle this and protocol interfacing of API can 
> be handled separately using Thrift or Web services. 
>> 2. Accepts a job request and parse the input types and if input types are 
>> correct, this will create an Airavata experiment ID.
> RS >> According to me, we need to save every request to registry before 
> verification and have a input configuration error if the inputs were not 
> correct. That will help us to find if there were any API invocation errors. 
>> 3. Orchestrtor then store the job information to registry against the 
>> generated experiment ID (All the other components identify the job using 
>> this experiment ID).
>>
>> 4. After that Orchestrator pull up all the descriptors related to this 
>> request and do some scheduling to decide where to run the job and submit the 
>> job to a GFAC node (Handling multiple GFAC nodes is going to be a future 
>> improvement in Orchestrator).
>>
>> If we are trying to do pull based job submission it might be a good idea to 
>> handle errors, if we store jobs to Registry and GFAC pull jobs and execute 
>> them Orchestrator component really doesn' t have to worry about the error 
>> handling.
>>
>> I did not quite understand what you meant by "pull based job submission". I 
>> believe it is saving job in registry and periodically GFAC looking up for 
>> new jobs and submitting them.
>> Yes. 
> RS >> I think orchestrator should call GFAC to invoke the job than GFAC 
> polling for the jobs. Orchestrator should make a decision that to which 
> instance of GFAC it submit the job and if there is a system error then bring 
> up or communicate to another instance.I think pull based model for GFAC will 
> add an overhead. We will add another point of failure.  
>
>> Further why are you saying you dont need to worry about error handling ? 
>> What sort of errors are you considering ?
>> I am considering GFAC failures or connection between Orchestrator and GFAC 
>> goes down. 
>>  
>>
>> Because we can implement a logic to GFAC if a particular job is not updating 
>> its status fora g iven time it assume job is hanged or either GFAC node 
>> which handles that job is fauiled, so  GFAC pull that job (we definitely 
>> need a locking mechanism here, to avoid two instances are not going to  
>> execute hanged job) and  start execute it. (If GFAC is handling a long 
>> running job still it has to update the job stutus frequently with the same 
>> status to make sure GFAC node is running).
>>
>> I have some comments/questions on this regard;
>>
>> 1. How are you going to detect that job is hanged ?
>>
>> 2. We clearly need to distinguish between fault jobs and fault GFAC 
>> instances. Because GFAC replication should not pick the job if its logic is 
>> leading to hang situation.
>> I haven't seen hanged logic situation, may be there are. 
>> GFAC replication should pick the job only if primary GFAC instance is down. 
>> I believe you proposed locking mechanism to handle this scenario. But I dont 
>> see how locking mechanism going to resolve this situation. Can you explain 
>> more ?
>> Example if gfac has an logic of picking up a job which didn't response in a 
>> given time there could be a scenario where two gfac instances try to pick 
>> the same job. Ex: there are 3 gfac nodes working and one goes down with a 
>> given job. And two other nodes recognize this at the same time and try to 
>> launch the sam ejob. I was talking about locks to fix this issue.
> RS >> One way to handle is to look at job walltime. If the walltime for a 
> running job is expired and we still don’t have the status of the job then we 
> can go ahead and check the status and start cleaning up the job. 
>>  
>> 2. According to your description, it seems there is no communication between 
>> GFAC instance and Orchastrator.So GFAC and Orchastrator exchange data 
>> through registry (Database). Performance might drop since we are going 
>> through persisting mediums.
>> Yes you are correct, I am assuming we are mostly focusing on implementing 
>> more reliable system and most of these jobs are running hours, and we don't 
>> need to implement high performance system for a system with  long running 
>> jobs. 
> RS >> We need to discuss this. I think orchestrator should only maintain 
> state of request not GFAC.
>> 3. What is the strategy to divide jobs among GFAC instances ?
>> Not sure, we have to discuss it. 
>>
>> 4. How to identify GFAC instance is failed ?
>>
>> 5. How GFAC instances should be registered with the orchestrator ?
> RS >> We need to have a mechanism which record how many GFAC instance are 
> running and how many jobs per instance.  
>> 6. How job cancellations are handled ?
> RS >> Single job canceling is simple and should have a API function to cancel 
> based on experiment id and/or local job id. 
>> 7. What happend if Orchestrator goes down ?
>> This is under assumption Orchestrator doesn't go down (Ex: as a Head node in 
>> Map reduce). 
> RS >> I think registration of job happen outside orchestrator and 
> orchestrator/GFAC progress the states.  
>> 8. Does monitoring execution paths go throug Orchastrato ?
>> I intensionally didn't mention about monitoring, how about we discuss it 
>> separate. 
>>
>> 9. How does fail over work ? 
>>
>> What do you mean and whose fail over ? 
>>  
>>
>> 5. GFAC creates its execution chain and store it back to registry with 
>> experiment ID, and GFAC updates its states using check pointing.
>>
>>
>> 6. If we are not doing pull based submission,during a GFAC failure 
>> Orchestrator have to identify it and submit the active jobs from failure 
>> gfac node  to other nodes.  
>>
>> I think there is more communication need to happend here.
>> 1. When Orchastrator first deposit the job it should be unsubmitted state.
>> 2. GFAC should only update state to active after really submitting it to 
>> resource
>> I agree, there could be few important states like
>> input transfered, job submitted, job finished, output transfered.
>>
>> Incase of a GFAC instance failure the secondary GFAC should go through all 
>> unfinished jobs relevant to failed and get there state by consulting the 
>> resource. If those jobs are still in active state monitoring mechanism 
>> should be established. We only need to re-submit jobs if they are in 
>> unsubmitted state. 
>> +1. 
>>
>> To precisely implement this we need a 2-phase commit like mechanism. Then we 
>> can make sure jobs will not duplicate.
>> +1.
>>
>>
>> Thanks amila for compiling the email carefully.
>>
>> Regards
>> Lahiru 
>>
>>  
>> This might cause job duplication in case Orchestrator falls alarm about GFAC 
>> failure (so have to handle carefully).
>>
>> We have lot more to discus about the GFAC but I limit our discussion to 
>> Orchestrator component for now.
>>
>> WDYT about this design ?
>>
>> Lahiru
>>
>> -- 
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>>
>>
>>
>>
>> -- 
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>

Reply via email to