Lahiru: Can you please start a document to record this conversation? There are very valuable points to records and don’t want to loose anything in email threads.
My comments are inline with prefix RS>>: On Dec 5, 2013, at 10:12 PM, Lahiru Gunathilake <[email protected]> wrote: > Hi Amila, > > I have answered questions you raised except some how to questions (for how > questions we need to figure out solutions, before that we need to come up > with good design). > > > On Thu, Dec 5, 2013 at 7:58 PM, Amila Jayasekara <[email protected]> > wrote: > > > > On Thu, Dec 5, 2013 at 2:34 PM, Lahiru Gunathilake <[email protected]> wrote: > Hi All, > > We are thinking of implementing an Airavata Orchestrator component to replace > WorkflowInterpreter to avoid gateway developers to dealing with workflows > when they simply have one single independent jobs to run in their gateways. > This component is mainly focusing on how to invoke GFAC and accept requests > from the client API. > > I have following features in mind about this component. > > 1. It gives a web services or REST interface where we can implement a client > to invoke it to submit jobs. RS >> We need a API method to handle this and protocol interfacing of API can be handled separately using Thrift or Web services. > > 2. Accepts a job request and parse the input types and if input types are > correct, this will create an Airavata experiment ID. RS >> According to me, we need to save every request to registry before verification and have a input configuration error if the inputs were not correct. That will help us to find if there were any API invocation errors. > > 3. Orchestrtor then store the job information to registry against the > generated experiment ID (All the other components identify the job using this > experiment ID). > > 4. After that Orchestrator pull up all the descriptors related to this > request and do some scheduling to decide where to run the job and submit the > job to a GFAC node (Handling multiple GFAC nodes is going to be a future > improvement in Orchestrator). > > If we are trying to do pull based job submission it might be a good idea to > handle errors, if we store jobs to Registry and GFAC pull jobs and execute > them Orchestrator component really doesn' t have to worry about the error > handling. > > I did not quite understand what you meant by "pull based job submission". I > believe it is saving job in registry and periodically GFAC looking up for new > jobs and submitting them. > Yes. RS >> I think orchestrator should call GFAC to invoke the job than GFAC polling for the jobs. Orchestrator should make a decision that to which instance of GFAC it submit the job and if there is a system error then bring up or communicate to another instance.I think pull based model for GFAC will add an overhead. We will add another point of failure. > Further why are you saying you dont need to worry about error handling ? What > sort of errors are you considering ? > I am considering GFAC failures or connection between Orchestrator and GFAC > goes down. > > > Because we can implement a logic to GFAC if a particular job is not updating > its status fora g iven time it assume job is hanged or either GFAC node which > handles that job is fauiled, so GFAC pull that job (we definitely need a > locking mechanism here, to avoid two instances are not going to execute > hanged job) and start execute it. (If GFAC is handling a long running job > still it has to update the job stutus frequently with the same status to make > sure GFAC node is running). > > I have some comments/questions on this regard; > > 1. How are you going to detect that job is hanged ? > > 2. We clearly need to distinguish between fault jobs and fault GFAC > instances. Because GFAC replication should not pick the job if its logic is > leading to hang situation. > I haven't seen hanged logic situation, may be there are. > GFAC replication should pick the job only if primary GFAC instance is down. I > believe you proposed locking mechanism to handle this scenario. But I dont > see how locking mechanism going to resolve this situation. Can you explain > more ? > Example if gfac has an logic of picking up a job which didn't response in a > given time there could be a scenario where two gfac instances try to pick the > same job. Ex: there are 3 gfac nodes working and one goes down with a given > job. And two other nodes recognize this at the same time and try to launch > the sam ejob. I was talking about locks to fix this issue. RS >> One way to handle is to look at job walltime. If the walltime for a running job is expired and we still don’t have the status of the job then we can go ahead and check the status and start cleaning up the job. > > 2. According to your description, it seems there is no communication between > GFAC instance and Orchastrator.So GFAC and Orchastrator exchange data through > registry (Database). Performance might drop since we are going through > persisting mediums. > Yes you are correct, I am assuming we are mostly focusing on implementing > more reliable system and most of these jobs are running hours, and we don't > need to implement high performance system for a system with long running > jobs. RS >> We need to discuss this. I think orchestrator should only maintain state of request not GFAC. > > 3. What is the strategy to divide jobs among GFAC instances ? > Not sure, we have to discuss it. > > 4. How to identify GFAC instance is failed ? > > 5. How GFAC instances should be registered with the orchestrator ? RS >> We need to have a mechanism which record how many GFAC instance are running and how many jobs per instance. > > 6. How job cancellations are handled ? RS >> Single job canceling is simple and should have a API function to cancel based on experiment id and/or local job id. > > 7. What happend if Orchestrator goes down ? > This is under assumption Orchestrator doesn't go down (Ex: as a Head node in > Map reduce). RS >> I think registration of job happen outside orchestrator and orchestrator/GFAC progress the states. > > 8. Does monitoring execution paths go throug Orchastrato ? > I intensionally didn't mention about monitoring, how about we discuss it > separate. > > 9. How does fail over work ? > > What do you mean and whose fail over ? > > > 5. GFAC creates its execution chain and store it back to registry with > experiment ID, and GFAC updates its states using check pointing. > > > 6. If we are not doing pull based submission,during a GFAC failure > Orchestrator have to identify it and submit the active jobs from failure gfac > node to other nodes. > > I think there is more communication need to happend here. > 1. When Orchastrator first deposit the job it should be unsubmitted state. > 2. GFAC should only update state to active after really submitting it to > resource > I agree, there could be few important states like > input transfered, job submitted, job finished, output transfered. > > Incase of a GFAC instance failure the secondary GFAC should go through all > unfinished jobs relevant to failed and get there state by consulting the > resource. If those jobs are still in active state monitoring mechanism should > be established. We only need to re-submit jobs if they are in unsubmitted > state. > +1. > > To precisely implement this we need a 2-phase commit like mechanism. Then we > can make sure jobs will not duplicate. > +1. > > > Thanks amila for compiling the email carefully. > > Regards > Lahiru > > > This might cause job duplication in case Orchestrator falls alarm about GFAC > failure (so have to handle carefully). > > We have lot more to discus about the GFAC but I limit our discussion to > Orchestrator component for now. > > WDYT about this design ? > > Lahiru > > -- > System Analyst Programmer > PTI Lab > Indiana University > > > > > -- > System Analyst Programmer > PTI Lab > Indiana University
