Re: Orchestrator Real time Job submission improvement

Suresh Marru Mon, 15 Sep 2014 12:40:36 -0700

Hi Shameera,

I created a epic to track all work related to scheduling - 
https://issues.apache.org/jira/browse/AIRAVATA-1436


Suresh

On Sep 11, 2014, at 6:56 PM, Shameera Rathnayaka <[email protected]> wrote:

> Hi Suresh, 
> 
> Following is how I think we can use suggested improvement to handle real time 
> scheduling, If user select a resource when he submit the experiment, In 
> Validate allocation step we can check the job restriction and the possibility 
> of submitting a new job to given target resource under the selected username. 
> If there is no space to submit a new job to the target resource. Then inform 
> it to the user by a message, saying the experiment is rejected(or failed) 
> because of job count restriction of the target resource. 
> 
> If user need to auto-schedule his experiment, then we can move this 
> experiment to buffered queue and use real time job count details to decide 
> when it is possible to submit a new job to the target machine or find out a 
> best fit machine and submit the experiment.
> 
> Thanks, 
> Shameera.
> 
> 
> On Thu, Sep 11, 2014 at 2:42 PM, Suresh Marru <[email protected]> wrote:
> Hi Shameera,
> 
> Can you please map this to the diagram at [1]? Will the HPCPullMonitor be 
> equivalent to the BufferedQueue we discussed on the architecture list?
> 
> Suresh
> [1] - 
> https://cwiki.apache.org/confluence/display/AIRAVATA/Airavata+Metascheduler
> 
> On Sep 11, 2014, at 10:29 AM, Shameera Rathnayaka <[email protected]> 
> wrote:
> 
> > Hi devs,
> >
> > I am going to implement the $Subject
> >
> > Requirement: Introduce a max job submission count for a given resource 
> > under a given username.
> >
> > Abstraction: When user submits a new experiment to the airavata, user 
> > selects the resource (Machine) where airavata should run that experiment 
> > (Job). That resource may have job count restriction like under one user 
> > there can only be have X number of jobs either in Q or R state. So we need 
> > to handle this at Orchestrator level rather than handing over the 
> > experiment to GFac to submit the jobs where it gets rejected because of 
> > that restriction. To do that Orchestrator need to know the job count of 
> > particular user in that given resource.
> >
> >
> > Implementation:  HPCPullMonitor will write stat data to zookeeper, 
> > zookeeper path would be something like 
> > /stat/{username}/{machine}/jobs/{count}. Orchestrator will register a 
> > watcher for this data change and that watcher will trigger when any GFac 
> > node(Monitor component) update the job status realtime. Finished jobs will 
> > immediately decrement the count and these changes will replicate in 
> > Orchestrator with ZK watches.
> >
> > Thanks,
> > Shameera.
> >
> > --
> > Best Regards,
> > Shameera Rathnayaka.
> >
> > email: shameera AT apache.org , shameerainfo AT gmail.com
> > Blog : http://shameerarathnayaka.blogspot.com/
> 
> 
> 
> 
> -- 
> Best Regards,
> Shameera Rathnayaka.
> 
> email: shameera AT apache.org , shameerainfo AT gmail.com
> Blog : http://shameerarathnayaka.blogspot.com/

Re: Orchestrator Real time Job submission improvement

Reply via email to