Hi Shameera, I created a epic to track all work related to scheduling - https://issues.apache.org/jira/browse/AIRAVATA-1436
Suresh On Sep 11, 2014, at 6:56 PM, Shameera Rathnayaka <[email protected]> wrote: > Hi Suresh, > > Following is how I think we can use suggested improvement to handle real time > scheduling, If user select a resource when he submit the experiment, In > Validate allocation step we can check the job restriction and the possibility > of submitting a new job to given target resource under the selected username. > If there is no space to submit a new job to the target resource. Then inform > it to the user by a message, saying the experiment is rejected(or failed) > because of job count restriction of the target resource. > > If user need to auto-schedule his experiment, then we can move this > experiment to buffered queue and use real time job count details to decide > when it is possible to submit a new job to the target machine or find out a > best fit machine and submit the experiment. > > Thanks, > Shameera. > > > On Thu, Sep 11, 2014 at 2:42 PM, Suresh Marru <[email protected]> wrote: > Hi Shameera, > > Can you please map this to the diagram at [1]? Will the HPCPullMonitor be > equivalent to the BufferedQueue we discussed on the architecture list? > > Suresh > [1] - > https://cwiki.apache.org/confluence/display/AIRAVATA/Airavata+Metascheduler > > On Sep 11, 2014, at 10:29 AM, Shameera Rathnayaka <[email protected]> > wrote: > > > Hi devs, > > > > I am going to implement the $Subject > > > > Requirement: Introduce a max job submission count for a given resource > > under a given username. > > > > Abstraction: When user submits a new experiment to the airavata, user > > selects the resource (Machine) where airavata should run that experiment > > (Job). That resource may have job count restriction like under one user > > there can only be have X number of jobs either in Q or R state. So we need > > to handle this at Orchestrator level rather than handing over the > > experiment to GFac to submit the jobs where it gets rejected because of > > that restriction. To do that Orchestrator need to know the job count of > > particular user in that given resource. > > > > > > Implementation: HPCPullMonitor will write stat data to zookeeper, > > zookeeper path would be something like > > /stat/{username}/{machine}/jobs/{count}. Orchestrator will register a > > watcher for this data change and that watcher will trigger when any GFac > > node(Monitor component) update the job status realtime. Finished jobs will > > immediately decrement the count and these changes will replicate in > > Orchestrator with ZK watches. > > > > Thanks, > > Shameera. > > > > -- > > Best Regards, > > Shameera Rathnayaka. > > > > email: shameera AT apache.org , shameerainfo AT gmail.com > > Blog : http://shameerarathnayaka.blogspot.com/ > > > > > -- > Best Regards, > Shameera Rathnayaka. > > email: shameera AT apache.org , shameerainfo AT gmail.com > Blog : http://shameerarathnayaka.blogspot.com/
