I think it would be best for us to not maintain our own record of the job limit - we need to remember that jobs will be submitted to these resources using the community accounts through other methods as well. I think I remember someone mentioning that it would be ideal to poll the resources for their limits - can anyone confirm that we can do this?
On Mon, Aug 3, 2015 at 12:24 PM, Douglas Chau <[email protected]> wrote: > Hmm @shameera, that's very true. Perhaps, we can store the submission > requests in registry. In the event that orchestrator goes down we can > recover them through registry afterwards. > > @Yoshimito, I didn't think about that - will take it into > consideration.Thanks for the insight! > > On Mon, Aug 3, 2015 at 12:11 PM, K Yoshimoto <[email protected]> wrote: > >> >> I think you also want to put in a check for successful submission, >> then take appropriate action on failed submission. It can be >> difficult to keep the submission limit up-to-date. >> >> On Mon, Aug 03, 2015 at 11:03:46AM -0400, Douglas Chau wrote: >> > Hey Devs, >> > >> > Just wanted to get some input on our to plan to implement the queue >> > throttling feature. >> > >> > Batch Queue Throttling: >> > - in Orchestrator, the current submit() function in >> GFACPassiveJobSubmitter >> > publishes jobs to rabbitmq immediately >> > - instead of publishing immediately we should pass the messages to a new >> > component, call it BatchQueueClass. >> > - we need a new component BatchQueueClass to periodically check to see >> when >> > we can unload jobs to submit >> > >> > Adding BatchQueueClass >> > - setup a new table(s) to contain compute resource names and their >> > corresponding queues' current job numbers and maximum job limits >> > - data models in airavata have information on maximum job submission >> limits >> > for a queue but no data on how many jobs are currently running >> > - the current job number will effectively act as a counter, which will >> be >> > incremented when a job is submitted, and when a job is completed >> > - once that is done, BatchQueueClass needs to periodically check new >> table >> > to see if the user's requested queue's current job number < queue job >> > limit. If it is then we can pop jobs off and submit them until we hit >> the >> > job limit; if not, then we wait until the we're under the job limit. >> > >> > How does this sound? >> > >> > Doug >> > >
