On Tue, Sep 02, 2014 at 01:02:18PM -0400, Suresh Marru wrote: > Hi Kenneth, > > On Sep 2, 2014, at 12:44 PM, K Yoshimoto <[email protected]> wrote: > > > > > The tricky thing is the need to maintain an internal queue of > > jobs when the Stampede queued jobs limit is reached. If airavata > > has an internal representation for jobs to be submitted, I think you > > are most of the way there. > > Airavata has an internal representation of jobs, but there is no good global > view of all the jobs running on a given resource for a given community > account. We are trying to fix this, once this is done, as you say, the FIFO > implementation should be straight forward. > > > It is tricky to do resource-matching scheduling when the job mix > > is not known. For example, the scheduler does not know whether > > to preserve memory vs cores when deciding where to place a job. > > Also, the interactions of the gateway scheduler and the local > > schedulers may be complicated to predict. > > > > Fair share is probably not a good idea. In practice, it tends > > to disrupt the other scheduling policies such that one group is > > penalized and the others don't run much earlier. > > Interesting. What do you think of the capacity based scheduling algorithm > (linked below)?
I scanned through the YARN stuff, and it was not clear to me what their scheduling algorithm is. It looks like they only do resource-based scheduling for memory requirements. Also, it looks more like a way to schedule a cluster than a metascheduler. > > > > > One option is to maintain the gateway job queue internally, > > then use the MCP brute force approach: submit to all resources, > > then cancel after the first job start. You may also want to > > allow the gateway to set per-resource policy limits on > > number of jobs, job duration, job core size, SUs, etc. > > MCP is something we should try. The limits per gateway per resource exists, > but we need to exercise these capabilities. I don't think there's a need to use any of the MCP python code. Instead, just implement the simple brute-force approach in airavata scheduling routines. Kenneth > > Suresh > > > > > On Tue, Sep 02, 2014 at 07:50:12AM -0400, Suresh Marru wrote: > >> Hi All, > >> > >> Need some guidance on identifying a scheduling strategy and a pluggable > >> third party implementation for airavata scheduling needs. For context let > >> me describe the use cases for scheduling within airavata: > >> > >> * If we gateway/user is submitting a series of jobs, airavata is currently > >> not throttling them and sending them to compute clusters (in a FIFO way). > >> Resources enforce per user job limit within a queue and ensure fair use of > >> the clusters ((example: stampede allows 50 jobs per user in the normal > >> queue [1]). Airavata will need to implement queues and throttle jobs > >> respecting the max-job-per-queue limits of a underlying resource queue. > >> > >> * Current version of Airavata is also not performing job scheduling across > >> available computational resources and expecting gateways/users to pick > >> resources during experiment launch. Airavata will need to implement > >> schedulers which become aware of existing loads on the clusters and spread > >> jobs efficiently. The scheduler should be able to get access to heuristics > >> on previous executions and current requirements which includes job size > >> (number of nodes/cores), memory requirements, wall time estimates and so > >> forth. > >> > >> * As Airavata is mapping multiple individual user jobs into one or more > >> community account submissions, it also becomes critical to implement > >> fair-share scheduling among these users to ensure fair use of allocations > >> as well as allowable queue limits. > >> > >> Other use cases? > >> > >> We will greatly appreciate if folks on this list can shed light on > >> experiences using schedulers implemented in hadoop, mesos, storm or other > >> frameworks outside of their intended use. For instance, hadoop (yarn) > >> capacity [2] and fair schedulers [3][4][5] seem to meet the needs of > >> airavata. Is it a good idea to attempt to reuse these implementations? Any > >> other pluggable third-party alternatives. > >> > >> Thanks in advance for your time and insights, > >> > >> Suresh > >> > >> [1] - > >> https://www.tacc.utexas.edu/user-services/user-guides/stampede-user-guide#running > >> [2] - > >> http://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html > >> [3] - > >> http://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/FairScheduler.html > >> [4] - https://issues.apache.org/jira/browse/HADOOP-3746 > >> [5] - https://issues.apache.org/jira/browse/YARN-326 > >> > >>
