The tricky thing is the need to maintain an internal queue of
jobs when the Stampede queued jobs limit is reached.  If airavata
has an internal representation for jobs to be submitted, I think you
are most of the way there.

It is tricky to do resource-matching scheduling when the job mix
is not known.  For example, the scheduler does not know whether
to preserve memory vs cores when deciding where to place a job.
Also, the interactions of the gateway scheduler and the local
schedulers may be complicated to predict.

Fair share is probably not a good idea.  In practice, it tends
to disrupt the other scheduling policies such that one group is
penalized and the others don't run much earlier.

One option is to maintain the gateway job queue internally,
then use the MCP brute force approach: submit to all resources,
then cancel after the first job start.  You may also want to
allow the gateway to set per-resource policy limits on
number of jobs, job duration, job core size, SUs, etc.

On Tue, Sep 02, 2014 at 07:50:12AM -0400, Suresh Marru wrote:
> Hi All,
> 
> Need some guidance on identifying a scheduling strategy and a pluggable third 
> party implementation for airavata scheduling needs. For context let me 
> describe the use cases for scheduling within airavata:
> 
> * If we gateway/user is submitting a series of jobs, airavata is currently 
> not throttling them and sending them to compute clusters (in a FIFO way). 
> Resources enforce per user job limit within a queue and ensure fair use of 
> the clusters ((example: stampede allows 50 jobs per user in the normal queue 
> [1]). Airavata will need to implement queues and throttle jobs respecting the 
> max-job-per-queue limits of a underlying resource queue. 
>  
> * Current version of Airavata is also not performing job scheduling across 
> available computational resources and expecting gateways/users to pick 
> resources during experiment launch. Airavata will need to implement 
> schedulers which become aware of existing loads on the clusters and spread 
> jobs efficiently. The scheduler should be able to get access to heuristics on 
> previous executions and current requirements which includes job size (number 
> of nodes/cores), memory requirements, wall time estimates and so forth. 
> 
> * As Airavata is mapping multiple individual user jobs into one or more 
> community account submissions, it also becomes critical to implement 
> fair-share scheduling among these users to ensure fair use of allocations as 
> well as allowable queue limits.
> 
> Other use cases? 
> 
> We will greatly appreciate if folks on this list can shed light on 
> experiences using schedulers implemented in hadoop, mesos, storm or other 
> frameworks outside of their intended use. For instance, hadoop (yarn) 
> capacity [2] and fair schedulers [3][4][5] seem to meet the needs of 
> airavata. Is it a good idea to attempt to reuse these implementations? Any 
> other pluggable third-party alternatives. 
> 
> Thanks in advance for your time and insights,
> 
> Suresh
> 
> [1] - 
> https://www.tacc.utexas.edu/user-services/user-guides/stampede-user-guide#running
> [2] - 
> http://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
> [3] - 
> http://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/FairScheduler.html
> [4] - https://issues.apache.org/jira/browse/HADOOP-3746
> [5] - https://issues.apache.org/jira/browse/YARN-326
> 
> 

Reply via email to