oops, sorry. Here it is: http://www.mail-archive.com/[email protected]/msg01417.html
Thanks, Eran Chinthaka Withana On Thu, Sep 4, 2014 at 2:22 PM, Suresh Marru <[email protected]> wrote: > Hi Eran, Jijoe > > Can you share the missing reference you indicate below? > > Ofcourse by all means its good for Airavata to build over projects like > Mesos, thats my motivation for this discussion. I am not yet suggesting > implementing a scheduler, that will be a distraction. The meta scheduler I > illustrated is a mere routing to be injected into airavata job management > with a simple FIFO. We looking forward to hearing options from you all on > whats the right third party software is. Manu Singh a first year graduate > student at IU volunteers to do a academic study of these solutions, so will > appreciate pointers. > > Suresh > > On Sep 3, 2014, at 11:59 AM, Eran Chinthaka Withana < > [email protected]> wrote: > > > Hi, > > > > Before you go ahead and implement on your own, consider reading this mail > > thread[1] and looking at how frameworks like Apache Aurora does it on top > > of Apache Mesos. These may provide good inputs for this implementation. > > > > (thanks to Jijoe also who provided input for this) > > > > > > > > Thanks, > > Eran Chinthaka Withana > > > > > > On Wed, Sep 3, 2014 at 5:50 AM, Suresh Marru <[email protected]> wrote: > > > >> Thank you all for comments and suggestions. I summarized the discussion > as > >> a implementation plan on a wiki page: > >> > >> > https://cwiki.apache.org/confluence/display/AIRAVATA/Airavata+Metascheduler > >> > >> If this is amenable, we can take this to dev list to plan the > development > >> in two phases. First implement the Throttle-Job in and short term and > then > >> plan the Auto-Scheduling capabilities. > >> > >> Suresh > >> > >> On Sep 2, 2014, at 1:50 PM, Gary E. Gorbet <[email protected]> wrote: > >> > >>> It seems to me that among many possible functions a metascheduler (MS) > >> would provide, there are two separate ones that must be addressed first. > >> The two use cases implied are as follows. > >>> > >>> (1) The gateway submits a group of jobs to a specified resource where > >> the count of jobs exceeds the resource’s queued job limit. Let’s say 300 > >> very quick jobs are submitted, where the limit is 50 per community user. > >> The MS must maintain an internal queue and release jobs to the resource > in > >> groups with job counts under the limit (say, 40 at a time). > >>> > >>> (2) The gateway submits a job or set of jobs with a flag that specifies > >> that Airavata choose the resource. Here, MCP or some other mechanism > >> arrives eventually at the specific resource that completes the job(s). > >>> > >>> Where both uses are needed - unspecified resource and a group of jobs > >> with count exceeding limits - the MS action would be best defined by > >> knowing the definitions and mechanisms employed in the two separate > >> functions. For example, if MCP is employed, the initial brute force test > >> submissions might need to be done using the determined number of jobs > at a > >> time (e.g., 40). But the design here must adhere to design criteria > arrived > >> at for both function (1) and function (2). > >>> > >>> In UltraScan’s case, the most immediate need is for (1). The user could > >> manually determine the best resource or just make a reasonable guess. > What > >> the user does not want to do is manually release jobs 40 at a time. The > >> gateway interface allows specification of a group of 300 jobs and the > user > >> does not care what is going on under the covers to effect the running of > >> all of them eventually. So, I guess I am lobbying for addressing (1) > first; > >> both to meet UltraScan’s immediate need and to elucidate the design of > more > >> sophisticated functionality. > >>> > >>> - Gary > >>> > >>> On Sep 2, 2014, at 12:02 PM, Suresh Marru <[email protected]> wrote: > >>> > >>>> Hi Kenneth, > >>>> > >>>> On Sep 2, 2014, at 12:44 PM, K Yoshimoto <[email protected]> wrote: > >>>> > >>>>> > >>>>> The tricky thing is the need to maintain an internal queue of > >>>>> jobs when the Stampede queued jobs limit is reached. If airavata > >>>>> has an internal representation for jobs to be submitted, I think you > >>>>> are most of the way there. > >>>> > >>>> Airavata has an internal representation of jobs, but there is no good > >> global view of all the jobs running on a given resource for a given > >> community account. We are trying to fix this, once this is done, as you > >> say, the FIFO implementation should be straight forward. > >>>> > >>>>> It is tricky to do resource-matching scheduling when the job mix > >>>>> is not known. For example, the scheduler does not know whether > >>>>> to preserve memory vs cores when deciding where to place a job. > >>>>> Also, the interactions of the gateway scheduler and the local > >>>>> schedulers may be complicated to predict. > >>>>> > >>>>> Fair share is probably not a good idea. In practice, it tends > >>>>> to disrupt the other scheduling policies such that one group is > >>>>> penalized and the others don't run much earlier. > >>>> > >>>> Interesting. What do you think of the capacity based scheduling > >> algorithm (linked below)? > >>>> > >>>>> > >>>>> One option is to maintain the gateway job queue internally, > >>>>> then use the MCP brute force approach: submit to all resources, > >>>>> then cancel after the first job start. You may also want to > >>>>> allow the gateway to set per-resource policy limits on > >>>>> number of jobs, job duration, job core size, SUs, etc. > >>>> > >>>> MCP is something we should try. The limits per gateway per resource > >> exists, but we need to exercise these capabilities. > >>>> > >>>> Suresh > >>>> > >>>>> > >>>>> On Tue, Sep 02, 2014 at 07:50:12AM -0400, Suresh Marru wrote: > >>>>>> Hi All, > >>>>>> > >>>>>> Need some guidance on identifying a scheduling strategy and a > >> pluggable third party implementation for airavata scheduling needs. For > >> context let me describe the use cases for scheduling within airavata: > >>>>>> > >>>>>> * If we gateway/user is submitting a series of jobs, airavata is > >> currently not throttling them and sending them to compute clusters (in a > >> FIFO way). Resources enforce per user job limit within a queue and > ensure > >> fair use of the clusters ((example: stampede allows 50 jobs per user in > the > >> normal queue [1]). Airavata will need to implement queues and throttle > jobs > >> respecting the max-job-per-queue limits of a underlying resource queue. > >>>>>> > >>>>>> * Current version of Airavata is also not performing job scheduling > >> across available computational resources and expecting gateways/users to > >> pick resources during experiment launch. Airavata will need to implement > >> schedulers which become aware of existing loads on the clusters and > spread > >> jobs efficiently. The scheduler should be able to get access to > heuristics > >> on previous executions and current requirements which includes job size > >> (number of nodes/cores), memory requirements, wall time estimates and so > >> forth. > >>>>>> > >>>>>> * As Airavata is mapping multiple individual user jobs into one or > >> more community account submissions, it also becomes critical to > implement > >> fair-share scheduling among these users to ensure fair use of > allocations > >> as well as allowable queue limits. > >>>>>> > >>>>>> Other use cases? > >>>>>> > >>>>>> We will greatly appreciate if folks on this list can shed light on > >> experiences using schedulers implemented in hadoop, mesos, storm or > other > >> frameworks outside of their intended use. For instance, hadoop (yarn) > >> capacity [2] and fair schedulers [3][4][5] seem to meet the needs of > >> airavata. Is it a good idea to attempt to reuse these implementations? > Any > >> other pluggable third-party alternatives. > >>>>>> > >>>>>> Thanks in advance for your time and insights, > >>>>>> > >>>>>> Suresh > >>>>>> > >>>>>> [1] - > >> > https://www.tacc.utexas.edu/user-services/user-guides/stampede-user-guide#running > >>>>>> [2] - > >> > http://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html > >>>>>> [3] - > >> > http://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/FairScheduler.html > >>>>>> [4] - https://issues.apache.org/jira/browse/HADOOP-3746 > >>>>>> [5] - https://issues.apache.org/jira/browse/YARN-326 > >>>>>> > >>>>>> > >>>> > >>> > >> > >> > >
