Re: Scheduling stratergies for Airavata

Amila Jayasekara Thu, 04 Sep 2014 20:15:04 -0700

On Thu, Sep 4, 2014 at 5:55 PM, Suresh Marru <[email protected]> wrote:


> Eran,
>
> This is a good read and infact sounds very similar in situation (picking a
> well known solution vs writing our own).



> "As you may recollect, Airavata’s key challenge is in identifying the
> resources which have the shortest queue time across many resources. " -
> Well... to be precise Airavata needs to identify the resource which allows
> user application to execute with a minimum time. Queue time is only one
> factor which decides that. Resources accessible to community account is
> another factor. There are more factors scheduler needs to take into
> account; e.g. :- speed, memory, number of cores per node etc ... If you
> want to make scheduler more interesting you can also consider parameters
> such as job placement within nodes, network connectivity, NUMA pattern etc
> ... But I think those are too much at least for initial version of the
> scheduler.
>

Thanks
-Amila



> And of course, it will have use cases like re-using cloud resources for
> individual jobs part of a larger workflow (a flavor of your thesis topic if
> you still remember) and so on. So my question is, are Mesos or Aurora’s use
> cases in managing a fixed set of resources, I mean the challenge in
> spreading M jobs across N resources efficiently with fair-share, varying
> memory and I/O requirements and so on? Or did you also come across examples
> which will resonate with meta-schedulers interacting with multiple lower
> level schedulers?
>
> Thanks,
> Suresh
>
> On Sep 4, 2014, at 5:38 PM, Eran Chinthaka Withana <
> [email protected]> wrote:
>
> > oops, sorry. Here it is:
> > http://www.mail-archive.com/[email protected]/msg01417.html
> >
> > Thanks,
> > Eran Chinthaka Withana
> >
> >
> > On Thu, Sep 4, 2014 at 2:22 PM, Suresh Marru <[email protected]> wrote:
> >
> >> Hi Eran, Jijoe
> >>
> >> Can you share the missing reference you indicate below?
> >>
> >> Ofcourse by all means its good for Airavata to build over projects like
> >> Mesos, thats my motivation for this discussion. I am not yet suggesting
> >> implementing a scheduler, that will be a distraction. The meta
> scheduler I
> >> illustrated is a mere routing to be injected into airavata job
> management
> >> with a simple FIFO. We looking forward to hearing options from you all
> on
> >> whats the right third party software is. Manu Singh a first year
> graduate
> >> student at IU volunteers to do a academic study of these solutions, so
> will
> >> appreciate pointers.
> >>
> >> Suresh
> >>
> >> On Sep 3, 2014, at 11:59 AM, Eran Chinthaka Withana <
> >> [email protected]> wrote:
> >>
> >>> Hi,
> >>>
> >>> Before you go ahead and implement on your own, consider reading this
> mail
> >>> thread[1] and looking at how frameworks like Apache Aurora does it on
> top
> >>> of Apache Mesos. These may provide good inputs for this implementation.
> >>>
> >>> (thanks to Jijoe also who provided input for this)
> >>>
> >>>
> >>>
> >>> Thanks,
> >>> Eran Chinthaka Withana
> >>>
> >>>
> >>> On Wed, Sep 3, 2014 at 5:50 AM, Suresh Marru <[email protected]>
> wrote:
> >>>
> >>>> Thank you all for comments and suggestions. I summarized the
> discussion
> >> as
> >>>> a implementation plan on a wiki page:
> >>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/AIRAVATA/Airavata+Metascheduler
> >>>>
> >>>> If this is amenable, we can take this to dev list to plan the
> >> development
> >>>> in two phases. First implement the Throttle-Job in and short term and
> >> then
> >>>> plan the Auto-Scheduling capabilities.
> >>>>
> >>>> Suresh
> >>>>
> >>>> On Sep 2, 2014, at 1:50 PM, Gary E. Gorbet <[email protected]>
> wrote:
> >>>>
> >>>>> It seems to me that among many possible functions a metascheduler
> (MS)
> >>>> would provide, there are two separate ones that must be addressed
> first.
> >>>> The two use cases implied are as follows.
> >>>>>
> >>>>> (1) The gateway submits a group of jobs to a specified resource where
> >>>> the count of jobs exceeds the resource’s queued job limit. Let’s say
> 300
> >>>> very quick jobs are submitted, where the limit is 50 per community
> user.
> >>>> The MS must maintain an internal queue and release jobs to the
> resource
> >> in
> >>>> groups with job counts under the limit (say, 40 at a time).
> >>>>>
> >>>>> (2) The gateway submits a job or set of jobs with a flag that
> specifies
> >>>> that Airavata choose the resource. Here, MCP or some other mechanism
> >>>> arrives eventually at the specific resource that completes the job(s).
> >>>>>
> >>>>> Where both uses are needed - unspecified resource and a group of jobs
> >>>> with count exceeding limits - the MS action would be best defined by
> >>>> knowing the definitions and mechanisms employed in the two separate
> >>>> functions. For example, if MCP is employed, the initial brute force
> test
> >>>> submissions might need to be done using the determined number of jobs
> >> at a
> >>>> time (e.g., 40). But the design here must adhere to design criteria
> >> arrived
> >>>> at for both function (1) and function (2).
> >>>>>
> >>>>> In UltraScan’s case, the most immediate need is for (1). The user
> could
> >>>> manually determine the best resource or just make a reasonable guess.
> >> What
> >>>> the user does not want to do is manually release jobs 40 at a time.
> The
> >>>> gateway interface allows specification of a group of 300 jobs and the
> >> user
> >>>> does not care what is going on under the covers to effect the running
> of
> >>>> all of them eventually. So, I guess I am lobbying for addressing (1)
> >> first;
> >>>> both to meet UltraScan’s immediate need and to elucidate the design of
> >> more
> >>>> sophisticated functionality.
> >>>>>
> >>>>> - Gary
> >>>>>
> >>>>> On Sep 2, 2014, at 12:02 PM, Suresh Marru <[email protected]> wrote:
> >>>>>
> >>>>>> Hi Kenneth,
> >>>>>>
> >>>>>> On Sep 2, 2014, at 12:44 PM, K Yoshimoto <[email protected]> wrote:
> >>>>>>
> >>>>>>>
> >>>>>>> The tricky thing is the need to maintain an internal queue of
> >>>>>>> jobs when the Stampede queued jobs limit is reached.  If airavata
> >>>>>>> has an internal representation for jobs to be submitted, I think
> you
> >>>>>>> are most of the way there.
> >>>>>>
> >>>>>> Airavata has an internal representation of jobs, but there is no
> good
> >>>> global view of all the jobs running on a given resource for a given
> >>>> community account. We are trying to fix this, once this is done, as
> you
> >>>> say, the FIFO implementation should be straight forward.
> >>>>>>
> >>>>>>> It is tricky to do resource-matching scheduling when the job mix
> >>>>>>> is not known.  For example, the scheduler does not know whether
> >>>>>>> to preserve memory vs cores when deciding where to place a job.
> >>>>>>> Also, the interactions of the gateway scheduler and the local
> >>>>>>> schedulers may be complicated to predict.
> >>>>>>>
> >>>>>>> Fair share is probably not a good idea.  In practice, it tends
> >>>>>>> to disrupt the other scheduling policies such that one group is
> >>>>>>> penalized and the others don't run much earlier.
> >>>>>>
> >>>>>> Interesting. What do you think of the capacity based scheduling
> >>>> algorithm (linked below)?
> >>>>>>
> >>>>>>>
> >>>>>>> One option is to maintain the gateway job queue internally,
> >>>>>>> then use the MCP brute force approach: submit to all resources,
> >>>>>>> then cancel after the first job start.  You may also want to
> >>>>>>> allow the gateway to set per-resource policy limits on
> >>>>>>> number of jobs, job duration, job core size, SUs, etc.
> >>>>>>
> >>>>>> MCP is something we should try. The limits per gateway per resource
> >>>> exists, but we need to exercise these capabilities.
> >>>>>>
> >>>>>> Suresh
> >>>>>>
> >>>>>>>
> >>>>>>> On Tue, Sep 02, 2014 at 07:50:12AM -0400, Suresh Marru wrote:
> >>>>>>>> Hi All,
> >>>>>>>>
> >>>>>>>> Need some guidance on identifying a scheduling strategy and a
> >>>> pluggable third party implementation for airavata scheduling needs.
> For
> >>>> context let me describe the use cases for scheduling within airavata:
> >>>>>>>>
> >>>>>>>> * If we gateway/user is submitting a series of jobs, airavata is
> >>>> currently not throttling them and sending them to compute clusters
> (in a
> >>>> FIFO way). Resources enforce per user job limit within a queue and
> >> ensure
> >>>> fair use of the clusters ((example: stampede allows 50 jobs per user
> in
> >> the
> >>>> normal queue [1]). Airavata will need to implement queues and throttle
> >> jobs
> >>>> respecting the max-job-per-queue limits of a underlying resource
> queue.
> >>>>>>>>
> >>>>>>>> * Current version of Airavata is also not performing job
> scheduling
> >>>> across available computational resources and expecting gateways/users
> to
> >>>> pick resources during experiment launch. Airavata will need to
> implement
> >>>> schedulers which become aware of existing loads on the clusters and
> >> spread
> >>>> jobs efficiently. The scheduler should be able to get access to
> >> heuristics
> >>>> on previous executions and current requirements which includes job
> size
> >>>> (number of nodes/cores), memory requirements, wall time estimates and
> so
> >>>> forth.
> >>>>>>>>
> >>>>>>>> * As Airavata is mapping multiple individual user jobs into one or
> >>>> more community account submissions, it also becomes critical to
> >> implement
> >>>> fair-share scheduling among these users to ensure fair use of
> >> allocations
> >>>> as well as allowable queue limits.
> >>>>>>>>
> >>>>>>>> Other use cases?
> >>>>>>>>
> >>>>>>>> We will greatly appreciate if folks on this list can shed light on
> >>>> experiences using schedulers implemented in hadoop, mesos, storm or
> >> other
> >>>> frameworks outside of their intended use. For instance, hadoop (yarn)
> >>>> capacity [2] and fair schedulers [3][4][5] seem to meet the needs of
> >>>> airavata. Is it a good idea to attempt to reuse these implementations?
> >> Any
> >>>> other pluggable third-party alternatives.
> >>>>>>>>
> >>>>>>>> Thanks in advance for your time and insights,
> >>>>>>>>
> >>>>>>>> Suresh
> >>>>>>>>
> >>>>>>>> [1] -
> >>>>
> >>
> https://www.tacc.utexas.edu/user-services/user-guides/stampede-user-guide#running
> >>>>>>>> [2] -
> >>>>
> >>
> http://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
> >>>>>>>> [3] -
> >>>>
> >>
> http://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/FairScheduler.html
> >>>>>>>> [4] - https://issues.apache.org/jira/browse/HADOOP-3746
> >>>>>>>> [5] - https://issues.apache.org/jira/browse/YARN-326
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: Scheduling stratergies for Airavata

Reply via email to