Ok, will do. Thanks for providing some context on this topic.
Alex On Sun, Jan 11, 2015 at 8:34 PM, Patrick Wendell <pwend...@gmail.com> wrote: > Priority scheduling isn't something we've supported in Spark and we've > opted to support FIFO and Fair scheduling and asked users to try and > fit these to the needs of their applications. > > In practice from what I've seen of priority schedulers, such as the > linux CPU scheduler, is that strict priority scheduling is never used > in practice because of priority starvation and other issues. So you > have this second tier of heuristics that exist to deal with issues > like starvation, priority inversion, etc, and these become very > complex over time. > > That said, I looked a this a bit with @kayousterhout and I don't think > it would be very hard to implement a simple priority scheduler in the > current architecture. My main concern would be additional complexity > that would develop over time, based on looking at previous > implementations in the wild. > > Alessandro, would you be able to open a JIRA and list some of your > requirements there? That way we could hear whether other people have > similar needs. > > - Patrick > > On Sun, Jan 11, 2015 at 10:07 AM, Mark Hamstra <m...@clearstorydata.com> > wrote: > > Yes, if you are asking about developing a new priority queue job > scheduling > > feature and not just about how job scheduling currently works in Spark, > the > > that's a dev list issue. The current job scheduling priority is at the > > granularity of pools containing jobs, not the jobs themselves; so if you > > require strictly job-level priority queuing, that would require a new > > development effort -- and one that I expect will involve a lot of tricky > > corner cases. > > > > Sorry for misreading the nature of your initial inquiry. > > > > On Sun, Jan 11, 2015 at 7:36 AM, Alessandro Baretta < > alexbare...@gmail.com> > > wrote: > > > >> Cody, > >> > >> While I might be able to improve the scheduling of my jobs by using a > few > >> different pools with weights equal to, say, 1, 1e3 and 1e6, effectively > >> getting a small handful of priority classes. Still, this is really not > >> quite what I am describing. This is why my original post was on the dev > >> list. Let me then ask if there is any interest in having priority queue > job > >> scheduling in Spark. This is something I might be able to pull off. > >> > >> Alex > >> > >> On Sun, Jan 11, 2015 at 6:21 AM, Cody Koeninger <c...@koeninger.org> > >> wrote: > >> > >>> If you set up a number of pools equal to the number of different > priority > >>> levels you want, make the relative weights of those pools very > different, > >>> and submit a job to the pool representing its priority, I think youll > get > >>> behavior equivalent to a priority queue. Try it and see. > >>> > >>> If I'm misunderstandng what youre trying to do, then I don't know. > >>> > >>> > >>> On Sunday, January 11, 2015, Alessandro Baretta <alexbare...@gmail.com > > > >>> wrote: > >>> > >>>> Cody, > >>>> > >>>> Maybe I'm not getting this, but it doesn't look like this page is > >>>> describing a priority queue scheduling policy. What this section > discusses > >>>> is how resources are shared between queues. A weight-1000 pool will > get > >>>> 1000 times more resources allocated to it than a priority 1 queue. > Great, > >>>> but not what I want. I want to be able to define an Ordering on make > my > >>>> tasks representing their priority, and have Spark allocate all > resources to > >>>> the job that has the highest priority. > >>>> > >>>> Alex > >>>> > >>>> On Sat, Jan 10, 2015 at 10:11 PM, Cody Koeninger <c...@koeninger.org> > >>>> wrote: > >>>> > >>>>> > >>>>> > http://spark.apache.org/docs/latest/job-scheduling.html#configuring-pool-properties > >>>>> > >>>>> "Setting a high weight such as 1000 also makes it possible to > >>>>> implement *priority* between pools--in essence, the weight-1000 pool > >>>>> will always get to launch tasks first whenever it has jobs active." > >>>>> > >>>>> On Sat, Jan 10, 2015 at 11:57 PM, Alessandro Baretta < > >>>>> alexbare...@gmail.com> wrote: > >>>>> > >>>>>> Mark, > >>>>>> > >>>>>> Thanks, but I don't see how this documentation solves my problem. > You > >>>>>> are referring me to documentation of fair scheduling; whereas, I am > asking > >>>>>> about as unfair a scheduling policy as can be: a priority queue. > >>>>>> > >>>>>> Alex > >>>>>> > >>>>>> On Sat, Jan 10, 2015 at 5:00 PM, Mark Hamstra < > m...@clearstorydata.com > >>>>>> > wrote: > >>>>>> > >>>>>>> -dev, +user > >>>>>>> > >>>>>>> http://spark.apache.org/docs/latest/job-scheduling.html > >>>>>>> > >>>>>>> > >>>>>>> On Sat, Jan 10, 2015 at 4:40 PM, Alessandro Baretta < > >>>>>>> alexbare...@gmail.com> wrote: > >>>>>>> > >>>>>>>> Is it possible to specify a priority level for a job, such that > the > >>>>>>>> active > >>>>>>>> jobs might be scheduled in order of priority? > >>>>>>>> > >>>>>>>> Alex > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >> >