Fenzo integration should be considered. Old thread http://markmail.org/message/bjifjyvhvs2en3ts
If no volunteers by summer would take it up. Thx > On Mar 23, 2017, at 5:14 PM, Zameer Manji <zma...@apache.org> wrote: > > Hey, > > Sorry for the late reply. > > It is possible to make this configurable. For example we could just > implement multiple algorithms and switch between them using different > flags. If the flag value is just a class on the classpath that implements > an interface, it can be 100% pluggable. > > The primary part of the scheduling code is `TaskScheduler` and > `TaskAssigner`. > > `TaskScheduler` receives requests to schedule tasks and does some > validation and preparation. `TaskAssigner` implements the first fit > algorithm. > > However, I feel the best move for the project would be to move away from > first fit, to support soft constraints. I think it is a very valid feature > request and I believe it can be done without degradation performance. > Ideally, we should just use an existing Java library that implements a well > known algorithm. For example, Netflix's Fenzo > <https://github.com/Netflix/Fenzo> could be used here. > > On Wed, Mar 15, 2017 at 11:10 AM, Mauricio Garavaglia < > mauriciogaravag...@gmail.com> wrote: > >> Hi, >> >> Rather than changing the scheduling algorithm, I think we should open to >> support multiple algorithms. First-fit is certainly a great solution for >> humongous clusters with homogeneous workloads; but for smaller clusters we >> can make have more optimized scheduling without sacrificing scheduling >> performance. >> >> How difficult do you think it would be to start exploring that option? >> Haven't looked into the scheduling side of the code :) >> >> >> >> >> >> >> >> >>> On Mon, Mar 6, 2017 at 2:57 PM, Zameer Manji <zma...@apache.org> wrote: >>> >>> Something similar was proposed on a Dynamic Reservations review and there >>> is a ticket for it here <https://issues.apache.org/ >> jira/browse/AURORA-173 >>>> . >>> >>> I think it is feasible, but it is important to note that this is large >>> change because we are going to move Aurora from first fit to some other >>> algorithm. >>> >>> If we do this we need to ensure it scales to very large clusters and >>> ensures reasonably low latency in assigning tasks to offers. >>> >>> I support the idea of "spread", but it would need to be after a change to >>> the scheduling algorithm. >>> >>> On Mon, Mar 6, 2017 at 11:44 AM, Mauricio Garavaglia < >>> mauriciogaravag...@gmail.com> wrote: >>> >>>> Hello! >>>> >>>> I have a job that have multiple instances (>100) that'd I like to >> spread >>>> across the hosts in a cluster. Using a constraint such as >> "limit=host:1" >>>> doesn't work quite well, as I have more instances than nodes. >>>> >>>> As a workaround I increased the limit value to something like >>>> ceil(instances/nodes). But now the problem happens if a bunch of nodes >> go >>>> down (think a whole rack dies) because the instances will not run until >>>> them are back, even though we may have spare capacity on the rest of >> the >>>> hosts that we'd like to use. In that scenario, the job availability may >>> be >>>> affected because it's running with fewer instances than expected. On a >>>> smaller scale, the former approach would also apply if you want to >> spread >>>> tasks in racks or availability zones. I'd like to have one instance of >> a >>>> job per rack (failure domain) but in the case of it going down, the >>>> instance can be spawn on a different rack. >>>> >>>> I thought we could have a scheduling constraint to "spread" instances >>>> across a particular host attribute; instead of vetoing an offer right >>> away >>>> we check where the other instances of a task are running, looking for a >>>> particular attribute of the host. We try to maximize the different >> values >>>> of a particular attribute (rack, hostname, etc) on the task instances >>>> assignment. >>>> >>>> what do you think? did something like this came up in the past? is it >>>> feasible? >>>> >>>> >>>> Mauricio >>>> >>>> -- >>>> Zameer Manji >>>> >>> >> >> -- >> Zameer Manji >>