What do you mean by GC burden? What i'm proposing is effectively Map<String, String>. Even with an extremely forgetful operator (even more than Joe!), it would require a huge oversight to put a dent in heap usage. I'm sure there are ways we could even expose a useful stat to flag such an oversight.
On Tue, Jan 19, 2016 at 8:31 PM, Maxim Khutornenko <ma...@apache.org> wrote: > Right, that's what I thought. Yes, it sounds interesting. My only > concern is the GC burden of getting rid of hostnames that are obsolete > and no longer exist. Relying on offers to update hostname 'relevance' > may not work as dedicated hosts may be fully packed and not release > any resources for a very long time. Let me explore this idea a bit to > see what it would take to implement. > > On Tue, Jan 19, 2016 at 8:22 PM, Bill Farner <wfar...@apache.org> wrote: > > Not a host->attribute mapping (attribute in the mesos sense, anyway). > Rather > > an out-of-band API for marking machines as reserved. For task->offer > > mapping it's just a matter of another data source. Does that make sense? > > > > On Tuesday, January 19, 2016, Maxim Khutornenko <ma...@apache.org> > wrote: > > > >> > > >> > Can't this just be any old Constraint (not named "dedicated"). In > other > >> > words, doesn't this code already deal with non-dedicated constraints?: > >> > > >> > > >> > https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilterImpl.java#L193-L197 > >> > >> > >> Not really. There is a subtle difference here. A regular (non-dedicated) > >> constraint does not prevent other tasks from landing on a given machine > set > >> whereas dedicated keeps other tasks away by only allowing those matching > >> the dedicated attribute. What this proposal targets is allowing > exclusive > >> machine pool matching any job that has this new constraint while keeping > >> all other tasks that don't have that attribute away. > >> > >> Following an example from my original post, imagine a GPU machine pool. > Any > >> job (from any role) requiring GPU resource would be allowed while all > other > >> jobs that don't have that constraint would be vetoed. > >> > >> Also, regarding dedicated constraints necessitating a slave restart - > i've > >> > pondered moving dedicated machine management to the scheduler for > similar > >> > purposes. There's not really much forcing that behavior to be managed > >> with > >> > a slave attribute. > >> > >> > >> Would you mind giving a few more hints on the mechanics behind this? How > >> would scheduler know about dedicated hw without the slave attributes > set? > >> Are you proposing storing hostname->attribute mapping in the scheduler > >> store? > >> > >> On Tue, Jan 19, 2016 at 7:53 PM, Bill Farner <wfar...@apache.org > >> <javascript:;>> wrote: > >> > >> > Joe - if you want to pursue this, I suggest you start another thread > to > >> > keep this thread's discussion in tact. I will not be able to lead > this > >> > change, but can certainly shepherd! > >> > > >> > On Tuesday, January 19, 2016, Joe Smith <yasumo...@gmail.com > >> <javascript:;>> wrote: > >> > > >> > > As an operator, that'd be a relatively simple change in tooling, and > >> the > >> > > benefits of not forcing a slave restart would be _huge_. > >> > > > >> > > Keeping the dedicated semantics (but adding non-exclusive) would be > >> ideal > >> > > if possible. > >> > > > >> > > > On Jan 19, 2016, at 19:09, Bill Farner <wfar...@apache.org > >> <javascript:;> > >> > > <javascript:;>> wrote: > >> > > > > >> > > > Also, regarding dedicated constraints necessitating a slave > restart - > >> > > i've > >> > > > pondered moving dedicated machine management to the scheduler for > >> > similar > >> > > > purposes. There's not really much forcing that behavior to be > >> managed > >> > > with > >> > > > a slave attribute. > >> > > > > >> > > > On Tue, Jan 19, 2016 at 7:05 PM, John Sirois <j...@conductant.com > >> <javascript:;> > >> > > <javascript:;>> wrote: > >> > > > > >> > > >> On Tue, Jan 19, 2016 at 7:22 PM, Maxim Khutornenko < > >> ma...@apache.org <javascript:;> > >> > > <javascript:;>> > >> > > >> wrote: > >> > > >> > >> > > >>> Has anyone explored an idea of having a non-exclusive (wrt job > >> role) > >> > > >>> dedicated constraint in Aurora before? > >> > > >> > >> > > >> > >> > > >>> We do have a dedicated constraint now but it assumes a 1:1 > >> > > >>> relationship between a job role and a slave attribute [1]. For > >> > > >>> example: a 'www-data/prod/hello' job with a dedicated > constraint of > >> > > >>> 'dedicated': 'www-data/hello' may only be pinned to a particular > >> set > >> > > >>> of slaves if all of them have 'www-data/hello' attribute set. No > >> > other > >> > > >>> role tasks will be able to land on those slaves unless their > >> > > >>> 'role/name' pair is added into the slave attribute set. > >> > > >>> > >> > > >>> The above is very limiting as it prevents carving out subsets > of a > >> > > >>> shared pool cluster to be used by multiple roles at the same > time. > >> > > >>> Would it make sense to have a free-form dedicated constraint not > >> > bound > >> > > >>> to a particular role? Multiple jobs could then use this type of > >> > > >>> constraint dynamically without modifying the slave command line > >> (and > >> > > >>> requiring slave restart). > >> > > >> > >> > > >> Can't this just be any old Constraint (not named "dedicated"). > In > >> > other > >> > > >> words, doesn't this code already deal with non-dedicated > >> constraints?: > >> > > >> > >> > > >> > >> > > > >> > > >> > https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilterImpl.java#L193-L197 > >> > > >> > >> > > >> > >> > > >>> This could be quite useful for experimenting purposes (e.g. > >> different > >> > > >>> host OS) or to target a different hardware offering (e.g. > GPUs). In > >> > > >>> other words, only those jobs that explicitly opt-in to > participate > >> in > >> > > >>> an experiment or hw offering would be landing on that slave set. > >> > > >>> > >> > > >>> Thanks, > >> > > >>> Maxim > >> > > >>> > >> > > >>> [1]- > >> > > >> > >> > > > >> > > >> > https://github.com/apache/aurora/blob/eec985d948f02f46637d87cd4d212eb2a70ef8d0/src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java#L272-L276 > >> > > >> > >> > > >> > >> > > >> > >> > > >> -- > >> > > >> John Sirois > >> > > >> 303-512-3301 > >> > > >> > >> > > > >> > > >> >