Thanks for the input guys! I was wondering if you have any thoughts about how the API should look like.
On Tue, Jan 12, 2016 at 1:00 PM, John Sirois <john.sir...@gmail.com> wrote: > On Mon, Jan 11, 2016 at 11:02 PM, John Sirois <john.sir...@gmail.com> > wrote: > > > > > > > On Mon, Jan 11, 2016 at 11:00 PM, Bill Farner <wfar...@apache.org> > wrote: > > > >> In the log, tasks are denormalized anyhow: > >> > >> > https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/storage.thrift#L43-L45 > > > > > > Right - but now we'd be making that denormalization systemically > > in-effective. IIUC its values-equals based denorm, I'd think we'd need > > diffing in a cluster using, for example, ceph + docker ~exclusively. > > > > I was being generally confusing here. To be more precise, the issue I'm > concerned about is the newish log snapshot deduping feature [1] being > foiled by all TaskConfig's for a job's tasks now being unique via > `ExecutorConfig.data` [2]. > This is an optimization concern only, and IIUC it only becomes of concern > in very large clusters as evidenced by the fact the log dedup feature came > late in the use of Aurora by Twitter. > > This could definitely be worked out. > > [1] > > https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/storage.thrift#L196-L208 > [2] > > https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/api.thrift#L167 > > > > > > >> > >> > >> > >> > >> On Mon, Jan 11, 2016 at 9:54 PM, John Sirois <j...@conductant.com> > wrote: > >> > >> > On Mon, Jan 11, 2016 at 10:40 PM, Bill Farner <wfar...@apache.org> > >> wrote: > >> > > >> > > Funny, that's actually how the scheduler API originally worked. I > >> think > >> > > this is worth exploring, and would indeed completely sidestep the > >> > paradigm > >> > > shift i mentioned above. > >> > > > >> > > >> > I think the crux might be handling a structural diff of the thrift for > >> the > >> > Tasks to keep the log dedupe optimizations in-play for the most part; > ie > >> > store Task0 in-full, and Task1-N as thrift struct diffs against 0. > >> Maybe > >> > something simpler like a binary diff would be enough too. > >> > > >> > > >> > > > >> > > On Mon, Jan 11, 2016 at 9:20 PM, John Sirois <j...@conductant.com> > >> > wrote: > >> > > > >> > > > On Mon, Jan 11, 2016 at 10:10 PM, Bill Farner <wfar...@apache.org > > > >> > > wrote: > >> > > > > >> > > > > There's a chicken and egg problem though. That variable will > only > >> be > >> > > > filled > >> > > > > in on the executor, when we're already running in the docker > >> > > environment. > >> > > > > In this case, the parameter is used to *define* the docker > >> > environment. > >> > > > > > >> > > > > >> > > > So, from a naive standpoint, the fact that Job is exploded into > >> Tasks > >> > by > >> > > > the scheduler but that explosion is not exposed to the client > seems > >> to > >> > be > >> > > > the impedance mismatch here. > >> > > > I have not thought through this much at all, but say that > >> fundamentally > >> > > the > >> > > > scheduler took a Job that was a list of Tasks - possibly > >> heterogeneous. > >> > > > The current Job expands to homogeneous Tasks could be just a > >> standard > >> > > > convenience. > >> > > > > >> > > > In that sort of world, the customized params could be injected > >> client > >> > > side > >> > > > to form a list of heterogeneous tasks and the Scheduler could stay > >> > dumb - > >> > > > at least wrt Task parameterization. > >> > > > > >> > > > > >> > > > > On Mon, Jan 11, 2016 at 9:07 PM, ben...@gmail.com < > >> ben...@gmail.com> > >> > > > > wrote: > >> > > > > > >> > > > > > As a starting point, you might be able to cook up something > >> > involving > >> > > > > > {{mesos.instance}} as a lookup key to a pystachio list. You > do > >> > have > >> > > a > >> > > > > > unique integer task number per instance to work with. > >> > > > > > > >> > > > > > cf. > >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > http://aurora.apache.org/documentation/latest/configuration-reference/#template-namespaces > >> > > > > > > >> > > > > > On Mon, Jan 11, 2016 at 8:05 PM Bill Farner < > wfar...@apache.org > >> > > >> > > > wrote: > >> > > > > > > >> > > > > > > I agree that this appears necessary when parameters are > >> needed to > >> > > > > define > >> > > > > > > the runtime environment of the task (in this case, setting > up > >> the > >> > > > > docker > >> > > > > > > container). > >> > > > > > > > >> > > > > > > What's particularly interesting here is that this would call > >> for > >> > > the > >> > > > > > > scheduler to fill in the parameter values prior to launching > >> each > >> > > > task. > >> > > > > > > Using pystachio variables for this is certainly the most > >> natural > >> > in > >> > > > the > >> > > > > > > DSL, but becomes a paradigm shift since the scheduler is > >> > currently > >> > > > > > ignorant > >> > > > > > > of pystachio. > >> > > > > > > > >> > > > > > > Possibly only worth mentioning for shock value, but in the > DSL > >> > this > >> > > > > > starts > >> > > > > > > to look like lambdas pretty quickly. > >> > > > > > > > >> > > > > > > On Mon, Jan 11, 2016 at 7:46 PM, Mauricio Garavaglia < > >> > > > > > > mauriciogaravag...@gmail.com> wrote: > >> > > > > > > > >> > > > > > > > Hi guys, > >> > > > > > > > > >> > > > > > > > We are using the docker rbd volume plugin > >> > > > > > > > < > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://ceph.com/planet/getting-started-with-the-docker-rbd-volume-plugin > >> > > >> > > > > > > > to > >> > > > > > > > provide persistent storage to the aurora jobs that runs in > >> the > >> > > > > > > containers. > >> > > > > > > > Something like: > >> > > > > > > > > >> > > > > > > > p = [Parameter(name='volume', > value='my-ceph-volume:/foo'), > >> > ...] > >> > > > > > > > jobs = [ Service(..., container = Container(docker = > >> > Docker(..., > >> > > > > > > parameters > >> > > > > > > > = p)))] > >> > > > > > > > > >> > > > > > > > But in the case of jobs with multiple instances it's > >> required > >> > to > >> > > > > start > >> > > > > > > each > >> > > > > > > > container using different volumes, in our case different > >> ceph > >> > > > images. > >> > > > > > > This > >> > > > > > > > could be achieved by deploying, for example, 10 instances > >> and > >> > > then > >> > > > > > update > >> > > > > > > > each one independently to use the appropiate volume. Of > >> course > >> > > this > >> > > > > is > >> > > > > > > > quite inconvenient, error prone, and adds a lot of logic > and > >> > > state > >> > > > > > > outside > >> > > > > > > > aurora. > >> > > > > > > > > >> > > > > > > > We where thinking if it would make sense to have a way to > >> > > > > parameterize > >> > > > > > > the > >> > > > > > > > task instances, in a similar way that is done with > >> portmapping > >> > > for > >> > > > > > > example. > >> > > > > > > > In the job definition have something like > >> > > > > > > > > >> > > > > > > > params = [ > >> > > > > > > > Parameter( name='volume', > >> > > > > > > > value='service-{{instanceParameters.volume}}:/foo' ) > >> > > > > > > > ] > >> > > > > > > > ... > >> > > > > > > > jobs = [ > >> > > > > > > > Service( > >> > > > > > > > name = 'logstash', > >> > > > > > > > ... > >> > > > > > > > instanceParameters = { "volume" : ["foo", "bar", > >> "zaa"]}, > >> > > > > > > > instances = 3, > >> > > > > > > > container = Container( > >> > > > > > > > docker = Docker( > >> > > > > > > > image = 'image', > >> > > > > > > > parameters = params > >> > > > > > > > ) > >> > > > > > > > ) > >> > > > > > > > ) > >> > > > > > > > ] > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > Something like that, it would create 3 instances of the > >> tasks, > >> > > each > >> > > > > one > >> > > > > > > > running in a container that uses the volumes foo, bar, and > >> zaa. > >> > > > > > > > > >> > > > > > > > Does it make sense? I'd be glad to work on it but I want > to > >> > > > validate > >> > > > > > the > >> > > > > > > > idea with you first and hear comments about the > >> > > api/implementation. > >> > > > > > > > > >> > > > > > > > Thanks > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > Mauricio > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > >> > > > > >> > > > -- > >> > > > John Sirois > >> > > > 303-512-3301 > >> > > > > >> > > > >> > > >> > > >> > > >> > -- > >> > John Sirois > >> > 303-512-3301 > >> > > >> > > > > >