Re: Parameterize each Job Instance.

Bill Farner Tue, 12 Jan 2016 17:45:07 -0800

Semi off-the-cuff thought, but one option is to re-define the instance ID
-> TaskConfig association in JobConfiguration:


struct JobConfiguration {
  *...*
  6: TaskConfig taskConfig
  /**
   * The number of instances in the job. Generated instance IDs for
tasks will be in the range
   * [0, instances).
   */
  8: i32 instanceCount
}

Some prior art you could draw from is JobUpdateInstructions, which models a
heterogeneous set of tasks (while supporting normalization):

struct JobUpdateInstructions {
  /** Actual InstanceId -> TaskConfig mapping when the update was requested. */
  1: set<InstanceTaskConfig> initialState

  /** Desired configuration when the update completes. */
  2: InstanceTaskConfig desiredState
  ...
}

struct InstanceTaskConfig {
  /** A TaskConfig associated with instances. */
  1: TaskConfig task

  /** Instances associated with the TaskConfig. */
  2: set<Range> instances
}

So you could imagine JobConfiguration containing set<InstanceTaskConfig> to
be the eventual replacement of the taskConfig, instanceCount fields.

If we proceed this way, it suggests that we should change
JobUpdateInstructions.desiredState to also be set<InstanceTaskConfig> for
parity.


On Tue, Jan 12, 2016 at 8:10 PM, Mauricio Garavaglia <
mauriciogaravag...@gmail.com> wrote:

> Thanks for the input guys! I was wondering if you have any thoughts about
> how the API should look like.
>
> On Tue, Jan 12, 2016 at 1:00 PM, John Sirois <john.sir...@gmail.com>
> wrote:
>
> > On Mon, Jan 11, 2016 at 11:02 PM, John Sirois <john.sir...@gmail.com>
> > wrote:
> >
> > >
> > >
> > > On Mon, Jan 11, 2016 at 11:00 PM, Bill Farner <wfar...@apache.org>
> > wrote:
> > >
> > >> In the log, tasks are denormalized anyhow:
> > >>
> > >>
> >
> https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/storage.thrift#L43-L45
> > >
> > >
> > > Right - but now we'd be making that denormalization systemically
> > > in-effective.  IIUC its values-equals based denorm,  I'd think we'd
> need
> > > diffing in a cluster using, for example, ceph + docker ~exclusively.
> > >
> >
> > I was being generally confusing here.  To be more precise, the issue I'm
> > concerned about is the newish log snapshot deduping feature [1] being
> > foiled by all TaskConfig's for a job's tasks now being unique via
> > `ExecutorConfig.data` [2].
> > This is an optimization concern only, and IIUC it only becomes of concern
> > in very large clusters as evidenced by the fact the log dedup feature
> came
> > late in the use of Aurora by Twitter.
> >
> > This could definitely be worked out.
> >
> > [1]
> >
> >
> https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/storage.thrift#L196-L208
> > [2]
> >
> >
> https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/api.thrift#L167
> >
> > >
> > >
> > >>
> > >>
> > >>
> > >>
> > >> On Mon, Jan 11, 2016 at 9:54 PM, John Sirois <j...@conductant.com>
> > wrote:
> > >>
> > >> > On Mon, Jan 11, 2016 at 10:40 PM, Bill Farner <wfar...@apache.org>
> > >> wrote:
> > >> >
> > >> > > Funny, that's actually how the scheduler API originally worked.  I
> > >> think
> > >> > > this is worth exploring, and would indeed completely sidestep the
> > >> > paradigm
> > >> > > shift i mentioned above.
> > >> > >
> > >> >
> > >> > I think the crux might be handling a structural diff of the thrift
> for
> > >> the
> > >> > Tasks to keep the log dedupe optimizations in-play for the most
> part;
> > ie
> > >> > store Task0 in-full, and Task1-N as thrift struct diffs against 0.
> > >> Maybe
> > >> > something simpler like a binary diff would be enough too.
> > >> >
> > >> >
> > >> > >
> > >> > > On Mon, Jan 11, 2016 at 9:20 PM, John Sirois <j...@conductant.com
> >
> > >> > wrote:
> > >> > >
> > >> > > > On Mon, Jan 11, 2016 at 10:10 PM, Bill Farner <
> wfar...@apache.org
> > >
> > >> > > wrote:
> > >> > > >
> > >> > > > > There's a chicken and egg problem though. That variable will
> > only
> > >> be
> > >> > > > filled
> > >> > > > > in on the executor, when we're already running in the docker
> > >> > > environment.
> > >> > > > > In this case, the parameter is used to *define* the docker
> > >> > environment.
> > >> > > > >
> > >> > > >
> > >> > > > So, from a naive standpoint, the fact that Job is exploded into
> > >> Tasks
> > >> > by
> > >> > > > the scheduler but that explosion is not exposed to the client
> > seems
> > >> to
> > >> > be
> > >> > > > the impedance mismatch here.
> > >> > > > I have not thought through this much at all, but say that
> > >> fundamentally
> > >> > > the
> > >> > > > scheduler took a Job that was a list of Tasks - possibly
> > >> heterogeneous.
> > >> > > > The current Job expands to homogeneous Tasks could be just a
> > >> standard
> > >> > > > convenience.
> > >> > > >
> > >> > > > In that sort of world, the customized params could be injected
> > >> client
> > >> > > side
> > >> > > > to form a list of heterogeneous tasks and the Scheduler could
> stay
> > >> > dumb -
> > >> > > > at least wrt Task parameterization.
> > >> > > >
> > >> > > >
> > >> > > > > On Mon, Jan 11, 2016 at 9:07 PM, ben...@gmail.com <
> > >> ben...@gmail.com>
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > > As a starting point, you might be able to cook up something
> > >> > involving
> > >> > > > > > {{mesos.instance}} as a lookup key to a pystachio list.  You
> > do
> > >> > have
> > >> > > a
> > >> > > > > > unique integer task number per instance to work with.
> > >> > > > > >
> > >> > > > > > cf.
> > >> > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> http://aurora.apache.org/documentation/latest/configuration-reference/#template-namespaces
> > >> > > > > >
> > >> > > > > > On Mon, Jan 11, 2016 at 8:05 PM Bill Farner <
> > wfar...@apache.org
> > >> >
> > >> > > > wrote:
> > >> > > > > >
> > >> > > > > > > I agree that this appears necessary when parameters are
> > >> needed to
> > >> > > > > define
> > >> > > > > > > the runtime environment of the task (in this case, setting
> > up
> > >> the
> > >> > > > > docker
> > >> > > > > > > container).
> > >> > > > > > >
> > >> > > > > > > What's particularly interesting here is that this would
> call
> > >> for
> > >> > > the
> > >> > > > > > > scheduler to fill in the parameter values prior to
> launching
> > >> each
> > >> > > > task.
> > >> > > > > > > Using pystachio variables for this is certainly the most
> > >> natural
> > >> > in
> > >> > > > the
> > >> > > > > > > DSL, but becomes a paradigm shift since the scheduler is
> > >> > currently
> > >> > > > > > ignorant
> > >> > > > > > > of pystachio.
> > >> > > > > > >
> > >> > > > > > > Possibly only worth mentioning for shock value, but in the
> > DSL
> > >> > this
> > >> > > > > > starts
> > >> > > > > > > to look like lambdas pretty quickly.
> > >> > > > > > >
> > >> > > > > > > On Mon, Jan 11, 2016 at 7:46 PM, Mauricio Garavaglia <
> > >> > > > > > > mauriciogaravag...@gmail.com> wrote:
> > >> > > > > > >
> > >> > > > > > > > Hi guys,
> > >> > > > > > > >
> > >> > > > > > > > We are using the docker rbd volume plugin
> > >> > > > > > > > <
> > >> > > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://ceph.com/planet/getting-started-with-the-docker-rbd-volume-plugin
> > >> >
> > >> > > > > > > > to
> > >> > > > > > > > provide persistent storage to the aurora jobs that runs
> in
> > >> the
> > >> > > > > > > containers.
> > >> > > > > > > > Something like:
> > >> > > > > > > >
> > >> > > > > > > > p = [Parameter(name='volume',
> > value='my-ceph-volume:/foo'),
> > >> > ...]
> > >> > > > > > > > jobs = [ Service(..., container = Container(docker =
> > >> > Docker(...,
> > >> > > > > > > parameters
> > >> > > > > > > > = p)))]
> > >> > > > > > > >
> > >> > > > > > > > But in the case of jobs with multiple instances it's
> > >> required
> > >> > to
> > >> > > > > start
> > >> > > > > > > each
> > >> > > > > > > > container using different volumes, in our case different
> > >> ceph
> > >> > > > images.
> > >> > > > > > > This
> > >> > > > > > > > could be achieved by deploying, for example, 10
> instances
> > >> and
> > >> > > then
> > >> > > > > > update
> > >> > > > > > > > each one independently to use the appropiate volume. Of
> > >> course
> > >> > > this
> > >> > > > > is
> > >> > > > > > > > quite inconvenient, error prone, and adds a lot of logic
> > and
> > >> > > state
> > >> > > > > > > outside
> > >> > > > > > > > aurora.
> > >> > > > > > > >
> > >> > > > > > > > We where thinking if it would make sense to have a way
> to
> > >> > > > > parameterize
> > >> > > > > > > the
> > >> > > > > > > > task instances, in a similar way that is done with
> > >> portmapping
> > >> > > for
> > >> > > > > > > example.
> > >> > > > > > > > In the job definition have something like
> > >> > > > > > > >
> > >> > > > > > > > params = [
> > >> > > > > > > >   Parameter( name='volume',
> > >> > > > > > > > value='service-{{instanceParameters.volume}}:/foo' )
> > >> > > > > > > > ]
> > >> > > > > > > > ...
> > >> > > > > > > > jobs = [
> > >> > > > > > > >   Service(
> > >> > > > > > > >     name = 'logstash',
> > >> > > > > > > >     ...
> > >> > > > > > > >     instanceParameters = { "volume" : ["foo", "bar",
> > >> "zaa"]},
> > >> > > > > > > >     instances = 3,
> > >> > > > > > > >     container = Container(
> > >> > > > > > > >       docker = Docker(
> > >> > > > > > > >         image = 'image',
> > >> > > > > > > >         parameters = params
> > >> > > > > > > >       )
> > >> > > > > > > >     )
> > >> > > > > > > >   )
> > >> > > > > > > > ]
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > Something like that, it would create 3 instances of the
> > >> tasks,
> > >> > > each
> > >> > > > > one
> > >> > > > > > > > running in a container that uses the volumes foo, bar,
> and
> > >> zaa.
> > >> > > > > > > >
> > >> > > > > > > > Does it make sense? I'd be glad to work on it but I want
> > to
> > >> > > > validate
> > >> > > > > > the
> > >> > > > > > > > idea with you first and hear comments about the
> > >> > > api/implementation.
> > >> > > > > > > >
> > >> > > > > > > > Thanks
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > Mauricio
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > --
> > >> > > > John Sirois
> > >> > > > 303-512-3301
> > >> > > >
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > John Sirois
> > >> > 303-512-3301
> > >> >
> > >>
> > >
> > >
> >
>

Re: Parameterize each Job Instance.

Reply via email to