Re: 答复: Discussion on review request 51536

Joshua Cohen Tue, 06 Sep 2016 09:37:35 -0700

I'm +1 to this as well. Thanks for driving Zameer!

On Sat, Sep 3, 2016 at 1:31 PM, Renan DelValle <rdelv...@binghamton.edu>
wrote:


> +1 On Zameer's solution as that keeps the scheduler executor agnostic. In
> my opinion, it is a reasonable requirement to expect an executor to send
> the RUNNING status update only when it is confident that the task has been
> launched successfully. This also provides great flexibility for those
> creating custom executors because a task entering a healthy running state
> can mean different things depending on the executor. For example,
> docker-compose-executor needs to launch multiple containers. This can take
> a while and should only be considered to be healthy if all containers are
> running properly.
>
> Thanks for the proposal Zameer, and thanks for the work on this Kai.
>
> On Sat, Sep 3, 2016 at 1:39 AM, 黄 凯 <texasred2...@hotmail.com> wrote:
>
> > +1 to Zameer's idea. Now we have three persons on board.
> >
> >
> > ------------------------------
> > *From:* Maxim Khutornenko <ma...@apache.org>
> > *Sent:* Friday, September 2, 2016 18:36
> > *To:* dev@aurora.apache.org
> > *Cc:* 黄 凯; Joshua Cohen; s...@apache.org; cald...@gmail.com;
> > rdelv...@binghamton.edu
> > *Subject:* Re: 答复: Discussion on review request 51536
> >
> > Just to summarize Zameer's proposal buried deep in the quotation
> > stream: we keep watch_secs but let users set it to 0 iff they have
> > health checks enabled. No scheduler (updater) changes needed at all.
> > Users will need to opt-in to use the new feature by changing
> > watch_secs to 0 in their configs (or skipping it completely, which
> > will set the default to 0 automatically).
> >
> > I am +1 to this idea. Thanks Zameer!
> >
> > On Fri, Sep 2, 2016 at 6:28 PM, Zameer Manji <zma...@apache.org> wrote:
> > > Resending my emails from a domain that has mailing list friendly DMARC
> > > settings.
> > >
> > > It seems that we have achieved some consensus on the design, others
> feel
> > > free to weigh in.
> > >
> > > On Fri, Sep 2, 2016 at 5:29 PM, Zameer Manji <zma...@uber.com> wrote:
> > >
> > >> Kai,
> > >>
> > >> We have had coupled deploys before, I don't think it's too terrible.
> > It's
> > >> something to note in the release notes and some operational pain for
> > large
> > >> users.
> > >>
> > >> On Fri, Sep 2, 2016 at 4:42 PM, 黄 凯 <texasred2...@hotmail.com> wrote:
> > >>
> > >> > Another concern is that once we rolled out the new executor, we
> should
> > >> > rolled out a new client in order to use the health-check feature.
> > Hence
> > >> the
> > >> > executor and client rolling out process seem to be coupled.
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > ------------------------------
> > >> > *发件人:* 黄 凯 <texasred2...@hotmail.com>
> > >> > *发送时间:* 2016年9月3日 7:23
> > >> > *收件人:* Zameer Manji; dev@aurora.apache.org
> > >> > *抄送:* Joshua Cohen; s...@apache.org; cald...@gmail.com;
> > >> > rdelv...@binghamton.edu
> > >> > *主题:* 答复: Discussion on review request 51536
> > >> >
> > >> >
> > >> > Thanks for the new proposal, Zameer. It sounds good to me. The
> > benefit is
> > >> > that it does not alter the current infrastructure too much.
> > >> >
> > >> >
> > >> > However, there is one thing to keep in mind:
> > >> >
> > >> > we currently do a check to ensure watch_sec is longer than
> > >> > initial_interval_secs. We will have to remove the alert message if
> we
> > >> > choose to skip watch_sec by setting it as zero.
> > >> >
> > >> >
> > >> > So the new configuration will not support executor-driven health
> check
> > >> > unless the executors are rolled out 100%.
> > >> >
> > >> >
> > >> > Does this tradeoff seems OK for us, Maxim?
> > >> >
> > >> >
> > >> > Kai
> > >> >
> > >> >
> > >> > ------------------------------
> > >> > *发件人:* Zameer Manji <zma...@uber.com>
> > >> > *发送时间:* 2016年9月3日 6:53
> > >> > *收件人:* dev@aurora.apache.org
> > >> > *抄送:* 黄 凯; Joshua Cohen; s...@apache.org; cald...@gmail.com;
> > >> > rdelv...@binghamton.edu
> > >> > *主题:* Re: Discussion on review request 51536
> > >> >
> > >> >
> > >> >
> > >> > On Fri, Sep 2, 2016 at 3:24 PM, Maxim Khutornenko <ma...@apache.org
> >
> > >> > wrote:
> > >> >
> > >> >> Need to correct a few previous statements:
> > >> >>
> > >> >> > Also we do not want to expose this message to users.
> > >> >> This is incorrect. The original design proposal suggested to show
> > this
> > >> >> message in the UI as: "Task is healthy"
> > >> >>
> > >> >
> > >> > Does this mean the message in the status update is going to be
> > exactly,
> > >> > "Task is healthy" and the scheduler is going to check for this
> string
> > in
> > >> > the `TASK_RUNNING` status update? This means we are going to
> > establish a
> > >> > communication
> > >> > mechanism between the executor and scheduler that's not defined by a
> > >> > schema. I feel that's worse than putting JSON in there and having
> the
> > >> > scheduler parse it.
> > >> >
> > >> >
> > >> >> > The Mesos API isn't designed for packing arbitrary data
> > >> >> > in the status update message.
> > >> >> Don't think I agree, this is exactly what this field is for [1] and
> > we
> > >> >> already use it for other states [2].
> > >> >>
> > >> >
> > >> > I guess I should have said 'structured arbitrary data'. The
> > >> informational,
> > >> > messages are fine and we plumb them blindly into our logging and UI.
> > I'm
> > >> > not convinced we should start putting JSON or something more
> > structured
> > >> in
> > >> > there. That's yet another schema we have and yet another versioning
> > story
> > >> > we have to go though. This also complicates matters for custom
> > executor
> > >> > authors.
> > >> >
> > >> >
> > >> >>
> > >> >> > I would be open to just saying that scheduler version
> > >> >> > 0.16 (or 0.17) just assumes the executor transitions to
> > >> >> > RUNNING once a task is healthy and dropping
> > >> >> > `watch_secs`entirely.
> > >> >> We can't drop 'watch_secs' entirely as we still have to babysit job
> > >> >> updates that don't have health checks enabled.
> > >> >>
> > >> >
> > >> > Understood. I guess we can keep it but I'm now frustrated that we
> > have a
> > >> > parameter that is ignored if we set some json in
> ExecutorConfig.data.
> > >> > Ideally, we don't accept `watch_secs` if we want health check driven
> > >> > updates. As mentioned before I don't like this implicit tightening
> of
> > the
> > >> > executor and the scheduler.
> > >> >
> > >> >
> > >> >>
> > >> >> As for my take on the above, I favor #1 as the simplest answer to
> an
> > >> >> already simple question: "Should we use watch_secs for this
> instance
> > >> >> or not?". That's pretty much it. Scheduler does not need any schema
> > >> >> changes, know what health checks are or if a job has them enabled.
> At
> > >> >> least not until we attempt to move to centralized health checks
> > >> >> (AURORA-279) but that will be an entirely different design
> > discussion.
> > >> >>
> > >> >> [1] - https://github.com/apache/mesos/blob/master/include/mesos/
> > <https://github.com/apache/mesos/blob/master/include/mesos/>
> > apache/mesos <https://github.com/apache/mesos/blob/master/include/mesos/
> >
> > github.com
> > mesos - Mirror of Apache Mesos
> >
> >
> > >> >> mesos.proto#L1605.
> > >> >> [2] - https://github.com/apache/aurora/blob/5cad046fc0f0c4bb79a456
> > >> >> 3cfcff0442b7bf8383/src/main/python/apache/aurora/executor/
> > >> >> aurora_executor.py#L97
> > >> >
> > >> >
> > >> >
> > >> > With all of this in mind, I have another proposal. Why can't we have
> > the
> > >> > executor changes (wait until the task is healthy for RUNNING) *and*
> > read
> > >> > `watch_secs` if it is set? Why not have both of these features and
> if
> > >> users
> > >> > want purely health checking driven updates they can set this value
> to
> > 0
> > >> and
> > >> > enable health checks. If they want to have both health checking and
> > time
> > >> > driven updates they can set this to value to the time that they care
> > >> about.
> > >> > If they just want time driven updates they can disable health
> checking
> > >> and
> > >> > set this value.
> > >> >
> > >> > Then there is no coupling between the executor and the scheduler
> > except
> > >> > for status updates and there is no dependency on the `message` field
> > of
> > >> the
> > >> > status update.
> > >> >
> > >> > We could even treat `watch_secs` as minimum time in STARTING +
> RUNNING
> > >> > instead of RUNNING with this change and it becomes the lower bound
> in
> > the
> > >> > update transition speed. This can ensure that users don't deploy
> "too
> > >> fast"
> > >> > and end up overwhelming other services if they are deployed too
> > quickly.
> > >> >
> > >> >
> > >> >
> > >> >>
> > >> >>
> > >> >> On Fri, Sep 2, 2016 at 2:26 PM, Zameer Manji <zma...@apache.org>
> > wrote:
> > >> >> > *cc: Renan*
> > >> >> >
> > >> >> > I think there is some disagreement/discussion on the review
> > because we
> > >> >> have
> > >> >> > not achieved consensus on the design. Since the design doc was
> > >> written,
> > >> >> > Aurora adopted multiple executor support as well non HTTP based
> > >> >> > healthchecking. This invalidates some parts of the original
> > design. I
> > >> >> think
> > >> >> > all of the solutions here are possible amendments to the design
> > doc.
> > >> >> >
> > >> >> > I am not in favor of Solution 2 at all because status updates
> > between
> > >> >> > executor <-> agent <-> master <-> scheduler are designed to
> update
> > the
> > >> >> > framework of updates to the task and not really designed to send
> > >> >> arbitrary
> > >> >> > information. Just because the Mesos API provides us with a string
> > >> field
> > >> >> > doesn't mean we should try to pack in arbitrary data. Also, it
> > isn't
> > >> >> clear
> > >> >> > what other capabilities we might add in the future so I'm
> > unconvinced
> > >> >> that
> > >> >> > capabilities needs to exist at all. My fear is that we will
> create
> > the
> > >> >> > infrastructure for capabilities just to serve this need and
> nothing
> > >> >> else.
> > >> >> >
> > >> >> > I object to Solution 1 along the same lines. The Mesos API isn't
> > >> >> designed
> > >> >> > for packing arbitrary data in the status update message and I
> don't
> > >> >> think
> > >> >> > we should abuse that and rely on that. Also our current
> > infrastructure
> > >> >> just
> > >> >> > plumbs the message to the UI and I think displaying capabilities
> is
> > >> not
> > >> >> > something we should do.
> > >> >> >
> > >> >> > I am in favor of Solution 3 which is as close as possible to the
> > >> >> original
> > >> >> > design in the design doc. The design doc says the following:
> > >> >> >
> > >> >> > Scheduler updater will skip the minWaitInInstanceMs (aka
> watch_secs
> > >> >> >> <https://github.com/apache/aurora/blob/4b43305b33cd8bebdd802
> > >> >> 25a3987b7cc7a8389a2/docs/configuration-reference.md#
> > >> updateconfig-objects
> > >> >> >)
> > >> >> >> grace period any time it detects a named port ‘health’ in task
> > >> >> >> configuration. A RUNNING instance status will signify the end of
> > >> >> instance
> > >> >> >> update.
> > >> >> >
> > >> >> >
> > >> >> > Instead of detecting the 'health' port in the task configuration,
> > we
> > >> >> make
> > >> >> > enabling this feature explicitly by enabling a bit in the task
> > >> >> > configuration with the `executorDrivenUpdates` bit.
> > >> >> >
> > >> >> > I understand this option makes this feature more complex because
> it
> > >> >> > requires a schema change and requires operators to deploy the
> > executor
> > >> >> to
> > >> >> > all agents before upgrading the client. However, I think that's a
> > one
> > >> >> time
> > >> >> > operational cost as a opposed to long lived design choices that
> > will
> > >> >> affect
> > >> >> > the code.
> > >> >> >
> > >> >> > Further Solution 3 is the most amenable to custom executors and
> > >> >> continues
> > >> >> > our tradition of treating executors as opaque black boxes. I
> think
> > >> >> there is
> > >> >> > a lot of value in treating executors as black boxes as it leaves
> > the
> > >> >> door
> > >> >> > open to switching our executor to something else and doesn't
> > impose a
> > >> >> > burden to others that want to write their own.
> > >> >> >
> > >> >> > Alternatively, if amending the schema is too much work, I would
> be
> > >> open
> > >> >> to
> > >> >> > just saying that scheduler version 0.16 (or 0.17) just assumes
> the
> > >> >> executor
> > >> >> > transitions to RUNNING once a task is healthy and dropping
> > >> `watch_secs`
> > >> >> > entirely. We can put it in the release notes that operators must
> > >> deploy
> > >> >> the
> > >> >> > executor to 100% before deploying the scheduler.
> > >> >> >
> > >> >> >
> > >> >> > On Thu, Sep 1, 2016 at 6:40 PM, 黄 凯 <texasred2...@hotmail.com>
> > wrote:
> > >> >> >
> > >> >> >> Hi Folks,
> > >> >> >>
> > >> >> >> I'm currently working on a feature on aurora scheduler and
> > executor.
> > >> >> The
> > >> >> >> implementation strategy became controversial on the review
> board,
> > so
> > >> I
> > >> >> was
> > >> >> >> wondering if I should broadcast it to more audience and
> initiate a
> > >> >> >> discussion. Please feel free to let me know your thoughts, your
> > help
> > >> is
> > >> >> >> greatly appreciated!
> > >> >> >>
> > >> >> >> The high level goal of this feature is to improve reliability
> and
> > >> >> >> performance of the Aurora scheduler job updater, by relying on
> > health
> > >> >> check
> > >> >> >> status rather than watch_secs timeout when deciding an
> individual
> > >> >> instance
> > >> >> >> update state.
> > >> >> >>
> > >> >> >> Please see the original review request *
> > >> https://reviews.apache.org/r/
> > >> >> 51536/
> > >> >> >> <https://reviews.apache.org/r/51536/> *
> > >> >> >> aurora JIRA ticket *https://issues.apache.org/
> > jira/browse/AURORA-894
> > >> >> >> <https://issues.apache.org/jira/browse/AURORA-894>*
> > >> >> >> design doc *https://docs.google.com/docum
> > >> >> ent/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
> > >> >> >> <https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm
> > >> >> 10NXSxEWR0a-21FP5d94/edit#>*
> > >> >> >> for more details and background.
> > >> >> >>
> > >> >> >> Note: The design doc becomes a little bit outdated on the
> > "scheduler
> > >> >> >> change summary" part (this is what the review request trying to
> > >> >> address).
> > >> >> >> As a result, I've left some comment to clarify the latest
> proposed
> > >> >> >> implementation plan for scheduler change.
> > >> >> >>
> > >> >> >> There are two questions I'm trying to address here:
> > >> >> >> *1. How does the scheduler infer the executor version and be
> > backward
> > >> >> >> compatible?*
> > >> >> >> *2. Where do we determine if health check is enabled?*
> > >> >> >>
> > >> >> >> In short, there are 3 different solutions proposed on the review
> > >> board.
> > >> >> >>
> > >> >> >> In the first two approaches, the scheduler will rely on a string
> > to
> > >> >> >> determine the executor version. We determine whether health
> check
> > is
> > >> >> >> enabled merely on executor side. There will be communication
> > between
> > >> >> the
> > >> >> >> executor and the scheduler.
> > >> >> >> *Solution 1: *
> > >> >> >> *vCurrent executor sends a message in its health check thread
> > during
> > >> >> >> RUNNING state transition, and the vCurrent updater will infer
> the
> > >> >> executor
> > >> >> >> version from the presence of this message, and skip the
> > watch_secs if
> > >> >> >> necessary.*
> > >> >> >>
> > >> >> >> *Solution 2:*
> > >> >> >> *Instead of relying on the presence of an arbitrary string in
> the
> > >> >> message,
> > >> >> >> rely on the presence of a string like:
> > >> >> >> "capabilities:CAPABILITY_1,CAPABILITY-2" where CAPABILITY_1 and
> > >> >> >> CAPABILITY_2 (etc.) are constants defined in api.thrift.
> Basically
> > >> just
> > >> >> >> formalizing the mechanism and making it a bit more future
> proof.*
> > >> >> >>
> > >> >> >> In the third solution, the scheduler infers the executor version
> > from
> > >> >> the
> > >> >> >> JobUpdateSettings on scheduler side.
> > >> >> >> *Solution 3:*
> > >> >> >> *Adding a bit to JobUpdateSettings which is
> > ‘executorDrivenUpdates',
> > >> if
> > >> >> >> that is set, the scheduler assumes that the transition from
> > STARTING
> > >> ->
> > >> >> >> RUNNING makes the executor healthy and concurrently, we release
> > >> >> thermos and
> > >> >> >> change HealthCheckConfig to say that it should only go to
> running
> > >> after
> > >> >> >> healthy*.
> > >> >> >>
> > >> >> >> *Pros and Cons:*
> > >> >> >> The main benefit of Solution 1 is:
> > >> >> >> 1. By using the message in task status update, we don't have to
> > make
> > >> >> any
> > >> >> >> schema change, which makes the design simple.
> > >> >> >> 2. The feature is fully backward-compatible. When we roll out
> the
> > >> >> vCurrent
> > >> >> >> schedulers and executors, we do not have to instruct the users
> to
> > >> >> provide
> > >> >> >> additional field in the Job or Update configs, which could
> > confuses
> > >> >> >> customers when the vPrev and vCurrent executor coexist in the
> > >> cluster.
> > >> >> >>
> > >> >> >> Concerns:
> > >> >> >> Relying on the presence of a message makes things brittle. Also
> > we do
> > >> >> not
> > >> >> >> want to expose this message to users.
> > >> >> >>
> > >> >> >> The benefit of Solution 2 is making the feature more future
> proof.
> > >> >> >> However, if we do not envision a new executor feature in the
> short
> > >> >> term,
> > >> >> >> it's not too much different from Solution 1.
> > >> >> >>
> > >> >> >> The benefits of Solution 3 include:
> > >> >> >> 1. We support more than just thermos now (and others rely on
> > custom
> > >> >> >> executors).
> > >> >> >> 2. A lot of things in Aurora treat the executor as opaque. The
> > status
> > >> >> >> update message sent by executor should not be visible to users
> > only
> > >> if
> > >> >> it's
> > >> >> >> an error message.
> > >> >> >>
> > >> >> >> Concerns:
> > >> >> >> 1. In addition to the ‘executorDrivenUpdates' bit that
> identifies
> > the
> > >> >> >> executor version, we still need to notify the scheduler if
> health
> > >> >> check is
> > >> >> >> enabled on vCurrent executor, if not, the scheduler must be able
> > to
> > >> >> fall
> > >> >> >> back to use watch_secs.
> > >> >> >> 2. The users have to provide an additional field in their
> .aurora
> > >> >> config
> > >> >> >> files. The feature wouldn't be available unless new clients are
> > >> rolled
> > >> >> out
> > >> >> >> as well.
> > >> >> >>
> > >> >> >> Please let me know if I understand your suggestions correctly
> and
> > >> >> >> hopefully everyone is on the same page!
> > >> >> >>
> > >> >> >> Thanks,
> > >> >> >>
> > >> >> >> Kai
> > >> >> >>
> > >> >>
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > Zameer Manji
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >> Zameer Manji
> > >>
> >
>

Re: 答复: Discussion on review request 51536

Reply via email to