Need to correct a few previous statements: > Also we do not want to expose this message to users. This is incorrect. The original design proposal suggested to show this message in the UI as: "Task is healthy"
> The Mesos API isn't designed for packing arbitrary data > in the status update message. Don't think I agree, this is exactly what this field is for [1] and we already use it for other states [2]. > I would be open to just saying that scheduler version > 0.16 (or 0.17) just assumes the executor transitions to > RUNNING once a task is healthy and dropping > `watch_secs`entirely. We can't drop 'watch_secs' entirely as we still have to babysit job updates that don't have health checks enabled. As for my take on the above, I favor #1 as the simplest answer to an already simple question: "Should we use watch_secs for this instance or not?". That's pretty much it. Scheduler does not need any schema changes, know what health checks are or if a job has them enabled. At least not until we attempt to move to centralized health checks (AURORA-279) but that will be an entirely different design discussion. [1] - https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L1605. [2] - https://github.com/apache/aurora/blob/5cad046fc0f0c4bb79a4563cfcff0442b7bf8383/src/main/python/apache/aurora/executor/aurora_executor.py#L97 On Fri, Sep 2, 2016 at 2:26 PM, Zameer Manji <zma...@apache.org> wrote: > *cc: Renan* > > I think there is some disagreement/discussion on the review because we have > not achieved consensus on the design. Since the design doc was written, > Aurora adopted multiple executor support as well non HTTP based > healthchecking. This invalidates some parts of the original design. I think > all of the solutions here are possible amendments to the design doc. > > I am not in favor of Solution 2 at all because status updates between > executor <-> agent <-> master <-> scheduler are designed to update the > framework of updates to the task and not really designed to send arbitrary > information. Just because the Mesos API provides us with a string field > doesn't mean we should try to pack in arbitrary data. Also, it isn't clear > what other capabilities we might add in the future so I'm unconvinced that > capabilities needs to exist at all. My fear is that we will create the > infrastructure for capabilities just to serve this need and nothing else. > > I object to Solution 1 along the same lines. The Mesos API isn't designed > for packing arbitrary data in the status update message and I don't think > we should abuse that and rely on that. Also our current infrastructure just > plumbs the message to the UI and I think displaying capabilities is not > something we should do. > > I am in favor of Solution 3 which is as close as possible to the original > design in the design doc. The design doc says the following: > > Scheduler updater will skip the minWaitInInstanceMs (aka watch_secs >> <https://github.com/apache/aurora/blob/4b43305b33cd8bebdd80225a3987b7cc7a8389a2/docs/configuration-reference.md#updateconfig-objects>) >> grace period any time it detects a named port ‘health’ in task >> configuration. A RUNNING instance status will signify the end of instance >> update. > > > Instead of detecting the 'health' port in the task configuration, we make > enabling this feature explicitly by enabling a bit in the task > configuration with the `executorDrivenUpdates` bit. > > I understand this option makes this feature more complex because it > requires a schema change and requires operators to deploy the executor to > all agents before upgrading the client. However, I think that's a one time > operational cost as a opposed to long lived design choices that will affect > the code. > > Further Solution 3 is the most amenable to custom executors and continues > our tradition of treating executors as opaque black boxes. I think there is > a lot of value in treating executors as black boxes as it leaves the door > open to switching our executor to something else and doesn't impose a > burden to others that want to write their own. > > Alternatively, if amending the schema is too much work, I would be open to > just saying that scheduler version 0.16 (or 0.17) just assumes the executor > transitions to RUNNING once a task is healthy and dropping `watch_secs` > entirely. We can put it in the release notes that operators must deploy the > executor to 100% before deploying the scheduler. > > > On Thu, Sep 1, 2016 at 6:40 PM, 黄 凯 <texasred2...@hotmail.com> wrote: > >> Hi Folks, >> >> I'm currently working on a feature on aurora scheduler and executor. The >> implementation strategy became controversial on the review board, so I was >> wondering if I should broadcast it to more audience and initiate a >> discussion. Please feel free to let me know your thoughts, your help is >> greatly appreciated! >> >> The high level goal of this feature is to improve reliability and >> performance of the Aurora scheduler job updater, by relying on health check >> status rather than watch_secs timeout when deciding an individual instance >> update state. >> >> Please see the original review request *https://reviews.apache.org/r/51536/ >> <https://reviews.apache.org/r/51536/> * >> aurora JIRA ticket *https://issues.apache.org/jira/browse/AURORA-894 >> <https://issues.apache.org/jira/browse/AURORA-894>* >> design doc >> *https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit# >> <https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#>* >> for more details and background. >> >> Note: The design doc becomes a little bit outdated on the "scheduler >> change summary" part (this is what the review request trying to address). >> As a result, I've left some comment to clarify the latest proposed >> implementation plan for scheduler change. >> >> There are two questions I'm trying to address here: >> *1. How does the scheduler infer the executor version and be backward >> compatible?* >> *2. Where do we determine if health check is enabled?* >> >> In short, there are 3 different solutions proposed on the review board. >> >> In the first two approaches, the scheduler will rely on a string to >> determine the executor version. We determine whether health check is >> enabled merely on executor side. There will be communication between the >> executor and the scheduler. >> *Solution 1: * >> *vCurrent executor sends a message in its health check thread during >> RUNNING state transition, and the vCurrent updater will infer the executor >> version from the presence of this message, and skip the watch_secs if >> necessary.* >> >> *Solution 2:* >> *Instead of relying on the presence of an arbitrary string in the message, >> rely on the presence of a string like: >> "capabilities:CAPABILITY_1,CAPABILITY-2" where CAPABILITY_1 and >> CAPABILITY_2 (etc.) are constants defined in api.thrift. Basically just >> formalizing the mechanism and making it a bit more future proof.* >> >> In the third solution, the scheduler infers the executor version from the >> JobUpdateSettings on scheduler side. >> *Solution 3:* >> *Adding a bit to JobUpdateSettings which is ‘executorDrivenUpdates', if >> that is set, the scheduler assumes that the transition from STARTING -> >> RUNNING makes the executor healthy and concurrently, we release thermos and >> change HealthCheckConfig to say that it should only go to running after >> healthy*. >> >> *Pros and Cons:* >> The main benefit of Solution 1 is: >> 1. By using the message in task status update, we don't have to make any >> schema change, which makes the design simple. >> 2. The feature is fully backward-compatible. When we roll out the vCurrent >> schedulers and executors, we do not have to instruct the users to provide >> additional field in the Job or Update configs, which could confuses >> customers when the vPrev and vCurrent executor coexist in the cluster. >> >> Concerns: >> Relying on the presence of a message makes things brittle. Also we do not >> want to expose this message to users. >> >> The benefit of Solution 2 is making the feature more future proof. >> However, if we do not envision a new executor feature in the short term, >> it's not too much different from Solution 1. >> >> The benefits of Solution 3 include: >> 1. We support more than just thermos now (and others rely on custom >> executors). >> 2. A lot of things in Aurora treat the executor as opaque. The status >> update message sent by executor should not be visible to users only if it's >> an error message. >> >> Concerns: >> 1. In addition to the ‘executorDrivenUpdates' bit that identifies the >> executor version, we still need to notify the scheduler if health check is >> enabled on vCurrent executor, if not, the scheduler must be able to fall >> back to use watch_secs. >> 2. The users have to provide an additional field in their .aurora config >> files. The feature wouldn't be available unless new clients are rolled out >> as well. >> >> Please let me know if I understand your suggestions correctly and >> hopefully everyone is on the same page! >> >> Thanks, >> >> Kai >>