Thanks Stephan. Please read inline. On Sat, Jan 20, 2018 at 5:03 AM, Stephan Erb <[email protected]> wrote:
> Q1: Does Aurora use COMMAND or DEFAULT executor? > > > Aurora is currently using neither. In Mesos terms Thermos is a CUSTOM > executor. On top, Aurora supports alternative custom executors [1] such as > the Docker compose executor [2]. > > Mesos seems to be betting on the new DEFAULT executor. It should be > possible to make Thermos fit the DEFAULT executor model (as it supports > task groups), but I have no real estimate how much refactoring this would > require. > > This was about a point Bill made earlier. I am wondering if "without an executor" is COMMAND or DEFAULT. ``` > But do we really need the command line option? *Aurora can run tasks without an executor.* I'm assuming the shutdown call is incompatible with that mode. ``` > > Q2: I think that this is ok as Aurora's reconciliation will still work... > Right? > > > Aurora assumes a correspondence of one task per executor, so I believe > this is correct. > > Great. > Q3: Does thermos executor need any changes to respond to SHUTDOWN or does > it already handle that? > > > I have never tried it, but I believe it should work out of the box [3]. > Indeed looks like it is already handled. > [1] https://github.com/apache/aurora/blob/master/docs/ > features/custom-executors.md > [2] https://github.com/mesos/docker-compose-executor > [3] https://github.com/apache/aurora/blob/8af269f52f162faa36cd2778979626 > eefcbe8181/src/main/python/apache/aurora/executor/aurora_ > executor.py#L301-L313 > > > Best regards, > Stephan > > > On Wed, 2018-01-17 at 16:45 -0800, Mohit Jaggi wrote: > > FYI....I had a quick chat with Vinod from the Mesos team. I have some > questions for Aurora users inline: > > > *Originally the default was the COMMAND executor. In this world the > scheduler has no visibility into the command executor.* > *More recently, we added a DEFAULT executor which is used by frameworks > when they want to launch pod like task groups* > > *The SHUTDOWN executor call is only applicable if a scheduler uses CUSTOM > or DEFAULT executor *and* uses v1 scheduler API.* > > Q1: Does Aurora use COMMAND or DEFAULT executor? > > > *note that SHUTDOWN is not as robust as you might think > :slightly_smiling_face:* > *for one, there is no reconciliation API for the executor state. it is > very much best effort. * > *KILL is more robust for killing tasks, because task status updates are > reliably delivered and there is reconciliation API* > > Q2: I think that this is ok as Aurora's reconciliation will still work as > we don't have "executor state". "task state" will be a good and correct > proxy for that. Aurora will send SHUTDOWN again and again until it succeeds > in the same way as it does now with KILL. Right? > > Q3: Does thermos executor need any changes to respond to SHUTDOWN or does > it already handle that? > > > > > On Tue, Jan 16, 2018 at 4:48 PM, Mohit Jaggi <[email protected]> wrote: > > So that is pretty much what I proposed... > > If the method signature has to change, we can keep the executorId as it > is, unless we want to take this opportunity to clean that up. I will check > if the SHUTDOWN works in non-executor cases also. > > On Tue, Jan 16, 2018 at 3:03 PM, Bill Farner <[email protected]> wrote: > > We still need "Agent ID" for the shutdown call. > > > Darn. In that case, how about we change the method signature in Driver to > accept agentId and ignore that param in MesosSchedulerDriver. > > But do we really need the command line option? > > > Aurora can run tasks without an executor. I'm assuming the shutdown call > is incompatible with that mode. > > On Tue, Jan 16, 2018 at 1:57 PM, Mohit Jaggi <[email protected]> wrote: > > We still need "Agent ID" for the shutdown call. > > On Tue, Jan 16, 2018 at 1:57 PM, Mohit Jaggi <[email protected]> wrote: > > Sounds good. But do we really need the command line option? One can use an > older Driver if KILL is preferred for some reason. > > On Tue, Jan 16, 2018 at 1:51 PM, Bill Farner <[email protected]> wrote: > > This situation is much simpler if task ID == executor ID. I can't come up > with a good reason why this is not the case today. Our executor IDs > originally included static prefix, though i do not recall any rationale for > this. When Renan added custom executor support, this static prefix was > made configurable. Again, i do not believe there was any rationale for the > utility of executor IDs. > > I propose the following: > - Change relevant code in MesosTaskFactory to > setExecutorId(task.getTaskId()) > - Add a command line parameter (default false) to toggle use of executor > shutdown in VersionedSchedulerDriverService.killTask > > Does anyone see an issue with this approach? > > On Tue, Jan 16, 2018 at 11:15 AM, Mohit Jaggi <[email protected]> > wrote: > > To do this in a backward compatible manner, one way is : > > ``` > void destroy(taskId, executorId, agentId) { > > if(driver instanceOf Versioned....) > (Versioned...)driver.shutdown(executorId, agentId) > else > driver.kill(taskId) > > } > ``` > > Any other opinions? > > On Tue, Jan 16, 2018 at 11:12 AM, David McLaughlin <[email protected] > > wrote: > > Nope, I support getting SHUTDOWN in for users of the new API. > > On Tue, Jan 16, 2018 at 11:06 AM, Mohit Jaggi <[email protected]> > wrote: > > Are you suggesting that we delay the switch to SHUTDOWN call until this > working group can resolve the API perf issue? > > On Mon, Jan 15, 2018 at 3:55 PM, David McLaughlin <[email protected]> > wrote: > > We are working with Mesos folks to resolve it. There is a Mesos > performance working group that folks can join if they'd like to contribute: > http://mesos.apache.org/blog/performance-working-group-progress-report/ > > I'm not sure what you mean by branch. Everything we used to scale test is > on master. > > On Mon, Jan 15, 2018 at 10:08 AM, Meghdoot bhattacharya < > [email protected]> wrote: > > David, should twitter try against mesos 1.5 to see if things are better > with the new api instead of libmesos. This is going to be a drift over time > that will stop us from adopting new features. > > If it was sometime back it would be good to rerun the tests and open a > ticket in Mesos if issues exist. All aurora users can then push for > resolution. > > Also details on branch etc that has the api integration? > > Thx > > On Jan 12, 2018, at 11:39 AM, David McLaughlin <[email protected]> > wrote: > > I'm not sure I agree with the summary. Bill's proposal was using shutdown > only when using the new API. I would also support this if it's possible. > > On Fri, Jan 12, 2018 at 11:14 AM, Mohit Jaggi <[email protected]> > wrote: > > Summary so far: > - Bill supports making this change > - This change cannot be made in a backward compatible manner > - David (Twitter) does not want to use HTTP APIs due to performance > concerns. I conclude that folks from Twitter don't support this change > > Question: > - Are there other users that want this change? > > > > > > > > > > > > > >
