Re: shutdown vs kill API is Mesos

2018-01-20 Thread Stephan Erb
> Q1: Does Aurora use COMMAND or DEFAULT executor? Aurora is currently using neither. In Mesos terms Thermos is a CUSTOM executor. On top, Aurora supports alternative custom executors [1] such as the Docker compose executor [2]. Mesos seems to be betting on the new DEFAULT executor. It should

Re: shutdown vs kill API is Mesos

2018-01-17 Thread Mohit Jaggi
FYII had a quick chat with Vinod from the Mesos team. I have some questions for Aurora users inline: *Originally the default was the COMMAND executor. In this world the scheduler has no visibility into the command executor.* *More recently, we added a DEFAULT executor which is used by

Re: shutdown vs kill API is Mesos

2018-01-16 Thread Mohit Jaggi
So that is pretty much what I proposed... If the method signature has to change, we can keep the executorId as it is, unless we want to take this opportunity to clean that up. I will check if the SHUTDOWN works in non-executor cases also. On Tue, Jan 16, 2018 at 3:03 PM, Bill Farner

Re: shutdown vs kill API is Mesos

2018-01-16 Thread Bill Farner
> > We still need "Agent ID" for the shutdown call. Darn. In that case, how about we change the method signature in Driver to accept agentId and ignore that param in MesosSchedulerDriver. But do we really need the command line option? Aurora can run tasks without an executor. I'm assuming

Re: shutdown vs kill API is Mesos

2018-01-16 Thread Mohit Jaggi
We still need "Agent ID" for the shutdown call. On Tue, Jan 16, 2018 at 1:57 PM, Mohit Jaggi wrote: > Sounds good. But do we really need the command line option? One can use an > older Driver if KILL is preferred for some reason. > > On Tue, Jan 16, 2018 at 1:51 PM, Bill

Re: shutdown vs kill API is Mesos

2018-01-16 Thread Mohit Jaggi
Sounds good. But do we really need the command line option? One can use an older Driver if KILL is preferred for some reason. On Tue, Jan 16, 2018 at 1:51 PM, Bill Farner wrote: > This situation is much simpler if task ID == executor ID. I can't come up > with a good reason

Re: shutdown vs kill API is Mesos

2018-01-16 Thread Bill Farner
This situation is much simpler if task ID == executor ID. I can't come up with a good reason why this is not the case today. Our executor IDs originally included static prefix, though i do not recall any rationale for this. When Renan added custom executor support, this static prefix was made

Re: shutdown vs kill API is Mesos

2018-01-15 Thread David McLaughlin
We are working with Mesos folks to resolve it. There is a Mesos performance working group that folks can join if they'd like to contribute: http://mesos.apache.org/blog/performance-working-group-progress-report/ I'm not sure what you mean by branch. Everything we used to scale test is on master.

Re: shutdown vs kill API is Mesos

2018-01-15 Thread Meghdoot bhattacharya
David, should twitter try against mesos 1.5 to see if things are better with the new api instead of libmesos. This is going to be a drift over time that will stop us from adopting new features. If it was sometime back it would be good to rerun the tests and open a ticket in Mesos if issues

Re: shutdown vs kill API is Mesos

2018-01-12 Thread David McLaughlin
I'm not sure I agree with the summary. Bill's proposal was using shutdown only when using the new API. I would also support this if it's possible. On Fri, Jan 12, 2018 at 11:14 AM, Mohit Jaggi wrote: > Summary so far: > - Bill supports making this change > - This change

Re: shutdown vs kill API is Mesos

2018-01-12 Thread Mohit Jaggi
Summary so far: - Bill supports making this change - This change cannot be made in a backward compatible manner - David (Twitter) does not want to use HTTP APIs due to performance concerns. I conclude that folks from Twitter don't support this change Question: - Are there other users that want

Re: shutdown vs kill API is Mesos

2018-01-11 Thread Renan DelValle
Sorry, I guess referring to it as the libmesos way of talking to the Mesos master is a bit misleading. And I stand corrected, the V0 is only an adaptor to the V1 interface which still uses the undocumented RPC way of talking to the master (

Re: shutdown vs kill API is Mesos

2018-01-11 Thread Mohit Jaggi
David, - LCD makes sense. Does that mean that Twitter is using the SCHEDULER_DRIVER version? - I don't see Bill's proposal on this

Re: shutdown vs kill API is Mesos

2018-01-11 Thread David McLaughlin
Sorry, the other approach outlined by Bill would in theory work too, but it sounds like in practice it also needs more changes on the Mesos side. On Thu, Jan 11, 2018 at 1:55 PM, David McLaughlin wrote: > Right. In order to keep the current abstraction in Aurora (both

Re: shutdown vs kill API is Mesos

2018-01-11 Thread David McLaughlin
Right. In order to keep the current abstraction in Aurora (both APIs), we obviously have to bind to the lower common denominator API methods. So the only way to integrate with shutdown will be to fix the performance issues so we can switch to the new API. The performance issue we ran into at

Re: shutdown vs kill API is Mesos

2018-01-11 Thread Renan DelValle
The HTTP API is what is used under the hood for V0 and V1 (instead of libmesos), I believe that's what David was referencing when he mentioned the HTTP performance issues. Here's a better explanation from the original patch submitted by Zameer:

Re: shutdown vs kill API is Mesos

2018-01-11 Thread Mohit Jaggi
Thanks Renan. I saw that code. "Driver" interface does not have SHUTDOWN...so it is not "compatible". I was trying to change to VersionedSchedulerDriverService all over the code (that wreaks havoc across the tests!) but Mesos's Java wrapper

Re: shutdown vs kill API is Mesos

2018-01-11 Thread Renan DelValle
https://github.com/apache/aurora/blob/aae2b0dc73b7534c66982ed07b1f029150e245de/src/main/java/org/apache/aurora/scheduler/mesos/SchedulerDriverModule.java

Re: shutdown vs kill API is Mesos

2018-01-09 Thread Mohit Jaggi
David, Where can I find this code? Mohit. On Sat, Dec 9, 2017 at 4:27 PM, David McLaughlin wrote: > The new API is present in Aurora in a compatibility layer, but the HTTP > performance issues still exist so we can't make it the default. > > On Sat, Dec 9, 2017 at 4:24

Re: shutdown vs kill API is Mesos

2017-12-09 Thread Mohit Jaggi
Filed https://issues.apache.org/jira/browse/AURORA-1960 On Sat, Dec 9, 2017 at 4:45 PM, Bill Farner wrote: > The new API is present in Aurora in a compatibility layer > > > Aha! I had not explored that code >

Re: shutdown vs kill API is Mesos

2017-12-09 Thread Bill Farner
> > The new API is present in Aurora in a compatibility layer Aha! I had not explored that code yet. It does seem that