David,
- LCD makes sense. Does that mean that Twitter is using the SCHEDULER_DRIVER
<https://github.com/apache/aurora/blob/aae2b0dc73b7534c66982ed07b1f029150e245de/src/main/java/org/apache/aurora/scheduler/mesos/SchedulerDriverModule.java#L72>
 version?
- I don't see Bill's proposal on this thread. Did I miss it?

Renan,
VersionedDriverFactory
<https://github.com/apache/aurora/blob/2e1ca42887bc8ea1e8c6cddebe9d1cf29268c714/src/main/java/org/apache/aurora/scheduler/mesos/VersionedDriverFactory.java#L24>'s
comments indicate that libmesos is still used. What am I missing?

BTW, with the patch for Thermos (from Stephan I think), the need for
switching to SHUTDOWN is reduced.
Mohit.

On Thu, Jan 11, 2018 at 2:01 PM, David McLaughlin <dmclaugh...@apache.org>
wrote:

> Sorry, the other approach outlined by Bill would in theory work too, but
> it sounds like in practice it also needs more changes on the Mesos side.
>
> On Thu, Jan 11, 2018 at 1:55 PM, David McLaughlin <dmclaugh...@apache.org>
> wrote:
>
>> Right. In order to keep the current abstraction in Aurora (both APIs), we
>> obviously have to bind to the lower common denominator API methods. So the
>> only way to integrate with shutdown will be to fix the performance issues
>> so we can switch to the new API.
>>
>> The performance issue we ran into at Twitter was that with status updates
>> that were similar to our production volume, they started to get dropped and
>> tasks end up being LOST and unnecessarily killed. So it's a definite
>> blocker for us to adopt in its current state. We have someone who has
>> fixing this on the Mesos side in their backlog, but it's currently not the
>> highest priority for us.
>>
>> On Thu, Jan 11, 2018 at 1:45 PM, Renan DelValle <renanidelva...@gmail.com
>> > wrote:
>>
>>> The HTTP API is what is used under the hood for V0 and V1 (instead of
>>> libmesos), I believe that's what David was referencing when he mentioned
>>> the HTTP performance issues. Here's a better explanation from the original
>>> patch submitted by Zameer: https://github.com/apa
>>> che/aurora/commit/705dbc7cd7c3ff477bcf766cdafe49a68ab47dee#d
>>> iff-75bd5a98db87502a2332e9110d2eafc6
>>>
>>> I'm not sure about the Shutdown call, as you mentioned, the versioned
>>> driver seems to have the method but the driver interface does not. This
>>> might get tricky from here on in since Mesos has V1 only compatible calls.
>>>
>>> On Thu, Jan 11, 2018 at 1:24 PM, Mohit Jaggi <mohit.ja...@uber.com>
>>> wrote:
>>>
>>>> Thanks Renan. I saw that code. "Driver" interface does not have
>>>> SHUTDOWN...so it is not "compatible". I was trying to change to
>>>> VersionedSchedulerDriverService all over the code (that wreaks havoc
>>>> across the tests!) but Mesos's Java wrapper doesn't seem to have that
>>>> call either. Perhaps, that is why David referred to the HTTP API.
>>>>
>>>> On Thu, Jan 11, 2018 at 1:14 PM, Renan DelValle <
>>>> renanidelva...@gmail.com> wrote:
>>>>
>>>>> https://github.com/apache/aurora/blob/aae2b0dc73b7534c66982e
>>>>> d07b1f029150e245de/src/main/java/org/apache/aurora/scheduler
>>>>> /mesos/SchedulerDriverModule.java
>>>>>
>>>>> https://github.com/apache/aurora/blob/aae2b0dc73b7534c66982e
>>>>> d07b1f029150e245de/src/main/java/org/apache/aurora/scheduler
>>>>> /mesos/VersionedSchedulerDriverService.java#L50
>>>>>
>>>>> On Tue, Jan 9, 2018 at 1:21 PM, Mohit Jaggi <mohit.ja...@uber.com>
>>>>> wrote:
>>>>>
>>>>>> David,
>>>>>> Where can I find this code?
>>>>>>
>>>>>> Mohit.
>>>>>>
>>>>>> On Sat, Dec 9, 2017 at 4:27 PM, David McLaughlin <
>>>>>> dmclaugh...@apache.org> wrote:
>>>>>>
>>>>>>> The new API is present in Aurora in a compatibility layer, but the
>>>>>>> HTTP performance issues still exist so we can't make it the default.
>>>>>>>
>>>>>>> On Sat, Dec 9, 2017 at 4:24 PM, Bill Farner <wfar...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Aurora pre-dates SHUTDOWN by several years, so the option was not
>>>>>>>> present.  Additionally, the SHUTDOWN call is not available in the API 
>>>>>>>> used
>>>>>>>> by Aurora.  Last i knew, Aurora could not use the "new" API because of
>>>>>>>> performance issues in the implementation, but i do not know where that
>>>>>>>> stands today.
>>>>>>>>
>>>>>>>> https://mesos.apache.org/documentation/latest/scheduler-http
>>>>>>>> -api/#shutdown
>>>>>>>>
>>>>>>>>> NOTE: This is a new call that was not present in the old API
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sat, Dec 9, 2017 at 4:11 PM, Mohit Jaggi <mohit.ja...@uber.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Folks,
>>>>>>>>> Our Mesos team is wondering why Aurora chose KILL over SHUTDOWN
>>>>>>>>> for killing tasks. As Aurora has an executor per task, won't SHUTDOWN 
>>>>>>>>> work
>>>>>>>>> better? It will avoid zombie executors.
>>>>>>>>>
>>>>>>>>> Mohit.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to