Hi Joris,

Fair point: I didn't deliberately set out to change the behavior for
duplicate task IDs. Rather, it was a consequence of switching from
boost::circular_buffer to using a hashmap for managing completed
tasks. Using a hashmap has a few minor advantages [1], but we can
certainly continue using circular_buffer (or a multi-hashmap) if we
want to keep the current behavior.

I think we have the following options:

(1) Keep the current behavior: reusing task IDs is discouraged but supported.

(2) Per Alex's suggestion, we can say that frameworks are no longer
allowed to reuse task IDs. Because the master only keeps a
limited-size cache of completed tasks (which is not preserved across
master restart or failover), we wouldn't be able to reject all
situations in which frameworks attempt to reuse task IDs.

If we pursue #2, we might need a deprecation period or master
capability to give framework authors some time to migrate.

For the moment, I'll avoid changing the behavior for duplicate task
IDs; I've opened https://issues.apache.org/jira/browse/MESOS-6779 to
track this issue. If you have an opinion in this change, please
weigh-in, either on this thread or on JIRA.

Neil

[1] Specifically, making the management of completed and unreachable
tasks more symmetric and avoiding some bugs/UBI in
boost::circular_buffer. O(1) lookup of completed tasks might be useful
in the future but isn't used right now.

On Fri, Dec 9, 2016 at 2:13 PM, Joris Van Remoortere
<jo...@mesosphere.io> wrote:
> Hey Neil,
>
> I concur that using duplicate task IDs is bad practice and asking for
> trouble.
>
> Could you please clarify *why* you want to use a hashmap? Is your goal to
> remove duplicate task IDs or is this just a side-effect and you have a
> different reason (e.g. performance) for using a hashmap?
>
> I'm wondering why a multi-hashmap is not sufficient. This would be clear if
> you were explicitly *trying* to get rid of duplicates of course :-)
>
> Thanks,
> Joris
>
> —
> *Joris Van Remoortere*
> Mesosphere
>
> On Fri, Dec 9, 2016 at 7:08 AM, Neil Conway <neil.con...@gmail.com> wrote:
>
>> Folks,
>>
>> The master stores a cache of metadata about recently completed tasks;
>> for example, this information can be accessed via the "/tasks" HTTP
>> endpoint or the "GET_TASKS" call in the new Operator API.
>>
>> The master currently stores this metadata using a list; this means
>> that duplicate task IDs are permitted. We're considering [1] changing
>> this to use a hashmap instead. Using a hashmap would mean that
>> duplicate task IDs would be discarded: if two completed tasks have the
>> same task ID, only the metadata for the most recently completed task
>> would be retained by the master.
>>
>> If this behavior change would cause problems for your framework or
>> other software that relies on Mesos, please let me know.
>>
>> (Note that if you do have two completed tasks with the same ID, you'd
>> need an unambiguous way to tell them apart. As a recommendation, I
>> would strongly encourage framework authors to never reuse task IDs.)
>>
>> Neil
>>
>> [1] https://reviews.apache.org/r/54179/
>>

Reply via email to