[
https://issues.apache.org/jira/browse/MESOS-7832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16100772#comment-16100772
]
Yan Xu commented on MESOS-7832:
-------------------------------
/cc [~neilc] [~vinodkone]
> Mesos master during failover may not re-add completed tasks from agents
> belonging to frameworks that have yet to reregister
> ---------------------------------------------------------------------------------------------------------------------------
>
> Key: MESOS-7832
> URL: https://issues.apache.org/jira/browse/MESOS-7832
> Project: Mesos
> Issue Type: Bug
> Reporter: Yan Xu
>
> Relevant code:
> https://github.pie.apple.com/pie/mesos/blob/cd3380c4e9521b4b26f9030658816eee7a4b89a1/src/master/master.cpp#L8611-L8617
> Info about these completed tasks is discarded and later when the framework
> subscribes, the tasks are not recovered.
> It's not ideal that after a master failover, the new master doesn't recover
> all info that the previous master possesses and the webUI looks weird with
> missing info.
> In the short term we can store the info for such tasks temporarily but delete
> it after a timeout if the related frameworks don't reregister.
> Of course after we persist the frameworks in the registry the master will
> have better knowledge on whether a framework is completed.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)