[
https://issues.apache.org/jira/browse/MESOS-7795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ilya Pronin updated MESOS-7795:
-------------------------------
Shepherd: Yan Xu
> Remove "latest" symlink after agent reboot
> ------------------------------------------
>
> Key: MESOS-7795
> URL: https://issues.apache.org/jira/browse/MESOS-7795
> Project: Mesos
> Issue Type: Improvement
> Components: agent
> Reporter: Ilya Pronin
> Assignee: Ilya Pronin
> Priority: Minor
>
> Currently when the agent detects that the host was rebooted it doesn't
> recover agent info. New agent info is not checkpointed until the agent
> successfully registers with a master. If the agent crashes before
> registering, on restart it will recover the old agent info that was
> checkpointed before host reboot.
> This can lead to problems. E.g. the agent may flap due to incompatible agent
> info, if its resources somehow change after reboot. Or the usage of the old
> agent ID in reregistration process may cause crashes like MESOS-7432.
> We can remove the "latest" symlink when we detect that current boot ID is
> different from the checkpointed one in order to prevent the agent from
> recovering stale info after we checkpoint new boot ID. Or we can postpone
> boot ID checkpointing until we checkpointed new agent info.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)