Gastón Kleiman created MESOS-9573:
-------------------------------------
Summary: Agent should not try to recover operation status update
streams that haven't been created yet.
Key: MESOS-9573
URL: https://issues.apache.org/jira/browse/MESOS-9573
Project: Mesos
Issue Type: Bug
Components: agent
Reporter: Gastón Kleiman
If the agent fails over after having checkpointed a new operation but before
the operation status update stream is created, the recovery process will fail.
This happens because agent will try to recover the operation status update
streams even if it hasn't been created yet.
In order to prevent recovery failures, the agent should obtain the ids of the
streams to recover by walking the directory in which operation status updates
streams are stored.
The agent should also garbage collect streams if the checkpointed state doesn't
contain a corresponding operation.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)