James DeFelice created MESOS-8693:
-------------------------------------
Summary: agent: update_resource_provider w/ identical RP info
should not always force-restart plugin
Key: MESOS-8693
URL: https://issues.apache.org/jira/browse/MESOS-8693
Project: Mesos
Issue Type: Task
Affects Versions: 1.5.0
Reporter: James DeFelice
Currently when the UPDATE_RESOURCE_PROVIDER call is sent to an agent, and the
RP info of the request is identical to that of the running configuration, the
agent force-restarts the related CSI plugin. This is surprising on two accounts:
First, because it increases the complexity of the client that wants to ensure
the latest RP configuration is pushed to the agent. A CSI plugin may take a
long time to become ready after being reconfigured. It's likely that a caller
will experience a timeout while waiting for the RP to come into a healthy state
w/ the desired configuration. Upon retrying the update, a client DOES NOT
always wish to restart an ongoing reconfiguration effort – especially when for
long running reconfiguration operations. Mesos should NOT restart the related
CSI plugin by default if the new RP info matches the existing one, and instead
should either return 409 or some other, more appropriate error code (409 would
be nice/consistent, see below).
Second, because it differs from the idempotent nature of the
ADD_RESOURCE_PROVIDER call, which does NOT change the state of the plugin in
case of a duplicate request. The ADD_RESOURCE_PROVIDER call returns a 409
response, which allows callers to simply re-issue redundant requests without
concern for interrupting the state of a running plugin.
In the event that caller DOES want to force the restart of an underlying CSI
plugin, I suggest that we extend the UPDATE_RESOURCE_PROVIDER call w/ a
"force_restart" field (sibling to the "info" field). "force_restart == true"
would only have meaning for updates that involve unchanged RP info, otherwise
it would go unused.
/cc [~jieyu] [~chhsia0]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)