Deployment script retries are brain-dead
----------------------------------------
Key: LIBCLOUD-157
URL: https://issues.apache.org/jira/browse/LIBCLOUD-157
Project: Libcloud
Issue Type: Bug
Components: Core
Affects Versions: 0.8.0
Reporter: Mark Nottingham
in common/base, NodeDriver._run_deployment_script has the following retry
wrapper:
tries = 0
while tries < max_tries:
try:
node = task.run(node, ssh_client)
except Exception:
tries += 1
if tries >= max_tries:
raise LibcloudError(value='Failed after %d tries'
% (max_tries), driver=self)
else:
ssh_client.close()
return node
The except Exception swallows *all* errors, making debugging very hard.
Furthermore, max_tries is effectively hard-coded in deploy_node():
self._run_deployment_script(task=kwargs['deploy'],
node=node,
ssh_client=ssh_client,
max_tries=3)
... forcing people who want to control retries to spin their own deploy_node().
Suggestions:
- at a minimum, log or warn about the error that's caught in the retry loop
- better yet, make the catch more fine-grained, so that errors that we know
won't be retry-able will fail out immediately.
- think about making the default number of max_tries 1
- make max_tries controllable from deploy_node
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira