GitHub user oddshocks opened a pull request:

    https://github.com/apache/libcloud/pull/331

    Add a delay to SSH connection to fix the deploy_node race condition

    This patch might look a little "hackish", but it has solved the terrible 
`deploy_node` race condition for me 100%. I've been using libcloud with this 
patch for a few days with a 100% success rate. It seems that the `timeout` 
argument for `_ssh_client_connect` is insufficient. In fact, it's set to 300 
seconds by default, but the entire operation doesn't take nearly that long to 
fail, so that timeout must not be the proper thing to fix the `deploy_node` 
race condition. *This* fix, however, resolves the issue. 60 seconds is more 
than enough time to get the SSH key installed onto the node, even with the 
recent addition of `ssh_alternate_usernames`, which we suspect to be the 
culprit of this new race condition.
    
    Please, let me know if there's anything I can do to improve upon this patch 
and get it merged in. This is really a critical bug that needs to be resolved 
quickly.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/oddshocks/libcloud fix-deploy-race-condition

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/libcloud/pull/331.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #331
    
----
commit 05c846285c40bcc8e52e25a247d07315092c1f1d
Author: David Gay <[email protected]>
Date:   2014-06-28T18:41:17Z

    Add a delay to SSH connection to fix the deploy_node race condition

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to