Re: jclouds SshClient and ScriptBuilder

Andrew Phillips Wed, 26 Aug 2015 12:11:06 -0700

Hi Ashley

Any insight as to why this is happening? Is this a deliberate
implementation decision? Is there a better way around this?

I haven't had a chance to look at your scenario in detail, but thissounds very similar to a common problem related to a nohup/ssh racecondition.

Basically, what can happen in many remote automation scenarios is asfollows:

* You open an SSH connection to a box which calls something like a"service start" script* The "service start" script returns as soon as it has run its finalcommand, which is something like "nohup /my/service/run.sh &"

Now there is a race condition between nohup doing its job and SSHclosing the connection. More specifically, there is a short period oftime before nohup has been able to disconnect the service process fromsshd. If the sshd process terminates before the service process has beendisconnected, it will kill the service, as that is (still) a childprocess of sshd.

The solution here is to ensure that nohup has had a chance to kick inbefore the sshd process terminates. Probably the best way to do this isto ensure your service start script does not return until the service isactually up.

Another common way is to simply change the command you're running viaSSH from "service start" to "service start && sleep 2", although allthat's really doing is giving nohup two more seconds to do its job.

Without knowing more about what your service script is doing, I can'tsay whether this is actually what you're seeing, but the symptomscertainly sound comparable.


Regards

ap

Re: jclouds SshClient and ScriptBuilder

Reply via email to