[
https://issues.apache.org/jira/browse/VCL-503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andy Kurth resolved VCL-503.
----------------------------
Resolution: Fixed
Timeout was added a while back. This issue had also been concerned with making
persistent SSH connections which last throughout a forked vcld process. There
is code in OS.pm to do this but it is not stable enough to use in production.
Another issue will be created dedicated to persistent SSH connections.
> Add timeout to hung SSH processes
> ---------------------------------
>
> Key: VCL-503
> URL: https://issues.apache.org/jira/browse/VCL-503
> Project: VCL
> Issue Type: Improvement
> Components: vcld (backend)
> Affects Versions: 2.2.1
> Reporter: Andy Kurth
> Assignee: Andy Kurth
> Fix For: 2.4
>
>
> SSH processes issued from the management node to the computer being loaded
> occasionally hang for a very long time or indefinitely. This causes the
> reservation process to hang.
> This problem usually occurs soon after the computer begins to respond to SSH
> after it has been reloaded. vcld detects that it is responding and begins to
> issue commands. The SSH service/daemon is probably still being initialized
> on the computer. The SSH command hangs and does not fail because it makes an
> initial connection, a hiccup occurs, and the SSH service on the computer runs
> normally. Setting SSH options such as ServerAlive* or TCPKeepAlive doesn't
> help because the computer responds to these messages.
> Code should be added to timeout the SSH command process after a configurable
> amount of time.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)