Hello,
One of the issue we have recently been experiencing with Jenkins is that the
slaves (node) would go offline for no apparent reason and would not reconnect
automatically.
When slaves appear as offline, we tried to launch/reconnect the slave manually
but it does not work either. However, we are able to SSH into the machine using
PuTTy.
The only workaround is to restart the Jenkins server, until the problem
surfaces again. (Typically in a week.)
Instance Information
--------------------
Jenkins Server: 1.562
SSH Credentials Plugin: 1.6.1
SSH Slaves Plugin 1.6
Thread dump of slave node:
{dump}
"Channel reader thread: qa-linbuild-02" prio=5 WAITING
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:485)
com.trilead.ssh2.channel.ChannelManager.waitUntilChannelOpen(ChannelManager.java:109)
com.trilead.ssh2.channel.ChannelManager.openSessionChannel(ChannelManager.java:583)
com.trilead.ssh2.Session.<init>(Session.java:41)
com.trilead.ssh2.Connection.openSession(Connection.java:1129)
com.trilead.ssh2.SFTPv3Client.<init>(SFTPv3Client.java:99)
com.trilead.ssh2.SFTPv3Client.<init>(SFTPv3Client.java:119)
hudson.plugins.sshslaves.SSHLauncher.afterDisconnect(SSHLauncher.java:1160)
hudson.slaves.SlaveComputer$2.onClosed(SlaveComputer.java:437)
hudson.remoting.Channel.terminate(Channel.java:819)
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:76)
"Channel reader thread: qa-linbuild-03" prio=5 WAITING
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:485)
com.trilead.ssh2.channel.ChannelManager.waitUntilChannelOpen(ChannelManager.java:109)
com.trilead.ssh2.channel.ChannelManager.openSessionChannel(ChannelManager.java:583)
com.trilead.ssh2.Session.<init>(Session.java:41)
com.trilead.ssh2.Connection.openSession(Connection.java:1129)
com.trilead.ssh2.SFTPv3Client.<init>(SFTPv3Client.java:99)
com.trilead.ssh2.SFTPv3Client.<init>(SFTPv3Client.java:119)
hudson.plugins.sshslaves.SSHLauncher.afterDisconnect(SSHLauncher.java:1160)
hudson.slaves.SlaveComputer$2.onClosed(SlaveComputer.java:437)
hudson.remoting.Channel.terminate(Channel.java:819)
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:76)
{dump}
Also concerning is the number of threads is in the BLOCKED (126!).
Doesn't seem normal as there are no BLOCKED threads after the server is
restarted.
{dump}
// 118 instances
"Computer.threadPoolForRemoting [#26]" daemon prio=5 BLOCKED
hudson.plugins.sshslaves.SSHLauncher.afterDisconnect(SSHLauncher.java:1152)
hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:542)
jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
java.util.concurrent.FutureTask.run(FutureTask.java:138)
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
java.lang.Thread.run(Thread.java:662)
// 8 instances
"Computer.threadPoolForRemoting [#2922]" daemon prio=5 BLOCKED
hudson.plugins.sshslaves.SSHLauncher.launch(SSHLauncher.java:639)
hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222)
jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
java.util.concurrent.FutureTask.run(FutureTask.java:138)
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
java.lang.Thread.run(Thread.java:662)
{dump}
Looking forward to any ideas or suggestions.
Thank you.
Charles Chan
--
You received this message because you are subscribed to the Google Groups
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.