"hudson.remoting.ChannelClosedException: channel is already closed" indicates an unexpected loss of connection to the slave. The nested "Caused by: java.io.EOFException" indicates that the slave side has shut down the communication with the slave.
The thing is, the communication to the slave (InputStream that Channel reads) is tunneled over several layers, and the way this part of the code discovers the problem is by InputStream.read() returning -1.
This design of InputStream does not allow us to report the underlying cause of the communication problem through a chained exception, so we really can't properly report the root cause.
The slave console log does normally capture the last dying message from the slave JVM or a transport level errors, but this gets rotated quickly as soon as the next connection attempt starts, and while on $JENKINS_HOME this file is still available, there's no way to look at this from the web UI. Jenkins does pretty aggressively auto-reconnect slaves that fail, and it takes some time for someone to notice a build failure by ChannelClosedException and try to understand what's going on, so that makes the trouble-shooting even more tricky.
I was just sweeping the ssh-slaves plugin ticket backlog, and there are many reports of this same issue, so this clearly is a gap in the diagnosability of the slave connectivity.
If anyone has a good idea of how to capture the errors, that'd be greatly appreciated.
One approach that I think about is to introduce a proper log rotation mechanism (that handles LargeText.doProgressText() correctly), and somehow use that to let people scroll back the slave console log.
Perhaps another possibility is to let the ComputerLauncher record a connection loss as an Exception on a failing Channel.
On 04/17/2013 02:41 PM, hajush wrote:
The intermittent failure of slave jobs due to issue 12235 <https://issues.jenkins-ci.org/browse/JENKINS-12235> looks like it might start undoing progress in getting my work teams to adopt Jenkins. Has anyone given any thought to the issue and how to address it? Some folks had luck by increasing the ClientInterval on unix masters - but others did not. I see that late last month Kohsuke increased the pipe window size in hudson.remoting.Channel - though I'm not sure that would address this - and since it's intermittent - it's hard to test. Here's what our stack trace failure looks like. FATAL: Unable to delete script file c:\temp\hudson985794291407431615.bat hudson.util.IOException2: remote file operation failed: c:\temp\hudson985794291407431615.bat at hudson.remoting.Channel@e553b0:vcvmwin061 at hudson.FilePath.act(FilePath.java:848) at hudson.FilePath.act(FilePath.java:825) at hudson.FilePath.delete(FilePath.java:1202) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:810) at hudson.model.Build$BuildExecution.build(Build.java:199) at hudson.model.Build$BuildExecution.doRun(Build.java:160) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:592) at hudson.model.Run.execute(Run.java:1543) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:236) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:494) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:672) at hudson.FilePath.act(FilePath.java:841) -- View this message in context: http://jenkins.361315.n4.nabble.com/Any-ideas-how-to-fix-JENKINS-12235-tp4663279.html Sent from the Jenkins dev mailing list archive at Nabble.com.
-- Kohsuke Kawaguchi | CloudBees, Inc. | http://cloudbees.com/ Try Nectar, our professional version of Jenkins -- You received this message because you are subscribed to the Google Groups "Jenkins Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
