Change By: Arcadiy Ivanov (19/Nov/14 10:33 PM)
Description: Whenever a remote connection to a NIO JNLP slave is abruptly terminated, Channel.terminate() not cleanup exported FastPipedOutputStream, which causes FastPipedInputStream to leak, which causes FastPipedInputStream to loop forever waiting for buffer to be filled.

Example:

{noformat}
"Channel reader thread: Channel to Maven [/home/ec2-user/devhome/current/jdk/bin/java, -Xmx1g, -XX:MaxPermSize=1g, -Djava.awt.headless=true, -cp, /home/ec2-user/maven31-agent.jar:/home/ec2-user/devhome/current/maven/boot/plexus-classworlds-2.5.1.jar:/home/ec2-user/devhome/current/maven/conf/logging, jenkins.maven3.agent.Maven31Main, /home/ec2-user/devhome/current/maven, /tmp/slave.jar, /home/ec2-user/maven31-interceptor.jar, /home/ec2-user/maven3-interceptor-commons.jar, 26853]" prio=10 tid=0x00002b143d5a6800 nid=0xf0cb in Object.wait() [0x00002b1434e0c000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x000000070559dce8> (a [B)
        at hudson.remoting.FastPipedInputStream.read(FastPipedInputStream.java:175)
        - locked <0x000000070559dce8> (a [B)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
        - locked <0x000000070559c378> (a java.io.BufferedInputStream)
        at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82)
        at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72)
        at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103)
        at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33)
        at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
        at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

"Executor #0 for Primary Koji Slave Build Machine (i-26ed9ecc) : executing Koji - WildFly #13 / waiting for hudson.remoting.Channel@5f9da360:Channel to Maven [/home/ec2-user/devhome/current/jdk/bin/java, -Xmx1g, -XX:MaxPermSize=1g, -Djava.awt.headless=true, -cp, /home/ec2-user/maven31-agent.jar:/home/ec2-user/devhome/current/maven/boot/plexus-classworlds-2.5.1.jar:/home/ec2-user/devhome/current/maven/conf/logging, jenkins.maven3.agent.Maven31Main, /home/ec2-user/devhome/current/maven, /tmp/slave.jar, /home/ec2-user/maven31-interceptor.jar, /home/ec2-user/maven3-interceptor-commons.jar, 26853]" prio=10 tid=0x00002b144b97f000 nid=0xf0bf in Object.wait() [0x00002b1436624000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x000000070559d660> (a hudson.remoting.UserRequest)
        at hudson.remoting.Request.call(Request.java:146)
        - locked <0x000000070559d660> (a hudson.remoting.UserRequest)
        at hudson.remoting.Channel.call(Channel.java:751)
        at hudson.maven.ProcessCache$MavenProcess.call(ProcessCache.java:161)
        at hudson.maven.MavenModuleSetBuild$MavenModuleSetBuildExecution.doRun(MavenModuleSetBuild.java:840)
        at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:533)
        at hudson.model.Run.execute(Run.java:1745)
        at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:529)
        at hudson.model.ResourceController.execute(ResourceController.java:89)
        at hudson.model.Executor.run(Executor.java:240)
{noformat}

Slave computer log:
{noformat}
JNLP agent connected from /10.40.1.117
<===[JENKINS REMOTING CAPACITY]===>Slave.jar version: 2.47
This is a Unix slave
Slave successfully connected and online
JNLP agent connected
ERROR: Connection terminated
java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@f4c5cfc[name=Primary Koji Slave Build Machine (i-26ed9ecc)]
at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:211)
at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:631)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
at org.jenkinsci.remoting.nio.FifoBuffer$Pointer.receive(FifoBuffer.java:136)
at org.jenkinsci.remoting.nio.FifoBuffer.receive(FifoBuffer.java:306)
at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:564)
... 6 more
{noformat}

Discussion:

The problem, ultimately, lies in how the Channel is terminated.

# NioTransport.abort(e) calls ByteArrayReceiver.terminate(e)
# ByteArrayReceiver.terminate(e) calls CommandReceiver.terminate(e)
# CommandReceiver is constructed as anonymous type in Channel constructor in transportSetup and redirects the call to Channel.this.terminate(e); 
# Channel.this.terminate(e) closes does NOT cleanup any of the exported Streams in the ExportTable.
# Since channel doesn't clean up FastPipedOutputStream from the ExportTable the FastPipeInputStream.source() always returns a reference to FastPipedOutputStream and the pipe IN side of the pipe will wait on buffer forever.
# Since FastPipeInputStream.read waits for data in buffer forever, the
   SynchronousCommanTransport$ReaderThread never terminates.
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira

--
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to