Hello,

Are you trying to write a fix for this? If not, please use the users ML for
such discussions.
If you're trying to fix it, do not hesitate to ask for feedback/help about
specific code changes in Jenkins core or a plugin.

Thanks!

Le mar. 6 nov. 2018 à 23:07, 'Daniel Weber' via Jenkins Developers <
jenkinsci-dev@googlegroups.com> a écrit :

> Hi all,
>
> we are trying to find the root cause for ChannelClosedExceptions we keep
> getting in about 10% of our builds (which is far too often).
>
> We use one-time agents deployed in AWS (instances are created and
> destroyed for each build) in this scenario. I set the default log level to
> FINEST on the agents and found this
>
> Nov 06, 2018 5:38:57 PM
> org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader run
> FINER: [JNLP4-connect connection to frcv001/10.179.171.222:49394]
> Unexpected I/O exception
> java.io.IOException: A connection attempt failed because the connected
> party did not properly respond after a period of time, or established
> connection failed because connected host has failed to respond
> at sun.nio.ch.SocketDispatcher.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(Unknown Source)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
> at sun.nio.ch.IOUtil.read(Unknown Source)
> at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
> at hudson.remoting.SocketChannelStream$1.read(SocketChannelStream.java:35)
> at sun.nio.ch.ChannelInputStream.read(Unknown Source)
> at sun.nio.ch.ChannelInputStream.read(Unknown Source)
> at sun.nio.ch.ChannelInputStream.read(Unknown Source)
> at java.nio.channels.Channels$ReadableByteChannelImpl.read(Unknown Source)
> at
> org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:245)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:93)
> at java.lang.Thread.run(Unknown Source)
>
> Nov 06, 2018 5:38:57 PM org.jenkinsci.remoting.protocol.NetworkLayer
> onRecvClosed
> FINEST: [JNLP4-connect connection to frcv001/10.179.171.222:49394] RECV
> Closed
> Nov 06, 2018 5:38:57 PM org.jenkinsci.remoting.protocol.FilterLayer
> onRecvClosed
> FINEST: [JNLP4-connect connection to frcv001/10.179.171.222:49394] RECV
> Closed
> Nov 06, 2018 5:38:57 PM org.jenkinsci.remoting.protocol.ApplicationLayer
> onRecvClosed
> FINE: [JNLP4-connect connection to frcv001/10.179.171.222:49394] RECV
> Closed
> Nov 06, 2018 5:38:57 PM com.trilead.ssh2.log.Logger log
> FINEST: Received SSH_MSG_CHANNEL_DATA 64511 bytes payload
> Nov 06, 2018 5:38:57 PM com.trilead.ssh2.log.Logger log
> FINEST: Got SSH_MSG_CHANNEL_DATA (channel 100, 64502)
> Nov 06, 2018 5:38:57 PM hudson.remoting.jnlp.Main$CuiListener status
> INFO: Terminated
>
> and then later
>
> FINER: HandleMessage: got shutdown
> Nov 06, 2018 5:38:57 PM hudson.remoting.Request$2 run
> WARNING: Failed to send back a reply to the request
> hudson.remoting.Request$2@503986c5
> hudson.remoting.ChannelClosedException: Channel
> "hudson.remoting.Channel@798747f3:JNLP4-connect connection to frcv001/
> 10.179.171.222:49394": channel is already closed
> at hudson.remoting.Channel.send(Channel.java:717)
> at hudson.remoting.Request$2.run(Request.java:382)
> at
> hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
> at java.util.concurrent.FutureTask.run(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:93)
> at java.lang.Thread.run(Unknown Source)
> Caused by: java.nio.channels.ClosedChannelException
> at
> org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
> at
> org.jenkinsci.remoting.protocol.impl.BIONetworkLayer.access$1800(BIONetworkLayer.java:48)
> at
> org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:264)
> ... 4 more
>
>
> The agents successfully reconnect after this, but that does not help us
> because at that time, the build has already
> failed and the machine will be destroyed.
>
> I also read this:
> https://support.cloudbees.com/hc/en-us/articles/115001416548-dedicated-JNLP-agents-formerly-slaves-get-disconnected
>
>    - agent.jar is downloaded from the master before startup to make sure
>    they use the correct version
>    - I did not find any log messaged indicating the ping thread
>    terminated the connection
>    - We are currently using Jenkins LTS 2.121.3, Remoting version: 3.21.1
>    - Both master and agents are running under Windows Server 2012.
>
> Any help / hints would be greatly appreciated. I'd also be happy to open
> an issue if you think it makes sense :)
>
> Best Regards,
> Daniel
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Jenkins Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to jenkinsci-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/jenkinsci-dev/CA%2Bj7gMQKbij5Oq8nsm%3Do9sBk6nBf6Dk5k74stHAse1ci0%2BkZ1A%40mail.gmail.com
> <https://groups.google.com/d/msgid/jenkinsci-dev/CA%2Bj7gMQKbij5Oq8nsm%3Do9sBk6nBf6Dk5k74stHAse1ci0%2BkZ1A%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/CANWgJS4HQ-P5BAXNpmN1Px2NodWTD-S6JBRqTNnGuQGOHQzp1Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to