I ran jstack on Jenkins, and many of the threads had state BLOCKED. However after a restart most of the threads are BLOCKED. Not sure if it is an issue here.
After a restart Jenkins starts with aprox 200 threads open. When I got problem with disconnected agents, the thread count reached 500. onsdag 17. juli 2019 12.40.14 UTC+2 skrev Sverre Moe følgende: > > It seems to be the monitoring that gets the agents disconnected. > > Got this in my log file this last time they got disconnectd. > > Jul 17, 2019 11:58:22 AM > hudson.init.impl.InstallUncaughtExceptionHandler$DefaultUncaughtExceptionHandler > > uncaughtExc > eption > SEVERE: A thread (Timer-3450/103166) died unexpectedly due to an uncaught > exception, this may leave your Jenkins in a > bad way and is usually indicative of a bug in the code. > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:717) > at java.util.Timer.<init>(Timer.java:160) > at java.util.Timer.<init>(Timer.java:132) > at > org.jenkinsci.plugins.ssegateway.sse.EventDispatcher.scheduleRetryQueueProcessing(EventDispatcher.java:296 > > > ) > at > org.jenkinsci.plugins.ssegateway.sse.EventDispatcher.processRetries(EventDispatcher.java:437) > > > at > org.jenkinsci.plugins.ssegateway.sse.EventDispatcher$1.run(EventDispatcher.java:299) > > > at java.util.TimerThread.mainLoop(Timer.java:555) > at java.util.TimerThread.run(Timer.java:505) > > Jul 17, 2019 11:58:31 AM > hudson.init.impl.InstallUncaughtExceptionHandler$DefaultUncaughtExceptionHandler > > uncaughtExc > eption > SEVERE: A thread (Thread-30062/98187) died unexpectedly due to an uncaught > exception, this may leave your Jenkins in > a bad way and is usually indicative of a bug in the code. > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:717) > at > com.trilead.ssh2.transport.TransportManager.sendAsynchronousMessage(TransportManager.java:649) > > > at > com.trilead.ssh2.channel.ChannelManager.msgChannelRequest(ChannelManager.java:1213) > > > at > com.trilead.ssh2.channel.ChannelManager.handleMessage(ChannelManager.java:1466) > > > at > com.trilead.ssh2.transport.TransportManager.receiveLoop(TransportManager.java:809) > > > at > com.trilead.ssh2.transport.TransportManager$1.run(TransportManager.java:502) > > at java.lang.Thread.run(Thread.java:748) > > > Now I have gotten catastrophic failure. I cannot relaunch any agents any > more. > > [07/17/19 12:04:10] [SSH] Opening SSH connection to > jbssles120x64r12.spacetec.no:22. > ERROR: Unexpected error in launching a agent. This is probably a bug in > Jenkins. > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:717) > at > com.trilead.ssh2.transport.TransportManager.initialize(TransportManager.java:545) > at com.trilead.ssh2.Connection.connect(Connection.java:774) > at > hudson.plugins.sshslaves.SSHLauncher.openConnection(SSHLauncher.java:817) > at hudson.plugins.sshslaves.SSHLauncher$1.call(SSHLauncher.java:419) > at hudson.plugins.sshslaves.SSHLauncher$1.call(SSHLauncher.java:406) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > [07/17/19 12:04:10] Launch failed - cleaning up connection > [07/17/19 12:04:10] [SSH] Connection closed. > > > My Jenkins server has over 500 threads open > Threads: 506 total, 0 running, 506 sleeping, 0 stopped, 0 zombie > > > onsdag 17. juli 2019 10.24.12 UTC+2 skrev Sverre Moe følgende: >> >> We have had to blissfull days of stable Jenkins. Today two nodes are >> disconnected and they will not come back online. >> >> What is strange is it is the same two-three nodes every time. >> Running disconnect on them through the URL >> http://jenkins.example.com/jenkins/computer/NODE_NAME/disconnect, does >> not work. >> I have to enter configuration, Save, then relaunch to get them up running. >> >> I tried setting the ulimit values as suggested in >> >> https://support.cloudbees.com/hc/en-us/articles/222446987-Prepare-Jenkins-for-Support#bulimitsettingsjustforlinuxos >> >> I have also added additional JVM options as suggested in >> >> https://support.cloudbees.com/hc/en-us/articles/222446987-Prepare-Jenkins-for-Support#ajavaparameters >> https://go.cloudbees.com/docs/solutions/jvm-troubleshooting/ >> >> The number of threads of Jenkins server is currently 265. Yesterday when >> all was fine this was up to 300. >> >> >> Maybe ralted or unrelated: >> When this happens we have some builds on other nodes that stops working. >> They are aborted, but are still showing as running. The only thing that >> works is deleting the agent and creating it again, that or restarting >> Jenkins. >> >> >> søndag 14. juli 2019 13.31.51 UTC+2 skrev Sverre Moe følgende: >>> >>> I suspected it might be related, but was not sure. >>> >>> The odd thing this just started being a problem a week ago. Nothing as >>> far as I can see has changed on the Jenkins server. >>> >>> lørdag 13. juli 2019 13.04.44 UTC+2 skrev Ivan Fernandez Calvo følgende: >>>> >>>> I saw that you have another question related with OOM errors in Jenkins >>>> if it is the same instance , this is your real issue with the agents, >>>> until >>>> you do not have a stable Jenkins instance the agent disconnection will be >>>> a >>>> side effect. >>>>> >>>>> -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/78dc2517-d4e0-4d1b-939f-b0546c796807%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
