[JIRA] [core] (JENKINS-35477) Important memory usage if a slave running on Java 6 tries to connect on Jenkins core >= 1.612

[email protected] (JIRA) Wed, 08 Jun 2016 09:05:21 -0700

Title: Message Title

Arnaud Héritier created an issue

JENKINS-35477

Issue Type:	Bug
Assignee:	Unassigned
Attachments:	Screen Shot 2016-05-31 at 14.36.35.png, Screen Shot 2016-05-31 at 14.38.24.png, Screen Shot 2016-05-31 at 14.40.01.png, Screen Shot 2016-05-31 at 14.41.41.png
Components:	core
Created:	2016/Jun/08 4:04 PM
Priority:	Major
Reporter:	Arnaud Héritier

Since Jenkins core 1.612 Java 7 is required for core and agents. It may happen that in a migration a user forget to upgrade the JVM of an agent. It is not a supported but what is annoying is that it produces a important consumption of memory because the connection fails repeatedly with a Java Compatibility error which isn't correctly catched. The problem was originally

Here is the analyse done by stephenconnolly :

I suspect that the J6 may be causing other leaks as it is probably blowing up in unexpected places

 
                                                        May 31, 2016 2:21:13 PM hudson.TcpSlaveAgentListener$ConnectionHandler run
INFO: Accepted connection #64 from /127.0.0.1:54507
May 31, 2016 2:21:13 PM hudson.TcpSlaveAgentListener$ConnectionHandler run
WARNING: Connection #64 failed
java.io.IOException: Remote call on jnlp failed
    at hudson.remoting.Channel.call(Channel.java:789)
    at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:508)
    at jenkins.slaves.JnlpSlaveAgentProtocol$Handler.jnlpConnect(JnlpSlaveAgentProtocol.java:126)
    at jenkins.slaves.DefaultJnlpSlaveReceiver.handle(DefaultJnlpSlaveReceiver.java:70)
    at jenkins.slaves.JnlpSlaveAgentProtocol2$Handler2.run(JnlpSlaveAgentProtocol2.java:57)
    at jenkins.slaves.JnlpSlaveAgentProtocol2.handle(JnlpSlaveAgentProtocol2.java:30)
    at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:156)
Caused by: java.lang.ClassFormatError: Failed to load hudson.slaves.SlaveComputer$SlaveVersion
    at hudson.remoting.RemoteClassLoader.loadClassFile(RemoteClassLoader.java:340)
    at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:251)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:249)
    at hudson.remoting.MultiClassLoaderSerializer$Input.resolveClass(MultiClassLoaderSerializer.java:114)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1591)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1496)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1750)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:349)
    at hudson.remoting.UserRequest.deserialize(UserRequest.java:184)
    at hudson.remoting.UserRequest.perform(UserRequest.java:98)
    at hudson.remoting.UserRequest.perform(UserRequest.java:48)
    at hudson.remoting.Request$2.run(Request.java:326)
    at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
    at hudson.remoting.Engine$1$1.run(Engine.java:62)
    at java.lang.Thread.run(Thread.java:695)
    at ......remote call to jnlp(Native Method)
    at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1416)
    at hudson.remoting.UserResponse.retrieve(UserRequest.java:220)
    at hudson.remoting.Channel.call(Channel.java:781)
    ... 6 more
Caused by: java.lang.UnsupportedClassVersionError: hudson/slaves/SlaveComputer$SlaveVersion : Unsupported major.minor version 51.0
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClassCond(ClassLoader.java:637)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:621)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:471)
    at hudson.remoting.RemoteClassLoader.loadClassFile(RemoteClassLoader.java:338)
    at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:251)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:249)
    at hudson.remoting.MultiClassLoaderSerializer$Input.resolveClass(MultiClassLoaderSerializer.java:114)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1591)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1496)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1750)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:349)
    at hudson.remoting.UserRequest.deserialize(UserRequest.java:184)
    at hudson.remoting.UserRequest.perform(UserRequest.java:98)
    at hudson.remoting.UserRequest.perform(UserRequest.java:48)
    at hudson.remoting.Request$2.run(Request.java:326)
    at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
    at hudson.remoting.Engine$1$1.run(Engine.java:62)
    at java.lang.Thread.run(Thread.java:695)
 
                                                    

I think this is an issue in Jenkins Core, to whit:

All of this should be in a try ... catch block and we should probably close the channel if any of that fails.

Instead what is happening is that the channel remains semi-half-open:

The slave side thinks it is closed but the Jenkins side does not.
Because we have not set the slave's channel field, subsequent connection attempts will not be rejected due to an existing connection. In fact nothing is really retaining a reference to the channel, and we never got to set up the ping thread, so at best we are awaiting the OS to decide the socket is dead.

Using a `while true ; do java -jar slave.jar -noReconnect -jnlpUrl ... ; do` loop you can trigger the issue faster:

The memory will be reclaimed once the connection is old enough to have been deemed dead by the TCP stack, but I had one slave with at most one partially set-up connection and the Channel instances just keep on growing. Every so often you can get a few connections to drop off through a full GC, but there would still be loads still "live"

after a short while

after some more time

(next I let it run a little more then stopped the slave and triggered a full GC)

notice GC doesn't make much of a dent

1m40s later we were able to get GC to collect another instance, leaving loads still hanging around:

after another forced GC

The workaround is obviously not to have a J6 slave.

Add Comment

This message was sent by Atlassian JIRA

--
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
For more options, visit https://groups.google.com/d/optout.

[JIRA] [core] (JENKINS-35477) Important memory usage if a slave running on Java 6 tries to connect on Jenkins core >= 1.612

Reply via email to