The "JMX client heartbeat n" threads are created by com.sun.jmx.remote.internal.ClientCommunicatorAdmin.
GFSH registers JMXConnectionListener which is a non-inner class hiding inside ./geode-core/src/main/java/org/apache/geode/management/internal/cli/shell/JmxOperationInvoker.java -- we should extract all such classes to their own java file or make then static inner classes. JMXConnectionListener implements javax.management.NotificationListener and is registered with JMXConnector. "gfshFileLogger.severe" is the line in notifyDisconnect that is causing these fatal level log messages. It's not clear to me what would cause GFSH to keep the JDK's "JMX client heartbeat n" threads around (I haven't actually seen this yet). Maybe the way JmxOperationInvoker is implemented is incorrect and leaks or keeps some JMX resources around. GFSH does keep a reference to JmxOperationInvoker and I don't see that cleared anywhere (such as when it might disconnect from a JMX server). It seems reasonable to lower the log level for the log messages (I hate to suppress it entirely though). Beyond that we should file a bug for the leaking of "JMX client heartbeat n" threads. I suspect leaking these threads would mostly occur in tests that are connecting/disconnecting a GFSH instance repeatedly. The typical user of GFSH isn't going to repeatedly connect/disconnect like this. Maybe the tests that are hitting this are closing the manager 1st and then GFSH's JMXConnectionListener detects this but then never frees the JMXConnectionListener or JmxOperationInvoker. -Kirk On Thu, Oct 27, 2016 at 11:46 AM, Bruce Schuchardt <bschucha...@pivotal.io> wrote: > I think the code is in Gfsh.notifyDisconnect(). I think the test that's > polluting others is ConnectToLocatorSSLDUnitTest. I had a failure with > this suspect string in a test that ran just after it. I added a thread dump > in the @After of that test and see that it's leaving one of these threads > behind along with some Gfsh Launcher threads. > > > Le 10/27/2016 à 10:54 AM, Kirk Lund a écrit : > >> Bruce, can you please point me at the JDK code you think is generating the >> fatal message? I looked in RMIConnector.java and can't find it in there. >> >> Maybe we can fix ordering of tearDown or something else in order to "fix" >> this message. >> >> Thanks, >> Kirk >> >> >> On Wed, Oct 26, 2016 at 5:00 PM, Dan Smith <dsm...@pivotal.io> wrote: >> >> Just looking at a couple of these bugs, these are fatal level errors. Do >>> you know what is causing them or what affect this might have? >>> >>> [fatal 2016/09/29 12:12:03.891 PDT <JMX client heartbeat 3> tid=0x18d] >>> (tid=397 msgId=39) No longer connected to cc6-co6.gemstone.com[27162]. >>> >>> -Dan >>> >>> On Wed, Oct 26, 2016 at 4:50 PM, Bruce Schuchardt < >>> bschucha...@pivotal.io> >>> wrote: >>> >>> We have 23 CI failure tickets concerning suspect strings from "JMX client >>>> heartbeat" threads. I think we should add this to ExpectedStrings.java >>>> >>> and >>> >>>> close these tickets. The suspect strings are coming from >>>> javax.management.remote.rmi.RMIConnector and I don't think there's >>>> anything we can do about them. Most, if not all, are assigned to >>>> >>> security >>> >>>> or JMX components. >>>> >>>> >>>> GEODE-1858 >>>> GEODE-1955 >>>> GEODE-1492 >>>> GEODE-1539 >>>> GEODE-1538 >>>> GEODE-1773 >>>> GEODE-1922 >>>> GEODE-1482 >>>> GEODE-1475 >>>> GEODE-1480 >>>> GEODE-1481 >>>> GEODE-1826 >>>> GEODE-1825 >>>> GEODE-1820 >>>> GEODE-1878 >>>> GEODE-1879 >>>> GEODE-1877 >>>> GEODE-1875 >>>> GEODE-1876 >>>> GEODE-1499 >>>> GEODE-1476 >>>> GEODE-1869 >>>> GEODE-1769 >>>> >>>> >>>> >