Guido Iaquinti created ZOOKEEPER-4540: -----------------------------------------
Summary: Zookeeper 3.7.0: /metrics java.lang.IllegalArgumentException: Invalid thread ID parameter: 0 Key: ZOOKEEPER-4540 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4540 Project: ZooKeeper Issue Type: Bug Reporter: Guido Iaquinti Zookeeper node running version 3.7.0 part of a cluster with 2 other nodes. I'm getting the following log while accessing the /metrics (Prometheus) endpoint: {code:java} 2022-05-10 16:11:01,122 [myid:0] - WARN [qtp850551034-16:HttpChannel@677] - /metrics java.lang.IllegalArgumentException: Invalid thread ID parameter: 0 at java.management/sun.management.ThreadImpl.verifyThreadId(ThreadImpl.java:165) at java.management/sun.management.ThreadImpl.verifyThreadIds(ThreadImpl.java:174) at java.management/sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:180) at io.prometheus.client.hotspot.ThreadExports.getThreadStateCountMap(ThreadExports.java:98) at io.prometheus.client.hotspot.ThreadExports.addThreadMetrics(ThreadExports.java:86) at io.prometheus.client.hotspot.ThreadExports.collect(ThreadExports.java:123) at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.findNextElement(CollectorRegistry.java:190) at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:223) at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:144) at io.prometheus.client.exporter.common.TextFormat.write004(TextFormat.java:22) at io.prometheus.client.exporter.MetricsServlet.doGet(MetricsServlet.java:49) at org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider$MetricsServletImpl.doGet(PrometheusMetricsProvider.java:406) at javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:791) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:550) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:516) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:279) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105) at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036) at java.base/java.lang.Thread.run(Thread.java:829) {code} the response of the endpoint is also partial: {code:java} ssm-user@dev-iad-zk-0:/var/snap/amazon-ssm-agent/5165$ curl http://localhost:7000/metrics # HELP learner_proposal_received_count learner_proposal_received_count # TYPE learner_proposal_received_count counter learner_proposal_received_count 0.0{code} Few additional things: # the other 2 nodes don't experience the same issue (same config) # the node is otherwise fine (the service is working) # the overall cluster health is good Not a critical issue but it's creating us issues from a service visibility perspective. Thank you! -- This message was sent by Atlassian Jira (v8.20.7#820007)