Martin Andersson created KAFKA-19697: ----------------------------------------
Summary: NPE Cannot invoke org.apache.kafka.connect.runtime.ConnectMetrics$MetricGroup.close() Key: KAFKA-19697 URL: https://issues.apache.org/jira/browse/KAFKA-19697 Project: Kafka Issue Type: Bug Components: connect Affects Versions: 4.0.0 Environment: Kafka connect cluster with 20 workers running in kubernetes, on homebrewed kafka images built from eclipse-temurin:21-jre-alpine-3.21 Reporter: Martin Andersson Several tasks in a sink connector in a long-running connect cluster broke spontaneously with the following stacktrace: {code:java} java.lang.NullPointerException: Cannot invoke \"org.apache.kafka.connect.runtime.ConnectMetrics$MetricGroup.close()\" because the return value of \"java.util.concurrent.ConcurrentMap.get(Object)\" is null at org.apache.kafka.connect.runtime.Worker$ConnectorStatusMetricsGroup.recordTaskRemoved(Worker.java:2333) at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:707) at org.apache.kafka.connect.runtime.Worker.startSinkTask(Worker.java:568) at org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:2009) at org.apache.kafka.connect.runtime.distributed.DistributedHerder.lambda$getTaskStartingCallable$39(DistributedHerder.java:2059) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) {code} Restarting the failed tasks with the REST API lead to another task failure with the following stacktrace: {code:java} java.lang.NullPointerException: Cannot invoke \"java.util.Map.size()\" because \"inputMap\" is null at org.apache.kafka.common.utils.Utils.castToStringObjectMap(Utils.java:1476) at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:112) at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:146) at org.apache.kafka.connect.runtime.TaskConfig.<init>(TaskConfig.java:51) at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:661) at org.apache.kafka.connect.runtime.Worker.startSinkTask(Worker.java:568) at org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:2009) at org.apache.kafka.connect.runtime.distributed.DistributedHerder.lambda$getTaskStartingCallable$39(DistributedHerder.java:2059) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) {code} The failed tasks did not show up on the _connector-failed-task-count_ metric (or in the _restarting/paused/failed_ task metrics), but the failing tasks disappeared from the connector-running-task-count metric. -- This message was sent by Atlassian Jira (v8.20.10#820010)