[
https://issues.apache.org/jira/browse/KAFKA-18790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17926968#comment-17926968
]
Chia-Ping Tsai commented on KAFKA-18790:
----------------------------------------
The error stack is shown below. The root cause of the issue is the unexpected
termination of the AsyncConsumer background thread. This occurs due to an error
encountered during the creation of the NetworkClient. Interestingly, within the
testing framework, we are able to update the trust store, which suggests that
the background thread might be attempting to read the trust store while it is
being updated during NetworkClient creation when SSL is enabled.
It seems we need to handle following issues.
1. rewrite the test to avoid updating the trust file. we can address that by
avoid creating admin repeatedly
2. ConsumerNetworkThread#run should log the error caused by
`initializeResources`
```
Korg.apache.kafka.common.KafkaException: Failed to create new NetworkClient
y at
org.apache.kafka.clients.ClientUtils.createNetworkClient(ClientUtils.java:256)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
y at
org.apache.kafka.clients.ClientUtils.createNetworkClient(ClientUtils.java:163)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.clients.consumer.internals.NetworkClientDelegate$1.create(NetworkClientDelegate.java:460)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.clients.consumer.internals.NetworkClientDelegate$1.create(NetworkClientDelegate.java:457)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.clients.consumer.internals.CachedSupplier.get(CachedSupplier.java:39)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.clients.consumer.internals.RequestManagers$1.create(RequestManagers.java:166)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.clients.consumer.internals.RequestManagers$1.create(RequestManagers.java:163)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.clients.consumer.internals.CachedSupplier.get(CachedSupplier.java:39)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.clients.consumer.internals.events.ApplicationEventProcessor$1.create(ApplicationEventProcessor.java:675)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.clients.consumer.internals.events.ApplicationEventProcessor$1.create(ApplicationEventProcessor.java:672)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.clients.consumer.internals.CachedSupplier.get(CachedSupplier.java:39)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkThread.initializeResources(ConsumerNetworkThread.java:121)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkThread.run(ConsumerNetworkThread.java:100)
[kafka-clients-4.1.0-SNAPSHOT.jar:?]
�Caused by: org.apache.kafka.common.KafkaException:
org.apache.kafka.common.KafkaException: Failed to load SSL keystore
/tmp/truststore8413027689144989558.jks of type JKS
� at
org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:184)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
{ at
org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:188)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.common.network.ChannelBuilders.clientChannelBuilder(ChannelBuilders.java:79)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
z at
org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:120)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
y at
org.apache.kafka.clients.ClientUtils.createNetworkClient(ClientUtils.java:224)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
... 12 more
�Caused by: org.apache.kafka.common.KafkaException: Failed to load SSL keystore
/tmp/truststore8413027689144989558.jks of type JKS
� at
org.apache.kafka.common.security.ssl.DefaultSslEngineFactory$FileBasedStore.load(DefaultSslEngineFactory.java:380)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.common.security.ssl.DefaultSslEngineFactory$FileBasedStore.<init>(DefaultSslEngineFactory.java:352)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.common.security.ssl.DefaultSslEngineFactory.createTruststore(DefaultSslEngineFactory.java:325)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.common.security.ssl.DefaultSslEngineFactory.configure(DefaultSslEngineFactory.java:169)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.common.security.ssl.SslFactory.instantiateSslEngineFactory(SslFactory.java:147)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
y at
org.apache.kafka.common.security.ssl.SslFactory.configure(SslFactory.java:100)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:180)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
{ at
org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:188)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.common.network.ChannelBuilders.clientChannelBuilder(ChannelBuilders.java:79)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
z at
org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:120)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
y at
org.apache.kafka.clients.ClientUtils.createNetworkClient(ClientUtils.java:224)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
... 12 more
Caused by: java.io.EOFException
Q at
java.base/java.io.DataInputStream.readFully(DataInputStream.java:210) ~[?:?]
O at java.base/java.io.DataInputStream.readInt(DataInputStream.java:385)
~[?:?]
Z at
java.base/sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:674)
~[?:?]
` at
java.base/sun.security.util.KeyStoreDelegator.engineLoad(KeyStoreDelegator.java:228)
~[?:?]
E at java.base/java.security.KeyStore.load(KeyStore.java:1499) ~[?:?]
� at
org.apache.kafka.common.security.ssl.DefaultSslEngineFactory$FileBasedStore.load(DefaultSslEngineFactory.java:377)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.common.security.ssl.DefaultSslEngineFactory$FileBasedStore.<init>(DefaultSslEngineFactory.java:352)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.common.security.ssl.DefaultSslEngineFactory.createTruststore(DefaultSslEngineFactory.java:325)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.common.security.ssl.DefaultSslEngineFactory.configure(DefaultSslEngineFactory.java:169)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.common.security.ssl.SslFactory.instantiateSslEngineFactory(SslFactory.java:147)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
y at
org.apache.kafka.common.security.ssl.SslFactory.configure(SslFactory.java:100)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:180)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
{ at
org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:188)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
� at
org.apache.kafka.common.network.ChannelBuilders.clientChannelBuilder(ChannelBuilders.java:79)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
z at
org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:120)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
y at
org.apache.kafka.clients.ClientUtils.createNetworkClient(ClientUtils.java:224)
~[kafka-clients-4.1.0-SNAPSHOT.jar:?]
```
> testCustomQuotaCallback time out build failure
> ----------------------------------------------
>
> Key: KAFKA-18790
> URL: https://issues.apache.org/jira/browse/KAFKA-18790
> Project: Kafka
> Issue Type: Test
> Reporter: Lianet Magrans
> Assignee: Chia-Ping Tsai
> Priority: Major
>
> Recent transient build failure due to testCustomQuotaCallback timeout,
> recovered on the next run. The build was for a PR just fixing a flaky test
> unrelated to testCustomQuotaCallback (PR test passed successfully on that
> same failed build).
> Investigating the transient build failure, it showed that the quarantine task
> had timed out. In the logs I could see that the testCustomQuotaCallback was
> once started but never finished
> {noformat}
> > Task :core:quarantinedTest
> ...
> Gradle Test Run :core:quarantinedTest > Gradle Test Executor 112 >
> CustomQuotaCallbackTest > testCustomQuotaCallback(String, String) >
> testCustomQuotaCallback(String, String).quorum=kraft.groupProtocol=classic
> STARTED
> ...
> > Task :core:quarantinedTest
> ...
> Gradle Test Run :core:quarantinedTest > Gradle Test Executor 112 >
> CustomQuotaCallbackTest > testCustomQuotaCallback(String, String) >
> testCustomQuotaCallback(String, String).quorum=kraft.groupProtocol=classic
> PASSED
> ...
> Gradle Test Run :core:quarantinedTest > Gradle Test Executor 112 >
> CustomQuotaCallbackTest > testCustomQuotaCallback(String, String) >
> testCustomQuotaCallback(String, String).quorum=kraft.groupProtocol=consumer
> STARTED
> ...{noformat}
>
> This last one never finished. And the thread dump showed:
> {code:java}
> "Test worker" #1 [7287] prio=5 os_prio=0 cpu=60168.75ms elapsed=10224.12s
> tid=0x00007f17d802fe70 nid=7287 waiting on condition [0x00007f17dd7f9000]
> java.lang.Thread.State: WAITING (parking)
> at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
> parking to wait for <0x0000000088705fa8> (a
> java.util.concurrent.CompletableFuture$Signaller)
> at
> java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:221)
> at
> java.util.concurrent.CompletableFuture$Signaller.block([email protected]/CompletableFuture.java:1908)
> at
> java.util.concurrent.ForkJoinPool.unmanagedBlock([email protected]/ForkJoinPool.java:4023)
> at
> java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3969)
> at
> java.util.concurrent.CompletableFuture.waitingGet([email protected]/CompletableFuture.java:1942)
> at
> java.util.concurrent.CompletableFuture.get([email protected]/CompletableFuture.java:2116)
> at
> org.apache.kafka.clients.consumer.internals.ConsumerUtils.getResult(ConsumerUtils.java:238)
> at
> org.apache.kafka.clients.consumer.internals.events.ApplicationEventHandler.addAndGet(ApplicationEventHandler.java:131)
> at
> org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.subscribeInternal(AsyncKafkaConsumer.java:1965)
> at
> org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.subscribe(AsyncKafkaConsumer.java:1817)
> at
> org.apache.kafka.clients.consumer.KafkaConsumer.subscribe(KafkaConsumer.java:717)
> at kafka.api.QuotaTestClients.consumeUntilThrottled(BaseQuotaTest.scala:238)
> at
> kafka.api.CustomQuotaCallbackTest$GroupedUser.produceConsume(CustomQuotaCallbackTest.scala:322)
> at
> kafka.api.CustomQuotaCallbackTest.testCustomQuotaCallback(CustomQuotaCallbackTest.scala:147)
> {code}
>
> Build with the test timeout:
> [https://github.com/apache/kafka/actions/runs/13273058547/job/37110549871?pr=18866]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)