[ 
https://issues.apache.org/jira/browse/KAFKA-17524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17881887#comment-17881887
 ] 

Chia-Ping Tsai commented on KAFKA-17524:
----------------------------------------

[~mjsax] StreamThreadTest is hanging even though junit tries to interrupt it. 
The possible root cause is shown below.
1. junit timeout (interrupt)
2. TaskExecutorThread#runOnce receive InterruptedException but the exception is 
ignored [0]
3. TaskExecutorThread run into next loop as `shutdownRequested` is false
4. TaskExecutorThread#awaitProcessableTasks hanging
5. StreamThreadTest is down and set `shutdownRequested` to true, but it is too 
late ...
6. StreamThreadTest#tearDown  [1] hanging due to TaskExecutorThread is running 
and timeout is Long.MAX

The simple solution is to make `requestShutdown` call `Thread#interrupt` after 
`shutdownRequested` is set to true. WDYT?


[0] 
https://github.com/apache/kafka/blob/344d8a60af8ba7e4ae646501494b17b6987a95d3/streams/src/main/java/org/apache/kafka/streams/processor/internals/tasks/DefaultTaskExecutor.java#L119
[1] 
https://github.com/apache/kafka/blob/344d8a60af8ba7e4ae646501494b17b6987a95d3/streams/src/test/java/org/apache/kafka/streams/processor/internals/StreamThreadTest.java#L235

> StreamThreadTest shouldReturnErrorIfProducerInstanceIdNotInitialized hanging
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-17524
>                 URL: https://issues.apache.org/jira/browse/KAFKA-17524
>             Project: Kafka
>          Issue Type: Test
>          Components: streams, unit tests
>            Reporter: David Arthur
>            Priority: Major
>              Labels: flaky-test
>
> A trunk build had a timeout and it appears that this test was the cause.
> [https://github.com/apache/kafka/actions/runs/10798234851/job/29953919232]
> In the Gradle log, we seeĀ 
> {code:java}
> 2024-09-10T20:31:26.6830206Z Gradle Test Run :streams:test > Gradle Test 
> Executor 47 > StreamThreadTest > 
> shouldReturnErrorIfProducerInstanceIdNotInitialized(boolean, boolean) > 
> "shouldReturnErrorIfProducerInstanceIdNotInitialized(boolean, 
> boolean).stateUpdaterEnabled=true, processingThreadsEnabled=true" STARTED 
> {code}
> but no "FINISHED" or "FAILED" later in the log.
> It seems that this test was running for around 50 minutes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to