[ 
https://issues.apache.org/jira/browse/KAFKA-17789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Michaud updated KAFKA-17789:
------------------------------------
    Description: 
In an application with multiple clients, each having multiple threads, when the 
app is started with an empty storage (without resetting the whole application), 
only a part of the clients are restoring the changelog topics.

Those non-restoring clients are also not able to shutdown gracefully.

 

Reproduction steps

> I'm putting all the actual details, while I'm going to make a project to 
> reproduce it locally, and I'll link it inside this ticket.
 * Having the app in a kubernetes environment, with multiple pods (5) so 
finally having 5 streams clients, and also enough data or poor cpu to have long 
restoration (enough to see the issue after 1 or 2 minutes)
 * Already consumed input topics and be live (no lag on input or internal 
topics)
 * then stop the app
 * clear out the local storage
 * finally restart and see that only 2 or 3 clients are restoring, the others 
consuming nothing
 * Bonus: stop the clients, then the stuck clients should not close and should 
continue sending heartbeats and answering any rebalance assignment

Related slack discussion: 
https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1728296887560369

  was:
In an application with multiple clients, each having multiple threads, when the 
app is started with an empty storage (without resetting the whole application), 
only a part of the clients are restoring the changelog topics.

Those non-restoring clients are also not able to shutdown gracefully.

 

Reproduction steps

> I'm putting all the actual details, while I'm going to make a project to 
> reproduce it locally, and I'll link it inside this ticket.
 * Having the app in a kubernetes environment, with multiple pods (5) so 
finally having 5 streams clients, and also enough data or poor cpu to have long 
restoration (enough to see the issue after 1 or 2 minutes)
 * Already consumed input topics and be live (no lag on input or internal 
topics)
 * then stop the app
 * clear out the local storage
 * finally restart and see that only 2 or 3 clients are restoring, the others 
consuming nothing
 * Bonus: stop the clients, then the stuck clients should not close and should 
continue sending heartbeats and answering any rebalance assignment


> State updater stuck when starting with empty state folder
> ---------------------------------------------------------
>
>                 Key: KAFKA-17789
>                 URL: https://issues.apache.org/jira/browse/KAFKA-17789
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 3.8.0
>            Reporter: Antoine Michaud
>            Priority: Critical
>             Fix For: 4.0.0
>
>
> In an application with multiple clients, each having multiple threads, when 
> the app is started with an empty storage (without resetting the whole 
> application), only a part of the clients are restoring the changelog topics.
> Those non-restoring clients are also not able to shutdown gracefully.
>  
> Reproduction steps
> > I'm putting all the actual details, while I'm going to make a project to 
> > reproduce it locally, and I'll link it inside this ticket.
>  * Having the app in a kubernetes environment, with multiple pods (5) so 
> finally having 5 streams clients, and also enough data or poor cpu to have 
> long restoration (enough to see the issue after 1 or 2 minutes)
>  * Already consumed input topics and be live (no lag on input or internal 
> topics)
>  * then stop the app
>  * clear out the local storage
>  * finally restart and see that only 2 or 3 clients are restoring, the others 
> consuming nothing
>  * Bonus: stop the clients, then the stuck clients should not close and 
> should continue sending heartbeats and answering any rebalance assignment
> Related slack discussion: 
> https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1728296887560369



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to