[jira] [Commented] (KAFKA-13501) Avoid state restore via rebalance if standbys are enabled

Matthias J. Sax (Jira) Thu, 25 Jul 2024 14:27:08 -0700


    [ 
https://issues.apache.org/jira/browse/KAFKA-13501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868807#comment-17868807
 ]


Matthias J. Sax commented on KAFKA-13501:
-----------------------------------------

It's some not so easy to reproduce cases... There is literally 
`TaskCorruptedException` in the code base that tells you when this could 
happen. – It's often EOS related (ie, state store corrupted), but also 
retriable errors on `send()`, and invalid offset from a local .checkpoint file 
can cause it (from a quick look into the code). Best to check the code directly 
to see which scenario might be easiest for you to reproducer/trigger.

> Avoid state restore via rebalance if standbys are enabled
> ---------------------------------------------------------
>
>                 Key: KAFKA-13501
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13501
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Matthias J. Sax
>            Priority: Major
>              Labels: new-streams-runtime-should-fix
>
> There are certain scenario in which Kafka Streams wipes out local state and 
> rebuilt it from scratch. This is a thread local cleanup, ie, no rebalance is 
> triggered, and we end up with an offline task until state restoration 
> finished.
> If standby tasks are enable, it might actually make sense to trigger a 
> rebalance instead, to get the task re-assigned to the instance hosting the 
> standby so get the task active again quickly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-13501) Avoid state restore via rebalance if standbys are enabled

Reply via email to