cadonna opened a new pull request, #16570: URL: https://github.com/apache/kafka/pull/16570
When Streams tries to transit a restored active task to RUNNING, the first thing it does is getting the committed offsets for this task. If getting the offsets expires a timeout, Streams does not re-throw the error initially, but tries to get the committed offsets later until a Streams-specific timeout is hit. Restored active tasks from the state updater are removed from the output queue of the restored tasks in the state updater. If a timeout occurs, the restored task is neither added to the task registry nor re-added to the state updater. The task is lost since it is not maintained anywhere. This means the task is also not closed. When the same task is created again on the same stream thread since the stream thread does not know about this lost task, the state stores are opened again and RocksDB will throw the "No locks available" error. This commit re-adds the task to the state updater if the committed request times out. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org