cadonna opened a new pull request, #16570:
URL: https://github.com/apache/kafka/pull/16570

   When Streams tries to transit a restored active task to RUNNING, the first 
thing it does is getting the committed offsets for this task. If getting the 
offsets expires a timeout, Streams does not re-throw the error initially, but 
tries to get the committed offsets later until a Streams-specific timeout is 
hit.
   
   Restored active tasks from the state updater are removed from the output 
queue of the restored tasks in the state updater. If a timeout occurs, the 
restored task is neither added to the task registry nor re-added to the state 
updater. The task is lost since it is not maintained anywhere. This means the 
task is also not closed. When the same task is created again on the same stream 
thread since the stream thread does not know about this lost task, the state 
stores are opened again and RocksDB will throw the "No locks available" error.
   
   This commit re-adds the task to the state updater if the committed request 
times out.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to