divijvaidya commented on PR #14242:
URL: https://github.com/apache/kafka/pull/14242#issuecomment-1779404361

   > If we ignore producer-state-flush failure here, recovery-point might be 
incremented even with stale on-disk producer state snapshot. So, in case of 
restart after power failure, the broker might restore stale producer state 
without rebuilding (since recovery point is incremented) which could cause 
idempotency issues.
   
   Great point. May I suggest that we document the consistency expectations of 
producer snapshot with segment on the disk. From what you mentioned, it sounds 
like "Kafka expects producer snapshot to be strongly consistent with the 
segment data on disk before the recovery checkpoint but doesn't expect after 
the checkpoint. The inconsistency after the checkpoint is acceptable 
because....blah blah"
   
   We verify that expectations with experts such as Justine and Jun. Based on 
that we can make a decision of quietly vs. async etc. The documentation will 
also help future contributions reason about code base. Initially, you can put 
the documentation in the description of this PR itself and later we can find a 
home for it in Kafka website docs. 
   
   We need to do the same exercise for other files that you are changing in 
this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to