YutaLin opened a new pull request, #10218:
URL: https://github.com/apache/ozone/pull/10218

   ## What changes were proposed in this pull request?
   1. Move resetInFlightSnapshotCount() from notifyLeaderReady() to 
notifyNotLeader()
       The counter tracks requests where preExecute() ran but 
validateAndUpdateCache()
        hasn't completed. When an OM steps down as leader, it should reset its 
counter
        because those tracked requests are no longer its responsibility. 
Previously,
        resetting in notifyLeaderReady() was incorrect - the new leader never 
ran
        preExecute() for pending requests, so its counter should start at 0.
   2. Override handleRequestFailure() in OMSnapshotCreateRequest
        When a request fails after preExecute() (e.g., PrepareState rejection),
        the counter was never decremented. This could cause the counter to grow
        unbounded during OM prepare mode for upgrades.
   
    3. Add safety net check in validateAndUpdateCache()
        Call assertSnapshotLimitNotExceeded() as a hard guarantee that the 
snapshot
        limit is never exceeded, even if in-flight tracking has bugs during 
leader
        transitions or other edge cases
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-13357
   
   ## How was this patch tested?
   Add tests and ci(https://github.com/YutaLin/ozone/actions/runs/25562315743)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to