errose28 opened a new pull request #2288:
URL: https://github.com/apache/ozone/pull/2288


   ## What changes were proposed in this pull request?
   
   Fix intermittent test failures related to Ozone Manager preparation, and 
include numerous prepare improvements:
   
   - A prepare index is now valid as long as it is less than or equal to the 
current log index.
       - This is the fix for the flaky test.
       - Previously, if OMs were prepared and leader changed, Ratis would push 
a conf entry to the log, increasing the OMs' log indices. On restart, the log 
index would no longer match the prepare index, and the OM would restart 
unprepared.
   
       - As a side effect, there is no longer such a thing as a "stale" prepare 
marker.
           - Stale markers would exist on disk, but because the prepare index 
is smaller than the existing log index, it would be ignored.
           - This required changes to handling disk structures on startup (with 
the --upgrade flag to cancel prepare), and on snapshot install.
   
   - Clarified logic for handling prepare marker file and DB during snapshot 
install or startup
       - There should be no possiblity for DB divergence or unexpected prepare 
behavior when cancelling preparation with the startup flag (--upgrade) or via a 
ratis request *Unless* only some OMs are restarted with the --upgrade flag.
           - This case is now added to the upgrade documentation.
           - If OMs are accidentally started this way, issuing a cancel prepare 
request will fix potential divergence.
   
   - Clarified purge log flow and prepare index.
       1. OM waits for the txn index in the DB to be at least that of its 
prepare request.
           - OM depends on the pre-append gate to gaurantee that no write txns 
have entered after the prepare request is received.
           - If the index is larger, it is due to a snapshot ratis specific 
txns.
       2. OM waits for the ratis state machine index to be at least the prepare 
request index + 1.
           - This indiciates that ratis has applied the commit entry for the 
prepare request.
       3. OM takes a snapshot of its database.
           - This moves the database's transaction index to the ratis commit 
index (one ahead of the prepare request index), and gaurantees that the the 
prepare request has been added to the state machine as well.
           - If we did not wait for the ratis commit index, it could be lost 
when when we purge the logs, since all that is left after this operation is the 
DB.
               - If the commit index was lost on only some OMs, we could have 
state machine divergence.
       4. OM purges its logs, and ensures that the returned purge index is at 
least as large as the DB snapshot index.
           - If it is not, some logs for OM requests may remain, which could be 
applied across versions and violate the purpose of prepare.
   
   - Fail fast approach taken to preparation issues.
       - Previous logic attempted to recover from preparation issues like 
mismatched indices or faulty marker files, usually resulting in logged warnings 
and/or OMs subtly cancelling their prepare state on problems.
       - After improvements to prepare index management, we should be confident 
that the failure cases represent rare but serious issues in the system 
warranting failures and explicit messages.
   
   - Improved logging in light of debugging various prepare failures.
   
   - General refactoring aimed at making the system easier to understand.
   
   ## What is the link to the Apache JIRA
   
   HDDS-5109
   
   ## How was this patch tested?
   
   Existing unit and integration tests updated.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to