[ 
https://issues.apache.org/jira/browse/HDDS-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-4227:
------------------------------------
    Target Version/s: 1.2.0

> Implement a "prepareForUpgrade" step that applies all committed transactions 
> onto the OM state machine.
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-4227
>                 URL: https://issues.apache.org/jira/browse/HDDS-4227
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>          Components: Ozone Manager
>            Reporter: Aravindan Vijayan
>            Assignee: Aravindan Vijayan
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.1.0
>
>
> *Why is this needed?*
> Through HDDS-4143, we have a generic factory to handle multiple versions of 
> apply transaction implementations based on layout version. Hence, this 
> factory can be used to handle versioned requests across layout versions, 
> whenever both the versions need to exist in the code (Let's say for 
> HDDS-2939). 
> However, it has been noticed that the OM ratis requests are still undergoing 
> lot of minor changes (HDDS-4007, HDDS-4007, HDDS-3903), and in these cases it 
> will become hard to maintain 2 versions of the code just to support clean 
> upgrades. 
> Hence, the plan is to build a pre-upgrade utility (client API) that makes 
> sure that an OM instance has no "un-applied" transactions in this Raft log. 
> Invoking this client API makes sure that the upgrade starts with a clean 
> state. Of course, this would be needed only in a HA setup. In a non HA setup, 
> this can either be skipped, or when invoked will be a No-Op (Non Ratis) or 
> cause no harm (Single node Ratis).
> *How does it work?*
> Before updating the software bits, our goal is to get OMs to get to the  
> latest state with respect to apply transaction. The reason we want this is to 
> make sure that the same version of the code executes the AT step in all the 3 
> OMs. In a high level, the flow will be as follows.
> * Before upgrade, *stop* the OMs.
> * Start OMs with a special flag --prepareUpgrade (This is something like 
> --init,  which is a special state which stops the ephemeral OM instance after 
> doing some work)
> * When OM is started with the --prepareUpgrade flag, it does not start the 
> RPC server, so no new requests can get in.
> * In this state, we give every OM time to apply txn until the last txn.
> * We know that at least 2 OMs would have gotten the last client request 
> transaction committed into their log. Hence, those 2 OMs are expected to 
> apply transaction to that index faster.
> * At every OM, the Raft log will be purged after this wait period (so that 
> the replay does not happen), and a Ratis snapshot taken at last txn.
> * Even if there is a lagger OM which is unable to get to last applied txn 
> index, its logs will be purged after the wait time expires.
> * Now when OMs are started with newer version, all the OMs will start using 
> the new code.
> * The lagger OM will get the new Ratis snapshot since there are no logs to 
> replay from.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to