[ 
https://issues.apache.org/jira/browse/HDDS-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368592#comment-17368592
 ] 

Bharat Viswanadham commented on HDDS-5338:
------------------------------------------

We need to download checkpoint when converting from non-ha ratis based cluster 
to ha enabled cluster like when we add 2 more nodes to make it HA (in this 
case, the old single node OM is first converted to ratis-enabled, and then if 
we add 2 more nodes, only the older one can become leader, so we can download 
the checkpoint from that.  

{quote}Let's say there are 3 existing OMs - om1, om2 and om3. om1 is network 
partitioned from the other 2 and assumes itself to be the leader. We try to 
bootstrap a new OM om4 and it contacts om1 first and downloads a checkpoint 
from it (since om1 replies that it is the leader). But since om1 was network 
partitioned, it does not have the correct DB snapshot. After this, om4 contacts 
the OM ring again to do a SetConfiguration. This request now goes to the 
correct leader OM - om2. om2 assumes that the bootstrapping OM has already got 
the non-ratis transactions through the DB checkpoint and sends it only the 
ratis logs. This will lead to inconsistent state in om4.{quote} 

If it has more than 1 node, that means it is already ratis enabled cluster, why 
do we need to download checkpoint at all in this scenario?

> Handle Bootstrap when original OM has non-ratis transactions
> ------------------------------------------------------------
>
>                 Key: HDDS-5338
>                 URL: https://issues.apache.org/jira/browse/HDDS-5338
>             Project: Apache Ozone
>          Issue Type: Sub-task
>    Affects Versions: 1.2.0
>            Reporter: Hanisha Koneru
>            Assignee: Hanisha Koneru
>            Priority: Major
>
> When non-Ratis OM is converted to ratis enabled OM, there could be 
> transactions in the RocksDB which are not part of the Ratis logs. If the 
> Ratis logs are not purged when a new OM is bootstrapped, it will just get all 
> the Ratis logs from the old OM. The non-ratis transactions in the RocksDB 
> will not be transferred to the new OM as Ratis will not know that there are 
> transactions in the DB not present in the logs. 
> So when a new OM is bootstrapping, we should check the DB for non-ratis 
> transactions and if any are present, the new OM should download the DB from 
> existing OM before the setConf request is sent out.
> Thanks [~bharat] for identifying this scenario 
> [here|https://github.com/apache/ozone/pull/1494#issuecomment-859329558] .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to