errose28 commented on code in PR #3841:
URL: https://github.com/apache/ozone/pull/3841#discussion_r1003901404


##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/upgrade/TestDatanodeUpgradeToScmHA.java:
##########
@@ -103,6 +104,8 @@ public TestDatanodeUpgradeToScmHA(boolean 
scmHAAlreadyEnabled) {
     this.scmHAAlreadyEnabled = scmHAAlreadyEnabled;
     conf = new OzoneConfiguration();
     conf.setBoolean(ScmConfigKeys.OZONE_SCM_HA_ENABLE_KEY, 
scmHAAlreadyEnabled);
+    // DATANODE_SCHEMA_V3 has higher feature version than SCM_HA
+    conf.setBoolean(DatanodeConfiguration.CONTAINER_SCHEMA_V3_ENABLED, false);

Review Comment:
   > But in this test, it only upgrades like from 1.0.0 to 1.1.0, only the 
SCM-HA related finalization action is executed,
   
   Finalization always causes the component to go from its current layout 
version to the latest one. There is no way to finalize just up to the SCM HA 
layout feature, so the DN is also being finalized for schema V3 and the upgrade 
action for Schema V3 is being run. See the test logs
   ```log
   2022-10-24 16:49:59,462 [Listener at localhost/61869] INFO  
upgrade.DatanodeSchemaV3FinalizeAction 
(DatanodeSchemaV3FinalizeAction.java:execute(50)) - Upgrading Datanode volume 
layout for Schema V3 support.
   ```
   
   I tried this test out locally and the failure is in 
`TestDatanodeUpgradeToScmHA#testFailedVolumeDuringFinalization`. In this test, 
a volume is failed while the datanode is finalizing, so its disk layout does 
not get upgraded. When the volume is restored and the DN is restarted, it looks 
like the volume is not formatted to schema v3 since it was out during 
finalization, causing the error.
   
   Unfortunately the upgrade framework currently only tracks layout version for 
a whole component, not on a per volume basis. This means if a volume is out 
while the DN is finalized, some manual startup logic must be added to make sure 
it gets formatted if it is restored. There is a [jira to improve 
this](https://issues.apache.org/jira/browse/HDDS-5475) but for now layout 
features handling volume format changes need to handle this case manually. For 
the SCM HA SCM ID to cluster ID change, this is done by calling 
`VersionedDatanodeFeatures.ScmHA.upgradeVolumeIfNeeded(volume, clusterId)` in 
`StorageVolumeUtil#checkVolume`. It looks like Schema V3 may need a similar 
thing to create the RocksDB on startup after the cluster ID has been obtained 
and the volume formatted with it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to