errose28 commented on code in PR #3841:
URL: https://github.com/apache/ozone/pull/3841#discussion_r1003901404
##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/upgrade/TestDatanodeUpgradeToScmHA.java:
##########
@@ -103,6 +104,8 @@ public TestDatanodeUpgradeToScmHA(boolean
scmHAAlreadyEnabled) {
this.scmHAAlreadyEnabled = scmHAAlreadyEnabled;
conf = new OzoneConfiguration();
conf.setBoolean(ScmConfigKeys.OZONE_SCM_HA_ENABLE_KEY,
scmHAAlreadyEnabled);
+ // DATANODE_SCHEMA_V3 has higher feature version than SCM_HA
+ conf.setBoolean(DatanodeConfiguration.CONTAINER_SCHEMA_V3_ENABLED, false);
Review Comment:
> But in this test, it only upgrades like from 1.0.0 to 1.1.0, only the
SCM-HA related finalization action is executed,
Finalization always causes the component to go from its current layout
version to the latest one. There is no way to finalize just up to the SCM HA
layout feature, so the DN is also being finalized for schema V3 and the upgrade
action for Schema V3 is being run. See the test logs
```log
2022-10-24 16:49:59,462 [Listener at localhost/61869] INFO
upgrade.DatanodeSchemaV3FinalizeAction
(DatanodeSchemaV3FinalizeAction.java:execute(50)) - Upgrading Datanode volume
layout for Schema V3 support.
```
I tried this test out locally and the failure is in
`TestDatanodeUpgradeToScmHA#testFailedVolumeDuringFinalization`. In this test,
a volume is failed while the datanode is finalizing, so its disk layout does
not get upgraded. When the volume is restored and the DN is restarted, it looks
like the volume is not formatted to schema v3 since it was out during
finalization, causing the error.
Unfortunately the upgrade framework currently only tracks layout version for
a whole component, not on a per volume basis. This means if a volume is out
while the DN is finalized, some manual startup logic must be added to make sure
it gets formatted if it is restored. There is a [jira to improve
this](https://issues.apache.org/jira/browse/HDDS-5475) but for now layout
features handling volume format changes need to handle this case manually. For
the SCM HA SCM ID to cluster ID change, this is done by calling
`VersionedDatanodeFeatures.ScmHA.upgradeVolumeIfNeeded(volume, clusterId)` in
`StorageVolumeUtil#checkVolume`. It looks like Schema V3 may need a similar
thing to create the RocksDB on startup after the cluster ID has been obtained
and the volume formatted with it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]