[
https://issues.apache.org/jira/browse/HDDS-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612925#comment-17612925
]
Navin Kumar commented on HDDS-7088:
-----------------------------------
Thanks [~szetszwo] for pointing related Jira HDDS-7103
While reproducing the issue by setting the two configurations to the same
directory for both OM and SCM ratis directory like ”
/var/lib/hadoop-ozone/ratis/shared” i see there is only Raft directory for the
mentioned path is visible and other one gets overridden by the one (OM or SCM)
whichever was started later point of time ( or auto format behaviour).
{code:java}
[root@c4694-node2 hadoop-ozone]# cd /var/lib/hadoop-ozone/ratis/shared
[root@c4694-node2 shared]# ls -lart
total 0
drwxr-xr-x 3 root root 28 Oct 2 14:18 ..
drwxr-xr-x 3 hdfs hdfs 50 Oct 2 14:18 .
drwxr-xr-x 4 hdfs hdfs 66 Oct 4 06:12 ad32d39d-9fd8-48ab-abcb-d0d1e913367f
[root@c4694-node2 shared]# cd ad32d39d-9fd8-48ab-abcb-d0d1e913367f
[root@c4694-node2 ad32d39d-9fd8-48ab-abcb-d0d1e913367f]# ls -lart
total 4
drwxr-xr-x 3 hdfs hdfs 50 Oct 2 14:18 ..
drwxr-xr-x 2 hdfs hdfs 6 Oct 2 14:18 sm
drwxr-xr-x 2 hdfs hdfs 31 Oct 2 14:18 current
drwxr-xr-x 4 hdfs hdfs 66 Oct 4 06:12 .
-rw-r--r-- 1 hdfs hdfs 38 Oct 4 06:12 in_use.lock {code}
After initial analysis i thought this scenario can be handled if we append the
"SCM" or "HA" to the path configured as shared directory , a local directory
where ratis log will be stored . But later i got to know this will not help
since the group ids are already different .
Also how would we handle the upgrade scenario where path before and after
upgrade will be different.
Any suggestions on how to fix this ?
> OM incorrectly detects SCM Ratis Group ID when OM and SCM are colocated with
> same Ratis storage directory
> ---------------------------------------------------------------------------------------------------------
>
> Key: HDDS-7088
> URL: https://issues.apache.org/jira/browse/HDDS-7088
> Project: Apache Ozone
> Issue Type: Bug
> Components: OM HA, SCM HA
> Affects Versions: 1.2.1
> Reporter: Ethan Rose
> Assignee: Navin Kumar
> Priority: Major
>
> When OM and SCM are colocated and ozone.om.ratis.storage.dir and
> ozone.scm.ha.ratis.storage.dir are set to the same directory, the OM will
> incorrectly detect SCM's Ratis group ID and think that were was an erroneous
> change to the OM service ID.
> {code}
> 22022-07-21 12:57:24,691 ERROR
> org.apache.hadoop.ozone.om.OzoneManagerStarter: OM start failed with exception
> java.io.IOException: Ratis group Dir on disk scm does not match with
> RaftGroupID7f6848f2-63c2-3ce4-a1d2-9e88065dfcc3 generated from service id
> ozone1. Looks like there is a change to ozone.om.service.ids value after the
> cluster is setup. Currently change to this value is not supported.
> {code}
> This Jira is to support setting the two configurations to the same directory
> when OM and SCM are colocated while still maintaining the OM service ID check.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]