[ 
https://issues.apache.org/jira/browse/HDDS-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612925#comment-17612925
 ] 

Navin Kumar commented on HDDS-7088:
-----------------------------------

Thanks [~szetszwo] for pointing related Jira HDDS-7103 

While reproducing the issue by setting the two configurations to the same 
directory for both OM and SCM ratis directory like  ” 
/var/lib/hadoop-ozone/ratis/shared”  i see there is only Raft directory for the 
mentioned path  is visible and other one gets overridden by the one (OM or SCM) 
whichever was started later point of time ( or auto format behaviour).

 
{code:java}
[root@c4694-node2 hadoop-ozone]# cd /var/lib/hadoop-ozone/ratis/shared
[root@c4694-node2 shared]# ls -lart
total 0
drwxr-xr-x 3 root root 28 Oct  2 14:18 ..
drwxr-xr-x 3 hdfs hdfs 50 Oct  2 14:18 .
drwxr-xr-x 4 hdfs hdfs 66 Oct  4 06:12 ad32d39d-9fd8-48ab-abcb-d0d1e913367f
[root@c4694-node2 shared]# cd ad32d39d-9fd8-48ab-abcb-d0d1e913367f
[root@c4694-node2 ad32d39d-9fd8-48ab-abcb-d0d1e913367f]# ls -lart
total 4
drwxr-xr-x 3 hdfs hdfs 50 Oct  2 14:18 ..
drwxr-xr-x 2 hdfs hdfs  6 Oct  2 14:18 sm
drwxr-xr-x 2 hdfs hdfs 31 Oct  2 14:18 current
drwxr-xr-x 4 hdfs hdfs 66 Oct  4 06:12 .
-rw-r--r-- 1 hdfs hdfs 38 Oct  4 06:12 in_use.lock {code}
After initial analysis i thought this scenario can be handled if we append the 
"SCM" or "HA" to the path configured as shared directory , a local directory 
where ratis log will be stored . But later i got to know this will not help 
since the group ids are already different .

 

Also how would we handle the upgrade scenario where path before and after 
upgrade will be different.

Any suggestions on how to fix this ?

> OM incorrectly detects SCM Ratis Group ID when OM and SCM are colocated with 
> same Ratis storage directory
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-7088
>                 URL: https://issues.apache.org/jira/browse/HDDS-7088
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: OM HA, SCM HA
>    Affects Versions: 1.2.1
>            Reporter: Ethan Rose
>            Assignee: Navin Kumar
>            Priority: Major
>
> When OM and SCM are colocated and ozone.om.ratis.storage.dir and 
> ozone.scm.ha.ratis.storage.dir are set to the same directory, the OM will 
> incorrectly detect SCM's Ratis group ID and think that were was an erroneous 
> change to the OM service ID. 
> {code}
> 22022-07-21 12:57:24,691 ERROR 
> org.apache.hadoop.ozone.om.OzoneManagerStarter: OM start failed with exception
> java.io.IOException: Ratis group Dir on disk scm does not match with 
> RaftGroupID7f6848f2-63c2-3ce4-a1d2-9e88065dfcc3 generated from service id 
> ozone1. Looks like there is a change to ozone.om.service.ids value after the 
> cluster is setup. Currently change to this value is not supported.
> {code}
> This Jira is to support setting the two configurations to the same directory 
> when OM and SCM are colocated while still maintaining the OM service ID check.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to