[
https://issues.apache.org/jira/browse/HDDS-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17579827#comment-17579827
]
Tsz-wo Sze commented on HDDS-7103:
----------------------------------
[~NeilJoshi], [~erose], after much thought, a server should never create a new
directory unless RaftStorage.StartupOption.FORMAT is specified. Otherwise, it
will cause data loss since the raft log committed in that server is
(temporarily) lost. Consider the following
# Server A and Server B were fast and had commit index 100 while server C had
commit index 90.
# The disk of Server A was bad and it was restarted with a new directory.
Then, its commit index was reset and became -1.
# Server C started a leader election and Server A voted for it. The commit
index became 90 – it lost the commits from index 91 to index 100.
Note that there was only one failure in the example above. Note also that,
without restarting with a new directory, Server A's commit index was 100 and it
won't vote for Server C.
For Ozone and any other applications, it should start with
RaftStorage.StartupOption.FORMAT only at the first time. When it restarts, it
should use RaftStorage.StartupOption.RECOVER.
> Ratis log storage directories unchecked causing unhandled exception on
> datanode restart
> ---------------------------------------------------------------------------------------
>
> Key: HDDS-7103
> URL: https://issues.apache.org/jira/browse/HDDS-7103
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Neil Joshi
> Priority: Major
>
> Under the condition the ratis storage logs are configured to be on multiple
> disks and there is a corruption causing the same directory found on each
> disk, ratis throws an unhandled exception. The unhandled exception prevents
> the datanode from creating pipelines. The datanode remains up with the user
> only detecting a failure through the datanode logs.
> Error can be seen with ozone cluster with configuration property
> _*dfs.container.ratis.datanode.storage.dir*_ set to two volume locations, ie.
> _dn1,dn2_ . Having the same directories in both disks. On datanode start
> error will be logged when bringing up the XceiverServerRatis.
> Snippet of logged error:
> {code:java}
> ozone-datanode-1 | 2022-08-03 22:05:54 INFO XceiverServerRatis:481 -
> Starting XceiverServerRatis feb90744-e0e7-4b2e-8d57-02213ce29693
> ozone-datanode-1 | 2022-08-03 22:05:54 WARN EndpointStateMachine:236 -
> Unable to communicate to SCM server at scm:9861 for past 0 seconds.
> ozone-datanode-1 | java.io.IOException: More than one directories found for
> 01a173a0-6bd2-478a-8598-05df3a6f318a:
> [/mydata/dn1/01a173a0-6bd2-478a-8598-05df3a6f318a,
> /mydata/dn2/01a173a0-6bd2-478a-8598-05df3a6f318a]
> ozone-datanode-1 | at
> org.apache.ratis.server.impl.ServerState.chooseStorageDir(ServerState.java:177)
> ozone-datanode-1 | at
> org.apache.ratis.server.impl.ServerState.<init>(ServerState.java:113)
> ozone-datanode-1 | at
> org.apache.ratis.server.impl.RaftServerImpl.<init>(RaftServerImpl.java:201){code}
> This jira is filed to track the issue and to resolve it. This issue had been
> identified and discussed in a previous PR for the hdds volume diskchecker, PR
> #2158, https://github.com/apache/ozone/pull/2158#issuecomment-836580999.
> Idea from the PR was to omit directories with the problem and continue. This
> was to be done either,
> i.) with a checker prior to the XceiverServerRatis; if this is in the current
> Ozone, how to configure it to resolve this issue.
> ii.) modifiy the Ratis code to remove affected directories and continue
> instead of throwing and unhandled IOException, see
> https://github.com/apache/ratis/blob/040bc52e19a5e36f5710ccd4fc1981e862e691e8/ratis-server/src/main/java/org/apache/ratis/server/impl/ServerState.java#L107-L117.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]