[
https://issues.apache.org/jira/browse/HDDS-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17650491#comment-17650491
]
mingchao zhao commented on HDDS-2696:
-------------------------------------
Ozone 1.3.0 had been released and we currently have more than 600 open issues
targeted for 1.3.0. I am moving the target field to 1.4.0.
If there is anything needs to be discussed about the Target Version, Please
reach out to me via Apache email or Slack.
> Document recovery from RATIS-677
> --------------------------------
>
> Key: HDDS-2696
> URL: https://issues.apache.org/jira/browse/HDDS-2696
> Project: Apache Ozone
> Issue Type: Improvement
> Components: Ozone Datanode
> Reporter: István Fajth
> Priority: Critical
> Labels: Triaged
>
> As RATIS-677 is solved in a way where a setting needs to be changed, and set
> for the RatisServer implementation to ignore the corruption, and at the
> moment due to HDDS-2647, we do not have a clear recovery path from a ratis
> corruption in the pipeline data.
> We should document how this can be recovered. I have an idea which includes
> closing the pipeline in SCM and remove the ratis metadata for the pipeline in
> the DataNodes, which effectively clears out the corrupted pipeline from the
> system.
> There are two problems I have with finding a recovery path, and document it:
> - I am not sure if we have strong enough guarantees that the writes happened
> properly if the ratis metadata could become corrupt so this needs to be
> investigated.
> - At the moment I can not validate this approach, as if I do the steps (stop
> the 3 DN, move out ratis data for pipeline, close the pipeline with scmcli,
> then restart the DNs) the pipeline is not closed properly, and SCM fails as
> described in HDDS-2695
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]