[
https://issues.apache.org/jira/browse/HDDS-11955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17906629#comment-17906629
]
Sammi Chen commented on HDDS-11955:
-----------------------------------
[~szetszwo], is it possible for raft leader to know that it's a raft log replay
due to restart to it's a normal leader to follower request sync?
> ContainerStateMachine.readStateMachineData may throw NoSuchFileException
> ------------------------------------------------------------------------
>
> Key: HDDS-11955
> URL: https://issues.apache.org/jira/browse/HDDS-11955
> Project: Apache Ozone
> Issue Type: Bug
> Components: Ozone Datanode
> Reporter: Tsz-wo Sze
> Priority: Major
>
> Suppose we have the following Raft log entires
> - index 110 is a writeChunk,
> - index 120 is a deleteBlock for the chunk above, and
> - one of the datanode followers has nextIndex 100.
> Then, the datanode leader has to send log entries to that follower starting
> from 100. If the leader already has applied log entry 120, it requires to
> read a deleted block in ContainerStateMachine.readStateMachineData and leads
> to NoSuchFileException.
> Also, Ozone datanode does not support Ratis snapshot, this problem cannot be
> worked around by sending a snapshot.
> Potential fix: The datanode leader has to return something instead of
> throwing NoSuchFileException. The datanode leader might return a proto
> indicating that the block is already deleted. When the datanode follower
> receives the proto, it will just mark it as deleted. We need a design for
> this problem.
> Thanks [~sammichen] for pointing out the problem.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]