[
https://issues.apache.org/jira/browse/HDDS-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arpit Agarwal reassigned HDDS-1595:
-----------------------------------
Assignee: Supratim Deka
> Handling IO Failures on the Datanode
> ------------------------------------
>
> Key: HDDS-1595
> URL: https://issues.apache.org/jira/browse/HDDS-1595
> Project: Hadoop Distributed Data Store
> Issue Type: Improvement
> Components: Ozone Datanode
> Reporter: Supratim Deka
> Assignee: Supratim Deka
> Priority: Major
> Attachments: Handling IO Failures on the Datanode.pdf, Raft IO v2.svg
>
>
> This Jira covers all the changes required to handle IO Failures on the
> Datanode. Handling an IO failure on the Datanode involves detecting failures
> as they happen and propagating the failure to the appropriate component in
> the system - possibly the Client and/or the SCM based on the type of failure.
> At a high-level, IO Failure handling has the following goals:
> 1. Prevent Inconsistencies and corruption - due to non-handling or
> mishandling of failures.
> 2. Prevent any data loss - timely detection of failure and propagate correct
> error back to the initiator instead of silently dropping the data while the
> client assumes the operation is committed.
> 3. Contain the disruption in the system - if a disk volume fails on a DN,
> operations to the other nodes and volumes should not be affected.
> Details pertaining to design and changes required are covered in the attached
> pdf document.
> A sequence diagram used to analyse the Datanode IO Path is also attached, in
> svg format.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]