[
https://issues.apache.org/jira/browse/KUDU-2135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Wong reassigned KUDU-2135:
---------------------------------
Assignee: Andrew Wong
> Persist disk health state to disk
> ---------------------------------
>
> Key: KUDU-2135
> URL: https://issues.apache.org/jira/browse/KUDU-2135
> Project: Kudu
> Issue Type: Improvement
> Components: fs
> Reporter: Andrew Wong
> Assignee: Andrew Wong
>
> When a tablet server disk fails, it is marked FAILED in memory and not
> touched during the lifetime of the tablet server. The next time the tablet
> server is started, however, if the disk happens to start up successfully, it
> will be used as is.
> This may be risky, as the disk may be corrupted, or may be more prone to
> runtime failures. Additionally, when we begin striping metadata or WALs, we
> may end up with multiple of them for a single tablet (e.g. one that was on
> the failed disk, and another if the tablet was reassigned to the same tablet
> server). As such, the contents of the previously failed disk should not be
> used. It is thus necessary to persist the health of a disk to ensure
> unhealthy disks are not used.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)