[ 
https://issues.apache.org/jira/browse/KUDU-2135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wong reassigned KUDU-2135:
---------------------------------

    Assignee: Andrew Wong

> Persist disk health state to disk
> ---------------------------------
>
>                 Key: KUDU-2135
>                 URL: https://issues.apache.org/jira/browse/KUDU-2135
>             Project: Kudu
>          Issue Type: Improvement
>          Components: fs
>            Reporter: Andrew Wong
>            Assignee: Andrew Wong
>
> When a tablet server disk fails, it is marked FAILED in memory and not 
> touched during the lifetime of the tablet server. The next time the tablet 
> server is started, however, if the disk happens to start up successfully, it 
> will be used as is.
> This may be risky, as the disk may be corrupted, or may be more prone to 
> runtime failures. Additionally, when we begin striping metadata or WALs, we 
> may end up with multiple of them for a single tablet (e.g. one that was on 
> the failed disk, and another if the tablet was reassigned to the same tablet 
> server). As such, the contents of the previously failed disk should not be 
> used. It is thus necessary to persist the health of a disk to ensure 
> unhealthy disks are not used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to