[ 
https://issues.apache.org/jira/browse/KUDU-616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16367479#comment-16367479
 ] 

Grant Henke commented on KUDU-616:
----------------------------------

[~andrew.wong] was this work handled in some other jiras?

> Mitigate tablet damage when disks are lost
> ------------------------------------------
>
>                 Key: KUDU-616
>                 URL: https://issues.apache.org/jira/browse/KUDU-616
>             Project: Kudu
>          Issue Type: Sub-task
>          Components: fs
>    Affects Versions: M5
>            Reporter: Adar Dembo
>            Assignee: Andrew Wong
>            Priority: Major
>
> Disk loss is an unfortunate fact of life, and Kudu should provide mechanisms 
> for mitigating disk loss.
> # Make it possible to isolate specific tablets to some subset of the 
> machine's disks, so that if one disk dies it doesn't take out all the tablets 
> with it. This is more complicated than it looks:
> ** We need a concrete way of describing disk groups. It can be per-node, or 
> abstract enough that it makes sense across the entire cluster, or perhaps we 
> aggregate information (e.g. ten machines have 5 disks and the other forty 
> machines have 6 disks).
> ** This mechanism needs to be used for both data blocks and other bits of 
> metadata (master blocks, superblocks, and other random files).
> ** Presumably it needs to be provided when a table is created (or a tablet is 
> split), and it needs to be persisted as part of tablet metadata. It might be 
> sufficient to express it in Kudu configuration (i.e. complex gflags) but 
> since it can be associated to tablet metadata, it's hard to see how this 
> would work.
> # When a disk fails, the server needs to handle it appropriately (mark it as 
> failed, put affected tablets in a failed state, etc.).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to