[
https://issues.apache.org/jira/browse/KUDU-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Wong updated KUDU-2359:
------------------------------
Status: In Review (was: Open)
> tserver should allow starting with a small number of missing data dirs
> ----------------------------------------------------------------------
>
> Key: KUDU-2359
> URL: https://issues.apache.org/jira/browse/KUDU-2359
> Project: Kudu
> Issue Type: Improvement
> Components: fs, tserver
> Reporter: Todd Lipcon
> Assignee: Andrew Wong
> Priority: Major
>
> Often when a disk fails, its mount point will not come back up when the
> server is restarted. Currently, Kudu will respond to this by failing to
> restart with an error like:
> F0314 18:23:39.353916 112051 tablet_server_main.cc:80] Check failed: _s.ok()
> Bad status: Already present: FS layout already exists; not overwriting
> existing layout. See
> https://kudu.apache.org/releases/1.8.0-SNAPSHOT/docs/troubleshooting.html:
> unable to create file system roots: FSManager roots already exist:
> /data/1/kudu,/data/2/kudu,/data/3/kudu,/data/5/kudu,/data/6/kudu,/data/7/kudu,/data/8/kudu,/data/1/kudu-wal
> However, this defeats some of the advantages of the "allow single disk
> failure" work. One could use the update_data_dirs tool to remove the missing
> disk, but you'd also need to persistently change the configuration of the
> daemon, which is hard to do with a consistent configuration management.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)