[ https://issues.apache.org/jira/browse/KUDU-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Wong reassigned KUDU-2359: --------------------------------- Assignee: Andrew Wong > tserver should allow starting with a small number of missing data dirs > ---------------------------------------------------------------------- > > Key: KUDU-2359 > URL: https://issues.apache.org/jira/browse/KUDU-2359 > Project: Kudu > Issue Type: Improvement > Components: fs, tserver > Reporter: Todd Lipcon > Assignee: Andrew Wong > Priority: Major > > Often when a disk fails, its mount point will not come back up when the > server is restarted. Currently, Kudu will respond to this by failing to > restart with an error like: > F0314 18:23:39.353916 112051 tablet_server_main.cc:80] Check failed: _s.ok() > Bad status: Already present: FS layout already exists; not overwriting > existing layout. See > https://kudu.apache.org/releases/1.8.0-SNAPSHOT/docs/troubleshooting.html: > unable to create file system roots: FSManager roots already exist: > /data/1/kudu,/data/2/kudu,/data/3/kudu,/data/5/kudu,/data/6/kudu,/data/7/kudu,/data/8/kudu,/data/1/kudu-wal > However, this defeats some of the advantages of the "allow single disk > failure" work. One could use the update_data_dirs tool to remove the missing > disk, but you'd also need to persistently change the configuration of the > daemon, which is hard to do with a consistent configuration management. -- This message was sent by Atlassian JIRA (v7.6.3#76005)