[
https://issues.apache.org/jira/browse/KUDU-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Wong resolved KUDU-2627.
-------------------------------
Fix Version/s: 1.12.0
Resolution: Duplicate
This is a dupe of KUDU-2993, which is resolved as of Kudu 1.12.
> Automatically "fix" inconsistent data directories
> -------------------------------------------------
>
> Key: KUDU-2627
> URL: https://issues.apache.org/jira/browse/KUDU-2627
> Project: Kudu
> Issue Type: Improvement
> Components: fs
> Reporter: Andrew Wong
> Priority: Major
> Fix For: 1.12.0
>
>
> Currently, Kudu will attempt to check the integrity of its FS layout by
> checking that all data dirs exist where they're expected, and that all of
> them "know" about the rest of the data dirs in the FS layout. When a data dir
> is missing on disk (e.g. because the underlying disk was yanked and a new one
> was put in), this currently means that all other data dirs will expect a data
> dir that will be missing. Following KUDU-2359, Kudu will accept this and
> start up, but label the data dir as "failed", alerting users that something
> on disk is inconsistent with the users' FS config, at which point, they can
> run `kudu fs update_dirs` with the expected directories.
> This isn't a great user experience for a couple reasons: 1) it adds more
> legwork and more downtime when recovering from disk failures, performing
> hardware upgrades, etc., 2) if the user _is_ repairing a disk failure, the
> "new" directories input to the `kudu fs update_dirs` tool will be identical
> to the old ones (or more cautiously be done as a removal and then addition),
> which is somewhat confusing. The `kudu fs update_dirs` tool is already smart
> enough to tell users when attention is needed (e.g. if removing directories
> with tablets striped across them); it wouldn't be unreasonable to think that
> we could put it in front of (or mirror the behavior in front of) a server
> startup.
> For administrators who prefer tooling, it probably makes sense to maintain
> the current, more conservative, less automatic codepaths, and gate it by some
> flag.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)