Hello Andrew Wong, Todd Lipcon,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/8978

to review the following change.


Change subject: KUDU-2202: support for removing data directories (take two)
......................................................................

KUDU-2202: support for removing data directories (take two)

This commit extends "kudu fs update_dirs" to allow the removal of data
directories. A data dir may be removed provided there are no tablets
configured to use it. The --force flag may be used to override this safety
check; tablets that depend on the removed data dir will fail to bootstrap
the next time the server is started.

I'll be the first to admit that this functionality isn't terribly useful at
the moment because Kudu deployments are configured to stripe data across all
data directories, so removing a data dir will fail all tablets, which is
equivalent to reformatting the node. Nevertheless, it will become more
useful as users customize the new --fs_target_data_dirs_per_tablet gflag.

This is the second attempt at implementing this functionality. The first
used a different interface and was vulnerable to a data loss issue
documented in KUDU-2202. Andrew fixed that in commit 47b81c4, which reopens
the door for safely removing data directories.

I considered a stronger check that only prohibits data dir removal if there
were data blocks on the to-be-removed data dir, but decided against it:
- It's rare to find a tablet configured to use a data dir but without any
  actual blocks on it. In fact, that's only likely for empty tables, or
  tables with empty range partitions.
- We'd still need to rewrite any unaffected tablet superblocks and remove
  the data dir from their data dir groups.

There are several interesting modifications here:
1. PathInstanceMetadataFile::CheckIntegrity can now be run without checking
   for missing instance files. This is used when removing a data dir (its
   instance file will be missing).
2. The DataDirManager and FsManager may now be constructed with both
   read_only=true and update_on_disk=true. The CLI tool leverages this
   functionality to create a "speculative" FsManager using the new data dir
   configuration that is used to check whether existing tablets depend on
   the data dir to be removed.
3. The DataDirManager now checks for integrity after making on-disk changes.
   While not strictly necessary, this seems like a good sanity check to
   avoid introducing bugs.
4. The "kudu fs update_dirs" action was updated for data dir removal. It now
   accepts the --force option, without which removal will only be allowed if
   no tablets are configured to use the removed data dir.

Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
---
M src/kudu/fs/block_manager_util-test.cc
M src/kudu/fs/block_manager_util.cc
M src/kudu/fs/block_manager_util.h
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
10 files changed, 317 insertions(+), 81 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/78/8978/1
--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>

Reply via email to