Hello Tidy Bot, Kudu Jenkins, Andrew Wong, Adar Dembo,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14654

to look at the new patch set (#20).

Change subject: KUDU-2975: Spread WAL across multiple directories
......................................................................

KUDU-2975: Spread WAL across multiple directories

Add a new gflag named "fs_wal_dirs" to support spreading WAL across
multiple directories.

Use a new class named "WalDirManager" to manage the WAL dirs. Every WAL
directory has an UUID, We record the directory's UUID and all
directories's UUIDs into a file named "wal_manager_instance". We also record
the tablet's WAL directory UUID into the tablet's metadata. We determine the
tablet WAL location based on the dir UUID recorded in metadata at
reboot.

If switch 'fs_wal_dir' to 'fs_wal_dirs', first we need use "kudu fs
update_dirs" tool to update the WAL dir. We should make sure that the
new configuration includes the old ones, otherwise some tablets may be
failed after startup.

Because the tablet's metadata in old version had no WAL directory, we
look for the tablet's WAL directory under all the new WAL directories.
If found, we write the talbet's WAL into WalDirManager. But flushing
metadata may causes a deadlock in the tool code, so we don't persist
the metadata immediately.

In this version, one of the WAL directorys's structure looks like this:

  ----wal

  --------instance

  --------wals

  ------------wal_manager_instance

  ------------tablet1_uuid

  ----------------index.0

  ----------------wal.0

  ------------tablet2_uuid

  ----------------index.0

  ----------------wal.0

Some WAL directories are allowed to be missing at startup. Or some disks
that hold WAL are allowed to be failed at startup. If the tablets
located on failed WAL directories, they can be recovered by master.

Deleting tablet has also been modified, we put the delete WAL before the
delete metadata. Here may be not safe, need discuss.

This modification has the following functions not completed at present:
1. WAL disk failure not support.
2. some tools about WAL not modify.

Change-Id: Ied496804421d91ff1fa63d49979fde971071506e
---
M src/kudu/consensus/consensus_peers-test.cc
M src/kudu/consensus/consensus_queue-test.cc
M src/kudu/consensus/log-test-base.h
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_cache-test.cc
M src/kudu/consensus/log_reader.cc
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs.proto
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/fs/fs_report.cc
M src/kudu/fs/fs_report.h
A src/kudu/fs/wal_dirs-test.cc
A src/kudu/fs/wal_dirs.cc
A src/kudu/fs/wal_dirs.h
M src/kudu/integration-tests/delete_table-itest.cc
M src/kudu/integration-tests/mini_cluster_fs_inspector.cc
M src/kudu/integration-tests/mini_cluster_fs_inspector.h
M src/kudu/integration-tests/multidir_cluster-itest.cc
M src/kudu/integration-tests/open-readonly-fs-itest.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/integration-tests/timestamp_advancement-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/master/mini_master-test.cc
M src/kudu/master/mini_master.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/mini-cluster/internal_mini_cluster.cc
M src/kudu/mini-cluster/internal_mini_cluster.h
M src/kudu/mini-cluster/mini_cluster.h
M src/kudu/server/server_base.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_bootstrap-test.cc
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/tools/tool_action_local_replica.cc
M src/kudu/tserver/mini_tablet_server-test.cc
M src/kudu/tserver/mini_tablet_server.cc
M src/kudu/tserver/mini_tablet_server.h
M src/kudu/tserver/tablet_copy-test-base.h
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_server-stress-test.cc
M src/kudu/tserver/tablet_server-test-base.cc
M src/kudu/tserver/tablet_server-test-base.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/util/env_util.cc
M src/kudu/util/env_util.h
M src/kudu/util/path_util.h
58 files changed, 3,217 insertions(+), 323 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/54/14654/20
--
To view, visit http://gerrit.cloudera.org:8080/14654
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ied496804421d91ff1fa63d49979fde971071506e
Gerrit-Change-Number: 14654
Gerrit-PatchSet: 20
Gerrit-Owner: YangSong <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: YangSong <[email protected]>

Reply via email to