Hello Tidy Bot, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8395

to look at the new patch set (#13).

Change subject: error_manager: synchronize/serialize handling
......................................................................

error_manager: synchronize/serialize handling

The FsErrorManager is helper infrastructure that other classes can use to
help provide API contracts related to error handling.

Here is an example error-handling contract provided to the TSTabletManager
in a future patch, if a disk failure Status is returned during tablet
operation, either:
1) the tablet server will crash and we can rely on Kudu's crash-consistency
   mechanisms for safety, or
2) any affected Tablet will transition to the 'kStopped' state via an error
   manager callback. Most components will simply pass the non-OK Status up
   the call chain. However, if a method expects an IO operation to
   succeed for correctness purposes, and it receives an error Status,
   then it should check that the Tablet is stopped. If so, it can be
   assumed that the error was handled. Otherwise, Kudu must rely on the
   crash-consistency mechanisms and crash.

Given the above contract, The state of a tablet server post-disk-failure
depends significantly on the completion of disk-failure-handling
callbacks. Error-handling _must_ finish before anything is propagated
back to the offending caller.

To ensure correct completion of error-handling even in a concurrent
setting, this patch extends the error manager with a mutex and a second
error-handling callback type. This ensures that errors indirectly caused
by disk failures can be handled by non-disk-specific handling,
serializing failure-handling in the same fashion.

As an example of where this is necessary, say a tablet has data in a
single directory and hits a bad disk. That directory is immediately
marked failed and handling starts to fail all tablets in the directory.
Before, if the tablet were to create a new block before being failed, it
would fail immediately, complaining that no directories are available,
and would eventually fail a CHECK that translates roughly to: "Has error
handling for this tablet completed?" By wrapping block creation with
tablet-specific error handling and with serialized error-handling, this
CHECK will pass.

A previous revision accomplished this by using a single-threaded
threadpool to trigger error-handling and ensuring completion of
error-handling by waiting on the threadpool. I ultimately went with this
implentation since it's a bit more straight-forward wrt when to make the
different calls (i.e. "trigger disk error-handling vs trigger tablet
error-handling", instead of "trigger error-handling vs wait for
error-handling to finish"). This also has the benefit of being
extendable for other kinds of errors in the future.

Change-Id: Ie61c408a0b4424f933f40a31147568c2f906be0e
---
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/data_dirs-test.cc
A src/kudu/fs/error_manager-test.cc
A src/kudu/fs/error_manager.cc
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/tserver/tablet_server.cc
M src/kudu/util/status.h
12 files changed, 394 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/95/8395/13
--
To view, visit http://gerrit.cloudera.org:8080/8395
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie61c408a0b4424f933f40a31147568c2f906be0e
Gerrit-Change-Number: 8395
Gerrit-PatchSet: 13
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>

Reply via email to