Andrew Wong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11249 )

Change subject: KUDU-2469 pt 2: fail replicas on CFile corruption
......................................................................

KUDU-2469 pt 2: fail replicas on CFile corruption

This adds handling for CFile corruption errors via the error manager. If
a CFile corruption is encountered, the replica affected will be failed
and scheduled to be shutdown (pulling the tablet id of interest from the
IOContext), and eventually resulting in its re-replication.

Corruption handling is entirely delegated to the CFileReaders, which
have access to the error manager. Given that checksum errors are
detected in VerifyChecksum(), methods that wrap VerifyChecksum() must
expect the corruption and handle it, namely ReadBlock() and Init().

This patch also includes a fault injection flag that helped facilitate
testing, and some extra plumbing of IOContexts in places that were
caught without coverage: the IndexTreeIterator and the BloomCache.

Change-Id: I63d541443bc68c83fd0ca6d51315143fee04d50f
Reviewed-on: http://gerrit.cloudera.org:8080/11249
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo <[email protected]>
Reviewed-by: Grant Henke <[email protected]>
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/cfile/index_btree.cc
M src/kudu/cfile/index_btree.h
M src/kudu/fs/error_manager.cc
M src/kudu/fs/error_manager.h
M src/kudu/fs/io_context.h
M src/kudu/integration-tests/disk_failure-itest.cc
M src/kudu/tablet/deltafile.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_server.cc
13 files changed, 342 insertions(+), 77 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Adar Dembo: Looks good to me, but someone else must approve
  Grant Henke: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/11249
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I63d541443bc68c83fd0ca6d51315143fee04d50f
Gerrit-Change-Number: 11249
Gerrit-PatchSet: 13
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot

Reply via email to