Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11249 )
Change subject: KUDU-2469 pt 2: fail replicas on CFile corruption ...................................................................... KUDU-2469 pt 2: fail replicas on CFile corruption This adds handling for CFile corruption errors via the error manager. If a CFile corruption is encountered, the replica affected will be failed and scheduled to be shutdown (pulling the tablet id of interest from the IOContext), and eventually resulting in its re-replication. Corruption handling is entirely delegated to the CFileReaders, which have access to the error manager. Given that checksum errors are detected in VerifyChecksum(), methods that wrap VerifyChecksum() must expect the corruption and handle it, namely ReadBlock() and Init(). This patch also includes a fault injection flag that helped facilitate testing, and some extra plumbing of IOContexts in places that were caught without coverage: the IndexTreeIterator and the BloomCache. Change-Id: I63d541443bc68c83fd0ca6d51315143fee04d50f Reviewed-on: http://gerrit.cloudera.org:8080/11249 Tested-by: Kudu Jenkins Reviewed-by: Adar Dembo <[email protected]> Reviewed-by: Grant Henke <[email protected]> --- M src/kudu/cfile/bloomfile.cc M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/cfile/index_btree.cc M src/kudu/cfile/index_btree.h M src/kudu/fs/error_manager.cc M src/kudu/fs/error_manager.h M src/kudu/fs/io_context.h M src/kudu/integration-tests/disk_failure-itest.cc M src/kudu/tablet/deltafile.cc M src/kudu/tserver/tablet_server-test.cc M src/kudu/tserver/tablet_server.cc 13 files changed, 342 insertions(+), 77 deletions(-) Approvals: Kudu Jenkins: Verified Adar Dembo: Looks good to me, but someone else must approve Grant Henke: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/11249 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I63d541443bc68c83fd0ca6d51315143fee04d50f Gerrit-Change-Number: 11249 Gerrit-PatchSet: 13 Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
