Andrew Wong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8607
Change subject: handle disk failures during tablet copies ...................................................................... handle disk failures during tablet copies There are two components in a tablet copy: the copy client (that receives data) and the copy session source (that sends data). Coarse-grain handling of disk failures during tablet copies is done for the tablet copy client as: * Before starting a copy client, if no disks are available to place the tablet, simply return (instead of failing a CHECK). * Before downloading each WALs or block, check that the tablet is in a healthy group. And for the tablet copy session as: * Before sending a block or log segment, check if the tablet has an error. Upon returning an error, the tablet copy client will shutdown the replica, leaving it in a failed state. A test is added to ensure that both copy clients and that source sessions with failed disks will return errors to the copying client. Change-Id: Iacbfe446d01dd523fb2f2f81880e5af2551e979f --- M src/kudu/tablet/tablet.h M src/kudu/tserver/tablet_copy-test-base.h M src/kudu/tserver/tablet_copy_client-test.cc M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h M src/kudu/tserver/tablet_copy_service-test.cc M src/kudu/tserver/tablet_copy_source_session.cc M src/kudu/tserver/tablet_copy_source_session.h 8 files changed, 119 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/8607/1 -- To view, visit http://gerrit.cloudera.org:8080/8607 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iacbfe446d01dd523fb2f2f81880e5af2551e979f Gerrit-Change-Number: 8607 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Wong <[email protected]>
