Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-1605. Blocks can be incorrectly deleted if TS crashes 

KUDU-1605. Blocks can be incorrectly deleted if TS crashes mid-copy

This fixes a bug in the way we handle tablet copies while replacing
existing tombstoned tablets:

- a tablet exists in TABLET_DATA_TOMBSTONED state
- we begin copying a new replica on top of this one
-- this calls TabletMetadata::ReplaceSuperBlock() using the remote
   superblock (importantly, this remote superblock contains remote block
- we crash mid-copy
- on restart, we see the "TABLET_DATA_COPYING" state and "roll forward"
  the deletion of this tablet. However the block IDs here are the IDs from
  the remote machine, and we incorrectly delete a bunch of blocks.

This has always been an issue, but was made worse in 0.10 by the fix for
KUDU-1538. After fixing KUDU-1538, the likelihood of a remote block ID
matching a local one is quite high, whereas before we'd usually not see
this bug.

The fix here is relatively simple: rather than writing the remote
superblock to disk when starting the copy, we just change the state of
the existing superblock to indicate 'copying'.

Change-Id: Ica25c5e4e5894ea80e416d9a4ad44dd25e0c6d53
Reviewed-by: Mike Percy <>
Tested-by: Todd Lipcon <>
M src/kudu/integration-tests/
M src/kudu/integration-tests/
M src/kudu/integration-tests/test_workload.h
M src/kudu/tablet/
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tserver/
M src/kudu/tserver/
7 files changed, 165 insertions(+), 65 deletions(-)

  Mike Percy: Looks good to me, approved
  Todd Lipcon: Verified

To view, visit
To unsubscribe, visit

Gerrit-MessageType: merged
Gerrit-Change-Id: Ica25c5e4e5894ea80e416d9a4ad44dd25e0c6d53
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <>
Gerrit-Reviewer: Mike Percy <>
Gerrit-Reviewer: Todd Lipcon <>

Reply via email to