[kudu-CR] KUDU-1538: prevent block ID reuse to avoid potential data loss

Todd Lipcon (Code Review) Wed, 20 Jul 2016 22:15:07 -0700

Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit


    http://gerrit.cloudera.org:8080/3719

to look at the new patch set (#2).

Change subject: KUDU-1538: prevent block ID reuse to avoid potential data loss
......................................................................

KUDU-1538: prevent block ID reuse to avoid potential data loss

This changes the block managers to allocate block IDs sequentially
rather than randomly. Given our 64-bit block IDs, this prevents ever
reusing an ID (it would take thousands of years even at unrealistically
high allocation rates).

The trickiness of this patch is that, in many unit tests, the BlockCache
singleton ends up persisting across multiple separate block managers.
Even though the test has torn down and recreated a new block manager,
the BlockCache continues to cache entries from the previous block manager.
With the block IDs starting from '1', we would be sure to have a collision
and many tests failed.

The workaround is for the LBM to notice when it is running in a gtest
(by way of some weak symbol magic) and start its allocation at a
random point in block space, rather than starting at 1.

For the FBM, I did a simpler workaround and just always started allocation
at a random point, since the FBM doesn't scan its block list at startup
and therefore would likely suffer collisions at startup even in the
'normal' (non-test) case.

Unfortunately there's no real way to write a regression test for this:
it would only produce itself after inserting tens of terabytes of data
in the presence of lots of remote bootstraps, etc.

Change-Id: Id45bf81bd6bccd51937c358716ace895ccee469c
---
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/util/test_util.cc
6 files changed, 50 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/3719/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3719
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id45bf81bd6bccd51937c358716ace895ccee469c
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins

[kudu-CR] KUDU-1538: prevent block ID reuse to avoid potential data loss

Reply via email to