[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-06 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8465 )

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 06 Nov 2017 23:03:37 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-06 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8465 )

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8465/8/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

http://gerrit.cloudera.org:8080/#/c/8465/8/src/kudu/tablet/tablet_metadata.cc@420
PS8, Line 420: fs_manager()->block_manager()->NotifyBlockId(max_block_id);
> it seems possible that max_block_id is uninitialized at this point if there
Yeah, it gets initialized to 0 (see fs/block_id.cc:32)



--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 06 Nov 2017 22:49:21 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-06 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8465 )

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8465/8/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

http://gerrit.cloudera.org:8080/#/c/8465/8/src/kudu/tablet/tablet_metadata.cc@420
PS8, Line 420: fs_manager()->block_manager()->NotifyBlockId(max_block_id);
it seems possible that max_block_id is uninitialized at this point if there 
were no blocks. Does it have a well-defined uninitialized value that makes this 
safe?



--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 06 Nov 2017 22:47:13 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8465 )

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8465/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8465/7//COMMIT_MSG@24
PS7, Line 24: the
> Double the
Done


http://gerrit.cloudera.org:8080/#/c/8465/7/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

http://gerrit.cloudera.org:8080/#/c/8465/7/src/kudu/fs/block_manager.h@265
PS7, Line 265:   // externally-referenced ids that they may not have previously 
found.
> Getting better. Perhaps it would be good to add an example that will help p
Done


http://gerrit.cloudera.org:8080/#/c/8465/7/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

http://gerrit.cloudera.org:8080/#/c/8465/7/src/kudu/tablet/tablet_metadata.cc@404
PS7, Line 404: if a disk
 : // was not read
> Can you reword this example so it's more relatable to the relevant Kudu con
Done



--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Sat, 04 Nov 2017 05:37:18 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8465

to look at the new patch set (#8).

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..

KUDU-2202 avoid block ID reuse for missing dirs

The block manager is initially only aware of the blocks it reads when
first loading the FS layout. If the opening of a directory fails (e.g.
due to a disk failure), the server can still start up, but will fail any
tablet configured to use that directory. By not scanning the failed
directory, the server will "forget" about some blocks.

Since LBM block IDs are sequentially allocated just past the end of the
greatest known ID, it's possible for a new tablet to create blocks with
IDs already assigned to a failed tablet. If that failed tablet is
deleted, the live data belonging to the new tablet will be deleted.

To prevent this, a new API has been added to the block managers to grant
external components the ability to notify the block manager of block
IDs. In loading a tablet's metadata, once all rowsets and orphaned
blocks have been seen, the block manager is notified of the highest
block ID in the superblock.

This patch only targets the LBM, which assigns blocks sequentially,
and therefore only needs to know about the largest block ID referenced
by tablets.

A unit test is added to log_block_manager-test and an integration test
is added to ts_recovery-itest to ensure block ID reuse does not happen.
Also includes some C++11/stylistic cleanup.

Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
---
M src/kudu/fs/block_id.h
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
15 files changed, 247 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/65/8465/8
--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8465 )

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..


Patch Set 7:

(3 comments)

Looks good, just some nits on comments.

http://gerrit.cloudera.org:8080/#/c/8465/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8465/7//COMMIT_MSG@24
PS7, Line 24: the
Double the


http://gerrit.cloudera.org:8080/#/c/8465/7/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

http://gerrit.cloudera.org:8080/#/c/8465/7/src/kudu/fs/block_manager.h@265
PS7, Line 265:   // externally-referenced ids that they may not have previously 
found.
Getting better. Perhaps it would be good to add an example that will help 
people understand how the block manager may have previously not found a block? 
Like "...avoid reusing externally-referenced ids that they may not have 
previously found (e.g. because those ids' blocks were on a data directory that 
failed)".


http://gerrit.cloudera.org:8080/#/c/8465/7/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

http://gerrit.cloudera.org:8080/#/c/8465/7/src/kudu/tablet/tablet_metadata.cc@404
PS7, Line 404: if a disk
 : // was not read
Can you reword this example so it's more relatable to the relevant Kudu 
concepts? Like, it should probably talk about data directories rather than 
disks, and maybe it should explain why a directory would not be "read" (because 
it failed).



--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Sat, 04 Nov 2017 05:16:04 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8465 )

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8465/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8465/5//COMMIT_MSG@16
PS5, Line 16: for new
> for a new
Missed this, sorry



--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Sat, 04 Nov 2017 05:01:30 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8465

to look at the new patch set (#7).

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..

KUDU-2202 avoid block ID reuse for missing dirs

The block manager is initially only aware of the blocks it reads when
first loading the FS layout. If the opening of a directory fails (e.g.
due to a disk failure), the server can still start up, but will fail any
tablet configured to use that directory. By not scanning the failed
directory, the server will "forget" about some blocks.

Since LBM block IDs are sequentially allocated just past the end of the
greatest known ID, it's possible for a new tablet to create blocks with
IDs already assigned to a failed tablet. If that failed tablet is
deleted, the live data belonging to the new tablet will be deleted.

To prevent this, a new API has been added to the block managers to grant
external components the ability to notify the block manager of block
IDs. In loading a tablet's metadata, once all rowsets and orphaned
blocks have been seen, the block manager is notified of the highest
block ID in the the superblock.

This patch only targets the LBM, which assigns blocks sequentially,
and therefore only needs to know about the largest block ID referenced
by tablets.

A unit test is added to log_block_manager-test and an integration test
is added to ts_recovery-itest to ensure block ID reuse does not happen.
Also includes some C++11/stylistic cleanup.

Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
---
M src/kudu/fs/block_id.h
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
15 files changed, 246 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/65/8465/7
--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8465 )

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..


Patch Set 5:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/fs/block_manager.h@264
PS5, Line 264: should use
> The new wording isn't quite right; it suggests that the block manager itsel
Done


http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/fs/log_block_manager-test.cc@500
PS5, Line 500:   // Simulate a complete reset of the block manager's block ID 
record, e.g.
 :   // restarting but with all the blocks gone.
 :   bm_->next_block_id_.Store(1);
> Rather than simulating it in this way, how about deleting the containers fr
I agree that'd be not too complicated: DoGetContainers() and then delete some 
files.

There's a hack in log_block_manager.cc:1706 that makes this not the case.

I could maybe add a new block manager option to turn of the seeding, but I'll 
hold off given we get the same behavior reaching in.


http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/integration-tests/ts_recovery-itest.cc
File src/kudu/integration-tests/ts_recovery-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/integration-tests/ts_recovery-itest.cc@149
PS5, Line 149:   // Set up a basic server that flushes often so we create 
blocks quickly.
> BTW, it might be useful to wait for some compactions too, or some other ope
Done


http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/tablet/tablet_metadata.cc@403
PS5, Line 403: BlockId max_block_id;
> Probably worth adding a comment to this block of code (where you aggregate
Done


http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/tablet/tablet_metadata.cc@405
PS5, Line 405: if (!block_ids.empty()) {
> Is this check needed?
With some restructuring, no. But there could be no block_ids, and there would 
be a segfault a L406.


http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/tablet/tablet_metadata.cc@418
PS5, Line 418: fs_manager()->block_manager()->NotifyBlockId(max_block_id);
> What happens if we make it down here and max_block_id is still uninitialize
The default is 0 (invalid), so it should no-op.



--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Sat, 04 Nov 2017 04:05:20 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8465

to look at the new patch set (#6).

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..

KUDU-2202 avoid block ID reuse for missing dirs

The block manager is initially only aware of the blocks it reads when
first loading the FS layout. If the opening of a directory fails (e.g.
due to a disk failure), the server can still start up, but will fail any
tablet configured to use that directory. By not scanning the failed
directory, the server will "forget" about some blocks.

Since LBM block IDs are sequentially allocated just past the end of the
greatest known ID, it's possible for new tablet to create blocks with
IDs already assigned to a failed tablet. If that failed tablet is
deleted, the live data belonging to the second tablet will be deleted.

To prevent this, a new API has been added to the block managers to grant
external components the ability to notify the block manager of block
IDs. In loading a tablet's metadata, once all rowsets and orphaned
blocks have been seen, the block manager is notified of the highest
block ID in the the superblock.

This patch only targets the LBM, which assigns blocks sequentially,
and therefore only needs to know about the largest block ID referenced
by tablets.

A unit test is added to log_block_manager-test and an integration test
is added to ts_recovery-itest to ensure block ID reuse does not happen.
Also includes some C++11/stylistic cleanup.

Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
---
M src/kudu/fs/block_id.h
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
15 files changed, 246 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/65/8465/6
--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8465 )

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..


Patch Set 5:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/8465/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8465/5//COMMIT_MSG@16
PS5, Line 16: for new
for a new


http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/fs/block_manager.h@264
PS5, Line 264: should use
The new wording isn't quite right; it suggests that the block manager itself 
"uses" NotifyBlockId, but it's actually other components that do.


http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/fs/log_block_manager-test.cc@500
PS5, Line 500:   // Simulate a complete reset of the block manager's block ID 
record, e.g.
 :   // restarting but with all the blocks gone.
 :   bm_->next_block_id_.Store(1);
Rather than simulating it in this way, how about deleting the containers from 
disk and reloading the block manager? That's a little bit more "real world" 
without making the test too complicated, I hope. Would let you avoid direct 
access to next_block_id_ too.


http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/integration-tests/ts_recovery-itest.cc
File src/kudu/integration-tests/ts_recovery-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/integration-tests/ts_recovery-itest.cc@149
PS5, Line 149:   // Set up a basic server that flushes often so we create 
blocks quickly.
BTW, it might be useful to wait for some compactions too, or some other 
operation that would delete some blocks, so that some block IDs make it into 
the orphaned block list.


http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/tablet/tablet_metadata.cc@403
PS5, Line 403: BlockId max_block_id;
Probably worth adding a comment to this block of code (where you aggregate all 
the block IDs and compute the max) explaining the intent.


http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/tablet/tablet_metadata.cc@405
PS5, Line 405: if (!block_ids.empty()) {
Is this check needed?


http://gerrit.cloudera.org:8080/#/c/8465/5/src/kudu/tablet/tablet_metadata.cc@418
PS5, Line 418: fs_manager()->block_manager()->NotifyBlockId(max_block_id);
What happens if we make it down here and max_block_id is still uninitialized? 
What value is stored by the block manager?



--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Sat, 04 Nov 2017 02:58:30 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8465

to look at the new patch set (#5).

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..

KUDU-2202 avoid block ID reuse for missing dirs

The block manager is initially only aware of the blocks it reads when
first loading the FS layout. If the opening of a directory fails (e.g.
due to a disk failure), the server can still start up, but will fail any
tablet configured to use that directory. By not scanning the failed
directory, the server will "forget" about some blocks.

Since LBM block IDs are sequentially allocated just past the end of the
greatest known ID, it's possible for new tablet to create blocks with
IDs already assigned to a failed tablet. If that failed tablet is
deleted, the live data belonging to the second tablet will be deleted.

To prevent this, a new API has been added to the block managers to grant
external components the ability to notify the block manager of block
IDs. In loading a tablet's metadata, once all rowsets and orphaned
blocks have been seen, the block manager is notified of the highest
block ID in the its superblock.

This patch only targets the LBM, which assigns blocks sequentially,
and therefore only needs to know about the largest block ID referenced
by tablets.

A unit test is added to log_block_manager-test and an integration test
is added to ts_recovery-itest to ensure block ID reuse does not happen.
Also includes some C++11/stylistic cleanup.

Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
---
M src/kudu/fs/block_id.h
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
15 files changed, 233 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/65/8465/5
--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8465

to look at the new patch set (#4).

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..

KUDU-2202 avoid block ID reuse for missing dirs

The block manager is initially only aware of the blocks it reads when
first loading the FS layout. If the opening of a directory fails (e.g.
due to a disk failure), the server can still start up, but will fail any
tablet configured to use that directory. By not scanning the failed
directory, the server will "forget" about some blocks.

Since LBM block IDs are sequentially allocated just past the end of the
greatest known ID, it's possible for new tablet to create blocks with
IDs already assigned to a failed tablet. If that failed tablet is
deleted, the live data belonging to the second tablet will be deleted.

To prevent this, a new API has been added to the block managers to grant
external components the ability to notify the block manager of block
IDs. In loading a tablet's metadata, once all rowsets and orphaned
blocks have been seen, the block manager is notified of the highest
block ID in the its superblock.

This patch only targets the LBM, which assigns blocks sequentially,
and therefore only needs to know about the largest block ID referenced
by tablets.

A unit test is added to log_block_manager-test and an integration test
is added to ts_recovery-itest to ensure block ID reuse does not happen.
Also includes some C++11/stylistic cleanup.

Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
---
M src/kudu/fs/block_id.h
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
15 files changed, 240 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/65/8465/4
--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8465 )

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..


Patch Set 3:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/8465/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8465/1//COMMIT_MSG@10
PS1, Line 10: If the opening of a directory fails (e.g.
: due to a disk failure), the server can still start up, but will 
fail any
: tablet configured to use that directory. By not scanning the 
failed
: directory, the server will "forget" about some blocks.
:
> I'd say that the server will still start up but fail any tablet configured
Done


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/fs/block_manager.h@264
PS1, Line 264:  to
> should
Done


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/fs/block_manager.h@266
PS1, Line 266:   virtual void NotifyBlockId(BlockId block_id) = 0;
> Since BlockIds are basically uint64_ts, we should just pass them by copy.
Done


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/integration-tests/ts_recovery-itest.cc
File src/kudu/integration-tests/ts_recovery-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/integration-tests/ts_recovery-itest.cc@144
PS1, Line 144:   if (GetParam() != "log") {
 : LOG(INFO) << "Missin
> Where's the multi-directory part?
Was originally structured to fail a directory. No longer!


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/integration-tests/ts_recovery-itest.cc@199
PS1, Line 199:   for (const string& dir : ts->data_dirs()) {
 : const string& data_dir = JoinPathSegments(dir, "data");
 : vector children;
> This will fail the old tablet, right? Is that worth asserting for?
Done


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/integration-tests/ts_recovery-itest.cc@207
PS1, Line 207:
> You can get this special string from data_dirs.h (or should be able to, any
Done


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/integration-tests/ts_recovery-itest.cc@208
PS1, Line 208:
> ASSERT_OK?
Done


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/integration-tests/ts_recovery-itest.cc@233
PS1, Line 233: new_block_ids.begin(), 
new_block_ids.end(),
> You could SCOPED_TRACE this (and the first/second sets) to avoid logging th
Done


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/tablet/tablet_metadata.cc@399
PS1, Line 399:   next_rowset_idx_ = std::max(next_rowset_idx_, 
rowset_meta->id() + 1);
> I'm curious whether this plumbing the max "down" like this is more efficien
Ran a few runs of dense_node-itest and it seems like it doesn't make a huge 
difference.
Good call though, bootstrap is dominated by disk access, the difference in the 
O(N) copy here is pretty negligible. WiIl go with that approach.


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/tablet/tablet_metadata.cc@413
PS1, Line 413: ager of blocks it may not be aware of (e.g. if a
 : // disk was not read).
> Not really necessary for this comment.
Done



--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Sat, 04 Nov 2017 01:13:07 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8465

to look at the new patch set (#3).

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..

KUDU-2202 avoid block ID reuse for missing dirs

The block manager is initially only aware of the blocks it reads when
first loading the FS layout. If the opening of a directory fails (e.g.
due to a disk failure), the server can still start up, but will fail any
tablet configured to use that directory. By not scanning the failed
directory, the server will "forget" about some blocks.

Since LBM block IDs are sequentially allocated just past the end of the
greatest known ID, it's possible for new tablet to create blocks with
IDs already assigned to a failed tablet. If that failed tablet is
deleted, the live data belonging to the second tablet will be deleted.

To prevent this, a new API has been added to the block managers to grant
external components the ability to notify the block manager of block
IDs. The largest block ID is kept track of when opening and reading a
tablet's metadata. Once all rowsets and orphaned blocks have been seen,
the block manager is notified of the highest block ID in the tablet's
superblock.

This patch only targets the LBM, which assigns blocks sequentially,
and therefore only needs to know about the largest block ID referenced
by tablets.

A unit test is added to log_block_manager-test and an integration test
is added to ts_recovery-itest to ensure block ID reuse does not happen.
Also includes some C++11/stylistic cleanup.

Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
---
M src/kudu/fs/block_id.h
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
15 files changed, 237 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/65/8465/3
--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8465 )

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/fs/file_block_manager.cc@942
PS1, Line 942: void FileBlockManager::NotifyBlockId(const BlockId& /* block_id 
*/) {
> warning: parameter 'block_id' is unused [misc-unused-parameters]
Done


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/integration-tests/ts_recovery-itest.cc
File src/kudu/integration-tests/ts_recovery-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/integration-tests/ts_recovery-itest.cc@103
PS1, Line 103: using tablet::TabletSuperBlockPB;
> warning: using decl 'Substitute' is unused [misc-unused-using-decls]
Done


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/integration-tests/ts_recovery-itest.cc@104
PS1, Line 104:
> warning: using decl 'RowSetDataPB' is unused [misc-unused-using-decls]
Done


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/tablet/rowset_metadata.cc
File src/kudu/tablet/rowset_metadata.cc:

http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/tablet/rowset_metadata.cc@49
PS1, Line 49:   *metadata = std::move(ret);
> warning: prefer ptr1 = std::move(ptr2) over ptr1.reset(ptr2.release()) [mis
Done


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/tablet/tablet_metadata.cc@678
PS1, Line 678:   AtomicWord rowset_idx = 
Barrier_AtomicIncrement(_rowset_idx_, 1) - 1;
> warning: parameter 'schema' is unused [misc-unused-parameters]
Done



--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Fri, 03 Nov 2017 23:32:38 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8465 )

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..


Patch Set 1:

(11 comments)

I think it would also be nice to have an LBM-level test that ensures block IDs 
are not reused. It should be far simpler to write.

http://gerrit.cloudera.org:8080/#/c/8465/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8465/1//COMMIT_MSG@10
PS1, Line 10: If the opening of a directory fails (e.g.
: due to a disk failure), the server can still start up, but it may 
not
: know about some blocks, and may end up assigning them to a new 
tablet.
: If the blocks ID is used by two tablets, we can get into 
situations
: where deleting one will delete data for the other.
I'd say that the server will still start up but fail any tablet configured to 
use that directory. Additionally, by not scanning the failed directory, the 
server will "forget" about some blocks, and since LBM block IDs are 
sequentially allocated beginning just past the block with the greatest ID, it's 
possible for a second tablet to create a block and be allocated a block ID 
assigned to a failed tablet. Then, if the failed tablet is deleted, it'll also 
delete live data belonging to that second tablet.

You can put this more concisely, but the keys are to talk about how block ID 
selection works, and to draw a more explicit connection between the reuse by a 
second tablet and the data corruption when deleting the failed tablet.


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/fs/block_manager.h@264
PS1, Line 264: can
should


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/fs/block_manager.h@266
PS1, Line 266:   virtual void NotifyBlockId(const BlockId& block_id) = 0;
Since BlockIds are basically uint64_ts, we should just pass them by copy.


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/integration-tests/ts_recovery-itest.cc
File src/kudu/integration-tests/ts_recovery-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/integration-tests/ts_recovery-itest.cc@144
PS1, Line 144:   // Set up a basic multi-directory server with some rows that 
flushes often so
 :   // we get some blocks.
Where's the multi-directory part?


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/integration-tests/ts_recovery-itest.cc@199
PS1, Line 199:
 :   // Now empty the data directories of blocks. The server will 
start up, not
 :   // read any blocks, but should still avoid the old tablet's 
blocks.
This will fail the old tablet, right? Is that worth asserting for?


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/integration-tests/ts_recovery-itest.cc@207
PS1, Line 207: "block_manager_instance"
You can get this special string from data_dirs.h (or should be able to, anyway).


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/integration-tests/ts_recovery-itest.cc@208
PS1, Line 208: WARN_NOT_OK
ASSERT_OK?


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/integration-tests/ts_recovery-itest.cc@233
PS1, Line 233:   LOG(INFO) << "Intersection: " << 
BlockId::JoinStrings(block_id_intersection);
You could SCOPED_TRACE this (and the first/second sets) to avoid logging this 
except in the event of failure.


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/integration-tests/ts_recovery-itest.cc@233
PS1, Line 233:   LOG(INFO) << "Intersection: " << 
BlockId::JoinStrings(block_id_intersection);
You could SCOPED_TRACE this (and the first/second sets) to avoid logging this 
except in the event of failure.


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/tablet/tablet_metadata.cc@399
PS1, Line 399:   RETURN_NOT_OK(RowSetMetadata::Load(this, rowset_pb, 
_block_id, _meta));
I'm curious whether this plumbing the max "down" like this is more efficient 
than using RowSetMetadata::GetAllBlocks (or even 
TabletMetadata::CollectAllBlockIDs) to fetch all the IDs, and selecting the 
maximum amongst them. If not, the bulk approach is certainly less code.


http://gerrit.cloudera.org:8080/#/c/8465/1/src/kudu/tablet/tablet_metadata.cc@413
PS1, Line 413: This is only relevant for block managers that assign
 : // sequential block IDs,
Not really necessary for this comment.



--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 

[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8465

to look at the new patch set (#2).

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..

KUDU-2202 avoid block ID reuse for missing dirs

The block manager is initially only aware of the blocks it reads when
first loading the FS layout. If the opening of a directory fails (e.g.
due to a disk failure), the server can still start up, but it may not
know about some blocks, and may end up assigning them to a new tablet.
If the blocks ID is used by two tablets, we can get into situations
where deleting one will delete data for the other.

To prevent this, a new API has been added to the block managers to grant
external components the ability to notify the block manager of block
IDs. The largest block ID is kept track of when opening and reading a
tablet's metadata. Once all rowsets and orphaned blocks have been seen,
the block manager is notified of the highest block ID in the tablet's
superblock.

This patch only targets the LBM, which assigns blocks sequentially,
and therefore only needs to know about the largest block ID referenced
by tablets.

An integration test is added to ts_recovery-itest to ensure block ID
reuse does not happen.

Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
---
M src/kudu/fs/block_id.h
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
14 files changed, 213 insertions(+), 65 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/65/8465/2
--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Andrew Wong (Code Review)
Andrew Wong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8465


Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..

KUDU-2202 avoid block ID reuse for missing dirs

The block manager is initially only aware of the blocks it reads when
first loading the FS layout. If the opening of a directory fails (e.g.
due to a disk failure), the server can still start up, but it may not
know about some blocks, and may end up assigning them to a new tablet.
If the blocks ID is used by two tablets, we can get into situations
where deleting one will delete data for the other.

To prevent this, a new API has been added to the block managers to grant
external components the ability to notify the block manager of block
IDs. The largest block ID is kept track of when opening and reading a
tablet's metadata. Once all rowsets and orphaned blocks have been seen,
the block manager is notified of the highest block ID in the tablet's
superblock.

This patch only targets the LBM, which assigns blocks sequentially,
and therefore only needs to know about the largest block ID referenced
by tablets.

An integration test is added to ts_recovery-itest to ensure block ID
reuse does not happen.

Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
---
M src/kudu/fs/block_id.h
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tablet/tablet_metadata.cc
11 files changed, 210 insertions(+), 58 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/65/8465/1
--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong