[kudu-CR] fs: generate report during Open

2017-04-27 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged.

Change subject: fs: generate report during Open
..


fs: generate report during Open

This patch modifies the FsManager and block managers to produce a filesystem
report when opened. The report includes various filesystem statistics as
well as all inconsistencies found and repaired.

The heart of the patch is the FsReport nested data structure. Originally
implemented as a protobuf to take advantage of DebugString(), I found it to
be a poor fit once I began customizing the log output, so I rewrote it in
its current form. The report includes fs-wide statistics as well as optional
consistency checks that may or may not be performed by the block managers.
Reports can be merged; the LBM takes advantage of this as it processes data
directories in parallel. The consistency checks have structure, so as to
simplify testing and ToString() customization. A sample report can be found
at the end of this commit message.

Thanks to the level of detail in the FsReport, the LBM now separates the act
of finding inconsistencies from repairing them. This makes it much easier to
enforce that repairs aren't done on a read-only filesystem, or if there were
any fatal and irreparable errors.

Thorough testing of the inconsistencies in the report is deferred to a
different patch as this one was already quite large.

Other changes of note:
- The LBM treats data files without matching metadata files as incomplete
  containers; previously only metadata files without matching data files
  were treated in this way.
- The LBM maps all containers by name so they can be looked up during the
  repair process; previously they were stored in a vector.
- The checks performed by "kudu fs check" are in FsReport. I think this
  makes sense, and it's an acknowledgement that they should be performed by
  the block manager in the future. The code in tool_action_fs.cc was
  restructured to write its findings into the report before logging it.

Block manager report

1 data directories: 
/tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059
Total live blocks: 47
Total live bytes: 221524
Total live bytes (after alignment): 323582
Total number of LBM containers: 53 (21 full)
Did not check for missing blocks
Did not check for orphaned blocks
Total full LBM containers with extra space: 0 (0 repaired)
Total full LBM container extra space in bytes: 0 (0 repaired)
Total incomplete LBM containers: 3 (3 repaired)
Misaligned block in container 
/tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059/038a826c27db4e6da8237ce79853ece2:
 block_id {
  id: 9225972546088407965
}
op_type: CREATE
timestamp_us: 0
offset: 1052673
length: 16384

Misaligned block in container 
/tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059/0612d7730bdc4cfd83afbaed271d5289:
 block_id {
  id: 3148617980286357277
}
op_type: CREATE
timestamp_us: 0
offset: 1048577
length: 16384

Total LBM partial records: 7 (7 repaired)

Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Reviewed-on: http://gerrit.cloudera.org:8080/6581
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon 
---
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
A src/kudu/fs/fs_report.cc
A src/kudu/fs/fs_report.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/server/server_base.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/util/pb_util.cc
17 files changed, 1,104 insertions(+), 217 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] fs: generate report during Open

2017-04-27 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 10: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] fs: generate report during Open

2017-04-27 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 10: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] fs: generate report during Open

2017-04-24 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 9:

There were two real changes from v8:
- Rename of the "full space preallocation check" (to generalize for hole 
repunching).
- Move of misaligned block check so that deleted misaligned blocks aren't 
flagged.

-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] fs: generate report during Open

2017-04-24 Thread Adar Dembo (Code Review)
Hello David Ribeiro Alves, Todd Lipcon, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6581

to look at the new patch set (#9).

Change subject: fs: generate report during Open
..

fs: generate report during Open

This patch modifies the FsManager and block managers to produce a filesystem
report when opened. The report includes various filesystem statistics as
well as all inconsistencies found and repaired.

The heart of the patch is the FsReport nested data structure. Originally
implemented as a protobuf to take advantage of DebugString(), I found it to
be a poor fit once I began customizing the log output, so I rewrote it in
its current form. The report includes fs-wide statistics as well as optional
consistency checks that may or may not be performed by the block managers.
Reports can be merged; the LBM takes advantage of this as it processes data
directories in parallel. The consistency checks have structure, so as to
simplify testing and ToString() customization. A sample report can be found
at the end of this commit message.

Thanks to the level of detail in the FsReport, the LBM now separates the act
of finding inconsistencies from repairing them. This makes it much easier to
enforce that repairs aren't done on a read-only filesystem, or if there were
any fatal and irreparable errors.

Thorough testing of the inconsistencies in the report is deferred to a
different patch as this one was already quite large.

Other changes of note:
- The LBM treats data files without matching metadata files as incomplete
  containers; previously only metadata files without matching data files
  were treated in this way.
- The LBM maps all containers by name so they can be looked up during the
  repair process; previously they were stored in a vector.
- The checks performed by "kudu fs check" are in FsReport. I think this
  makes sense, and it's an acknowledgement that they should be performed by
  the block manager in the future. The code in tool_action_fs.cc was
  restructured to write its findings into the report before logging it.

Block manager report

1 data directories: 
/tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059
Total live blocks: 47
Total live bytes: 221524
Total live bytes (after alignment): 323582
Total number of LBM containers: 53 (21 full)
Did not check for missing blocks
Did not check for orphaned blocks
Total full LBM containers with extra space: 0 (0 repaired)
Total full LBM container extra space in bytes: 0 (0 repaired)
Total incomplete LBM containers: 3 (3 repaired)
Misaligned block in container 
/tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059/038a826c27db4e6da8237ce79853ece2:
 block_id {
  id: 9225972546088407965
}
op_type: CREATE
timestamp_us: 0
offset: 1052673
length: 16384

Misaligned block in container 
/tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059/0612d7730bdc4cfd83afbaed271d5289:
 block_id {
  id: 3148617980286357277
}
op_type: CREATE
timestamp_us: 0
offset: 1048577
length: 16384

Total LBM partial records: 7 (7 repaired)

Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
---
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
A src/kudu/fs/fs_report.cc
A src/kudu/fs/fs_report.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/server/server_base.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/util/pb_util.cc
17 files changed, 1,104 insertions(+), 217 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/6581/9
-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] fs: generate report during Open

2017-04-19 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 8: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] fs: generate report during Open

2017-04-19 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 8: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] fs: generate report during Open

2017-04-13 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 7: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

PS4, Line 69: FsReport report;
> Not sure where you got 120 from. The Google Style Guide (https://google.git
right 100, not 120. guess I didn't recall the mostly-80 recommendation.

Still see it as worthless since I can't really have a less than 100 cols editor 
given how often there are spills.

To keep from constantly scrolling horizontally in most parts of the code base I 
have at least 100 cols visible at which point vertical space is more important 
to me.

anyway, not worth blocking work for comment wrapping. my personal preference, 
if we're taking those into account, has been noted, moving on.


http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS4, Line 584: ile_si
> But that's obvious from L585-588, isn't it?
I guess my personal preference is precision vs color. For me I would prefer a: 
"First verify that the record's offset/length aren't non-existent or negative." 
or somesuch than a " First verify that the record's offset/length aren't wildly 
incorrect." that forces me to go read the if conditions to understand what that 
means. again, moving on.


-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] fs: generate report during Open

2017-04-13 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

PS4, Line 69: // unnecessarily.
> I don't think that it makes a lot of sense for people to have different wra
Not sure where you got 120 from. The Google Style Guide 
(https://google.github.io/styleguide/cppguide.html#Line_Length) mandates 80. 
Our style guide relaxes that to 100 (and that's enforced by LINT precommit 
runs), but recommends staying under 80 
(http://kudu.apache.org/docs/contributing.html#_line_length).


http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS4, Line 324: truly catastrophic
> can you spell that out then? it's unlikely that the next person using this 
Done


PS4, Line 584: wildly
> can you spell that out then, please? (that's kind of what I meant from my c
But that's obvious from L585-588, isn't it?

This comment isn't here to explain the checks in detail (since that's obvious 
from the code), but to contrast with the comment on L594. That is, after 
reading L594, you should know that it's only OK to compare the data_file_size 
with the record offset+length thanks to the checks on L585-588, and if you were 
modifying this function, you shouldn't reorder the comparison to precede those 
checks.


PS4, Line 1726: Truncation is performed even if the container's logical file 
size
  :   // (available by proxy via preallocated_offset_) and 
total_bytes_written_
  :   // agree, because XFS's speculative preallocation feature 
may artificially
  :   // enlarge the file without updating its file size.
> Yeah, but if it starts with "For XFS filesystems" or somesuch, then you kno
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] fs: generate report during Open

2017-04-13 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6581

to look at the new patch set (#7).

Change subject: fs: generate report during Open
..

fs: generate report during Open

This patch modifies the FsManager and block managers to produce a filesystem
report when opened. The report includes various filesystem statistics as
well as all inconsistencies found and repaired.

The heart of the patch is the FsReport nested data structure. Originally
implemented as a protobuf to take advantage of DebugString(), I found it to
be a poor fit once I began customizing the log output, so I rewrote it in
its current form. The report includes fs-wide statistics as well as optional
consistency checks that may or may not be performed by the block managers.
Reports can be merged; the LBM takes advantage of this as it processes data
directories in parallel. The consistency checks have structure, so as to
simplify testing and ToString() customization. A sample report can be found
at the end of this commit message.

Thanks to the level of detail in the FsReport, the LBM now separates the act
of finding inconsistencies from repairing them. This makes it much easier to
enforce that repairs aren't done on a read-only filesystem, or if there were
any fatal and irreparable errors.

Thorough testing of the inconsistencies in the report is deferred to a
different patch as this one was already quite large.

Other changes of note:
- The LBM treats data files without matching metadata files as incomplete
  containers; previously only metadata files without matching data files
  were treated in this way.
- The LBM maps all containers by name so they can be looked up during the
  repair process; previously they were stored in a vector.
- The checks performed by "kudu fs check" are in FsReport. I think this
  makes sense, and it's an acknowledgement that they should be performed by
  the block manager in the future. The code in tool_action_fs.cc was
  restructured to write its findings into the report before logging it.

Block manager report

1 data directories: 
/tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059
Total live blocks: 47
Total live bytes: 221524
Total live bytes (after alignment): 323582
Total number of LBM containers: 53 (21 full)
Did not check for missing blocks
Did not check for orphaned blocks
Total full LBM containers with preallocated space: 0 (0 repaired)
Total full LBM container preallocated space in bytes: 0 (0 repaired)
Total incomplete LBM containers: 3 (3 repaired)
Misaligned block in container 
/tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059/038a826c27db4e6da8237ce79853ece2:
 block_id {
  id: 9225972546088407965
}
op_type: CREATE
timestamp_us: 0
offset: 1052673
length: 16384

Misaligned block in container 
/tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059/0612d7730bdc4cfd83afbaed271d5289:
 block_id {
  id: 3148617980286357277
}
op_type: CREATE
timestamp_us: 0
offset: 1048577
length: 16384

Total LBM partial records: 7 (7 repaired)

Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
---
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
A src/kudu/fs/fs_report.cc
A src/kudu/fs/fs_report.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/server/server_base.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/util/pb_util.cc
17 files changed, 1,092 insertions(+), 216 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/6581/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] fs: generate report during Open

2017-04-13 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

PS4, Line 69: // unnecessarily.
> Generally I wrap at 80 characters, and this extends to 81 if unwrapped. But
I don't think that it makes a lot of sense for people to have different 
wrapping policies. Not saying that 80 chars is wrong or right, just that we 
should all do it or none of us should do it. Given that code style mandates 
wrapping at 120 I usually point as nits forced wrapping well below that where 
there's a single word in the last line. particularly when wrapping occurs in 
places where you can find surrounding code that doesn't, like this one.

In general, if devs need to have their displays adjusted to 120 cols anyway, 
vertical space is more precious than horizontal space, IMO


http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS4, Line 324: truly catastrophic
> This isn't for fatal and irreparable inconsistencies though; this is for "w
can you spell that out then? it's unlikely that the next person using this will 
know what you mean by "truly catastrophic"


PS4, Line 584: wildly
> Non-existent or negative values (what we're testing for in the subsequent c
can you spell that out then, please? (that's kind of what I meant from my 
comment)


PS4, Line 1726: Truncation is performed even if the container's logical file 
size
  :   // (available by proxy via preallocated_offset_) and 
total_bytes_written_
  :   // agree, because XFS's speculative preallocation feature 
may artificially
  :   // enlarge the file without updating its file size.
> Why? It's just one sentence and you can't read it without seeing "XFS specu
Yeah, but if it starts with "For XFS filesystems" or somesuch, then you know 
you don't need to read it at all if what you're concerned about is another 
filesystem


PS4, Line 1753: // TODO(adar): track as an inconsistency?
> I hesitated to track it as an inconsistency for it for a couple reasons:
sounds good


-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] fs: generate report during Open

2017-04-12 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 4:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/6581/4//COMMIT_MSG
Commit Message:

PS4, Line 9: This patch modifies the FsManager and block managers to produce a 
filesystem
   : report when opened. The report includes various filesystem 
statistics as
   : well as all inconsistencies found and repaired.
   : 
   : The heart of the patch is the FsReport nested data structure. 
Originally
   : implemented as a protobuf to take advantage of DebugString(), I 
found it to
   : be a poor fit once I began customizing the log output, so I 
rewrote it in
   : its current form. The report includes fs-wide statistics as well 
as optional
   : consistency checks that may or may not be performed by the block 
managers.
   : Reports can be merged; the LBM takes advantage of this as it 
processes data
   : directories in parallel. The consistency checks have structure, so 
as to
   : simplify testing and ToString() customization.
   : 
   : Thanks to the level of detail in the FsReport, the LBM now 
separates the act
   : of finding inconsistencies from repairing them. This makes it much 
easier to
   : enforce that repairs aren't done on a read-only filesystem, or if 
there were
   : any fatal and irreparable errors.
> could you post an example of the report on the commit message?
Done


http://gerrit.cloudera.org:8080/#/c/6581/5/src/kudu/fs/fs_report.h
File src/kudu/fs/fs_report.h:

Line 30: 
> More generalized documentation of Check struct usage would be nice.
Done


http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

PS4, Line 69: // unnecessarily.
> nit no need to wrap
Generally I wrap at 80 characters, and this extends to 81 if unwrapped. But, I 
agree that a single word hanging out on its own line is distracting, so I'll 
unwrap.


http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS4, Line 305: preallocated_window()
> docs
Gonna punt; a subsequent patch removes the method.


PS4, Line 324: truly catastrophic
> replace with the verbage you've been using elsewhere (i guess fatal and irr
This isn't for fatal and irreparable inconsistencies though; this is for "we 
have to stop NOW" errors. There's just one such case right now: not being able 
to get the size of the container's data file. But that's why I called it "truly 
catastrophic".


PS4, Line 554: _file_size, _malformed_record));
> nit no need to wrap
See earlier comment; I wrap at 80 characters.


PS4, Line 571: pb_reader.offset());
> same
See earlier.


PS4, Line 584: wildly
> what's "wildly incorrect"?
Non-existent or negative values (what we're testing for in the subsequent 
condition).


PS4, Line 1652: We are going to perform all of these checks
> are these all the possible checks? if so please say so, if not explain why
These aren't all of the checks. We don't emplace the missing/orphaned block 
checks, because the LBM can't do them on its own.

I'll add a comment to that effect but I prefer to keep this somewhat vague so 
that it won't get stale as the set of possible checks changes.


PS4, Line 1726: Truncation is performed even if the container's logical file 
size
  :   // (available by proxy via preallocated_offset_) and 
total_bytes_written_
  :   // agree, because XFS's speculative preallocation feature 
may artificially
  :   // enlarge the file without updating its file size.
> mention that this is XFS specific earlier in the comment
Why? It's just one sentence and you can't read it without seeing "XFS 
speculative preallocation" in there.


PS4, Line 1753: // TODO(adar): track as an inconsistency?
> yeah. can we do that as part of this patch? how "possible" is this? should 
I hesitated to track it as an inconsistency for it for a couple reasons:
1. It's scoped to multiple containers (other inconsistencies are all single 
container), so a little more complicated.
2. I can't think of a "natural" way it'd occur in the wild. The only 
possibilities involve operator error (i.e. duplicating a container).

We can always convert it into an inconsistency in the future.


PS4, Line 1821: report->malformed_record_check->entries.emplace_back(
  : container->ToString(), record);
> could we provide more information as to what exactly the malformedness was?
If we wanted that, I'd probably track it as a different kind of inconsistency 
altogether (instead of malformed record).

If you feel strongly, I'll do that now; otherwise I was planning to abide by 
the TODO and leave it...to do.


PS4, Line 1841: if (read_only_ || 

[kudu-CR] fs: generate report during Open

2017-04-12 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6581/5/src/kudu/fs/fs_report.h
File src/kudu/fs/fs_report.h:

Line 30: 
More generalized documentation of Check struct usage would be nice.


-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] fs: generate report during Open

2017-04-11 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 4:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/6581/4//COMMIT_MSG
Commit Message:

PS4, Line 9: This patch modifies the FsManager and block managers to produce a 
filesystem
   : report when opened. The report includes various filesystem 
statistics as
   : well as all inconsistencies found and repaired.
   : 
   : The heart of the patch is the FsReport nested data structure. 
Originally
   : implemented as a protobuf to take advantage of DebugString(), I 
found it to
   : be a poor fit once I began customizing the log output, so I 
rewrote it in
   : its current form. The report includes fs-wide statistics as well 
as optional
   : consistency checks that may or may not be performed by the block 
managers.
   : Reports can be merged; the LBM takes advantage of this as it 
processes data
   : directories in parallel. The consistency checks have structure, so 
as to
   : simplify testing and ToString() customization.
   : 
   : Thanks to the level of detail in the FsReport, the LBM now 
separates the act
   : of finding inconsistencies from repairing them. This makes it much 
easier to
   : enforce that repairs aren't done on a read-only filesystem, or if 
there were
   : any fatal and irreparable errors.
could you post an example of the report on the commit message?


http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/fs_report.cc
File src/kudu/fs/fs_report.cc:

PS4, Line 216: s += Substitute("Misaligned block in container $0: $1\n",
 : mb.container, SecureDebugString(mb.record));
nit use SubstituteAndAppend for consistency


http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

PS4, Line 69: // unnecessarily.
nit no need to wrap


http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS4, Line 305: preallocated_window()
docs


PS4, Line 324: truly catastrophic
replace with the verbage you've been using elsewhere (i guess fatal and 
irreparable?)


PS4, Line 554: _file_size, _malformed_record));
nit no need to wrap


PS4, Line 571: pb_reader.offset());
same


PS4, Line 584: wildly
what's "wildly incorrect"?


PS4, Line 1652: We are going to perform all of these checks
are these all the possible checks? if so please say so, if not explain why


PS4, Line 1726: Truncation is performed even if the container's logical file 
size
  :   // (available by proxy via preallocated_offset_) and 
total_bytes_written_
  :   // agree, because XFS's speculative preallocation feature 
may artificially
  :   // enlarge the file without updating its file size.
mention that this is XFS specific earlier in the comment


PS4, Line 1753: // TODO(adar): track as an inconsistency?
yeah. can we do that as part of this patch? how "possible" is this? should 
maybe make this part of the report as a fatal and irreparable inconsistency but 
still print the report before fataling?


PS4, Line 1821: report->malformed_record_check->entries.emplace_back(
  : container->ToString(), record);
could we provide more information as to what exactly the malformedness was?


PS4, Line 1841: if (read_only_ || report->HasFatalErrors()) {
  : return Status::OK();
  :   }
LOG(WARNING) here


PS4, Line 1878: / Technically we've "repaired" the inconsistency if the 
truncation
  :   // succeeded, even if the following logic fails.
  :   pr.repaired = true;
did we? even if we never ended up syncing to disk? (I'm assuming that ftruncate 
doesn't mandatorily fsync, please correct me if I'm wrong)


http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/tools/tool_action_fs.cc
File src/kudu/tools/tool_action_fs.cc:

PS4, Line 126:   // Add missing blocks to the report.
 :   report.missing_block_check.emplace();
 :   for (const auto& id : missing_block_ids) {
 : report.missing_block_check->entries.emplace_back(
 : id, FindOrDie(live_block_id_to_tablet, id));
 :   }
 : 
 :   // Add orphaned blocks to the report after attempting to 
repair them.
 :   report.orphaned_block_check.emplace();
 :   for (const auto& id : orphaned_block_ids) {
 : // Opening a block isn't free, but the number of orphaned 
blocks shouldn't
 : // be extraordinarily high.
 : uint64_t size;
 : {
 :   unique_ptr block;
 :   RETURN_NOT_OK(fs_manager.OpenBlock(id, 

[kudu-CR] fs: generate report during Open

2017-04-11 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6581

to look at the new patch set (#5).

Change subject: fs: generate report during Open
..

fs: generate report during Open

This patch modifies the FsManager and block managers to produce a filesystem
report when opened. The report includes various filesystem statistics as
well as all inconsistencies found and repaired.

The heart of the patch is the FsReport nested data structure. Originally
implemented as a protobuf to take advantage of DebugString(), I found it to
be a poor fit once I began customizing the log output, so I rewrote it in
its current form. The report includes fs-wide statistics as well as optional
consistency checks that may or may not be performed by the block managers.
Reports can be merged; the LBM takes advantage of this as it processes data
directories in parallel. The consistency checks have structure, so as to
simplify testing and ToString() customization.

Thanks to the level of detail in the FsReport, the LBM now separates the act
of finding inconsistencies from repairing them. This makes it much easier to
enforce that repairs aren't done on a read-only filesystem, or if there were
any fatal and irreparable errors.

Thorough testing of the inconsistencies in the report is deferred to a
different patch as this one was already quite large.

Other changes of note:
- The LBM treats data files without matching metadata files as incomplete
  containers; previously only metadata files without matching data files
  were treated in this way.
- The LBM maps all containers by name so they can be looked up during the
  repair process; previously they were stored in a vector.
- The checks performed by "kudu fs check" are in FsReport. I think this
  makes sense, and it's an acknowledgement that they should be performed by
  the block manager in the future. The code in tool_action_fs.cc was
  restructured to write its findings into the report before logging it.

Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
---
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
A src/kudu/fs/fs_report.cc
A src/kudu/fs/fs_report.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/server/server_base.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/util/pb_util.cc
17 files changed, 1,072 insertions(+), 212 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/6581/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] fs: generate report during Open

2017-04-11 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/fs_report.cc
File src/kudu/fs/fs_report.cc:

PS4, Line 56:   auto& missing_blocks = 
LookupOrInsert(_blocks_by_tablet_id,
:   mb.tablet_id, 
vector());
: missing_blocks.emplace_back(mb.block_id.ToString());
> I think you can use operator[]'s auto-construction magic here and just do:
Ah right, forgot about that. Thanks.


-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] fs: generate report during Open

2017-04-10 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/fs_report.cc
File src/kudu/fs/fs_report.cc:

PS4, Line 56:   auto& missing_blocks = 
LookupOrInsert(_blocks_by_tablet_id,
:   mb.tablet_id, 
vector());
: missing_blocks.emplace_back(mb.block_id.ToString());
I think you can use operator[]'s auto-construction magic here and just do:

missing_blocks_by_tablet_id[mb.tablet_id].emplace_back(mb.block_id.ToString())


-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] fs: generate report during Open

2017-04-10 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6581

to look at the new patch set (#4).

Change subject: fs: generate report during Open
..

fs: generate report during Open

This patch modifies the FsManager and block managers to produce a filesystem
report when opened. The report includes various filesystem statistics as
well as all inconsistencies found and repaired.

The heart of the patch is the FsReport nested data structure. Originally
implemented as a protobuf to take advantage of DebugString(), I found it to
be a poor fit once I began customizing the log output, so I rewrote it in
its current form. The report includes fs-wide statistics as well as optional
consistency checks that may or may not be performed by the block managers.
Reports can be merged; the LBM takes advantage of this as it processes data
directories in parallel. The consistency checks have structure, so as to
simplify testing and ToString() customization.

Thanks to the level of detail in the FsReport, the LBM now separates the act
of finding inconsistencies from repairing them. This makes it much easier to
enforce that repairs aren't done on a read-only filesystem, or if there were
any fatal and irreparable errors.

Thorough testing of the inconsistencies in the report is deferred to a
different patch as this one was already quite large.

Other changes of note:
- The LBM treats data files without matching metadata files as incomplete
  containers; previously only metadata files without matching data files
  were treated in this way.
- The LBM maps all containers by name so they can be looked up during the
  repair process; previously they were stored in a vector.
- The checks performed by "kudu fs check" are in FsReport. I think this
  makes sense, and it's an acknowledgement that they should be performed by
  the block manager in the future. The code in tool_action_fs.cc was
  restructured to write its findings into the report before logging it.

Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
---
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
A src/kudu/fs/fs_report.cc
A src/kudu/fs/fs_report.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/server/server_base.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/util/pb_util.cc
17 files changed, 1,074 insertions(+), 212 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/6581/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] fs: generate report during Open

2017-04-10 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6581/3/src/kudu/fs/fs_report.cc
File src/kudu/fs/fs_report.cc:

Line 251:   offset(std::move(o)),
> warning: std::move of the variable of a trivially-copyable type has no effe
Done


Line 288:   if (c && other.c) { \
> warning: macro argument should be enclosed in parentheses [misc-macro-paren
Done


Line 289: c->MergeFrom(other.c.get()); \
> warning: macro argument should be enclosed in parentheses [misc-macro-paren
Done


Line 291: c = other.c; \
> warning: macro argument should be enclosed in parentheses [misc-macro-paren
Done


Line 315: s += c->ToString(); \
> warning: macro argument should be enclosed in parentheses [misc-macro-paren
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] fs: generate report during Open

2017-04-07 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6581

to look at the new patch set (#3).

Change subject: fs: generate report during Open
..

fs: generate report during Open

This patch modifies the FsManager and block managers to produce a filesystem
report when opened. The report includes various filesystem statistics as
well as all inconsistencies found and repaired.

The heart of the patch is the FsReport nested data structure. Originally
implemented as a protobuf to take advantage of DebugString(), I found it to
be a poor fit once I began customizing the log output, so I rewrote it in
its current form. The report includes fs-wide statistics as well as optional
consistency checks that may or may not be performed by the block managers.
Reports can be merged; the LBM takes advantage of this as it processes data
directories in parallel. The consistency checks have structure, so as to
simplify testing and ToString() customization.

Thanks to the level of detail in the FsReport, the LBM now separates the act
of finding inconsistencies from repairing them. This makes it much easier to
enforce that repairs aren't done on a read-only filesystem, or if there were
any fatal and irreparable errors.

Thorough testing of the inconsistencies in the report is deferred to a
different patch as this one was already quite large.

Other changes of note:
- The LBM treats data files without matching metadata files as incomplete
  containers; previously only metadata files without matching data files
  were treated in this way.
- The LBM maps all containers by name so they can be looked up during the
  repair process; previously they were stored in a vector.
- The checks performed by "kudu fs check" are in FsReport. I think this
  makes sense, and it's an acknowledgement that they should be performed by
  the block manager in the future. The code in tool_action_fs.cc was
  restructured to write its findings into the report before logging it.

Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
---
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
A src/kudu/fs/fs_report.cc
A src/kudu/fs/fs_report.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/server/server_base.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/util/pb_util.cc
17 files changed, 1,074 insertions(+), 212 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/6581/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] fs: generate report during Open

2017-04-07 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6581

to look at the new patch set (#2).

Change subject: fs: generate report during Open
..

fs: generate report during Open

This patch modifies the FsManager and block managers to produce a filesystem
report when opened. The report includes various filesystem statistics as
well as all inconsistencies found and repaired.

The heart of the patch is the FsReport nested data structure. Originally
implemented as a protobuf to take advantage of DebugString(), I found it to
be a poor fit once I began customizing the log output, so I rewrote it in
its current form. The report includes fs-wide statistics as well as optional
consistency checks that may or may not be performed by the block managers.
Reports can be merged; the LBM takes advantage of this as it processes data
directories in parallel. The consistency checks have structure, so as to
simplify testing and ToString() customization.

Thanks to the level of detail in the FsReport, the LBM now separates the act
of finding inconsistencies from repairing them. This makes it much easier to
enforce that repairs aren't done on a read-only filesystem, or if there were
any fatal and irreparable errors.

Thorough testing of the inconsistencies in the report is deferred to a
different patch as this one was already quite large.

Other changes of note:
- The LBM treats data files without matching metadata files as incomplete
  containers; previously only metadata files without matching data files
  were treated in this way.
- The LBM maps all containers by name so they can be looked up during the
  repair process; previously they were stored in a vector.
- The checks performed by "kudu fs check" are in FsReport. I think this
  makes sense, and it's an acknowledgement that they should be performed by
  the block manager in the future. The code in tool_action_fs.cc was
  restructured to write its findings into the report before logging it.

Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
---
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
A src/kudu/fs/fs_report.cc
A src/kudu/fs/fs_report.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/server/server_base.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/util/pb_util.cc
17 files changed, 1,073 insertions(+), 211 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/6581/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] fs: generate report during Open

2017-04-07 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 1:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/6581/1//COMMIT_MSG
Commit Message:

Line 7: fs: generate report during Open
> do you have any measurement of whether this affects startup time on a syste
I don't expect it to, so no. There is no additional IO or system calls, and on 
a steady-state node (i.e. no inconsistencies), there are no allocations in the 
report either.

If you're alluding to the use of FS_IOC_FIEMAP on container files at startup, 
that's in a different patch. :)


PS1, Line 32:  previously only the opposite was true.
> what is the opposite?
metadata files without matching data files. I'll reword.


http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/fs_report.cc
File src/kudu/fs/fs_report.cc:

Line 65:it = missing_block_by_tablet_id.equal_range(it->first).second) {
> Do we ever expect this to be a big map?
Missing blocks are fatal, so no, I don't expect there to be any. Put another 
way, I would expect none, or all (in the event of someone wiping their data 
directories but preserving superblocks), and optimizing for the latter isn't 
that useful.

I was trying to hard to emulate Guava's MultiMap.asMap(), but I agree that this 
is pretty inscrutable. I'll rework into map.


PS1, Line 68: bool first = true;
: for (auto it = range.first; it != range.second; it++) {
:   if (first) {
: first = false;
:   } else {
: block_str += ", ";
:   }
:   block_str += it->second;
: }
> if you did the map> above, you could save a bunch of LOC here
Done


Line 78: s += Substitute("Fatal error: tablet $0 missing blocks: [ $1 ]\n",
> can use SubstituteAndAppend
Done


Line 228: s += Substitute("Misaligned block in container $0: $1\n",
> SubstituteAndAppend
Done


PS1, Line 324:   if (malformed_record_check && other.malformed_record_check) {
 : 
malformed_record_check->MergeFrom(other.malformed_record_check.get());
 :   } else if (other.malformed_record_check) {
 : malformed_record_check = other.malformed_record_check;
 :   }
> hrm, this is very repetitive. you think it's worth doing a templated functi
I'll do a macro. In ToString() too.


Line 405:   LOG(INFO) << ToString();
> LOG has the downisde of a maximum 32kb (maybe 64kb?) limit, which I could s
Think I should log each line individually? That makes the result a little 
tougher to cut and paste without also removing all the log prefixes.


http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/fs_report.h
File src/kudu/fs/fs_report.h:

Line 44: Entry(BlockId block_id, std::string tablet_id);
> if this is just a simple assignment constructor, maybe it's not necessary a
Doesn't work with emplace_back(), unfortunately: 
http://stackoverflow.com/questions/13812703/c11-emplace-back-on-vectorstruct?noredirect=1=1


-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] fs: generate report during Open

2017-04-07 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 1:

(8 comments)

only skimmed through so far since I see David took a look

http://gerrit.cloudera.org:8080/#/c/6581/1//COMMIT_MSG
Commit Message:

Line 7: fs: generate report during Open
do you have any measurement of whether this affects startup time on a system 
with lots of containers?


http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/fs_report.cc
File src/kudu/fs/fs_report.cc:

Line 65:it = missing_block_by_tablet_id.equal_range(it->first).second) {
Do we ever expect this to be a big map?

It seems like this loop might end up being klog(n) instead of k + log(n) since 
you are calling equal_range for every element that you hit?

The nested iterator with the same name as the outer iterator is also pretty 
confusing. From reading the code I can't quite tell if it's correct.

If we expect this to usually be small anyway, maybe it would be simpler to just 
use a map so we can use simpler foreach loops?


PS1, Line 68: bool first = true;
: for (auto it = range.first; it != range.second; it++) {
:   if (first) {
: first = false;
:   } else {
: block_str += ", ";
:   }
:   block_str += it->second;
: }
if you did the map> above, you could save a bunch of LOC here too 
and just use a JoinStrings() call


Line 78: s += Substitute("Fatal error: tablet $0 missing blocks: [ $1 ]\n",
can use SubstituteAndAppend


Line 228: s += Substitute("Misaligned block in container $0: $1\n",
SubstituteAndAppend


PS1, Line 324:   if (malformed_record_check && other.malformed_record_check) {
 : 
malformed_record_check->MergeFrom(other.malformed_record_check.get());
 :   } else if (other.malformed_record_check) {
 : malformed_record_check = other.malformed_record_check;
 :   }
hrm, this is very repetitive. you think it's worth doing a templated function 
or macro?


Line 405:   LOG(INFO) << ToString();
LOG has the downisde of a maximum 32kb (maybe 64kb?) limit, which I could see 
being annoying if you get a truncated report


http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/fs_report.h
File src/kudu/fs/fs_report.h:

Line 44: Entry(BlockId block_id, std::string tablet_id);
if this is just a simple assignment constructor, maybe it's not necessary and 
you could just use brace-initialization?


-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] fs: generate report during Open

2017-04-07 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 1:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

Line 209:   virtual Status Open(FsReport* report = nullptr) = 0;
> warning: default arguments on virtual or override methods are prohibited [g
Done


http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

Line 539: Status FileBlockManager::Open(FsReport* report) {
> warning: default arguments on virtual or override methods are prohibited [g
Done


http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/file_block_manager.h
File src/kudu/fs/file_block_manager.h:

Line 84:   Status Open(FsReport* report = nullptr) override;
> warning: default arguments on virtual or override methods are prohibited [g
Done


http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/fs_manager.cc
File src/kudu/fs/fs_manager.cc:

Line 83: using kudu::fs::CreateBlockOptions;
> warning: using decl 'CreateBlockOptions' is unused [misc-unused-using-decls
Done


http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/fs_report.cc
File src/kudu/fs/fs_report.cc:

Line 36: using std::vector;
> warning: using decl 'vector' is unused [misc-unused-using-decls]
Done


Line 212:   record(std::move(record)) {
> warning: passing result of std::move() as a const reference argument; no mo
Done


Line 236:   record(std::move(record)) {
> warning: passing result of std::move() as a const reference argument; no mo
Done


Line 263:   offset(std::move(offset)),
> warning: std::move of the variable of a trivially-copyable type has no effe
Done


http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/fs_report.h
File src/kudu/fs/fs_report.h:

PS1, Line 34:  
> prefix with: "Error type:"
Done


Line 45: 
> nit no need for an empty line, same for the structs below
Done


PS1, Line 56: deletes
> this doesn't delete anything right? maybe rephrase to: "(can be repaired by
I like the information on how a repairable error is repaired, but I'll reword 
this.


PS1, Line 77: container
> containers
Done


PS1, Line 77: preallocated
> pre-allocated
Disagree:

$ git grep -i pre-allocated | grep // | wc -l
5
$ git grep -i preallocated | grep // | wc -l
22


PS1, Line 80: truncates the container data files
> same comment as in the previous struct
Done


PS1, Line 101: Checks for LBM containers where one or both files are too short 
to be of use.
> care to elaborate on this? not sure I understand how files can be too short
Suppose we crash right after we've created a new data file but before we create 
the metadata file (or vice versa). Or suppose we get as far as creating both 
files but crash before writing the PBC container header into the metadata file. 
In either case, this is an "incomplete" container that contains no data. It's 
more annoying to reuse it than it's worth, so we just throw it out.

That's the kind of inconsistency this is targeting. I'll try to clarify.


PS1, Line 199: Fatal and repairable
> isn't a repairable inconsistency by definition non-fatal?
Unfortunately no.

The "partial record" inconsistency must be repaired in order for the LBM to 
function. If it isn't repaired, we won't be able to append new records to the 
metadata file. That's not so bad for CREATE records (we could get by with 
prohibiting new blocks in the container), but it means DELETEs can't be 
written, so existing blocks in the container can't be deleted.


PS1, Line 217:   // Returns whether this report describes at least one fatal 
and irreparable
 :   // inconsistency.
 :   bool HasFatalErrors() const;
> this method name and comment seems to agree with my comment on the struct h
Perhaps, but the implementation only looks for errors that are  
fatal+irreparable. That's what the comment says too.

If you think the method name isn't precise enough, to what would you have me 
change it?


PS1, Line 226:   // Like CheckForFatalErrors(), but also writes the report to 
LOG(INFO).
 :   Status LogAndCheckForFatalErrors() const;
 : 
 :   // Like CheckForFatalErrors(), but also writes the report to 
stdout.
 :   Status PrintAndCheckForFatalErrors() const;
> do there really belong here?, why not just pass a report all the time and t
They're just convenience methods since I've already repeated those patterns 
(four times for LogAndCheck and twice for PrintAndCheck).


http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 1304: Status LogBlockManager::Open(FsReport* report) {
> warning: default arguments on virtual or override methods are prohibited [g
Done


http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/log_block_manager.h
File src/kudu/fs/log_block_manager.h:

Line 

[kudu-CR] fs: generate report during Open

2017-04-07 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: fs: generate report during Open
..


Patch Set 1:

(11 comments)

first round of reviews. this patch is pretty heftly so I'll review it in 2-3 
rounds or less. rather do that than leave no feedback until I'm done

http://gerrit.cloudera.org:8080/#/c/6581/1//COMMIT_MSG
Commit Message:

PS1, Line 32:  previously only the opposite was true.
what is the opposite?


http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/fs_report.h
File src/kudu/fs/fs_report.h:

PS1, Line 34:  
prefix with: "Error type:"
same for the cases below


Line 45: 
nit no need for an empty line, same for the structs below


PS1, Line 56: deletes
this doesn't delete anything right? maybe rephrase to: "(can be repaired by 
deleting the block) though I'd be okay to just omit this altogether


PS1, Line 77: preallocated
pre-allocated


PS1, Line 77: container
containers


PS1, Line 80: truncates the container data files
same comment as in the previous struct


PS1, Line 101: Checks for LBM containers where one or both files are too short 
to be of use.
care to elaborate on this? not sure I understand how files can be too short. 
were they truncated? (in which case how is this repairable?)


PS1, Line 199: Fatal and repairable
isn't a repairable inconsistency by definition non-fatal?
how about just:
- fatal
- repairable
- irreparable


PS1, Line 217:   // Returns whether this report describes at least one fatal 
and irreparable
 :   // inconsistency.
 :   bool HasFatalErrors() const;
this method name and comment seems to agree with my comment on the struct header


PS1, Line 226:   // Like CheckForFatalErrors(), but also writes the report to 
LOG(INFO).
 :   Status LogAndCheckForFatalErrors() const;
 : 
 :   // Like CheckForFatalErrors(), but also writes the report to 
stdout.
 :   Status PrintAndCheckForFatalErrors() const;
do there really belong here?, why not just pass a report all the time and then 
let the called decide what to do with it? (print it or log it or whatever)


-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] fs: generate report during Open

2017-04-06 Thread Adar Dembo (Code Review)
Hello David Ribeiro Alves, Todd Lipcon,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/6581

to review the following change.

Change subject: fs: generate report during Open
..

fs: generate report during Open

This patch modifies the FsManager and block managers to produce a filesystem
report when opened. The report includes various filesystem statistics as
well as all inconsistencies found and repaired.

The heart of the patch is the FsReport nested data structure. Originally
implemented as a protobuf to take advantage of DebugString(), I found it to
be a poor fit once I began customizing the log output, so I rewrote it in
its current form. The report includes fs-wide statistics as well as optional
consistency checks that may or may not be performed by the block managers.
Reports can be merged; the LBM takes advantage of this as it processes data
directories in parallel. The consistency checks have structure, so as to
simplify testing and ToString() customization.

Thanks to the level of detail in the FsReport, the LBM now separates the act
of finding inconsistencies from repairing them. This makes it much easier to
enforce that repairs aren't done on a read-only filesystem, or if there were
any fatal and irreparable errors.

Thorough testing of the inconsistencies in the report is deferred to a
different patch as this one was already quite large.

Other changes of note:
- The LBM treats data files without matching metadata files as incomplete
  containers; previously only the opposite was true.
- The LBM maps all containers by name so they can be looked up during the
  repair process; previously they were stored in a vector.
- The checks performed by "kudu fs check" are in FsReport. I think this
  makes sense, and it's an acknowledgement that they should be performed by
  the block manager in the future. The code in tool_action_fs.cc was
  restructured to write its findings into the report before logging it.

Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
---
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
A src/kudu/fs/fs_report.cc
A src/kudu/fs/fs_report.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/server/server_base.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
M src/kudu/util/pb_util.cc
17 files changed, 1,144 insertions(+), 208 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/6581/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Todd Lipcon