[kudu-CR] fs: generate report during Open
Adar Dembo has submitted this change and it was merged. Change subject: fs: generate report during Open .. fs: generate report during Open This patch modifies the FsManager and block managers to produce a filesystem report when opened. The report includes various filesystem statistics as well as all inconsistencies found and repaired. The heart of the patch is the FsReport nested data structure. Originally implemented as a protobuf to take advantage of DebugString(), I found it to be a poor fit once I began customizing the log output, so I rewrote it in its current form. The report includes fs-wide statistics as well as optional consistency checks that may or may not be performed by the block managers. Reports can be merged; the LBM takes advantage of this as it processes data directories in parallel. The consistency checks have structure, so as to simplify testing and ToString() customization. A sample report can be found at the end of this commit message. Thanks to the level of detail in the FsReport, the LBM now separates the act of finding inconsistencies from repairing them. This makes it much easier to enforce that repairs aren't done on a read-only filesystem, or if there were any fatal and irreparable errors. Thorough testing of the inconsistencies in the report is deferred to a different patch as this one was already quite large. Other changes of note: - The LBM treats data files without matching metadata files as incomplete containers; previously only metadata files without matching data files were treated in this way. - The LBM maps all containers by name so they can be looked up during the repair process; previously they were stored in a vector. - The checks performed by "kudu fs check" are in FsReport. I think this makes sense, and it's an acknowledgement that they should be performed by the block manager in the future. The code in tool_action_fs.cc was restructured to write its findings into the report before logging it. Block manager report 1 data directories: /tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059 Total live blocks: 47 Total live bytes: 221524 Total live bytes (after alignment): 323582 Total number of LBM containers: 53 (21 full) Did not check for missing blocks Did not check for orphaned blocks Total full LBM containers with extra space: 0 (0 repaired) Total full LBM container extra space in bytes: 0 (0 repaired) Total incomplete LBM containers: 3 (3 repaired) Misaligned block in container /tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059/038a826c27db4e6da8237ce79853ece2: block_id { id: 9225972546088407965 } op_type: CREATE timestamp_us: 0 offset: 1052673 length: 16384 Misaligned block in container /tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059/0612d7730bdc4cfd83afbaed271d5289: block_id { id: 3148617980286357277 } op_type: CREATE timestamp_us: 0 offset: 1048577 length: 16384 Total LBM partial records: 7 (7 repaired) Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Reviewed-on: http://gerrit.cloudera.org:8080/6581 Tested-by: Kudu Jenkins Reviewed-by: Todd Lipcon--- M src/kudu/fs/CMakeLists.txt M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h A src/kudu/fs/fs_report.cc A src/kudu/fs/fs_report.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/server/server_base.cc M src/kudu/tools/kudu-tool-test.cc M src/kudu/tools/tool_action_fs.cc M src/kudu/util/pb_util.cc 17 files changed, 1,104 insertions(+), 217 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 11 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] fs: generate report during Open
Todd Lipcon has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 10: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] fs: generate report during Open
Todd Lipcon has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] fs: generate report during Open
Adar Dembo has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 9: There were two real changes from v8: - Rename of the "full space preallocation check" (to generalize for hole repunching). - Move of misaligned block check so that deleted misaligned blocks aren't flagged. -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] fs: generate report during Open
Hello David Ribeiro Alves, Todd Lipcon, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6581 to look at the new patch set (#9). Change subject: fs: generate report during Open .. fs: generate report during Open This patch modifies the FsManager and block managers to produce a filesystem report when opened. The report includes various filesystem statistics as well as all inconsistencies found and repaired. The heart of the patch is the FsReport nested data structure. Originally implemented as a protobuf to take advantage of DebugString(), I found it to be a poor fit once I began customizing the log output, so I rewrote it in its current form. The report includes fs-wide statistics as well as optional consistency checks that may or may not be performed by the block managers. Reports can be merged; the LBM takes advantage of this as it processes data directories in parallel. The consistency checks have structure, so as to simplify testing and ToString() customization. A sample report can be found at the end of this commit message. Thanks to the level of detail in the FsReport, the LBM now separates the act of finding inconsistencies from repairing them. This makes it much easier to enforce that repairs aren't done on a read-only filesystem, or if there were any fatal and irreparable errors. Thorough testing of the inconsistencies in the report is deferred to a different patch as this one was already quite large. Other changes of note: - The LBM treats data files without matching metadata files as incomplete containers; previously only metadata files without matching data files were treated in this way. - The LBM maps all containers by name so they can be looked up during the repair process; previously they were stored in a vector. - The checks performed by "kudu fs check" are in FsReport. I think this makes sense, and it's an acknowledgement that they should be performed by the block manager in the future. The code in tool_action_fs.cc was restructured to write its findings into the report before logging it. Block manager report 1 data directories: /tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059 Total live blocks: 47 Total live bytes: 221524 Total live bytes (after alignment): 323582 Total number of LBM containers: 53 (21 full) Did not check for missing blocks Did not check for orphaned blocks Total full LBM containers with extra space: 0 (0 repaired) Total full LBM container extra space in bytes: 0 (0 repaired) Total incomplete LBM containers: 3 (3 repaired) Misaligned block in container /tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059/038a826c27db4e6da8237ce79853ece2: block_id { id: 9225972546088407965 } op_type: CREATE timestamp_us: 0 offset: 1052673 length: 16384 Misaligned block in container /tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059/0612d7730bdc4cfd83afbaed271d5289: block_id { id: 3148617980286357277 } op_type: CREATE timestamp_us: 0 offset: 1048577 length: 16384 Total LBM partial records: 7 (7 repaired) Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e --- M src/kudu/fs/CMakeLists.txt M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h A src/kudu/fs/fs_report.cc A src/kudu/fs/fs_report.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/server/server_base.cc M src/kudu/tools/kudu-tool-test.cc M src/kudu/tools/tool_action_fs.cc M src/kudu/util/pb_util.cc 17 files changed, 1,104 insertions(+), 217 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/6581/9 -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] fs: generate report during Open
David Ribeiro Alves has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 8: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] fs: generate report during Open
Todd Lipcon has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 8: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] fs: generate report during Open
David Ribeiro Alves has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 7: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: PS4, Line 69: FsReport report; > Not sure where you got 120 from. The Google Style Guide (https://google.git right 100, not 120. guess I didn't recall the mostly-80 recommendation. Still see it as worthless since I can't really have a less than 100 cols editor given how often there are spills. To keep from constantly scrolling horizontally in most parts of the code base I have at least 100 cols visible at which point vertical space is more important to me. anyway, not worth blocking work for comment wrapping. my personal preference, if we're taking those into account, has been noted, moving on. http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: PS4, Line 584: ile_si > But that's obvious from L585-588, isn't it? I guess my personal preference is precision vs color. For me I would prefer a: "First verify that the record's offset/length aren't non-existent or negative." or somesuch than a " First verify that the record's offset/length aren't wildly incorrect." that forces me to go read the if conditions to understand what that means. again, moving on. -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] fs: generate report during Open
Adar Dembo has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: PS4, Line 69: // unnecessarily. > I don't think that it makes a lot of sense for people to have different wra Not sure where you got 120 from. The Google Style Guide (https://google.github.io/styleguide/cppguide.html#Line_Length) mandates 80. Our style guide relaxes that to 100 (and that's enforced by LINT precommit runs), but recommends staying under 80 (http://kudu.apache.org/docs/contributing.html#_line_length). http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: PS4, Line 324: truly catastrophic > can you spell that out then? it's unlikely that the next person using this Done PS4, Line 584: wildly > can you spell that out then, please? (that's kind of what I meant from my c But that's obvious from L585-588, isn't it? This comment isn't here to explain the checks in detail (since that's obvious from the code), but to contrast with the comment on L594. That is, after reading L594, you should know that it's only OK to compare the data_file_size with the record offset+length thanks to the checks on L585-588, and if you were modifying this function, you shouldn't reorder the comparison to precede those checks. PS4, Line 1726: Truncation is performed even if the container's logical file size : // (available by proxy via preallocated_offset_) and total_bytes_written_ : // agree, because XFS's speculative preallocation feature may artificially : // enlarge the file without updating its file size. > Yeah, but if it starts with "For XFS filesystems" or somesuch, then you kno Done -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] fs: generate report during Open
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6581 to look at the new patch set (#7). Change subject: fs: generate report during Open .. fs: generate report during Open This patch modifies the FsManager and block managers to produce a filesystem report when opened. The report includes various filesystem statistics as well as all inconsistencies found and repaired. The heart of the patch is the FsReport nested data structure. Originally implemented as a protobuf to take advantage of DebugString(), I found it to be a poor fit once I began customizing the log output, so I rewrote it in its current form. The report includes fs-wide statistics as well as optional consistency checks that may or may not be performed by the block managers. Reports can be merged; the LBM takes advantage of this as it processes data directories in parallel. The consistency checks have structure, so as to simplify testing and ToString() customization. A sample report can be found at the end of this commit message. Thanks to the level of detail in the FsReport, the LBM now separates the act of finding inconsistencies from repairing them. This makes it much easier to enforce that repairs aren't done on a read-only filesystem, or if there were any fatal and irreparable errors. Thorough testing of the inconsistencies in the report is deferred to a different patch as this one was already quite large. Other changes of note: - The LBM treats data files without matching metadata files as incomplete containers; previously only metadata files without matching data files were treated in this way. - The LBM maps all containers by name so they can be looked up during the repair process; previously they were stored in a vector. - The checks performed by "kudu fs check" are in FsReport. I think this makes sense, and it's an acknowledgement that they should be performed by the block manager in the future. The code in tool_action_fs.cc was restructured to write its findings into the report before logging it. Block manager report 1 data directories: /tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059 Total live blocks: 47 Total live bytes: 221524 Total live bytes (after alignment): 323582 Total number of LBM containers: 53 (21 full) Did not check for missing blocks Did not check for orphaned blocks Total full LBM containers with preallocated space: 0 (0 repaired) Total full LBM container preallocated space in bytes: 0 (0 repaired) Total incomplete LBM containers: 3 (3 repaired) Misaligned block in container /tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059/038a826c27db4e6da8237ce79853ece2: block_id { id: 9225972546088407965 } op_type: CREATE timestamp_us: 0 offset: 1052673 length: 16384 Misaligned block in container /tmp/kudutest-1000/block_manager-stress-test.BlockManagerStressTest_1.StressTest.1491963221016069-10059/0612d7730bdc4cfd83afbaed271d5289: block_id { id: 3148617980286357277 } op_type: CREATE timestamp_us: 0 offset: 1048577 length: 16384 Total LBM partial records: 7 (7 repaired) Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e --- M src/kudu/fs/CMakeLists.txt M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h A src/kudu/fs/fs_report.cc A src/kudu/fs/fs_report.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/server/server_base.cc M src/kudu/tools/kudu-tool-test.cc M src/kudu/tools/tool_action_fs.cc M src/kudu/util/pb_util.cc 17 files changed, 1,092 insertions(+), 216 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/6581/7 -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] fs: generate report during Open
David Ribeiro Alves has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: PS4, Line 69: // unnecessarily. > Generally I wrap at 80 characters, and this extends to 81 if unwrapped. But I don't think that it makes a lot of sense for people to have different wrapping policies. Not saying that 80 chars is wrong or right, just that we should all do it or none of us should do it. Given that code style mandates wrapping at 120 I usually point as nits forced wrapping well below that where there's a single word in the last line. particularly when wrapping occurs in places where you can find surrounding code that doesn't, like this one. In general, if devs need to have their displays adjusted to 120 cols anyway, vertical space is more precious than horizontal space, IMO http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: PS4, Line 324: truly catastrophic > This isn't for fatal and irreparable inconsistencies though; this is for "w can you spell that out then? it's unlikely that the next person using this will know what you mean by "truly catastrophic" PS4, Line 584: wildly > Non-existent or negative values (what we're testing for in the subsequent c can you spell that out then, please? (that's kind of what I meant from my comment) PS4, Line 1726: Truncation is performed even if the container's logical file size : // (available by proxy via preallocated_offset_) and total_bytes_written_ : // agree, because XFS's speculative preallocation feature may artificially : // enlarge the file without updating its file size. > Why? It's just one sentence and you can't read it without seeing "XFS specu Yeah, but if it starts with "For XFS filesystems" or somesuch, then you know you don't need to read it at all if what you're concerned about is another filesystem PS4, Line 1753: // TODO(adar): track as an inconsistency? > I hesitated to track it as an inconsistency for it for a couple reasons: sounds good -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] fs: generate report during Open
Adar Dembo has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 4: (15 comments) http://gerrit.cloudera.org:8080/#/c/6581/4//COMMIT_MSG Commit Message: PS4, Line 9: This patch modifies the FsManager and block managers to produce a filesystem : report when opened. The report includes various filesystem statistics as : well as all inconsistencies found and repaired. : : The heart of the patch is the FsReport nested data structure. Originally : implemented as a protobuf to take advantage of DebugString(), I found it to : be a poor fit once I began customizing the log output, so I rewrote it in : its current form. The report includes fs-wide statistics as well as optional : consistency checks that may or may not be performed by the block managers. : Reports can be merged; the LBM takes advantage of this as it processes data : directories in parallel. The consistency checks have structure, so as to : simplify testing and ToString() customization. : : Thanks to the level of detail in the FsReport, the LBM now separates the act : of finding inconsistencies from repairing them. This makes it much easier to : enforce that repairs aren't done on a read-only filesystem, or if there were : any fatal and irreparable errors. > could you post an example of the report on the commit message? Done http://gerrit.cloudera.org:8080/#/c/6581/5/src/kudu/fs/fs_report.h File src/kudu/fs/fs_report.h: Line 30: > More generalized documentation of Check struct usage would be nice. Done http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: PS4, Line 69: // unnecessarily. > nit no need to wrap Generally I wrap at 80 characters, and this extends to 81 if unwrapped. But, I agree that a single word hanging out on its own line is distracting, so I'll unwrap. http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: PS4, Line 305: preallocated_window() > docs Gonna punt; a subsequent patch removes the method. PS4, Line 324: truly catastrophic > replace with the verbage you've been using elsewhere (i guess fatal and irr This isn't for fatal and irreparable inconsistencies though; this is for "we have to stop NOW" errors. There's just one such case right now: not being able to get the size of the container's data file. But that's why I called it "truly catastrophic". PS4, Line 554: _file_size, _malformed_record)); > nit no need to wrap See earlier comment; I wrap at 80 characters. PS4, Line 571: pb_reader.offset()); > same See earlier. PS4, Line 584: wildly > what's "wildly incorrect"? Non-existent or negative values (what we're testing for in the subsequent condition). PS4, Line 1652: We are going to perform all of these checks > are these all the possible checks? if so please say so, if not explain why These aren't all of the checks. We don't emplace the missing/orphaned block checks, because the LBM can't do them on its own. I'll add a comment to that effect but I prefer to keep this somewhat vague so that it won't get stale as the set of possible checks changes. PS4, Line 1726: Truncation is performed even if the container's logical file size : // (available by proxy via preallocated_offset_) and total_bytes_written_ : // agree, because XFS's speculative preallocation feature may artificially : // enlarge the file without updating its file size. > mention that this is XFS specific earlier in the comment Why? It's just one sentence and you can't read it without seeing "XFS speculative preallocation" in there. PS4, Line 1753: // TODO(adar): track as an inconsistency? > yeah. can we do that as part of this patch? how "possible" is this? should I hesitated to track it as an inconsistency for it for a couple reasons: 1. It's scoped to multiple containers (other inconsistencies are all single container), so a little more complicated. 2. I can't think of a "natural" way it'd occur in the wild. The only possibilities involve operator error (i.e. duplicating a container). We can always convert it into an inconsistency in the future. PS4, Line 1821: report->malformed_record_check->entries.emplace_back( : container->ToString(), record); > could we provide more information as to what exactly the malformedness was? If we wanted that, I'd probably track it as a different kind of inconsistency altogether (instead of malformed record). If you feel strongly, I'll do that now; otherwise I was planning to abide by the TODO and leave it...to do. PS4, Line 1841: if (read_only_ ||
[kudu-CR] fs: generate report during Open
Andrew Wong has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/6581/5/src/kudu/fs/fs_report.h File src/kudu/fs/fs_report.h: Line 30: More generalized documentation of Check struct usage would be nice. -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] fs: generate report during Open
David Ribeiro Alves has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 4: (15 comments) http://gerrit.cloudera.org:8080/#/c/6581/4//COMMIT_MSG Commit Message: PS4, Line 9: This patch modifies the FsManager and block managers to produce a filesystem : report when opened. The report includes various filesystem statistics as : well as all inconsistencies found and repaired. : : The heart of the patch is the FsReport nested data structure. Originally : implemented as a protobuf to take advantage of DebugString(), I found it to : be a poor fit once I began customizing the log output, so I rewrote it in : its current form. The report includes fs-wide statistics as well as optional : consistency checks that may or may not be performed by the block managers. : Reports can be merged; the LBM takes advantage of this as it processes data : directories in parallel. The consistency checks have structure, so as to : simplify testing and ToString() customization. : : Thanks to the level of detail in the FsReport, the LBM now separates the act : of finding inconsistencies from repairing them. This makes it much easier to : enforce that repairs aren't done on a read-only filesystem, or if there were : any fatal and irreparable errors. could you post an example of the report on the commit message? http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/fs_report.cc File src/kudu/fs/fs_report.cc: PS4, Line 216: s += Substitute("Misaligned block in container $0: $1\n", : mb.container, SecureDebugString(mb.record)); nit use SubstituteAndAppend for consistency http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: PS4, Line 69: // unnecessarily. nit no need to wrap http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: PS4, Line 305: preallocated_window() docs PS4, Line 324: truly catastrophic replace with the verbage you've been using elsewhere (i guess fatal and irreparable?) PS4, Line 554: _file_size, _malformed_record)); nit no need to wrap PS4, Line 571: pb_reader.offset()); same PS4, Line 584: wildly what's "wildly incorrect"? PS4, Line 1652: We are going to perform all of these checks are these all the possible checks? if so please say so, if not explain why PS4, Line 1726: Truncation is performed even if the container's logical file size : // (available by proxy via preallocated_offset_) and total_bytes_written_ : // agree, because XFS's speculative preallocation feature may artificially : // enlarge the file without updating its file size. mention that this is XFS specific earlier in the comment PS4, Line 1753: // TODO(adar): track as an inconsistency? yeah. can we do that as part of this patch? how "possible" is this? should maybe make this part of the report as a fatal and irreparable inconsistency but still print the report before fataling? PS4, Line 1821: report->malformed_record_check->entries.emplace_back( : container->ToString(), record); could we provide more information as to what exactly the malformedness was? PS4, Line 1841: if (read_only_ || report->HasFatalErrors()) { : return Status::OK(); : } LOG(WARNING) here PS4, Line 1878: / Technically we've "repaired" the inconsistency if the truncation : // succeeded, even if the following logic fails. : pr.repaired = true; did we? even if we never ended up syncing to disk? (I'm assuming that ftruncate doesn't mandatorily fsync, please correct me if I'm wrong) http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/tools/tool_action_fs.cc File src/kudu/tools/tool_action_fs.cc: PS4, Line 126: // Add missing blocks to the report. : report.missing_block_check.emplace(); : for (const auto& id : missing_block_ids) { : report.missing_block_check->entries.emplace_back( : id, FindOrDie(live_block_id_to_tablet, id)); : } : : // Add orphaned blocks to the report after attempting to repair them. : report.orphaned_block_check.emplace(); : for (const auto& id : orphaned_block_ids) { : // Opening a block isn't free, but the number of orphaned blocks shouldn't : // be extraordinarily high. : uint64_t size; : { : unique_ptr block; : RETURN_NOT_OK(fs_manager.OpenBlock(id,
[kudu-CR] fs: generate report during Open
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6581 to look at the new patch set (#5). Change subject: fs: generate report during Open .. fs: generate report during Open This patch modifies the FsManager and block managers to produce a filesystem report when opened. The report includes various filesystem statistics as well as all inconsistencies found and repaired. The heart of the patch is the FsReport nested data structure. Originally implemented as a protobuf to take advantage of DebugString(), I found it to be a poor fit once I began customizing the log output, so I rewrote it in its current form. The report includes fs-wide statistics as well as optional consistency checks that may or may not be performed by the block managers. Reports can be merged; the LBM takes advantage of this as it processes data directories in parallel. The consistency checks have structure, so as to simplify testing and ToString() customization. Thanks to the level of detail in the FsReport, the LBM now separates the act of finding inconsistencies from repairing them. This makes it much easier to enforce that repairs aren't done on a read-only filesystem, or if there were any fatal and irreparable errors. Thorough testing of the inconsistencies in the report is deferred to a different patch as this one was already quite large. Other changes of note: - The LBM treats data files without matching metadata files as incomplete containers; previously only metadata files without matching data files were treated in this way. - The LBM maps all containers by name so they can be looked up during the repair process; previously they were stored in a vector. - The checks performed by "kudu fs check" are in FsReport. I think this makes sense, and it's an acknowledgement that they should be performed by the block manager in the future. The code in tool_action_fs.cc was restructured to write its findings into the report before logging it. Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e --- M src/kudu/fs/CMakeLists.txt M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h A src/kudu/fs/fs_report.cc A src/kudu/fs/fs_report.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/server/server_base.cc M src/kudu/tools/kudu-tool-test.cc M src/kudu/tools/tool_action_fs.cc M src/kudu/util/pb_util.cc 17 files changed, 1,072 insertions(+), 212 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/6581/5 -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] fs: generate report during Open
Adar Dembo has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/fs_report.cc File src/kudu/fs/fs_report.cc: PS4, Line 56: auto& missing_blocks = LookupOrInsert(_blocks_by_tablet_id, : mb.tablet_id, vector()); : missing_blocks.emplace_back(mb.block_id.ToString()); > I think you can use operator[]'s auto-construction magic here and just do: Ah right, forgot about that. Thanks. -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] fs: generate report during Open
Todd Lipcon has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/6581/4/src/kudu/fs/fs_report.cc File src/kudu/fs/fs_report.cc: PS4, Line 56: auto& missing_blocks = LookupOrInsert(_blocks_by_tablet_id, : mb.tablet_id, vector()); : missing_blocks.emplace_back(mb.block_id.ToString()); I think you can use operator[]'s auto-construction magic here and just do: missing_blocks_by_tablet_id[mb.tablet_id].emplace_back(mb.block_id.ToString()) -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] fs: generate report during Open
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6581 to look at the new patch set (#4). Change subject: fs: generate report during Open .. fs: generate report during Open This patch modifies the FsManager and block managers to produce a filesystem report when opened. The report includes various filesystem statistics as well as all inconsistencies found and repaired. The heart of the patch is the FsReport nested data structure. Originally implemented as a protobuf to take advantage of DebugString(), I found it to be a poor fit once I began customizing the log output, so I rewrote it in its current form. The report includes fs-wide statistics as well as optional consistency checks that may or may not be performed by the block managers. Reports can be merged; the LBM takes advantage of this as it processes data directories in parallel. The consistency checks have structure, so as to simplify testing and ToString() customization. Thanks to the level of detail in the FsReport, the LBM now separates the act of finding inconsistencies from repairing them. This makes it much easier to enforce that repairs aren't done on a read-only filesystem, or if there were any fatal and irreparable errors. Thorough testing of the inconsistencies in the report is deferred to a different patch as this one was already quite large. Other changes of note: - The LBM treats data files without matching metadata files as incomplete containers; previously only metadata files without matching data files were treated in this way. - The LBM maps all containers by name so they can be looked up during the repair process; previously they were stored in a vector. - The checks performed by "kudu fs check" are in FsReport. I think this makes sense, and it's an acknowledgement that they should be performed by the block manager in the future. The code in tool_action_fs.cc was restructured to write its findings into the report before logging it. Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e --- M src/kudu/fs/CMakeLists.txt M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h A src/kudu/fs/fs_report.cc A src/kudu/fs/fs_report.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/server/server_base.cc M src/kudu/tools/kudu-tool-test.cc M src/kudu/tools/tool_action_fs.cc M src/kudu/util/pb_util.cc 17 files changed, 1,074 insertions(+), 212 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/6581/4 -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] fs: generate report during Open
Adar Dembo has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/6581/3/src/kudu/fs/fs_report.cc File src/kudu/fs/fs_report.cc: Line 251: offset(std::move(o)), > warning: std::move of the variable of a trivially-copyable type has no effe Done Line 288: if (c && other.c) { \ > warning: macro argument should be enclosed in parentheses [misc-macro-paren Done Line 289: c->MergeFrom(other.c.get()); \ > warning: macro argument should be enclosed in parentheses [misc-macro-paren Done Line 291: c = other.c; \ > warning: macro argument should be enclosed in parentheses [misc-macro-paren Done Line 315: s += c->ToString(); \ > warning: macro argument should be enclosed in parentheses [misc-macro-paren Done -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] fs: generate report during Open
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6581 to look at the new patch set (#3). Change subject: fs: generate report during Open .. fs: generate report during Open This patch modifies the FsManager and block managers to produce a filesystem report when opened. The report includes various filesystem statistics as well as all inconsistencies found and repaired. The heart of the patch is the FsReport nested data structure. Originally implemented as a protobuf to take advantage of DebugString(), I found it to be a poor fit once I began customizing the log output, so I rewrote it in its current form. The report includes fs-wide statistics as well as optional consistency checks that may or may not be performed by the block managers. Reports can be merged; the LBM takes advantage of this as it processes data directories in parallel. The consistency checks have structure, so as to simplify testing and ToString() customization. Thanks to the level of detail in the FsReport, the LBM now separates the act of finding inconsistencies from repairing them. This makes it much easier to enforce that repairs aren't done on a read-only filesystem, or if there were any fatal and irreparable errors. Thorough testing of the inconsistencies in the report is deferred to a different patch as this one was already quite large. Other changes of note: - The LBM treats data files without matching metadata files as incomplete containers; previously only metadata files without matching data files were treated in this way. - The LBM maps all containers by name so they can be looked up during the repair process; previously they were stored in a vector. - The checks performed by "kudu fs check" are in FsReport. I think this makes sense, and it's an acknowledgement that they should be performed by the block manager in the future. The code in tool_action_fs.cc was restructured to write its findings into the report before logging it. Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e --- M src/kudu/fs/CMakeLists.txt M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h A src/kudu/fs/fs_report.cc A src/kudu/fs/fs_report.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/server/server_base.cc M src/kudu/tools/kudu-tool-test.cc M src/kudu/tools/tool_action_fs.cc M src/kudu/util/pb_util.cc 17 files changed, 1,074 insertions(+), 212 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/6581/3 -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] fs: generate report during Open
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6581 to look at the new patch set (#2). Change subject: fs: generate report during Open .. fs: generate report during Open This patch modifies the FsManager and block managers to produce a filesystem report when opened. The report includes various filesystem statistics as well as all inconsistencies found and repaired. The heart of the patch is the FsReport nested data structure. Originally implemented as a protobuf to take advantage of DebugString(), I found it to be a poor fit once I began customizing the log output, so I rewrote it in its current form. The report includes fs-wide statistics as well as optional consistency checks that may or may not be performed by the block managers. Reports can be merged; the LBM takes advantage of this as it processes data directories in parallel. The consistency checks have structure, so as to simplify testing and ToString() customization. Thanks to the level of detail in the FsReport, the LBM now separates the act of finding inconsistencies from repairing them. This makes it much easier to enforce that repairs aren't done on a read-only filesystem, or if there were any fatal and irreparable errors. Thorough testing of the inconsistencies in the report is deferred to a different patch as this one was already quite large. Other changes of note: - The LBM treats data files without matching metadata files as incomplete containers; previously only metadata files without matching data files were treated in this way. - The LBM maps all containers by name so they can be looked up during the repair process; previously they were stored in a vector. - The checks performed by "kudu fs check" are in FsReport. I think this makes sense, and it's an acknowledgement that they should be performed by the block manager in the future. The code in tool_action_fs.cc was restructured to write its findings into the report before logging it. Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e --- M src/kudu/fs/CMakeLists.txt M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h A src/kudu/fs/fs_report.cc A src/kudu/fs/fs_report.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/server/server_base.cc M src/kudu/tools/kudu-tool-test.cc M src/kudu/tools/tool_action_fs.cc M src/kudu/util/pb_util.cc 17 files changed, 1,073 insertions(+), 211 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/6581/2 -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] fs: generate report during Open
Adar Dembo has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 1: (9 comments) http://gerrit.cloudera.org:8080/#/c/6581/1//COMMIT_MSG Commit Message: Line 7: fs: generate report during Open > do you have any measurement of whether this affects startup time on a syste I don't expect it to, so no. There is no additional IO or system calls, and on a steady-state node (i.e. no inconsistencies), there are no allocations in the report either. If you're alluding to the use of FS_IOC_FIEMAP on container files at startup, that's in a different patch. :) PS1, Line 32: previously only the opposite was true. > what is the opposite? metadata files without matching data files. I'll reword. http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/fs_report.cc File src/kudu/fs/fs_report.cc: Line 65:it = missing_block_by_tablet_id.equal_range(it->first).second) { > Do we ever expect this to be a big map? Missing blocks are fatal, so no, I don't expect there to be any. Put another way, I would expect none, or all (in the event of someone wiping their data directories but preserving superblocks), and optimizing for the latter isn't that useful. I was trying to hard to emulate Guava's MultiMap.asMap(), but I agree that this is pretty inscrutable. I'll rework into map. PS1, Line 68: bool first = true; : for (auto it = range.first; it != range.second; it++) { : if (first) { : first = false; : } else { : block_str += ", "; : } : block_str += it->second; : } > if you did the map > above, you could save a bunch of LOC here Done Line 78: s += Substitute("Fatal error: tablet $0 missing blocks: [ $1 ]\n", > can use SubstituteAndAppend Done Line 228: s += Substitute("Misaligned block in container $0: $1\n", > SubstituteAndAppend Done PS1, Line 324: if (malformed_record_check && other.malformed_record_check) { : malformed_record_check->MergeFrom(other.malformed_record_check.get()); : } else if (other.malformed_record_check) { : malformed_record_check = other.malformed_record_check; : } > hrm, this is very repetitive. you think it's worth doing a templated functi I'll do a macro. In ToString() too. Line 405: LOG(INFO) << ToString(); > LOG has the downisde of a maximum 32kb (maybe 64kb?) limit, which I could s Think I should log each line individually? That makes the result a little tougher to cut and paste without also removing all the log prefixes. http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/fs_report.h File src/kudu/fs/fs_report.h: Line 44: Entry(BlockId block_id, std::string tablet_id); > if this is just a simple assignment constructor, maybe it's not necessary a Doesn't work with emplace_back(), unfortunately: http://stackoverflow.com/questions/13812703/c11-emplace-back-on-vectorstruct?noredirect=1=1 -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] fs: generate report during Open
Todd Lipcon has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 1: (8 comments) only skimmed through so far since I see David took a look http://gerrit.cloudera.org:8080/#/c/6581/1//COMMIT_MSG Commit Message: Line 7: fs: generate report during Open do you have any measurement of whether this affects startup time on a system with lots of containers? http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/fs_report.cc File src/kudu/fs/fs_report.cc: Line 65:it = missing_block_by_tablet_id.equal_range(it->first).second) { Do we ever expect this to be a big map? It seems like this loop might end up being klog(n) instead of k + log(n) since you are calling equal_range for every element that you hit? The nested iterator with the same name as the outer iterator is also pretty confusing. From reading the code I can't quite tell if it's correct. If we expect this to usually be small anyway, maybe it would be simpler to just use a mapso we can use simpler foreach loops? PS1, Line 68: bool first = true; : for (auto it = range.first; it != range.second; it++) { : if (first) { : first = false; : } else { : block_str += ", "; : } : block_str += it->second; : } if you did the map > above, you could save a bunch of LOC here too and just use a JoinStrings() call Line 78: s += Substitute("Fatal error: tablet $0 missing blocks: [ $1 ]\n", can use SubstituteAndAppend Line 228: s += Substitute("Misaligned block in container $0: $1\n", SubstituteAndAppend PS1, Line 324: if (malformed_record_check && other.malformed_record_check) { : malformed_record_check->MergeFrom(other.malformed_record_check.get()); : } else if (other.malformed_record_check) { : malformed_record_check = other.malformed_record_check; : } hrm, this is very repetitive. you think it's worth doing a templated function or macro? Line 405: LOG(INFO) << ToString(); LOG has the downisde of a maximum 32kb (maybe 64kb?) limit, which I could see being annoying if you get a truncated report http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/fs_report.h File src/kudu/fs/fs_report.h: Line 44: Entry(BlockId block_id, std::string tablet_id); if this is just a simple assignment constructor, maybe it's not necessary and you could just use brace-initialization? -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] fs: generate report during Open
Adar Dembo has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 1: (22 comments) http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/block_manager.h File src/kudu/fs/block_manager.h: Line 209: virtual Status Open(FsReport* report = nullptr) = 0; > warning: default arguments on virtual or override methods are prohibited [g Done http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/file_block_manager.cc File src/kudu/fs/file_block_manager.cc: Line 539: Status FileBlockManager::Open(FsReport* report) { > warning: default arguments on virtual or override methods are prohibited [g Done http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/file_block_manager.h File src/kudu/fs/file_block_manager.h: Line 84: Status Open(FsReport* report = nullptr) override; > warning: default arguments on virtual or override methods are prohibited [g Done http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/fs_manager.cc File src/kudu/fs/fs_manager.cc: Line 83: using kudu::fs::CreateBlockOptions; > warning: using decl 'CreateBlockOptions' is unused [misc-unused-using-decls Done http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/fs_report.cc File src/kudu/fs/fs_report.cc: Line 36: using std::vector; > warning: using decl 'vector' is unused [misc-unused-using-decls] Done Line 212: record(std::move(record)) { > warning: passing result of std::move() as a const reference argument; no mo Done Line 236: record(std::move(record)) { > warning: passing result of std::move() as a const reference argument; no mo Done Line 263: offset(std::move(offset)), > warning: std::move of the variable of a trivially-copyable type has no effe Done http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/fs_report.h File src/kudu/fs/fs_report.h: PS1, Line 34: > prefix with: "Error type:" Done Line 45: > nit no need for an empty line, same for the structs below Done PS1, Line 56: deletes > this doesn't delete anything right? maybe rephrase to: "(can be repaired by I like the information on how a repairable error is repaired, but I'll reword this. PS1, Line 77: container > containers Done PS1, Line 77: preallocated > pre-allocated Disagree: $ git grep -i pre-allocated | grep // | wc -l 5 $ git grep -i preallocated | grep // | wc -l 22 PS1, Line 80: truncates the container data files > same comment as in the previous struct Done PS1, Line 101: Checks for LBM containers where one or both files are too short to be of use. > care to elaborate on this? not sure I understand how files can be too short Suppose we crash right after we've created a new data file but before we create the metadata file (or vice versa). Or suppose we get as far as creating both files but crash before writing the PBC container header into the metadata file. In either case, this is an "incomplete" container that contains no data. It's more annoying to reuse it than it's worth, so we just throw it out. That's the kind of inconsistency this is targeting. I'll try to clarify. PS1, Line 199: Fatal and repairable > isn't a repairable inconsistency by definition non-fatal? Unfortunately no. The "partial record" inconsistency must be repaired in order for the LBM to function. If it isn't repaired, we won't be able to append new records to the metadata file. That's not so bad for CREATE records (we could get by with prohibiting new blocks in the container), but it means DELETEs can't be written, so existing blocks in the container can't be deleted. PS1, Line 217: // Returns whether this report describes at least one fatal and irreparable : // inconsistency. : bool HasFatalErrors() const; > this method name and comment seems to agree with my comment on the struct h Perhaps, but the implementation only looks for errors that are fatal+irreparable. That's what the comment says too. If you think the method name isn't precise enough, to what would you have me change it? PS1, Line 226: // Like CheckForFatalErrors(), but also writes the report to LOG(INFO). : Status LogAndCheckForFatalErrors() const; : : // Like CheckForFatalErrors(), but also writes the report to stdout. : Status PrintAndCheckForFatalErrors() const; > do there really belong here?, why not just pass a report all the time and t They're just convenience methods since I've already repeated those patterns (four times for LogAndCheck and twice for PrintAndCheck). http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: Line 1304: Status LogBlockManager::Open(FsReport* report) { > warning: default arguments on virtual or override methods are prohibited [g Done http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/log_block_manager.h File src/kudu/fs/log_block_manager.h: Line
[kudu-CR] fs: generate report during Open
David Ribeiro Alves has posted comments on this change. Change subject: fs: generate report during Open .. Patch Set 1: (11 comments) first round of reviews. this patch is pretty heftly so I'll review it in 2-3 rounds or less. rather do that than leave no feedback until I'm done http://gerrit.cloudera.org:8080/#/c/6581/1//COMMIT_MSG Commit Message: PS1, Line 32: previously only the opposite was true. what is the opposite? http://gerrit.cloudera.org:8080/#/c/6581/1/src/kudu/fs/fs_report.h File src/kudu/fs/fs_report.h: PS1, Line 34: prefix with: "Error type:" same for the cases below Line 45: nit no need for an empty line, same for the structs below PS1, Line 56: deletes this doesn't delete anything right? maybe rephrase to: "(can be repaired by deleting the block) though I'd be okay to just omit this altogether PS1, Line 77: preallocated pre-allocated PS1, Line 77: container containers PS1, Line 80: truncates the container data files same comment as in the previous struct PS1, Line 101: Checks for LBM containers where one or both files are too short to be of use. care to elaborate on this? not sure I understand how files can be too short. were they truncated? (in which case how is this repairable?) PS1, Line 199: Fatal and repairable isn't a repairable inconsistency by definition non-fatal? how about just: - fatal - repairable - irreparable PS1, Line 217: // Returns whether this report describes at least one fatal and irreparable : // inconsistency. : bool HasFatalErrors() const; this method name and comment seems to agree with my comment on the struct header PS1, Line 226: // Like CheckForFatalErrors(), but also writes the report to LOG(INFO). : Status LogAndCheckForFatalErrors() const; : : // Like CheckForFatalErrors(), but also writes the report to stdout. : Status PrintAndCheckForFatalErrors() const; do there really belong here?, why not just pass a report all the time and then let the called decide what to do with it? (print it or log it or whatever) -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] fs: generate report during Open
Hello David Ribeiro Alves, Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/6581 to review the following change. Change subject: fs: generate report during Open .. fs: generate report during Open This patch modifies the FsManager and block managers to produce a filesystem report when opened. The report includes various filesystem statistics as well as all inconsistencies found and repaired. The heart of the patch is the FsReport nested data structure. Originally implemented as a protobuf to take advantage of DebugString(), I found it to be a poor fit once I began customizing the log output, so I rewrote it in its current form. The report includes fs-wide statistics as well as optional consistency checks that may or may not be performed by the block managers. Reports can be merged; the LBM takes advantage of this as it processes data directories in parallel. The consistency checks have structure, so as to simplify testing and ToString() customization. Thanks to the level of detail in the FsReport, the LBM now separates the act of finding inconsistencies from repairing them. This makes it much easier to enforce that repairs aren't done on a read-only filesystem, or if there were any fatal and irreparable errors. Thorough testing of the inconsistencies in the report is deferred to a different patch as this one was already quite large. Other changes of note: - The LBM treats data files without matching metadata files as incomplete containers; previously only the opposite was true. - The LBM maps all containers by name so they can be looked up during the repair process; previously they were stored in a vector. - The checks performed by "kudu fs check" are in FsReport. I think this makes sense, and it's an acknowledgement that they should be performed by the block manager in the future. The code in tool_action_fs.cc was restructured to write its findings into the report before logging it. Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e --- M src/kudu/fs/CMakeLists.txt M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h A src/kudu/fs/fs_report.cc A src/kudu/fs/fs_report.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/server/server_base.cc M src/kudu/tools/kudu-tool-test.cc M src/kudu/tools/tool_action_fs.cc M src/kudu/util/pb_util.cc 17 files changed, 1,144 insertions(+), 208 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/6581/1 -- To view, visit http://gerrit.cloudera.org:8080/6581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Idcc1d512e2bed103c8d41623a63b4d071532b35e Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Todd Lipcon