Adar Dembo has posted comments on this change. Change subject: tool: port kudu-fs_dump, remove kudu-fs_list ......................................................................
Patch Set 5: (22 comments) http://gerrit.cloudera.org:8080/#/c/4305/5/docs/release_notes.adoc File docs/release_notes.adoc: PS5, Line 49: - The `cfile-dump` tool has been removed. The same functionality is now : implemented as `kudu fs cfile_dump`. Update this. PS5, Line 58: The `kudu-fs_list` tool has been removed as it was replicating most of the : functionalities already available under 'kudu fs dump'. This should be reworded now that some functionality was preserved, right? http://gerrit.cloudera.org:8080/#/c/4305/5/src/kudu/tools/kudu-tool-test.cc File src/kudu/tools/kudu-tool-test.cc: Line 202: const vector<string> kFsDumpModeRegexes = { Need to add uuid to this list. PS5, Line 501: const string kTestTablet = "test-tablet"; : const string kTestTable = "test-table"; Nit: can you add "Id" as a suffix to these variable names, so it's clear that it's a table ID (as opposed to the table name, below)? In the other new tests too. PS5, Line 517: string fs_paths = Substitute("--fs_wal_dir=$0 --fs_data_dirs=$1", : kTestDir, kTestDir); Nit: not sure what this partial substitution buys you; fs_paths is only used in one place here and in the next test. Line 534: StripWhiteSpace(&debug_str); What whitespace is being stripped here? PS5, Line 568: std::ostringstream Nit: add this to using statements at the top of the file. Should also add the right include too. Line 570: ASSERT_STR_MATCHES(stdout, tree_out.str()); We can't do ASSERT_EQ() here? Line 582: unique_ptr<TabletHarness> harness(new TabletHarness(kSchemaWithIds, opts)); Can't we just declare the TabletHarness on the stack and avoid the unique_ptr? Line 583: CHECK_OK(harness->Create(true)); These CHECK_OK() statements should be ASSERT_OK() instead. CHECK_OK() is only needed when being called on a separate thread in a unit test (here we're on the main thread). PS5, Line 587: gscoped_ptr<LocalTabletWriter> writer( : new LocalTabletWriter(harness->tablet().get(), &kSchema)); : gscoped_ptr<KuduPartialRow> row(new KuduPartialRow(&kSchemaWithIds)); Likewise, can't we declare these on the stack? PS5, Line 606: vector<string> rows = strings::Split(stdout, "\n", strings::SkipEmpty()); Why not use RunActionStdoutLines to get this? PS5, Line 621: if (stdout.find("No rowsets found") == string::npos) { AFAICT this test is deterministic, so will we get "No rowsets found" or not? Given that the bulk of the interesting testing is when we don't get that error, we should do whatever we need to do up-front (i.e. write more data, or flush, or whatever) so that we can convert this into an ASSERT instead. Below too. http://gerrit.cloudera.org:8080/#/c/4305/5/src/kudu/tools/tool_action_common.h File src/kudu/tools/tool_action_common.h: Line 63: class FsTool { My mistake, I wasn't clear when I asked you to remove fs_tool. I didn't mean "move FsTool from the fs_tool.* code files into tool_action.*"; I meant "remove the FsTool abstraction altogether and replace it with stateless functions". Given the simplicity of what we're trying to do, I don't think the abstraction buys us anything, and I'd like to use this opportunity (of writing a new tool) to get rid of old abstractions and assumptions. http://gerrit.cloudera.org:8080/#/c/4305/5/src/kudu/tools/tool_action_fs.cc File src/kudu/tools/tool_action_fs.cc: PS5, Line 138: Print Nit: Change to "Dump" to be consistent with other dump actions. http://gerrit.cloudera.org:8080/#/c/4305/5/src/kudu/tools/tool_action_tablet.cc File src/kudu/tools/tool_action_tablet.cc: Line 19: #include "kudu/tools/tool_action_common.h" This actually belongs where it was previously. The top-most include is reserved for the primary header of this compilation unit. See https://google.github.io/styleguide/cppguide.html#Names_and_Order_of_Includes for details. PS5, Line 331: dump_wals When you add a dump submode, move this into it too. PS5, Line 343: list_blocks How about "dump_block_ids"? Then you can add it to a dump submode, rename the associated function "DumpTabletBlockIds", and change the description to "Dump the IDs of all blocks belonging to a tablet". Line 351: ActionBuilder("list_tablets", &ListAllTablets) Nit: let's just call this one "list", because "kudu tablet list" makes intuitive sense. PS5, Line 385: tablet Nit: "of a tablet" Above too. PS5, Line 385: rowset Nit: rowset contents PS5, Line 399: .AddAction(std::move(list_blocks)) : .AddAction(std::move(list_tablets)) : .AddAction(std::move(dump_blocks)) : .AddAction(std::move(dump_data)) : .AddAction(std::move(dump_meta)) : .AddAction(std::move(dump_rowset)) How about a 'dump' mode to consolidate? Maybe even a 'list' mode, since "dump list ..." is a little weird. -- To view, visit http://gerrit.cloudera.org:8080/4305 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1ec628b65613011d8c48b6239c13762276425966 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dinesh Bhat <din...@cloudera.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Dinesh Bhat <din...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-HasComments: Yes