Adar Dembo has posted comments on this change.

Change subject: tool: port kudu-fs_dump, remove kudu-fs_list, fs_tool
......................................................................


Patch Set 8:

(38 comments)

http://gerrit.cloudera.org:8080/#/c/4305/8/docs/release_notes.adoc
File docs/release_notes.adoc:

PS8, Line 71: local_repica
local_replica


http://gerrit.cloudera.org:8080/#/c/4305/5/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

PS5, Line 517:       ASSERT_STR_MATCHES(stdout, "Header:");
             :       ASSERT_STR_MATCHES(stdout, "1\\.1@1");
> Fixed, it was a blind copy-paste effect.
Hmm, not exactly what I meant. What I mean is: you're already doing a 
substitute on L521; include the fs_wal_dir and fs_data_dirs substitution in 
there rather than doing it out here.

That way the entire command line is more clear; don't need to look in two 
different places.


http://gerrit.cloudera.org:8080/#/c/4305/8/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

Line 20: #include <sstream>
Nit: should precede string.


Line 215:     const vector<string> kLocalReplicaModeRegexes = {
Shouldn't there be a dump mode in here? And something for "list all replicas"?


PS8, Line 594:   string fs_paths = "--fs_wal_dir=" + kTestDir + " "
             :       "--fs_data_dirs=" + kTestDir;
Same comment about partial substitution here.


Line 596:   LOG(INFO) <<fs_paths;
This was to help you debug, right? Can it be removed now?


PS8, Line 621:   string fs_paths = "--fs_wal_dir=" + kTestDir + " "
             :       "--fs_data_dirs=" + kTestDir;
Nit: bring this down to L634, where it's first used.


Line 626:   for (int i= 0; i< 10; i++) {
Nit: int i = 0


http://gerrit.cloudera.org:8080/#/c/4305/8/src/kudu/tools/tool_action_fs.cc
File src/kudu/tools/tool_action_fs.cc:

Line 19: #include "kudu/tools/tool_action_common.h"
As before, this include doesn't belong up here.


http://gerrit.cloudera.org:8080/#/c/4305/8/src/kudu/tools/tool_action_local_replica.cc
File src/kudu/tools/tool_action_local_replica.cc:

Line 28: #include "kudu/common/schema.h"
Nit: should go after row*


Line 33: #include "kudu/consensus/consensus.pb.h"
Nit: shoudl go after consensus_meta.h.


Line 53: #include "kudu/tablet/rowset_metadata.h"
Nit: should come before tablet.h.


Line 55: #include "kudu/tserver/tserver.pb.h"
Nit: should go after tablet_copy_client.h.


PS8, Line 58: #include "kudu/util/logging.h"
Nit: should go after env_util.h.


PS8, Line 71: information(if any)
Nit: information (if any)


PS8, Line 114: DumpOptions
I'd drop this struct altogether, because:
1. start_key and end_key are never used.
2. nrows and metadata_only are always set to FLAGS_nrows and 
FLAGS_metadata_only, so you can just use those gflags directly where needed 
instead of marshaling them into the struct and passing the struct around.


PS8, Line 286: static
Nit: don't need this; the function is already in an anonymous namespace.


Line 287:                                   const RowSetMetadata& rs_meta) {
Nit: fix the indentation on this line.


Line 292:     std::cout << "Column block for column ID " << col_id;
std::cout and std::endl are already in the 'using' blocks above, so you can 
drop the std:: prefixes.


PS8, Line 318:   const string* tablet_id = FindOrNull(context.required_args, 
"tablet_id");
             :   if (tablet_id == nullptr) {
             :     LOG(INFO) << "No tablet_id specified, dumping all tablets:";
             :   }
I understand the existing tool allowed this 'no tablet_id means dump all 
tablets' behavior, but let's not allow that here. Actually, as written this can 
never be null because it's a required arg, so it never worked anyway.


PS8, Line 346: FsManager& fs_manager
Google style frowns on passing arguments by ref. Your options are:
- Pass by const ref. Appropriate when the argument is not modified by the 
callee.
- Pass by pointer. Appropriate when the argument is modified by the callee, or 
if the callee stores an additional copy of the pointer somewhere.
- Pass by value, and std::move() in the call-site. Appropriate in a hot path 
when the argument is no longer needed by the caller, and should be handed over 
to the callee.

Given the circumstances here, you should use const ref.


Line 381:       std::cout << "\t" << tablet << std::endl;
In this case, let's not bother with the leading tab. It'd be easier to parse 
the command line if a tablet ID was the only thing on its output line.


PS8, Line 398: scoped_refptr<server::Clock>(nullptr)
I think this can just be "scoped_refptr<server::Clock>()".


PS8, Line 399: nullptr
When passing a nullptr like this, consider the following style:

  Tablet t(foo,
           bar,
           nullptr, // very short comment describing this argument's type
           reg.get());

So here it would be:

  Tablet t(meta,
           scoped_refptr<server::Clock>(),
           shared_ptr<MemTracker>(),
           nullptr, // MetricRegistry
           reg.get());

You don't need those comments for args 2 and 3 because the type is already in 
the template argument.


PS8, Line 432: FsManager& fs_manager
Const ref here too.


PS8, Line 476:   // NewDeltaIterator returns Status::OK() iff a new 
DeltaIterator is created. Thus,
             :   // it's safe to have a gscoped_ptr take possesion of 
'raw_iter' here.
             :   gscoped_ptr<DeltaIterator> delta_iter(raw_iter);
This is correct, but let's use unique_ptr now; this was written before we 
transitioned to C++11.


PS8, Line 540: FsManager& fs_manager
Let's change this to const ref.


PS8, Line 545: tablet::RowSetDataPB
You can add "using tablet::RowSetDataPB" at the top and omit the prefix.


PS8, Line 608:   DumpOptions opts;
             :   opts.nrows = FLAGS_nrows;
             :   opts.metadata_only = FLAGS_metadata_only;
Move this to just before the loop on L625 (no point in reconstructing it with 
every loop iteration).


PS8, Line 621:   Schema schema = meta->schema();
I don't see us using this anywhere, and it's not clear why we want to make a 
local _copy_ of it either (vs. a local const ref).


Line 640:   uint32_t rowset_idx;
Let's parse this as a 64-bit integer; that's how large it can be in 
RowSetMetadata.


PS8, Line 648:   DumpOptions opts;
             :   opts.nrows = FLAGS_nrows;
             :   opts.metadata_only = FLAGS_metadata_only;
Nit: move this to just before L657, as it's not needed before.


PS8, Line 667:   FsManagerOpts fs_opts;
             :   fs_opts.read_only = true;
             :   FsManager fs_manager(Env::Default(), fs_opts);
             :   RETURN_NOT_OK(fs_manager.Open());
Given how often this is repeated, consider decomposing it into a local 
function. You'll probably need to have that function return a 
unique_ptr<FsManager> given that FsManager is not copyable or assignable.


Line 673:   CHECK_OK(PrintTabletMeta(fs_manager, tablet_id, 0));
Why CHECK_OK? Shouldn't we RETURN_NOT_OK() on this? Or just "return 
PrintTabletMeta(...)"?


PS8, Line 673: PrintTabletMeta
Nit: let's rename this to DumpTabletMeta to be consistent with the other 
functions here.


Line 679: static unique_ptr<Mode> BuildDumpMode() {
Nit: move this into the anonymous namespace above and then you won't need 
'static'.

Basically, 'static' is the C-style way of making a function local to its 
compilation unit, while anonymous namespaces are the C++-style way of doing it.


PS8, Line 682:       .Description("Dump the data of a tablet")
This needs to be differentiated from dump_data in some way. Presumably this 
only dumps data from the replica's data blocks while dump_data dumps all data 
(including data from WALs)?


PS8, Line 786: tablets
Nit: "Kudu replicas in the local filesystem."

Please make similar substitutions elsewhere in the file, to emphasize that 
we're looking at replicas that are on this machine (rather than remote).


-- 
To view, visit http://gerrit.cloudera.org:8080/4305
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1ec628b65613011d8c48b6239c13762276425966
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dinesh Bhat <din...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Dinesh Bhat <din...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-HasComments: Yes

Reply via email to