Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9490 )
Change subject: [tools] Add a tool to recover master data from tablet servers ...................................................................... Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/9490/4/src/kudu/tools/master_rebuilder.cc File src/kudu/tools/master_rebuilder.cc: http://gerrit.cloudera.org:8080/#/c/9490/4/src/kudu/tools/master_rebuilder.cc@289 PS4, Line 289: // Start up the master and syscatalog. When thinking about the layers involved, I agree that recreating the master metadata via SysCatalogTable::Write seems preferable. Any lower and you'd need to start your own TabletReplica and manage the master schema yourself. Any higher and you end up starting a full master process. But, that's actually what you're doing here, and it's concerning for a few reasons: 1) By calling Master::Init(), do we bind to some ports? Will we start responding to incoming RPCs? A CLI tool shouldn't do either. 2) By calling SysCatalogTable::CreateNew, you end up baking particulars about _this_ process into the on-disk data. For example (maybe the only example), a cmeta file will be generated with the first RPC address of _this process_ (well, of Master::Init I guess) in it. That seems like a bad idea since the CLI tool and the actual master are likely to run differently (i.e. different UNIX users, maybe different machines too if the only goal here is to generate some on-disk data). So here are some avenues to explore: 1) Continuing the thread of "reconstruction via generating physical on-disk data directly", could we instantiate a Tablet, load the master's schema, perform tablet writes directly to the Tablet, then Flush() at the end? Then there's no TabletReplica, no cmeta, no WAL, etc. TabletHarness is a test-only class that you may be able to reuse. The big question is whether a master could load such a tablet afterwards. 2) Or, if we go in the other direction, could we start an empty master and "import" reconstructed metadata into it via RPC? -- To view, visit http://gerrit.cloudera.org:8080/9490 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If29e421d466a531ebad72e281ae27e74e458f8c6 Gerrit-Change-Number: 9490 Gerrit-PatchSet: 4 Gerrit-Owner: Will Berkeley <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Comment-Date: Thu, 22 Mar 2018 23:08:56 +0000 Gerrit-HasComments: Yes
