Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/9490 )
Change subject: [tools] Add a tool to recover master data from tablet servers ...................................................................... Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/9490/4/src/kudu/tools/master_rebuilder.cc File src/kudu/tools/master_rebuilder.cc: http://gerrit.cloudera.org:8080/#/c/9490/4/src/kudu/tools/master_rebuilder.cc@289 PS4, Line 289: // Start up the master and syscatalog. > 1) By calling Master::Init(), do we bind to some ports? Yes. Master::Init() calls KuduServer::Init() which calls ServerBase::Init(), which builds the messenger and binds to ports. We also start the webserver though that can be disabled with -webserver_enabled=no. > Will we start responding to incoming RPCs? I don't think so. Doesn't look like it from test logs. It might respond to some base RPCs like Ping? > A CLI tool shouldn't do either. Agreed. > 2) By calling SysCatalogTable::CreateNew, you end up baking > particulars about _this_ process into the on-disk data. I imagined that as par for the course. In my head, this was used to recreate master data in situ so then a real master process could be started on top of it. Part of the running the tool correctly would be running it as the correct user, as it is for some other tools. To get back a distributed master, one would migrate from the reconstructed single master to multimaster. Alternatively, one could use a remote reconstructed master to bootstrap "real" masters by migrating to a distributed master and then dropping the reconstructed master. > 1) Continuing the thread of "reconstruction via generating physical on-disk > data directly", could we instantiate a Tablet, load the master's schema, > perform tablet writes directly to the Tablet, then Flush() at the end? I'm going to evaluate how feasible this is more carefully, but my first impression is that while this would be the ideal way to do things, doing it with a half-cocked pop-up process gets us what we need with relatively little work, while all the extra work earns a cleaner implementation with no more capability. Also, don't we require WAL segments whenever we find data? I think if a Kudu process found just a tablet and nothing else it would not function right, and it'd take some rejiggering to make it work, or fool it into working. > 2) Or, if we go in the other direction, could we start an empty > master and "import" reconstructed metadata into it via RPC? This could work if we added a couple of RPCs, like "AdoptTable" and "AdoptTablet" that accepted table and tablet metadata and wrote it to the syscatalog. The basic operation of the tool would be the same except it would collect the data into the PBs and then send it via RPC to be written, rather than controlling a pop-up master and writing to its syscatalog. Overall I think 1 is more work than it is worth compared to a quicker and dirtier solution, but 2 might be nice. Would it satisfy you if I tried an implementation in the style of 2? -- To view, visit http://gerrit.cloudera.org:8080/9490 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If29e421d466a531ebad72e281ae27e74e458f8c6 Gerrit-Change-Number: 9490 Gerrit-PatchSet: 4 Gerrit-Owner: Will Berkeley <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-Comment-Date: Fri, 23 Mar 2018 05:53:31 +0000 Gerrit-HasComments: Yes
