I replied on PR#2237, but let me write down here as well. Thanks Duo, I agreed with you on the meta table inconsistency with the ZNode because we cannot find the last server host on the ZNode and the meta region is offline, then an InitMetaProcedure was submitted. (rewording from your comments and thanks for pointing out in the PR). Although I was thinking not throwing exception and continue the meta bootstrap, your inconsistency concern between ZNode and HFiles make senses.
Interestingly, after another night of thinking, I found that in HBASE-24388 moves the server location of the meta table to the master region, it seems that solves our conflicts of interesting that InitMetaProcedure should not be entered with master region, and meta will not be deleted. But before the completion of splittable meta in HBASE-11288 (or even with it), adding an exception should be protecting the cluster from deleting the big meta (if there are any corner cases). Thanks again, have a good weekend. -Stephen On Thu, Aug 13, 2020 at 10:48 PM 张铎(Duo Zhang) <[email protected]> wrote: > > I'm +1 on adding a check to see if the meta region is really empty or > partial. If it is not, just leave the meta region there and let the users > use HBCK to fix the inconsistency, as we should not schedule > InitMetaProcedure if the meta has already been initialized. > > Thanks. > > Tak-Lon (Stephen) Wu <[email protected]> 于2020年8月14日周五 下午1:16写道: > > > Hi guys, > > > > Sorry to bother everyone, but we need some help on this discussion > > about a recent change in HBASE-24471 that adds a new state > > `INIT_META_WRITE_FS_LAYOUT` to InitMetaProcedure. Within the state, it > > introduces a new logic to remove the meta directory if it exists. > > > > private static void writeFsLayout(Path rootDir, Configuration conf) > > throws IOException { > > LOG.info("BOOTSTRAP: creating hbase:meta region"); > > FileSystem fs = rootDir.getFileSystem(conf); > > Path tableDir = CommonFSUtils.getTableDir(rootDir, > > TableName.META_TABLE_NAME); > > if (fs.exists(tableDir) && !fs.delete(tableDir, true)) { > > LOG.warn("Can not delete partial created meta table, continue..."); > > } > > > > HBASE-24471 is an incompatible change as mentioned in release note, if > > a HM restarts and hit into InitMetaProcedure#INIT_META_WRITE_FS_LAYOUT > > , it considers the meta is `partial` and it should be deleted even if > > the meta may not be partial (however, we cannot tell from the HFiles > > or table data itself if the table is partial or inconsistent). > > > > So, I’m wondering if we can keep the meta without deleting it, or > > leave it to repair action if any inconsistency happens after the meta > > bootstrap , e.g. using HBCK. > > > > Apologize in advance to Duo, and I want some ideas from a broader > > audience how we can move forward from the discussion on the PR#2237 > > > > P.S. I need to be honest on our use cases, we’re restarting a cluster > > on a fresh ZK data (the cloud use cases that restarting on no ZK and > > WAL but only HFiles) that will lead into resubmitting > > InitMetaProcedure and triggers the first state of > > INIT_META_WRITE_FS_LAYOUT that deletes the meta. As such we’re > > suffering from the other side that even if the meta direcotry has the > > right data content, we need to rebuild it. > > > > Related JIRAs > > * https://issues.apache.org/jira/browse/HBASE-24471 > > * https://issues.apache.org/jira/browse/HBASE-24833 > > > > Related PRs > > * PR#1806, > > https://github.com/apache/hbase/commit/4d5efec76718032a1e55024fd5133409e4be3cb8# > > * PR#2237 still in progress of discussion, > > https://github.com/apache/hbase/pull/2237 > > > > > > > > Thanks, > > Stephen > >
