[ https://issues.apache.org/jira/browse/HBASE-13260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612226#comment-14612226 ]
stack commented on HBASE-13260: ------------------------------- Resolve as interesting experiment from which we learned a bunch? > Bootstrap Tables for fun and profit > ------------------------------------ > > Key: HBASE-13260 > URL: https://issues.apache.org/jira/browse/HBASE-13260 > Project: HBase > Issue Type: Bug > Reporter: Enis Soztutar > Assignee: Enis Soztutar > Fix For: 2.0.0, 1.3.0 > > Attachments: hbase-13260_bench.patch, hbase-13260_prototype.patch > > > Over at the ProcV2 discussions(HBASE-12439) and elsewhere I was mentioning an > idea where we may want to use regular old regions to store/persist some data > needed for HBase master to operate. > We regularly use system tables for storing system data. acl, meta, namespace, > quota are some examples. We also store the table state in meta now. Some data > is persisted in zk only (replication peers and replication state, etc). We > are moving away from zk as a permanent storage. As any self-respecting > database does, we should store almost all of our data in HBase itself. > However, we have an "availability" dependency between different kinds of > data. For example all system tables need meta to be assigned first. All > master operations need ns table to be assigned, etc. > For at least two types of data, (1) procedure v2 states, (2) RS groups in > HBASE-6721 we cannot depend on meta being assigned since "assignment" itself > will depend on accessing this data. The solution in (1) is to implement a > custom WAL format, and custom recover lease and WAL recovery. The solution in > (2) is to have the table to store this data, but also cache it in zk for > bootrapping initial assignments. > For solving both of the above (and possible future use cases if any), I > propose we add a "boostrap table" concept, which is: > - A set of predefined tables hosted in a separate dir in HDFS. > - A table is only 1 region, not splittable > - Not assigned through regular assignment > - Hosted only on 1 server (typically master) > - Has a dedicated WAL. > - A service does WAL recovery + fencing for these tables. > This has the benefit of using a region to keep the data, but frees us to > re-implement caching and we can use the same WAL / Memstore / Recovery > mechanisms that are battle-tested. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)