[jira] [Commented] (HBASE-13260) Bootstrap Tables for fun and profit

Enis Soztutar (JIRA) Tue, 28 Apr 2015 11:43:05 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-13260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14517632#comment-14517632
 ]


Enis Soztutar commented on HBASE-13260:
---------------------------------------

Thanks [~mbertozzi] for pursuing this. Pushed the code that I am running at 
https://github.com/enis/hbase/tree/hbase-13260-review. Committed your test as 
TestProcedureStorePerf. 

{code}
mvn test -Dtest=TestProcedureStorePerf
{code}
Results from my MBP. Maybe it is because of the relatively fast SSD in my 
laptop? Let me try a linux box without SSDs. 
{code}
  TestProcedureStorePerf.runTestWith10ThreadsAndProcV2Wal:225->runTest:207 
Wrote 1000000 procedures with 10 threads with useProcV2Wal=true hsync=false in 
36.8390sec (36.839sec)
  TestProcedureStorePerf.runTestWith10ThreadsAndRegionStore:250->runTest:207 
Wrote 1000000 procedures with 10 threads with useProcV2Wal=false hsync=false in 
17.1810sec (17.181sec)
  TestProcedureStorePerf.runTestWith30ThreadsAndProcV2Wal:230->runTest:207 
Wrote 1000000 procedures with 30 threads with useProcV2Wal=true hsync=false in 
22.3340sec (22.334sec)
  TestProcedureStorePerf.runTestWith30ThreadsAndRegionStore:255->runTest:207 
Wrote 1000000 procedures with 30 threads with useProcV2Wal=false hsync=false in 
25.1180sec (25.118sec)
  TestProcedureStorePerf.runTestWith4ThreadsAndProcV2Wal:220->runTest:207 Wrote 
1000000 procedures with 5 threads with useProcV2Wal=true hsync=false in 
53.6920sec (53.692sec)
  TestProcedureStorePerf.runTestWith50ThreadsAndProcV2Wal:235->runTest:207 
Wrote 1000000 procedures with 50 threads with useProcV2Wal=true hsync=false in 
20.8590sec (20.859sec)
  TestProcedureStorePerf.runTestWith50ThreadsAndRegionStore:260->runTest:207 
Wrote 1000000 procedures with 50 threads with useProcV2Wal=false hsync=false in 
19.1450sec (19.145sec)
  TestProcedureStorePerf.runTestWith5ThreadsAndRegionStore:245->runTest:207 
Wrote 1000000 procedures with 5 threads with useProcV2Wal=false hsync=false in 
15.8760sec (15.876sec)
{code}

> Bootstrap Tables for fun and profit 
> ------------------------------------
>
>                 Key: HBASE-13260
>                 URL: https://issues.apache.org/jira/browse/HBASE-13260
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 2.0.0, 1.1.0
>
>         Attachments: hbase-13260_bench.patch, hbase-13260_prototype.patch
>
>
> Over at the ProcV2 discussions(HBASE-12439) and elsewhere I was mentioning an 
> idea where we may want to use regular old regions to store/persist some data 
> needed for HBase master to operate. 
> We regularly use system tables for storing system data. acl, meta, namespace, 
> quota are some examples. We also store the table state in meta now. Some data 
> is persisted in zk only (replication peers and replication state, etc). We 
> are moving away from zk as a permanent storage. As any self-respecting 
> database does, we should store almost all of our data in HBase itself. 
> However, we have an "availability" dependency between different kinds of 
> data. For example all system tables need meta to be assigned first. All 
> master operations need ns table to be assigned, etc. 
> For at least two types of data, (1) procedure v2 states, (2) RS groups in 
> HBASE-6721 we cannot depend on meta being assigned since "assignment" itself 
> will depend on accessing this data. The solution in (1) is to implement a 
> custom WAL format, and custom recover lease and WAL recovery. The solution in 
> (2) is to have the table to store this data, but also cache it in zk for 
> bootrapping initial assignments. 
> For solving both of the above (and possible future use cases if any), I 
> propose we add a "boostrap table" concept, which is: 
>  - A set of predefined tables hosted in a separate dir in HDFS. 
>  - A table is only 1 region, not splittable 
>  - Not assigned through regular assignment 
>  - Hosted only on 1 server (typically master)
>  - Has a dedicated WAL. 
>  - A service does WAL recovery + fencing for these tables. 
> This has the benefit of using a region to keep the data, but frees us to 
> re-implement caching and we can use the same WAL / Memstore / Recovery 
> mechanisms that are battle-tested. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13260) Bootstrap Tables for fun and profit

Reply via email to