Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Hbase/MasterRewrite" page has been changed by stack.
http://wiki.apache.org/hadoop/Hbase/MasterRewrite?action=diff&rev1=6&rev2=7

--------------------------------------------------

  = Design Notes for Master Rewrite =
  
  Initial Master Rewrite design came of conversations had at the hbase 
hackathon held at StumbleUpon, August 5-7, 2009 
([[https://issues.apache.org/jira/secure/attachment/12418561/HBase+Hackathon+Notes+-+Sunday.pdf|Jon
 Gray kept notes]]).  The umbrella issue for the master rewrite is 
[[https://issues.apache.org/jira/browse/HBASE-1816|HBASE-1816]].  Timeline is 
hbase 0.20.1.
+ 
+ == Table of Contents ==
+  * What does the Master do now?
+  * Problems with current Master
+  * Design
+   * 
  
  == What does the Master do now? ==
  Here's a bit of a refresher on what Master currently does:
@@ -17, +23 @@

   * Watches ZK for its own lease and for regionservers so knows when to run 
recovery
  
  == Problems with current Master ==
+ There is a good list in the 
[[https://issues.apache.org/jira/secure/ManageLinks.jspa?id=12434794|Issue 
Links]] section of HBASE-1816.
-  * Balancer is not testable unless you spin up full cluster
-  * And here is a small selection of the issues a rewrite should address...
-   * HBASE-1422 Refactor to Server Manager
-   * HBASE-1750 Region opens and assignment running independent of shutdown 
processing
-   * HBASE-1439 race between master and regionserver after missed heartbeat
-   * HBASE-1736 If RS can't talk to master, pause; more importantly, don't 
split (Currently we do and splits are lost and table is wounded)
-   * HBASE-869 On split, if failure updating of .META., table subsequently 
broke
-   * HBASE-934 Assigning all regions to one server only
-   * HBASE-1666 'safe mode' is broken; MetaScanner initial scan completes 
without scanning .META.
-   * HBASE-1679 Flapping DNS does us more harm than it need to
-   * HBASE-1700 Regionserver should be parsimonious regards messages sent 
master
-   * HBASE-1676 load balancing on a large cluster doesn't work very well
-   * HBASE-1742 Region lost (disabled) when -ROOT- offline or hosting server 
dies just before it tells master successful
-   * HBASE-1364 [performance] Distributed splitting of regionserver commit logs
-   * HBASE-451 Remove HTableDescriptor from HRegionInfo
-   * HBASE-1111 [performance] Crash recovery takes way too long
-   * HBASE-1730 online schema updates
-   * HBASE-1502 Remove need for heartbeats in HBase
-   * HBASE-1451 Redo master management of state transitions coalescing and 
keeping transition state over in ZK
          
  
  == Design ==
+ 
+ === Move all region state transitions to zookeeper ===
+ Run state transitions by changing state in zookeeper rather than inside in 
Master
+ 
+ === In Zookeeper, a State and a Schema section ===
  
  == Notes ==
  To be organized...

Reply via email to