Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "Hbase/MasterRewrite" page has been changed by stack. http://wiki.apache.org/hadoop/Hbase/MasterRewrite?action=diff&rev1=6&rev2=7 -------------------------------------------------- = Design Notes for Master Rewrite = Initial Master Rewrite design came of conversations had at the hbase hackathon held at StumbleUpon, August 5-7, 2009 ([[https://issues.apache.org/jira/secure/attachment/12418561/HBase+Hackathon+Notes+-+Sunday.pdf|Jon Gray kept notes]]). The umbrella issue for the master rewrite is [[https://issues.apache.org/jira/browse/HBASE-1816|HBASE-1816]]. Timeline is hbase 0.20.1. + + == Table of Contents == + * What does the Master do now? + * Problems with current Master + * Design + * == What does the Master do now? == Here's a bit of a refresher on what Master currently does: @@ -17, +23 @@ * Watches ZK for its own lease and for regionservers so knows when to run recovery == Problems with current Master == + There is a good list in the [[https://issues.apache.org/jira/secure/ManageLinks.jspa?id=12434794|Issue Links]] section of HBASE-1816. - * Balancer is not testable unless you spin up full cluster - * And here is a small selection of the issues a rewrite should address... - * HBASE-1422 Refactor to Server Manager - * HBASE-1750 Region opens and assignment running independent of shutdown processing - * HBASE-1439 race between master and regionserver after missed heartbeat - * HBASE-1736 If RS can't talk to master, pause; more importantly, don't split (Currently we do and splits are lost and table is wounded) - * HBASE-869 On split, if failure updating of .META., table subsequently broke - * HBASE-934 Assigning all regions to one server only - * HBASE-1666 'safe mode' is broken; MetaScanner initial scan completes without scanning .META. - * HBASE-1679 Flapping DNS does us more harm than it need to - * HBASE-1700 Regionserver should be parsimonious regards messages sent master - * HBASE-1676 load balancing on a large cluster doesn't work very well - * HBASE-1742 Region lost (disabled) when -ROOT- offline or hosting server dies just before it tells master successful - * HBASE-1364 [performance] Distributed splitting of regionserver commit logs - * HBASE-451 Remove HTableDescriptor from HRegionInfo - * HBASE-1111 [performance] Crash recovery takes way too long - * HBASE-1730 online schema updates - * HBASE-1502 Remove need for heartbeats in HBase - * HBASE-1451 Redo master management of state transitions coalescing and keeping transition state over in ZK == Design == + + === Move all region state transitions to zookeeper === + Run state transitions by changing state in zookeeper rather than inside in Master + + === In Zookeeper, a State and a Schema section === == Notes == To be organized...
