Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "ZooKeeper/HBaseUseCases" page has been changed by PatrickHunt. http://wiki.apache.org/hadoop/ZooKeeper/HBaseUseCases?action=diff&rev1=5&rev2=6 -------------------------------------------------- === Case 2 === Summary: HBase Region Transitions from unassigned to open and from open to unassigned with some intermediate states + Expected scale: 100k regions across thousands of RegionServers + + [PDH start] + + This sounds like 2 recipes -- "dynamic configuration" ("dynamic sharding", same thing except the data may be a bit larger) and "group membership". Basically you want to have a list of region servers that are available to do work. You also want a master to coordinate the work among the region servers. You also want to ensure that the work handed to the RS is acted upon in order (state transitions) and would like to know the status of the work at any point in time. So really I see two recipes here: + + Here's an idea, see if I got the idea right, obv would have to flesh this out more but this is the general idea. I've chosen random paths below, obv you'd want some sort of prefix, better names, etc... + + 1) group membership: + # have a /regionservers znode + # master watches /regionservers for any child changes + # as each region server becomes available to do work (or track state if up but not avail) it creates an ephemeral node + * /regionserver/<host:port> = <status> + # master watches /regionserver/<host:port> and cleans up if RS goes away + + 2) task assignment (ie dynamic configuration) + # have a /tables znode + # /tables/<regionserver by host:port> which gets created when master notices new region server + * RS host:port watches this node for any child changes + # /tables/<regionserver by host:port>/<regionX> znode for each region assigned to RS host:port + * RS host:port watches this node in case reassigned by master, or region changes state + # /tables/<regionserver by host:port>/<regionX>/<state>-<seq#> znode created by master + * seq ensures order seen by RS + * RS deletes old state, oldest entry is the current state, always 1 or more znode here -- the current state + + [PDH end] General recipe implemented: None yet. Need help. Was thinking of keeping queues up in zk -- queues per regionserver for it to open/close etc. But the list of all regions is kept elsewhere currently and probably for the foreseeable future out in our .META. catalog table. Some further description can be found here [[http://wiki.apache.org/hadoop/Hbase/MasterRewrite#regionstate|Master Rewrite: Region State]]
