[
https://issues.apache.org/jira/browse/HBASE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787446#comment-13787446
]
stack commented on HBASE-5487:
------------------------------
bq. We have talked a little bit here, and agreed on 2 key points.
What about Enis's 'requirements'. We agree on his list? That makes 3 key
points. I think add a fourth point where we rehearse what is wrong w/ the
current system (Jon's suggestion) as it will help ensure we don't repeat the
mistakes of the past.
Make a subtask 'New Assignment Manager'? Or make this a subtask of a new issue
called 'New Assignment Manager'. An issue named so will be easier to find than
this one. Also, others are interested in this effort (@honghua and
[~xieliang007]) and it'll catch their attention.
I think it too big a change to be done for the pending 0.98. Lets not rush it.
It could even land post hbase 1.0 if 0.98 is to become 1.0.
On the design doc.,
+ Doc., needs author and date. I would expect a section situating the document
-- context -- that at least referred to the current 'design' -- se
https://issues.apache.org/jira/browse/HBASE-2485 (it has 'state' machine that
looks like this one)
+ The problem section is too short (state kept in multiple places and all have
to agree...); need more full list so can be sure proposal addresses them all
+ How is the proposal different from what we currently have? I see us tying
regionstate to table state. That is new. But the rest, where we have a record
and it is atomically changed looks like our RegionState in Master memory?
There is an increasing 'version' which should help ensure a 'direction' for
change which should help.
+ A single atomically mutable record of the regionstate is well and good but
how then to get the cluster to align w/ what this record says? For example,
table record says it is disabled. It has 10k regions. How do we get the
regions to agree w/ the Table record which says it is disabled? We can send
the closes but how we sure the close happened on all 10k regions?
+ I don't get this bit "This record is the only source of
truth about the region, and is never removed while the
region is relevant.This simplifies current situation where
ZK state, master state and
META state can all conflict in various special
ways..." Its fine having a source of truth but ain't the hard part bring the
system along? (meta edits, clients, etc.).
Experience has zk as messy to reason with. It is also an indirection having RS
and M go to zk to do 'state'.
Thank sfor writing this up Sergey
> Generic framework for Master-coordinated tasks
> ----------------------------------------------
>
> Key: HBASE-5487
> URL: https://issues.apache.org/jira/browse/HBASE-5487
> Project: HBase
> Issue Type: New Feature
> Components: master, regionserver, Zookeeper
> Affects Versions: 0.94.0
> Reporter: Mubarak Seyed
> Priority: Critical
> Attachments: Region management in Master.pdf
>
>
> Need a framework to execute master-coordinated tasks in a fault-tolerant
> manner.
> Master-coordinated tasks such as online-scheme change and delete-range
> (deleting region(s) based on start/end key) can make use of this framework.
> The advantages of framework are
> 1. Eliminate repeated code in Master, ZooKeeper tracker and Region-server for
> master-coordinated tasks
> 2. Ability to abstract the common functions across Master -> ZK and RS -> ZK
> 3. Easy to plugin new master-coordinated tasks without adding code to core
> components
--
This message was sent by Atlassian JIRA
(v6.1#6144)