[
https://issues.apache.org/jira/browse/HBASE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13615901#comment-13615901
]
Sergey Shelukhin commented on HBASE-5487:
-----------------------------------------
bq. Is the assignment manager the only "master coordinated" task in scope?
Only for the current document version... tables could be added.
bq. Instead of asserting it is not clear if table (+region) locks scale, let's
find out.
Hmm... that would require implementing region locks, and having a very large
cluster. I am talking more about unacceptable blocking of user operations, and
management of expiring locks in presense of real-life failures.
bq. Master operations and processes can clash and we should understand where we
need concurrency control. (I'm working on a table – here's an draft distilled
version [1], there exists an overly detailed version that I'll share once i get
it fixed)
Comments below.
bq. Should there be a notion of queuing operations? (locking, or an actual
queue) Should these operations be generically logged so they can complete if a
master goes down in the middle? (ex: master goes down during a "move" operation
after the close but before the open on the new rs).
You mean like WAL for operations?
bq. The "design principles" is actually more of a proposed design.
Yeah, sorry, wanted to split it into two sections but never did. Will rename.
bq. how do we deal with operations where we need "locks" on multiple region
because we are reading or modifying multiple regions – e.g. splits, merges,
snapshots? Matteo Bertozzi had suggested in another jira making a the meta row
per table, or maybe part of the solution is using the multi-row single meta
region transaction.
Depends on where we store it, but yeah these have to be transactional. Last
section (very short :)) suggests using ZK, which already supports that.
bq. What are alternatives? why this approach vs others?
I can expand the doc... the implicitly mentioned existing alternatives are
locks, which I would argue scale less and are harder to manage; or transaction
approach that is currently used (although not unified), for example via
transient transaction nodes.
Actually, one alternative approach I saw used for such things is to simplify
concurrency of operations/etc. with actor-like model, where master has logical
cluster state and previously saved target state, and periodically (often) takes
an epic lock, looks at them quickly, and based on what it is doing, outputs new
target cluster state and a list of physical things to do Then it releases epic
lock, and the new target state is saved, and operations performed.
That way all state-management code becomes simple, because it runs in one place
with no concurrency, and recovery just has to compare real cluster state with
destination state.
But this will require thinking about this differently.
Also usually that would mean RSes won't be able to initiate operations (like
split) - they will have to go thru master (which I would argue is ok).
Also it's not clear whether this will become too much of a bottleneck.
bq. Where do you think the new information will be, META table?
It seems to me that ZK would be better (see last section), but META is also an
option.
>From the spreadsheet:
bq. Enabling and disabling table operations should be blocked when any of
these simple region operations are in progress
Not clear why (logically).
bq. move
Move is close and open, doesn't require consistency, right?
bq. Regionserver Processes ... However, the individual operations must
maintain the table integrity property.
Not clear what this means for snapshots.
> Generic framework for Master-coordinated tasks
> ----------------------------------------------
>
> Key: HBASE-5487
> URL: https://issues.apache.org/jira/browse/HBASE-5487
> Project: HBase
> Issue Type: New Feature
> Components: master, regionserver, Zookeeper
> Affects Versions: 0.94.0
> Reporter: Mubarak Seyed
> Attachments: Region management in Master.pdf
>
>
> Need a framework to execute master-coordinated tasks in a fault-tolerant
> manner.
> Master-coordinated tasks such as online-scheme change and delete-range
> (deleting region(s) based on start/end key) can make use of this framework.
> The advantages of framework are
> 1. Eliminate repeated code in Master, ZooKeeper tracker and Region-server for
> master-coordinated tasks
> 2. Ability to abstract the common functions across Master -> ZK and RS -> ZK
> 3. Easy to plugin new master-coordinated tasks without adding code to core
> components
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira