[
https://issues.apache.org/jira/browse/HBASE-12439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207125#comment-14207125
]
Nick Dimiduk commented on HBASE-12439:
--------------------------------------
This will be a huge feature for MTTR and online reliability -- why the Minor
label?
I'm not clear on some of the abstractions. Please comment as to whether the
below observations are true or false.
- client submits a "procedure" that it's interested in observing through it's
execution progress
- a procedure is defined as a DAG of sub-procedures that are required to
complete procedure execution
- multiple sub-procedures can be executed in parallel
- a sub-procedure can define an action that must be taken on multiple hosts
- DAG execution progress is tracked through a storage system
- procedure execution can be halted and reverted at any time
- completed DAG sub-procedures must be able to roll-back in the event of
procedure revert
- procedure execution is tied to transitions through a persisted state machine
- all procedures have the same set of states through which they can transition
Why implement a separate store? Can we not use a system table for the procedure
state store?
> Procedure V2
> ------------
>
> Key: HBASE-12439
> URL: https://issues.apache.org/jira/browse/HBASE-12439
> Project: HBase
> Issue Type: New Feature
> Components: master
> Affects Versions: 2.0.0
> Reporter: Matteo Bertozzi
> Assignee: Matteo Bertozzi
> Priority: Minor
> Attachments: ProcedureV2.pdf
>
>
> Procedure v2 (aka Notification Bus) aims to provide a unified way to build:
> * multi-steps procedure with a rollback/rollforward ability in case of
> failure (e.g. create/delete table)
> ** HBASE-12070
> * notifications across multiple machines (e.g. ACLs/Labels/Quotas cache
> updates)
> ** Make sure that every machine has the grant/revoke/label
> ** Enforce "space limit" quota across the namespace
> ** HBASE-10295 eliminate permanent replication zk node
> * procedures across multiple machines (e.g. Snapshots)
> * coordinated long-running procedures (e.g. compactions, splits, ...)
> * Synchronous calls, with the ability to see the state/result in case of
> failure.
> ** HBASE-11608 sync split
> still work in progress/initial prototype: https://reviews.apache.org/r/27703/
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)