[ 
https://issues.apache.org/jira/browse/HBASE-12439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207125#comment-14207125
 ] 

Nick Dimiduk commented on HBASE-12439:
--------------------------------------

This will be a huge feature for MTTR and online reliability -- why the Minor 
label?

I'm not clear on some of the abstractions. Please comment as to whether the 
below observations are true or false.
 - client submits a "procedure" that it's interested in observing through it's 
execution progress
 - a procedure is defined as a DAG of sub-procedures that are required to 
complete procedure execution
 - multiple sub-procedures can be executed in parallel
 - a sub-procedure can define an action that must be taken on multiple hosts
 - DAG execution progress is tracked through a storage system
 - procedure execution can be halted and reverted at any time
 - completed DAG sub-procedures must be able to roll-back in the event of 
procedure revert
 - procedure execution is tied to transitions through a persisted state machine
 - all procedures have the same set of states through which they can transition

Why implement a separate store? Can we not use a system table for the procedure 
state store?

> Procedure V2
> ------------
>
>                 Key: HBASE-12439
>                 URL: https://issues.apache.org/jira/browse/HBASE-12439
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 2.0.0
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>            Priority: Minor
>         Attachments: ProcedureV2.pdf
>
>
> Procedure v2 (aka Notification Bus) aims to provide a unified way to build:
> * multi-steps procedure with a rollback/rollforward ability in case of 
> failure (e.g. create/delete table)
> ** HBASE-12070
> * notifications across multiple machines (e.g. ACLs/Labels/Quotas cache 
> updates)
> ** Make sure that every machine has the grant/revoke/label
> ** Enforce "space limit" quota across the namespace
> ** HBASE-10295 eliminate permanent replication zk node
> * procedures across multiple machines (e.g. Snapshots)
> * coordinated long-running procedures (e.g. compactions, splits, ...)
> * Synchronous calls, with the ability to see the state/result in case of 
> failure.
> ** HBASE-11608 sync split
> still work in progress/initial prototype: https://reviews.apache.org/r/27703/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to