[ 
https://issues.apache.org/jira/browse/HBASE-12439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302746#comment-14302746
 ] 

Enis Soztutar commented on HBASE-12439:
---------------------------------------

Thanks Matteo. This is good. Similar to what has been discussed in other jiras, 
but with some implementation this time. 
bq. Suggest add JIRA number to doc. Suggest a sentence on how PV2 is NOT FATE. 
Add the work 'idempotent' in around this text "...in a way that each step must 
be able to be executed multiple times (generating the same result) a..." 
(although if a rollback, I suppose it not idempotent?)
The way I see it is that FATE execution is in stack, versus here is DAG. FATE 
calls the above "adompotent", since the step can be in partially done or 
failed. So the step should work over the result of a partial execution from 
previous. For example, a step for creating a dir for the table in hdfs should 
not fail if the directory is already there. 

What is the diagram that talks about "branch coordinators"? Does not seem 
mentioned in the text. 

I think we should address fencing as a first level goal, and mention it in the 
state store implementation. If we make it explicit in store, alternative 
implementations if any has to take that into account. Fencing is really 
important because current master lacks it, and it is a potential cause for 
wracking havoc on the cluster. Proper fencing can only be achieved through the 
store, and only if active master does a state store operation for every action. 
For example, the master can do a "register master' procedure as a way to commit 
its state, and prevent the previous master to do any more operation. I could 
not see a use of fencing through wal (or recover lease, etc) in the patch. 

bq. The main problem of using a table is that you end up with the chicken egg 
problem.
This is easy to workaround. We can have two state store implementations. One is 
a smaller scale zk based one, for doing bootstrap. The other is for usual 
operations. However, I think we still do not need a table yet, but a state 
store can be implemented as a region opened in master. This way, we do not have 
to re-implement yet another wal, and custom in-memory data structures. Let me 
experiment with this approach on top of this patch. 

I'll take a more closer look at the patch as well. 



 





> Procedure V2
> ------------
>
>                 Key: HBASE-12439
>                 URL: https://issues.apache.org/jira/browse/HBASE-12439
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 2.0.0
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>            Priority: Minor
>         Attachments: ProcedureV2.pdf, Procedurev2Notification-Bus.pdf
>
>
> Procedure v2 (aka Notification Bus) aims to provide a unified way to build:
> * multi-steps procedure with a rollback/rollforward ability in case of 
> failure (e.g. create/delete table)
> ** HBASE-12070
> * notifications across multiple machines (e.g. ACLs/Labels/Quotas cache 
> updates)
> ** Make sure that every machine has the grant/revoke/label
> ** Enforce "space limit" quota across the namespace
> ** HBASE-10295 eliminate permanent replication zk node
> * procedures across multiple machines (e.g. Snapshots)
> * coordinated long-running procedures (e.g. compactions, splits, ...)
> * Synchronous calls, with the ability to see the state/result in case of 
> failure.
> ** HBASE-11608 sync split
> still work in progress/initial prototype: https://reviews.apache.org/r/27703/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to