[
https://issues.apache.org/jira/browse/HBASE-12439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303017#comment-14303017
]
Matteo Bertozzi commented on HBASE-12439:
-----------------------------------------
{quote}FATE calls the above "adompotent", since the step can be in partially
done or failed. So the step should work over the result of a partial execution
from previous. For example, a step for creating a dir for the table in hdfs
should not fail if the directory is already there.{quote}
here the logic is the same, once you execute a step if there is a non retryable
code failure there will be a rollback step called.
the logic to revert partial step is responsibility of the execute()/rollback()
implementation not of the framework, the framework only knows if a step is
supposed to be executed or rollback, it has no knowledge about what you are
doing.
{quote}I think we should address fencing as a first level goal, and mention it
in the state store implementation. If we make it explicit in store, alternative
implementations if any has to take that into account.... {quote}
agreed, I'm not yet at this point. I'm still at making sure the
execution/rollback was as expected.
{quote}This is easy to workaround. We can have two state store implementations.
One is a smaller scale zk based one, for doing bootstrap. The other is for
usual operations. However, I think we still do not need a table yet, but a
state store can be implemented as a region opened in master. This way, we do
not have to re-implement yet another wal, and custom in-memory data structures.
Let me experiment with this approach on top of this patch.{quote}
The reason I choose the wal was to support assignment, all the logged events
will probably trigger to many flush and compactions. and we don't really need
this data to be compacted. but maybe a simple tuning on the region to avoid
compaction and relying on TTL may be just fine and avoid the problem. didn't
look into it too much, if you have time to experiment with it feel free to post
a patch or just suggestions on how to change it.
> Procedure V2
> ------------
>
> Key: HBASE-12439
> URL: https://issues.apache.org/jira/browse/HBASE-12439
> Project: HBase
> Issue Type: New Feature
> Components: master
> Affects Versions: 2.0.0
> Reporter: Matteo Bertozzi
> Assignee: Matteo Bertozzi
> Priority: Minor
> Attachments: ProcedureV2.pdf, Procedurev2Notification-Bus.pdf
>
>
> Procedure v2 (aka Notification Bus) aims to provide a unified way to build:
> * multi-steps procedure with a rollback/rollforward ability in case of
> failure (e.g. create/delete table)
> ** HBASE-12070
> * notifications across multiple machines (e.g. ACLs/Labels/Quotas cache
> updates)
> ** Make sure that every machine has the grant/revoke/label
> ** Enforce "space limit" quota across the namespace
> ** HBASE-10295 eliminate permanent replication zk node
> * procedures across multiple machines (e.g. Snapshots)
> * coordinated long-running procedures (e.g. compactions, splits, ...)
> * Synchronous calls, with the ability to see the state/result in case of
> failure.
> ** HBASE-11608 sync split
> still work in progress/initial prototype: https://reviews.apache.org/r/27703/
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)