[
https://issues.apache.org/jira/browse/FLINK-11809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Till Rohrmann closed FLINK-11809.
---------------------------------
Resolution: Abandoned
> Implement etcd based StateHandleStore
> -------------------------------------
>
> Key: FLINK-11809
> URL: https://issues.apache.org/jira/browse/FLINK-11809
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Coordination
> Reporter: MalcolmSanders
> Priority: Minor
>
> Implement StateHandleStore using jetcd.
> Previously ZooKeeperStateHandleStore stores data in a dfs file while records
> its metadata to a zookeeper node in order to keep the data size of a
> zookeeper node small. EtcdStateHandleStore should work in the same way while
> there is a corner case that should be carefully dealt with using etcd.
> As described in FLINK-6612, the ZooKeeperStateHandleStore does not guard
> against concurrent delete operations which could happen in case of a lost
> leadership and a new leadership grant. The problem is that checkpoint nodes
> can get deleted even after they have been recovered by another
> ZooKeeperCompletedCheckpointStore. This corrupts the recovered checkpoint and
> thwarts future recoveries. In order to guard against deletions of ZooKeeper
> nodes which are still being used by a different ZooKeeperStateHandleStore, a
> locking mechanism has been introduced to make sure that zookeeper nodes are
> allowed to be deleted only after all ZooKeeperStateHandleStores have released
> their lock. The locking mechanism is implemented via ephemeral child nodes of
> the respective ZooKeeper node. Whenever a ZooKeeperStateHandleStore wants to
> lock a ZNode, thus, protecting it from being deleted, it creates an ephemeral
> child node. The node's name is unique to the ZooKeeperStateHandleStore
> instance. The delete operations will then only delete the node if it does not
> have any children associated. In order to guard against orphaned lock nodes,
> they are created as ephemeral nodes. This means that they will be deleted by
> ZooKeeper once the connection of the ZooKeeper client which created the node
> timed out.
> The solution leverages that a zookeeper directory cannot be deleted if it
> still has child nodes and a ephemeral node will be deleted by zookeeper
> server once the corresponding client is disconnected. Since etcd doesn’t have
> hierarchical structure like zookeeper, there is no actual relations between a
> path and its so-called parent directory. Suppose we create a persistent
> key-value to store metadata and then create an ephemeral key-value using
> previous key as the prefix. Once the client disconnects with etcd server, the
> ephemeral key-value pair will be deleted from etcd server while there’ll be
> no effect on its parent key-value pair.
> My proposal to tackle this case is illustrated in Part 6 Implementation of
> etcd based StateHandleStore in [design
> doc|https://docs.google.com/document/d/12-gIZDuT4IOWG7gmwSqNFsOHuGlkdRHge0ahJ7M311Y/edit#heading=h.sqkj9zjvgicu].
> Any comments or suggestions will be appreciated.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)