[jira] [Commented] (FLINK-10333) Rethink ZooKeeper based stores (SubmittedJobGraph, MesosWorker, CompletedCheckpoints)

Stephan Ewen (JIRA) Wed, 28 Nov 2018 04:04:51 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-10333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16701780#comment-16701780
 ]


Stephan Ewen commented on FLINK-10333:
--------------------------------------

Forwarding a discussion with the curator folks from a while ago:

{code}
Hi Curators!

We are using the curator LeaderLatch recipe for leader election - so far works 
great!

We need to additionally attach a "fencing token" to the leader latch - 
basically a random unique ID that can be used to identify actions taken under 
the assumption of a certain node holding the leader latch. This token should be 
retrieved atomically with the leader status and live and die with the leader 
status.

Its seems natural to use the leader latch znode for that.
We were thinking to either use the latch's znode zxid, or to write some bytes 
to the znode (or attach a child node) to it.

The recipe does not provide any access to the znode, however. What would you 
recommend to do here? Is it possible to add (read only) access to the znode in 
the LeaderLatch?
{code}

{code}
There’s a constructor in LeaderLatch that takes an “id”. That id is the payload 
for the lock node.

        public LeaderLatch(CuratorFramework client, String latchPath, String id)

You can call getParticipants() to get the IDs of all participants in the latch.
{code}

{code}
>From what I understood from the API docs, that id is the same every time that 
>specific contender becomes leader.

What I am looking for is a unique value for every time leader status changes 
(which is when a different znode gets created). We need that to separate 
actions taken by a participant across being leader, losing leadership, and 
re-gaining its leadership.
{code}

{code}
Why not just create a new LeaderLatch each time? Other than that, a PR with a 
settable id would be nice.
{code}


> Rethink ZooKeeper based stores (SubmittedJobGraph, MesosWorker, 
> CompletedCheckpoints)
> -------------------------------------------------------------------------------------
>
>                 Key: FLINK-10333
>                 URL: https://issues.apache.org/jira/browse/FLINK-10333
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination
>    Affects Versions: 1.5.3, 1.6.0, 1.7.0
>            Reporter: Till Rohrmann
>            Priority: Major
>             Fix For: 1.8.0
>
>
> While going over the ZooKeeper based stores 
> ({{ZooKeeperSubmittedJobGraphStore}}, {{ZooKeeperMesosWorkerStore}}, 
> {{ZooKeeperCompletedCheckpointStore}}) and the underlying 
> {{ZooKeeperStateHandleStore}} I noticed several inconsistencies which were 
> introduced with past incremental changes.
> * Depending whether {{ZooKeeperStateHandleStore#getAllSortedByNameAndLock}} 
> or {{ZooKeeperStateHandleStore#getAllAndLock}} is called, deserialization 
> problems will either lead to removing the Znode or not
> * {{ZooKeeperStateHandleStore}} leaves inconsistent state in case of 
> exceptions (e.g. {{#getAllAndLock}} won't release the acquired locks in case 
> of a failure)
> * {{ZooKeeperStateHandleStore}} has too many responsibilities. It would be 
> better to move {{RetrievableStateStorageHelper}} out of it for a better 
> separation of concerns
> * {{ZooKeeperSubmittedJobGraphStore}} overwrites a stored {{JobGraph}} even 
> if it is locked. This should not happen since it could leave another system 
> in an inconsistent state (imagine a changed {{JobGraph}} which restores from 
> an old checkpoint)
> * Redundant but also somewhat inconsistent put logic in the different stores
> * Shadowing of ZooKeeper specific exceptions in {{ZooKeeperStateHandleStore}} 
> which were expected to be caught in {{ZooKeeperSubmittedJobGraphStore}}
> * Getting rid of the {{SubmittedJobGraphListener}} would be helpful
> These problems made me think how reliable these components actually work. 
> Since these components are very important, I propose to refactor them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10333) Rethink ZooKeeper based stores (SubmittedJobGraph, MesosWorker, CompletedCheckpoints)

Reply via email to