Re: Node being there and not at the same time

Alexander Shraer Thu, 30 Aug 2012 23:05:15 -0700

This sounds like a good idea. I'm not sure how easy it would be to
implement as the client may need to be in a new sort of "conditional" state.


Alex

On Thu, Aug 30, 2012 at 10:50 PM, Bill Bridge <[email protected]>wrote:

>  Nothing to be sorry about, I was wrong to suggest a client could see an
> old state by reconnecting. When you said that it should not be allowed I
> realized that had to be the case. I saw that email too and realized it had
> something to do with this subject.
>
> It would seem nicer to simply do a sync() when this happens rather than
> refusing the connection. We could destroy the connection if the client is
> still in the future after a sync(). There is something seriously wrong if
> the client is still in the future after a sync(). If this happened with the
> current code the client would just keep trying until the connection finally
> worked and we would not find out that something is wrong. I suppose the
> client's last zxid could have been corrupted in his memory causing this
> problem. It would be good to have this disconnect and fail the client
> rather than spin.
>
> Without the connection you cannot do the sync() yourself. It is
> conceivable that it will be a few seconds before there is another server
> that is current enough to connect with. Maybe the other servers are in
> different data centers and would not be efficient to connect to them.
>
> Bill
>
> On 8/30/2012 10:21 PM, Alexander Shraer wrote:
>
> Bill,
>
>  I'm sorry - you were right and I totally quoted the wrong place in the
> code. The code that ensures that a client doesn't "go back in time" by
> connecting to a server that is less up to date than that client is most
> probably this one from ZooKeeperServer.java. I realized it after looking on
> the question of Simon today in the mailing list...
>
>       if (connReq.getLastZxidSeen() > zkDb.dataTree.lastProcessedZxid)
>
>             String msg = "Refusing session request for client "
>
>                 + cnxn.getRemoteSocketAddress()
>
>                 + " as it has seen zxid 0x"
>
>                 + Long.toHexString(connReq.getLastZxidSeen())
>
>                 + " our last zxid is 0x"
>
>                 +
> Long.toHexString(getZKDatabase().getDataTreeLastProcessedZxid())
>
>                 + " client must try another server";
>
> On Mon, Aug 27, 2012 at 10:22 AM, Bill Bridge <[email protected]>wrote:
>
>> Alex,
>> You certainly know the code much better than I, so I may be mistaken
>> here. It looks to me like waitForEpochAck() is about changes in the set of
>> peers, and is not related to client connect/disconnects. I do not see how
>> this would be called if a client disconnected due to some problem of his
>> own, such as too slow to heartbeat, then reconnected to a different peer or
>> observer.
>>
>> You suggest that a reconnecting client should ensure the new server has
>> seen all transactions that the client has seen. This sounds like the right
>> thing to do. This would certainly eliminate the race condition I
>> postulated. This sounds like the kind of thing someone would have already
>> thought of. If this is not already done then it would be a good change to
>> make. I do not know where the code to do that would be. It could be part of
>> the server reconnect code or it could be a sync() in the client library.
>>
>> If Mattias's code creates a new session when reconnecting, rather than
>> reconnecting to the same session, then he could have the problem described
>> even if reconnect ensures the client is not ahead of the server. He could
>> fix this either by reconnecting to the same session, or simply doing a
>> sync() when necessary.
>>
>> Thanks,
>> Bill
>>
>>
>> On 8/24/2012 6:11 PM, Alexander Shraer wrote:
>>
>>> Bill,  if I understand correctly this shouldn't be possible - the
>>> client will not be able to connect to a server that is
>>> less up-to-date than that same client. So if the create completed at
>>> the client before it disconnects the new server will have to know
>>> about it too otherwise the connection will fail. See
>>> Leader.waitForEpochAck:
>>>
>>> if (ss.isMoreRecentThan(leaderStateSummary)) {
>>>                      throw new IOException("Follower is ahead of the
>>> leader, leader summary: "
>>>                                                      +
>>> leaderStateSummary.getCurrentEpoch()
>>>                                                      + " (current
>>> epoch), "
>>>                                                      +
>>> leaderStateSummary.getLastZxid()
>>>                                                      + " (last zxid)");
>>>                  }
>>>
>>> of course its possible that another client connected to a different
>>> server doesn't see the create.
>>>
>>> Alex
>>>
>>>
>>> On Fri, Aug 24, 2012 at 5:15 PM, Bill Bridge <[email protected]>
>>> wrote:
>>>
>>>> Mattias,
>>>>
>>>> Is it possible that after you get NODEEXISTS from creation and before
>>>> you do
>>>> the second getData(), you reconnect to another ZooKeeper instance? If
>>>> so,
>>>> maybe the new connection is to a follower that has not yet seen the
>>>> creation. If this is what is happening, then a sync() after the second
>>>> NONODE with a third getData() should work. By only doing the sync()
>>>> when you
>>>> hit the unusual race condition it will have no performance impact.
>>>>
>>>> Bill
>>>>
>>>>
>>>> On 8/23/2012 8:21 AM, Mattias Persson wrote:
>>>>
>>>>> Hi David,
>>>>>
>>>>> There is nowhere in the code where that node gets deleted. If we
>>>>> refrain
>>>>> from that suspicion, could there be something else?
>>>>>
>>>>> 2012/8/23 David Nickerson <[email protected]>
>>>>>
>>>>>  It's a little difficult to guess what your application is doing, but
>>>>>> it
>>>>>> sounds like there's "someone else" who can create and delete the nodes
>>>>>> you're trying to work with. So when you create the node and check its
>>>>>> data,
>>>>>> someone else might have deleted it before you got the chance to check
>>>>>> the
>>>>>> data. The same is true when you check that it exists and then check
>>>>>> the
>>>>>> data. You could ensure that the node won't be deleted by using ACLs or
>>>>>> giving the node a sequential ephemeral child.
>>>>>>
>>>>>> On Thu, Aug 23, 2012 at 6:30 AM, Mattias Persson
>>>>>> <[email protected]>wrote:
>>>>>>
>>>>>>  Hi,
>>>>>>>
>>>>>>> I've got a problem that I've seen at only a few occasions and which
>>>>>>> confuses me a bit. Basically I construct a ZooKeeper client (I'm
>>>>>>> running
>>>>>>> version 3.3.2) where there's a ZK quorum of size 3 running. I get a
>>>>>>> SyncConnected event in a Watcher of mine and in that watcher I do a
>>>>>>> get-or-create(-if-absent) behaviour where I first do a:
>>>>>>>
>>>>>>>     zooKeeper.getData( myPath, false, null );
>>>>>>>
>>>>>>> if that produces a NONODE code I'll try to create it with:
>>>>>>>
>>>>>>>     zooKeeper.create( myPath, smallByteArray, OPEN_ACL_UNSAFE,
>>>>>>> PERSISTENT
>>>>>>>
>>>>>> );
>>>>>>
>>>>>>> If that fails with NODEEXISTS code I'll just get it, assuming someone
>>>>>>>
>>>>>> else
>>>>>>
>>>>>>> made it before me. What I see from this getData call that I do after
>>>>>>> getting this NODEEXISTS code, which is the same as the first one
>>>>>>> btw, is
>>>>>>> that I'll get a NONODE code back. Given in this scenario is that I'm
>>>>>>> 100%
>>>>>>> certain that this node exists in the quorum at myPath in the first
>>>>>>> place
>>>>>>> even.
>>>>>>>
>>>>>>> Questions:
>>>>>>> 1) How can this happen?
>>>>>>> 2) Do I use ZooKeeper here in an improper way?
>>>>>>> 3) Will a later version fix any potential issue I might have hit?
>>>>>>> 4) What's the guarantees around the state of my ZooKeeper instance
>>>>>>> after
>>>>>>>
>>>>>> a
>>>>>>
>>>>>>> receive a SyncConnected event, is it fully synced with the master at
>>>>>>> that
>>>>>>> point, or will a call to sync() be necessary first?
>>>>>>>
>>>>>>> Best,
>>>>>>> Mattias
>>>>>>>
>>>>>>> --
>>>>>>> Mattias Persson, [[email protected]]
>>>>>>> Hacker, Neo Technology
>>>>>>> www.neotechnology.com
>>>>>>>
>>>>>>>
>>>>>
>>
>
>

Re: Node being there and not at the same time

Reply via email to