Nothing to be sorry about, I was wrong to suggest a client could see an old state by reconnecting. When you said that it should not be allowed I realized that had to be the case. I saw that email too and realized it had something to do with this subject.

It would seem nicer to simply do a sync() when this happens rather than refusing the connection. We could destroy the connection if the client is still in the future after a sync(). There is something seriously wrong if the client is still in the future after a sync(). If this happened with the current code the client would just keep trying until the connection finally worked and we would not find out that something is wrong. I suppose the client's last zxid could have been corrupted in his memory causing this problem. It would be good to have this disconnect and fail the client rather than spin.

Without the connection you cannot do the sync() yourself. It is conceivable that it will be a few seconds before there is another server that is current enough to connect with. Maybe the other servers are in different data centers and would not be efficient to connect to them.

Bill
On 8/30/2012 10:21 PM, Alexander Shraer wrote:
Bill,

I'm sorry - you were right and I totally quoted the wrong place in the code. The code that ensures that a client doesn't "go back in time" by connecting to a server that is less up to date than that client is most probably this one from ZooKeeperServer.java. I realized it after looking on the question of Simon today in the mailing list...

if (connReq.getLastZxidSeen() > zkDb.dataTree.lastProcessedZxid)

            String msg = "Refusing session request for client "

                + cnxn.getRemoteSocketAddress()

                + " as it has seen zxid 0x"

                + Long.toHexString(connReq.getLastZxidSeen())

                + " our last zxid is 0x"

+ Long.toHexString(getZKDatabase().getDataTreeLastProcessedZxid())

                + " client must try another server";


On Mon, Aug 27, 2012 at 10:22 AM, Bill Bridge <[email protected] <mailto:[email protected]>> wrote:

    Alex,
    You certainly know the code much better than I, so I may be
    mistaken here. It looks to me like waitForEpochAck() is about
    changes in the set of peers, and is not related to client
    connect/disconnects. I do not see how this would be called if a
    client disconnected due to some problem of his own, such as too
    slow to heartbeat, then reconnected to a different peer or observer.

    You suggest that a reconnecting client should ensure the new
    server has seen all transactions that the client has seen. This
    sounds like the right thing to do. This would certainly eliminate
    the race condition I postulated. This sounds like the kind of
    thing someone would have already thought of. If this is not
    already done then it would be a good change to make. I do not know
    where the code to do that would be. It could be part of the server
    reconnect code or it could be a sync() in the client library.

    If Mattias's code creates a new session when reconnecting, rather
    than reconnecting to the same session, then he could have the
    problem described even if reconnect ensures the client is not
    ahead of the server. He could fix this either by reconnecting to
    the same session, or simply doing a sync() when necessary.

    Thanks,
    Bill


    On 8/24/2012 6:11 PM, Alexander Shraer wrote:

        Bill,  if I understand correctly this shouldn't be possible - the
        client will not be able to connect to a server that is
        less up-to-date than that same client. So if the create
        completed at
        the client before it disconnects the new server will have to know
        about it too otherwise the connection will fail. See
        Leader.waitForEpochAck:

        if (ss.isMoreRecentThan(leaderStateSummary)) {
                             throw new IOException("Follower is ahead
        of the
        leader, leader summary: "
                                                             +
        leaderStateSummary.getCurrentEpoch()
                                                             + "
        (current epoch), "
                                                             +
        leaderStateSummary.getLastZxid()
                                                             + " (last
        zxid)");
                         }

        of course its possible that another client connected to a
        different
        server doesn't see the create.

        Alex


        On Fri, Aug 24, 2012 at 5:15 PM, Bill Bridge
        <[email protected] <mailto:[email protected]>> wrote:

            Mattias,

            Is it possible that after you get NODEEXISTS from creation
            and before you do
            the second getData(), you reconnect to another ZooKeeper
            instance? If so,
            maybe the new connection is to a follower that has not yet
            seen the
            creation. If this is what is happening, then a sync()
            after the second
            NONODE with a third getData() should work. By only doing
            the sync() when you
            hit the unusual race condition it will have no performance
            impact.

            Bill


            On 8/23/2012 8:21 AM, Mattias Persson wrote:

                Hi David,

                There is nowhere in the code where that node gets
                deleted. If we refrain
                from that suspicion, could there be something else?

                2012/8/23 David Nickerson
                <[email protected]
                <mailto:[email protected]>>

                    It's a little difficult to guess what your
                    application is doing, but it
                    sounds like there's "someone else" who can create
                    and delete the nodes
                    you're trying to work with. So when you create the
                    node and check its
                    data,
                    someone else might have deleted it before you got
                    the chance to check the
                    data. The same is true when you check that it
                    exists and then check the
                    data. You could ensure that the node won't be
                    deleted by using ACLs or
                    giving the node a sequential ephemeral child.

                    On Thu, Aug 23, 2012 at 6:30 AM, Mattias Persson
                    <[email protected]
                    <mailto:[email protected]>>wrote:

                        Hi,

                        I've got a problem that I've seen at only a
                        few occasions and which
                        confuses me a bit. Basically I construct a
                        ZooKeeper client (I'm running
                        version 3.3.2) where there's a ZK quorum of
                        size 3 running. I get a
                        SyncConnected event in a Watcher of mine and
                        in that watcher I do a
                        get-or-create(-if-absent) behaviour where I
                        first do a:

                            zooKeeper.getData( myPath, false, null );

                        if that produces a NONODE code I'll try to
                        create it with:

                            zooKeeper.create( myPath, smallByteArray,
                        OPEN_ACL_UNSAFE, PERSISTENT

                    );

                        If that fails with NODEEXISTS code I'll just
                        get it, assuming someone

                    else

                        made it before me. What I see from this
                        getData call that I do after
                        getting this NODEEXISTS code, which is the
                        same as the first one btw, is
                        that I'll get a NONODE code back. Given in
                        this scenario is that I'm
                        100%
                        certain that this node exists in the quorum at
                        myPath in the first place
                        even.

                        Questions:
                        1) How can this happen?
                        2) Do I use ZooKeeper here in an improper way?
                        3) Will a later version fix any potential
                        issue I might have hit?
                        4) What's the guarantees around the state of
                        my ZooKeeper instance after

                    a

                        receive a SyncConnected event, is it fully
                        synced with the master at
                        that
                        point, or will a call to sync() be necessary
                        first?

                        Best,
                        Mattias

                        --
                        Mattias Persson, [[email protected]
                        <mailto:[email protected]>]
                        Hacker, Neo Technology
                        www.neotechnology.com
                        <http://www.neotechnology.com>





Reply via email to