Re: adding a separate thread to detect network timeouts faster

Jeremy Stribling Tue, 10 Sep 2013 23:33:25 -0700

Hi Germán,

A very quick scan of that JIRA makes me think you're talking aboutserver->server heartbeats, and not client->server heartbeats (which iswhat I'm talking about). I have not tested it explicitly or inspectedthat part of the code, but I've hit many cases in testing and productionwhere client session expirations coincide with long fsync times aslogged by the server.


Jeremy

On 09/10/2013 10:40 PM, German Blanco wrote:

Hello Jeremy and all,

my idea was that the current implementation of ping handling already does
not wait on disk IO.
I am even working in a JIRA case that is related with this:
https://issues.apache.org/jira/browse/ZOOKEEPER-87
And I have also made some tests that seem to confirm that ping handling is
done in a different thread than transaction handling.
But actually, I don't have any confirmation from any person in this
project. Are you sure that ping handling waits on IO for anything? Have you
tested it?

Regards,
Germán Blanco.



On Tue, Sep 10, 2013 at 11:05 PM, Jeremy Stribling <[email protected]> wrote:

Good suggestion, thanks.  At the very least, I think what we have in mind
would be off by default, so users could only turn it on if they know they
have relatively few clients and slow disks.  An adaptive scheme would be
even better, obviously.


On 09/10/2013 02:04 PM, Ted Dunning wrote:

Perhaps you should be suggesting a design that is adaptive rather than
configured and guarantees low overhead at the cost of notification time in
extreme scenarios.

For instance, the server can send no more than 1000 (or whatever number)
HB's per second and never more than one per second to any client.  This
caps the cost nicely.



On Tue, Sep 10, 2013 at 1:59 PM, Ted Dunning <[email protected]<mailto:
[email protected]>**> wrote:


     Since you are talking about client connection failure detection,
     no, I don't think that there is a major barrier other than
     actually implementing a reliable check.

     Keep in mind the cost.  There are ZK installs with 100,000
     clients.  If these are heartbeating every 2 seconds, you have
     50,000 packets per second hitting the quorum or 10,000 per server
     if all connections are well balanced.

     If you only have 10 clients, the network burden is nominal.



     On Tue, Sep 10, 2013 at 1:34 PM, Jeremy Stribling
     <[email protected] <mailto:[email protected]>> wrote:

         I mostly agree, but let's assume that a ~5x speedup in
         detecting those types of failures is considered significant
         for some people. Are there technical reasons that would
         prevent this idea from working?

         On 09/10/2013 01:31 PM, Ted Dunning wrote:

             I don't see the strong value here.  A few failures would
             be detected more
             quickly, but I am not convinced that this would actually
             improve
             functionality significantly.


             On Tue, Sep 10, 2013 at 1:01 PM, Jeremy Stribling
             <[email protected] <mailto:[email protected]>> wrote:

                 Hi all,

                 Let's assume that you wanted to deploy ZK in a
                 virtualized environment,
                 despite all of the known drawbacks.  Assume we could
                 deploy it such that
                 the ZK servers were all using independent CPUs and
                 storage (though not
                 dedicated disks).  Obviously, the shared disks (shared
                 with other, non-ZK
                 VMs on the same hypervisor) will cause ZK to hit the
                 default session
                 timeout occasionally, so you would need to raise the
                 existing session
                 timeout to something like 30 seconds.

                 I'm curious if there would be any technical drawbacks
                 to adding an
                 additional heartbeat mechanism between the clients and
                 the servers, which
                 would have the goal of detecting network-only failures
                 faster than the
                 existing heartbeat mechanism.  The idea is that there
                 would be a new thread
                 dedicated to processing these heartbeats, which would
                 not get blocked on
                 I/O.  Then the clients could configure a second,
                 smaller timeout value, and
                 it would be assumed that any such timeout indicated a
                 real problem.  The
                 existing mechanism would still be in place to catch
                 I/O-related errors.

                 I understand the philosophy that there should be some
                 heartbeat mechanism
                 that takes the disk into account, but I'm having
                 trouble coming up with
                 technical reasons not to add a second mechanism.
                 Obviously, the advantage
                 would be that the clients could detect network
                 failures and system crashes
                 more quickly in an environment with slow disks, and
                 fail over to other
                 servers more quickly.  The only disadvantages I can
                 come up with are:

                 1) More code complexity, and slightly more heartbeat
                 traffic on the wire
                 2) I think the servers have to log session expirations
                 to disk, so if the
                 sessions expire at a faster rate than the disk can
                 handle, it might lead to
                 a large backlog.

                 Are there other drawbacks I am missing?  Would a patch
                 that added
                 something like this be considered, or is it dead from
                 the start? Thanks,

                 Jeremy

Re: adding a separate thread to detect network timeouts faster

Reply via email to