Re: [VOTE] Re: Help detecting client disconnects for network server

Samuel Andrew McIntyre 12 Oct 2004 07:19:44 -0000

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Oct 11, 2004, at 4:38 PM, Jan Hlavat� wrote:

You seem to misunderstand keepalive mechanism.

I admit that exactly what keepalive does and how it interacts with SoTimeout is confusing. :) Both the javadoc and the available unix documentation for the native methods are not exceptionally clear, and it wasn't until I tried each option out with various settings and looked at how the underlying timer was implemented that I got a grasp on what these options really do.

I'm +1 to enabling keepalive by default. At this point, I'm just trying to understand what would be the problem with making it a configurable option. Here are the reasons why I think it's a good thing for us to make it configurable:

a) keepalive is opaque. Because we cannot get or set the parameters which control its behavior, we don't know exactly what the behavior will be when it's turned on. So, it might be desirable to turn it off, in the interest of consistency and having exact control over the behavior of the application.

b) the implementation of keepalive is OS-dependent. That implementation could be buggy, or worse, non-existent. This compounds (a), where you can't know exactly what the behavior will be when you enable it. So, another plus for control of behavior and for configurable timeouts on the server end where the mechanism is defined by us and completely in our domain of control.

c) interaction with SoTimeout can lead to unexpected behavior if keepalive is not configurable. As described earlier, SoTimeout(0) doesn't do what we expect if we can't turn off keepalive. Same situation if the machine timeout + probe intervals < SoTimeout(). But, we can't ever tell whether or not SoTimeout > keepalive timeout because of (a). So, another plus one for configurable timeouts on the server end.

d) keepalive could be keeping a zombie connection alive. I've personally dealt with two systems in the last few months that suffered a catastrophic disk/filesystem problem but where the network layer of the kernel was still active and happy. I'll spare you the details, but network services from these machines were behaving erratically depending on what files the services accessed, and it wasn't until I stood in front of the console that the cause of the failure became clear. In this case, keepalive would have had the opposite of the intended effect: a bad high-level connection is kept alive by the low-level mechanism. I admit that this is a minor detail, but it can be frustrating to determine the cause of failure in such a situation. So, another plus one for configurable timeouts on the server end.

e) Dan's points concerning dropping good connections, bandwidth use, and charge-by-packet.

f) The use of keepalive is always offered as an option in other development environments. Why would we require it?

Like I said, I'm +1 to enabling keepalive by default. In most cases it will do what is expected and the effort in making it a configurable option is minimal. I just think there are good reasons for making the use of keepalive an option and not a requirement. And also, that an interesting opportunity lies in finding an alternative method to solving the locking issue in the original post of this thread.

andrew
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iD8DBQFBa4VJDfB0XauCH7wRAkNzAJ45Ke8oki5XGrkMRBP52/Y7iY8VgwCfaoYe
a8lHDsIR+brrlW3JGM0Z0qg=
=SKyO
-----END PGP SIGNATURE-----

Re: [VOTE] Re: Help detecting client disconnects for network server

Reply via email to