On 16 Mar 2001 10:30:49 +0000, Damion Parry wrote:
> Thats a problem with bind call within vsd-redirect hanging on to port 80
> for a while after the process has been killed. It has been suggested on
> this mailing list that bind holds the port for two minutes after it has
> been killed, however I couldn't find any documentation to confirm that
> (if anyone knows, let me know and I shall adjust the wait within
> rebootvs to take account of this).
The kernel will keep the socket bound for two minutes unless the
SO_REUSEADDR socket option is set, which vsdredirect does.
I believe the problem may be simply the speed at which a vsreboot can
happen. It get the bind() error a lot if one of the vs's is busy serving
up web pages, but if I do a vsboot --stop, wait a second, then do a
vsboot --start from the main server, it rarely ever happens. I think
what's happening is that when the vs restarts and the vsdredirect is
actually in use at the time, the running vsdredirect process does a
close() on the socket and the new vsdredirect process starts and calls
bind() on the socket while the socket from the old process is still in a
FIN_WAIT2 or LAST_ACK state waiting for the client to acknowledge.
While that may not be the case exactly, it is the speed that causes the
problem, it never happens for me if I wait a second or 2. Perhaps
vsdredirect should check the cause of failure in the bind() call, and
maybe sleep for a second or 2 and try again. Something like:
vsdredirect.c: 204
if (bind (serversock, (struct sockaddr *)&server, serverlen) < 0) {
if (errno == EINVAL) { /* socket may already be bound, wait a sec and
retry. */
sleep(2);
if (bind (serversock, (struct sockaddr *)&server, serverlen) < 0)
fatal ("bind()");
} else {
fatal ("bind()");
}
}
I could be completely wrong though, it's just a thought.
--
Matt Kennedy
Programmer Jumpline.com, Inc.