actually 100 :)This patch can't be applied... it actually introduces a denial of service problem if folks can simply early-disconnect a server some half dozen
times in a row... It isn't hard to work up such a tool.If it is possible for someone to externally tickle the TransmitFile socket recycle bug then I agree.
Better; what if we test *which* socket failed. We are sort of helpless when the errors could be either the Listen and Accept socket. If the
error is on the Listen socket, we should exit signaling the parent to do
a restart with new listeners, if the error is on the accept socket we can
just keep going.
Based on the IP address renewal scenario you mention below, testing the Listen socket (somehow, tbd) sounds like a good idea.
Just to summarize, there are three conditions we need to consider: 1) we hit the TransmitFile recycle bug many times in a row 2) we have encountered an incompatible firewall or VPN 3) the IP address has changed
Instead, can we find some patch that will test AcceptEx? Perhaps we create a single local listen and attempt to connect and write to it, test
that the AcceptEx succeeds, and otherwise emit some nasty warnings
and throw a flag that puts us into the Win9x listener code?
Testing AcceptEx is not easy, the failure only occurs when duplicating the socket between processes. But maybe testing the Listen socket provides us with enough information to indicate what the problem might be and suggest or perform corrective action.
Does accept() also fail? Can we use the 9x code to work around these sorts of problems?
No, accept() is fine. Using the 9x path *may* work but I haven't tested it. The other option Bill S. suggested was to add a directive that forces the 9x path. I tend to think that is preferable than a run time decision because I'm not sure we can reliably determine which path to take at runtime. Note: taking the 9x path is only relevant to case 2) above.
OK, so we can log a msg & continue instead of exiting.I don't as much mind the Sleep(100) or even Sleep(0) so that we relinquish clock cycles. It's the arbitrary "foil the server 100 times and it will exit" problem.
Since we may not be able to guarantee a false positive maybe we should modify the error message and say that "if NO requests are being served it is probably a firewall or VPN problem", but continue the accept loop.
However, prior to logging this message we would need to test the Listen
socket and, if it is bad, log a message saying that the IP address has probably become invalid, then exit the child and let the parent renew the Listeners.
Because those only occur once the listen socket becomes
invalidated, due to DHCP or some other change. You can trigger
by reconfiguring TCP/IP to switch between two IP addresses.
Again, we can recover gracefully if we ask the parent to do a respawn upon recreating all of *it's* listeners.
i.e. whenever we hit some threshold of consecutive AcceptEx errors test the Listening socket (tbd somehow), and exit the child if it is bad.
Allan