William A. Rowe, Jr. wrote:
This patch can't be applied... it actually introduces a denial of service
problem if folks can simply early-disconnect a server some half dozen
actually 100 :)

times in a row... It isn't hard to work up such a tool.
If it is possible for someone to externally tickle the TransmitFile socket recycle bug then I agree.

Better; what if we test *which* socket failed. We are sort of helpless when the errors could be either the Listen and Accept socket. If the
error is on the Listen socket, we should exit signaling the parent to do
a restart with new listeners, if the error is on the accept socket we can
just keep going.
Based on the IP address renewal scenario you mention below, testing the Listen
socket (somehow, tbd) sounds like a good idea.

Just to summarize, there are three conditions we need to consider:
1) we hit the TransmitFile recycle bug many times in a row
2) we have encountered an incompatible firewall or VPN
3) the IP address has changed

Instead, can we find some patch that will test AcceptEx? Perhaps we create a single local listen and attempt to connect and write to it, test
that the AcceptEx succeeds, and otherwise emit some nasty warnings
and throw a flag that puts us into the Win9x listener code?
Testing AcceptEx is not easy, the failure only occurs when duplicating
the socket between processes. But maybe testing the Listen socket
provides us with enough information to indicate what the problem might be
and suggest or perform corrective action.

Does accept() also fail?  Can we use the 9x code to work around these
sorts of problems?
No, accept() is fine. Using the 9x path *may* work but I haven't
tested it. The other option Bill S. suggested was to add a directive
that forces the 9x path. I tend to think that is preferable than a
run time decision because I'm not sure we can reliably determine
which path to take at runtime.
Note: taking the 9x path is only relevant to case 2) above.

I don't as much mind the Sleep(100) or even Sleep(0) so that we
relinquish clock cycles.  It's the arbitrary "foil the server 100 times
and it will exit" problem.
OK, so we can log a msg & continue instead of exiting.

Since we may not be able to guarantee a false positive
maybe we should modify the error message and say that
"if NO requests are being served it is probably a firewall
or VPN problem", but continue the accept loop.

However, prior to logging this message we would need to test the Listen
socket and, if it is bad, log a message saying that the IP address has probably become invalid, then exit the child and let the parent renew the Listeners.


Because those only occur once the listen socket becomes
invalidated, due to DHCP or some other change. You can trigger
by reconfiguring TCP/IP to switch between two IP addresses.
Again, we can recover gracefully if we ask the parent to do a respawn upon recreating all of *it's* listeners.
i.e. whenever we hit some threshold of consecutive AcceptEx errors
test the Listening socket (tbd somehow), and exit the child if it is bad.

Allan



Reply via email to