Aaron Bannert wrote: > > On Thu, Jan 17, 2002 at 10:20:00AM -0500, Greg Ames wrote: > [snip] > > As a matter of fact, I typically listen on two ports while testing to disable > > S_L_U_A, so I can easily figure out which process will get the next connection > > in case I want to gdb it. While trying out ktrace on my test config, I saw that > > the fcntl() accept mutex has got a thundering herd problem on daedalus. After > > releasing the fcntl mutex, you see the kernel context switching to all of the > > idle httpd processes. The first process that wakes up gets the mutex, the rest > > of the context switches simply burn a little CPU, then block again. Moral: the > > default cure looks as bad as the disease. > > Is there anywhere else that we've started using cross process locks > since 2.0.28? If fcntl() is known to cause this behaviour, why is it > enabled at all on this version of FreeBSD?
I didn't know about this until yesterday. Nobody else mentioned it AFAIK. > Based on your ktrace output from a couple days ago, I have a working > theory that I have yet to reproduce: I noticed that there is a very > high occurance of sendfile returning with errno 35 (Resource temporarily > unavailable). On FreeBSD, sendfile will almost always return -1 errno 35 for big files. That simply means the file is bigger than the socket buffers and the disk i/o bandwidth is higher than the network bandwidth to the user. But we call sendfile twice as often as we need to on FreeBSD and probably Solaris. I consistently see a pattern of two sendfiles then a select. When apr sees that sendfile sent some bytes, we change the retval from -1 and quickly forget that it told us it would block. We do need to exit from apr_sendfile after it sends bytes so that the app can update the offset & length etc. But before exiting, apr should put a mark on the wall that tells us to issue select() first next time apr_sendfile is called, because the kernel just told us what was likely to happen. We already have this kind of logic in apr_recv. It uses the APR_INCOMPLETE_READ flag to predict whether the next call will return EAGAIN/EWOULDBLOCK, so we know to try select() first on the next apr call. Perhaps we should rename this to APR_INCOMPLETE_IO and use it in apr_sendfile, or get really crazy and use a new flag. > Unfortunately, I don't think this will account for the short bursts of run > queue growth we're talking about here, but it is something to look into. Right. The double sendfiles() have been happening for ages, so they are not the cause of the load spike problem. They certainly need to be addressed, along with a number of other extra syscall problems. However, if you look at the change log for apr_sendfile, you'll see that we've gone round and round trying to get it right on FreeBSD. So this change needs to coded, reviewed and tested very carefully. Greg
