On Mon, Aug 1, 2011 at 8:45 AM, azurIt <[email protected]> wrote: > > Hi, > > i came accross a serious problem with Apache server when it is NOT able to > read/write from/to network immediatelly. > > We are using network throttling patches in our linux kernel wihch are able to > emulate low network throughput (we are using this to throttle users). When a > process reaches it's limit, read/write will result in EAGAIN which is > completely ok cos it's standard behavior. Problem is that Apache is not able > to correctly process this and is trying to read/write in a loop WITHOUT any > delay - this is, of course, resulting in 100% CPU consumption. This bug can > lead to DoS of the whole server. > > Here is the strace output when Apache's child is throttled: > > writev(6409, [{"n1n1n1n1n1n1n1n1n1n1n1n1n1n1n1n1n"..., 2900}], 1) = -1 EAGAIN > (Resource temporarily unavailable) > poll([{fd=6409, events=POLLOUT}], 1, 100000) = 1 ([{fd=6409, > revents=POLLOUT}]) > writev(6409, [{"n1n1n1n1n1n1n1n1n1n1n1n1n1n1n1n1n"..., 2900}], 1) = -1 EAGAIN > (Resource temporarily unavailable) > poll([{fd=6409, events=POLLOUT}], 1, 100000) = 1 ([{fd=6409, > revents=POLLOUT}]) > writev(6409, [{"n1n1n1n1n1n1n1n1n1n1n1n1n1n1n1n1n"..., 2900}], 1) = -1 EAGAIN > (Resource temporarily unavailable) > poll([{fd=6409, events=POLLOUT}], 1, 100000) = 1 ([{fd=6409, > revents=POLLOUT}]) > writev(6409, [{"n1n1n1n1n1n1n1n1n1n1n1n1n1n1n1n1n"..., 2900}], 1) = -1 EAGAIN > (Resource temporarily unavailable) > poll([{fd=6409, events=POLLOUT}], 1, 100000) = 1 ([{fd=6409, > revents=POLLOUT}]) > writev(6409, [{"n1n1n1n1n1n1n1n1n1n1n1n1n1n1n1n1n"..., 2900}], 1) = -1 EAGAIN > (Resource temporarily unavailable) > poll([{fd=6409, events=POLLOUT}], 1, 100000) = 1 ([{fd=6409, > revents=POLLOUT}]) > > ..and so on until it is able to send data again. > > I suggest to insert a little delay everytime an EAGAIN is returned.
Looks like your kernel patches cause poll() to tell Apache the socket is immediately writable even when the kernel will return EAGAIN on a subsequent write. I'd expect without the patches, and with a client that doesn't read the [large] response, you won't see the behavior and poll() will just block (not spin) until you do some reading.
