On Mon, Mar 07, 2005 at 04:14:37PM +1100, Nick Piggin wrote: > I think you would have better luck in reproducing this problem if you > did the full sendfile thing. > > I think it is becoming disk bound due to page reclaim problems, which > is causing the slowdown. > > In that case, writing the network only test would help to confirm the > problem is not a networking one - so not useless by any means.
Not necessarily, Nick. I have written an HTTP testing tool which matches the description of Ben's : non-blocking, single-threaded, no disk I/O, etc... It works flawlessly under 2.4, and gives me random numbers in 2.6, especially if I start some CPU activity on the system, I can get pauses of up to 13 seconds without this tool doing anything !!! At first I believed it was because of the scheduler, but it might also be related to what is described here since I had somewhat the same setup (gigE, 1500, thousands of sockets). I never had enough time to investigate more, so I went back to 2.4. It makes me think that for the problem described here, we have no indication of CPU & I/O activity, which might help Ben try to reproduce. Cheers, Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

