Lawrence Stewart
Tue, 23 Sep 2008 19:05:20 -0700
On Sep 23, 2008, at 9:18 PM, Perry E. Metzger wrote:
Greg Lindahl <[EMAIL PROTECTED]> writes:On Tue, Sep 23, 2008 at 07:43:19PM -0400, Perry E. Metzger wrote:As for the daemons, remember that with a proper scheduler, you willswitch straight from an incoming network interrupt to a high priorityprocess that is expecting the incoming packet, and that even works correctly on some (but not all) Linux kernels. A user process cannot take priority over other tasks, at least not without someone being quite deliberate about it.You realize that most big HPC systems are using interconnects that don't generate many or any interrupts, right?Of course. Usually one even uses interrupt pacing/mitigation even in gig ethernet on a modern machine -- otherwise you're not going to get reasonable performance. (For 10Gig, you have to do even uglier tricks.) However, my argument still holds without any change. Until you actually process the packet, which happens in the kernel, userland won't see it anyway, and when the kernel processes it, it is free to switch to whatever userland process it wishes, and (under normal circumstances) it will do the right thing.
I think Greg is talking about HPC interconnects that do OS bypass, and Perry is talking about the kernel IP stack. Different things.IB, Quadrics, Myrinet, and SiCortex stuff does not go through the kernel,
does not interrupt, does not schedule. Typically the application thread calling SEND directly interacts with the NIC, and at the other end, the thread calling RECV directly polls the NIC queue to receive a packet.A sufficently fancy ethernet controller could do similar things, "sockets
direct" and so forth.In our code, the fast path from application calling SEND to the application
returning from RECV at the other end is 250 machine instructions. Thereis no time for the kernel to get in the way, no time to switch contexts or
address spaces or save registers.I'm sure the OS kernel is a fine thing. We throw it an occasional TLB miss as a bribe not to bother the application. Useful for initialization and ECC, but
you wouldn't loan it your car. -L _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf