Roch - PAE wrote:
jason jiang writes:
 > >From my experience, use the softintr to distribute the packets to
 > upper layer will get poor performance on latency and througput
 > performance than handle it in single interrupt thread. And you want to
> make sure do not handle too much packets in one interrupts. > > > This message posted from opensolaris.org
 > _______________________________________________
 > networking-discuss mailing list
 > [email protected]


I see that both interrupt scheme suffer from the same
drawback of pinning whatever thread happens to be running on the interrupt/softintr cpu. The problem gets really annoying
when the incoming inter-packets time interval is smaller than the
handling time under the interrupt. Even if the code is set
to return after handling N packets, a new interrupt will be
_immediately_ signified and the pinning will keep on going.

That depends on the device and the driver. Its fully possible to acknowledge multiple interrupts. Most hardware I've worked with, if multiple packets arrive, and the interrupt on the device is not acknowledged, then multiple interrupts are not received. So if you don't acknowledge the interrupt until you think you're done processing, you probably won't take another interrupt when you exit. (You do need to check one last time after acknowledging the interrupt to prevent a lost packet race though.)
Now, the per-packet handling time, is not a well defined
entity. The software stack can choose to do more (say push up
through TCP/IP) or less work (just queue and wake kernel
thread) on each packet. All this needs to be managed based
on the load and we're moving in that direction.

There are other changes in the process... when the stack can't keep up with the inbound packets at _interrupt_ rate, the stack will have the ability to turn off interrupts on the device (if it supports it), and run the receive thread in "polling mode". This means that you have no interpacket context switches. It will stay in this mode until the poller empties the receive ring.

At the driver level, if you reach a point where you have a
large queue in the HW receive rings, that is a nice
indication that deferring the processing to a non-interrupt
kernel thread would be good. Under this condition the thread wakeup cost is amortized over the handling of many packets.

Hmm... but you still have the initial latency for the first packet in the ring. Its not fatal, but its not nice to add 10msec latency if you don't have to, either. The details of this decision are moving up-stack though, in the form of squeues and polling with crossbow.

We are looking at other ways to reduce per-packet processing overhead as well... stay tuned.

   -- Garrett
_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to