Roch - PAE wrote:
jason jiang writes:
> >From my experience, use the softintr to distribute the packets to
> upper layer will get poor performance on latency and througput
> performance than handle it in single interrupt thread. And you want to
> make sure do not handle too much packets in one interrupts.
>
>
> This message posted from opensolaris.org
> _______________________________________________
> networking-discuss mailing list
> [email protected]
I see that both interrupt scheme suffer from the same
drawback of pinning whatever thread happens to be running on
the interrupt/softintr cpu. The problem gets really annoying
when the incoming inter-packets time interval is smaller than the
handling time under the interrupt. Even if the code is set
to return after handling N packets, a new interrupt will be
_immediately_ signified and the pinning will keep on going.
That depends on the device and the driver. Its fully possible to
acknowledge multiple interrupts. Most hardware I've worked with, if
multiple packets arrive, and the interrupt on the device is not
acknowledged, then multiple interrupts are not received. So if you
don't acknowledge the interrupt until you think you're done processing,
you probably won't take another interrupt when you exit. (You do need
to check one last time after acknowledging the interrupt to prevent a
lost packet race though.)
Now, the per-packet handling time, is not a well defined
entity. The software stack can choose to do more (say push up
through TCP/IP) or less work (just queue and wake kernel
thread) on each packet. All this needs to be managed based
on the load and we're moving in that direction.
There are other changes in the process... when the stack can't keep up
with the inbound packets at _interrupt_ rate, the stack will have the
ability to turn off interrupts on the device (if it supports it), and
run the receive thread in "polling mode". This means that you have no
interpacket context switches. It will stay in this mode until the
poller empties the receive ring.
At the driver level, if you reach a point where you have a
large queue in the HW receive rings, that is a nice
indication that deferring the processing to a non-interrupt
kernel thread would be good. Under this condition the thread
wakeup cost is amortized over the handling of many packets.
Hmm... but you still have the initial latency for the first packet in
the ring. Its not fatal, but its not nice to add 10msec latency if you
don't have to, either. The details of this decision are moving up-stack
though, in the form of squeues and polling with crossbow.
We are looking at other ways to reduce per-packet processing overhead as
well... stay tuned.
-- Garrett
_______________________________________________
networking-discuss mailing list
[email protected]