Re: Some interrupt coalescing tests

2001-10-22 Thread Mike Silbersack


On Thu, 18 Oct 2001, Terry Lambert wrote:

> In the non-LRP case, the percentage drop in interrupt overhead
> is ~10% (as has been observed by others).  THis makes sense,
> too, if you consider that NETISR driving of receives means
> less time in interrupt processing.  If we multiply the 15%
> (100% - 85% = 15% in transmit) by 3 (12000/(12000-8000) =
> 100% / 33% = 3), then we get 45% in transmit in the non-LRP
> case.

Hrm, so the reduction I saw is repeatable; good.  If there are no
objections, I'll fix up the whitespace later this week, fix the warning,
and commit it.  (The whitespace applies to the unified diff I created;
your context one may have correct whitespace.)

> It would be nice if someone could confirm that slightly less
> than 1/2 of the looping is on the transmit side for a non-LRP
> kernel, but that's about what we should expect...

If it is, I wonder if we could put a larger number of packets in the queue
and disable transmit interrupts for a while.  MMmm, dynamic queues.
Sounds like something that would take a lot of work to get right,
unfortunately.  (Presumably, this tactic could be applied to most network
cards, although all the better ones probably have really good transmit
interrupt mitigation.)

> I'm really surprised abuse of the HTTP protocol itself in
> denial of service attacks isn't more common.

Well, the attack your proposed would require symmetric bandwidth (if I
understood it correctly), and is of course traceable.  My guess would be
that even script kiddies are smart enough to avoid attacks which could
easily be traceable back to their drones.

> Even ignoring this, there's a pretty clear off the shelf
> hardware path to a full 10 gigabits, with PCI-X (8 gigabits
> times 2 busses gets you there, which is 25 times the largest
> UUNet hosting center pipe size today).

Are you sure about that?  I recently heard that Internet2 will be moving
to 2.4Gbps backbones in the near future, and I assume that qwest wouldn't
be willing to donate that bandwidth unless they had similar capabilities
already.  ("They" being generalized to all backbone providers.)

> Fair share is more a problem for slower interfaces without
> hardware coelescing, and software is an OK band-aid for
> them (IMO).
>
> I suspect that you will want to spend most of your CPU time
> doing processing, rather than interrupt handing, in any case.
>
> -- Terry

Yep, probably.  Are you implementing fair sharing soon?

Mike "Silby" Silbersack


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Some interrupt coalescing tests

2001-10-18 Thread Terry Lambert

Mike Silbersack wrote:
> What probably should be done, if you have time, is to add a bit of
> profiling to your patch to find out how it helps most.  I'm curious how
> many times it ends up looping, and also why it is looping (whether this is
> due to receive or transmit.)  I think knowing this information would help
> optimize the drivers further, and perhaps suggest a tact we haven't
> thought of.

On 960 megabits per second on a Tigon III (full wire speed,
non-jumbogram), the looping is almost entirely (~85%) on
the receive side.

It loops for 75% of the hardware interrupts in the LRP case
(reduction of interrupts from 12,000 to 8,000 -- 33%).

This is really expected, since in the LRP case, the receive
processing is significantly higher, and even in that case,
we are not driving the CPU to the wall in interrupt processing.

In the non-LRP case, the percentage drop in interrupt overhead
is ~10% (as has been observed by others).  THis makes sense,
too, if you consider that NETISR driving of receives means
less time in interrupt processing.  If we multiply the 15%
(100% - 85% = 15% in transmit) by 3 (12000/(12000-8000) =
100% / 33% = 3), then we get 45% in transmit in the non-LRP
case.

It would be nice if someone could confirm that slightly less
than 1/2 of the looping is on the transmit side for a non-LRP
kernel, but that's about what we should expect...


> > I don't know if anyone has tested what happens to apache in
> > a denial of service attack consisting of a huge number of
> > partial "GET" requests that are incomplete, and so leave state
> > hanging around in the HTTP server...
> 
> I'm sure it would keel over and die, since it needs a process
> per socket.  If you're talking about sockets in TIME_WAIT or
> such, see netkill.pl.

I was thinking in terms of connections not getting dropped.

The most correct way to handle this is probably an accept
filter for , indicating a complete GET
request (still leaves POST, though, which has a body), with
dropping of long duration incomplete requests.  Unfortunately,
without going into the "Content-Length:" parsing, we are
pretty much screwed on POST, and very big POSTs still screw
you badly (imagine a "Content-Length: 10").  You
can mitigate that by limiting request size, but you are
still talking about putting HTTP parsing in the kernel,
above and beyond simple accept filters.

I'm really surprised abuse of the HTTP protocol itself in
denial of service attacks isn't more common.


> > Yes.  Floyd and Druschel recommend using high and low
> > watermarks on the amount of data pending processing in
> > user space.  The most common approach is to use a fair
> > share scheduling algorithm, which reserves a certain
> > amount of CPU for user space processing, but this is
> > somewhat wasteful, if there is no work, since it denies
> > quantum to the interrupt processing, potentially wrongly.
> 
> I'm not sure such an algorithm would be wasteful - there must be data
> coming in to trigger such a huge amount of interrupts.  I guess this would
> depend on how efficient your application is, how you set the limits, etc.

Yes.  The "waste" comment is aimed at the idea that you
will most likely have a heterogeneous loading, so you can
not accurately predict ahead of time that you will spend
80% of your time in the kernel, and 20% processing in user
space, or whatever ratio you come up with.  This becomes
much more of an issue when you have an attack, which will,
by definition, end up being asymmetric.

In practice, however, no one out there has a pipe size in
excess of 400 Mbits outside of a lab, so most people never
really need 1Gbit of throughput, anyway.  If you can make
your system handle full wire speed for 1Gbit, you are pretty
much safe from any attack that someone might want to throw
at you, at least until the pipes get larger.

Even ignoring this, there's a pretty clear off the shelf
hardware path to a full 10 gigabits, with PCI-X (8 gigabits
times 2 busses gets you there, which is 25 times the largest
UUNet hosting center pipe size today).

Fair share is more a problem for slower interfaces without
hardware coelescing, and software is an OK band-aid for
them (IMO).

I suspect that you will want to spend most of your CPU time
doing processing, rather than interrupt handing, in any case.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Some interrupt coalescing tests

2001-10-17 Thread Mike Silbersack


On Sun, 14 Oct 2001, Terry Lambert wrote:

> The one thing I _would_ add -- though I'm waiting for it to
> be a problem before doing it -- is to limit the total number
> of packets processed per interrupt by keeping a running count.
>
> You would have to be _AMAZINGLY_ loaded to hit this, though;
> since it would mean absolutely continuous DMAs.  I think it
> is self-limiting, should that happen, since once you are out
> of mbufs, you're out.  The correct thing to do is probably to
> let it run out, but keep a seperate transmit reserve, so that
> you can process requests to completion.

What probably should be done, if you have time, is to add a bit of
profiling to your patch to find out how it helps most.  I'm curious how
many times it ends up looping, and also why it is looping (whether this is
due to receive or transmit.)  I think knowing this information would help
optimize the drivers further, and perhaps suggest a tact we haven't
thought of.

> I don't know if anyone has tested what happens to apache in
> a denial of service attack consisting of a huge number of
> partial "GET" requests that are incomplete, and so leave state
> hanging around in the HTTP server...

I'm sure it would keel over and die, since it needs a process per socket.
If you're talking about sockets in TIME_WAIT or such, see netkill.pl.

> Yes.  Floyd and Druschel recommend using high and low
> watermarks on the amount of data pending processing in
> user space.  The most common approach is to use a fair
> share scheduling algorithm, which reserves a certain
> amount of CPU for user space processing, but this is
> somewhat wasteful, if there is no work, since it denies
> quantum to the interrupt processing, potentially wrongly.

I'm not sure such an algorithm would be wasteful - there must be data
coming in to trigger such a huge amount of interrupts.  I guess this would
depend on how efficient your application is, how you set the limits, etc.

Mike "Silby" Silbersack



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Some interrupt coalescing tests

2001-10-14 Thread Terry Lambert

Mike Silbersack wrote:
> Hm, true, I guess the improvement is respectable.  My thought is mostly
> that I'm not sure how much it's extending the performance range of a
> system; testing with more varied packet loads as suggested by Alfred would
> help tell us the answer to this.

I didn't respond to Alfred's post, and I probably should have;
he had some very good comments, including varying the load.

My main interest has been in increasing throughput as much as
possible; as such, my packet load has been geared towards moving
the most data possible.

The tests we did were with just connections per second, 1k HTTP
transfers, and 10k HTTP transfers.  Unfortunately, I can't give
you seperate numbers without the LRP, since we didn't bother
after the connection rate went from ~7000/second to 23500/second
with LRP, it wasn't worth it.


> The extra polling of the bus in cases where there are no additional
> packets to grab is what I was wondering about.  I guess in comparison to
> the quantity of packet data going by, it's not a real issue.

It could be, if you were doing something that was network
triggered, relatively low cost, but CPU intensive; on the
whole, though, there's very little that isn't going to be
network related, these days, and what there is, will end up
not taking the overhead, unless you are also doing networking.

Maybe it should be a tunable?  But these days, everything is
pretty much I/O bound, not CPU bound.

The one thing I _would_ add -- though I'm waiting for it to
be a problem before doing it -- is to limit the total number
of packets processed per interrupt by keeping a running count.

You would have to be _AMAZINGLY_ loaded to hit this, though;
since it would mean absolutely continuous DMAs.  I think it
is self-limiting, should that happen, since once you are out
of mbufs, you're out.  The correct thing to do is probably to
let it run out, but keep a seperate transmit reserve, so that
you can process requests to completion.

I don't know if anyone has tested what happens to apache in
a denial of service attack consisting of a huge number of
partial "GET" requests that are incomplete, and so leave state
hanging around in the HTTP server...


[ ... polling vs. interrupt load ... ]

> Straight polling isn't necessarily the solution I was thinking of, but
> rather some form of interrupt disabling at high rates.  For example, if
> the driver were to keep track of how many interrupts/second it was taking,
> perhaps it could up the number of receive buffers from 64 to something
> higher, then disable the card's interrupt and set a callback to run in a
> short bit of time at which point interrupts would be reenabled and the
> interrupt handler would be run.  Ideally, this could reduce the number of
> interrupts greatly, increasing efficiency under load.  Paired with this
> could be receive polling during transmit, something which does not seem to
> be done at current, if I'm reading correctly.
> 
> I'm not sure of the feasibility of the above, unfortunately - it would
> seem highly dependant on how short of a timeout we can realistically get
> along with how many mbufs we can spare for receive buffers.

Yes.  Floyd and Druschel recommend using high and low
watermarks on the amount of data pending processing in
user space.  The most common approach is to use a fair
share scheduling algorithm, which reserves a certain
amount of CPU for user space processing, but this is
somewhat wasteful, if there is no work, since it denies
quantum to the interrupt processing, potentially wrongly.


-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Some interrupt coalescing tests

2001-10-14 Thread Mike Silbersack


On Sat, 13 Oct 2001, Terry Lambert wrote:

> Mike Silbersack wrote:
>
> One issue to be careful of here is that the removal of the
> tcptmpl actually causes a performance hit that wasn't there
> in the 4.3 code.  My original complaint about tcptmpl taking
> up 256 instead of 60 bytes stands, but I'm more than half
> convinced that making it take up 60 bytes is OK... or at
> least is more OK than allocating and deallocating each time,
> and I don't yet have a better answer to the problem.  4.3
> doesn't have this change, but 4.4 does.

I need benchmarks to prove the slowdown, Terry.  The testing I performed
(which is limited, of course) showed no measureable speed difference.
Remember that the only time the tcptempl mbuf ever gets allocated now is
when a keepalive is sent, which is a rare event.  The rest of the time,
it's just copying the data from the preexisting structures over to the new
packet.  If you can show me that this method is slower, I will move it
over to a zone allocated setup like you proposed.

> > I'm not sure if the number was lower because the celeron couldn't run the
> > flooder as quickly, or if the -current box was dropping packets.  I
> > suspect the latter, as the -current box was NOTICEABLY slowed down; I
> > could watch systat refresh the screen.
>
> This is unfortunate; it's an effect that I expected with
> the -current code, because of the change to the interrupt
> processing path.
>
> To clarify here, the slowdown occurred both with and without
> the patch, right?
>
> The problem here is that when you hit livelock (full CPU
> utilization), then you are pretty much unable to do anything
> at all, unless the code path goes all the way to the top of
> the stack.

Yep, the -current box livelocked with and without the patch.  I'm not sure
if -current is solely to blame, though.  My -current box is using a PNIC,
which incurs additional overhead relative to other tulip clones, according
to the driver's comments.  And the 3com in that box hasn't worked in a
while... maybe I should try debugging that so I have an additional test
point.

> > The conclusion?  I think that the dc driver does a good enough job of
> > grabbing multiple packets at once, and won't be helped by Terry's patch
> > except in a few very cases.
>
> 10% is a good improvement; my gut feeling is that it would
> have been less than that.  This is actually good news for
> me, since it means that my 30% number is bounded by the
> user space program not being run (in other words, I should
> be able to get considerably better performance, using a
> weighted fair share scheduler).  As long as it doesn't
> damage performance, I think that it's proven itself.

Hm, true, I guess the improvement is respectable.  My thought is mostly
that I'm not sure how much it's extending the performance range of a
system; testing with more varied packet loads as suggested by Alfred would
help tell us the answer to this.

> > In fact, I have a sneaky suspicion that Terry's patch may
> > increase bus traffic slightly.  I'm not sure how much of
> > an issue this is, perhaps Bill or Luigi could comment.
>
> This would be interesting to me, as well.  I gave Luigi an
> early copy of the patch to play with a while ago, and also
> copied Bill.
>
> I'm interested in how you think it could increase traffic;
> the only credible reason I've been able to come up with is
> the ability to push more packets through, when they would
> otherwise end up being dropped because of the queue full
> condition -- if this is the case, the bus traffic is real
> work, and not additonal overhead.

The extra polling of the bus in cases where there are no additional
packets to grab is what I was wondering about.  I guess in comparison to
the quantity of packet data going by, it's not a real issue.

> > In short, if we're going to try to tackle high interrupt load,
> > it should be done by disabling interrupts and going to polling
> > under high load;
>
> I would agree with this, except that it's only really a
> useful observation if FreeBSD is being used as purely a
> network processor.  Without interrupts, the polling will
> take a significant portion of the available CPU to do, and
> you can't burn that CPU if, for example, you have an SSL
> card that does your handshakes, but you need to run the SSL
> sessions themselves up in user space.

Straight polling isn't necessarily the solution I was thinking of, but
rather some form of interrupt disabling at high rates.  For example, if
the driver were to keep track of how many interrupts/second it was taking,
perhaps it could up the number of receive buffers from 64 to something
higher, then disable the card's interrupt and set a callback to run in a
short bit of time at which point interrupts would be reenabled and the
interrupt handler would be run.  Ideally, this could reduce the number of
interrupts greatly, increasing efficiency under load.  Paired with this
could be receive polling during transmit, something which doe

Re: Some interrupt coalescing tests

2001-10-13 Thread Terry Lambert

Mike Silbersack wrote:
> Well, I've been watching everyone argue about the value of interrupt
> coalescing in the net drivers, so I decided to port terry's patch to 4.4 &
> -current to see what the results are.

Thanks!


> The network is 100mbps, switched.  To simulate load, I used a syn flooder
> aimed at an unused port.  icmp/rst response limiting was enabled.
> 
> With the -current box attacking the -stable box, I was able to notice a
> slight drop in interrupts/second with the patch applied.  The number of
> packets was ~57000/second.
> 
> Before: ~46000 ints/sec, 57-63% processor usage due to interrupts.
> After: ~38000 ints/sec, 50-60% processor usage due to interrupts.
> 
> In both cases, the box felt responsive.

One issue to be careful of here is that the removal of the
tcptmpl actually causes a performance hit that wasn't there
in the 4.3 code.  My original complaint about tcptmpl taking
up 256 instead of 60 bytes stands, but I'm more than half
convinced that making it take up 60 bytes is OK... or at
least is more OK than allocating and deallocating each time,
and I don't yet have a better answer to the problem.  4.3
doesn't have this change, but 4.4 does.


> With the -stable box attacking the -current box, the patch made no
> difference.  The box bogged down at only ~25000 ints/sec, and response
> limiting reported the number of packets to be ~44000/second.
> 
> I'm not sure if the number was lower because the celeron couldn't run the
> flooder as quickly, or if the -current box was dropping packets.  I
> suspect the latter, as the -current box was NOTICEABLY slowed down; I
> could watch systat refresh the screen.

This is unfortunate; it's an effect that I expected with
the -current code, because of the change to the interrupt
processing path.

To clarify here, the slowdown occurred both with and without
the patch, right?

The problem here is that when you hit livelock (full CPU
utilization), then you are pretty much unable to do anything
at all, unless the code path goes all the way to the top of
the stack.


> The conclusion?  I think that the dc driver does a good enough job of
> grabbing multiple packets at once, and won't be helped by Terry's patch
> except in a few very cases.

10% is a good improvement; my gut feeling is that it would
have been less than that.  This is actually good news for
me, since it means that my 30% number is bounded by the
user space program not being run (in other words, I should
be able to get considerably better performance, using a
weighted fair share scheduler).  As long as it doesn't
damage performance, I think that it's proven itself.


> In fact, I have a sneaky suspicion that Terry's patch may
> increase bus traffic slightly.  I'm not sure how much of
> an issue this is, perhaps Bill or Luigi could comment.

This would be interesting to me, as well.  I gave Luigi an
early copy of the patch to play with a while ago, and also
copied Bill.

I'm interested in how you think it could increase traffic;
the only credible reason I've been able to come up with is
the ability to push more packets through, when they would
otherwise end up being dropped because of the queue full
condition -- if this is the case, the bus traffic is real
work, and not additonal overhead.

If you weren't getting any packets, or had a very slow
packet rate, it might increase bus traffic, in that doing
an extra check might always return a negative response (in
the test case in question, that's not true, since it's not
doing more work than it would with the same load, using
interrupts to trigger the same bus traffic.  Note that it
is only a consideration in the case that there is bus
traffic involved when polling an empty ring to see if DMA
has been done to a particular mbuf or cluster, so it takes
an odd card for it to be a problem.


> In short, if we're going to try to tackle high interrupt load,
> it should be done by disabling interrupts and going to polling
> under high load;

I would agree with this, except that it's only really a
useful observation if FreeBSD is being used as purely a
network processor.  Without interrupts, the polling will
take a significant portion of the available CPU to do, and
you can't burn that CPU if, for example, you have an SSL
card that does your handshakes, but you need to run the SSL
sessions themselves up in user space.

For example, the current ClickArray "Array 1000" product
does around 700 1024 bit SSL connection setups a second, and,
since it uses a Broadcom card, the card is only doing the
handshaking, and not the rest of the crypto processing.  The
crypto stream processing has to be done in user space, in
the SSL proxy code living there, and as such, would suffer
from doing poling.


> the patch proposed here isn't worth the extra complexity.

I'd argue that the complexity is coming, no matter what.  If
you seperate out the tx_eof and rx_eof entry points, and
externalize them into the ethernet driver interface, in order
to enable polling, you 

Re: Some interrupt coalescing tests

2001-10-12 Thread Mike Silbersack


On Fri, 12 Oct 2001, Alfred Perlstein wrote:

> > The network is 100mbps, switched.  To simulate load, I used a syn flooder
> > aimed at an unused port.  icmp/rst response limiting was enabled.
>
> Actually, you might want to leave that on, it will generate more load.

I considered leaving it on, but I'm not sure if that would be constructive
or not.  The primary problem with doing that is related to my test setup -
as we see from the stable -> current attack, my current box couldn't take
the interrupt load of that many incoming packets, which would slow down
the outgoing packets.  If I had a better test setup, I'd like to try that.

> > Before: ~46000 ints/sec, 57-63% processor usage due to interrupts.
> > After: ~38000 ints/sec, 50-60% processor usage due to interrupts.
> >
> > In both cases, the box felt responsive.
>
> You need to get real hardware to run these tests, obviously you aren't
> saturating your line.  I would suspect a better test would be to see
> how many pps you get can at the point where cpu utlization reaches
> 100%.  Basically start at a base of 60,000pps, and see how many more
> it takes to drive them both to 100%.
>
> Even your limited tests show a mean improvement of something like
> 10%.
>
> 10% isn't earth shattering, but it is a signifigant improvement.

Yes, there is some improvement, but I'm not sure that the actual effect is
worthwhile.  Even with the 10% decrease, you're still going to kill the
box if the interrupt count goes much higher.

If you can setup a 4.4+this patch test of some sort with varying loads to
see the effect, maybe we could characterize the effect of the patch more.
With my setup, I don't think I can really take this testing any further.

Mike "Silby" Silbersack


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Some interrupt coalescing tests

2001-10-12 Thread Alfred Perlstein

* Mike Silbersack <[EMAIL PROTECTED]> [011012 01:30] wrote:
> 
> Well, I've been watching everyone argue about the value of interrupt
> coalescing in the net drivers, so I decided to port terry's patch to 4.4 &
> -current to see what the results are.  The patch included applies cleanly
> to 4.4's if_dc, and will apply to -current with a one line change.
> Whitespace is horrible, I copied and pasted the original patch, used patch
> -l, etc.
> 
> The test setup I used was as follows:
> Duron 600, PNIC, running -current
> Celeron 450, ADMtek tulip-clone, running -stable
> 
> The network is 100mbps, switched.  To simulate load, I used a syn flooder
> aimed at an unused port.  icmp/rst response limiting was enabled.

Actually, you might want to leave that on, it will generate more load.

> 
> With the -current box attacking the -stable box, I was able to notice a
> slight drop in interrupts/second with the patch applied.  The number of
> packets was ~57000/second.
> 
> Before: ~46000 ints/sec, 57-63% processor usage due to interrupts.
> After: ~38000 ints/sec, 50-60% processor usage due to interrupts.
> 
> In both cases, the box felt responsive.

You need to get real hardware to run these tests, obviously you aren't
saturating your line.  I would suspect a better test would be to see
how many pps you get can at the point where cpu utlization reaches
100%.  Basically start at a base of 60,000pps, and see how many more
it takes to drive them both to 100%.

Even your limited tests show a mean improvement of something like
10%.

10% isn't earth shattering, but it is a signifigant improvement.

-Alfred

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Some interrupt coalescing tests

2001-10-11 Thread Mike Silbersack


Well, I've been watching everyone argue about the value of interrupt
coalescing in the net drivers, so I decided to port terry's patch to 4.4 &
-current to see what the results are.  The patch included applies cleanly
to 4.4's if_dc, and will apply to -current with a one line change.
Whitespace is horrible, I copied and pasted the original patch, used patch
-l, etc.

The test setup I used was as follows:
Duron 600, PNIC, running -current
Celeron 450, ADMtek tulip-clone, running -stable

The network is 100mbps, switched.  To simulate load, I used a syn flooder
aimed at an unused port.  icmp/rst response limiting was enabled.

With the -current box attacking the -stable box, I was able to notice a
slight drop in interrupts/second with the patch applied.  The number of
packets was ~57000/second.

Before: ~46000 ints/sec, 57-63% processor usage due to interrupts.
After: ~38000 ints/sec, 50-60% processor usage due to interrupts.

In both cases, the box felt responsive.

With the -stable box attacking the -current box, the patch made no
difference.  The box bogged down at only ~25000 ints/sec, and response
limiting reported the number of packets to be ~44000/second.

I'm not sure if the number was lower because the celeron couldn't run the
flooder as quickly, or if the -current box was dropping packets.  I
suspect the latter, as the -current box was NOTICEABLY slowed down; I
could watch systat refresh the screen.

The conclusion?  I think that the dc driver does a good enough job of
grabbing multiple packets at once, and won't be helped by Terry's patch
except in a few very cases.  In fact, I have a sneaky suspicion that
Terry's patch may increase bus traffic slightly.  I'm not sure how much of
an issue this is, perhaps Bill or Luigi could comment.

In short, if we're going to try to tackle high interrupt load, it should
be done by disabling interrupts and going to polling under high load;
the patch proposed here isn't worth the extra complexity.

I suppose this would all change if we were using LRP and doing lots of
processing in the interrupt handler... but we aren't.

Mike "Silby" Silbersack


--- if_dc.c.origThu Oct 11 01:39:05 2001
+++ if_dc.c Thu Oct 11 01:39:30 2001
@@ -193,8 +193,8 @@
 static int dc_coal __P((struct dc_softc *, struct mbuf **));
 static void dc_pnic_rx_bug_war __P((struct dc_softc *, int));
 static int dc_rx_resync__P((struct dc_softc *));
-static void dc_rxeof   __P((struct dc_softc *));
-static void dc_txeof   __P((struct dc_softc *));
+static int dc_rxeof   __P((struct dc_softc *));
+static int dc_txeof   __P((struct dc_softc *));
 static void dc_tick__P((void *));
 static void dc_tx_underrun __P((struct dc_softc *));
 static void dc_intr__P((void *));
@@ -2302,7 +2302,7 @@
  * A frame has been uploaded: pass the resulting mbuf chain up to
  * the higher level protocols.
  */
-static void dc_rxeof(sc)
+static int dc_rxeof(sc)
struct dc_softc *sc;
 {
 struct ether_header*eh;
@@ -2311,6 +2311,7 @@
struct dc_desc  *cur_rx;
int i, total_len = 0;
u_int32_t   rxstat;
+  int cnt = 0;
 
ifp = &sc->arpcom.ac_if;
i = sc->dc_cdata.dc_rx_prod;
@@ -2355,7 +2356,7 @@
continue;
} else {
dc_init(sc);
-   return;
+  return(cnt);
}
}
 
@@ -2379,6 +2380,7 @@
/* Remove header from mbuf and pass it on. */
m_adj(m, sizeof(struct ether_header));
ether_input(ifp, eh, m);
+  cnt++;
}
 
sc->dc_cdata.dc_rx_prod = i;
@@ -2389,12 +2391,13 @@
  * the list buffers.
  */
 
-static void dc_txeof(sc)
+static int dc_txeof(sc)
struct dc_softc *sc;
 {
struct dc_desc  *cur_tx = NULL;
struct ifnet*ifp;
int idx;
+  int cnt = 0;
 
ifp = &sc->arpcom.ac_if;
 
@@ -2452,7 +2455,7 @@
ifp->if_collisions++;
if (!(txstat & DC_TXSTAT_UNDERRUN)) {
dc_init(sc);
-   return;
+  return(cnt);
}
}
 
@@ -2466,13 +2469,14 @@
 
sc->dc_cdata.dc_tx_cnt--;
DC_INC(idx, DC_TX_LIST_CNT);
+  cnt++;
}
 
sc->dc_cdata.dc_tx_cons = idx;
if (cur_tx != NULL)
ifp->if_flags &= ~IFF_OACTIVE;
 
-   return;
+  return(cnt);
 }
 
 static void dc_tick(xsc)
@@ -2612,6 +2616,7 @@
struct dc_softc *sc