Re: Allocate a page at interrupt time

2001-08-10 Thread Terry Lambert

Mike Smith wrote:
 The basic problem here is that you have decided what interrupt threads
 are, and aren't interested in the fact that what FreeBSD calls interrupt
 threads are not the same thing, despite being told this countless times,
 and despite it being embodied in the code that's right under your nose.
 
 You believe that an interrupt results in a make-runnable event, and at
 some future time, the interrupt thread services the interrupt request.
 
 This is not the case, and never was.  The entire point of having
 interrupt threads is to allow interrupt handling routines to block in the
 case where the handler/driver design does not allow for nonblocking
 synchronisation between the top and bottom halves.

So enlighten me, since the code right under my nose often
does not run on my dual CPU system, and I like prose anyway,
preferrably backed by data and repeatable research results.

What do interrupt threads buy you that isn't there in 4.x,
besides being one hammer among dozens that can hit the SMP
nail?

Why don't I want to run my interrupt to completion, and want
to use an interrupt thread to do the work instead?

On what context do they block?

Why is it not better to change the handler/driver design to
allow for nonblocking synchronization?


Personally, when I get an ACK from a SYN/ACK I sent in
response to a SYN, and the connection completes, I think
that running the stack at interrupt all the way up to
the point of putting the completed new socket connection
on the associated listening socket's accept list is the
correct thing to do; likewise anything else that would
result in a need for upper level processing, _at all_.
This lets me process everything I can, and drop everything
I can't, as early as possible, before I've invested a lot
of futile effort in processing that will come to naught.

This is what LRP does.

This is what Van Jacobson's stack ([EMAIL PROTECTED])
does.

Why are you right, and Mohit Aron, Jeff Mogul, Peter
Druschel, and Van Jacobson, wrong?


 Most of the issues you raise regarding livelock can be
 mitigated with thoughtful driver design.  Eventually,
 however, the machine hits the wall, and something has to
 break.  You can't avoid this, no matter how you try; the
 goal is to put it off as long as possible.
 
 So.  Now you've been told again.

Tell me why it has to break, instead of me disabling receipt
of the packets by the card in order to shed load before it
becomes an issue for the host machine's bus, interrupt
processing system, etc.?

Are you claiming that dropping packets that are physically
impossible to handle, as early as possible, while handing
_all_ packets that are physically possible to handle, is
broken, or is somehow unpossible?

Thanks for any light you can shed on the subject,
-- Terry

PS: If you want to visit me at work, I'll show you code
running in a significantly modified FreeBSD 4.3 kernel.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-09 Thread Terry Lambert

Weiguang SHI wrote:
 
 I found an article on livelock at
 
 http://www.research.compaq.com/wrl/people/mogul/mogulpubsextern.html
 
 Just go there and search for livelock.
 
 But I don't agree with Terry about the interrupt-thread-is-bad
 thing, because, if I read it correctly, the authors themself
 implemented their ideas in interrupt thread of the Digital Unix.

Not quite.  These days, we are not necessarily talking about
just interrupt load limitations.

Feel free to take the following with a grain of salt; but
realize, I have personally achieved more simultaneous connections
on a FreeBSD box than anyone else out there without my code in
hand, and this was using gigabit ethernet controllers on modern
hardware, and further, this code is in shipping product today.

--

The number one way of dealing with excess load is to load-shed
it before the load causes problems.

In an interrupt threads implementaion, you can't really do
this, since the only option you have is when to schedule a
polling operation.  This leads to several inefficiencies,
all of which negatively impact the top end performance you
are going to be able to achieve.

Use of interrupt threads suffers from a drastically increased
latency in reenabling of interrupts, and can generally only
perform a single polling cycle, without running into the problem
of not making forward progress at the application level (they
run at IPL 0, which is effectively the same time at which NETISR
is currently run).  This leads to a tradeoff in increased
interrupt handling latency (e.g. the Tigon II Gigabit ethernet
driver in FreeBSD sets the Tigon II card firmware to coelesce at
most 32 interrupts), vs. the transmit starvation problem noted in
section 4.4 of the paper.

It should also be noted that, even if you have not reenabled
interrupts, the DMA engine on the card will still be DMA'ing
data into your receiver ring buffer.  The burst data rate on
a 66MHz, 64 bit PCI bus is just over 4Mbits/S, and the sustainable
data rate is much lower than that.

This means a machine acting as a switch or firewall with two
of these cards on board will not really have much time for
doing anything at all, except DMA transfers, if they are run
at full burst speed all the time (not possible).  Running an
application which requires disk activity will further eat into
the available bandwidth.

So this raises the spectre of DMA-based bus transfer livelock:
not just interrupt based livelock, if one is scheduling interrupt
threads to do event polling, instead of using one of the other
approaches outlined in the paper.

In the DEC UNIX case, they mitigated the problem by getting rid
of the IP input queue, and getting rid of NETISR (I agree that
these are required of any code with these goals).  The use of
the polling thread is really just their way of implementing the
polling approach, from section 5.3.  This does not address the
problems I noted above, and in particular, does not address the
latency vs. bus livelock tradeoff problem with modern hardware
(they were using an AMD LANCE Ethernet chip; this was a 10Mb
chip, and it doesn't support interrupt coelescing).  They also
assumed the use of a user space forwarding agent (screend):
a single process.

Further, I think that the feedback mechanism selected is not
really workable, without rewriting the card firmware, and
having a significant memory buffer on the card, something which
is not available on the market yet today.  This is because,
in practice, you can't stop all incoming packet processing just
because one user space program out of dozens has a full input
queue that the user space program has not processed yet.  It's
not reasonable to ignore new incoming requests to a web server,
or to disable card interrupts, or to (for example) drop all ARP
packets until TCP processing for that one application is complete:
their basic assumption -- which they admit, in section 6.6.1, is
that the screend is the only application running on the system.
This is simply not the case with a high traffic web server, a
database system, or any other work-to-do-engine model of several
process (or threads) with identical capability to service the
incoming requests.

Further, these applications use TCP, and thus have explicitly
application bound socket endpoints, and there is no way to
guarantee client load.  We could trivially DOS attack an Apache
server running SSL via mod_proxy, for example, by sending a flood
of intentionally bad packets.  The computation expense would keep
its input queue full, and therefore, the feedback mechanism noted
would starve the other Apache processes of legitimate input.
There are other obvious attacks, which are no less damaging in
their results, which attack other points in the assumption of a
single process queue feedback mechanism.

Their scheduler in section 7, which is in effect identical to the
fixed scheduling class in SVR4 (which was used by USL to avoid
the move mouse, wiggle cursor problem when using the 

Re: Allocate a page at interrupt time

2001-08-09 Thread Terry Lambert

Greg Lehey wrote:
  Solaris hits the wall a little later, but it still hits the
  wall.
 
 Every SMP system experiences performance degradation at some point.
 The question is a matter of the extent.

IMO, 16 processors is not unreasonable, even with standard APIC
based SMP.  32 is out of the question, but that's mostly because
you can't have more than 32 APIC ID's in the current 32 bit
processors, and still give one or more away to an IO APIC.  8-).


  On Intel hardware, it has historically hit it at the same 4 CPUs
  where everyone else tends to hit it, for the same reasons;
 
 This is a very broad statement.  You contradict it further down.

I contradict it for SPARC; I don't think I contradicted it
for Intel, but am wiling to take correction...


  Solaris claims to scale to 64 processors while maintaining SMP,
  rather than real or virtual NUMA.  It's been my own experience that
  this scaling claim is not entirely accurate, if what you are doing
  is a lot of kernel processing.
 
 I think that depends on how you interpret the claim.  It can only mean
 that adding a 64th processor can still be of benefit.

The 4 processors Intel claim is a point of diminishing
returns, and is well enough known that it is almost passed
into folklore (which might not bode well for finding people
building boards with more, which would be unfortunate).  My
SPARC experience is likewise a diminshing returns, where
it becomes cheaper to buy another box to get the performance
increment, than to stick more processors in the same box.
It's definitely anecdotal on my part.


  On the other hand, if you are running a lot of non-intersecting user
  space code (e.g. JVM's or CGI's), it's not as bad (and realized that
  FreeBSD is not that bad in the same situation, either: it's just not
  as common in practice as it is in theory).
 
 You're just describing a fact of life about UNIX SMP support.

Practice vs. Theory?  Or the inevitability of UNIX SMP support
having the performance characteristics it has most places?  I don't
buy the we must live with the performance because it's UNIX
argument, if you meant the latter.


  It should be noted that Solaris Interrupt threads are only
  used for interrupts of priority 10 and below: higher priority
  interrupts are _NOT_ handled by threads (interrupts at a
  priority level from 11 to 15).  10 is the clock interrupt.
 
 FreeBSD also has provision for not using interrupt threads for
 everything.  It's clearly too early to decide which interrupts should
 be left as traditional interrupts, and we've done some shifting back
 and forth to get things to work.  Note that the priority numbers are
 noise.  In this statement, they're just a convenient way to
 distinguish between threaded and non-threaded interrupts.

FreeBSD masks, Solaris IPLs.  In context, this was meant to
show why Solaris' approach is not directly translatable to
FreeBSD.

I really can't buy the idea that interrupt threads are a good
idea for anything that can flood your bus or interrupt bandwidth,
or have tiny/non-existant FIFOs, relative to the speeds they are
being pushed; right now that means might be OK for disks; not OK
for really fast network controllers, not OK for sorta fast network
controllers without a lot of adapter RAM, not OK for serial ports
and floppies, at least in my mind.



 I think somebody else has pointed out that we're very conscious of CPU
 affinity.

I think affinity isn't enough; I've expressed this to Alfred on a
number of occasions already, when I see him in the hallway or at
lunch.  Dealing with the problem is kind of an all-or-nothing bid.


  In the 32 processor Sequent boxes, the actual system bus was
  different, and directly supported message passing.
 
 Was this better or worse?

For the intent, much better.  It meant that non-intersecting
CPU/peripheral paths could run simultaneously.  The Harris
H1000 and H1200 had a similar thing (big iron real time
systems used on Navy ships and at the college where Wes and
I went to school).


  Also, the Sun system is still an IPL system, using level based
  blocking, rather than masking, and these threads can find
  themselves blocks on a mutex or condition variable for a
  relatively long time; if this happens, it resumes the previous
  thread _but does not drop its IPL below that of the suspended
  thread_, which is basically the Djikstra Banker's Algorithm
  method of avoiding priority inversion on interrupts (i.e. ugly).
 
 So you're saying we're doing it better?

Long term priority lending is the real problem I'm noting; this
is an artifact of context borrowing, more than anything else
(more below).

I think the FreeBSD use of masking is better than IPL'ing, and is
an obvious win in the case of multiple cards, since you can run
multiple interrupt handlers at the same time, but wonder what will
happen when/if it gets to UltraSPARC hardware.  I think the Djikstra
algorithm, in which contended resources are prereserved based on an
anticipated need, 

Re: Allocate a page at interrupt time

2001-08-09 Thread Mike Smith

 I really can't buy the idea that interrupt threads are a good
 idea for anything that can flood your bus or interrupt bandwidth,
 or have tiny/non-existant FIFOs, relative to the speeds they are
 being pushed; right now that means might be OK for disks; not OK
 for really fast network controllers, not OK for sorta fast network
 controllers without a lot of adapter RAM, not OK for serial ports
 and floppies, at least in my mind.

The basic problem here is that you have decided what interrupt threads 
are, and aren't interested in the fact that what FreeBSD calls interrupt 
threads are not the same thing, despite being told this countless times, 
and despite it being embodied in the code that's right under your nose.

You believe that an interrupt results in a make-runnable event, and at 
some future time, the interrupt thread services the interrupt request.

This is not the case, and never was.  The entire point of having 
interrupt threads is to allow interrupt handling routines to block in the 
case where the handler/driver design does not allow for nonblocking 
synchronisation between the top and bottom halves.  Most of the issues 
you raise regarding livelock can be mitigated with thoughtful driver 
design.  Eventually, however, the machine hits the wall, and something 
has to break.  You can't avoid this, no matter how you try; the goal is 
to put it off as long as possible.

So.  Now you've been told again.


-- 
... every activity meets with opposition, everyone who acts has his
rivals and unfortunately opponents also.  But not because people want
to be opponents, rather because the tasks and relationships force
people to take different points of view.  [Dr. Fritz Todt]
   V I C T O R Y   N O T   V E N G E A N C E



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-08 Thread Terry Lambert

void wrote:
  Can you name one SMP OS implementation that uses an
  interrupt threads approach that doesn't hit a scaling
  wall at 4 (or fewer) CPUs, due to heavier weight thread
  context switch overhead?
 
 Solaris, if I remember my Vahalia book correctly (isn't that a favorite
 of yours?).

As usual, IMO...

Yes, I like the Vahalia book; I did technical review of
it for Prentice Hall before its publication.

Solaris hits the wall a little later, but it still hits the
wall.  On Intel hardware, it has historically hit it at the
same 4 CPUs where everyone else tends to hit it, for the same
reasons; as of Solaris 2.6, they have adopted the hybrid per
CPU pool model recommended in Vahalia (Chapter 12).

While I'm at it, I suppose I should recommend reading the
definitive Solaris internals book, to date:

Solaris Internals, Core Kernel Architecture
Jim Mauro, Richard McDougall
Prentice Hall
ISBN: 0-13-022496-0

Solaris does use interrupt threads for some interrupts; I
don't like the idea, for the reasons stated previously.

Solaris claims to scale to 64 processors while maintaining
SMP, rather than real or virtual NUMA.  It's been my own
experience that this scaling claim is not entirely accurate,
if what you are doing is a lot of kernel processing.  On the
other hand, if you are running a lot of non-intersecting
user space code (e.g. JVM's or CGI's), it's not as bad (and
realized that FreeBSD is not that bad in the same situation,
either: it's just not as common in practice as it is in
theory).

It should be noted that Solaris Interrupt threads are only
used for interrupts of priority 10 and below: higher priority
interrupts are _NOT_ handled by threads (interrupts at a
priority level from 11 to 15).  10 is the clock interrupt.

It should also be noted that Solaris maintains a per processor
pool of interrupt threads for each of the lower priority
interrupts, with a global thread that is used for handling of
the clock interrupt.  This is _very_ different than taking an
interrupt thread, and rescheduling it on an arbitrary CPU,
and as others have pointed out, the hardware used to do the
scheduling is very different.

In the 32 processor Sequent boxes, the actual system bus was
different, and directly supported message passing.

There is also specific hardware support for handling interrupts
via threads, which is really not applicable to x86 or even the
Alpha architectures on which FreeBSD currently runs, nor to the
IA64 architecture (port in progress).  In particular, there is
a single system wide table, introduced with the UltraSPARC, that
doesn't need to be locked to support interrupt handling.

Also, the Sun system is still an IPL system, using level based
blocking, rather than masking, and these threads can find
themselves blocks on a mutex or condition variable for a
relatively long time; if this happens, it resumes the previous
thread _but does not drop its IPL below that of the suspended
thread_, which is basically the Djikstra Banker's Algorithm
method of avoiding priority inversion on interrupts (i.e. ugly).

Finally, the Sun system borrows the context of the interrupted
process (thread) for interrupt handling (the LWP).  This is very
similar to the technique employed with kernel vs. user space
thread associations within the Windows kernels (this was one of
the steps I was referring to when I said that NT had dealt with
a number of scaling issues before it needed to, so that they
would not turn into problems on 8-way and higher systems).

Personally, I think that the Sun system is extremely succeptible
to receiver livelock (Network interrupts are at 7, and disk
interrupts are at 5, which means that so long as you are getting
pounded with network interrupts for e.g. NFS read or write
requests, you're not going to service the disk interrupts that
will let you dispose of the traffic, nor will you run the user
space code for things like CGI's or Apache servers trying to
service a heavy load of requests for content).

I'm also not terrifically impressed with their callout mechanism,
when applied to networking, which has a preponderance of fixed,
known interval timers, but FreeBSD's isn't really any better,
which it comes to huge numbers of network connections, since it
will end up hashing 2/4/6/8/... into the same bucket, unordered,
which means traversing a large list of timers which are not
going to end up expiring (callout wheels are not a good thing to
mix with fixed interval timers of relatively long durations,
like the 2MSL timers that live in the networking code, or most
especially the TIME_WAIT timers).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-08 Thread Terry Lambert

Mike Smith wrote:
 Terry; all this thinking you're doing is *really*bad*.
 
 I appreciate that you believe you're trying to educate us somehow. But
 what you're really doing right now is filling our list archives with
 convincing-sounding crap.  People that are curious about this issue are
 likely to turn up your postings, and get *really* confused.
 
 Please.  Just stop, ok?

Mike, I know you are convinced you know everything, and that
of all the people who have worked professionally on SMP
systems before, FreeBSD has only one guy I'm aware of in a
design position for the SMP project, and a lot of students
who think they know what they are doing, even though they
can't cite the literature, but please...

Read the email threads all the way through before commenting
on my postings; the IPI issue is real for TLB shootdown, as
was pointed out by others; it was quite late, and it's very
understandable, given that I have aphasic dyslexia, that I
substituted the wrong word.

Rather than correcting things, as others have done, you have
insisted that no issue exists.

Effectively calling me an idiot in a public forum doesn't
help your credibility, and you're doing more damage by
denying that there is any issue whatsoever to be concerned
about, and being pedantic about precise word usage, instead
of addressing the issues and correcting my unintentional
spoonerisms out of concern for the archives.


Also please read the white paper reference I gave you about
receiver livelock: interrupt threads were, and are, a bad
idea, particularly on stock Intel SMP hardware -- so Solaris
using that approach doesn't justify it any more than antique
versions of IRIX using that approach do.

If you don't want to believe be, then believe Jeff Mogul,
but don't pretend that simply because I chose the wrong word,
that there is no issue to consider.

Thanks,
-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-08 Thread Weiguang SHI

I found an article on livelock at

http://www.research.compaq.com/wrl/people/mogul/mogulpubsextern.html

Just go there and search for livelock.

But I don't agree with Terry about the interrupt-thread-is-bad
thing, because, if I read it correctly, the authors themself
implemented their ideas in interrupt thread of the Digital Unix.

Weiguang

From: Greg Lehey [EMAIL PROTECTED]
To: Terry Lambert [EMAIL PROTECTED]
CC: Bosko Milekic [EMAIL PROTECTED], Matt Dillon 
[EMAIL PROTECTED], Zhihui Zhang [EMAIL PROTECTED], 
[EMAIL PROTECTED]
Subject: Re: Allocate a page at interrupt time
Date: Wed, 8 Aug 2001 13:34:14 +0930

On Tuesday,  7 August 2001 at  1:58:21 -0700, Terry Lambert wrote:
  Bosko Milekic wrote:
  I keep wondering about the sagicity of running interrupts in
  threads... it still seems like an incredibly bad idea to me.
 
  I guess my major problem with this is that by running in
  threads, it's made it nearly impossibly to avoid receiver
  livelock situations, using any of the classical techniques
  (e.g. Mogul's work, etc.).
 
  References to published works?
 
  Just do an NCSTRL search on receiver livelock; you will get
  over 90 papers...
 
  http://ncstrl.mit.edu/
 
  See also the list of participating institutions:
 
  http://ncstrl.mit.edu/Dienst/UI/2.0/ListPublishers
 
  It won't be that hard to find... Mogul has only published 92
  papers.  8-)

So much data, in fact, that you could hide anything behind it.  Would
you like to be more specific?

  It also has the unfortunate property of locking us into virtual
  wire mode, when in fact Microsoft demonstrated that wiring down
  interrupts to particular CPUs was good practice, in terms of
  assuring best performance.  Specifically, running in virtual
 
  Can you point us at any concrete information that shows
  this?  Specifically, without being Microsoft biased (as is most
  data published by Microsoft)? -- i.e. preferably third-party
  performance testing that attributes wiring down of interrupts to
  particular CPUs as _the_ performance advantage.
 
  FreeBSD was tested, along with Linux and NT, by Ziff Davis
  Labs, in Foster city, with the participation of Jordan
  Hubbard and Mike Smith.  You can ask either of them for the
  results of the test; only the Linux and NT numbers were
  actually released.  This was done to provide a non-biased
  baseline, in reaction to the Mindcraft benchmarks, where
  Linux showed so poorly.  They ran quad ethernet cards, with
  quad CPUs; the NT drivers wired the cards down to seperate
  INT A/B/C/D interrupts, one per CPU.

You carefully neglect to point out that this was the old SMP
implementation.  I think this completely invalidates any point you may
have been trying to make.

  wire mode means that all your CPUs get hit with the interrupt,
  whereas running with the interrupt bound to a particular CPU
  reduces the overall overhead.  Even what we have today, with
 
  Obviously.
 
  I mention it because this is the direction FreeBSD appears
  to be moving in.  Right now, Intel is shipping with seperate
  PCI busses; there is one motherboard from their serverworks
  division that has 16 seperate PCI busses -- which means that
  you can do simultaneous gigabit card DMA to and from memory,
  without running into bus contention, so long as the memory is
  logically seperate.  NT can use this hardware to its full
  potential; FreeBSD as it exists, can not, and FreeBSD as it
  appears to be heading today (interrupt threads, etc.) seems
  to be in the same boat as Linux, et. al..  PCI-X will only
  make things worse (8.4 gigabit, burst rate).

What do interrupt threads have to do with this?

Terry, we've done a lot of thinking about performance implications
over the last 2 years, including addressing all of the points that you
appear to raise.  A lot of it is in the archives.

It's quite possible that we've missed something important that you
haven't.  But if that's the case, we'd like you to state it.  All I
see is you coming in, waving your hands and shouting generalities
which don't really help much.  The fact that people are still
listening is very much an indication of the hope that you might come
up with something useful.  But pointing to 92 papers and saying it's
in there [somewhere] isn't very helpful.

Greg
--
See complete headers for address and phone numbers

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message


_
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-08 Thread Greg Lehey

On Wednesday,  8 August 2001 at  0:27:23 -0700, Terry Lambert wrote:
 void wrote:
 Can you name one SMP OS implementation that uses an
 interrupt threads approach that doesn't hit a scaling
 wall at 4 (or fewer) CPUs, due to heavier weight thread
 context switch overhead?

 Solaris, if I remember my Vahalia book correctly (isn't that a favorite
 of yours?).

 As usual, IMO...

 Yes, I like the Vahalia book; I did technical review of
 it for Prentice Hall before its publication.

 Solaris hits the wall a little later, but it still hits the
 wall.

Every SMP system experiences performance degradation at some point.
The question is a matter of the extent.

 On Intel hardware, it has historically hit it at the same 4 CPUs
 where everyone else tends to hit it, for the same reasons; 

This is a very broad statement.  You contradict it further down.

 as of Solaris 2.6, they have adopted the hybrid per CPU pool model
 recommended in Vahalia (Chapter 12).

 While I'm at it, I suppose I should recommend reading the
 definitive Solaris internals book, to date:

   Solaris Internals, Core Kernel Architecture
   Jim Mauro, Richard McDougall
   Prentice Hall
   ISBN: 0-13-022496-0

Yes, I have this book.  It looks very good, but I haven't found time
to read it.

 Solaris claims to scale to 64 processors while maintaining SMP,
 rather than real or virtual NUMA.  It's been my own experience that
 this scaling claim is not entirely accurate, if what you are doing
 is a lot of kernel processing.

I think that depends on how you interpret the claim.  It can only mean
that adding a 64th processor can still be of benefit.

 On the other hand, if you are running a lot of non-intersecting user
 space code (e.g. JVM's or CGI's), it's not as bad (and realized that
 FreeBSD is not that bad in the same situation, either: it's just not
 as common in practice as it is in theory).

You're just describing a fact of life about UNIX SMP support.

 It should be noted that Solaris Interrupt threads are only
 used for interrupts of priority 10 and below: higher priority
 interrupts are _NOT_ handled by threads (interrupts at a
 priority level from 11 to 15).  10 is the clock interrupt.

FreeBSD also has provision for not using interrupt threads for
everything.  It's clearly too early to decide which interrupts should
be left as traditional interrupts, and we've done some shifting back
and forth to get things to work.  Note that the priority numbers are
noise.  In this statement, they're just a convenient way to
distinguish between threaded and non-threaded interrupts.

 It should also be noted that Solaris maintains a per processor pool
 of interrupt threads for each of the lower priority interrupts, with
 a global thread that is used for handling of the clock interrupt.
 This is _very_ different than taking an interrupt thread, and
 rescheduling it on an arbitrary CPU, and as others have pointed out,
 the hardware used to do the scheduling is very different.

I think somebody else has pointed out that we're very conscious of CPU
affinity.

 In the 32 processor Sequent boxes, the actual system bus was
 different, and directly supported message passing.

Was this better or worse?

 There is also specific hardware support for handling interrupts
 via threads, which is really not applicable to x86 or even the
 Alpha architectures on which FreeBSD currently runs, nor to the
 IA64 architecture (port in progress).  In particular, there is
 a single system wide table, introduced with the UltraSPARC, that
 doesn't need to be locked to support interrupt handling.

 Also, the Sun system is still an IPL system, using level based
 blocking, rather than masking, and these threads can find
 themselves blocks on a mutex or condition variable for a
 relatively long time; if this happens, it resumes the previous
 thread _but does not drop its IPL below that of the suspended
 thread_, which is basically the Djikstra Banker's Algorithm
 method of avoiding priority inversion on interrupts (i.e. ugly).

So you're saying we're doing it better?

 Finally, the Sun system borrows the context of the interrupted
 process (thread) for interrupt handling (the LWP).  This is very
 similar to the technique employed with kernel vs. user space thread
 associations within the Windows kernels (this was one of the steps I
 was referring to when I said that NT had dealt with a number of
 scaling issues before it needed to, so that they would not turn into
 problems on 8-way and higher systems).

This is also the method we're planning to use, as I'm sure you're
aware from previous messages on the -smp list.

 Personally, I think that the Sun system is extremely succeptible to
 receiver livelock (Network interrupts are at 7, and disk interrupts
 are at 5, which means that so long as you are getting pounded with
 network interrupts for e.g. NFS read or write requests, you're not
 going to service the disk interrupts that will let you dispose of
 the traffic, nor 

Re: Allocate a page at interrupt time

2001-08-07 Thread Terry Lambert

Matt Dillon wrote:
 Yes, that is precisely the reason.  In -current this all changes, though,
 since interrupts are now threads.  *But*, that said, interrupts cannot
 really afford to hold mutexes that might end up blocking them for
 long periods of time so I would still recommend that interrupt code not
 attempt to allocate pages out of PQ_CACHE.

I keep wondering about the sagicity of running interrupts in
threads... it still seems like an incredibly bad idea to me.

I guess my major problem with this is that by running in
threads, it's made it nearly impossibly to avoid receiver
livelock situations, using any of the classical techniques
(e.g. Mogul's work, etc.).

It also has the unfortunate property of locking us into virtual
wire mode, when in fact Microsoft demonstrated that wiring down
interrupts to particular CPUs was good practice, in terms of
assuring best performance.  Specifically, running in virtual
wire mode means that all your CPUs get hit with the interrupt,
whereas running with the interrupt bound to a particular CPU
reduces the overall overhead.  Even what we have today, with
the big giant lock and redirecting interrupts to the CPU in
the kernel is better than that...

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Bosko Milekic


On Mon, Aug 06, 2001 at 11:27:56PM -0700, Terry Lambert wrote:
 I keep wondering about the sagicity of running interrupts in
 threads... it still seems like an incredibly bad idea to me.
 
 I guess my major problem with this is that by running in
 threads, it's made it nearly impossibly to avoid receiver
 livelock situations, using any of the classical techniques
 (e.g. Mogul's work, etc.).

References to published works?
 
 It also has the unfortunate property of locking us into virtual
 wire mode, when in fact Microsoft demonstrated that wiring down
 interrupts to particular CPUs was good practice, in terms of
 assuring best performance.  Specifically, running in virtual

Can you point us at any concrete information that shows this?
Specifically, without being Microsoft biased (as is most data published by
Microsoft)? -- i.e. preferably third-party performance testing that attributes
wiring down of interrupts to particular CPUs as _the_ performance advantage.

 wire mode means that all your CPUs get hit with the interrupt,
 whereas running with the interrupt bound to a particular CPU
 reduces the overall overhead.  Even what we have today, with

Obviously.

 the big giant lock and redirecting interrupts to the CPU in
 the kernel is better than that...
 
 -- Terry

-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Alfred Perlstein

* Bosko Milekic [EMAIL PROTECTED] [010807 02:16] wrote:
 
 On Mon, Aug 06, 2001 at 11:27:56PM -0700, Terry Lambert wrote:
  I keep wondering about the sagicity of running interrupts in
  threads... it still seems like an incredibly bad idea to me.
  
  I guess my major problem with this is that by running in
  threads, it's made it nearly impossibly to avoid receiver
  livelock situations, using any of the classical techniques
  (e.g. Mogul's work, etc.).
 
   References to published works?
  
  It also has the unfortunate property of locking us into virtual
  wire mode, when in fact Microsoft demonstrated that wiring down
  interrupts to particular CPUs was good practice, in terms of
  assuring best performance.  Specifically, running in virtual
 
   Can you point us at any concrete information that shows this?
 Specifically, without being Microsoft biased (as is most data published by
 Microsoft)? -- i.e. preferably third-party performance testing that attributes
 wiring down of interrupts to particular CPUs as _the_ performance advantage.
 
  wire mode means that all your CPUs get hit with the interrupt,
  whereas running with the interrupt bound to a particular CPU
  reduces the overall overhead.  Even what we have today, with
 
   Obviously.
 
  the big giant lock and redirecting interrupts to the CPU in
  the kernel is better than that...

I really don't see what part of the current design specifically
disallows one to both:

1) force interrupts to be taken on a particular cpu.
2) if that thread gets switched out, have it put on a per-cpu
   runqueue when it becomes runable preventing another cpu from
   snatching it up.

I've already implemented #2, #1 requires touching hardware
which isn't something I like doing. :)

-- 
-Alfred Perlstein [[EMAIL PROTECTED]]
Ok, who wrote this damn function called '??'?
And why do my programs keep crashing in it?

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Mike Smith

 It also has the unfortunate property of locking us into virtual
 wire mode, when in fact Microsoft demonstrated that wiring down
 interrupts to particular CPUs was good practice, in terms of
 assuring best performance.  Specifically, running in virtual
 wire mode means that all your CPUs get hit with the interrupt,
 whereas running with the interrupt bound to a particular CPU
 reduces the overall overhead.  Even what we have today, with
 the big giant lock and redirecting interrupts to the CPU in
 the kernel is better than that...

Terry, this is *total* garbage.

Just so you know, ok?

-- 
... every activity meets with opposition, everyone who acts has his
rivals and unfortunately opponents also.  But not because people want
to be opponents, rather because the tasks and relationships force
people to take different points of view.  [Dr. Fritz Todt]
   V I C T O R Y   N O T   V E N G E A N C E



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Terry Lambert

Bosko Milekic wrote:
  I keep wondering about the sagicity of running interrupts in
  threads... it still seems like an incredibly bad idea to me.
 
  I guess my major problem with this is that by running in
  threads, it's made it nearly impossibly to avoid receiver
  livelock situations, using any of the classical techniques
  (e.g. Mogul's work, etc.).
 
 References to published works?

Just do an NCSTRL search on receiver livelock; you will get
over 90 papers...

http://ncstrl.mit.edu/

See also the list of participating institutions:

http://ncstrl.mit.edu/Dienst/UI/2.0/ListPublishers

It won't be that hard to find... Mogul has only published 92
papers.  8-)


  It also has the unfortunate property of locking us into virtual
  wire mode, when in fact Microsoft demonstrated that wiring down
  interrupts to particular CPUs was good practice, in terms of
  assuring best performance.  Specifically, running in virtual
 
 Can you point us at any concrete information that shows
 this?  Specifically, without being Microsoft biased (as is most
 data published by Microsoft)? -- i.e. preferably third-party
 performance testing that attributes wiring down of interrupts to
 particular CPUs as _the_ performance advantage.

FreeBSD was tested, along with Linux and NT, by Ziff Davis
Labs, in Foster city, with the participation of Jordan
Hubbard and Mike Smith.  You can ask either of them for the
results of the test; only the Linux and NT numbers were
actually released.  This was done to provide a non-biased
baseline, in reaction to the Mindcraft benchmarks, where
Linux showed so poorly.  They ran quad ethernet cards, with
quad CPUs; the NT drivers wired the cards down to seperate
INT A/B/C/D interrupts, one per CPU.


  wire mode means that all your CPUs get hit with the interrupt,
  whereas running with the interrupt bound to a particular CPU
  reduces the overall overhead.  Even what we have today, with
 
 Obviously.

I mention it because this is the direction FreeBSD appears
to be moving in.  Right now, Intel is shipping with seperate
PCI busses; there is one motherboard from their serverworks
division that has 16 seperate PCI busses -- which means that
you can do simultaneous gigabit card DMA to and from memory,
without running into bus contention, so long as the memory is
logically seperate.  NT can use this hardware to its full
potential; FreeBSD as it exists, can not, and FreeBSD as it
appears to be heading today (interrupt threads, etc.) seems
to be in the same boat as Linux, et. al..  PCI-X will only
make things worse (8.4 gigabit, burst rate).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Terry Lambert

Mike Smith wrote:
 
  It also has the unfortunate property of locking us into virtual
  wire mode, when in fact Microsoft demonstrated that wiring down
  interrupts to particular CPUs was good practice, in terms of
  assuring best performance.  Specifically, running in virtual
  wire mode means that all your CPUs get hit with the interrupt,
  whereas running with the interrupt bound to a particular CPU
  reduces the overall overhead.  Even what we have today, with
  the big giant lock and redirecting interrupts to the CPU in
  the kernel is better than that...
 
 Terry, this is *total* garbage.
 
 Just so you know, ok?

What this, exactly?

That virtual wire mode is actually a bad idea for some
applications -- specifically, high speed networking with
multiple gigabit ethernet cards?

That Microsoft demonstrated that wiring down interrupts
to a particular CPU was a good idea, and kicked both Linux'
and FreeBSD's butt in the test at ZD Labs?

That taking interrupts on a single directed CPU is better
than taking an IPI on all your CPUs, and then sorting out
who's going to handle the interrupt?

Can you name one SMP OS implementation that uses an
interrupt threads approach that doesn't hit a scaling
wall at 4 (or fewer) CPUs, due to heavier weight thread
context switch overhead?

Can you tell me how, in the context of having an interrupt
thread doing scheduled processing, how you could avoid an
interrupt overhead livelock, where the thread doesn't get
opportunity to run because you're too busy taking interrupts
to be able to get any work done?

FWIW, I would be happy to cite sources to you, off the
general list.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Matt Dillon


:  It also has the unfortunate property of locking us into virtual
:  wire mode, when in fact Microsoft demonstrated that wiring down
:  interrupts to particular CPUs was good practice, in terms of
:  assuring best performance.  Specifically, running in virtual
:  wire mode means that all your CPUs get hit with the interrupt,
:  whereas running with the interrupt bound to a particular CPU
:  reduces the overall overhead.  Even what we have today, with
:  the big giant lock and redirecting interrupts to the CPU in
:  the kernel is better than that...
: 
: Terry, this is *total* garbage.
: 
: Just so you know, ok?
:
:What this, exactly?
:
:That virtual wire mode is actually a bad idea for some
:applications -- specifically, high speed networking with
:multiple gigabit ethernet cards?

All the cpu's don't get the interrupt, only one does.

:That Microsoft demonstrated that wiring down interrupts
:to a particular CPU was a good idea, and kicked both Linux'
:and FreeBSD's butt in the test at ZD Labs?

Well, if you happen to have four NICs and four CPUs, and
you are running them all full bore, I would say that
wiring the NICs to the CPUs would be a good idea.  That
seems like a rather specialized situation, though.

-Matt

:That taking interrupts on a single directed CPU is better
:than taking an IPI on all your CPUs, and then sorting out
:who's going to handle the interrupt?
:...
:
:-- Terry




To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Garance A Drosihn

At 12:39 AM -0700 8/7/01, Mike Smith wrote:
   It also has the unfortunate property of locking us into virtual
  wire mode, when in fact Microsoft demonstrated that wiring down
  interrupts to particular CPUs was good practice, in terms of
  assuring best performance.  Specifically, running in virtual
  wire mode means that all your CPUs get hit with the interrupt,
  whereas running with the interrupt bound to a particular CPU
  reduces the overall overhead.  Even what we have today, with
  the big giant lock and redirecting interrupts to the CPU in
  the kernel is better than that...

Terry, this is *total* garbage.

Just so you know, ok?

There are people on this list besides Terry.  Terry has taken
the time to refer to a few URL's, and remind us of a benchmark
that I (for one) do remember, and I do remember Windows doing
quite well on it.  Maybe that benchmark was bogus for some
reason, but I seem to remember several freebsd developers taking
it seriously at the time.

So, could you at least fill in what part of the above is total
garbage?  Throw in a few insults to Terry if it makes you feel
better for some reason, but raise the level of information
content a little for the rest of us?  You quoted several
distinct comments of Terry's -- were all of them garbage?

It might very well be that all of Terry's comments were in fact
garbage, but from the sidelines I'd appreciate a little more
in the way of technical details.

-- 
Garance Alistair Drosehn=   [EMAIL PROTECTED]
Senior Systems Programmer   or  [EMAIL PROTECTED]
Rensselaer Polytechnic Instituteor  [EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Garance A Drosihn

At 9:55 AM -0700 8/7/01, Matt Dillon wrote:
:  It also has the unfortunate property of locking us into virtual
:  wire mode, when in fact Microsoft demonstrated that wiring down
:  interrupts to particular CPUs was good practice, in terms of
:  assuring best performance. [...]
:
: Terry, this is *total* garbage.
:
: Just so you know, ok?
:
:What this, exactly?
:
:That virtual wire mode is actually a bad idea for some
:applications -- specifically, high speed networking with
:multiple gigabit ethernet cards?

 All the cpu's don't get the interrupt, only one does.

:That Microsoft demonstrated that wiring down interrupts
:to a particular CPU was a good idea, and kicked both Linux'
:and FreeBSD's butt in the test at ZD Labs?

 Well, if you happen to have four NICs and four CPUs, and
 you are running them all full bore, I would say that
 wiring the NICs to the CPUs would be a good idea.  That
 seems like a rather specialized situation, though.

Okay, that's helpful to sort out the discussion.

I'd agree that is a specialized situation, one which wouldn't
be critical to many freebsd users.  Is Terry right that the
current strategy will lock us into virtual wire mode, in
some way which means that this specialized situation CANNOT
be handled?

(it would be fine if it were handled via some specialized
kernel option, imo.  I'm just wondering what the limitations
are.  I do not mean to imply we should follow some different
strategy here, I'm just wondering...)

-- 
Garance Alistair Drosehn=   [EMAIL PROTECTED]
Senior Systems Programmer   or  [EMAIL PROTECTED]
Rensselaer Polytechnic Instituteor  [EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Matt Dillon

:I'd agree that is a specialized situation, one which wouldn't
:be critical to many freebsd users.  Is Terry right that the
:current strategy will lock us into virtual wire mode, in
:some way which means that this specialized situation CANNOT
:be handled?
:
:(it would be fine if it were handled via some specialized
:kernel option, imo.  I'm just wondering what the limitations
:are.  I do not mean to imply we should follow some different
:strategy here, I'm just wondering...)
:
:-- 
:Garance Alistair Drosehn=   [EMAIL PROTECTED]

In -current there is nothing preventing us from wiring
interrupt *threads* to cpus.  Wiring the actual interrupts
themselves might or might not yield a performance improvement
beyond that.

-Matt

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Zach Brown

 That Microsoft demonstrated that wiring down interrupts
 to a particular CPU was a good idea, and kicked both Linux'
 and FreeBSD's butt in the test at ZD Labs?

No, Terry, this is not what was demonstrated by those tests.  Will this
myth never die?  Do Mike and I have to write up a nice white paper? :)

The environment was ridigly specified:  quad cpu box, four eepro 100mb
interfaces, and a _heavy_ load of short lived connections fetching static
cached content.  The test was clearly designed to stress concurrency in
the network stack, with heavy low latency interrupt load.  Neither Linux
nor FreeBSD could do this well at the time.  There was a service pack
issed a few months before the test that 'threaded' NT's stack..

It was not a mistake that the rules of the tests forbid doing the sane
thing and running on a system with a single very fast cpu, lots of mem,
and gigabit interface with an actual published interface for coalescing
interrupts.  That would have performed better and been cheaper.

Thats what pisses me off about the tests to this day.  The problem people
are faced with is is how do I serve this static content reliably and
cheaply, not, what OS should I serve my content with, now that I've
bought this ridiculous machine?.  Its sad that people consistently
insist on drawing insane conclusions from these benchmark events.

-- 
 zach

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Terry Lambert

Matt Dillon wrote:
 :What this, exactly?
 :
 :That virtual wire mode is actually a bad idea for some
 :applications -- specifically, high speed networking with
 :multiple gigabit ethernet cards?
 
 All the cpu's don't get the interrupt, only one does.

I think that you will end up taking an IPI (Inter Processor
Interrupt) to shoot down the cache line during an invalidate
cycle, when moving an interrupt processing thread from one
CPU to another.  For multiple high speed interfaces (disk or
network; doesn't matter), you will end up burining a *lot*
of time, without a lockdown.

You might be able to avoid this by doing some of the tricks
I've discussed with Alfred to ensure that there is no lock
contention in the non-migratory case for KSEs (or kernel
interrupt threads) to handle per CPU scheduling, but I
think that the interrupt masking will end up being very hard
to manage, and you will get the same effect as locking the
interrupt to a particular CPU... if you asre lucky.

Any case which _did_ invoke a lock and resulted in contention
would require at least a barrier instruction; I guess you
could do it in a non-cacheable page to avoid the TLB
interaction, and another IPI for an update or invalidate
cycle for the lock, but then you are limited to memory speed,
which is getting down to around a factor of 10 (133MHz) slower
than CPU speed, these days, and that's actually one heck of a
stall hit to take.


 :That Microsoft demonstrated that wiring down interrupts
 :to a particular CPU was a good idea, and kicked both Linux'
 :and FreeBSD's butt in the test at ZD Labs?
 
 Well, if you happen to have four NICs and four CPUs, and
 you are running them all full bore, I would say that
 wiring the NICs to the CPUs would be a good idea.  That
 seems like a rather specialized situation, though.

I don't think so.  These days, interrupt overhead can come
from many places, including intentional denial of service
attacks.  If you have an extra box around, I'd suggest that
you install QLinux, and benchmark it side by side against
FreeBSD, under an extreme load, and watch the FreeBSD system's
performance fall off when interrupt overhead becomes so high
that NETISR effectively never gets a chance to run.

I also suggest using 100Base-T cards, since the interrupt
coelescing on Gigabit cards could prevent you from observing
the livelock from interrupt overload, unless you could load
your machine to full wire speed (~950Mbits/S) so that your
PCI bus transfer rate becomes a barrier.

I know you were involved in some of the performance tuning
that was attempted immediately after the ZD Labs tests, so I
know you know this was a real issue; I think it still is.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Terry Lambert

Zach Brown wrote:
  That Microsoft demonstrated that wiring down interrupts
  to a particular CPU was a good idea, and kicked both Linux'
  and FreeBSD's butt in the test at ZD Labs?
 
 No, Terry, this is not what was demonstrated by those tests.  Will this
 myth never die?  Do Mike and I have to write up a nice white paper? :)

That would be nice, actually.

 
 The environment was ridigly specified:  quad cpu box, four eepro 100mb
 interfaces, and a _heavy_ load of short lived connections fetching static
 cached content.  The test was clearly designed to stress concurrency in
 the network stack, with heavy low latency interrupt load.  Neither Linux
 nor FreeBSD could do this well at the time.  There was a service pack
 issed a few months before the test that 'threaded' NT's stack..
 
 It was not a mistake that the rules of the tests forbid doing the sane
 thing and running on a system with a single very fast cpu, lots of mem,
 and gigabit interface with an actual published interface for coalescing
 interrupts.  That would have performed better and been cheaper.

I have soft interrupt coelescing changes for most FreeBSD
drivers written by Bill Paul; the operation is trivial, and Bill
has structured his drivers well for doing that sort of thing.

I personally don't think the test was unfair; it seems to me
to be representative of most web traffic, which averages 8k a
page for most static content, according to published studies.

 Thats what pisses me off about the tests to this day.  The problem
 people are faced with is is how do I serve this static content
 reliably and cheaply, not, what OS should I serve my content
 with, now that I've bought this ridiculous machine?.

8-) 8-).


  Its sad that people consistently insist on drawing insane
 conclusions from these benchmark events.

I think that concurrency in the TCP stack is something that
needs to be addressed; I'm glad they ran the benchmark, if
only for that.

Even if we both agree on the conclusions, agreeing isn't
going to change people's perceptions, but beating them on
their terms _will_, so it's a worthwhile pursuit.

I happen to agree that their test indicated some shortcomings
in the OS designs; regardless of whether we think they were
carefully chosen to specifically emphasize those shortcomings,
it doesn't change the fact that they are shortcomings.

There's no use crying over spilt milk: the question is what
can be done about it, besides trying to deny the validity of
the tests.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Rik van Riel

On Tue, 7 Aug 2001, Terry Lambert wrote:
 Matt Dillon wrote:

  All the cpu's don't get the interrupt, only one does.

 I think that you will end up taking an IPI (Inter Processor
 Interrupt) to shoot down the cache line during an invalidate
 cycle, when moving an interrupt processing thread from one
 CPU to another.

You have a lot of fantasy today.  You may want to consider
reading one of the white papers you referred us to with so
much enthusiasm and trying again later ;)

  Well, if you happen to have four NICs and four CPUs, and
  you are running them all full bore, I would say that
  wiring the NICs to the CPUs would be a good idea.  That
  seems like a rather specialized situation, though.

 I don't think so.  These days, interrupt overhead can come
 from many places,

Exactly. You never know where your interrupts come from, so
wiring them in a fixed setup really isn't going to do you
much good in the generic case.

Now if you want to optimise your source code for something
like a Mindcraft benchmark ...

regards,

Rik
--
Executive summary of a recent Microsoft press release:
   we are concerned about the GNU General Public License (GPL)


http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com/


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Matt Dillon


:
:Matt Dillon wrote:
: :What this, exactly?
: :
: :That virtual wire mode is actually a bad idea for some
: :applications -- specifically, high speed networking with
: :multiple gigabit ethernet cards?
: 
: All the cpu's don't get the interrupt, only one does.
:
:I think that you will end up taking an IPI (Inter Processor
:Interrupt) to shoot down the cache line during an invalidate
:cycle, when moving an interrupt processing thread from one
:CPU to another.  For multiple high speed interfaces (disk or
:network; doesn't matter), you will end up burining a *lot*
:of time, without a lockdown.

Cache line invalidation does not require an IPI.  TLB
shootdowns require IPIs.  TLB shootdowns are unrelated to
interrupt threads, they only occur when shared mmu mappings
change.  Cache line invalidation can waste cpu cycles --
when cache mastership changes occur between cpus due to
threads being switched between cpus.  I consider this a
serious problem in -current.

-Matt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Bosko Milekic


On Tue, Aug 07, 2001 at 12:19:01PM -0700, Matt Dillon wrote:
 Cache line invalidation does not require an IPI.  TLB
 shootdowns require IPIs.  TLB shootdowns are unrelated to
 interrupt threads, they only occur when shared mmu mappings
 change.  Cache line invalidation can waste cpu cycles --
 when cache mastership changes occur between cpus due to
 threads being switched between cpus.  I consider this a
 serious problem in -current.

I don't think it's fair to consider this a serious problem seeing as
how, as far as I'm aware, we've intended to eventually introduce code that will
favor keeping threads running on one CPU on that same CPU as long as it is
reasonable to do so (which should be most of the time).
I think after briefly discussing with Alfred on IRC that Alfred has
some CPU affinity patches on the way, but I'm not sure if they address
thread scheduling with the above intent in mind or if they merely introduce
an _interface_ to bind a thread to a single CPU.
 
   -Matt

-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Alfred Perlstein

* Bosko Milekic [EMAIL PROTECTED] [010807 14:51] wrote:
 
 On Tue, Aug 07, 2001 at 12:19:01PM -0700, Matt Dillon wrote:
  Cache line invalidation does not require an IPI.  TLB
  shootdowns require IPIs.  TLB shootdowns are unrelated to
  interrupt threads, they only occur when shared mmu mappings
  change.  Cache line invalidation can waste cpu cycles --
  when cache mastership changes occur between cpus due to
  threads being switched between cpus.  I consider this a
  serious problem in -current.
 
   I don't think it's fair to consider this a serious problem seeing as
 how, as far as I'm aware, we've intended to eventually introduce code that will
 favor keeping threads running on one CPU on that same CPU as long as it is
 reasonable to do so (which should be most of the time).
   I think after briefly discussing with Alfred on IRC that Alfred has
 some CPU affinity patches on the way, but I'm not sure if they address
 thread scheduling with the above intent in mind or if they merely introduce
 an _interface_ to bind a thread to a single CPU.

They do both. :)  You can bind a process to a runqueue _and_ at the
same time as long as a process on another CPU doesn't have a much
higher priority we'll take from our local pool.

Basically we give processes that last ran on our own CPU a false
priority boost.

http://people.freebsd.org/~alfred/bind_cpu.diff

+   cpu = PCPU_GET(cpuid);
+   pricpu = runq_findbit(runqcpu[cpu]);
+   pri = runq_findbit(rq);
+   CTR2(KTR_RUNQ, runq_choose: pri=%d cpupri=%d, pri, pricpu);
+   if (pricpu != -1  (pricpu  pri || pri == -1)) {
+   pri = pricpu;
+   rqh = runqcpu[cpu].rq_queues[pri];
+   } else if (pri != -1) {
+   rqh = rq-rq_queues[pri];
+   } else {
+   CTR1(KTR_RUNQ, runq_choose: idleproc pri=%d, pri);
+   return (PCPU_GET(idleproc));
+   }
+   p = TAILQ_FIRST(rqh);

Actually I think this patch is stale, it doesn't have
the priority boost, but basically you can put it in the

if (pricpu != -1  (pricpu  pri || pri == -1)) {

clause sort of like this:

if (pricpu != -1  (pricpu - FUDGE  pri || pri == -1)) {

Where FUDGE is the priority boost you want to give local processes.

-- 
-Alfred Perlstein [[EMAIL PROTECTED]]
Ok, who wrote this damn function called '??'?
And why do my programs keep crashing in it?

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread void

On Tue, Aug 07, 2001 at 02:11:10AM -0700, Terry Lambert wrote:
 
 Can you name one SMP OS implementation that uses an
 interrupt threads approach that doesn't hit a scaling
 wall at 4 (or fewer) CPUs, due to heavier weight thread
 context switch overhead?

Solaris, if I remember my Vahalia book correctly (isn't that a favorite
of yours?).

-- 
 Ben

An art scene of delight
 I created this to be ...  -- Sun Ra

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Greg Lehey

On Tuesday,  7 August 2001 at  1:58:21 -0700, Terry Lambert wrote:
 Bosko Milekic wrote:
 I keep wondering about the sagicity of running interrupts in
 threads... it still seems like an incredibly bad idea to me.

 I guess my major problem with this is that by running in
 threads, it's made it nearly impossibly to avoid receiver
 livelock situations, using any of the classical techniques
 (e.g. Mogul's work, etc.).

 References to published works?

 Just do an NCSTRL search on receiver livelock; you will get
 over 90 papers...

   http://ncstrl.mit.edu/

 See also the list of participating institutions:

   http://ncstrl.mit.edu/Dienst/UI/2.0/ListPublishers

 It won't be that hard to find... Mogul has only published 92
 papers.  8-)

So much data, in fact, that you could hide anything behind it.  Would
you like to be more specific?

 It also has the unfortunate property of locking us into virtual
 wire mode, when in fact Microsoft demonstrated that wiring down
 interrupts to particular CPUs was good practice, in terms of
 assuring best performance.  Specifically, running in virtual

 Can you point us at any concrete information that shows
 this?  Specifically, without being Microsoft biased (as is most
 data published by Microsoft)? -- i.e. preferably third-party
 performance testing that attributes wiring down of interrupts to
 particular CPUs as _the_ performance advantage.

 FreeBSD was tested, along with Linux and NT, by Ziff Davis
 Labs, in Foster city, with the participation of Jordan
 Hubbard and Mike Smith.  You can ask either of them for the
 results of the test; only the Linux and NT numbers were
 actually released.  This was done to provide a non-biased
 baseline, in reaction to the Mindcraft benchmarks, where
 Linux showed so poorly.  They ran quad ethernet cards, with
 quad CPUs; the NT drivers wired the cards down to seperate
 INT A/B/C/D interrupts, one per CPU.

You carefully neglect to point out that this was the old SMP
implementation.  I think this completely invalidates any point you may
have been trying to make.

 wire mode means that all your CPUs get hit with the interrupt,
 whereas running with the interrupt bound to a particular CPU
 reduces the overall overhead.  Even what we have today, with

 Obviously.

 I mention it because this is the direction FreeBSD appears
 to be moving in.  Right now, Intel is shipping with seperate
 PCI busses; there is one motherboard from their serverworks
 division that has 16 seperate PCI busses -- which means that
 you can do simultaneous gigabit card DMA to and from memory,
 without running into bus contention, so long as the memory is
 logically seperate.  NT can use this hardware to its full
 potential; FreeBSD as it exists, can not, and FreeBSD as it
 appears to be heading today (interrupt threads, etc.) seems
 to be in the same boat as Linux, et. al..  PCI-X will only
 make things worse (8.4 gigabit, burst rate).

What do interrupt threads have to do with this?

Terry, we've done a lot of thinking about performance implications
over the last 2 years, including addressing all of the points that you
appear to raise.  A lot of it is in the archives.

It's quite possible that we've missed something important that you
haven't.  But if that's the case, we'd like you to state it.  All I
see is you coming in, waving your hands and shouting generalities
which don't really help much.  The fact that people are still
listening is very much an indication of the hope that you might come
up with something useful.  But pointing to 92 papers and saying it's
in there [somewhere] isn't very helpful.

Greg
--
See complete headers for address and phone numbers

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-05 Thread Matt Dillon

:I should have guessed the reason. Matthew Dillon answered this question on 
:Fri, 2 Jun 2000 as follows:
:
:
:The VM routines that manage pages associated with objects are not
:protected against interrupts, so interrupts aren't allowed to change
:page-object associations.  Otherwise an interrupt at just the wrong
:time could corrupt the mainline kernel VM code.
:
:
:On Thu, 2 Aug 2001, Zhihui Zhang wrote:
:
: 
: FreeBSD can not allocate from the PQ_CACHE queue in an interrupt context.
: Can anyone explain it to me why this is the case?
: 
: 
: Thanks,

Yes, that is precisely the reason.  In -current this all changes, though,
since interrupts are now threads.  *But*, that said, interrupts cannot
really afford to hold mutexes that might end up blocking them for 
long periods of time so I would still recommend that interrupt code not
attempt to allocate pages out of PQ_CACHE.

-Matt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Allocate a page at interrupt time

2001-08-03 Thread Zhihui Zhang


FreeBSD can not allocate from the PQ_CACHE queue in an interrupt context.
Can anyone explain it to me why this is the case?


Thanks,
  
-Zhihui


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Allocate a page at interrupt time

2001-08-01 Thread Zhihui Zhang


FreeBSD can not allocate from the PQ_CACHE queue in an interrupt
context. Can anyone explain it to me why this is the case?

Thanks,

-Zhihui


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message