Re: Allocate a page at interrupt time
Mike Smith wrote: The basic problem here is that you have decided what interrupt threads are, and aren't interested in the fact that what FreeBSD calls interrupt threads are not the same thing, despite being told this countless times, and despite it being embodied in the code that's right under your nose. You believe that an interrupt results in a make-runnable event, and at some future time, the interrupt thread services the interrupt request. This is not the case, and never was. The entire point of having interrupt threads is to allow interrupt handling routines to block in the case where the handler/driver design does not allow for nonblocking synchronisation between the top and bottom halves. So enlighten me, since the code right under my nose often does not run on my dual CPU system, and I like prose anyway, preferrably backed by data and repeatable research results. What do interrupt threads buy you that isn't there in 4.x, besides being one hammer among dozens that can hit the SMP nail? Why don't I want to run my interrupt to completion, and want to use an interrupt thread to do the work instead? On what context do they block? Why is it not better to change the handler/driver design to allow for nonblocking synchronization? Personally, when I get an ACK from a SYN/ACK I sent in response to a SYN, and the connection completes, I think that running the stack at interrupt all the way up to the point of putting the completed new socket connection on the associated listening socket's accept list is the correct thing to do; likewise anything else that would result in a need for upper level processing, _at all_. This lets me process everything I can, and drop everything I can't, as early as possible, before I've invested a lot of futile effort in processing that will come to naught. This is what LRP does. This is what Van Jacobson's stack ([EMAIL PROTECTED]) does. Why are you right, and Mohit Aron, Jeff Mogul, Peter Druschel, and Van Jacobson, wrong? Most of the issues you raise regarding livelock can be mitigated with thoughtful driver design. Eventually, however, the machine hits the wall, and something has to break. You can't avoid this, no matter how you try; the goal is to put it off as long as possible. So. Now you've been told again. Tell me why it has to break, instead of me disabling receipt of the packets by the card in order to shed load before it becomes an issue for the host machine's bus, interrupt processing system, etc.? Are you claiming that dropping packets that are physically impossible to handle, as early as possible, while handing _all_ packets that are physically possible to handle, is broken, or is somehow unpossible? Thanks for any light you can shed on the subject, -- Terry PS: If you want to visit me at work, I'll show you code running in a significantly modified FreeBSD 4.3 kernel. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
Weiguang SHI wrote: I found an article on livelock at http://www.research.compaq.com/wrl/people/mogul/mogulpubsextern.html Just go there and search for livelock. But I don't agree with Terry about the interrupt-thread-is-bad thing, because, if I read it correctly, the authors themself implemented their ideas in interrupt thread of the Digital Unix. Not quite. These days, we are not necessarily talking about just interrupt load limitations. Feel free to take the following with a grain of salt; but realize, I have personally achieved more simultaneous connections on a FreeBSD box than anyone else out there without my code in hand, and this was using gigabit ethernet controllers on modern hardware, and further, this code is in shipping product today. -- The number one way of dealing with excess load is to load-shed it before the load causes problems. In an interrupt threads implementaion, you can't really do this, since the only option you have is when to schedule a polling operation. This leads to several inefficiencies, all of which negatively impact the top end performance you are going to be able to achieve. Use of interrupt threads suffers from a drastically increased latency in reenabling of interrupts, and can generally only perform a single polling cycle, without running into the problem of not making forward progress at the application level (they run at IPL 0, which is effectively the same time at which NETISR is currently run). This leads to a tradeoff in increased interrupt handling latency (e.g. the Tigon II Gigabit ethernet driver in FreeBSD sets the Tigon II card firmware to coelesce at most 32 interrupts), vs. the transmit starvation problem noted in section 4.4 of the paper. It should also be noted that, even if you have not reenabled interrupts, the DMA engine on the card will still be DMA'ing data into your receiver ring buffer. The burst data rate on a 66MHz, 64 bit PCI bus is just over 4Mbits/S, and the sustainable data rate is much lower than that. This means a machine acting as a switch or firewall with two of these cards on board will not really have much time for doing anything at all, except DMA transfers, if they are run at full burst speed all the time (not possible). Running an application which requires disk activity will further eat into the available bandwidth. So this raises the spectre of DMA-based bus transfer livelock: not just interrupt based livelock, if one is scheduling interrupt threads to do event polling, instead of using one of the other approaches outlined in the paper. In the DEC UNIX case, they mitigated the problem by getting rid of the IP input queue, and getting rid of NETISR (I agree that these are required of any code with these goals). The use of the polling thread is really just their way of implementing the polling approach, from section 5.3. This does not address the problems I noted above, and in particular, does not address the latency vs. bus livelock tradeoff problem with modern hardware (they were using an AMD LANCE Ethernet chip; this was a 10Mb chip, and it doesn't support interrupt coelescing). They also assumed the use of a user space forwarding agent (screend): a single process. Further, I think that the feedback mechanism selected is not really workable, without rewriting the card firmware, and having a significant memory buffer on the card, something which is not available on the market yet today. This is because, in practice, you can't stop all incoming packet processing just because one user space program out of dozens has a full input queue that the user space program has not processed yet. It's not reasonable to ignore new incoming requests to a web server, or to disable card interrupts, or to (for example) drop all ARP packets until TCP processing for that one application is complete: their basic assumption -- which they admit, in section 6.6.1, is that the screend is the only application running on the system. This is simply not the case with a high traffic web server, a database system, or any other work-to-do-engine model of several process (or threads) with identical capability to service the incoming requests. Further, these applications use TCP, and thus have explicitly application bound socket endpoints, and there is no way to guarantee client load. We could trivially DOS attack an Apache server running SSL via mod_proxy, for example, by sending a flood of intentionally bad packets. The computation expense would keep its input queue full, and therefore, the feedback mechanism noted would starve the other Apache processes of legitimate input. There are other obvious attacks, which are no less damaging in their results, which attack other points in the assumption of a single process queue feedback mechanism. Their scheduler in section 7, which is in effect identical to the fixed scheduling class in SVR4 (which was used by USL to avoid the move mouse, wiggle cursor problem when using the
Re: Allocate a page at interrupt time
Greg Lehey wrote: Solaris hits the wall a little later, but it still hits the wall. Every SMP system experiences performance degradation at some point. The question is a matter of the extent. IMO, 16 processors is not unreasonable, even with standard APIC based SMP. 32 is out of the question, but that's mostly because you can't have more than 32 APIC ID's in the current 32 bit processors, and still give one or more away to an IO APIC. 8-). On Intel hardware, it has historically hit it at the same 4 CPUs where everyone else tends to hit it, for the same reasons; This is a very broad statement. You contradict it further down. I contradict it for SPARC; I don't think I contradicted it for Intel, but am wiling to take correction... Solaris claims to scale to 64 processors while maintaining SMP, rather than real or virtual NUMA. It's been my own experience that this scaling claim is not entirely accurate, if what you are doing is a lot of kernel processing. I think that depends on how you interpret the claim. It can only mean that adding a 64th processor can still be of benefit. The 4 processors Intel claim is a point of diminishing returns, and is well enough known that it is almost passed into folklore (which might not bode well for finding people building boards with more, which would be unfortunate). My SPARC experience is likewise a diminshing returns, where it becomes cheaper to buy another box to get the performance increment, than to stick more processors in the same box. It's definitely anecdotal on my part. On the other hand, if you are running a lot of non-intersecting user space code (e.g. JVM's or CGI's), it's not as bad (and realized that FreeBSD is not that bad in the same situation, either: it's just not as common in practice as it is in theory). You're just describing a fact of life about UNIX SMP support. Practice vs. Theory? Or the inevitability of UNIX SMP support having the performance characteristics it has most places? I don't buy the we must live with the performance because it's UNIX argument, if you meant the latter. It should be noted that Solaris Interrupt threads are only used for interrupts of priority 10 and below: higher priority interrupts are _NOT_ handled by threads (interrupts at a priority level from 11 to 15). 10 is the clock interrupt. FreeBSD also has provision for not using interrupt threads for everything. It's clearly too early to decide which interrupts should be left as traditional interrupts, and we've done some shifting back and forth to get things to work. Note that the priority numbers are noise. In this statement, they're just a convenient way to distinguish between threaded and non-threaded interrupts. FreeBSD masks, Solaris IPLs. In context, this was meant to show why Solaris' approach is not directly translatable to FreeBSD. I really can't buy the idea that interrupt threads are a good idea for anything that can flood your bus or interrupt bandwidth, or have tiny/non-existant FIFOs, relative to the speeds they are being pushed; right now that means might be OK for disks; not OK for really fast network controllers, not OK for sorta fast network controllers without a lot of adapter RAM, not OK for serial ports and floppies, at least in my mind. I think somebody else has pointed out that we're very conscious of CPU affinity. I think affinity isn't enough; I've expressed this to Alfred on a number of occasions already, when I see him in the hallway or at lunch. Dealing with the problem is kind of an all-or-nothing bid. In the 32 processor Sequent boxes, the actual system bus was different, and directly supported message passing. Was this better or worse? For the intent, much better. It meant that non-intersecting CPU/peripheral paths could run simultaneously. The Harris H1000 and H1200 had a similar thing (big iron real time systems used on Navy ships and at the college where Wes and I went to school). Also, the Sun system is still an IPL system, using level based blocking, rather than masking, and these threads can find themselves blocks on a mutex or condition variable for a relatively long time; if this happens, it resumes the previous thread _but does not drop its IPL below that of the suspended thread_, which is basically the Djikstra Banker's Algorithm method of avoiding priority inversion on interrupts (i.e. ugly). So you're saying we're doing it better? Long term priority lending is the real problem I'm noting; this is an artifact of context borrowing, more than anything else (more below). I think the FreeBSD use of masking is better than IPL'ing, and is an obvious win in the case of multiple cards, since you can run multiple interrupt handlers at the same time, but wonder what will happen when/if it gets to UltraSPARC hardware. I think the Djikstra algorithm, in which contended resources are prereserved based on an anticipated need,
Re: Allocate a page at interrupt time
I really can't buy the idea that interrupt threads are a good idea for anything that can flood your bus or interrupt bandwidth, or have tiny/non-existant FIFOs, relative to the speeds they are being pushed; right now that means might be OK for disks; not OK for really fast network controllers, not OK for sorta fast network controllers without a lot of adapter RAM, not OK for serial ports and floppies, at least in my mind. The basic problem here is that you have decided what interrupt threads are, and aren't interested in the fact that what FreeBSD calls interrupt threads are not the same thing, despite being told this countless times, and despite it being embodied in the code that's right under your nose. You believe that an interrupt results in a make-runnable event, and at some future time, the interrupt thread services the interrupt request. This is not the case, and never was. The entire point of having interrupt threads is to allow interrupt handling routines to block in the case where the handler/driver design does not allow for nonblocking synchronisation between the top and bottom halves. Most of the issues you raise regarding livelock can be mitigated with thoughtful driver design. Eventually, however, the machine hits the wall, and something has to break. You can't avoid this, no matter how you try; the goal is to put it off as long as possible. So. Now you've been told again. -- ... every activity meets with opposition, everyone who acts has his rivals and unfortunately opponents also. But not because people want to be opponents, rather because the tasks and relationships force people to take different points of view. [Dr. Fritz Todt] V I C T O R Y N O T V E N G E A N C E To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
void wrote: Can you name one SMP OS implementation that uses an interrupt threads approach that doesn't hit a scaling wall at 4 (or fewer) CPUs, due to heavier weight thread context switch overhead? Solaris, if I remember my Vahalia book correctly (isn't that a favorite of yours?). As usual, IMO... Yes, I like the Vahalia book; I did technical review of it for Prentice Hall before its publication. Solaris hits the wall a little later, but it still hits the wall. On Intel hardware, it has historically hit it at the same 4 CPUs where everyone else tends to hit it, for the same reasons; as of Solaris 2.6, they have adopted the hybrid per CPU pool model recommended in Vahalia (Chapter 12). While I'm at it, I suppose I should recommend reading the definitive Solaris internals book, to date: Solaris Internals, Core Kernel Architecture Jim Mauro, Richard McDougall Prentice Hall ISBN: 0-13-022496-0 Solaris does use interrupt threads for some interrupts; I don't like the idea, for the reasons stated previously. Solaris claims to scale to 64 processors while maintaining SMP, rather than real or virtual NUMA. It's been my own experience that this scaling claim is not entirely accurate, if what you are doing is a lot of kernel processing. On the other hand, if you are running a lot of non-intersecting user space code (e.g. JVM's or CGI's), it's not as bad (and realized that FreeBSD is not that bad in the same situation, either: it's just not as common in practice as it is in theory). It should be noted that Solaris Interrupt threads are only used for interrupts of priority 10 and below: higher priority interrupts are _NOT_ handled by threads (interrupts at a priority level from 11 to 15). 10 is the clock interrupt. It should also be noted that Solaris maintains a per processor pool of interrupt threads for each of the lower priority interrupts, with a global thread that is used for handling of the clock interrupt. This is _very_ different than taking an interrupt thread, and rescheduling it on an arbitrary CPU, and as others have pointed out, the hardware used to do the scheduling is very different. In the 32 processor Sequent boxes, the actual system bus was different, and directly supported message passing. There is also specific hardware support for handling interrupts via threads, which is really not applicable to x86 or even the Alpha architectures on which FreeBSD currently runs, nor to the IA64 architecture (port in progress). In particular, there is a single system wide table, introduced with the UltraSPARC, that doesn't need to be locked to support interrupt handling. Also, the Sun system is still an IPL system, using level based blocking, rather than masking, and these threads can find themselves blocks on a mutex or condition variable for a relatively long time; if this happens, it resumes the previous thread _but does not drop its IPL below that of the suspended thread_, which is basically the Djikstra Banker's Algorithm method of avoiding priority inversion on interrupts (i.e. ugly). Finally, the Sun system borrows the context of the interrupted process (thread) for interrupt handling (the LWP). This is very similar to the technique employed with kernel vs. user space thread associations within the Windows kernels (this was one of the steps I was referring to when I said that NT had dealt with a number of scaling issues before it needed to, so that they would not turn into problems on 8-way and higher systems). Personally, I think that the Sun system is extremely succeptible to receiver livelock (Network interrupts are at 7, and disk interrupts are at 5, which means that so long as you are getting pounded with network interrupts for e.g. NFS read or write requests, you're not going to service the disk interrupts that will let you dispose of the traffic, nor will you run the user space code for things like CGI's or Apache servers trying to service a heavy load of requests for content). I'm also not terrifically impressed with their callout mechanism, when applied to networking, which has a preponderance of fixed, known interval timers, but FreeBSD's isn't really any better, which it comes to huge numbers of network connections, since it will end up hashing 2/4/6/8/... into the same bucket, unordered, which means traversing a large list of timers which are not going to end up expiring (callout wheels are not a good thing to mix with fixed interval timers of relatively long durations, like the 2MSL timers that live in the networking code, or most especially the TIME_WAIT timers). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
Mike Smith wrote: Terry; all this thinking you're doing is *really*bad*. I appreciate that you believe you're trying to educate us somehow. But what you're really doing right now is filling our list archives with convincing-sounding crap. People that are curious about this issue are likely to turn up your postings, and get *really* confused. Please. Just stop, ok? Mike, I know you are convinced you know everything, and that of all the people who have worked professionally on SMP systems before, FreeBSD has only one guy I'm aware of in a design position for the SMP project, and a lot of students who think they know what they are doing, even though they can't cite the literature, but please... Read the email threads all the way through before commenting on my postings; the IPI issue is real for TLB shootdown, as was pointed out by others; it was quite late, and it's very understandable, given that I have aphasic dyslexia, that I substituted the wrong word. Rather than correcting things, as others have done, you have insisted that no issue exists. Effectively calling me an idiot in a public forum doesn't help your credibility, and you're doing more damage by denying that there is any issue whatsoever to be concerned about, and being pedantic about precise word usage, instead of addressing the issues and correcting my unintentional spoonerisms out of concern for the archives. Also please read the white paper reference I gave you about receiver livelock: interrupt threads were, and are, a bad idea, particularly on stock Intel SMP hardware -- so Solaris using that approach doesn't justify it any more than antique versions of IRIX using that approach do. If you don't want to believe be, then believe Jeff Mogul, but don't pretend that simply because I chose the wrong word, that there is no issue to consider. Thanks, -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
I found an article on livelock at http://www.research.compaq.com/wrl/people/mogul/mogulpubsextern.html Just go there and search for livelock. But I don't agree with Terry about the interrupt-thread-is-bad thing, because, if I read it correctly, the authors themself implemented their ideas in interrupt thread of the Digital Unix. Weiguang From: Greg Lehey [EMAIL PROTECTED] To: Terry Lambert [EMAIL PROTECTED] CC: Bosko Milekic [EMAIL PROTECTED], Matt Dillon [EMAIL PROTECTED], Zhihui Zhang [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: Allocate a page at interrupt time Date: Wed, 8 Aug 2001 13:34:14 +0930 On Tuesday, 7 August 2001 at 1:58:21 -0700, Terry Lambert wrote: Bosko Milekic wrote: I keep wondering about the sagicity of running interrupts in threads... it still seems like an incredibly bad idea to me. I guess my major problem with this is that by running in threads, it's made it nearly impossibly to avoid receiver livelock situations, using any of the classical techniques (e.g. Mogul's work, etc.). References to published works? Just do an NCSTRL search on receiver livelock; you will get over 90 papers... http://ncstrl.mit.edu/ See also the list of participating institutions: http://ncstrl.mit.edu/Dienst/UI/2.0/ListPublishers It won't be that hard to find... Mogul has only published 92 papers. 8-) So much data, in fact, that you could hide anything behind it. Would you like to be more specific? It also has the unfortunate property of locking us into virtual wire mode, when in fact Microsoft demonstrated that wiring down interrupts to particular CPUs was good practice, in terms of assuring best performance. Specifically, running in virtual Can you point us at any concrete information that shows this? Specifically, without being Microsoft biased (as is most data published by Microsoft)? -- i.e. preferably third-party performance testing that attributes wiring down of interrupts to particular CPUs as _the_ performance advantage. FreeBSD was tested, along with Linux and NT, by Ziff Davis Labs, in Foster city, with the participation of Jordan Hubbard and Mike Smith. You can ask either of them for the results of the test; only the Linux and NT numbers were actually released. This was done to provide a non-biased baseline, in reaction to the Mindcraft benchmarks, where Linux showed so poorly. They ran quad ethernet cards, with quad CPUs; the NT drivers wired the cards down to seperate INT A/B/C/D interrupts, one per CPU. You carefully neglect to point out that this was the old SMP implementation. I think this completely invalidates any point you may have been trying to make. wire mode means that all your CPUs get hit with the interrupt, whereas running with the interrupt bound to a particular CPU reduces the overall overhead. Even what we have today, with Obviously. I mention it because this is the direction FreeBSD appears to be moving in. Right now, Intel is shipping with seperate PCI busses; there is one motherboard from their serverworks division that has 16 seperate PCI busses -- which means that you can do simultaneous gigabit card DMA to and from memory, without running into bus contention, so long as the memory is logically seperate. NT can use this hardware to its full potential; FreeBSD as it exists, can not, and FreeBSD as it appears to be heading today (interrupt threads, etc.) seems to be in the same boat as Linux, et. al.. PCI-X will only make things worse (8.4 gigabit, burst rate). What do interrupt threads have to do with this? Terry, we've done a lot of thinking about performance implications over the last 2 years, including addressing all of the points that you appear to raise. A lot of it is in the archives. It's quite possible that we've missed something important that you haven't. But if that's the case, we'd like you to state it. All I see is you coming in, waving your hands and shouting generalities which don't really help much. The fact that people are still listening is very much an indication of the hope that you might come up with something useful. But pointing to 92 papers and saying it's in there [somewhere] isn't very helpful. Greg -- See complete headers for address and phone numbers To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message _ Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
On Wednesday, 8 August 2001 at 0:27:23 -0700, Terry Lambert wrote: void wrote: Can you name one SMP OS implementation that uses an interrupt threads approach that doesn't hit a scaling wall at 4 (or fewer) CPUs, due to heavier weight thread context switch overhead? Solaris, if I remember my Vahalia book correctly (isn't that a favorite of yours?). As usual, IMO... Yes, I like the Vahalia book; I did technical review of it for Prentice Hall before its publication. Solaris hits the wall a little later, but it still hits the wall. Every SMP system experiences performance degradation at some point. The question is a matter of the extent. On Intel hardware, it has historically hit it at the same 4 CPUs where everyone else tends to hit it, for the same reasons; This is a very broad statement. You contradict it further down. as of Solaris 2.6, they have adopted the hybrid per CPU pool model recommended in Vahalia (Chapter 12). While I'm at it, I suppose I should recommend reading the definitive Solaris internals book, to date: Solaris Internals, Core Kernel Architecture Jim Mauro, Richard McDougall Prentice Hall ISBN: 0-13-022496-0 Yes, I have this book. It looks very good, but I haven't found time to read it. Solaris claims to scale to 64 processors while maintaining SMP, rather than real or virtual NUMA. It's been my own experience that this scaling claim is not entirely accurate, if what you are doing is a lot of kernel processing. I think that depends on how you interpret the claim. It can only mean that adding a 64th processor can still be of benefit. On the other hand, if you are running a lot of non-intersecting user space code (e.g. JVM's or CGI's), it's not as bad (and realized that FreeBSD is not that bad in the same situation, either: it's just not as common in practice as it is in theory). You're just describing a fact of life about UNIX SMP support. It should be noted that Solaris Interrupt threads are only used for interrupts of priority 10 and below: higher priority interrupts are _NOT_ handled by threads (interrupts at a priority level from 11 to 15). 10 is the clock interrupt. FreeBSD also has provision for not using interrupt threads for everything. It's clearly too early to decide which interrupts should be left as traditional interrupts, and we've done some shifting back and forth to get things to work. Note that the priority numbers are noise. In this statement, they're just a convenient way to distinguish between threaded and non-threaded interrupts. It should also be noted that Solaris maintains a per processor pool of interrupt threads for each of the lower priority interrupts, with a global thread that is used for handling of the clock interrupt. This is _very_ different than taking an interrupt thread, and rescheduling it on an arbitrary CPU, and as others have pointed out, the hardware used to do the scheduling is very different. I think somebody else has pointed out that we're very conscious of CPU affinity. In the 32 processor Sequent boxes, the actual system bus was different, and directly supported message passing. Was this better or worse? There is also specific hardware support for handling interrupts via threads, which is really not applicable to x86 or even the Alpha architectures on which FreeBSD currently runs, nor to the IA64 architecture (port in progress). In particular, there is a single system wide table, introduced with the UltraSPARC, that doesn't need to be locked to support interrupt handling. Also, the Sun system is still an IPL system, using level based blocking, rather than masking, and these threads can find themselves blocks on a mutex or condition variable for a relatively long time; if this happens, it resumes the previous thread _but does not drop its IPL below that of the suspended thread_, which is basically the Djikstra Banker's Algorithm method of avoiding priority inversion on interrupts (i.e. ugly). So you're saying we're doing it better? Finally, the Sun system borrows the context of the interrupted process (thread) for interrupt handling (the LWP). This is very similar to the technique employed with kernel vs. user space thread associations within the Windows kernels (this was one of the steps I was referring to when I said that NT had dealt with a number of scaling issues before it needed to, so that they would not turn into problems on 8-way and higher systems). This is also the method we're planning to use, as I'm sure you're aware from previous messages on the -smp list. Personally, I think that the Sun system is extremely succeptible to receiver livelock (Network interrupts are at 7, and disk interrupts are at 5, which means that so long as you are getting pounded with network interrupts for e.g. NFS read or write requests, you're not going to service the disk interrupts that will let you dispose of the traffic, nor
Re: Allocate a page at interrupt time
Matt Dillon wrote: Yes, that is precisely the reason. In -current this all changes, though, since interrupts are now threads. *But*, that said, interrupts cannot really afford to hold mutexes that might end up blocking them for long periods of time so I would still recommend that interrupt code not attempt to allocate pages out of PQ_CACHE. I keep wondering about the sagicity of running interrupts in threads... it still seems like an incredibly bad idea to me. I guess my major problem with this is that by running in threads, it's made it nearly impossibly to avoid receiver livelock situations, using any of the classical techniques (e.g. Mogul's work, etc.). It also has the unfortunate property of locking us into virtual wire mode, when in fact Microsoft demonstrated that wiring down interrupts to particular CPUs was good practice, in terms of assuring best performance. Specifically, running in virtual wire mode means that all your CPUs get hit with the interrupt, whereas running with the interrupt bound to a particular CPU reduces the overall overhead. Even what we have today, with the big giant lock and redirecting interrupts to the CPU in the kernel is better than that... -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
On Mon, Aug 06, 2001 at 11:27:56PM -0700, Terry Lambert wrote: I keep wondering about the sagicity of running interrupts in threads... it still seems like an incredibly bad idea to me. I guess my major problem with this is that by running in threads, it's made it nearly impossibly to avoid receiver livelock situations, using any of the classical techniques (e.g. Mogul's work, etc.). References to published works? It also has the unfortunate property of locking us into virtual wire mode, when in fact Microsoft demonstrated that wiring down interrupts to particular CPUs was good practice, in terms of assuring best performance. Specifically, running in virtual Can you point us at any concrete information that shows this? Specifically, without being Microsoft biased (as is most data published by Microsoft)? -- i.e. preferably third-party performance testing that attributes wiring down of interrupts to particular CPUs as _the_ performance advantage. wire mode means that all your CPUs get hit with the interrupt, whereas running with the interrupt bound to a particular CPU reduces the overall overhead. Even what we have today, with Obviously. the big giant lock and redirecting interrupts to the CPU in the kernel is better than that... -- Terry -- Bosko Milekic [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
* Bosko Milekic [EMAIL PROTECTED] [010807 02:16] wrote: On Mon, Aug 06, 2001 at 11:27:56PM -0700, Terry Lambert wrote: I keep wondering about the sagicity of running interrupts in threads... it still seems like an incredibly bad idea to me. I guess my major problem with this is that by running in threads, it's made it nearly impossibly to avoid receiver livelock situations, using any of the classical techniques (e.g. Mogul's work, etc.). References to published works? It also has the unfortunate property of locking us into virtual wire mode, when in fact Microsoft demonstrated that wiring down interrupts to particular CPUs was good practice, in terms of assuring best performance. Specifically, running in virtual Can you point us at any concrete information that shows this? Specifically, without being Microsoft biased (as is most data published by Microsoft)? -- i.e. preferably third-party performance testing that attributes wiring down of interrupts to particular CPUs as _the_ performance advantage. wire mode means that all your CPUs get hit with the interrupt, whereas running with the interrupt bound to a particular CPU reduces the overall overhead. Even what we have today, with Obviously. the big giant lock and redirecting interrupts to the CPU in the kernel is better than that... I really don't see what part of the current design specifically disallows one to both: 1) force interrupts to be taken on a particular cpu. 2) if that thread gets switched out, have it put on a per-cpu runqueue when it becomes runable preventing another cpu from snatching it up. I've already implemented #2, #1 requires touching hardware which isn't something I like doing. :) -- -Alfred Perlstein [[EMAIL PROTECTED]] Ok, who wrote this damn function called '??'? And why do my programs keep crashing in it? To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
It also has the unfortunate property of locking us into virtual wire mode, when in fact Microsoft demonstrated that wiring down interrupts to particular CPUs was good practice, in terms of assuring best performance. Specifically, running in virtual wire mode means that all your CPUs get hit with the interrupt, whereas running with the interrupt bound to a particular CPU reduces the overall overhead. Even what we have today, with the big giant lock and redirecting interrupts to the CPU in the kernel is better than that... Terry, this is *total* garbage. Just so you know, ok? -- ... every activity meets with opposition, everyone who acts has his rivals and unfortunately opponents also. But not because people want to be opponents, rather because the tasks and relationships force people to take different points of view. [Dr. Fritz Todt] V I C T O R Y N O T V E N G E A N C E To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
Bosko Milekic wrote: I keep wondering about the sagicity of running interrupts in threads... it still seems like an incredibly bad idea to me. I guess my major problem with this is that by running in threads, it's made it nearly impossibly to avoid receiver livelock situations, using any of the classical techniques (e.g. Mogul's work, etc.). References to published works? Just do an NCSTRL search on receiver livelock; you will get over 90 papers... http://ncstrl.mit.edu/ See also the list of participating institutions: http://ncstrl.mit.edu/Dienst/UI/2.0/ListPublishers It won't be that hard to find... Mogul has only published 92 papers. 8-) It also has the unfortunate property of locking us into virtual wire mode, when in fact Microsoft demonstrated that wiring down interrupts to particular CPUs was good practice, in terms of assuring best performance. Specifically, running in virtual Can you point us at any concrete information that shows this? Specifically, without being Microsoft biased (as is most data published by Microsoft)? -- i.e. preferably third-party performance testing that attributes wiring down of interrupts to particular CPUs as _the_ performance advantage. FreeBSD was tested, along with Linux and NT, by Ziff Davis Labs, in Foster city, with the participation of Jordan Hubbard and Mike Smith. You can ask either of them for the results of the test; only the Linux and NT numbers were actually released. This was done to provide a non-biased baseline, in reaction to the Mindcraft benchmarks, where Linux showed so poorly. They ran quad ethernet cards, with quad CPUs; the NT drivers wired the cards down to seperate INT A/B/C/D interrupts, one per CPU. wire mode means that all your CPUs get hit with the interrupt, whereas running with the interrupt bound to a particular CPU reduces the overall overhead. Even what we have today, with Obviously. I mention it because this is the direction FreeBSD appears to be moving in. Right now, Intel is shipping with seperate PCI busses; there is one motherboard from their serverworks division that has 16 seperate PCI busses -- which means that you can do simultaneous gigabit card DMA to and from memory, without running into bus contention, so long as the memory is logically seperate. NT can use this hardware to its full potential; FreeBSD as it exists, can not, and FreeBSD as it appears to be heading today (interrupt threads, etc.) seems to be in the same boat as Linux, et. al.. PCI-X will only make things worse (8.4 gigabit, burst rate). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
Mike Smith wrote: It also has the unfortunate property of locking us into virtual wire mode, when in fact Microsoft demonstrated that wiring down interrupts to particular CPUs was good practice, in terms of assuring best performance. Specifically, running in virtual wire mode means that all your CPUs get hit with the interrupt, whereas running with the interrupt bound to a particular CPU reduces the overall overhead. Even what we have today, with the big giant lock and redirecting interrupts to the CPU in the kernel is better than that... Terry, this is *total* garbage. Just so you know, ok? What this, exactly? That virtual wire mode is actually a bad idea for some applications -- specifically, high speed networking with multiple gigabit ethernet cards? That Microsoft demonstrated that wiring down interrupts to a particular CPU was a good idea, and kicked both Linux' and FreeBSD's butt in the test at ZD Labs? That taking interrupts on a single directed CPU is better than taking an IPI on all your CPUs, and then sorting out who's going to handle the interrupt? Can you name one SMP OS implementation that uses an interrupt threads approach that doesn't hit a scaling wall at 4 (or fewer) CPUs, due to heavier weight thread context switch overhead? Can you tell me how, in the context of having an interrupt thread doing scheduled processing, how you could avoid an interrupt overhead livelock, where the thread doesn't get opportunity to run because you're too busy taking interrupts to be able to get any work done? FWIW, I would be happy to cite sources to you, off the general list. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
: It also has the unfortunate property of locking us into virtual : wire mode, when in fact Microsoft demonstrated that wiring down : interrupts to particular CPUs was good practice, in terms of : assuring best performance. Specifically, running in virtual : wire mode means that all your CPUs get hit with the interrupt, : whereas running with the interrupt bound to a particular CPU : reduces the overall overhead. Even what we have today, with : the big giant lock and redirecting interrupts to the CPU in : the kernel is better than that... : : Terry, this is *total* garbage. : : Just so you know, ok? : :What this, exactly? : :That virtual wire mode is actually a bad idea for some :applications -- specifically, high speed networking with :multiple gigabit ethernet cards? All the cpu's don't get the interrupt, only one does. :That Microsoft demonstrated that wiring down interrupts :to a particular CPU was a good idea, and kicked both Linux' :and FreeBSD's butt in the test at ZD Labs? Well, if you happen to have four NICs and four CPUs, and you are running them all full bore, I would say that wiring the NICs to the CPUs would be a good idea. That seems like a rather specialized situation, though. -Matt :That taking interrupts on a single directed CPU is better :than taking an IPI on all your CPUs, and then sorting out :who's going to handle the interrupt? :... : :-- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
At 12:39 AM -0700 8/7/01, Mike Smith wrote: It also has the unfortunate property of locking us into virtual wire mode, when in fact Microsoft demonstrated that wiring down interrupts to particular CPUs was good practice, in terms of assuring best performance. Specifically, running in virtual wire mode means that all your CPUs get hit with the interrupt, whereas running with the interrupt bound to a particular CPU reduces the overall overhead. Even what we have today, with the big giant lock and redirecting interrupts to the CPU in the kernel is better than that... Terry, this is *total* garbage. Just so you know, ok? There are people on this list besides Terry. Terry has taken the time to refer to a few URL's, and remind us of a benchmark that I (for one) do remember, and I do remember Windows doing quite well on it. Maybe that benchmark was bogus for some reason, but I seem to remember several freebsd developers taking it seriously at the time. So, could you at least fill in what part of the above is total garbage? Throw in a few insults to Terry if it makes you feel better for some reason, but raise the level of information content a little for the rest of us? You quoted several distinct comments of Terry's -- were all of them garbage? It might very well be that all of Terry's comments were in fact garbage, but from the sidelines I'd appreciate a little more in the way of technical details. -- Garance Alistair Drosehn= [EMAIL PROTECTED] Senior Systems Programmer or [EMAIL PROTECTED] Rensselaer Polytechnic Instituteor [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
At 9:55 AM -0700 8/7/01, Matt Dillon wrote: : It also has the unfortunate property of locking us into virtual : wire mode, when in fact Microsoft demonstrated that wiring down : interrupts to particular CPUs was good practice, in terms of : assuring best performance. [...] : : Terry, this is *total* garbage. : : Just so you know, ok? : :What this, exactly? : :That virtual wire mode is actually a bad idea for some :applications -- specifically, high speed networking with :multiple gigabit ethernet cards? All the cpu's don't get the interrupt, only one does. :That Microsoft demonstrated that wiring down interrupts :to a particular CPU was a good idea, and kicked both Linux' :and FreeBSD's butt in the test at ZD Labs? Well, if you happen to have four NICs and four CPUs, and you are running them all full bore, I would say that wiring the NICs to the CPUs would be a good idea. That seems like a rather specialized situation, though. Okay, that's helpful to sort out the discussion. I'd agree that is a specialized situation, one which wouldn't be critical to many freebsd users. Is Terry right that the current strategy will lock us into virtual wire mode, in some way which means that this specialized situation CANNOT be handled? (it would be fine if it were handled via some specialized kernel option, imo. I'm just wondering what the limitations are. I do not mean to imply we should follow some different strategy here, I'm just wondering...) -- Garance Alistair Drosehn= [EMAIL PROTECTED] Senior Systems Programmer or [EMAIL PROTECTED] Rensselaer Polytechnic Instituteor [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
:I'd agree that is a specialized situation, one which wouldn't :be critical to many freebsd users. Is Terry right that the :current strategy will lock us into virtual wire mode, in :some way which means that this specialized situation CANNOT :be handled? : :(it would be fine if it were handled via some specialized :kernel option, imo. I'm just wondering what the limitations :are. I do not mean to imply we should follow some different :strategy here, I'm just wondering...) : :-- :Garance Alistair Drosehn= [EMAIL PROTECTED] In -current there is nothing preventing us from wiring interrupt *threads* to cpus. Wiring the actual interrupts themselves might or might not yield a performance improvement beyond that. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
That Microsoft demonstrated that wiring down interrupts to a particular CPU was a good idea, and kicked both Linux' and FreeBSD's butt in the test at ZD Labs? No, Terry, this is not what was demonstrated by those tests. Will this myth never die? Do Mike and I have to write up a nice white paper? :) The environment was ridigly specified: quad cpu box, four eepro 100mb interfaces, and a _heavy_ load of short lived connections fetching static cached content. The test was clearly designed to stress concurrency in the network stack, with heavy low latency interrupt load. Neither Linux nor FreeBSD could do this well at the time. There was a service pack issed a few months before the test that 'threaded' NT's stack.. It was not a mistake that the rules of the tests forbid doing the sane thing and running on a system with a single very fast cpu, lots of mem, and gigabit interface with an actual published interface for coalescing interrupts. That would have performed better and been cheaper. Thats what pisses me off about the tests to this day. The problem people are faced with is is how do I serve this static content reliably and cheaply, not, what OS should I serve my content with, now that I've bought this ridiculous machine?. Its sad that people consistently insist on drawing insane conclusions from these benchmark events. -- zach To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
Matt Dillon wrote: :What this, exactly? : :That virtual wire mode is actually a bad idea for some :applications -- specifically, high speed networking with :multiple gigabit ethernet cards? All the cpu's don't get the interrupt, only one does. I think that you will end up taking an IPI (Inter Processor Interrupt) to shoot down the cache line during an invalidate cycle, when moving an interrupt processing thread from one CPU to another. For multiple high speed interfaces (disk or network; doesn't matter), you will end up burining a *lot* of time, without a lockdown. You might be able to avoid this by doing some of the tricks I've discussed with Alfred to ensure that there is no lock contention in the non-migratory case for KSEs (or kernel interrupt threads) to handle per CPU scheduling, but I think that the interrupt masking will end up being very hard to manage, and you will get the same effect as locking the interrupt to a particular CPU... if you asre lucky. Any case which _did_ invoke a lock and resulted in contention would require at least a barrier instruction; I guess you could do it in a non-cacheable page to avoid the TLB interaction, and another IPI for an update or invalidate cycle for the lock, but then you are limited to memory speed, which is getting down to around a factor of 10 (133MHz) slower than CPU speed, these days, and that's actually one heck of a stall hit to take. :That Microsoft demonstrated that wiring down interrupts :to a particular CPU was a good idea, and kicked both Linux' :and FreeBSD's butt in the test at ZD Labs? Well, if you happen to have four NICs and four CPUs, and you are running them all full bore, I would say that wiring the NICs to the CPUs would be a good idea. That seems like a rather specialized situation, though. I don't think so. These days, interrupt overhead can come from many places, including intentional denial of service attacks. If you have an extra box around, I'd suggest that you install QLinux, and benchmark it side by side against FreeBSD, under an extreme load, and watch the FreeBSD system's performance fall off when interrupt overhead becomes so high that NETISR effectively never gets a chance to run. I also suggest using 100Base-T cards, since the interrupt coelescing on Gigabit cards could prevent you from observing the livelock from interrupt overload, unless you could load your machine to full wire speed (~950Mbits/S) so that your PCI bus transfer rate becomes a barrier. I know you were involved in some of the performance tuning that was attempted immediately after the ZD Labs tests, so I know you know this was a real issue; I think it still is. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
Zach Brown wrote: That Microsoft demonstrated that wiring down interrupts to a particular CPU was a good idea, and kicked both Linux' and FreeBSD's butt in the test at ZD Labs? No, Terry, this is not what was demonstrated by those tests. Will this myth never die? Do Mike and I have to write up a nice white paper? :) That would be nice, actually. The environment was ridigly specified: quad cpu box, four eepro 100mb interfaces, and a _heavy_ load of short lived connections fetching static cached content. The test was clearly designed to stress concurrency in the network stack, with heavy low latency interrupt load. Neither Linux nor FreeBSD could do this well at the time. There was a service pack issed a few months before the test that 'threaded' NT's stack.. It was not a mistake that the rules of the tests forbid doing the sane thing and running on a system with a single very fast cpu, lots of mem, and gigabit interface with an actual published interface for coalescing interrupts. That would have performed better and been cheaper. I have soft interrupt coelescing changes for most FreeBSD drivers written by Bill Paul; the operation is trivial, and Bill has structured his drivers well for doing that sort of thing. I personally don't think the test was unfair; it seems to me to be representative of most web traffic, which averages 8k a page for most static content, according to published studies. Thats what pisses me off about the tests to this day. The problem people are faced with is is how do I serve this static content reliably and cheaply, not, what OS should I serve my content with, now that I've bought this ridiculous machine?. 8-) 8-). Its sad that people consistently insist on drawing insane conclusions from these benchmark events. I think that concurrency in the TCP stack is something that needs to be addressed; I'm glad they ran the benchmark, if only for that. Even if we both agree on the conclusions, agreeing isn't going to change people's perceptions, but beating them on their terms _will_, so it's a worthwhile pursuit. I happen to agree that their test indicated some shortcomings in the OS designs; regardless of whether we think they were carefully chosen to specifically emphasize those shortcomings, it doesn't change the fact that they are shortcomings. There's no use crying over spilt milk: the question is what can be done about it, besides trying to deny the validity of the tests. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
On Tue, 7 Aug 2001, Terry Lambert wrote: Matt Dillon wrote: All the cpu's don't get the interrupt, only one does. I think that you will end up taking an IPI (Inter Processor Interrupt) to shoot down the cache line during an invalidate cycle, when moving an interrupt processing thread from one CPU to another. You have a lot of fantasy today. You may want to consider reading one of the white papers you referred us to with so much enthusiasm and trying again later ;) Well, if you happen to have four NICs and four CPUs, and you are running them all full bore, I would say that wiring the NICs to the CPUs would be a good idea. That seems like a rather specialized situation, though. I don't think so. These days, interrupt overhead can come from many places, Exactly. You never know where your interrupts come from, so wiring them in a fixed setup really isn't going to do you much good in the generic case. Now if you want to optimise your source code for something like a Mindcraft benchmark ... regards, Rik -- Executive summary of a recent Microsoft press release: we are concerned about the GNU General Public License (GPL) http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com/ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
: :Matt Dillon wrote: : :What this, exactly? : : : :That virtual wire mode is actually a bad idea for some : :applications -- specifically, high speed networking with : :multiple gigabit ethernet cards? : : All the cpu's don't get the interrupt, only one does. : :I think that you will end up taking an IPI (Inter Processor :Interrupt) to shoot down the cache line during an invalidate :cycle, when moving an interrupt processing thread from one :CPU to another. For multiple high speed interfaces (disk or :network; doesn't matter), you will end up burining a *lot* :of time, without a lockdown. Cache line invalidation does not require an IPI. TLB shootdowns require IPIs. TLB shootdowns are unrelated to interrupt threads, they only occur when shared mmu mappings change. Cache line invalidation can waste cpu cycles -- when cache mastership changes occur between cpus due to threads being switched between cpus. I consider this a serious problem in -current. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
On Tue, Aug 07, 2001 at 12:19:01PM -0700, Matt Dillon wrote: Cache line invalidation does not require an IPI. TLB shootdowns require IPIs. TLB shootdowns are unrelated to interrupt threads, they only occur when shared mmu mappings change. Cache line invalidation can waste cpu cycles -- when cache mastership changes occur between cpus due to threads being switched between cpus. I consider this a serious problem in -current. I don't think it's fair to consider this a serious problem seeing as how, as far as I'm aware, we've intended to eventually introduce code that will favor keeping threads running on one CPU on that same CPU as long as it is reasonable to do so (which should be most of the time). I think after briefly discussing with Alfred on IRC that Alfred has some CPU affinity patches on the way, but I'm not sure if they address thread scheduling with the above intent in mind or if they merely introduce an _interface_ to bind a thread to a single CPU. -Matt -- Bosko Milekic [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
* Bosko Milekic [EMAIL PROTECTED] [010807 14:51] wrote: On Tue, Aug 07, 2001 at 12:19:01PM -0700, Matt Dillon wrote: Cache line invalidation does not require an IPI. TLB shootdowns require IPIs. TLB shootdowns are unrelated to interrupt threads, they only occur when shared mmu mappings change. Cache line invalidation can waste cpu cycles -- when cache mastership changes occur between cpus due to threads being switched between cpus. I consider this a serious problem in -current. I don't think it's fair to consider this a serious problem seeing as how, as far as I'm aware, we've intended to eventually introduce code that will favor keeping threads running on one CPU on that same CPU as long as it is reasonable to do so (which should be most of the time). I think after briefly discussing with Alfred on IRC that Alfred has some CPU affinity patches on the way, but I'm not sure if they address thread scheduling with the above intent in mind or if they merely introduce an _interface_ to bind a thread to a single CPU. They do both. :) You can bind a process to a runqueue _and_ at the same time as long as a process on another CPU doesn't have a much higher priority we'll take from our local pool. Basically we give processes that last ran on our own CPU a false priority boost. http://people.freebsd.org/~alfred/bind_cpu.diff + cpu = PCPU_GET(cpuid); + pricpu = runq_findbit(runqcpu[cpu]); + pri = runq_findbit(rq); + CTR2(KTR_RUNQ, runq_choose: pri=%d cpupri=%d, pri, pricpu); + if (pricpu != -1 (pricpu pri || pri == -1)) { + pri = pricpu; + rqh = runqcpu[cpu].rq_queues[pri]; + } else if (pri != -1) { + rqh = rq-rq_queues[pri]; + } else { + CTR1(KTR_RUNQ, runq_choose: idleproc pri=%d, pri); + return (PCPU_GET(idleproc)); + } + p = TAILQ_FIRST(rqh); Actually I think this patch is stale, it doesn't have the priority boost, but basically you can put it in the if (pricpu != -1 (pricpu pri || pri == -1)) { clause sort of like this: if (pricpu != -1 (pricpu - FUDGE pri || pri == -1)) { Where FUDGE is the priority boost you want to give local processes. -- -Alfred Perlstein [[EMAIL PROTECTED]] Ok, who wrote this damn function called '??'? And why do my programs keep crashing in it? To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
On Tue, Aug 07, 2001 at 02:11:10AM -0700, Terry Lambert wrote: Can you name one SMP OS implementation that uses an interrupt threads approach that doesn't hit a scaling wall at 4 (or fewer) CPUs, due to heavier weight thread context switch overhead? Solaris, if I remember my Vahalia book correctly (isn't that a favorite of yours?). -- Ben An art scene of delight I created this to be ... -- Sun Ra To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
On Tuesday, 7 August 2001 at 1:58:21 -0700, Terry Lambert wrote: Bosko Milekic wrote: I keep wondering about the sagicity of running interrupts in threads... it still seems like an incredibly bad idea to me. I guess my major problem with this is that by running in threads, it's made it nearly impossibly to avoid receiver livelock situations, using any of the classical techniques (e.g. Mogul's work, etc.). References to published works? Just do an NCSTRL search on receiver livelock; you will get over 90 papers... http://ncstrl.mit.edu/ See also the list of participating institutions: http://ncstrl.mit.edu/Dienst/UI/2.0/ListPublishers It won't be that hard to find... Mogul has only published 92 papers. 8-) So much data, in fact, that you could hide anything behind it. Would you like to be more specific? It also has the unfortunate property of locking us into virtual wire mode, when in fact Microsoft demonstrated that wiring down interrupts to particular CPUs was good practice, in terms of assuring best performance. Specifically, running in virtual Can you point us at any concrete information that shows this? Specifically, without being Microsoft biased (as is most data published by Microsoft)? -- i.e. preferably third-party performance testing that attributes wiring down of interrupts to particular CPUs as _the_ performance advantage. FreeBSD was tested, along with Linux and NT, by Ziff Davis Labs, in Foster city, with the participation of Jordan Hubbard and Mike Smith. You can ask either of them for the results of the test; only the Linux and NT numbers were actually released. This was done to provide a non-biased baseline, in reaction to the Mindcraft benchmarks, where Linux showed so poorly. They ran quad ethernet cards, with quad CPUs; the NT drivers wired the cards down to seperate INT A/B/C/D interrupts, one per CPU. You carefully neglect to point out that this was the old SMP implementation. I think this completely invalidates any point you may have been trying to make. wire mode means that all your CPUs get hit with the interrupt, whereas running with the interrupt bound to a particular CPU reduces the overall overhead. Even what we have today, with Obviously. I mention it because this is the direction FreeBSD appears to be moving in. Right now, Intel is shipping with seperate PCI busses; there is one motherboard from their serverworks division that has 16 seperate PCI busses -- which means that you can do simultaneous gigabit card DMA to and from memory, without running into bus contention, so long as the memory is logically seperate. NT can use this hardware to its full potential; FreeBSD as it exists, can not, and FreeBSD as it appears to be heading today (interrupt threads, etc.) seems to be in the same boat as Linux, et. al.. PCI-X will only make things worse (8.4 gigabit, burst rate). What do interrupt threads have to do with this? Terry, we've done a lot of thinking about performance implications over the last 2 years, including addressing all of the points that you appear to raise. A lot of it is in the archives. It's quite possible that we've missed something important that you haven't. But if that's the case, we'd like you to state it. All I see is you coming in, waving your hands and shouting generalities which don't really help much. The fact that people are still listening is very much an indication of the hope that you might come up with something useful. But pointing to 92 papers and saying it's in there [somewhere] isn't very helpful. Greg -- See complete headers for address and phone numbers To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Allocate a page at interrupt time
:I should have guessed the reason. Matthew Dillon answered this question on :Fri, 2 Jun 2000 as follows: : : :The VM routines that manage pages associated with objects are not :protected against interrupts, so interrupts aren't allowed to change :page-object associations. Otherwise an interrupt at just the wrong :time could corrupt the mainline kernel VM code. : : :On Thu, 2 Aug 2001, Zhihui Zhang wrote: : : : FreeBSD can not allocate from the PQ_CACHE queue in an interrupt context. : Can anyone explain it to me why this is the case? : : : Thanks, Yes, that is precisely the reason. In -current this all changes, though, since interrupts are now threads. *But*, that said, interrupts cannot really afford to hold mutexes that might end up blocking them for long periods of time so I would still recommend that interrupt code not attempt to allocate pages out of PQ_CACHE. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Allocate a page at interrupt time
FreeBSD can not allocate from the PQ_CACHE queue in an interrupt context. Can anyone explain it to me why this is the case? Thanks, -Zhihui To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Allocate a page at interrupt time
FreeBSD can not allocate from the PQ_CACHE queue in an interrupt context. Can anyone explain it to me why this is the case? Thanks, -Zhihui To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message