Re: Patch for critical_enter()/critical_exit() & interrupt assembly revamp, please review!
:So, your great deal of time and effort over the last week is more important :than our time and effort over the last few months? : :Cheers, :-Peter :-- :Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] And how the hell did you come to that conclusion? Nothing I have done has interfered in the least with what you are doing. Not one thing. Even my kern_prot want-to-commit, supposedly duplicating some of JHB's work (but it turns out not since JHB didn't bother instrumenting Giant) would not have cost John more then 5 seconds of merge work. In short, if you are going to make such a state you had damn well start backing it up with hard fact, because I have *HAD* *IT* with all the shit flying around lately that has taken what should be 10 or 20 man hours of work and turned it into 80 or 90. I am not playing this game any more. If you have ephermal, unspecified problems with my work you can address them *AFTER* I commit. If you have hard facts showing probable interference then you can address them *BEFORE* I commit. So far I have not heard a single hard fact in all the shit flying back and forth. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Patch for critical_enter()/critical_exit() & interrupt assembly revamp, please review!
Matthew Dillon wrote: > > : > :On Mon, 25 Feb 2002, Matthew Dillon wrote: > : > :> Unless an unforseen problem arises, I am going to commit this tomorrow > :> and then start working on a cleanup patch. I have decided to > : > :Please wait for jhb's opinion on it. He seems to be offline again. > :I think he has plans and maybe even code for more code in critical_enter(). > :I think we don't agree with these plans, but they are just as valid > :as ours, and our versions undo many of his old changes. > > I am not going to predicate my every move on permission from JHB nor > do I intend to repeat the last debacle which held-up (and is still holdin g > up) potential commits for a week and a half now. JHB hasn't even > committed *HIS* patches and I am beginning to wonder what the point is > when *NOTHING* goes in. If he had code he damn well should have said > something on the lists two days ago. As it is, I have invested a great > deal of time and effort on this patch and it is damn well going to go > in so I can move on. So, your great deal of time and effort over the last week is more important than our time and effort over the last few months? Cheers, -Peter -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] "All of this is for nothing if we don't go to the stars" - JMS/B5 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Patch for critical_enter()/critical_exit() & interrupt assembly revamp, please review!
:> :is written out, so I don't see much point in changing this do the masking :> :in software and avoid the soft interrupt. :> :> I have no idea what you are talking about Jake. Could you supply :> some context? : :Sorry, maybe I got ahead of myself. I was responding to your suggestion :that other architectures should pick up this design. This does not make :sense for sparc64. All I am advocating is that either cpu_fork() or the trampoline code be responsible for setting up the '1 level of critical_enter()' state that fork_exit() is currently hacking together with twine and bailing wire. That seems pretty straightforward to me. There is a window of opportunity between the trampoline code and the point where fork_exit() installs td_critnest where an interrupt can occur and mess things up. In order to get around this window, fork_exit() is making assumptions about the underlying hardware that it should not be making. For IA32 I will probably fix it in cpu_fork() itself in a later commit and not have to touch the trampoline code at all, but for this current commit I am fixing it in the trampoline code for i386 and have a compatibility shim in fork_exit() for the other architectures. I will also be moving critical_enter()/exit() from MI to MD at some point (not this commit, some later commit), but for the other architectures it will be the same code. I am only optmizing i386. :Fine. As long as their functionality with respect to MI code does not :change I don't care. : The functionality does not change unless you are making an assumption in regards to interrupt disablement vs interrupt deferral in the MI code. Only fork_exit() makes that assumption at the moment but the commit will only correct the assumption for I386. It will not change anything for the other architectures (i.e. the assumption can still be made for the other architectures). -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Patch for critical_enter()/critical_exit() & interrupt assembly revamp, please review!
Apparently, On Sun, Feb 24, 2002 at 09:03:53PM -0800, Matthew Dillon said words to the effect of; > :... > :interrupts; hard interrupts are rarely masked. The queueing is written > :in assmebler and runs outside of the kernel so it is fast. Traps in > :general are fast because they don't touch memory until the trapframe > :is written out, so I don't see much point in changing this do the masking > :in software and avoid the soft interrupt. > > I have no idea what you are talking about Jake. Could you supply > some context? Sorry, maybe I got ahead of myself. I was responding to your suggestion that other architectures should pick up this design. This does not make sense for sparc64. > > :In regards to the patch I think that making critical_enter/exit MD in > :certain cases is a mistake. cpu_critical_enter/exit are already MD > :and it looks you could hide this behind them with some minor changes. > : > :Like in the critical_enter case, make cpu_critical_enter take a thread > :as its argument. It could be a null macro in the i386 case, which would > :optimize out the test. > : > :if (td->td_critnest == 0) > : cpu_critical_enter(td); > :td->td_critnest++; > : > :Likewise with cpu_critical_exit, make it take the thread as its argument. > : > :If (td->td_critnest == 1) { > : td->td_critnest = 0; > : cpu_critical_exit(td); > :} else > : td->td_critnest--; > :(This is equivalent to if (--td->td_critnest == 0), its a matter of taste.) > > I'm afraid I have to strongly disagree. I believe that > critical_enter()/exit is *ALREADY* being severely misused in current. > For example, fork_exit() makes assumptions about the underlying hardware > in order to initialize td_critnest and the scheduler mutex safely. This > is just plain broken. fork_exit() should not make any such assumptions. > It is the clear responsibility of either the trampoline code or > cpu_fork() to properly set up td_critnest. One can make an argument > that the sched_lock fixup is in fork_exit()'s domain but you cannot > make the argument that the initialization of critnest (and any associated > hardware state) is MI. It just isn't. > > Additionally, critical_enter()/exit themselves are certainly much > more MD then MI. It makes little sense to abstract out an MI interface > that forces unnecessary overhead when all you have to do is define an > MD API that has a few MI requirements, like td_critnest being a critical > nesting count that the MI code can count on. But as MI code the existing > critical_enter()/exit impose additional MD fields and do not give the > MD code the option of not using them (except in a degenerate fashion). > Specifically, the use of the td_savecrit field is under the partial > contorl of the MI code when it shouldn't be, not just in fork_exit() > but also in critical_enter() and critical_exit() themselves. > > The existing critical_enter/exit code is far too abstracted for its > low-level purpose and I *WILL* be moving them into MD sections of the > system where they belong. These are really MD routines, not MI routines. Fine. As long as their functionality with respect to MI code does not change I don't care. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Patch for critical_enter()/critical_exit() & interrupt assembly revamp, please review!
:Removing cli and sti from the critical functions saves us 300 ns on I'm going to make a correction here... I didn't realize I had WITNESS turned on. 300ns of savings makes a lot of sense when WITNESS is turned on because witness manipulates a number of spin locks. I am going to rerun the test on a 2xCPU box without WITNESS. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Patch for critical_enter()/critical_exit() & interrupt assembly revamp, please review!
:... :interrupts; hard interrupts are rarely masked. The queueing is written :in assmebler and runs outside of the kernel so it is fast. Traps in :general are fast because they don't touch memory until the trapframe :is written out, so I don't see much point in changing this do the masking :in software and avoid the soft interrupt. I have no idea what you are talking about Jake. Could you supply some context? :In regards to the patch I think that making critical_enter/exit MD in :certain cases is a mistake. cpu_critical_enter/exit are already MD :and it looks you could hide this behind them with some minor changes. : :Like in the critical_enter case, make cpu_critical_enter take a thread :as its argument. It could be a null macro in the i386 case, which would :optimize out the test. : :if (td->td_critnest == 0) : cpu_critical_enter(td); :td->td_critnest++; : :Likewise with cpu_critical_exit, make it take the thread as its argument. : :If (td->td_critnest == 1) { : td->td_critnest = 0; : cpu_critical_exit(td); :} else : td->td_critnest--; :(This is equivalent to if (--td->td_critnest == 0), its a matter of taste.) I'm afraid I have to strongly disagree. I believe that critical_enter()/exit is *ALREADY* being severely misused in current. For example, fork_exit() makes assumptions about the underlying hardware in order to initialize td_critnest and the scheduler mutex safely. This is just plain broken. fork_exit() should not make any such assumptions. It is the clear responsibility of either the trampoline code or cpu_fork() to properly set up td_critnest. One can make an argument that the sched_lock fixup is in fork_exit()'s domain but you cannot make the argument that the initialization of critnest (and any associated hardware state) is MI. It just isn't. Additionally, critical_enter()/exit themselves are certainly much more MD then MI. It makes little sense to abstract out an MI interface that forces unnecessary overhead when all you have to do is define an MD API that has a few MI requirements, like td_critnest being a critical nesting count that the MI code can count on. But as MI code the existing critical_enter()/exit impose additional MD fields and do not give the MD code the option of not using them (except in a degenerate fashion). Specifically, the use of the td_savecrit field is under the partial contorl of the MI code when it shouldn't be, not just in fork_exit() but also in critical_enter() and critical_exit() themselves. The existing critical_enter/exit code is far too abstracted for its low-level purpose and I *WILL* be moving them into MD sections of the system where they belong. These are really MD routines, not MI routines. :I notice the you are using cpu_critical_exit for the purpose of disabling :interrupts only, like cli, sti. I think that you should use api that tmm :wrote for sparc64 for this so that it can do more. i386 does not really :have cli/sti equivalents otherwise. :... :s = intr_disable() disables interrupts and returns the old state. :intr_restore(s)restores the state s. : :I don't really like this change in general because it complicates things :more than I'd like, but I also don't have a bearing on how expensive cli/sti :are. It would seem to me that it just takes a few more clock cycles, which :isn't that important. I understand that it increases latency for the fast :part of the interrupt handler, but they are not able to run in critical :sections anyway due to the software masking. : :Jake Removing cli and sti from the critical functions saves us 300 ns on every single system call, interrupt, and most traps. And that's with the sysctl instrumentation and without them being inlined. With the sysctl instrumentation removed and with critical_enter() and critical_exit() inlined my guess is that we will save on the order of 500 ns. This is not insignificant on a P3, and it will probably save even more on a P4. With critical_enter() and critical_exit() moved into MD sections, we can do away with cpu_critical_enter() and cpu_critical_exit() (that is, make them specific to the MD implementation of critical_enter() and critical_exit() on those platforms that want to keep the two-layer abstraction). But for I386 I suppose I could just rename them to intr_disable() and intr_restore(). I don't understand your comment about functionality, they work just fine. It's just the name that needs changing. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Patch for critical_enter()/critical_exit() & interrupt assembly revamp, please review!
Apparently, On Sun, Feb 24, 2002 at 11:12:22AM -0800, Matthew Dillon said words to the effect of; > NOTES: > > I would like to thank Bruce for supplying the sample code that allowed > me to do this in a day instead of several days. > > * debug.critical_mode sysctl. This will not be in the final commit, > nor will any of the code that tests the variable. The final commit > will use code as if the critical_mode were '1'. > > The default is 1, which means to use the new streamlined > interrupt and cpu_critical_enter/exit() code. Setting it to 0 > will revert to the old hard-interrupt-disablement operation. You > can change the mode at any time. > > * Additional cpu_critical_enter/exit() calls around icu_lock. Since > critical_enter() no longer disables interrupts, special care must > be taken when dealing with the icu_lock spin mutex because it is > the one thing the interrupt code needs to be able to defer the > interrupt. > > * MACHINE_CRITICAL_ENTER define. This exists to maintain compatibility > with other architectures. i386 defines this to cause fork_exit to > use the new API and to allow the i386 MD code to supply the > critical_enter() and critical_exit() procedures rather then > kern_switch.c > > I would much prefer it if the other architectures were brought around > to use this new mechanism. The old mechanism makes assumptions > in regards to hard disablement that is no longer correct for i386. For sparc64 we basically already have this in hardware. There is both an interrupt enable (IE) bit and a soft interrupt priority. Hardware interrupts are masked by the IE bit. In all cases they just queue the interrupt packet that is sent by the hardware in a per-cpu interrupt queue and post a soft interrupt. The critical_* stuff only masks soft interrupts; hard interrupts are rarely masked. The queueing is written in assmebler and runs outside of the kernel so it is fast. Traps in general are fast because they don't touch memory until the trapframe is written out, so I don't see much point in changing this do the masking in software and avoid the soft interrupt. In regards to the patch I think that making critical_enter/exit MD in certain cases is a mistake. cpu_critical_enter/exit are already MD and it looks you could hide this behind them with some minor changes. Like in the critical_enter case, make cpu_critical_enter take a thread as its argument. It could be a null macro in the i386 case, which would optimize out the test. if (td->td_critnest == 0) cpu_critical_enter(td); td->td_critnest++; Likewise with cpu_critical_exit, make it take the thread as its argument. If (td->td_critnest == 1) { td->td_critnest = 0; cpu_critical_exit(td); } else td->td_critnest--; (This is equivalent to if (--td->td_critnest == 0), its a matter of taste.) The fork case is a little different, hmm. I'd have to think about it more. An md function/macro would probably suffice. I notice the you are using cpu_critical_exit for the purpose of disabling interrupts only, like cli, sti. I think that you should use api that tmm wrote for sparc64 for this so that it can do more. i386 does not really have cli/sti equivalents otherwise. s = intr_disable() disables interrupts and returns the old state. intr_restore(s) restores the state s. I don't really like this change in general because it complicates things more than I'd like, but I also don't have a bearing on how expensive cli/sti are. It would seem to me that it just takes a few more clock cycles, which isn't that important. I understand that it increases latency for the fast part of the interrupt handler, but they are not able to run in critical sections anyway due to the software masking. Jake To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Patch for critical_enter()/critical_exit() & interrupt assembly revamp, please review!
Matt, Thanks for the great explanation! Hopefully, I'll get a chance to try this for alpha (though I won't mind if somebody beats me to it). I've also been meaning to get iprobe (kernel profiler) going again too, to find out why -current on alpha sucks so badly right now. (your gettimeofday() syscall test shows that -current has over 2x the overhead for syscalls on alpha, even w/o INVARIANTS). Drew To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Patch for critical_enter()/critical_exit() & interrupt assembly revamp, please review!
:I think you're making critical_enter()/critical_exit() cheaper by not :actually messing with the interrupt hardware when they're called. :Very much like what we do in 4-stable. : :Instead, you set/clear a per-cpu flag (or incr/decr a counter). If an :interrupt comes in when you're inside a critical section, you make :note of it & mask interrupts on that cpu. In critical_exit(), :you run the interrupt handler if there's a pending interrupt, and :unmask interrupts. If there's not a pending interrupt, you just clear :the flag (or devrement a counter). : :Is that basically it? Right. And it turns out to be a whole lot easier to do in -current because we do not have the _cpl baggage. Basically critical_enter() devolves down into: ++td->td_critnest And critical_exit() devolves down into: if (--td->td_critnest == 0 && PCPU_GET(int_pending)) _deal_with_interupts_marked_pending() As we found with the lazy spl()s in -stable, this is a huge win-win situation because no interrupts typically occur while one is in short-lived critical sections. On -current, critical sections are for spin locks (i.e. mainly the sched_lock) and in a few other tiny pieces of code and that is pretty much it. It is a far better situation then the massive use of spl*()'s we see in -stable and once Giant is gone it is going to be pretty awesome. :If so, I'm wondering if this is even possible on alpha, where we don't :have direct access to the hardware. However, according to the psuedo :code for rti in the Brown book, munging with the saved PSL in :trapframe to set the IPL should work. Hmm.. : :Drew Well, on IA32 there are effectively two types of interrupts (edge and level triggered), and three interrupt masks (the processor interrupt disable, a master bit mask, and then the per-device masks), and two ways of handling interrupts (as a scheduled interrupt thread or as a 'fast' interrupt). Note that the masking requirement is already part and parcel in -current because most interrupts are now scheduled. The hardware interrupt routine must schedule the interrupt thread and then mask the interrupt since it is not dealing with it right then and there and must be able to return to whatever was running before. The interrupt thread will unmask the interrupt when it is through processing it. But *which* mask one is able to use determines how efficiently one can write critical_enter() and critical_exit(). On IA32 we can use the master bit mask instead of the processor interrupt disable which allows the optimization you see in this patch set. -- On alpha if you are only able to use the process interrupt disable you can still optimizatize critical_enter() and critical_exit(). That is, you can do this: critical_enter(): ++td->td_critnest; critical_exit() if (--td->td_critnest == 0 && PCPU_GET(int_pending)) { PCPU_SET(int_pending, 0); if (PCPU_GET(ipending)) deal_with_pending_edge_triggered_ints(); enable_hardware_interrupts(); (level triggered ints will immediately cause an interrupt which, since critnest is 0, can now be taken) } hard interrupt vector code: if (curthread->td_critnest) { Save any edge triggered interrupts to a bitmap somewhere (i.e. ipending |= 1 << irq_that_was_taken). adjust return frame such that the restored PSL has interrupts disabled so it does not immediately re-enter the hardware interrupt vector code. } As you can see, it should be possible to make critical_enter() and critical_exit() streamlined even on architectures where you may not have direct access to interrupt masks. In fact, I considered doing exactly this for the I386 but decided that it was cleaner to mask level interrupts and throw everything into the same ipending hopper. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Patch for critical_enter()/critical_exit() & interrupt assembly revamp, please review!
I'm not fluent in x86 asm, so can you confirm for me what you're attempting to do here? I think you're making critical_enter()/critical_exit() cheaper by not actually messing with the interrupt hardware when they're called. Very much like what we do in 4-stable. Instead, you set/clear a per-cpu flag (or incr/decr a counter). If an interrupt comes in when you're inside a critical section, you make note of it & mask interrupts on that cpu. In critical_exit(), you run the interrupt handler if there's a pending interrupt, and unmask interrupts. If there's not a pending interrupt, you just clear the flag (or devrement a counter). Is that basically it? If so, I'm wondering if this is even possible on alpha, where we don't have direct access to the hardware. However, according to the psuedo code for rti in the Brown book, munging with the saved PSL in trapframe to set the IPL should work. Hmm.. Drew To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Patch for critical_enter()/critical_exit() & interrupt assembly revamp, please review!
NOTES: I would like to thank Bruce for supplying the sample code that allowed me to do this in a day instead of several days. * debug.critical_mode sysctl. This will not be in the final commit, nor will any of the code that tests the variable. The final commit will use code as if the critical_mode were '1'. The default is 1, which means to use the new streamlined interrupt and cpu_critical_enter/exit() code. Setting it to 0 will revert to the old hard-interrupt-disablement operation. You can change the mode at any time. * Additional cpu_critical_enter/exit() calls around icu_lock. Since critical_enter() no longer disables interrupts, special care must be taken when dealing with the icu_lock spin mutex because it is the one thing the interrupt code needs to be able to defer the interrupt. * MACHINE_CRITICAL_ENTER define. This exists to maintain compatibility with other architectures. i386 defines this to cause fork_exit to use the new API and to allow the i386 MD code to supply the critical_enter() and critical_exit() procedures rather then kern_switch.c I would much prefer it if the other architectures were brought around to use this new mechanism. The old mechanism makes assumptions in regards to hard disablement that is no longer correct for i386. * Trampoline 'sti'. In the final commit, the trampoline will simply 'sti' after setting up td_critnest. The other junk to handle the hard-disablement case will be gone. * PSL save/restore in cpu_switch(). In the original code interrupts were always hard-disabled due to holding the sched_lock. cpu_switch never bothered to save/restore the hard interrupt enable/disable bit (the PSL). In the new code, hard disablement has no relationship to the holding of spin mutexes and so we have to save/restore the PSL. If we don't, one thread's interrupt disablement will propogate to another thread unexpectedly. * Additional STI's. It may be possible to emplace additional STI's in the code. For example, we should be able to enable interrupts in the dounpend() code after we complete processing of FAST interrupts and start processing normal interrupts. * Additional cpu_critical_enter()/exit() calls in CY and TIMER code. Bruce had additional hard interrupt disablements in these modules. I'm not sure why so if I need to do that as well I would like to know. * Additional optimization and work. This is ongoing work but this basic patch set, with some cleanups, is probably what I will commit initially. This code will give us a huge amount of flexibility in regards to handling interrupts. -Matt Index: i386/i386/exception.s === RCS file: /home/ncvs/src/sys/i386/i386/exception.s,v retrieving revision 1.91 diff -u -r1.91 exception.s --- i386/i386/exception.s 11 Feb 2002 03:41:58 - 1.91 +++ i386/i386/exception.s 24 Feb 2002 08:41:40 - @@ -222,6 +222,18 @@ pushl %esp/* trapframe pointer */ pushl %ebx/* arg1 */ pushl %esi/* function */ + movlPCPU(CURTHREAD),%ebx/* setup critnest */ + movl$1,TD_CRITNEST(%ebx) + cmpl$0,critical_mode + jne 1f + pushfl + poplTD_SAVECRIT(%ebx) + orl $PSL_I,TD_SAVECRIT(%ebx) + jmp 2f +1: + movl$-1,TD_SAVECRIT(%ebx) + sti /* enable interrupts */ +2: callfork_exit addl$12,%esp /* cut from syscall */ Index: i386/i386/genassym.c === RCS file: /home/ncvs/src/sys/i386/i386/genassym.c,v retrieving revision 1.121 diff -u -r1.121 genassym.c --- i386/i386/genassym.c17 Feb 2002 17:40:27 - 1.121 +++ i386/i386/genassym.c24 Feb 2002 09:06:56 - @@ -89,6 +89,8 @@ ASSYM(TD_KSE, offsetof(struct thread, td_kse)); ASSYM(TD_PROC, offsetof(struct thread, td_proc)); ASSYM(TD_INTR_NESTING_LEVEL, offsetof(struct thread, td_intr_nesting_level)); +ASSYM(TD_CRITNEST, offsetof(struct thread, td_critnest)); +ASSYM(TD_SAVECRIT, offsetof(struct thread, td_savecrit)); ASSYM(P_MD, offsetof(struct proc, p_md)); ASSYM(MD_LDT, offsetof(struct mdproc, md_ldt)); @@ -134,6 +136,7 @@ ASSYM(PCB_DR3, offsetof(struct pcb, pcb_dr3)); ASSYM(PCB_DR6, offsetof(struct pcb, pcb_dr6)); ASSYM(PCB_DR7, offsetof(struct pcb, pcb_dr7)); +ASSYM(PCB_PSL, offsetof(struct pcb, pcb_psl)); ASSYM(PCB_DBREGS, PCB_DBREGS); ASSYM(PCB_EXT, offsetof(struct pcb, pcb_ext)); @@ -176,6 +179,10 @@ ASSYM(PC_SIZEOF, sizeof(struct pcpu)); ASSYM(PC_PRVSPACE, offsetof(struct pcp