Re: i586 FP optimizations hosed.
On Tue, 3 Apr 2001, John Baldwin wrote: On 03-Apr-01 Bruce Evans wrote: There are many other possibilities: ... - don't attempt to save the FPU state reentrantly, since this doesn't work with preemptive context switchiing unless interrupt handlers also save the state reentrantly, which they shouldn't do because it is too wasteful. Instead, save the state in the pcb as is already done in copy{in,out} so that cpu_switch() handles it. This may be too wasteful too. - as in previous possibility, but avoid switching the entire state. For the FPU, the entire state must be switched, but for SSE individual registers can be saved and restored. Saving and restoring individual registers reentrantly would be easy but no longer works for the SMP case. Switching a subset of the state would not be so easy. Hm. I think I'm liking the next to last. Even if there is additional overhead, it should still outperform generic_bcopy and friends on the CPU's in question, right? Not clear. We now have heavyweight context switches that switch the FPU for every interrupt. If we make switching the FPU for ithreads fundamental instead of a just source of bugs, then it will be harder to implement lightweight context switches for ithreads (context switches won't be lightweight if they switch the FPU, and we might have to do extra work to avoid them). Also, if the FPU is actually used a lot, then it will have to be switched a lot. Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: i586 FP optimizations hosed.
On Mon, 2 Apr 2001, John Baldwin wrote: On 31-Mar-01 Bruce Evans wrote: [about i586-optimized copying and bzeroing] - we start using the FPU on a CPU with a free FPU (we used to free the FPU in some cases; now we only use optimizations in bcopy/bzero if the FPU was free to begin with). - we do a preemptive context switch and come back using a different FPU. The different CPU might even be unfree, and that case is now detected. In other cases, we just corrupt data by using different FPU registers :-(. Ugh. Hrm, then we need to either disable interrupts inside of i586_* or set a This would break fast interrupts :-). hard affinity flag in the process such that all other CPU's will ignore it and only p_lastcpu will run it next. There are many other possibilities: - don't use these routines. - don't use these routines for the SMP case. - disable preemptive context switching for the CPU that is using the FPU. The hard affinity flag could be used for this as a special case. - acquire sched_lock so that all sorts of context switching are disabled for all CPUs. - don't attempt to save the FPU state reentrantly, since this doesn't work with preemptive context switchiing unless interrupt handlers also save the state reentrantly, which they shouldn't do because it is too wasteful. Instead, save the state in the pcb as is already done in copy{in,out} so that cpu_switch() handles it. This may be too wasteful too. - as in previous possibility, but avoid switching the entire state. For the FPU, the entire state must be switched, but for SSE individual registers can be saved and restored. Saving and restoring individual registers reentrantly would be easy but no longer works for the SMP case. Switching a subset of the state would not be so easy. Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: i586 FP optimizations hosed.
On 03-Apr-01 Bruce Evans wrote: On Mon, 2 Apr 2001, John Baldwin wrote: On 31-Mar-01 Bruce Evans wrote: [about i586-optimized copying and bzeroing] - we start using the FPU on a CPU with a free FPU (we used to free the FPU in some cases; now we only use optimizations in bcopy/bzero if the FPU was free to begin with). - we do a preemptive context switch and come back using a different FPU. The different CPU might even be unfree, and that case is now detected. In other cases, we just corrupt data by using different FPU registers :-(. Ugh. Hrm, then we need to either disable interrupts inside of i586_* or set a This would break fast interrupts :-). hard affinity flag in the process such that all other CPU's will ignore it and only p_lastcpu will run it next. There are many other possibilities: - don't use these routines. - don't use these routines for the SMP case. - disable preemptive context switching for the CPU that is using the FPU. The hard affinity flag could be used for this as a special case. - acquire sched_lock so that all sorts of context switching are disabled for all CPUs. - don't attempt to save the FPU state reentrantly, since this doesn't work with preemptive context switchiing unless interrupt handlers also save the state reentrantly, which they shouldn't do because it is too wasteful. Instead, save the state in the pcb as is already done in copy{in,out} so that cpu_switch() handles it. This may be too wasteful too. - as in previous possibility, but avoid switching the entire state. For the FPU, the entire state must be switched, but for SSE individual registers can be saved and restored. Saving and restoring individual registers reentrantly would be easy but no longer works for the SMP case. Switching a subset of the state would not be so easy. Hm. I think I'm liking the next to last. Even if there is additional overhead, it should still outperform generic_bcopy and friends on the CPU's in question, right? Bruce -- John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.Baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: i586 FP optimizations hosed.
How about this: * a per-process fp-in-use flag * a per-cpu fp-save-block ID (incrementing serial number) When a context switch from process A to process B occurs: if (process-A-using-FP) { increment ID save FP state for process A and ID fp-save-block-id = ID } if (process-B-using-FP) { if (fp-save-block-id != B-saved_fp_state-ID) { restore FP state for B } } So lets take a few test cases: * Interrupt occurs - fp context saved - interrupt is scheduled on same cpu (lets say) - interrupt completes, original process switched back to - ID matches, no restore operation required. * Multiple interrupts occur or several context switches occur from the point of view of some cpu. - fp context saved - multiple interrupts context switches occur, but none require the use of the FPU - switch back to original process - ID matches, no restore operation required. * Interrupt occurs, original process woken up on different cpu - fp context saved - all sorts of things happen - original process woken up on different cpu - ID does not match (ID is per-cpu), restore FP context * Kernel bcopy operation in case where no process switch occurs (i.e. current process was using the FP and while we went into kernel mode, we have not yet saved the FP state anywhere). if (process-A-using-FP) { push FP state for process A on stack push process-A-using-FP flag } set process-A-using-FP flag use FP registers to do bcopy restore process-A-using-FP flag from stack restore FP state from stack This solution isn't perfect, we still wind up saving the FP state in the case of an interrupt, but at least we don't have to restore it if it is determined to still be valid. Additionally it is possible that we might be able to switch between several processes under normal operating conditions and not have to restore the FP state of the original one when we get back to it if none of the intervening processes (or the kernel) use the FP. I can't think up of anything more optimal that isn't overly complex. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: i586 FP optimizations hosed.
:* Kernel bcopy operation in case where no process switch occurs : (i.e. current process was using the FP and while we went into : kernel mode, we have not yet saved the FP state anywhere). : : if (process-A-using-FP) { : push FP state for process A on stack : push process-A-using-FP flag : } : set process-A-using-FP flag : use FP registers to do bcopy : : restore process-A-using-FP flag from stack : restore FP state from stack Oops. I meant: if (process-A-using-FP) { push FP state for process A on stack } push process-A-using-FP flag set process-A-using-FP flag use FP registers to do bcopy restore process-A-using-FP flag from stack if (process-A-using-FP) { restore FP state from stack } -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: i586 FP optimizations hosed.
-On [20010331 05:30], John Baldwin ([EMAIL PROTECTED]) wrote: It looks like it is just broken in the SMP case. Note: I got a i586_bzero_oops on an UP box. It was invoked through the random_process and the random_kthread. -- Jeroen Ruigrok van der Werven/Asmodai .oUo. asmodai@[wxs.nl|freebsd.org] Documentation nutter/C-rated Coder BSD: Technical excellence at its best D78D D0AD 244D 1D12 C9CA 7152 035C 1138 546A B867 How the gods kill... To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: i586 FP optimizations hosed.
On Tue, 3 Apr 2001, Jeroen Ruigrok/Asmodai wrote: -On [20010331 05:30], John Baldwin ([EMAIL PROTECTED]) wrote: It looks like it is just broken in the SMP case. Note: I got a i586_bzero_oops on an UP box. It was invoked through the random_process and the random_kthread. This means that i586_bzero was interrupted and the scheduler gave control to something else that used the FPU before returning control to i586_bzero. I think I can see how this can happen :-(. i586_bzero may be being used by a (non-ithread) process. Control may be switched to another (non-ithread) process that uses the FPU for copyin/copyout. The locking for bzero/bcopy doesn't help because copyin/copyout don't use it (they switch the context). The fix should be be to switch the context in all cases. This wasn't done before because it ddidn't work in interrupt handlers in SMPog. It still doesn't work in fast interrupt handlers, but this shouldn't be a problem since bcopy/ bzero are not in the (almost empty) set of functions that may be called from a fast interrupt handler. Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: i586 FP optimizations hosed.
On 31-Mar-01 Bruce Evans wrote: On Fri, 30 Mar 2001, John Baldwin wrote: On 30-Mar-01 David O'Brien wrote: On Fri, Mar 30, 2001 at 07:45:43AM +0200, Mark Murray wrote: I thought the 586 FP stuff was disabled? Nope. Depending on how current you are, it was either left broken. I commited BDE's fix to exeception.s that fixed things for K6-2 users. It looks like it is just broken in the SMP case. It is more just broken than before in the SMP case. Premptive context switching in the kernel did most of the breaking, and recent changes added sanity checks that detected a very broken case. With preemptive context switching, the following can happen: - we start using the FPU on a CPU with a free FPU (we used to free the FPU in some cases; now we only use optimizations in bcopy/bzero if the FPU was free to begin with). - we do a preemptive context switch and come back using a different FPU. The different CPU might even be unfree, and that case is now detected. In other cases, we just corrupt data by using different FPU registers :-(. Ugh. Hrm, then we need to either disable interrupts inside of i586_* or set a hard affinity flag in the process such that all other CPU's will ignore it and only p_lastcpu will run it next. Bruce -- John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: i586 FP optimizations hosed.
On Sat, Mar 31, 2001 at 06:34:12PM +0200, Leif Neland wrote: CPU1 stopping CPUs: 0x0001... Stopped. Stopped at i586_bzero_oops+0x1: jmp i586_bzero_oops I get panics on that instructions too on my old 60MHz P5. I've got a core dump, will tell more when I can interpret it... Mark's machine is a dual SMP machine. I don't recall there being dual-P5-60's, but this is me guessing. You're going to have to provide a lot more details. Is your machine a UP or MP machine? The current sources are believed to work fine in a UP kernel. -- -- David ([EMAIL PROTECTED]) To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: i586 FP optimizations hosed.
On Fri, Mar 30, 2001 at 07:45:43AM +0200, Mark Murray wrote: I thought the 586 FP stuff was disabled? Nope. Depending on how current you are, it was either left broken. I commited BDE's fix to exeception.s that fixed things for K6-2 users. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: i586 FP optimizations hosed.
On 30-Mar-01 David O'Brien wrote: On Fri, Mar 30, 2001 at 07:45:43AM +0200, Mark Murray wrote: I thought the 586 FP stuff was disabled? Nope. Depending on how current you are, it was either left broken. I commited BDE's fix to exeception.s that fixed things for K6-2 users. It looks like it is just broken in the SMP case. -- John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: i586 FP optimizations hosed.
On Fri, 30 Mar 2001, John Baldwin wrote: On 30-Mar-01 David O'Brien wrote: On Fri, Mar 30, 2001 at 07:45:43AM +0200, Mark Murray wrote: I thought the 586 FP stuff was disabled? Nope. Depending on how current you are, it was either left broken. I commited BDE's fix to exeception.s that fixed things for K6-2 users. It looks like it is just broken in the SMP case. It is more just broken than before in the SMP case. Premptive context switching in the kernel did most of the breaking, and recent changes added sanity checks that detected a very broken case. With preemptive context switching, the following can happen: - we start using the FPU on a CPU with a free FPU (we used to free the FPU in some cases; now we only use optimizations in bcopy/bzero if the FPU was free to begin with). - we do a preemptive context switch and come back using a different FPU. The different CPU might even be unfree, and that case is now detected. In other cases, we just corrupt data by using different FPU registers :-(. Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
i586 FP optimizations hosed.
Hi I have an SMP kernel with I586 and I686 support. If I boot it on a 686 it works. On a 586 it craps out with CPU1 stopping CPUs: 0x0001... Stopped. Stopped at i586_bzero_oops+0x1:jmp i586_bzero_oops late into the boot (during network/RPC init stuff). I thought the 586 FP stuff was disabled? M -- Mark Murray Warning: this .sig is umop ap!sdn To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message