Author: kib
Date: Fri Nov 19 09:49:14 2010
New Revision: 215513
URL: http://svn.freebsd.org/changeset/base/215513

Log:
  Merge the kern_fpu_enter/kern_fpu_leave KPI and followup fixes for the
  amd64 suspend/resume support.
  
  Tested by:    Mike Tancsa
  Also tested by:       Dewayne Geraghty <dewayne.geraghty heuristicsystems com 
au>,
       Daryl Richards <daryl isletech net>
  
  Below is the svn log of the merged revisions.
  ------------------------------------------------------------------------
  r197455 | emaste | 2009-09-24 17:26:42 +0300 (Thu, 24 Sep 2009) | 5 lines
  
  Add a backtrace to the "fpudna in kernel mode!" case, to help track down
  where this comes from.
  
  Reviewed by:  bde
  
  ------------------------------------------------------------------------
  r197863 | jkim | 2009-10-08 20:41:53 +0300 (Thu, 08 Oct 2009) | 8 lines
  
  Clean up amd64 suspend/resume code.
  
  - Allocate memory for wakeup code after ACPI bus is attached.  The early
  memory allocation hack was inherited from i386 but amd64 does not need it.
  - Exclude real mode IVT and BDA explicitly.  Improve comments about memory
  allocation and reason for the exclusions.  It is a no-op in reality, though.
  - Remove an unnecessary CLD from wakeup code and re-align.
  
  ------------------------------------------------------------------------
  r198931 | jkim | 2009-11-05 00:39:18 +0200 (Thu, 05 Nov 2009) | 2 lines
  
  Tweak memory allocation for amd64 suspend/resume CPU context.
  
  ------------------------------------------------------------------------
  r200280 | jkim | 2009-12-09 00:38:42 +0200 (Wed, 09 Dec 2009) | 2 lines
  
  Simplify a macro not to generate unncessary symbols.
  
  ------------------------------------------------------------------------
  r205444 | emaste | 2010-03-22 13:52:53 +0200 (Mon, 22 Mar 2010) | 7 lines
  
  Merge r197455 from amd64:
  
    Add a backtrace to the "fpudna in kernel mode!" case, to help track down
    where this comes from.
  
    Reviewed by:        bde
  
  ------------------------------------------------------------------------
  r208833 | kib | 2010-06-05 18:59:59 +0300 (Sat, 05 Jun 2010) | 15 lines
  
  Introduce the x86 kernel interfaces to allow kernel code to use
  FPU/SSE hardware. Caller should provide a save area that is chained
  into the stack of the areas; pcb save_area for usermode FPU state is
  on top. The pcb now contains a pointer to the current FPU saved area,
  used during FPUDNA handling and context switches.  There is also a
  facility to allow the kernel thread to use pcb save_area.
  
  Change the dreaded warnings "npxdna in kernel mode!" into the panics
  when FPU usage is not registered.
  
  KPI discussed with:   fabient
  Tested by:    pho, fabient
  Hardware provided by: Sentex Communications
  MFC after:    1 month
  
  ------------------------------------------------------------------------
  r208834 | kib | 2010-06-05 19:00:53 +0300 (Sat, 05 Jun 2010) | 13 lines
  
  Use the fpu_kern_enter() interface to properly separate usermode FPU
  context from in-kernel execution of padlock instructions and to handle
  spurious FPUDNA exceptions that sometime are raised when doing padlock
  calculations.
  
  Globally mark crypto(9) kthread as using FPU.
  
  Reviewed by:  pjd
  Hardware provided by: Sentex Communications
  Tested by:      pho
  PR:    amd64/135014
  MFC after:    1 month
  
  ------------------------------------------------------------------------
  r208877 | kib | 2010-06-06 19:13:50 +0300 (Sun, 06 Jun 2010) | 5 lines
  
  Style-compilant order of declarations.
  
  Noted by:     bde
  MFC after:    1 month
  
  ------------------------------------------------------------------------
  r209174 | jkim | 2010-06-14 23:08:26 +0300 (Mon, 14 Jun 2010) | 3 lines
  
  Fix ACPI suspend/resume on amd64, which was broken since r208833.
  We need actual storage for FPU state to save and restore.
  
  ------------------------------------------------------------------------
  r209198 | kib | 2010-06-15 12:19:33 +0300 (Tue, 15 Jun 2010) | 10 lines
  
  Use critical sections instead of disabling local interrupts to ensure
  the consistency between PCPU fpcurthread and the state of the FPU.
  
  Explicitely assert that the calling conventions for fpudrop() are
  adhered too. In cpu_thread_exit(), add missed critical section entrance.
  
  Reviewed by:  bde
  Tested by:    pho
  MFC after:    1 month
  
  ------------------------------------------------------------------------
  r209204 | kib | 2010-06-15 17:59:35 +0300 (Tue, 15 Jun 2010) | 5 lines
  
  Rename CRITSECT_ASSERT to CRITICAL_ASSERT.
  
  Suggested by: jhb
  MFC after:    1 month
  
  ------------------------------------------------------------------------
  r209208 | kib | 2010-06-15 21:16:04 +0300 (Tue, 15 Jun 2010) | 4 lines
  
  Remove two obsoleted comments, add a note about 32bit compatibility.
  
  MFC after:    1 month
  
  ------------------------------------------------------------------------
  r209252 | kib | 2010-06-17 15:35:17 +0300 (Thu, 17 Jun 2010) | 6 lines
  
  In the ia32_{get,set}_fpcontext(), use fpu{get,set}userregs instead
  of fpu{get,set}regs.
  
  Noted by:     bde
  MFC after:    1 month
  
  ------------------------------------------------------------------------
  r209460 | kib | 2010-06-23 13:40:28 +0300 (Wed, 23 Jun 2010) | 8 lines
  
  Remove unused i586 optimized bcopy/bzero/etc implementations that utilize
  FPU registers for copying. Remove the switch table and jumps from
  bcopy/bzero/... to the actual implementation.
  As a side-effect, i486-optimized bzero is removed.
  
  Reviewed by:  bde
  Tested by:    pho (previous version)
  
  ------------------------------------------------------------------------
  r209461 | kib | 2010-06-23 14:12:58 +0300 (Wed, 23 Jun 2010) | 8 lines
  
  Remove the support for int13 FPU exception reporting on i386. It is
  believed that all 486-class CPUs FreeBSD is capable to run on, either
  have no FPU and cannot use external coprocessor, or have FPU on the
  package and can use #MF.
  
  Reviewed by:  bde
  Tested by:    pho (previous version)
  
  ------------------------------------------------------------------------
  r209462 | kib | 2010-06-23 14:21:19 +0300 (Wed, 23 Jun 2010) | 8 lines
  
  After the FPU use requires #MF working due to INT13 FPU exception handling
  removal, MFi386 r209198:
      Use critical sections instead of disabling local interrupts to ensure
      the consistency between PCPU fpcurthread and the state of FPU.
  
  Reviewed by:  bde
  Tested by:    pho
  
  ------------------------------------------------------------------------
  r210514 | jkim | 2010-07-26 22:53:09 +0300 (Mon, 26 Jul 2010) | 6 lines
  
  Re-implement FPU suspend/resume for amd64.  This removes superfluous uses
  of critical_enter(9) and critical_exit(9) by fpugetregs() and fpusetregs().
  Also, we do not touch PCB flags any more.
  
  MFC after:    1 month
  
  ------------------------------------------------------------------------
  r210517 | jkim | 2010-07-27 00:24:52 +0300 (Tue, 27 Jul 2010) | 4 lines
  
  FNSTSW instruction can use AX register as an operand.
  
  Obtained from:        fenv.h
  
  ------------------------------------------------------------------------
  r210518 | jkim | 2010-07-27 01:16:36 +0300 (Tue, 27 Jul 2010) | 5 lines
  
  Reduce diff against fenv.h:
  
  Mark all inline asms as volatile for safety.  No object file change after
  this commit (verified with md5).
  
  ------------------------------------------------------------------------
  r210519 | jkim | 2010-07-27 01:55:14 +0300 (Tue, 27 Jul 2010) | 2 lines
  
  Remove an unused macro since r189418.
  
  ------------------------------------------------------------------------
  r210520 | jkim | 2010-07-27 02:02:18 +0300 (Tue, 27 Jul 2010) | 2 lines
  
  Add missing ldmxcsr() prototype for lint case.
  
  ------------------------------------------------------------------------
  r210521 | jkim | 2010-07-27 02:20:55 +0300 (Tue, 27 Jul 2010) | 3 lines
  
  Simplify fldcw() macro.  There is no reason to use pointer here.  No object
  file change after this commit (verified with md5).
  
  ------------------------------------------------------------------------
  r210614 | jkim | 2010-07-29 19:41:21 +0300 (Thu, 29 Jul 2010) | 2 lines
  
  Rename PCB_USER_FPU to PCB_USERFPU not to clash with a macro from fpu.h.
  
  ------------------------------------------------------------------------
  r210615 | jkim | 2010-07-29 19:49:20 +0300 (Thu, 29 Jul 2010) | 5 lines
  
  Fix another fallout from r208833.  savectx() is used to save CPU context
  for crash dump (dumppcb) and kdb (stoppcbs).  For both cases, there cannot
  have a valid pointer in pcb_save.  This should restore the previous
  behaviour.
  
  ------------------------------------------------------------------------
  r210777 | jkim | 2010-08-02 20:35:00 +0300 (Mon, 02 Aug 2010) | 13 lines
  
  - Merge savectx2() with savectx() and struct xpcb with struct pcb. [1]
  savectx() is only used for panic dump (dumppcb) and kdb (stoppcbs).  Thus,
  saving additional information does not hurt and it may be even beneficial.
  Unfortunately, struct pcb has grown larger to accommodate more data.
  Move 512-byte long pcb_user_save to the end of struct pcb while I am here.
  - savectx() now saves FPU state unconditionally and copy it to the PCB of
  FPU thread if necessary.  This gives panic dump and kdb a chance to take
  a look at the current FPU state even if the FPU is "supposedly" not used.
  - Resuming CPU now unconditionally reinitializes FPU.  If the saved FPU
  state was irrelevant, it could be in an unknown state.
  
  Suggested by: bde [1]
  
  ------------------------------------------------------------------------
  r210804 | jkim | 2010-08-03 18:32:08 +0300 (Tue, 03 Aug 2010) | 6 lines
  
  savectx() has not been used for fork(2) for about 15 years. [1]
  Do not clobber FPU thread's PCB as it is more harmful.  When we resume CPU,
  unconditionally reload FPU state.
  
  Pointed out by:       bde [1]
  
  ------------------------------------------------------------------------
  r212026 | jkim | 2010-08-31 00:19:42 +0300 (Tue, 31 Aug 2010) | 3 lines
  
  Save MSR_FSBASE, MSR_GSBASE and MSR_KGSBASE directly to PCB as we do not use
  these values in the function.
  
  ------------------------------------------------------------------------
  r214347 | jhb | 2010-10-25 18:31:13 +0300 (Mon, 25 Oct 2010) | 5 lines
  
  Use 'saveintr' instead of 'savecrit' or 'eflags' to hold the state returned
  by intr_disable().
  
  Requested by: bde
  
  ------------------------------------------------------------------------

Modified:
  stable/8/sys/amd64/acpica/acpi_machdep.c
  stable/8/sys/amd64/acpica/acpi_switch.S
  stable/8/sys/amd64/acpica/acpi_wakecode.S
  stable/8/sys/amd64/acpica/acpi_wakeup.c
  stable/8/sys/amd64/amd64/cpu_switch.S
  stable/8/sys/amd64/amd64/fpu.c
  stable/8/sys/amd64/amd64/genassym.c
  stable/8/sys/amd64/amd64/machdep.c
  stable/8/sys/amd64/amd64/mp_machdep.c
  stable/8/sys/amd64/amd64/trap.c
  stable/8/sys/amd64/amd64/vm_machdep.c
  stable/8/sys/amd64/ia32/ia32_reg.c
  stable/8/sys/amd64/ia32/ia32_signal.c
  stable/8/sys/amd64/include/fpu.h
  stable/8/sys/amd64/include/pcb.h
  stable/8/sys/crypto/via/padlock.c
  stable/8/sys/crypto/via/padlock.h
  stable/8/sys/crypto/via/padlock_cipher.c
  stable/8/sys/crypto/via/padlock_hash.c
  stable/8/sys/dev/fb/fbreg.h
  stable/8/sys/dev/random/nehemiah.c
  stable/8/sys/i386/i386/identcpu.c
  stable/8/sys/i386/i386/initcpu.c
  stable/8/sys/i386/i386/machdep.c
  stable/8/sys/i386/i386/perfmon.c
  stable/8/sys/i386/i386/ptrace_machdep.c
  stable/8/sys/i386/i386/support.s
  stable/8/sys/i386/i386/swtch.s
  stable/8/sys/i386/i386/trap.c
  stable/8/sys/i386/i386/vm_machdep.c
  stable/8/sys/i386/include/md_var.h
  stable/8/sys/i386/include/npx.h
  stable/8/sys/i386/include/pcb.h
  stable/8/sys/i386/isa/npx.c
  stable/8/sys/i386/linux/linux_ptrace.c
  stable/8/sys/kern/subr_trap.c
  stable/8/sys/opencrypto/crypto.c
  stable/8/sys/pc98/include/npx.h
  stable/8/sys/pc98/pc98/machdep.c
  stable/8/sys/x86/x86/local_apic.c
Directory Properties:
  stable/8/sys/   (props changed)
  stable/8/sys/amd64/include/xen/   (props changed)
  stable/8/sys/cddl/contrib/opensolaris/   (props changed)
  stable/8/sys/contrib/dev/acpica/   (props changed)
  stable/8/sys/contrib/pf/   (props changed)
  stable/8/sys/dev/xen/xenpci/   (props changed)

Modified: stable/8/sys/amd64/acpica/acpi_machdep.c
==============================================================================
--- stable/8/sys/amd64/acpica/acpi_machdep.c    Fri Nov 19 09:26:39 2010        
(r215512)
+++ stable/8/sys/amd64/acpica/acpi_machdep.c    Fri Nov 19 09:49:14 2010        
(r215513)
@@ -32,6 +32,7 @@ __FBSDID("$FreeBSD$");
 #include <sys/kernel.h>
 #include <sys/module.h>
 #include <sys/sysctl.h>
+
 #include <vm/vm.h>
 #include <vm/pmap.h>
 
@@ -71,7 +72,6 @@ acpi_machdep_init(device_t dev)
        STAILQ_INSERT_TAIL(&sc->apm_cdevs, &acpi_clone, entries);
        ACPI_UNLOCK(acpi);
        sc->acpi_clone = &acpi_clone;
-       acpi_install_wakeup_handler(sc);
 
        if (intr_model != ACPI_INTR_PIC)
                acpi_SetIntrModel(intr_model);
@@ -363,13 +363,20 @@ nexus_acpi_probe(device_t dev)
 static int
 nexus_acpi_attach(device_t dev)
 {
+       device_t acpi_dev;
+       int error;
 
        nexus_init_resources();
        bus_generic_probe(dev);
-       if (BUS_ADD_CHILD(dev, 10, "acpi", 0) == NULL)
+       acpi_dev = BUS_ADD_CHILD(dev, 10, "acpi", 0);
+       if (acpi_dev == NULL)
                panic("failed to add acpi0 device");
 
-       return (bus_generic_attach(dev));
+       error = bus_generic_attach(dev);
+       if (error == 0)
+               acpi_install_wakeup_handler(device_get_softc(acpi_dev));
+
+       return (error);
 }
 
 static device_method_t nexus_acpi_methods[] = {

Modified: stable/8/sys/amd64/acpica/acpi_switch.S
==============================================================================
--- stable/8/sys/amd64/acpica/acpi_switch.S     Fri Nov 19 09:26:39 2010        
(r215512)
+++ stable/8/sys/amd64/acpica/acpi_switch.S     Fri Nov 19 09:49:14 2010        
(r215513)
@@ -1,7 +1,7 @@
 /*-
  * Copyright (c) 2001 Takanori Watanabe <takaw...@jp.freebsd.org>
  * Copyright (c) 2001 Mitsuru IWASAKI <iwas...@jp.freebsd.org>
- * Copyright (c) 2008-2009 Jung-uk Kim <j...@freebsd.org>
+ * Copyright (c) 2008-2010 Jung-uk Kim <j...@freebsd.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
@@ -34,26 +34,11 @@
 #include "acpi_wakedata.h"
 #include "assym.s"
 
-#define        WAKEUP_DECL(member)     \
-    .set WAKEUP_ ## member, wakeup_ ## member - wakeup_ctx
-
-       WAKEUP_DECL(xpcb)
-       WAKEUP_DECL(gdt)
-       WAKEUP_DECL(efer)
-       WAKEUP_DECL(pat)
-       WAKEUP_DECL(star)
-       WAKEUP_DECL(lstar)
-       WAKEUP_DECL(cstar)
-       WAKEUP_DECL(sfmask)
-       WAKEUP_DECL(cpu)
-
-#define        WAKEUP_CTX(member)      WAKEUP_ ## member (%rdi)
-#define        WAKEUP_PCB(member)      PCB_ ## member(%r11)
-#define        WAKEUP_XPCB(member)     XPCB_ ## member(%r11)
+#define        WAKEUP_CTX(member)      wakeup_ ## member - wakeup_ctx(%rsi)
 
 ENTRY(acpi_restorecpu)
        /* Switch to KPML4phys. */
-       movq    %rsi, %rax
+       movq    %rdi, %rax
        movq    %rax, %cr3
 
        /* Restore GDT. */
@@ -62,7 +47,7 @@ ENTRY(acpi_restorecpu)
 1:
 
        /* Fetch PCB. */
-       movq    WAKEUP_CTX(xpcb), %r11
+       movq    WAKEUP_CTX(pcb), %rdi
 
        /* Force kernel segment registers. */
        movl    $KDSEL, %eax
@@ -75,16 +60,16 @@ ENTRY(acpi_restorecpu)
        movw    %ax, %gs
 
        movl    $MSR_FSBASE, %ecx
-       movl    WAKEUP_PCB(FSBASE), %eax
-       movl    4 + WAKEUP_PCB(FSBASE), %edx
+       movl    PCB_FSBASE(%rdi), %eax
+       movl    4 + PCB_FSBASE(%rdi), %edx
        wrmsr
        movl    $MSR_GSBASE, %ecx
-       movl    WAKEUP_PCB(GSBASE), %eax
-       movl    4 + WAKEUP_PCB(GSBASE), %edx
+       movl    PCB_GSBASE(%rdi), %eax
+       movl    4 + PCB_GSBASE(%rdi), %edx
        wrmsr
        movl    $MSR_KGSBASE, %ecx
-       movl    WAKEUP_XPCB(KGSBASE), %eax
-       movl    4 + WAKEUP_XPCB(KGSBASE), %edx
+       movl    PCB_KGSBASE(%rdi), %eax
+       movl    4 + PCB_KGSBASE(%rdi), %edx
        wrmsr
 
        /* Restore EFER. */
@@ -115,17 +100,21 @@ ENTRY(acpi_restorecpu)
        movl    WAKEUP_CTX(sfmask), %eax
        wrmsr
 
-       /* Restore CR0, CR2 and CR4. */
-       movq    WAKEUP_XPCB(CR0), %rax
+       /* Restore CR0 except for FPU mode. */
+       movq    PCB_CR0(%rdi), %rax
+       movq    %rax, %rcx
+       andq    $~(CR0_EM | CR0_TS), %rax
        movq    %rax, %cr0
-       movq    WAKEUP_XPCB(CR2), %rax
+
+       /* Restore CR2 and CR4. */
+       movq    PCB_CR2(%rdi), %rax
        movq    %rax, %cr2
-       movq    WAKEUP_XPCB(CR4), %rax
+       movq    PCB_CR4(%rdi), %rax
        movq    %rax, %cr4
 
        /* Restore descriptor tables. */
-       lidt    WAKEUP_XPCB(IDT)
-       lldt    WAKEUP_XPCB(LDT)
+       lidt    PCB_IDT(%rdi)
+       lldt    PCB_LDT(%rdi)
 
 #define        SDT_SYSTSS      9
 #define        SDT_SYSBSY      11
@@ -133,37 +122,44 @@ ENTRY(acpi_restorecpu)
        /* Clear "task busy" bit and reload TR. */
        movq    PCPU(TSS), %rax
        andb    $(~SDT_SYSBSY | SDT_SYSTSS), 5(%rax)
-       movw    WAKEUP_XPCB(TR), %ax
+       movw    PCB_TR(%rdi), %ax
        ltr     %ax
 
 #undef SDT_SYSTSS
 #undef SDT_SYSBSY
 
        /* Restore other callee saved registers. */
-       movq    WAKEUP_PCB(R15), %r15
-       movq    WAKEUP_PCB(R14), %r14
-       movq    WAKEUP_PCB(R13), %r13
-       movq    WAKEUP_PCB(R12), %r12
-       movq    WAKEUP_PCB(RBP), %rbp
-       movq    WAKEUP_PCB(RSP), %rsp
-       movq    WAKEUP_PCB(RBX), %rbx
+       movq    PCB_R15(%rdi), %r15
+       movq    PCB_R14(%rdi), %r14
+       movq    PCB_R13(%rdi), %r13
+       movq    PCB_R12(%rdi), %r12
+       movq    PCB_RBP(%rdi), %rbp
+       movq    PCB_RSP(%rdi), %rsp
+       movq    PCB_RBX(%rdi), %rbx
 
        /* Restore debug registers. */
-       movq    WAKEUP_PCB(DR0), %rax
+       movq    PCB_DR0(%rdi), %rax
        movq    %rax, %dr0
-       movq    WAKEUP_PCB(DR1), %rax
+       movq    PCB_DR1(%rdi), %rax
        movq    %rax, %dr1
-       movq    WAKEUP_PCB(DR2), %rax
+       movq    PCB_DR2(%rdi), %rax
        movq    %rax, %dr2
-       movq    WAKEUP_PCB(DR3), %rax
+       movq    PCB_DR3(%rdi), %rax
        movq    %rax, %dr3
-       movq    WAKEUP_PCB(DR6), %rax
+       movq    PCB_DR6(%rdi), %rax
        movq    %rax, %dr6
-       movq    WAKEUP_PCB(DR7), %rax
+       movq    PCB_DR7(%rdi), %rax
        movq    %rax, %dr7
 
+       /* Restore FPU state. */
+       fninit
+       fxrstor PCB_USERFPU(%rdi)
+
+       /* Reload CR0. */
+       movq    %rcx, %cr0
+
        /* Restore return address. */
-       movq    WAKEUP_PCB(RIP), %rax
+       movq    PCB_RIP(%rdi), %rax
        movq    %rax, (%rsp)
 
        /* Indicate the CPU is resumed. */
@@ -172,19 +168,3 @@ ENTRY(acpi_restorecpu)
 
        ret
 END(acpi_restorecpu)
-
-ENTRY(acpi_savecpu)
-       /* Fetch XPCB and save CPU context. */
-       movq    %rdi, %r10
-       call    savectx2
-       movq    %r10, %r11
-
-       /* Patch caller's return address and stack pointer. */
-       movq    (%rsp), %rax
-       movq    %rax, WAKEUP_PCB(RIP)
-       movq    %rsp, %rax
-       movq    %rax, WAKEUP_PCB(RSP)
-
-       movl    $1, %eax
-       ret
-END(acpi_savecpu)

Modified: stable/8/sys/amd64/acpica/acpi_wakecode.S
==============================================================================
--- stable/8/sys/amd64/acpica/acpi_wakecode.S   Fri Nov 19 09:26:39 2010        
(r215512)
+++ stable/8/sys/amd64/acpica/acpi_wakecode.S   Fri Nov 19 09:49:14 2010        
(r215513)
@@ -2,7 +2,7 @@
  * Copyright (c) 2001 Takanori Watanabe <takaw...@jp.freebsd.org>
  * Copyright (c) 2001 Mitsuru IWASAKI <iwas...@jp.freebsd.org>
  * Copyright (c) 2003 Peter Wemm
- * Copyright (c) 2008-2009 Jung-uk Kim <j...@freebsd.org>
+ * Copyright (c) 2008-2010 Jung-uk Kim <j...@freebsd.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
@@ -52,18 +52,17 @@
        .data                           /* So we can modify it */
 
        ALIGN_TEXT
-wakeup_start:
        .code16
+wakeup_start:
        /*
         * Set up segment registers for real mode, a small stack for
         * any calls we make, and clear any flags.
         */
        cli                             /* make sure no interrupts */
-       cld
        mov     %cs, %ax                /* copy %cs to %ds.  Remember these */
        mov     %ax, %ds                /* are offsets rather than selectors */
        mov     %ax, %ss
-       movw    $PAGE_SIZE - 8, %sp
+       movw    $PAGE_SIZE, %sp
        xorw    %ax, %ax
        pushw   %ax
        popfw
@@ -127,6 +126,7 @@ wakeup_sw32:
        /*
         * At this point, we are running in 32 bit legacy protected mode.
         */
+       ALIGN_TEXT
        .code32
 wakeup_32:
 
@@ -205,8 +205,8 @@ wakeup_64:
        mov     %ax, %ds
 
        /* Restore arguments and return. */
-       movq    wakeup_ctx - wakeup_start(%rbx), %rdi
-       movq    wakeup_kpml4 - wakeup_start(%rbx), %rsi
+       movq    wakeup_kpml4 - wakeup_start(%rbx), %rdi
+       movq    wakeup_ctx - wakeup_start(%rbx), %rsi
        movq    wakeup_retaddr - wakeup_start(%rbx), %rax
        jmp     *%rax
 
@@ -260,7 +260,7 @@ wakeup_kpml4:
 
 wakeup_ctx:
        .quad   0
-wakeup_xpcb:
+wakeup_pcb:
        .quad   0
 wakeup_gdt:
        .word   0

Modified: stable/8/sys/amd64/acpica/acpi_wakeup.c
==============================================================================
--- stable/8/sys/amd64/acpica/acpi_wakeup.c     Fri Nov 19 09:26:39 2010        
(r215512)
+++ stable/8/sys/amd64/acpica/acpi_wakeup.c     Fri Nov 19 09:49:14 2010        
(r215513)
@@ -2,7 +2,7 @@
  * Copyright (c) 2001 Takanori Watanabe <takaw...@jp.freebsd.org>
  * Copyright (c) 2001 Mitsuru IWASAKI <iwas...@jp.freebsd.org>
  * Copyright (c) 2003 Peter Wemm
- * Copyright (c) 2008-2009 Jung-uk Kim <j...@freebsd.org>
+ * Copyright (c) 2008-2010 Jung-uk Kim <j...@freebsd.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
@@ -31,13 +31,11 @@
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
-#include <sys/systm.h>
 #include <sys/bus.h>
 #include <sys/kernel.h>
 #include <sys/malloc.h>
 #include <sys/memrange.h>
 #include <sys/smp.h>
-#include <sys/types.h>
 
 #include <vm/vm.h>
 #include <vm/pmap.h>
@@ -47,11 +45,11 @@ __FBSDID("$FreeBSD$");
 #include <machine/pcb.h>
 #include <machine/pmap.h>
 #include <machine/specialreg.h>
-#include <machine/vmparam.h>
 
 #ifdef SMP
 #include <machine/apicreg.h>
 #include <machine/smp.h>
+#include <machine/vmparam.h>
 #endif
 
 #include <contrib/dev/acpica/include/acpi.h>
@@ -64,23 +62,18 @@ __FBSDID("$FreeBSD$");
 /* Make sure the code is less than a page and leave room for the stack. */
 CTASSERT(sizeof(wakecode) < PAGE_SIZE - 1024);
 
-#ifndef _SYS_CDEFS_H_
-#error this file needs sys/cdefs.h as a prerequisite
-#endif
-
 extern int             acpi_resume_beep;
 extern int             acpi_reset_video;
 
 #ifdef SMP
-extern struct xpcb     *stopxpcbs;
+extern struct pcb      **susppcbs;
 #else
-static struct xpcb     *stopxpcbs;
+static struct pcb      **susppcbs;
 #endif
 
-int                    acpi_restorecpu(struct xpcb *, vm_offset_t);
-int                    acpi_savecpu(struct xpcb *);
+int                    acpi_restorecpu(struct pcb *, vm_offset_t);
 
-static void            acpi_alloc_wakeup_handler(void);
+static void            *acpi_alloc_wakeup_handler(void);
 static void            acpi_stop_beep(void *);
 
 #ifdef SMP
@@ -111,10 +104,10 @@ acpi_wakeup_ap(struct acpi_softc *sc, in
        int             apic_id = cpu_apic_ids[cpu];
        int             ms;
 
-       WAKECODE_FIXUP(wakeup_xpcb, struct xpcb *, &stopxpcbs[cpu]);
-       WAKECODE_FIXUP(wakeup_gdt, uint16_t, stopxpcbs[cpu].xpcb_gdt.rd_limit);
+       WAKECODE_FIXUP(wakeup_pcb, struct pcb *, susppcbs[cpu]);
+       WAKECODE_FIXUP(wakeup_gdt, uint16_t, susppcbs[cpu]->pcb_gdt.rd_limit);
        WAKECODE_FIXUP(wakeup_gdt + 2, uint64_t,
-           stopxpcbs[cpu].xpcb_gdt.rd_base);
+           susppcbs[cpu]->pcb_gdt.rd_base);
        WAKECODE_FIXUP(wakeup_cpu, int, cpu);
 
        /* do an INIT IPI: assert RESET */
@@ -222,7 +215,6 @@ acpi_wakeup_cpus(struct acpi_softc *sc, 
 int
 acpi_sleep_machdep(struct acpi_softc *sc, int state)
 {
-       struct savefpu  *stopfpu;
 #ifdef SMP
        cpumask_t       wakeup_cpus;
 #endif
@@ -252,10 +244,7 @@ acpi_sleep_machdep(struct acpi_softc *sc
        cr3 = rcr3();
        load_cr3(KPML4phys);
 
-       stopfpu = &stopxpcbs[0].xpcb_pcb.pcb_save;
-       if (acpi_savecpu(&stopxpcbs[0])) {
-               fpugetregs(curthread, stopfpu);
-
+       if (savectx(susppcbs[0])) {
 #ifdef SMP
                if (wakeup_cpus != 0 && suspend_cpus(wakeup_cpus) == 0) {
                        device_printf(sc->acpi_dev,
@@ -268,11 +257,11 @@ acpi_sleep_machdep(struct acpi_softc *sc
                WAKECODE_FIXUP(resume_beep, uint8_t, (acpi_resume_beep != 0));
                WAKECODE_FIXUP(reset_video, uint8_t, (acpi_reset_video != 0));
 
-               WAKECODE_FIXUP(wakeup_xpcb, struct xpcb *, &stopxpcbs[0]);
+               WAKECODE_FIXUP(wakeup_pcb, struct pcb *, susppcbs[0]);
                WAKECODE_FIXUP(wakeup_gdt, uint16_t,
-                   stopxpcbs[0].xpcb_gdt.rd_limit);
+                   susppcbs[0]->pcb_gdt.rd_limit);
                WAKECODE_FIXUP(wakeup_gdt + 2, uint64_t,
-                   stopxpcbs[0].xpcb_gdt.rd_base);
+                   susppcbs[0]->pcb_gdt.rd_base);
                WAKECODE_FIXUP(wakeup_cpu, int, 0);
 
                /* Call ACPICA to enter the desired sleep state */
@@ -291,7 +280,6 @@ acpi_sleep_machdep(struct acpi_softc *sc
                for (;;)
                        ia32_pause();
        } else {
-               fpusetregs(curthread, stopfpu);
 #ifdef SMP
                if (wakeup_cpus != 0)
                        acpi_wakeup_cpus(sc, wakeup_cpus);
@@ -324,49 +312,48 @@ out:
        return (ret);
 }
 
-static vm_offset_t     acpi_wakeaddr;
-
-static void
+static void *
 acpi_alloc_wakeup_handler(void)
 {
        void            *wakeaddr;
-
-       if (!cold)
-               return;
+       int             i;
 
        /*
         * Specify the region for our wakeup code.  We want it in the low 1 MB
-        * region, excluding video memory and above (0xa0000).  We ask for
-        * it to be page-aligned, just to be safe.
+        * region, excluding real mode IVT (0-0x3ff), BDA (0x400-0x4ff), EBDA
+        * (less than 128KB, below 0xa0000, must be excluded by SMAP and DSDT),
+        * and ROM area (0xa0000 and above).  The temporary page tables must be
+        * page-aligned.
         */
-       wakeaddr = contigmalloc(4 * PAGE_SIZE, M_DEVBUF, M_NOWAIT, 0, 0x9ffff,
-           PAGE_SIZE, 0ul);
+       wakeaddr = contigmalloc(4 * PAGE_SIZE, M_DEVBUF, M_NOWAIT, 0x500,
+           0xa0000, PAGE_SIZE, 0ul);
        if (wakeaddr == NULL) {
                printf("%s: can't alloc wake memory\n", __func__);
-               return;
-       }
-       stopxpcbs = malloc(mp_ncpus * sizeof(*stopxpcbs), M_DEVBUF, M_NOWAIT);
-       if (stopxpcbs == NULL) {
-               contigfree(wakeaddr, 4 * PAGE_SIZE, M_DEVBUF);
-               printf("%s: can't alloc CPU state memory\n", __func__);
-               return;
+               return (NULL);
        }
-       acpi_wakeaddr = (vm_offset_t)wakeaddr;
-}
+       susppcbs = malloc(mp_ncpus * sizeof(*susppcbs), M_DEVBUF, M_WAITOK);
+       for (i = 0; i < mp_ncpus; i++)
+               susppcbs[i] = malloc(sizeof(**susppcbs), M_DEVBUF, M_WAITOK);
 
-SYSINIT(acpiwakeup, SI_SUB_KMEM, SI_ORDER_ANY, acpi_alloc_wakeup_handler, 0);
+       return (wakeaddr);
+}
 
 void
 acpi_install_wakeup_handler(struct acpi_softc *sc)
 {
+       static void     *wakeaddr = NULL;
        uint64_t        *pt4, *pt3, *pt2;
        int             i;
 
-       if (acpi_wakeaddr == 0ul)
+       if (wakeaddr != NULL)
+               return;
+
+       wakeaddr = acpi_alloc_wakeup_handler();
+       if (wakeaddr == NULL)
                return;
 
-       sc->acpi_wakeaddr = acpi_wakeaddr;
-       sc->acpi_wakephys = vtophys(acpi_wakeaddr);
+       sc->acpi_wakeaddr = (vm_offset_t)wakeaddr;
+       sc->acpi_wakephys = vtophys(wakeaddr);
 
        bcopy(wakecode, (void *)WAKECODE_VADDR(sc), sizeof(wakecode));
 
@@ -392,7 +379,7 @@ acpi_install_wakeup_handler(struct acpi_
        WAKECODE_FIXUP(wakeup_sfmask, uint64_t, rdmsr(MSR_SF_MASK));
 
        /* Build temporary page tables below realmode code. */
-       pt4 = (uint64_t *)acpi_wakeaddr;
+       pt4 = wakeaddr;
        pt3 = pt4 + (PAGE_SIZE) / sizeof(uint64_t);
        pt2 = pt3 + (PAGE_SIZE) / sizeof(uint64_t);
 

Modified: stable/8/sys/amd64/amd64/cpu_switch.S
==============================================================================
--- stable/8/sys/amd64/amd64/cpu_switch.S       Fri Nov 19 09:26:39 2010        
(r215512)
+++ stable/8/sys/amd64/amd64/cpu_switch.S       Fri Nov 19 09:49:14 2010        
(r215513)
@@ -113,7 +113,7 @@ done_store_dr:
        /* have we used fp, and need a save? */
        cmpq    %rdi,PCPU(FPCURTHREAD)
        jne     1f
-       addq    $PCB_SAVEFPU,%r8
+       movq    PCB_SAVEFPU(%r8),%r8
        clts
        fxsave  (%r8)
        smsw    %ax
@@ -302,121 +302,62 @@ END(cpu_switch)
  * Update pcb, saving current processor state.
  */
 ENTRY(savectx)
-       /* Fetch PCB. */
-       movq    %rdi,%rcx
-
-       /* Save caller's return address. */
-       movq    (%rsp),%rax
-       movq    %rax,PCB_RIP(%rcx)
-
-       movq    %cr3,%rax
-       movq    %rax,PCB_CR3(%rcx)
-
-       movq    %rbx,PCB_RBX(%rcx)
-       movq    %rsp,PCB_RSP(%rcx)
-       movq    %rbp,PCB_RBP(%rcx)
-       movq    %r12,PCB_R12(%rcx)
-       movq    %r13,PCB_R13(%rcx)
-       movq    %r14,PCB_R14(%rcx)
-       movq    %r15,PCB_R15(%rcx)
-
-       /*
-        * If fpcurthread == NULL, then the fpu h/w state is irrelevant and the
-        * state had better already be in the pcb.  This is true for forks
-        * but not for dumps (the old book-keeping with FP flags in the pcb
-        * always lost for dumps because the dump pcb has 0 flags).
-        *
-        * If fpcurthread != NULL, then we have to save the fpu h/w state to
-        * fpcurthread's pcb and copy it to the requested pcb, or save to the
-        * requested pcb and reload.  Copying is easier because we would
-        * have to handle h/w bugs for reloading.  We used to lose the
-        * parent's fpu state for forks by forgetting to reload.
-        */
-       pushfq
-       cli
-       movq    PCPU(FPCURTHREAD),%rax
-       testq   %rax,%rax
-       je      1f
-
-       movq    TD_PCB(%rax),%rdi
-       leaq    PCB_SAVEFPU(%rdi),%rdi
-       clts
-       fxsave  (%rdi)
-       smsw    %ax
-       orb     $CR0_TS,%al
-       lmsw    %ax
-
-       movq    $PCB_SAVEFPU_SIZE,%rdx  /* arg 3 */
-       leaq    PCB_SAVEFPU(%rcx),%rsi  /* arg 2 */
-       /* arg 1 (%rdi) already loaded */
-       call    bcopy
-1:
-       popfq
-
-       ret
-END(savectx)
-
-/*
- * savectx2(xpcb)
- * Update xpcb, saving current processor state.
- */
-ENTRY(savectx2)
-       /* Fetch XPCB. */
-       movq    %rdi,%r8
-
        /* Save caller's return address. */
        movq    (%rsp),%rax
-       movq    %rax,PCB_RIP(%r8)
+       movq    %rax,PCB_RIP(%rdi)
 
-       movq    %rbx,PCB_RBX(%r8)
-       movq    %rsp,PCB_RSP(%r8)
-       movq    %rbp,PCB_RBP(%r8)
-       movq    %r12,PCB_R12(%r8)
-       movq    %r13,PCB_R13(%r8)
-       movq    %r14,PCB_R14(%r8)
-       movq    %r15,PCB_R15(%r8)
+       movq    %rbx,PCB_RBX(%rdi)
+       movq    %rsp,PCB_RSP(%rdi)
+       movq    %rbp,PCB_RBP(%rdi)
+       movq    %r12,PCB_R12(%rdi)
+       movq    %r13,PCB_R13(%rdi)
+       movq    %r14,PCB_R14(%rdi)
+       movq    %r15,PCB_R15(%rdi)
 
-       movq    %cr0,%rax
-       movq    %rax,XPCB_CR0(%r8)
+       movq    %cr0,%rsi
+       movq    %rsi,PCB_CR0(%rdi)
        movq    %cr2,%rax
-       movq    %rax,XPCB_CR2(%r8)
+       movq    %rax,PCB_CR2(%rdi)
+       movq    %cr3,%rax
+       movq    %rax,PCB_CR3(%rdi)
        movq    %cr4,%rax
-       movq    %rax,XPCB_CR4(%r8)
+       movq    %rax,PCB_CR4(%rdi)
 
        movq    %dr0,%rax
-       movq    %rax,PCB_DR0(%r8)
+       movq    %rax,PCB_DR0(%rdi)
        movq    %dr1,%rax
-       movq    %rax,PCB_DR1(%r8)
+       movq    %rax,PCB_DR1(%rdi)
        movq    %dr2,%rax
-       movq    %rax,PCB_DR2(%r8)
+       movq    %rax,PCB_DR2(%rdi)
        movq    %dr3,%rax
-       movq    %rax,PCB_DR3(%r8)
+       movq    %rax,PCB_DR3(%rdi)
        movq    %dr6,%rax
-       movq    %rax,PCB_DR6(%r8)
+       movq    %rax,PCB_DR6(%rdi)
        movq    %dr7,%rax
-       movq    %rax,PCB_DR7(%r8)
-
-       sgdt    XPCB_GDT(%r8)
-       sidt    XPCB_IDT(%r8)
-       sldt    XPCB_LDT(%r8)
-       str     XPCB_TR(%r8)
+       movq    %rax,PCB_DR7(%rdi)
 
        movl    $MSR_FSBASE,%ecx
        rdmsr
-       shlq    $32,%rdx
-       leaq    (%rax,%rdx),%rax
-       movq    %rax,PCB_FSBASE(%r8)
+       movl    %eax,PCB_FSBASE(%rdi)
+       movl    %edx,PCB_FSBASE+4(%rdi)
        movl    $MSR_GSBASE,%ecx
        rdmsr
-       shlq    $32,%rdx
-       leaq    (%rax,%rdx),%rax
-       movq    %rax,PCB_GSBASE(%r8)
+       movl    %eax,PCB_GSBASE(%rdi)
+       movl    %edx,PCB_GSBASE+4(%rdi)
        movl    $MSR_KGSBASE,%ecx
        rdmsr
-       shlq    $32,%rdx
-       leaq    (%rax,%rdx),%rax
-       movq    %rax,XPCB_KGSBASE(%r8)
+       movl    %eax,PCB_KGSBASE(%rdi)
+       movl    %edx,PCB_KGSBASE+4(%rdi)
+
+       sgdt    PCB_GDT(%rdi)
+       sidt    PCB_IDT(%rdi)
+       sldt    PCB_LDT(%rdi)
+       str     PCB_TR(%rdi)
 
-       movl    $1, %eax
+       clts
+       fxsave  PCB_USERFPU(%rdi)
+       movq    %rsi,%cr0       /* The previous %cr0 is saved in %rsi. */
+
+       movl    $1,%eax
        ret
-END(savectx2)
+END(savectx)

Modified: stable/8/sys/amd64/amd64/fpu.c
==============================================================================
--- stable/8/sys/amd64/amd64/fpu.c      Fri Nov 19 09:26:39 2010        
(r215512)
+++ stable/8/sys/amd64/amd64/fpu.c      Fri Nov 19 09:49:14 2010        
(r215513)
@@ -65,34 +65,36 @@ __FBSDID("$FreeBSD$");
 
 #if defined(__GNUCLIKE_ASM) && !defined(lint)
 
-#define        fldcw(addr)             __asm("fldcw %0" : : "m" (*(addr)))
-#define        fnclex()                __asm("fnclex")
-#define        fninit()                __asm("fninit")
+#define        fldcw(cw)               __asm __volatile("fldcw %0" : : "m" 
(cw))
+#define        fnclex()                __asm __volatile("fnclex")
+#define        fninit()                __asm __volatile("fninit")
 #define        fnstcw(addr)            __asm __volatile("fnstcw %0" : "=m" 
(*(addr)))
-#define        fnstsw(addr)            __asm __volatile("fnstsw %0" : "=m" 
(*(addr)))
-#define        fxrstor(addr)           __asm("fxrstor %0" : : "m" (*(addr)))
+#define        fnstsw(addr)            __asm __volatile("fnstsw %0" : "=am" 
(*(addr)))
+#define        fxrstor(addr)           __asm __volatile("fxrstor %0" : : "m" 
(*(addr)))
 #define        fxsave(addr)            __asm __volatile("fxsave %0" : "=m" 
(*(addr)))
-#define        ldmxcsr(r)              __asm __volatile("ldmxcsr %0" : : "m" 
(r))
-#define        start_emulating()       __asm("smsw %%ax; orb %0,%%al; lmsw 
%%ax" \
-                                     : : "n" (CR0_TS) : "ax")
-#define        stop_emulating()        __asm("clts")
+#define        ldmxcsr(csr)            __asm __volatile("ldmxcsr %0" : : "m" 
(csr))
+#define        start_emulating()       __asm __volatile( \
+                                   "smsw %%ax; orb %0,%%al; lmsw %%ax" \
+                                   : : "n" (CR0_TS) : "ax")
+#define        stop_emulating()        __asm __volatile("clts")
 
 #else  /* !(__GNUCLIKE_ASM && !lint) */
 
-void   fldcw(caddr_t addr);
+void   fldcw(u_short cw);
 void   fnclex(void);
 void   fninit(void);
 void   fnstcw(caddr_t addr);
 void   fnstsw(caddr_t addr);
 void   fxsave(caddr_t addr);
 void   fxrstor(caddr_t addr);
+void   ldmxcsr(u_int csr);
 void   start_emulating(void);
 void   stop_emulating(void);
 
 #endif /* __GNUCLIKE_ASM && !lint */
 
-#define GET_FPU_CW(thread) ((thread)->td_pcb->pcb_save.sv_env.en_cw)
-#define GET_FPU_SW(thread) ((thread)->td_pcb->pcb_save.sv_env.en_sw)
+#define GET_FPU_CW(thread) ((thread)->td_pcb->pcb_save->sv_env.en_cw)
+#define GET_FPU_SW(thread) ((thread)->td_pcb->pcb_save->sv_env.en_sw)
 
 typedef u_char bool_t;
 
@@ -111,15 +113,18 @@ static    struct savefpu          fpu_initialstate;
 void
 fpuinit(void)
 {
-       register_t savecrit;
+       register_t saveintr;
        u_int mxcsr;
        u_short control;
 
-       savecrit = intr_disable();
+       /*
+        * It is too early for critical_enter() to work on AP.
+        */
+       saveintr = intr_disable();
        stop_emulating();
        fninit();
        control = __INITIAL_FPUCW__;
-       fldcw(&control);
+       fldcw(control);
        mxcsr = __INITIAL_MXCSR__;
        ldmxcsr(mxcsr);
        if (PCPU_GET(cpuid) == 0) {
@@ -132,7 +137,7 @@ fpuinit(void)
                bzero(fpu_initialstate.sv_xmm, sizeof(fpu_initialstate.sv_xmm));
        }
        start_emulating();
-       intr_restore(savecrit);
+       intr_restore(saveintr);
 }
 
 /*
@@ -141,16 +146,15 @@ fpuinit(void)
 void
 fpuexit(struct thread *td)
 {
-       register_t savecrit;
 
-       savecrit = intr_disable();
+       critical_enter();
        if (curthread == PCPU_GET(fpcurthread)) {
                stop_emulating();
-               fxsave(&PCPU_GET(curpcb)->pcb_save);
+               fxsave(PCPU_GET(curpcb)->pcb_save);
                start_emulating();
                PCPU_SET(fpcurthread, 0);
        }
-       intr_restore(savecrit);
+       critical_exit();
 }
 
 int
@@ -351,10 +355,9 @@ static char fpetable[128] = {
 int
 fputrap()
 {
-       register_t savecrit;
        u_short control, status;
 
-       savecrit = intr_disable();
+       critical_enter();
 
        /*
         * Interrupt handling (for another interrupt) may have pushed the
@@ -371,7 +374,7 @@ fputrap()
 
        if (PCPU_GET(fpcurthread) == curthread)
                fnclex();
-       intr_restore(savecrit);
+       critical_exit();
        return (fpetable[status & ((~control & 0x3f) | 0x40)]);
 }
 
@@ -389,12 +392,13 @@ void
 fpudna(void)
 {
        struct pcb *pcb;
-       register_t s;
 
+       critical_enter();
        if (PCPU_GET(fpcurthread) == curthread) {
                printf("fpudna: fpcurthread == curthread %d times\n",
                    ++err_count);
                stop_emulating();
+               critical_exit();
                return;
        }
        if (PCPU_GET(fpcurthread) != NULL) {
@@ -404,7 +408,6 @@ fpudna(void)
                       curthread, curthread->td_proc->p_pid);
                panic("fpudna");
        }
-       s = intr_disable();
        stop_emulating();
        /*
         * Record new context early in case frstor causes a trap.
@@ -422,23 +425,23 @@ fpudna(void)
                 */
                fxrstor(&fpu_initialstate);
                if (pcb->pcb_initial_fpucw != __INITIAL_FPUCW__)
-                       fldcw(&pcb->pcb_initial_fpucw);
+                       fldcw(pcb->pcb_initial_fpucw);
                pcb->pcb_flags |= PCB_FPUINITDONE;
+               if (PCB_USER_FPU(pcb))
+                       pcb->pcb_flags |= PCB_USERFPUINITDONE;
        } else
-               fxrstor(&pcb->pcb_save);
-       intr_restore(s);
+               fxrstor(pcb->pcb_save);
+       critical_exit();
 }
 
-/*
- * This should be called with interrupts disabled and only when the owning
- * FPU thread is non-null.
- */
 void
 fpudrop()
 {
        struct thread *td;
 
        td = PCPU_GET(fpcurthread);
+       KASSERT(td == curthread, ("fpudrop: fpcurthread != curthread"));
+       CRITICAL_ASSERT(td);
        PCPU_SET(fpcurthread, NULL);
        td->td_pcb->pcb_flags &= ~PCB_FPUINITDONE;
        start_emulating();
@@ -449,23 +452,47 @@ fpudrop()
  * It returns the FPU ownership status.
  */
 int
+fpugetuserregs(struct thread *td, struct savefpu *addr)
+{
+       struct pcb *pcb;
+
+       pcb = td->td_pcb;
+       if ((pcb->pcb_flags & PCB_USERFPUINITDONE) == 0) {
+               bcopy(&fpu_initialstate, addr, sizeof(fpu_initialstate));
+               addr->sv_env.en_cw = pcb->pcb_initial_fpucw;
+               return (_MC_FPOWNED_NONE);
+       }
+       critical_enter();
+       if (td == PCPU_GET(fpcurthread) && PCB_USER_FPU(pcb)) {
+               fxsave(addr);
+               critical_exit();
+               return (_MC_FPOWNED_FPU);
+       } else {
+               critical_exit();
+               bcopy(&pcb->pcb_user_save, addr, sizeof(*addr));
+               return (_MC_FPOWNED_PCB);
+       }
+}
+
+int
 fpugetregs(struct thread *td, struct savefpu *addr)
 {
-       register_t s;
+       struct pcb *pcb;
 
-       if ((td->td_pcb->pcb_flags & PCB_FPUINITDONE) == 0) {
+       pcb = td->td_pcb;
+       if ((pcb->pcb_flags & PCB_FPUINITDONE) == 0) {
                bcopy(&fpu_initialstate, addr, sizeof(fpu_initialstate));
-               addr->sv_env.en_cw = td->td_pcb->pcb_initial_fpucw;
+               addr->sv_env.en_cw = pcb->pcb_initial_fpucw;
                return (_MC_FPOWNED_NONE);
        }
-       s = intr_disable();
+       critical_enter();
        if (td == PCPU_GET(fpcurthread)) {
                fxsave(addr);
-               intr_restore(s);
+               critical_exit();
                return (_MC_FPOWNED_FPU);
        } else {
-               intr_restore(s);
-               bcopy(&td->td_pcb->pcb_save, addr, sizeof(*addr));
+               critical_exit();
+               bcopy(pcb->pcb_save, addr, sizeof(*addr));
                return (_MC_FPOWNED_PCB);
        }
 }
@@ -474,19 +501,42 @@ fpugetregs(struct thread *td, struct sav
  * Set the state of the FPU.
  */
 void
+fpusetuserregs(struct thread *td, struct savefpu *addr)
+{
+       struct pcb *pcb;
+
+       pcb = td->td_pcb;
+       critical_enter();
+       if (td == PCPU_GET(fpcurthread) && PCB_USER_FPU(pcb)) {
+               fxrstor(addr);
+               critical_exit();
+               pcb->pcb_flags |= PCB_FPUINITDONE | PCB_USERFPUINITDONE;
+       } else {
+               critical_exit();
+               bcopy(addr, &td->td_pcb->pcb_user_save, sizeof(*addr));
+               if (PCB_USER_FPU(pcb))
+                       pcb->pcb_flags |= PCB_FPUINITDONE;
+               pcb->pcb_flags |= PCB_USERFPUINITDONE;
+       }
+}
+
+void
 fpusetregs(struct thread *td, struct savefpu *addr)
 {
-       register_t s;
+       struct pcb *pcb;
 
-       s = intr_disable();
+       pcb = td->td_pcb;
+       critical_enter();
        if (td == PCPU_GET(fpcurthread)) {
                fxrstor(addr);
-               intr_restore(s);
+               critical_exit();
        } else {
-               intr_restore(s);
-               bcopy(addr, &td->td_pcb->pcb_save, sizeof(*addr));
+               critical_exit();
+               bcopy(addr, td->td_pcb->pcb_save, sizeof(*addr));
        }
-       curthread->td_pcb->pcb_flags |= PCB_FPUINITDONE;
+       if (PCB_USER_FPU(pcb))
+               pcb->pcb_flags |= PCB_USERFPUINITDONE;
+       pcb->pcb_flags |= PCB_FPUINITDONE;
 }
 
 /*
@@ -575,3 +625,73 @@ static devclass_t fpupnp_devclass;
 
 DRIVER_MODULE(fpupnp, acpi, fpupnp_driver, fpupnp_devclass, 0, 0);
 #endif /* DEV_ISA */
+
+int
+fpu_kern_enter(struct thread *td, struct fpu_kern_ctx *ctx, u_int flags)
+{
+       struct pcb *pcb;
+
+       pcb = td->td_pcb;
+       KASSERT(!PCB_USER_FPU(pcb) || pcb->pcb_save == &pcb->pcb_user_save,
+           ("mangled pcb_save"));
+       ctx->flags = 0;
+       if ((pcb->pcb_flags & PCB_FPUINITDONE) != 0)
+               ctx->flags |= FPU_KERN_CTX_FPUINITDONE;
+       fpuexit(td);
+       ctx->prev = pcb->pcb_save;
+       pcb->pcb_save = &ctx->hwstate;
+       pcb->pcb_flags |= PCB_KERNFPU;
+       pcb->pcb_flags &= ~PCB_FPUINITDONE;
+       return (0);
+}
+
+int
+fpu_kern_leave(struct thread *td, struct fpu_kern_ctx *ctx)
+{
+       struct pcb *pcb;
+
+       pcb = td->td_pcb;
+       critical_enter();
+       if (curthread == PCPU_GET(fpcurthread))
+               fpudrop();
+       critical_exit();
+       pcb->pcb_save = ctx->prev;
+       if (pcb->pcb_save == &pcb->pcb_user_save) {
+               if ((pcb->pcb_flags & PCB_USERFPUINITDONE) != 0)
+                       pcb->pcb_flags |= PCB_FPUINITDONE;
+               else
+                       pcb->pcb_flags &= ~PCB_FPUINITDONE;
+               pcb->pcb_flags &= ~PCB_KERNFPU;
+       } else {
+               if ((ctx->flags & FPU_KERN_CTX_FPUINITDONE) != 0)
+                       pcb->pcb_flags |= PCB_FPUINITDONE;
+               else
+                       pcb->pcb_flags &= ~PCB_FPUINITDONE;
+               KASSERT(!PCB_USER_FPU(pcb), ("unpaired fpu_kern_leave"));
+       }
+       return (0);
+}
+
+int
+fpu_kern_thread(u_int flags)
+{
+       struct pcb *pcb;
+
+       pcb = PCPU_GET(curpcb);
+       KASSERT((curthread->td_pflags & TDP_KTHREAD) != 0,
+           ("Only kthread may use fpu_kern_thread"));
+       KASSERT(pcb->pcb_save == &pcb->pcb_user_save, ("mangled pcb_save"));
+       KASSERT(PCB_USER_FPU(pcb), ("recursive call"));
+
+       pcb->pcb_flags |= PCB_KERNFPU;
+       return (0);

*** DIFF OUTPUT TRUNCATED AT 1000 LINES ***
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Reply via email to