The patch titled
update x86_64-mm-xen-use-iret-directly-where-possible
has been added to the -mm tree. Its filename is
update-x86_64-mm-xen-use-iret-directly-where-possible.patch
*** Remember to use Documentation/SubmitChecklist when testing your code ***
See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
out what to do about this
------------------------------------------------------
Subject: update x86_64-mm-xen-use-iret-directly-where-possible
From: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
There's only a minor code change from the version you've got, but the
comments are more accurate.
Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---
arch/i386/xen/xen-asm.S | 56 +++++++++++++++++++++++++-------------
1 files changed, 37 insertions(+), 19 deletions(-)
diff -puN
arch/i386/xen/xen-asm.S~update-x86_64-mm-xen-use-iret-directly-where-possible
arch/i386/xen/xen-asm.S
---
a/arch/i386/xen/xen-asm.S~update-x86_64-mm-xen-use-iret-directly-where-possible
+++ a/arch/i386/xen/xen-asm.S
@@ -108,14 +108,28 @@ ENDPATCH(xen_restore_fl_direct)
4: cs
esp-> 0: eip
- This attempts to make sure that any pending events are dealt with
- on return to usermode, but there is a small window in which an event
- can happen just before entering usermode. This has three effects:
- - There can be interrupt recursion on the stack, which is
- unbounded in theory (but very unlikely in practice)
- - New softirq events can be queued up, but they won't get
- processed until the cpu next enters and leaves the kernel.
- - Signals likewise.
+ This attempts to make sure that any pending events are dealt
+ with on return to usermode, but there is a small window in
+ which an event can happen just before entering usermode. If
+ the nested interrupt ends up setting one of the TIF_WORK_MASK
+ pending work flags, they will not be tested again before
+ returning to usermode. This means that a process can end up
+ with pending work, which will be unprocessed until the process
+ enters and leaves the kernel again, which could be an
+ unbounded amount of time. This means that a pending signal or
+ reschedule event could be indefinitely delayed.
+
+ The fix is to notice a nested interrupt in the critical
+ window, and if one occurs, then fold the nested interrupt into
+ the current interrupt stack frame, and re-process it
+ iteratively rather than recursively. This means that it will
+ exit via the normal path, and all pending work will be dealt
+ with appropriately.
+
+ Because the nested interrupt handler needs to deal with the
+ current stack state in whatever form its in, we keep things
+ simple by only using a single register which is pushed/popped
+ on the stack.
Non-direct iret could be done in the same way, but it would
require an annoying amount of code duplication. We'll assume
@@ -127,9 +141,6 @@ ENTRY(xen_iret_direct)
testl $(X86_EFLAGS_VM | XEN_EFLAGS_NMI), 8(%esp)
jnz hyper_iret
- /* check IF state we're restoring */
- testb $X86_EFLAGS_IF>>8, 8+1(%esp)
-
push %eax
ESP_OFFSET=4 # bytes pushed onto stack
@@ -144,6 +155,9 @@ ENTRY(xen_iret_direct)
movl $per_cpu__xen_vcpu_info, %eax
#endif
+ /* check IF state we're restoring */
+ testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp)
+
/* Maybe enable events. Once this happens we could get a
recursive event, so the critical region starts immediately
afterwards. However, if that happens we don't end up
@@ -187,7 +201,7 @@ hyper_iret:
The stack format at this point is:
----------------
- ss :
+ ss : (ss/esp may be present if we came from usermode)
esp :
eflags } outer exception info
cs }
@@ -219,17 +233,21 @@ hyper_iret:
The only caveat is that if the outer eax hasn't been
restored yet (ie, it's still on stack), we need to insert
its value into the SAVE_ALL state before going on, since
- its usermode state which we eventually need to restore.
+ it's usermode state which we eventually need to restore.
*/
ENTRY(xen_iret_crit_fixup)
/* offsets +4 for return address */
- /* Paranoia: make sure we're really coming from userspace.
- Once could imagine a case where userspace jumps into
- the critical range address, but just before the CPU
- delivers a GP, it decides to deliver an interrupt
- instead. Unlikely? Definitely. Easy to avoid?
- Yes. (Some virtual environments get this wrong.) */
+ /*
+ Paranoia: Make sure we're really coming from userspace.
+ One could imagine a case where userspace jumps into the
+ critical range address, but just before the CPU delivers a GP,
+ it decides to deliver an interrupt instead. Unlikely?
+ Definitely. Easy to avoid? Yes. The Intel documents
+ explicitly say that the reported EIP for a bad jump is the
+ jump instruction itself, not the destination, but some virtual
+ environments get this wrong.
+ */
movl PT_CS+4(%esp), %ecx
andl $SEGMENT_RPL_MASK, %ecx
cmpl $USER_RPL, %ecx
_
Patches currently in -mm which might be from [EMAIL PROTECTED] are
git-kbuild.patch
add-kstrndup-fix.patch
xen-build-fix.patch
fix-x86_64-mm-xen-xen-smp-guest-support.patch
more-fix-x86_64-mm-xen-xen-smp-guest-support.patch
fix-x86_64-mm-xen-add-xen-virtual-block-device-driver.patch
fix-x86_64-mm-add-common-orderly_poweroff.patch
tidy-up-usermode-helper-waiting-a-bit-fix.patch
update-x86_64-mm-xen-use-iret-directly-where-possible.patch
x86-use-elfnoteh-to-generate-vsyscall-notes-fix.patch
paravirt-helper-to-disable-all-io-space-fix-2.patch
paravirt-helper-to-disable-all-io-space-fix-3.patch
maps2-uninline-some-functions-in-the-page-walker.patch
maps2-eliminate-the-pmd_walker-struct-in-the-page-walker.patch
maps2-remove-vma-from-args-in-the-page-walker.patch
maps2-propagate-errors-from-callback-in-page-walker.patch
maps2-add-callbacks-for-each-level-to-page-walker.patch
maps2-move-the-page-walker-code-to-lib.patch
maps2-simplify-interdependence-of-proc-pid-maps-and-smaps.patch
maps2-move-clear_refs-code-to-task_mmuc.patch
maps2-regroup-task_mmu-by-interface.patch
maps2-make-proc-pid-smaps-optional-under-config_embedded.patch
maps2-make-proc-pid-clear_refs-option-under-config_embedded.patch
maps2-add-proc-pid-pagemap-interface.patch
maps2-add-proc-kpagemap-interface.patch
add-argv_split-fix.patch
add-common-orderly_poweroff-fix.patch
lguest-the-guest-code.patch
-
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html