Re: [PATCH 4/6] Unsuck "x86/entry/64: Create a percpu SYSCALL entry trampoline"

2017-12-02 Thread Andy Lutomirski
On Sat, Dec 2, 2017 at 7:18 AM, Josh Poimboeuf wrote: > On Thu, Nov 30, 2017 at 10:29:44PM -0800, Andy Lutomirski wrote: >> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S >> index caf74a1bb3de..28f4e7553c26 100644 >> --- a/arch/x86/entry/entry_64.S >> +++

Re: [PATCH 4/6] Unsuck "x86/entry/64: Create a percpu SYSCALL entry trampoline"

2017-12-02 Thread Andy Lutomirski
On Sat, Dec 2, 2017 at 7:18 AM, Josh Poimboeuf wrote: > On Thu, Nov 30, 2017 at 10:29:44PM -0800, Andy Lutomirski wrote: >> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S >> index caf74a1bb3de..28f4e7553c26 100644 >> --- a/arch/x86/entry/entry_64.S >> +++

Re: [PATCH 4/6] Unsuck "x86/entry/64: Create a percpu SYSCALL entry trampoline"

2017-12-02 Thread Josh Poimboeuf
On Thu, Nov 30, 2017 at 10:29:44PM -0800, Andy Lutomirski wrote: > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S > index caf74a1bb3de..28f4e7553c26 100644 > --- a/arch/x86/entry/entry_64.S > +++ b/arch/x86/entry/entry_64.S > @@ -180,14 +180,24 @@

Re: [PATCH 4/6] Unsuck "x86/entry/64: Create a percpu SYSCALL entry trampoline"

2017-12-02 Thread Josh Poimboeuf
On Thu, Nov 30, 2017 at 10:29:44PM -0800, Andy Lutomirski wrote: > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S > index caf74a1bb3de..28f4e7553c26 100644 > --- a/arch/x86/entry/entry_64.S > +++ b/arch/x86/entry/entry_64.S > @@ -180,14 +180,24 @@

[PATCH 4/6] Unsuck "x86/entry/64: Create a percpu SYSCALL entry trampoline"

2017-11-30 Thread Andy Lutomirski
This fixes a huge performance regression. Please add to the changelog: This patch actually seems to be a small speedup. With this patch, SYSCALL touches an extra cache line and an extra virtual page, but the pipeline no longer stalls waiting for SWAPGS. It seems that, at least in a tight loop,

[PATCH 4/6] Unsuck "x86/entry/64: Create a percpu SYSCALL entry trampoline"

2017-11-30 Thread Andy Lutomirski
This fixes a huge performance regression. Please add to the changelog: This patch actually seems to be a small speedup. With this patch, SYSCALL touches an extra cache line and an extra virtual page, but the pipeline no longer stalls waiting for SWAPGS. It seems that, at least in a tight loop,