Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-16 Thread Ingo Molnar
* Denys Vlasenko wrote: > > - but making it slower without really good reasons isn't good. > > The thinking here is that cleaning up entry.S is a good reason. > > We won't do anything which would slow it down by, say, 5%, > but one cycle may be considered acceptable loss. Ok, so I've applied

Re: [PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-16 Thread Ingo Molnar
* Denys Vlasenko vda.li...@googlemail.com wrote: - but making it slower without really good reasons isn't good. The thinking here is that cleaning up entry.S is a good reason. We won't do anything which would slow it down by, say, 5%, but one cycle may be considered acceptable loss.

Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-10 Thread Denys Vlasenko
On Tue, Mar 10, 2015 at 3:02 PM, Andy Lutomirski wrote: > On Tue, Mar 10, 2015 at 7:00 AM, Denys Vlasenko > wrote: >> On Tue, Mar 10, 2015 at 2:26 PM, Andy Lutomirski wrote: >>> usersp is IMO tolerable. The nasty thing is the FIXUP_TOP_OF_STACK / >>> RESTORE_TOP_OF_STACK garbage, and this

Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-10 Thread Andy Lutomirski
On Tue, Mar 10, 2015 at 7:00 AM, Denys Vlasenko wrote: > On Tue, Mar 10, 2015 at 2:26 PM, Andy Lutomirski wrote: >> usersp is IMO tolerable. The nasty thing is the FIXUP_TOP_OF_STACK / >> RESTORE_TOP_OF_STACK garbage, and this patch is the main step toward >> killing that off completely. I've

Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-10 Thread Denys Vlasenko
On Tue, Mar 10, 2015 at 2:26 PM, Andy Lutomirski wrote: > usersp is IMO tolerable. The nasty thing is the FIXUP_TOP_OF_STACK / > RESTORE_TOP_OF_STACK garbage, and this patch is the main step toward > killing that off completely. I've still never convinced myself that > there aren't

Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-10 Thread Denys Vlasenko
On Tue, Mar 10, 2015 at 2:21 PM, Ingo Molnar wrote: >> Since this patch does add two extra MOVs, >> I did benchmark these patches. They add exactly one cycle >> to system call code path on my Sandy Bridge CPU. > > Hm, but that's the wrong direction, we should try to make it faster, > and to clean

Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-10 Thread Ingo Molnar
> * Denys Vlasenko wrote: > > > > So there are now +2 instructions (5 instead of 3) in the > > > system_call path, but there are -2 instructions in the SYSRETQ > > > path, > > > > Unfortunately, no. [...] > > So I assumed that it was an equivalent transformation, given that > none of the

Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-10 Thread Andy Lutomirski
On Tue, Mar 10, 2015 at 6:21 AM, Ingo Molnar wrote: > > * Denys Vlasenko wrote: > >> > So there are now +2 instructions (5 instead of 3) in the >> > system_call path, but there are -2 instructions in the SYSRETQ >> > path, >> >> Unfortunately, no. [...] > > So I assumed that it was an equivalent

Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-10 Thread Ingo Molnar
* Andy Lutomirski wrote: > > Since this patch does add two extra MOVs, > > I did benchmark these patches. They add exactly one cycle > > to system call code path on my Sandy Bridge CPU. > > Personally, I'm willing to pay that cycle. It could be a bigger > savings on context switch, and the

Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-10 Thread Ingo Molnar
* Denys Vlasenko wrote: > > So there are now +2 instructions (5 instead of 3) in the > > system_call path, but there are -2 instructions in the SYSRETQ > > path, > > Unfortunately, no. [...] So I assumed that it was an equivalent transformation, given that none of the changelogs spelled

Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-10 Thread Andy Lutomirski
On Tue, Mar 10, 2015 at 6:18 AM, Denys Vlasenko wrote: > On 03/10/2015 01:51 PM, Ingo Molnar wrote: >> >> * Denys Vlasenko wrote: >> >>> PER_CPU(old_rsp) usage is simplified - now it is used only >>> as temp storage, and userspace stack pointer is immediately stored >>> in pt_regs->sp on syscall

Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-10 Thread Denys Vlasenko
On 03/10/2015 01:51 PM, Ingo Molnar wrote: > > * Denys Vlasenko wrote: > >> PER_CPU(old_rsp) usage is simplified - now it is used only >> as temp storage, and userspace stack pointer is immediately stored >> in pt_regs->sp on syscall entry, instead of being used later, >> on syscall exit. >> >>

Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-10 Thread Andy Lutomirski
On Tue, Mar 10, 2015 at 5:51 AM, Ingo Molnar wrote: > > * Denys Vlasenko wrote: > >> PER_CPU(old_rsp) usage is simplified - now it is used only >> as temp storage, and userspace stack pointer is immediately stored >> in pt_regs->sp on syscall entry, instead of being used later, >> on syscall

Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-10 Thread Ingo Molnar
* Denys Vlasenko wrote: > PER_CPU(old_rsp) usage is simplified - now it is used only > as temp storage, and userspace stack pointer is immediately stored > in pt_regs->sp on syscall entry, instead of being used later, > on syscall exit. > > Instead of PER_CPU(old_rsp) and task->thread.usersp,

Re: [PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-10 Thread Andy Lutomirski
On Tue, Mar 10, 2015 at 5:51 AM, Ingo Molnar mi...@kernel.org wrote: * Denys Vlasenko dvlas...@redhat.com wrote: PER_CPU(old_rsp) usage is simplified - now it is used only as temp storage, and userspace stack pointer is immediately stored in pt_regs-sp on syscall entry, instead of being used

Re: [PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-10 Thread Andy Lutomirski
On Tue, Mar 10, 2015 at 7:00 AM, Denys Vlasenko vda.li...@googlemail.com wrote: On Tue, Mar 10, 2015 at 2:26 PM, Andy Lutomirski l...@amacapital.net wrote: usersp is IMO tolerable. The nasty thing is the FIXUP_TOP_OF_STACK / RESTORE_TOP_OF_STACK garbage, and this patch is the main step toward

Re: [PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-10 Thread Denys Vlasenko
On Tue, Mar 10, 2015 at 2:21 PM, Ingo Molnar mi...@kernel.org wrote: Since this patch does add two extra MOVs, I did benchmark these patches. They add exactly one cycle to system call code path on my Sandy Bridge CPU. Hm, but that's the wrong direction, we should try to make it faster, and

Re: [PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-10 Thread Denys Vlasenko
On Tue, Mar 10, 2015 at 2:26 PM, Andy Lutomirski l...@amacapital.net wrote: usersp is IMO tolerable. The nasty thing is the FIXUP_TOP_OF_STACK / RESTORE_TOP_OF_STACK garbage, and this patch is the main step toward killing that off completely. I've still never convinced myself that there

Re: [PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-10 Thread Denys Vlasenko
On 03/10/2015 01:51 PM, Ingo Molnar wrote: * Denys Vlasenko dvlas...@redhat.com wrote: PER_CPU(old_rsp) usage is simplified - now it is used only as temp storage, and userspace stack pointer is immediately stored in pt_regs-sp on syscall entry, instead of being used later, on syscall

Re: [PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-10 Thread Andy Lutomirski
On Tue, Mar 10, 2015 at 6:21 AM, Ingo Molnar mi...@kernel.org wrote: * Denys Vlasenko dvlas...@redhat.com wrote: So there are now +2 instructions (5 instead of 3) in the system_call path, but there are -2 instructions in the SYSRETQ path, Unfortunately, no. [...] So I assumed that it

Re: [PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-10 Thread Andy Lutomirski
On Tue, Mar 10, 2015 at 6:18 AM, Denys Vlasenko dvlas...@redhat.com wrote: On 03/10/2015 01:51 PM, Ingo Molnar wrote: * Denys Vlasenko dvlas...@redhat.com wrote: PER_CPU(old_rsp) usage is simplified - now it is used only as temp storage, and userspace stack pointer is immediately stored in

Re: [PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-10 Thread Ingo Molnar
* Denys Vlasenko dvlas...@redhat.com wrote: So there are now +2 instructions (5 instead of 3) in the system_call path, but there are -2 instructions in the SYSRETQ path, Unfortunately, no. [...] So I assumed that it was an equivalent transformation, given that none of the changelogs

Re: [PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-10 Thread Ingo Molnar
* Denys Vlasenko dvlas...@redhat.com wrote: PER_CPU(old_rsp) usage is simplified - now it is used only as temp storage, and userspace stack pointer is immediately stored in pt_regs-sp on syscall entry, instead of being used later, on syscall exit. Instead of PER_CPU(old_rsp) and

Re: [PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-10 Thread Ingo Molnar
* Andy Lutomirski l...@amacapital.net wrote: Since this patch does add two extra MOVs, I did benchmark these patches. They add exactly one cycle to system call code path on my Sandy Bridge CPU. Personally, I'm willing to pay that cycle. It could be a bigger savings on context switch,

Re: [PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-10 Thread Ingo Molnar
* Denys Vlasenko dvlas...@redhat.com wrote: So there are now +2 instructions (5 instead of 3) in the system_call path, but there are -2 instructions in the SYSRETQ path, Unfortunately, no. [...] So I assumed that it was an equivalent transformation, given that none of the

Re: [PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-10 Thread Denys Vlasenko
On Tue, Mar 10, 2015 at 3:02 PM, Andy Lutomirski l...@amacapital.net wrote: On Tue, Mar 10, 2015 at 7:00 AM, Denys Vlasenko vda.li...@googlemail.com wrote: On Tue, Mar 10, 2015 at 2:26 PM, Andy Lutomirski l...@amacapital.net wrote: usersp is IMO tolerable. The nasty thing is the

Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-09 Thread Andy Lutomirski
On Mon, Mar 9, 2015 at 1:32 PM, Denys Vlasenko wrote: > On Mon, Mar 9, 2015 at 9:11 PM, Andy Lutomirski wrote: >>> @@ -253,11 +247,13 @@ GLOBAL(system_call_after_swapgs) >>> */ >>> ENABLE_INTERRUPTS(CLBR_NONE) >>> ALLOC_PT_GPREGS_ON_STACK 8 /* +8: space for

Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-09 Thread Denys Vlasenko
On Mon, Mar 9, 2015 at 9:11 PM, Andy Lutomirski wrote: >> @@ -253,11 +247,13 @@ GLOBAL(system_call_after_swapgs) >> */ >> ENABLE_INTERRUPTS(CLBR_NONE) >> ALLOC_PT_GPREGS_ON_STACK 8 /* +8: space for orig_ax */ >> + movq%rcx,RIP(%rsp) >> + movq

Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-09 Thread Andy Lutomirski
On Mon, Mar 9, 2015 at 11:39 AM, Denys Vlasenko wrote: > PER_CPU(old_rsp) usage is simplified - now it is used only > as temp storage, and userspace stack pointer is immediately stored > in pt_regs->sp on syscall entry, instead of being used later, > on syscall exit. > > Instead of

[PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath

2015-03-09 Thread Denys Vlasenko
PER_CPU(old_rsp) usage is simplified - now it is used only as temp storage, and userspace stack pointer is immediately stored in pt_regs->sp on syscall entry, instead of being used later, on syscall exit. Instead of PER_CPU(old_rsp) and task->thread.usersp, C code uses pt_regs->sp now.

Re: [PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-09 Thread Andy Lutomirski
On Mon, Mar 9, 2015 at 1:32 PM, Denys Vlasenko vda.li...@googlemail.com wrote: On Mon, Mar 9, 2015 at 9:11 PM, Andy Lutomirski l...@amacapital.net wrote: @@ -253,11 +247,13 @@ GLOBAL(system_call_after_swapgs) */ ENABLE_INTERRUPTS(CLBR_NONE) ALLOC_PT_GPREGS_ON_STACK 8

Re: [PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-09 Thread Andy Lutomirski
On Mon, Mar 9, 2015 at 11:39 AM, Denys Vlasenko dvlas...@redhat.com wrote: PER_CPU(old_rsp) usage is simplified - now it is used only as temp storage, and userspace stack pointer is immediately stored in pt_regs-sp on syscall entry, instead of being used later, on syscall exit. Instead of

Re: [PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-09 Thread Denys Vlasenko
On Mon, Mar 9, 2015 at 9:11 PM, Andy Lutomirski l...@amacapital.net wrote: @@ -253,11 +247,13 @@ GLOBAL(system_call_after_swapgs) */ ENABLE_INTERRUPTS(CLBR_NONE) ALLOC_PT_GPREGS_ON_STACK 8 /* +8: space for orig_ax */ + movq%rcx,RIP(%rsp) +

[PATCH 3/4] x86: save user rsp in pt_regs-sp on SYSCALL64 fastpath

2015-03-09 Thread Denys Vlasenko
PER_CPU(old_rsp) usage is simplified - now it is used only as temp storage, and userspace stack pointer is immediately stored in pt_regs-sp on syscall entry, instead of being used later, on syscall exit. Instead of PER_CPU(old_rsp) and task-thread.usersp, C code uses pt_regs-sp now.