Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-12-01 Thread Jeremy Fitzhardinge
On Nov 29, 2007, at 2:44 PM, Ingo Molnar wrote: * Andi Kleen <[EMAIL PROTECTED]> wrote: For i386 iirc Jeremy/Zach did the benchmarking and they settled on %fs because it was faster for something (originally it was %gs too) yep. IIRC, some CPUs only optimize %fs because that's what

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-12-01 Thread Jeremy Fitzhardinge
On Nov 29, 2007, at 2:44 PM, Ingo Molnar wrote: * Andi Kleen [EMAIL PROTECTED] wrote: For i386 iirc Jeremy/Zach did the benchmarking and they settled on %fs because it was faster for something (originally it was %gs too) yep. IIRC, some CPUs only optimize %fs because that's what Windows

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Chuck Ebbert
On 11/29/2007 05:21 PM, Roland McGrath wrote: >>> case offsetof(struct user32, regs.gs): >>> *val = child->thread.gsindex; >>> + if (child == current) >>> + asm("movl %%gs,%0" : "=r" (*val)); >> Won't this return the kernel's GS instead of the user's? >

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Linus Torvalds
On Thu, 29 Nov 2007, Roland McGrath wrote: > > Um, really? This is x86-64 code. AIUI those values don't have any effect > at all in 64-bit mode (as the kernel is). I haven't found any code in > entry_64.S or ia32entry.S that touches them. __switch_to uses direct > access to the segment

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Roland McGrath
> But the ones that do the same thing for fs/es/ds are *not*. Those three > registers are kernel mode registers (ds/es are the regular kernel data > segment, fs is the per-cpu data segment), and restored on return to user > space from the stack. Um, really? This is x86-64 code. AIUI those

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Roland McGrath
> > case offsetof(struct user32, regs.gs): > > *val = child->thread.gsindex; > > + if (child == current) > > + asm("movl %%gs,%0" : "=r" (*val)); > > Won't this return the kernel's GS instead of the user's? [...] > But this is x86_64, where swapgs is

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Linus Torvalds
On Thu, 29 Nov 2007, Andi Kleen wrote: > > How would you catch (common) the case of them having different bases in the > GDT TLS entries? At some point the selector has to be reloaded, otherwise > it won't be picked up by the CPU. You're right. I somehow thought we were using the LDT for TLS

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Andi Kleen
> HOWEVER. That is actually not the right (well, "complete") conditional, > since it's only one sub-case of the thing that matters. The right > conditional is probably > > /* >* Restore %gs if needed (which is common). >* We can avoid it if they are identical, and >

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread H. Peter Anvin
Ingo Molnar wrote: * Andi Kleen <[EMAIL PROTECTED]> wrote: For i386 iirc Jeremy/Zach did the benchmarking and they settled on %fs because it was faster for something (originally it was %gs too) yep. IIRC, some CPUs only optimize %fs because that's what Windows uses and leaves Linux with %gs

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Linus Torvalds
On Thu, 29 Nov 2007, Andi Kleen wrote: > > For i386 iirc Jeremy/Zach did the benchmarking and they settled > on %fs because it was faster for something (originally it was %gs too) Hmm. Context switching ends up having to switch the segment that we do *not* use for the kernel, and the context

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Ingo Molnar
* Andi Kleen <[EMAIL PROTECTED]> wrote: > For i386 iirc Jeremy/Zach did the benchmarking and they settled on %fs > because it was faster for something (originally it was %gs too) yep. IIRC, some CPUs only optimize %fs because that's what Windows uses and leaves Linux with %gs out in the cold.

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Andi Kleen
On Thu, Nov 29, 2007 at 11:16:55AM -0800, H. Peter Anvin wrote: > Andi, do you happen to remember the details on this? x86-64 has to use GS because there is no SWAPFS We decided to make it opposite on user space back then, but not based on benchmarks (there were only simulators back then) Oh yes

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread H. Peter Anvin
Andi, do you happen to remember the details on this? -hpa Linus Torvalds wrote: On Thu, 29 Nov 2007, H. Peter Anvin wrote: Linus Torvalds wrote: It is advantageous for user space to use the register the kernel typically won't, in order to speed up system call entry/exit. but I'm

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Linus Torvalds
On Thu, 29 Nov 2007, H. Peter Anvin wrote: > Linus Torvalds wrote: > > > > > It is advantageous for user space to use the register the kernel typically > > > won't, in order to speed up system call entry/exit. > > > > but I'm not seeing the reason for that one. Care to comment more? (Yes, > >

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread H. Peter Anvin
Linus Torvalds wrote: However, you also say: It is advantageous for user space to use the register the kernel typically won't, in order to speed up system call entry/exit. but I'm not seeing the reason for that one. Care to comment more? (Yes, there is often a latency from segment reload

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Linus Torvalds
On Thu, 29 Nov 2007, H. Peter Anvin wrote: > > The kernel uses %fs in 32-bit mode and %gs in 64-bit mode. Yeah, thanks for reminding me about this particular insanity. We should just make the kernel always use %gs for the percpu data. On 32-bit x86 there really is no reason to use %fs over

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread H. Peter Anvin
Chuck Ebbert wrote: On 11/29/2007 01:09 PM, Linus Torvalds wrote: case offsetof(struct user32, regs.gs): *val = child->thread.gsindex; + if (child == current) + asm("movl %%gs,%0" : "=r" (*val)); Won't this return the kernel's GS

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Chuck Ebbert
On 11/29/2007 01:09 PM, Linus Torvalds wrote: >>> case offsetof(struct user32, regs.gs): >>> *val = child->thread.gsindex; >>> + if (child == current) >>> + asm("movl %%gs,%0" : "=r" (*val)); >> Won't this return the kernel's GS instead of the user's? >

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread H. Peter Anvin
Linus Torvalds wrote: But this one is correct: case offsetof(struct user32, regs.gs): *val = child->thread.gsindex; + if (child == current) + asm("movl %%gs,%0" : "=r" (*val)); Won't this return the kernel's GS instead of the

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Linus Torvalds
Chuck seems to have caught a bug, although the wrong one: On Thu, 29 Nov 2007, Chuck Ebbert wrote: > > On 11/28/2007 07:42 PM, Roland McGrath wrote: > > --- a/arch/x86/ia32/ptrace32.c > > +++ b/arch/x86/ia32/ptrace32.c > > ... > > + if (child == current) > > +

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Chuck Ebbert
On 11/28/2007 07:42 PM, Roland McGrath wrote: > --- a/arch/x86/ia32/ptrace32.c > +++ b/arch/x86/ia32/ptrace32.c > @@ -48,19 +48,27 @@ static int putreg32(struct task_struct *child, unsigned > regno, u32 val) > if (val && (val & 3) != 3) > return -EIO; >

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Chuck Ebbert
On 11/28/2007 07:42 PM, Roland McGrath wrote: --- a/arch/x86/ia32/ptrace32.c +++ b/arch/x86/ia32/ptrace32.c @@ -48,19 +48,27 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 val) if (val (val 3) != 3) return -EIO;

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Chuck Ebbert
On 11/29/2007 01:09 PM, Linus Torvalds wrote: case offsetof(struct user32, regs.gs): *val = child-thread.gsindex; + if (child == current) + asm(movl %%gs,%0 : =r (*val)); Won't this return the kernel's GS instead of the user's? No, %gs is

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread H. Peter Anvin
Chuck Ebbert wrote: On 11/29/2007 01:09 PM, Linus Torvalds wrote: case offsetof(struct user32, regs.gs): *val = child-thread.gsindex; + if (child == current) + asm(movl %%gs,%0 : =r (*val)); Won't this return the kernel's GS instead

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread H. Peter Anvin
Linus Torvalds wrote: But this one is correct: case offsetof(struct user32, regs.gs): *val = child-thread.gsindex; + if (child == current) + asm(movl %%gs,%0 : =r (*val)); Won't this return the kernel's GS instead of the user's?

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Linus Torvalds
Chuck seems to have caught a bug, although the wrong one: On Thu, 29 Nov 2007, Chuck Ebbert wrote: On 11/28/2007 07:42 PM, Roland McGrath wrote: --- a/arch/x86/ia32/ptrace32.c +++ b/arch/x86/ia32/ptrace32.c ... + if (child == current) +

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Linus Torvalds
On Thu, 29 Nov 2007, H. Peter Anvin wrote: The kernel uses %fs in 32-bit mode and %gs in 64-bit mode. Yeah, thanks for reminding me about this particular insanity. We should just make the kernel always use %gs for the percpu data. On 32-bit x86 there really is no reason to use %fs over

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread H. Peter Anvin
Linus Torvalds wrote: However, you also say: It is advantageous for user space to use the register the kernel typically won't, in order to speed up system call entry/exit. but I'm not seeing the reason for that one. Care to comment more? (Yes, there is often a latency from segment reload

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Linus Torvalds
On Thu, 29 Nov 2007, H. Peter Anvin wrote: Linus Torvalds wrote: It is advantageous for user space to use the register the kernel typically won't, in order to speed up system call entry/exit. but I'm not seeing the reason for that one. Care to comment more? (Yes, there is often

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread H. Peter Anvin
Andi, do you happen to remember the details on this? -hpa Linus Torvalds wrote: On Thu, 29 Nov 2007, H. Peter Anvin wrote: Linus Torvalds wrote: It is advantageous for user space to use the register the kernel typically won't, in order to speed up system call entry/exit. but I'm

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Andi Kleen
On Thu, Nov 29, 2007 at 11:16:55AM -0800, H. Peter Anvin wrote: Andi, do you happen to remember the details on this? x86-64 has to use GS because there is no SWAPFS We decided to make it opposite on user space back then, but not based on benchmarks (there were only simulators back then) Oh yes

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Ingo Molnar
* Andi Kleen [EMAIL PROTECTED] wrote: For i386 iirc Jeremy/Zach did the benchmarking and they settled on %fs because it was faster for something (originally it was %gs too) yep. IIRC, some CPUs only optimize %fs because that's what Windows uses and leaves Linux with %gs out in the cold.

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Linus Torvalds
On Thu, 29 Nov 2007, Andi Kleen wrote: For i386 iirc Jeremy/Zach did the benchmarking and they settled on %fs because it was faster for something (originally it was %gs too) Hmm. Context switching ends up having to switch the segment that we do *not* use for the kernel, and the context

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Andi Kleen
HOWEVER. That is actually not the right (well, complete) conditional, since it's only one sub-case of the thing that matters. The right conditional is probably /* * Restore %gs if needed (which is common). * We can avoid it if they are identical, and * point

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread H. Peter Anvin
Ingo Molnar wrote: * Andi Kleen [EMAIL PROTECTED] wrote: For i386 iirc Jeremy/Zach did the benchmarking and they settled on %fs because it was faster for something (originally it was %gs too) yep. IIRC, some CPUs only optimize %fs because that's what Windows uses and leaves Linux with %gs

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Linus Torvalds
On Thu, 29 Nov 2007, Andi Kleen wrote: How would you catch (common) the case of them having different bases in the GDT TLS entries? At some point the selector has to be reloaded, otherwise it won't be picked up by the CPU. You're right. I somehow thought we were using the LDT for TLS

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Roland McGrath
case offsetof(struct user32, regs.gs): *val = child-thread.gsindex; + if (child == current) + asm(movl %%gs,%0 : =r (*val)); Won't this return the kernel's GS instead of the user's? [...] But this is x86_64, where swapgs is done on kernel

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Roland McGrath
But the ones that do the same thing for fs/es/ds are *not*. Those three registers are kernel mode registers (ds/es are the regular kernel data segment, fs is the per-cpu data segment), and restored on return to user space from the stack. Um, really? This is x86-64 code. AIUI those values

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Linus Torvalds
On Thu, 29 Nov 2007, Roland McGrath wrote: Um, really? This is x86-64 code. AIUI those values don't have any effect at all in 64-bit mode (as the kernel is). I haven't found any code in entry_64.S or ia32entry.S that touches them. __switch_to uses direct access to the segment registers

Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-29 Thread Chuck Ebbert
On 11/29/2007 05:21 PM, Roland McGrath wrote: case offsetof(struct user32, regs.gs): *val = child-thread.gsindex; + if (child == current) + asm(movl %%gs,%0 : =r (*val)); Won't this return the kernel's GS instead of the user's? [...] But this is

[PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-28 Thread Roland McGrath
This generalizes the getreg32 and putreg32 functions so they can be used on the current task, as well as on a task stopped in TASK_TRACED and switched off. This lays the groundwork to share this code for all kinds of user-mode machine state access, not just ptrace. Signed-off-by: Roland McGrath

[PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

2007-11-28 Thread Roland McGrath
This generalizes the getreg32 and putreg32 functions so they can be used on the current task, as well as on a task stopped in TASK_TRACED and switched off. This lays the groundwork to share this code for all kinds of user-mode machine state access, not just ptrace. Signed-off-by: Roland McGrath