Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support
On Wed, Aug 2, 2017 at 9:56 AM, Kees Cook wrote: > On Wed, Aug 2, 2017 at 9:42 AM, Thomas Garnier wrote: >> I noticed that not only we have the problem of gs:0x40 not being >> accessible. The compiler will default to the fs register if >> mcmodel=kernel is not set. >> >> On the next patch set, I am going to add support for >> -mstack-protector-guard=global so a global variable can be used >> instead of the segment register. Similar approach than ARM/ARM64. > > While this is probably understood, I have to point out that this would > be a major regression for the stack protection on x86. I agree, the optimal solution will be using updated gcc/clang. > >> Following this patch, I will work with gcc and llvm to add >> -mstack-protector-reg= support similar to PowerPC. >> This way we can have gs used even without mcmodel=kernel. Once that's >> an option, I can setup the GDT as described in the previous email >> (similar to RFG). > > It would be much nicer if we could teach gcc about the percpu area > instead. This would let us solve the global stack protector problem on > the other architectures: > http://www.openwall.com/lists/kernel-hardening/2017/06/27/6 Yes, while I am looking at gcc I will take a look at other architecture to see if I can help there too. > > -Kees > > -- > Kees Cook > Pixel Security -- Thomas ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support
On Wed, Aug 2, 2017 at 9:42 AM, Thomas Garnier wrote: > I noticed that not only we have the problem of gs:0x40 not being > accessible. The compiler will default to the fs register if > mcmodel=kernel is not set. > > On the next patch set, I am going to add support for > -mstack-protector-guard=global so a global variable can be used > instead of the segment register. Similar approach than ARM/ARM64. While this is probably understood, I have to point out that this would be a major regression for the stack protection on x86. > Following this patch, I will work with gcc and llvm to add > -mstack-protector-reg= support similar to PowerPC. > This way we can have gs used even without mcmodel=kernel. Once that's > an option, I can setup the GDT as described in the previous email > (similar to RFG). It would be much nicer if we could teach gcc about the percpu area instead. This would let us solve the global stack protector problem on the other architectures: http://www.openwall.com/lists/kernel-hardening/2017/06/27/6 -Kees -- Kees Cook Pixel Security ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support
On Thu, Jul 20, 2017 at 7:26 AM, Thomas Garnier wrote: > On Wed, Jul 19, 2017 at 4:33 PM, H. Peter Anvin wrote: >> On 07/19/17 11:26, Thomas Garnier wrote: >>> On Tue, Jul 18, 2017 at 8:08 PM, Brian Gerst wrote: On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier wrote: > Perpcu uses a clever design where the .percu ELF section has a virtual > address of zero and the relocation code avoid relocating specific > symbols. It makes the code simple and easily adaptable with or without > SMP support. > > This design is incompatible with PIE because generated code always try to > access the zero virtual address relative to the default mapping address. > It becomes impossible when KASLR is configured to go below -2G. This > patch solves this problem by removing the zero mapping and adapting the GS > base to be relative to the expected address. These changes are done only > when PIE is enabled. The original implementation is kept as-is > by default. The reason the per-cpu section is zero-based on x86-64 is to workaround GCC hardcoding the stack protector canary at %gs:40. So this patch is incompatible with CONFIG_STACK_PROTECTOR. >>> >>> Ok, that make sense. I don't want this feature to not work with >>> CONFIG_CC_STACKPROTECTOR*. One way to fix that would be adding a GDT >>> entry for gs so gs:40 points to the correct memory address and >>> gs:[rip+XX] works correctly through the MSR. >> >> What are you talking about? A GDT entry and the MSR do the same thing, >> except that a GDT entry is limited to an offset of 0-0x (which >> doesn't work for us, obviously.) >> > > A GDT entry would allow gs:0x40 to be valid while all gs:[rip+XX] > addresses uses the MSR. > > I didn't tested it but that was used on the RFG mitigation [1]. The fs > segment register was used for both thread storage and shadow stack. > > [1] http://xlab.tencent.com/en/2016/11/02/return-flow-guard/ > Small update on that. I noticed that not only we have the problem of gs:0x40 not being accessible. The compiler will default to the fs register if mcmodel=kernel is not set. On the next patch set, I am going to add support for -mstack-protector-guard=global so a global variable can be used instead of the segment register. Similar approach than ARM/ARM64. Following this patch, I will work with gcc and llvm to add -mstack-protector-reg= support similar to PowerPC. This way we can have gs used even without mcmodel=kernel. Once that's an option, I can setup the GDT as described in the previous email (similar to RFG). Let me know what you think about this approach. >>> Given the separate >>> discussion on mcmodel, I am going first to check if we can move from >>> PIE to PIC with a mcmodel=small or medium that would remove the percpu >>> change requirement. I tried before without success but I understand >>> better percpu and other components so maybe I can make it work. >> This is silly. The right thing is for PIE is to be explicitly absolute, without (%rip). The use of (%rip) memory references for percpu is just an optimization. >>> >>> I agree that it is odd but that's how the compiler generates code. I >>> will re-explore PIC options with mcmodel=small or medium, as mentioned >>> on other threads. >> >> Why should the way compiler generates code affect the way we do things >> in assembly? >> >> That being said, the compiler now has support for generating this kind >> of code explicitly via the __seg_gs pointer modifier. That should let >> us drop the __percpu_prefix and just use variables directly. I suspect >> we want to declare percpu variables as "volatile __seg_gs" to account >> for the possibility of CPU switches. >> >> Older compilers won't be able to work with this, of course, but I think >> that it is acceptable for those older compilers to not be able to >> support PIE. >> >> -hpa >> > > > > -- > Thomas -- Thomas ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support
On Wed, Jul 19, 2017 at 4:33 PM, H. Peter Anvin wrote: > On 07/19/17 11:26, Thomas Garnier wrote: >> On Tue, Jul 18, 2017 at 8:08 PM, Brian Gerst wrote: >>> On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier wrote: Perpcu uses a clever design where the .percu ELF section has a virtual address of zero and the relocation code avoid relocating specific symbols. It makes the code simple and easily adaptable with or without SMP support. This design is incompatible with PIE because generated code always try to access the zero virtual address relative to the default mapping address. It becomes impossible when KASLR is configured to go below -2G. This patch solves this problem by removing the zero mapping and adapting the GS base to be relative to the expected address. These changes are done only when PIE is enabled. The original implementation is kept as-is by default. >>> >>> The reason the per-cpu section is zero-based on x86-64 is to >>> workaround GCC hardcoding the stack protector canary at %gs:40. So >>> this patch is incompatible with CONFIG_STACK_PROTECTOR. >> >> Ok, that make sense. I don't want this feature to not work with >> CONFIG_CC_STACKPROTECTOR*. One way to fix that would be adding a GDT >> entry for gs so gs:40 points to the correct memory address and >> gs:[rip+XX] works correctly through the MSR. > > What are you talking about? A GDT entry and the MSR do the same thing, > except that a GDT entry is limited to an offset of 0-0x (which > doesn't work for us, obviously.) > A GDT entry would allow gs:0x40 to be valid while all gs:[rip+XX] addresses uses the MSR. I didn't tested it but that was used on the RFG mitigation [1]. The fs segment register was used for both thread storage and shadow stack. [1] http://xlab.tencent.com/en/2016/11/02/return-flow-guard/ >> Given the separate >> discussion on mcmodel, I am going first to check if we can move from >> PIE to PIC with a mcmodel=small or medium that would remove the percpu >> change requirement. I tried before without success but I understand >> better percpu and other components so maybe I can make it work. > >>> This is silly. The right thing is for PIE is to be explicitly absolute, >>> without (%rip). The use of (%rip) memory references for percpu is just >>> an optimization. >> >> I agree that it is odd but that's how the compiler generates code. I >> will re-explore PIC options with mcmodel=small or medium, as mentioned >> on other threads. > > Why should the way compiler generates code affect the way we do things > in assembly? > > That being said, the compiler now has support for generating this kind > of code explicitly via the __seg_gs pointer modifier. That should let > us drop the __percpu_prefix and just use variables directly. I suspect > we want to declare percpu variables as "volatile __seg_gs" to account > for the possibility of CPU switches. > > Older compilers won't be able to work with this, of course, but I think > that it is acceptable for those older compilers to not be able to > support PIE. > > -hpa > -- Thomas ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support
On 07/19/17 19:21, H. Peter Anvin wrote: > On 07/19/17 16:33, H. Peter Anvin wrote: >>> >>> I agree that it is odd but that's how the compiler generates code. I >>> will re-explore PIC options with mcmodel=small or medium, as mentioned >>> on other threads. >> >> Why should the way compiler generates code affect the way we do things >> in assembly? >> >> That being said, the compiler now has support for generating this kind >> of code explicitly via the __seg_gs pointer modifier. That should let >> us drop the __percpu_prefix and just use variables directly. I suspect >> we want to declare percpu variables as "volatile __seg_gs" to account >> for the possibility of CPU switches. >> >> Older compilers won't be able to work with this, of course, but I think >> that it is acceptable for those older compilers to not be able to >> support PIE. >> > > Grump. It turns out that the compiler doesn't do the right thing for > symbols marked with the __seg_[fg]s markers. __thread does the right > thing, but __thread a) has %fs: hard-coded, still, and b) I believe can > still cache %seg:0 arbitrarily long. I filed this bug report for gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81490 It might still be possible to work around this by playing really ugly games with __thread, but I haven't yet figured out how best to do that. -hpa ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support
On 07/19/17 16:33, H. Peter Anvin wrote: >> >> I agree that it is odd but that's how the compiler generates code. I >> will re-explore PIC options with mcmodel=small or medium, as mentioned >> on other threads. > > Why should the way compiler generates code affect the way we do things > in assembly? > > That being said, the compiler now has support for generating this kind > of code explicitly via the __seg_gs pointer modifier. That should let > us drop the __percpu_prefix and just use variables directly. I suspect > we want to declare percpu variables as "volatile __seg_gs" to account > for the possibility of CPU switches. > > Older compilers won't be able to work with this, of course, but I think > that it is acceptable for those older compilers to not be able to > support PIE. > Grump. It turns out that the compiler doesn't do the right thing for symbols marked with the __seg_[fg]s markers. __thread does the right thing, but __thread a) has %fs: hard-coded, still, and b) I believe can still cache %seg:0 arbitrarily long. -hpa ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support
On 07/19/17 11:26, Thomas Garnier wrote: > On Tue, Jul 18, 2017 at 8:08 PM, Brian Gerst wrote: >> On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier wrote: >>> Perpcu uses a clever design where the .percu ELF section has a virtual >>> address of zero and the relocation code avoid relocating specific >>> symbols. It makes the code simple and easily adaptable with or without >>> SMP support. >>> >>> This design is incompatible with PIE because generated code always try to >>> access the zero virtual address relative to the default mapping address. >>> It becomes impossible when KASLR is configured to go below -2G. This >>> patch solves this problem by removing the zero mapping and adapting the GS >>> base to be relative to the expected address. These changes are done only >>> when PIE is enabled. The original implementation is kept as-is >>> by default. >> >> The reason the per-cpu section is zero-based on x86-64 is to >> workaround GCC hardcoding the stack protector canary at %gs:40. So >> this patch is incompatible with CONFIG_STACK_PROTECTOR. > > Ok, that make sense. I don't want this feature to not work with > CONFIG_CC_STACKPROTECTOR*. One way to fix that would be adding a GDT > entry for gs so gs:40 points to the correct memory address and > gs:[rip+XX] works correctly through the MSR. What are you talking about? A GDT entry and the MSR do the same thing, except that a GDT entry is limited to an offset of 0-0x (which doesn't work for us, obviously.) > Given the separate > discussion on mcmodel, I am going first to check if we can move from > PIE to PIC with a mcmodel=small or medium that would remove the percpu > change requirement. I tried before without success but I understand > better percpu and other components so maybe I can make it work. >> This is silly. The right thing is for PIE is to be explicitly absolute, >> without (%rip). The use of (%rip) memory references for percpu is just >> an optimization. > > I agree that it is odd but that's how the compiler generates code. I > will re-explore PIC options with mcmodel=small or medium, as mentioned > on other threads. Why should the way compiler generates code affect the way we do things in assembly? That being said, the compiler now has support for generating this kind of code explicitly via the __seg_gs pointer modifier. That should let us drop the __percpu_prefix and just use variables directly. I suspect we want to declare percpu variables as "volatile __seg_gs" to account for the possibility of CPU switches. Older compilers won't be able to work with this, of course, but I think that it is acceptable for those older compilers to not be able to support PIE. -hpa ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support
On Tue, Jul 18, 2017 at 8:08 PM, Brian Gerst wrote: > On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier wrote: >> Perpcu uses a clever design where the .percu ELF section has a virtual >> address of zero and the relocation code avoid relocating specific >> symbols. It makes the code simple and easily adaptable with or without >> SMP support. >> >> This design is incompatible with PIE because generated code always try to >> access the zero virtual address relative to the default mapping address. >> It becomes impossible when KASLR is configured to go below -2G. This >> patch solves this problem by removing the zero mapping and adapting the GS >> base to be relative to the expected address. These changes are done only >> when PIE is enabled. The original implementation is kept as-is >> by default. > > The reason the per-cpu section is zero-based on x86-64 is to > workaround GCC hardcoding the stack protector canary at %gs:40. So > this patch is incompatible with CONFIG_STACK_PROTECTOR. Ok, that make sense. I don't want this feature to not work with CONFIG_CC_STACKPROTECTOR*. One way to fix that would be adding a GDT entry for gs so gs:40 points to the correct memory address and gs:[rip+XX] works correctly through the MSR. Given the separate discussion on mcmodel, I am going first to check if we can move from PIE to PIC with a mcmodel=small or medium that would remove the percpu change requirement. I tried before without success but I understand better percpu and other components so maybe I can make it work. Thanks a lot for the feedback. > > -- > Brian Gerst -- Thomas ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support
On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier wrote: > Perpcu uses a clever design where the .percu ELF section has a virtual > address of zero and the relocation code avoid relocating specific > symbols. It makes the code simple and easily adaptable with or without > SMP support. > > This design is incompatible with PIE because generated code always try to > access the zero virtual address relative to the default mapping address. > It becomes impossible when KASLR is configured to go below -2G. This > patch solves this problem by removing the zero mapping and adapting the GS > base to be relative to the expected address. These changes are done only > when PIE is enabled. The original implementation is kept as-is > by default. The reason the per-cpu section is zero-based on x86-64 is to workaround GCC hardcoding the stack protector canary at %gs:40. So this patch is incompatible with CONFIG_STACK_PROTECTOR. -- Brian Gerst ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support
Perpcu uses a clever design where the .percu ELF section has a virtual address of zero and the relocation code avoid relocating specific symbols. It makes the code simple and easily adaptable with or without SMP support. This design is incompatible with PIE because generated code always try to access the zero virtual address relative to the default mapping address. It becomes impossible when KASLR is configured to go below -2G. This patch solves this problem by removing the zero mapping and adapting the GS base to be relative to the expected address. These changes are done only when PIE is enabled. The original implementation is kept as-is by default. The assembly and PER_CPU macros are changed to use relative references when PIE is enabled. The KALLSYMS_ABSOLUTE_PERCPU configuration is disabled with PIE given percpu symbols are not absolute in this case. Position Independent Executable (PIE) support will allow to extended the KASLR randomization range below the -2G memory limit. Signed-off-by: Thomas Garnier --- arch/x86/entry/entry_64.S | 4 ++-- arch/x86/include/asm/percpu.h | 25 +++-- arch/x86/kernel/cpu/common.c | 4 +++- arch/x86/kernel/head_64.S | 4 arch/x86/kernel/setup_percpu.c | 2 +- arch/x86/kernel/vmlinux.lds.S | 13 +++-- arch/x86/lib/cmpxchg16b_emu.S | 8 arch/x86/xen/xen-asm.S | 12 ++-- init/Kconfig | 2 +- 9 files changed, 51 insertions(+), 23 deletions(-) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 691c4755269b..be198c0a2a8c 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -388,7 +388,7 @@ ENTRY(__switch_to_asm) #ifdef CONFIG_CC_STACKPROTECTOR movqTASK_stack_canary(%rsi), %rbx - movq%rbx, PER_CPU_VAR(irq_stack_union)+stack_canary_offset + movq%rbx, PER_CPU_VAR(irq_stack_union + stack_canary_offset) #endif /* restore callee-saved registers */ @@ -739,7 +739,7 @@ apicinterrupt IRQ_WORK_VECTOR irq_work_interrupt smp_irq_work_interrupt /* * Exception entry points. */ -#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8) +#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss + (TSS_ist + ((x) - 1) * 8)) .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1 ENTRY(\sym) diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h index 9fa03604b2b3..862eb771f0e5 100644 --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -4,9 +4,11 @@ #ifdef CONFIG_X86_64 #define __percpu_seg gs #define __percpu_mov_opmovq +#define __percpu_rel (%rip) #else #define __percpu_seg fs #define __percpu_mov_opmovl +#define __percpu_rel #endif #ifdef __ASSEMBLY__ @@ -27,10 +29,14 @@ #define PER_CPU(var, reg) \ __percpu_mov_op %__percpu_seg:this_cpu_off, reg;\ lea var(reg), reg -#define PER_CPU_VAR(var) %__percpu_seg:var +/* Compatible with Position Independent Code */ +#define PER_CPU_VAR(var) %__percpu_seg:(var)##__percpu_rel +/* Rare absolute reference */ +#define PER_CPU_VAR_ABS(var) %__percpu_seg:var #else /* ! SMP */ #define PER_CPU(var, reg) __percpu_mov_op $var, reg -#define PER_CPU_VAR(var) var +#define PER_CPU_VAR(var) (var)##__percpu_rel +#define PER_CPU_VAR_ABS(var) var #endif /* SMP */ #ifdef CONFIG_X86_64_SMP @@ -208,27 +214,34 @@ do { \ pfo_ret__; \ }) +/* Position Independent code uses relative addresses only */ +#ifdef CONFIG_X86_PIE +#define __percpu_stable_arg __percpu_arg(a1) +#else +#define __percpu_stable_arg __percpu_arg(P1) +#endif + #define percpu_stable_op(op, var) \ ({ \ typeof(var) pfo_ret__; \ switch (sizeof(var)) { \ case 1: \ - asm(op "b "__percpu_arg(P1)",%0"\ + asm(op "b "__percpu_stable_arg ",%0"\ : "=q" (pfo_ret__) \ : "p" (&(var)));\ break; \ case 2: \ - asm(op "w "__percpu_arg(P1)",%0"\ + asm(op "w "__percpu_stable_arg ",%0"\ : "=r" (pfo_ret__) \ : "p" (&(var)));\ break; \ case 4: \ - asm(op "l "__percpu_arg(P1)",%0"\ +