Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support

2017-08-02 Thread Thomas Garnier
On Wed, Aug 2, 2017 at 9:56 AM, Kees Cook  wrote:
> On Wed, Aug 2, 2017 at 9:42 AM, Thomas Garnier  wrote:
>> I noticed that not only we have the problem of gs:0x40 not being
>> accessible. The compiler will default to the fs register if
>> mcmodel=kernel is not set.
>>
>> On the next patch set, I am going to add support for
>> -mstack-protector-guard=global so a global variable can be used
>> instead of the segment register. Similar approach than ARM/ARM64.
>
> While this is probably understood, I have to point out that this would
> be a major regression for the stack protection on x86.

I agree, the optimal solution will be using updated gcc/clang.

>
>> Following this patch, I will work with gcc and llvm to add
>> -mstack-protector-reg= support similar to PowerPC.
>> This way we can have gs used even without mcmodel=kernel. Once that's
>> an option, I can setup the GDT as described in the previous email
>> (similar to RFG).
>
> It would be much nicer if we could teach gcc about the percpu area
> instead. This would let us solve the global stack protector problem on
> the other architectures:
> http://www.openwall.com/lists/kernel-hardening/2017/06/27/6

Yes, while I am looking at gcc I will take a look at other
architecture to see if I can help there too.

>
> -Kees
>
> --
> Kees Cook
> Pixel Security



-- 
Thomas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support

2017-08-02 Thread Kees Cook
On Wed, Aug 2, 2017 at 9:42 AM, Thomas Garnier  wrote:
> I noticed that not only we have the problem of gs:0x40 not being
> accessible. The compiler will default to the fs register if
> mcmodel=kernel is not set.
>
> On the next patch set, I am going to add support for
> -mstack-protector-guard=global so a global variable can be used
> instead of the segment register. Similar approach than ARM/ARM64.

While this is probably understood, I have to point out that this would
be a major regression for the stack protection on x86.

> Following this patch, I will work with gcc and llvm to add
> -mstack-protector-reg= support similar to PowerPC.
> This way we can have gs used even without mcmodel=kernel. Once that's
> an option, I can setup the GDT as described in the previous email
> (similar to RFG).

It would be much nicer if we could teach gcc about the percpu area
instead. This would let us solve the global stack protector problem on
the other architectures:
http://www.openwall.com/lists/kernel-hardening/2017/06/27/6

-Kees

-- 
Kees Cook
Pixel Security

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support

2017-08-02 Thread Thomas Garnier
On Thu, Jul 20, 2017 at 7:26 AM, Thomas Garnier  wrote:
> On Wed, Jul 19, 2017 at 4:33 PM, H. Peter Anvin  wrote:
>> On 07/19/17 11:26, Thomas Garnier wrote:
>>> On Tue, Jul 18, 2017 at 8:08 PM, Brian Gerst  wrote:
 On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier  
 wrote:
> Perpcu uses a clever design where the .percu ELF section has a virtual
> address of zero and the relocation code avoid relocating specific
> symbols. It makes the code simple and easily adaptable with or without
> SMP support.
>
> This design is incompatible with PIE because generated code always try to
> access the zero virtual address relative to the default mapping address.
> It becomes impossible when KASLR is configured to go below -2G. This
> patch solves this problem by removing the zero mapping and adapting the GS
> base to be relative to the expected address. These changes are done only
> when PIE is enabled. The original implementation is kept as-is
> by default.

 The reason the per-cpu section is zero-based on x86-64 is to
 workaround GCC hardcoding the stack protector canary at %gs:40.  So
 this patch is incompatible with CONFIG_STACK_PROTECTOR.
>>>
>>> Ok, that make sense. I don't want this feature to not work with
>>> CONFIG_CC_STACKPROTECTOR*. One way to fix that would be adding a GDT
>>> entry for gs so gs:40 points to the correct memory address and
>>> gs:[rip+XX] works correctly through the MSR.
>>
>> What are you talking about?  A GDT entry and the MSR do the same thing,
>> except that a GDT entry is limited to an offset of 0-0x (which
>> doesn't work for us, obviously.)
>>
>
> A GDT entry would allow gs:0x40 to be valid while all gs:[rip+XX]
> addresses uses the MSR.
>
> I didn't tested it but that was used on the RFG mitigation [1]. The fs
> segment register was used for both thread storage and shadow stack.
>
> [1] http://xlab.tencent.com/en/2016/11/02/return-flow-guard/
>

Small update on that.

I noticed that not only we have the problem of gs:0x40 not being
accessible. The compiler will default to the fs register if
mcmodel=kernel is not set.

On the next patch set, I am going to add support for
-mstack-protector-guard=global so a global variable can be used
instead of the segment register. Similar approach than ARM/ARM64.

Following this patch, I will work with gcc and llvm to add
-mstack-protector-reg= support similar to PowerPC.
This way we can have gs used even without mcmodel=kernel. Once that's
an option, I can setup the GDT as described in the previous email
(similar to RFG).

Let me know what you think about this approach.

>>> Given the separate
>>> discussion on mcmodel, I am going first to check if we can move from
>>> PIE to PIC with a mcmodel=small or medium that would remove the percpu
>>> change requirement. I tried before without success but I understand
>>> better percpu and other components so maybe I can make it work.
>>
 This is silly.  The right thing is for PIE is to be explicitly absolute,
 without (%rip).  The use of (%rip) memory references for percpu is just
 an optimization.
>>>
>>> I agree that it is odd but that's how the compiler generates code. I
>>> will re-explore PIC options with mcmodel=small or medium, as mentioned
>>> on other threads.
>>
>> Why should the way compiler generates code affect the way we do things
>> in assembly?
>>
>> That being said, the compiler now has support for generating this kind
>> of code explicitly via the __seg_gs pointer modifier.  That should let
>> us drop the __percpu_prefix and just use variables directly.  I suspect
>> we want to declare percpu variables as "volatile __seg_gs" to account
>> for the possibility of CPU switches.
>>
>> Older compilers won't be able to work with this, of course, but I think
>> that it is acceptable for those older compilers to not be able to
>> support PIE.
>>
>> -hpa
>>
>
>
>
> --
> Thomas



-- 
Thomas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support

2017-07-20 Thread Thomas Garnier
On Wed, Jul 19, 2017 at 4:33 PM, H. Peter Anvin  wrote:
> On 07/19/17 11:26, Thomas Garnier wrote:
>> On Tue, Jul 18, 2017 at 8:08 PM, Brian Gerst  wrote:
>>> On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier  wrote:
 Perpcu uses a clever design where the .percu ELF section has a virtual
 address of zero and the relocation code avoid relocating specific
 symbols. It makes the code simple and easily adaptable with or without
 SMP support.

 This design is incompatible with PIE because generated code always try to
 access the zero virtual address relative to the default mapping address.
 It becomes impossible when KASLR is configured to go below -2G. This
 patch solves this problem by removing the zero mapping and adapting the GS
 base to be relative to the expected address. These changes are done only
 when PIE is enabled. The original implementation is kept as-is
 by default.
>>>
>>> The reason the per-cpu section is zero-based on x86-64 is to
>>> workaround GCC hardcoding the stack protector canary at %gs:40.  So
>>> this patch is incompatible with CONFIG_STACK_PROTECTOR.
>>
>> Ok, that make sense. I don't want this feature to not work with
>> CONFIG_CC_STACKPROTECTOR*. One way to fix that would be adding a GDT
>> entry for gs so gs:40 points to the correct memory address and
>> gs:[rip+XX] works correctly through the MSR.
>
> What are you talking about?  A GDT entry and the MSR do the same thing,
> except that a GDT entry is limited to an offset of 0-0x (which
> doesn't work for us, obviously.)
>

A GDT entry would allow gs:0x40 to be valid while all gs:[rip+XX]
addresses uses the MSR.

I didn't tested it but that was used on the RFG mitigation [1]. The fs
segment register was used for both thread storage and shadow stack.

[1] http://xlab.tencent.com/en/2016/11/02/return-flow-guard/

>> Given the separate
>> discussion on mcmodel, I am going first to check if we can move from
>> PIE to PIC with a mcmodel=small or medium that would remove the percpu
>> change requirement. I tried before without success but I understand
>> better percpu and other components so maybe I can make it work.
>
>>> This is silly.  The right thing is for PIE is to be explicitly absolute,
>>> without (%rip).  The use of (%rip) memory references for percpu is just
>>> an optimization.
>>
>> I agree that it is odd but that's how the compiler generates code. I
>> will re-explore PIC options with mcmodel=small or medium, as mentioned
>> on other threads.
>
> Why should the way compiler generates code affect the way we do things
> in assembly?
>
> That being said, the compiler now has support for generating this kind
> of code explicitly via the __seg_gs pointer modifier.  That should let
> us drop the __percpu_prefix and just use variables directly.  I suspect
> we want to declare percpu variables as "volatile __seg_gs" to account
> for the possibility of CPU switches.
>
> Older compilers won't be able to work with this, of course, but I think
> that it is acceptable for those older compilers to not be able to
> support PIE.
>
> -hpa
>



-- 
Thomas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support

2017-07-19 Thread H. Peter Anvin
On 07/19/17 19:21, H. Peter Anvin wrote:
> On 07/19/17 16:33, H. Peter Anvin wrote:
>>>
>>> I agree that it is odd but that's how the compiler generates code. I
>>> will re-explore PIC options with mcmodel=small or medium, as mentioned
>>> on other threads.
>>
>> Why should the way compiler generates code affect the way we do things
>> in assembly?
>>
>> That being said, the compiler now has support for generating this kind
>> of code explicitly via the __seg_gs pointer modifier.  That should let
>> us drop the __percpu_prefix and just use variables directly.  I suspect
>> we want to declare percpu variables as "volatile __seg_gs" to account
>> for the possibility of CPU switches.
>>
>> Older compilers won't be able to work with this, of course, but I think
>> that it is acceptable for those older compilers to not be able to
>> support PIE.
>>
> 
> Grump.  It turns out that the compiler doesn't do the right thing for
> symbols marked with the __seg_[fg]s markers.  __thread does the right
> thing, but __thread a) has %fs: hard-coded, still, and b) I believe can
> still cache %seg:0 arbitrarily long.

I filed this bug report for gcc:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81490

It might still be possible to work around this by playing really ugly
games with __thread, but I haven't yet figured out how best to do that.

-hpa

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support

2017-07-19 Thread H. Peter Anvin
On 07/19/17 16:33, H. Peter Anvin wrote:
>>
>> I agree that it is odd but that's how the compiler generates code. I
>> will re-explore PIC options with mcmodel=small or medium, as mentioned
>> on other threads.
> 
> Why should the way compiler generates code affect the way we do things
> in assembly?
> 
> That being said, the compiler now has support for generating this kind
> of code explicitly via the __seg_gs pointer modifier.  That should let
> us drop the __percpu_prefix and just use variables directly.  I suspect
> we want to declare percpu variables as "volatile __seg_gs" to account
> for the possibility of CPU switches.
> 
> Older compilers won't be able to work with this, of course, but I think
> that it is acceptable for those older compilers to not be able to
> support PIE.
> 

Grump.  It turns out that the compiler doesn't do the right thing for
symbols marked with the __seg_[fg]s markers.  __thread does the right
thing, but __thread a) has %fs: hard-coded, still, and b) I believe can
still cache %seg:0 arbitrarily long.

-hpa


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support

2017-07-19 Thread H. Peter Anvin
On 07/19/17 11:26, Thomas Garnier wrote:
> On Tue, Jul 18, 2017 at 8:08 PM, Brian Gerst  wrote:
>> On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier  wrote:
>>> Perpcu uses a clever design where the .percu ELF section has a virtual
>>> address of zero and the relocation code avoid relocating specific
>>> symbols. It makes the code simple and easily adaptable with or without
>>> SMP support.
>>>
>>> This design is incompatible with PIE because generated code always try to
>>> access the zero virtual address relative to the default mapping address.
>>> It becomes impossible when KASLR is configured to go below -2G. This
>>> patch solves this problem by removing the zero mapping and adapting the GS
>>> base to be relative to the expected address. These changes are done only
>>> when PIE is enabled. The original implementation is kept as-is
>>> by default.
>>
>> The reason the per-cpu section is zero-based on x86-64 is to
>> workaround GCC hardcoding the stack protector canary at %gs:40.  So
>> this patch is incompatible with CONFIG_STACK_PROTECTOR.
> 
> Ok, that make sense. I don't want this feature to not work with
> CONFIG_CC_STACKPROTECTOR*. One way to fix that would be adding a GDT
> entry for gs so gs:40 points to the correct memory address and
> gs:[rip+XX] works correctly through the MSR.

What are you talking about?  A GDT entry and the MSR do the same thing,
except that a GDT entry is limited to an offset of 0-0x (which
doesn't work for us, obviously.)

> Given the separate
> discussion on mcmodel, I am going first to check if we can move from
> PIE to PIC with a mcmodel=small or medium that would remove the percpu
> change requirement. I tried before without success but I understand
> better percpu and other components so maybe I can make it work.

>> This is silly.  The right thing is for PIE is to be explicitly absolute,
>> without (%rip).  The use of (%rip) memory references for percpu is just
>> an optimization.
> 
> I agree that it is odd but that's how the compiler generates code. I
> will re-explore PIC options with mcmodel=small or medium, as mentioned
> on other threads.

Why should the way compiler generates code affect the way we do things
in assembly?

That being said, the compiler now has support for generating this kind
of code explicitly via the __seg_gs pointer modifier.  That should let
us drop the __percpu_prefix and just use variables directly.  I suspect
we want to declare percpu variables as "volatile __seg_gs" to account
for the possibility of CPU switches.

Older compilers won't be able to work with this, of course, but I think
that it is acceptable for those older compilers to not be able to
support PIE.

-hpa


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support

2017-07-19 Thread Thomas Garnier
On Tue, Jul 18, 2017 at 8:08 PM, Brian Gerst  wrote:
> On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier  wrote:
>> Perpcu uses a clever design where the .percu ELF section has a virtual
>> address of zero and the relocation code avoid relocating specific
>> symbols. It makes the code simple and easily adaptable with or without
>> SMP support.
>>
>> This design is incompatible with PIE because generated code always try to
>> access the zero virtual address relative to the default mapping address.
>> It becomes impossible when KASLR is configured to go below -2G. This
>> patch solves this problem by removing the zero mapping and adapting the GS
>> base to be relative to the expected address. These changes are done only
>> when PIE is enabled. The original implementation is kept as-is
>> by default.
>
> The reason the per-cpu section is zero-based on x86-64 is to
> workaround GCC hardcoding the stack protector canary at %gs:40.  So
> this patch is incompatible with CONFIG_STACK_PROTECTOR.

Ok, that make sense. I don't want this feature to not work with
CONFIG_CC_STACKPROTECTOR*. One way to fix that would be adding a GDT
entry for gs so gs:40 points to the correct memory address and
gs:[rip+XX] works correctly through the MSR. Given the separate
discussion on mcmodel, I am going first to check if we can move from
PIE to PIC with a mcmodel=small or medium that would remove the percpu
change requirement. I tried before without success but I understand
better percpu and other components so maybe I can make it work.

Thanks a lot for the feedback.

>
> --
> Brian Gerst



-- 
Thomas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support

2017-07-18 Thread Brian Gerst
On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier  wrote:
> Perpcu uses a clever design where the .percu ELF section has a virtual
> address of zero and the relocation code avoid relocating specific
> symbols. It makes the code simple and easily adaptable with or without
> SMP support.
>
> This design is incompatible with PIE because generated code always try to
> access the zero virtual address relative to the default mapping address.
> It becomes impossible when KASLR is configured to go below -2G. This
> patch solves this problem by removing the zero mapping and adapting the GS
> base to be relative to the expected address. These changes are done only
> when PIE is enabled. The original implementation is kept as-is
> by default.

The reason the per-cpu section is zero-based on x86-64 is to
workaround GCC hardcoding the stack protector canary at %gs:40.  So
this patch is incompatible with CONFIG_STACK_PROTECTOR.

--
Brian Gerst

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC 16/22] x86/percpu: Adapt percpu for PIE support

2017-07-18 Thread Thomas Garnier
Perpcu uses a clever design where the .percu ELF section has a virtual
address of zero and the relocation code avoid relocating specific
symbols. It makes the code simple and easily adaptable with or without
SMP support.

This design is incompatible with PIE because generated code always try to
access the zero virtual address relative to the default mapping address.
It becomes impossible when KASLR is configured to go below -2G. This
patch solves this problem by removing the zero mapping and adapting the GS
base to be relative to the expected address. These changes are done only
when PIE is enabled. The original implementation is kept as-is
by default.

The assembly and PER_CPU macros are changed to use relative references
when PIE is enabled.

The KALLSYMS_ABSOLUTE_PERCPU configuration is disabled with PIE given
percpu symbols are not absolute in this case.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier 
---
 arch/x86/entry/entry_64.S  |  4 ++--
 arch/x86/include/asm/percpu.h  | 25 +++--
 arch/x86/kernel/cpu/common.c   |  4 +++-
 arch/x86/kernel/head_64.S  |  4 
 arch/x86/kernel/setup_percpu.c |  2 +-
 arch/x86/kernel/vmlinux.lds.S  | 13 +++--
 arch/x86/lib/cmpxchg16b_emu.S  |  8 
 arch/x86/xen/xen-asm.S | 12 ++--
 init/Kconfig   |  2 +-
 9 files changed, 51 insertions(+), 23 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 691c4755269b..be198c0a2a8c 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -388,7 +388,7 @@ ENTRY(__switch_to_asm)
 
 #ifdef CONFIG_CC_STACKPROTECTOR
movqTASK_stack_canary(%rsi), %rbx
-   movq%rbx, PER_CPU_VAR(irq_stack_union)+stack_canary_offset
+   movq%rbx, PER_CPU_VAR(irq_stack_union + stack_canary_offset)
 #endif
 
/* restore callee-saved registers */
@@ -739,7 +739,7 @@ apicinterrupt IRQ_WORK_VECTOR   
irq_work_interrupt  smp_irq_work_interrupt
 /*
  * Exception entry points.
  */
-#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8)
+#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss + (TSS_ist + ((x) - 1) * 8))
 
 .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
 ENTRY(\sym)
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index 9fa03604b2b3..862eb771f0e5 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -4,9 +4,11 @@
 #ifdef CONFIG_X86_64
 #define __percpu_seg   gs
 #define __percpu_mov_opmovq
+#define __percpu_rel   (%rip)
 #else
 #define __percpu_seg   fs
 #define __percpu_mov_opmovl
+#define __percpu_rel
 #endif
 
 #ifdef __ASSEMBLY__
@@ -27,10 +29,14 @@
 #define PER_CPU(var, reg)  \
__percpu_mov_op %__percpu_seg:this_cpu_off, reg;\
lea var(reg), reg
-#define PER_CPU_VAR(var)   %__percpu_seg:var
+/* Compatible with Position Independent Code */
+#define PER_CPU_VAR(var)   %__percpu_seg:(var)##__percpu_rel
+/* Rare absolute reference */
+#define PER_CPU_VAR_ABS(var)   %__percpu_seg:var
 #else /* ! SMP */
 #define PER_CPU(var, reg)  __percpu_mov_op $var, reg
-#define PER_CPU_VAR(var)   var
+#define PER_CPU_VAR(var)   (var)##__percpu_rel
+#define PER_CPU_VAR_ABS(var)   var
 #endif /* SMP */
 
 #ifdef CONFIG_X86_64_SMP
@@ -208,27 +214,34 @@ do {  
\
pfo_ret__;  \
 })
 
+/* Position Independent code uses relative addresses only */
+#ifdef CONFIG_X86_PIE
+#define __percpu_stable_arg __percpu_arg(a1)
+#else
+#define __percpu_stable_arg __percpu_arg(P1)
+#endif
+
 #define percpu_stable_op(op, var)  \
 ({ \
typeof(var) pfo_ret__;  \
switch (sizeof(var)) {  \
case 1: \
-   asm(op "b "__percpu_arg(P1)",%0"\
+   asm(op "b "__percpu_stable_arg ",%0"\
: "=q" (pfo_ret__)  \
: "p" (&(var)));\
break;  \
case 2: \
-   asm(op "w "__percpu_arg(P1)",%0"\
+   asm(op "w "__percpu_stable_arg ",%0"\
: "=r" (pfo_ret__)  \
: "p" (&(var)));\
break;  \
case 4: \
-   asm(op "l "__percpu_arg(P1)",%0"\
+