Re: [Xen-devel] x86: PIE support and option to extend KASLR randomization

2017-08-16 Thread Daniel Micay
> How are these assumptions hardcoded by GCC? Most of the instructions
> should be 
> relocatable straight away, as most call/jump/branch instructions are
> RIP-relative.
> 
> I.e. is there no GCC code generation mode where code can be placed
> anywhere in the 
> canonical address space, yet call and jump distance is within 31 bits
> so that the 
> generated code is fast?

That's what PIE is meant to do. However, not disabling support for lazy
linking (-fno-plt) / symbol interposition (-Bsymbolic) is going to cause
it to add needless overhead.

arm64 is using -pie -shared -Bsymbolic in arch/arm64/Makefile for their
CONFIG_RELOCATABLE option. See 08cc55b2afd97a654f71b3bebf8bb0ec89fdc498.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] x86: PIE support and option to extend KASLR randomization

2017-08-15 Thread Daniel Micay
On 15 August 2017 at 10:20, Thomas Garnier  wrote:
> On Tue, Aug 15, 2017 at 12:56 AM, Ingo Molnar  wrote:
>>
>> * Thomas Garnier  wrote:
>>
>>> > Do these changes get us closer to being able to build the kernel as truly
>>> > position independent, i.e. to place it anywhere in the valid x86-64 
>>> > address
>>> > space? Or any other advantages?
>>>
>>> Yes, PIE allows us to put the kernel anywhere in memory. It will allow us to
>>> have a full randomized address space where position and order of sections 
>>> are
>>> completely random. There is still some work to get there but being able to 
>>> build
>>> a PIE kernel is a significant step.
>>
>> So I _really_ dislike the whole PIE approach, because of the huge slowdown:
>>
>> +config RANDOMIZE_BASE_LARGE
>> +   bool "Increase the randomization range of the kernel image"
>> +   depends on X86_64 && RANDOMIZE_BASE
>> +   select X86_PIE
>> +   select X86_MODULE_PLTS if MODULES
>> +   default n
>> +   ---help---
>> + Build the kernel as a Position Independent Executable (PIE) and
>> + increase the available randomization range from 1GB to 3GB.
>> +
>> + This option impacts performance on kernel CPU intensive workloads 
>> up
>> + to 10% due to PIE generated code. Impact on user-mode processes and
>> + typical usage would be significantly less (0.50% when you build the
>> + kernel).
>> +
>> + The kernel and modules will generate slightly more assembly (1 to 
>> 2%
>> + increase on the .text sections). The vmlinux binary will be
>> + significantly smaller due to less relocations.
>>
>> To put 10% kernel overhead into perspective: enabling this option wipes out 
>> about
>> 5-10 years worth of painstaking optimizations we've done to keep the kernel 
>> fast
>> ... (!!)
>
> Note that 10% is the high-bound of a CPU intensive workload.

The cost can be reduced by using -fno-plt these days but some work
might be required to make that work with the kernel.

Where does that 10% estimate in the kernel config docs come from? I'd
be surprised if it really cost that much on x86_64. That's a realistic
cost for i386 with modern GCC (it used to be worse) but I'd expect
x86_64 to be closer to 2% even for CPU intensive workloads. It should
be very close to zero with -fno-plt.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel