Re: [PATCH v3 3/3] powerpc: Book3S 64-bit "heavyweight" KASAN support

2019-12-18 Thread Christophe Leroy




On 12/18/2019 04:32 AM, Daniel Axtens wrote:

Daniel Axtens  writes:


Hi Christophe,

I'm working through your feedback, thank you. Regarding this one:


--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -2081,7 +2081,14 @@ void show_stack(struct task_struct *tsk, unsigned long 
*stack)
/*
 * See if this is an exception frame.
 * We look for the "regshere" marker in the current frame.
+*
+* KASAN may complain about this. If it is an exception frame,
+* we won't have unpoisoned the stack in asm when we set the
+* exception marker. If it's not an exception frame, who knows
+* how things are laid out - the shadow could be in any state
+* at all. Just disable KASAN reporting for now.
 */
+   kasan_disable_current();
if (validate_sp(sp, tsk, STACK_INT_FRAME_SIZE)
&& stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
struct pt_regs *regs = (struct pt_regs *)
@@ -2091,6 +2098,7 @@ void show_stack(struct task_struct *tsk, unsigned long 
*stack)
   regs->trap, (void *)regs->nip, (void *)lr);
firstframe = 1;
}
+   kasan_enable_current();


If this is really a concern for all targets including PPC32, should it
be a separate patch with a Fixes: tag to be applied back in stable as well ?


I've managed to repro this by commening out the kasan_disable/enable
lines, and just booting in qemu without a disk attached:

sudo qemu-system-ppc64 -accel kvm -m 2G -M pseries -cpu power9  -kernel 
./vmlinux  -nographic -chardev stdio,id=charserial0,mux=on -device 
spapr-vty,chardev=charserial0,reg=0x3000  -mon 
chardev=charserial0,mode=readline -nodefaults -smp 2

...

[0.210740] Kernel panic - not syncing: VFS: Unable to mount root fs on 
unknown-block(0,0)
[0.210789] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
5.5.0-rc1-next-20191213-16824-g469a24fbdb34 #12
[0.210844] Call Trace:
[0.210866] [c0006a4839b0] [c1f74f48] dump_stack+0xfc/0x154 
(unreliable)
[0.210915] [c0006a483a00] [c025411c] panic+0x258/0x59c
[0.210958] [c0006a483aa0] [c24870b0] 
mount_block_root+0x648/0x7ac
[0.211005] 
==
[0.211054] BUG: KASAN: stack-out-of-bounds in show_stack+0x438/0x580
[0.211095] Read of size 8 at addr c0006a483b00 by task swapper/0/1
[0.211134]
[0.211152] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
5.5.0-rc1-next-20191213-16824-g469a24fbdb34 #12
[0.211207] Call Trace:
[0.211225] [c0006a483680] [c1f74f48] dump_stack+0xfc/0x154 
(unreliable)
[0.211274] [c0006a4836d0] [c08f877c] 
print_address_description.isra.10+0x7c/0x470
[0.211330] [c0006a483760] [c08f8e7c] __kasan_report+0x1bc/0x244
[0.211380] [c0006a483830] [c08f6eb8] kasan_report+0x18/0x30
[0.211422] [c0006a483850] [c08fa5d4] 
__asan_report_load8_noabort+0x24/0x40
[0.211471] [c0006a483870] [c003d448] show_stack+0x438/0x580
[0.211512] [c0006a4839b0] [c1f74f48] dump_stack+0xfc/0x154
[0.211553] [c0006a483a00] [c025411c] panic+0x258/0x59c
[0.211595] [c0006a483aa0] [c24870b0] 
mount_block_root+0x648/0x7ac
[0.211644] [c0006a483be0] [c2487784] 
prepare_namespace+0x1ec/0x240
[0.211694] [c0006a483c60] [c248669c] 
kernel_init_freeable+0x7f4/0x870
[0.211745] [c0006a483da0] [c0011f30] kernel_init+0x3c/0x15c
[0.211787] [c0006a483e20] [c000bebc] 
ret_from_kernel_thread+0x5c/0x80
[0.211834]
[0.211851] Allocated by task 0:
[0.211878]  save_stack+0x2c/0xe0
[0.211904]  __kasan_kmalloc.isra.16+0x11c/0x150
[0.211937]  kmem_cache_alloc_node+0x114/0x3b0
[0.211971]  copy_process+0x5b8/0x6410
[0.211996]  _do_fork+0x130/0xbf0
[0.212022]  kernel_thread+0xdc/0x130
[0.212047]  rest_init+0x44/0x184
[0.212072]  start_kernel+0x77c/0x7dc
[0.212098]  start_here_common+0x1c/0x20
[0.212122]
[0.212139] Freed by task 0:
[0.212163] (stack is not available)
[0.212187]
[0.212205] The buggy address belongs to the object at c0006a48
[0.212205]  which belongs to the cache thread_stack of size 16384
[0.212285] The buggy address is located 15104 bytes inside of
[0.212285]  16384-byte region [c0006a48, c0006a484000)
[0.212356] The buggy address belongs to the page:
[0.212391] page:c00c001a9200 refcount:1 mapcount:0 
mapping:c0006a019e00 index:0x0 compound_mapcount: 0
[0.212455] raw: 00710200 5deadbeef100 5deadbeef122 
c0006a019e00
[0.212504] raw:  00100010 0001 
0

Re: [PATCH v3 3/3] powerpc: Book3S 64-bit "heavyweight" KASAN support

2019-12-17 Thread Daniel Axtens


>>[For those not immersed in ppc64, in real mode, the top nibble or 2 bits
>>(depending on radix/hash mmu) of the address is ignored. The linear
>>mapping is placed at 0xc000. This means that a pointer to
>>part of the linear mapping will work both in real mode, where it will be
>>interpreted as a physical address of the form 0x000..., and out of real
>>mode, where it will go via the linear mapping.]
>>
>
> How does hash or radix mmu mode effect how many bits are ignored in real mode?

Bah, you're picking on details that I picked up from random
conversations in the office rather than from reading the spec! :P

The ISA suggests that real addresses space is limited to at most 64
bits. ISAv3, Book III s5.7:

| * Host real address space size is 2^m bytes, m <= 60;
|   see Note 1.
| * Guest real address space size is 2 m bytes, m <= 60;
|   see Notes 1 and 2.
...
| Notes:
| 1. The value of m is implementation-dependent (sub-
|ject to the maximum given above). When used to
|address storage or to represent a guest real
|address, the high-order 60-m bits of the “60-bit”
|real address must be zeros.
| 2. The hypervisor may assign a guest real address
|space size for each partition that uses Radix Tree
|translation. Accesses to guest real storage out-
|side this range but still mappable by the second
|level Radix Tree will cause an HISI or HDSI.
|Accesses to storage outside the mappable range
|will have boundedly undefined results.

However, it doesn't follow from that passage that the top 4 bits are
always ignored when translations are off ('real mode'): see for example
the discussion of the HRMOR in s 5.7.3 and s 5.7.3.1. 

I think I got the 'top 2 bits on radix' thing from the discussion of
'quadrants' in arch/powerpc/include/asm/book3s/64/radix.h, which in turn
is discussed in s 5.7.5.1. Table 20 in particular is really helpful for
understanding it. But it's not especially relevant to what I'm actually
doing here.

I think to fully understand all of what's going on I would need to spend
some serious time with the entirety of s5.7, because there a lot of
quirks about how storage works! But I think for our purposes it suffices
to say:

  The kernel installs a linear mapping at effective address
  c000... onward. This is a one-to-one mapping with physical memory from
  ... onward. Because of how memory accesses work on powerpc 64-bit
  Book3S, a kernel pointer in the linear map accesses the same memory
  both with translations on (accessing as an 'effective address'), and
  with translations off (accessing as a 'real address'). This works in
  both guests and the hypervisor. For more details, see s5.7 of Book III
  of version 3 of the ISA, in particular the Storage Control Overview,
  s5.7.3, and s5.7.5 - noting that this KASAN implementation currently
  only supports Radix.

Thanks for your attention to detail!

Regards,
Daniel





Re: [PATCH v3 3/3] powerpc: Book3S 64-bit "heavyweight" KASAN support

2019-12-17 Thread Daniel Axtens
Daniel Axtens  writes:

> Hi Christophe,
>
> I'm working through your feedback, thank you. Regarding this one:
>
>>> --- a/arch/powerpc/kernel/process.c
>>> +++ b/arch/powerpc/kernel/process.c
>>> @@ -2081,7 +2081,14 @@ void show_stack(struct task_struct *tsk, unsigned 
>>> long *stack)
>>> /*
>>>  * See if this is an exception frame.
>>>  * We look for the "regshere" marker in the current frame.
>>> +*
>>> +* KASAN may complain about this. If it is an exception frame,
>>> +* we won't have unpoisoned the stack in asm when we set the
>>> +* exception marker. If it's not an exception frame, who knows
>>> +* how things are laid out - the shadow could be in any state
>>> +* at all. Just disable KASAN reporting for now.
>>>  */
>>> +   kasan_disable_current();
>>> if (validate_sp(sp, tsk, STACK_INT_FRAME_SIZE)
>>> && stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
>>> struct pt_regs *regs = (struct pt_regs *)
>>> @@ -2091,6 +2098,7 @@ void show_stack(struct task_struct *tsk, unsigned 
>>> long *stack)
>>>regs->trap, (void *)regs->nip, (void *)lr);
>>> firstframe = 1;
>>> }
>>> +   kasan_enable_current();
>>
>> If this is really a concern for all targets including PPC32, should it 
>> be a separate patch with a Fixes: tag to be applied back in stable as well ?
>
> I've managed to repro this by commening out the kasan_disable/enable
> lines, and just booting in qemu without a disk attached:
>
> sudo qemu-system-ppc64 -accel kvm -m 2G -M pseries -cpu power9  -kernel 
> ./vmlinux  -nographic -chardev stdio,id=charserial0,mux=on -device 
> spapr-vty,chardev=charserial0,reg=0x3000  -mon 
> chardev=charserial0,mode=readline -nodefaults -smp 2 
>
> ...
>
> [0.210740] Kernel panic - not syncing: VFS: Unable to mount root fs on 
> unknown-block(0,0)
> [0.210789] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
> 5.5.0-rc1-next-20191213-16824-g469a24fbdb34 #12
> [0.210844] Call Trace:
> [0.210866] [c0006a4839b0] [c1f74f48] dump_stack+0xfc/0x154 
> (unreliable)
> [0.210915] [c0006a483a00] [c025411c] panic+0x258/0x59c
> [0.210958] [c0006a483aa0] [c24870b0] 
> mount_block_root+0x648/0x7ac
> [0.211005] 
> ==
> [0.211054] BUG: KASAN: stack-out-of-bounds in show_stack+0x438/0x580
> [0.211095] Read of size 8 at addr c0006a483b00 by task swapper/0/1
> [0.211134] 
> [0.211152] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
> 5.5.0-rc1-next-20191213-16824-g469a24fbdb34 #12
> [0.211207] Call Trace:
> [0.211225] [c0006a483680] [c1f74f48] dump_stack+0xfc/0x154 
> (unreliable)
> [0.211274] [c0006a4836d0] [c08f877c] 
> print_address_description.isra.10+0x7c/0x470
> [0.211330] [c0006a483760] [c08f8e7c] 
> __kasan_report+0x1bc/0x244
> [0.211380] [c0006a483830] [c08f6eb8] kasan_report+0x18/0x30
> [0.211422] [c0006a483850] [c08fa5d4] 
> __asan_report_load8_noabort+0x24/0x40
> [0.211471] [c0006a483870] [c003d448] show_stack+0x438/0x580
> [0.211512] [c0006a4839b0] [c1f74f48] dump_stack+0xfc/0x154
> [0.211553] [c0006a483a00] [c025411c] panic+0x258/0x59c
> [0.211595] [c0006a483aa0] [c24870b0] 
> mount_block_root+0x648/0x7ac
> [0.211644] [c0006a483be0] [c2487784] 
> prepare_namespace+0x1ec/0x240
> [0.211694] [c0006a483c60] [c248669c] 
> kernel_init_freeable+0x7f4/0x870
> [0.211745] [c0006a483da0] [c0011f30] kernel_init+0x3c/0x15c
> [0.211787] [c0006a483e20] [c000bebc] 
> ret_from_kernel_thread+0x5c/0x80
> [0.211834] 
> [0.211851] Allocated by task 0:
> [0.211878]  save_stack+0x2c/0xe0
> [0.211904]  __kasan_kmalloc.isra.16+0x11c/0x150
> [0.211937]  kmem_cache_alloc_node+0x114/0x3b0
> [0.211971]  copy_process+0x5b8/0x6410
> [0.211996]  _do_fork+0x130/0xbf0
> [0.212022]  kernel_thread+0xdc/0x130
> [0.212047]  rest_init+0x44/0x184
> [0.212072]  start_kernel+0x77c/0x7dc
> [0.212098]  start_here_common+0x1c/0x20
> [0.212122] 
> [0.212139] Freed by task 0:
> [0.212163] (stack is not available)
> [0.212187] 
> [0.212205] The buggy address belongs to the object at c0006a48
> [0.212205]  which belongs to the cache thread_stack of size 16384
> [0.212285] The buggy address is located 15104 bytes inside of
> [0.212285]  16384-byte region [c0006a48, c0006a484000)
> [0.212356] The buggy address belongs to the page:
> [0.212391] page:c00c001a9200 refcount:1 mapcount:0 
> mapping:c0006a019e00 index:0x0 compound_mapcount: 0
> [0.212455] raw: 00710

Re: [PATCH v3 3/3] powerpc: Book3S 64-bit "heavyweight" KASAN support

2019-12-17 Thread Daniel Axtens
Hi Christophe,

I'm working through your feedback, thank you. Regarding this one:

>> --- a/arch/powerpc/kernel/process.c
>> +++ b/arch/powerpc/kernel/process.c
>> @@ -2081,7 +2081,14 @@ void show_stack(struct task_struct *tsk, unsigned 
>> long *stack)
>>  /*
>>   * See if this is an exception frame.
>>   * We look for the "regshere" marker in the current frame.
>> + *
>> + * KASAN may complain about this. If it is an exception frame,
>> + * we won't have unpoisoned the stack in asm when we set the
>> + * exception marker. If it's not an exception frame, who knows
>> + * how things are laid out - the shadow could be in any state
>> + * at all. Just disable KASAN reporting for now.
>>   */
>> +kasan_disable_current();
>>  if (validate_sp(sp, tsk, STACK_INT_FRAME_SIZE)
>>  && stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
>>  struct pt_regs *regs = (struct pt_regs *)
>> @@ -2091,6 +2098,7 @@ void show_stack(struct task_struct *tsk, unsigned long 
>> *stack)
>> regs->trap, (void *)regs->nip, (void *)lr);
>>  firstframe = 1;
>>  }
>> +kasan_enable_current();
>
> If this is really a concern for all targets including PPC32, should it 
> be a separate patch with a Fixes: tag to be applied back in stable as well ?

I've managed to repro this by commening out the kasan_disable/enable
lines, and just booting in qemu without a disk attached:

sudo qemu-system-ppc64 -accel kvm -m 2G -M pseries -cpu power9  -kernel 
./vmlinux  -nographic -chardev stdio,id=charserial0,mux=on -device 
spapr-vty,chardev=charserial0,reg=0x3000  -mon 
chardev=charserial0,mode=readline -nodefaults -smp 2 

...

[0.210740] Kernel panic - not syncing: VFS: Unable to mount root fs on 
unknown-block(0,0)
[0.210789] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
5.5.0-rc1-next-20191213-16824-g469a24fbdb34 #12
[0.210844] Call Trace:
[0.210866] [c0006a4839b0] [c1f74f48] dump_stack+0xfc/0x154 
(unreliable)
[0.210915] [c0006a483a00] [c025411c] panic+0x258/0x59c
[0.210958] [c0006a483aa0] [c24870b0] 
mount_block_root+0x648/0x7ac
[0.211005] 
==
[0.211054] BUG: KASAN: stack-out-of-bounds in show_stack+0x438/0x580
[0.211095] Read of size 8 at addr c0006a483b00 by task swapper/0/1
[0.211134] 
[0.211152] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
5.5.0-rc1-next-20191213-16824-g469a24fbdb34 #12
[0.211207] Call Trace:
[0.211225] [c0006a483680] [c1f74f48] dump_stack+0xfc/0x154 
(unreliable)
[0.211274] [c0006a4836d0] [c08f877c] 
print_address_description.isra.10+0x7c/0x470
[0.211330] [c0006a483760] [c08f8e7c] __kasan_report+0x1bc/0x244
[0.211380] [c0006a483830] [c08f6eb8] kasan_report+0x18/0x30
[0.211422] [c0006a483850] [c08fa5d4] 
__asan_report_load8_noabort+0x24/0x40
[0.211471] [c0006a483870] [c003d448] show_stack+0x438/0x580
[0.211512] [c0006a4839b0] [c1f74f48] dump_stack+0xfc/0x154
[0.211553] [c0006a483a00] [c025411c] panic+0x258/0x59c
[0.211595] [c0006a483aa0] [c24870b0] 
mount_block_root+0x648/0x7ac
[0.211644] [c0006a483be0] [c2487784] 
prepare_namespace+0x1ec/0x240
[0.211694] [c0006a483c60] [c248669c] 
kernel_init_freeable+0x7f4/0x870
[0.211745] [c0006a483da0] [c0011f30] kernel_init+0x3c/0x15c
[0.211787] [c0006a483e20] [c000bebc] 
ret_from_kernel_thread+0x5c/0x80
[0.211834] 
[0.211851] Allocated by task 0:
[0.211878]  save_stack+0x2c/0xe0
[0.211904]  __kasan_kmalloc.isra.16+0x11c/0x150
[0.211937]  kmem_cache_alloc_node+0x114/0x3b0
[0.211971]  copy_process+0x5b8/0x6410
[0.211996]  _do_fork+0x130/0xbf0
[0.212022]  kernel_thread+0xdc/0x130
[0.212047]  rest_init+0x44/0x184
[0.212072]  start_kernel+0x77c/0x7dc
[0.212098]  start_here_common+0x1c/0x20
[0.212122] 
[0.212139] Freed by task 0:
[0.212163] (stack is not available)
[0.212187] 
[0.212205] The buggy address belongs to the object at c0006a48
[0.212205]  which belongs to the cache thread_stack of size 16384
[0.212285] The buggy address is located 15104 bytes inside of
[0.212285]  16384-byte region [c0006a48, c0006a484000)
[0.212356] The buggy address belongs to the page:
[0.212391] page:c00c001a9200 refcount:1 mapcount:0 
mapping:c0006a019e00 index:0x0 compound_mapcount: 0
[0.212455] raw: 00710200 5deadbeef100 5deadbeef122 
c0006a019e00
[0.212504] raw:  00100010 0001 

[0.212551] page dumped because: k

Re: [PATCH v3 3/3] powerpc: Book3S 64-bit "heavyweight" KASAN support

2019-12-13 Thread Christophe Leroy




Le 12/12/2019 à 16:16, Daniel Axtens a écrit :

KASAN support on Book3S is a bit tricky to get right:

  - It would be good to support inline instrumentation so as to be able to
catch stack issues that cannot be caught with outline mode.

  - Inline instrumentation requires a fixed offset.

  - Book3S runs code in real mode after booting. Most notably a lot of KVM
runs in real mode, and it would be good to be able to instrument it.

  - Because code runs in real mode after boot, the offset has to point to
valid memory both in and out of real mode.

[For those not immersed in ppc64, in real mode, the top nibble or 2 bits
(depending on radix/hash mmu) of the address is ignored. The linear
mapping is placed at 0xc000. This means that a pointer to
part of the linear mapping will work both in real mode, where it will be
interpreted as a physical address of the form 0x000..., and out of real
mode, where it will go via the linear mapping.]

One approach is just to give up on inline instrumentation. This way all
checks can be delayed until after everything set is up correctly, and the
address-to-shadow calculations can be overridden. However, the features and
speed boost provided by inline instrumentation are worth trying to do
better.

If _at compile time_ it is known how much contiguous physical memory a
system has, the top 1/8th of the first block of physical memory can be set
aside for the shadow. This is a big hammer and comes with 3 big
consequences:

  - there's no nice way to handle physically discontiguous memory, so only
the first physical memory block can be used.

  - kernels will simply fail to boot on machines with less memory than
specified when compiling.

  - kernels running on machines with more memory than specified when
compiling will simply ignore the extra memory.

Implement and document KASAN this way. The current implementation is Radix
only.

Despite the limitations, it can still find bugs,
e.g. http://patchwork.ozlabs.org/patch/1103775/

At the moment, this physical memory limit must be set _even for outline
mode_. This may be changed in a later series - a different implementation
could be added for outline mode that dynamically allocates shadow at a
fixed offset. For example, see https://patchwork.ozlabs.org/patch/795211/

Suggested-by: Michael Ellerman 
Cc: Balbir Singh  # ppc64 out-of-line radix version
Cc: Christophe Leroy  # ppc32 version
Signed-off-by: Daniel Axtens 

---
Changes since v2:

  - Address feedback from Christophe around cleanups and docs.
  - Address feedback from Balbir: at this point I don't have a good solution
for the issues you identify around the limitations of the inline 
implementation
but I think that it's worth trying to get the stack instrumentation support.
I'm happy to have an alternative and more flexible outline mode - I had
envisoned this would be called 'lightweight' mode as it imposes fewer 
restrictions.
I've linked to your implementation. I think it's best to add it in a 
follow-up series.
  - Made the default PHYS_MEM_SIZE_FOR_KASAN value 1024MB. I think most people 
have
guests with at least that much memory in the Radix 64s case so it's a much
saner default - it means that if you just turn on KASAN without reading the
docs you're much more likely to have a bootable kernel, which you will never
have if the value is set to zero! I'm happy to bikeshed the value if we 
want.

Changes since v1:
  - Landed kasan vmalloc support upstream
  - Lots of feedback from Christophe.

Changes since the rfc:

  - Boots real and virtual hardware, kvm works.

  - disabled reporting when we're checking the stack for exception
frames. The behaviour isn't wrong, just incompatible with KASAN.

  - Documentation!

  - Dropped old module stuff in favour of KASAN_VMALLOC.

The bugs with ftrace and kuap were due to kernel bloat pushing
prom_init calls to be done via the plt. Because we did not have
a relocatable kernel, and they are done very early, this caused
everything to explode. Compile with CONFIG_RELOCATABLE!
---
  Documentation/dev-tools/kasan.rst |   8 +-
  Documentation/powerpc/kasan.txt   | 112 +-
  arch/powerpc/Kconfig  |   3 +
  arch/powerpc/Kconfig.debug|  21 
  arch/powerpc/Makefile |  11 ++
  arch/powerpc/include/asm/book3s/64/hash.h |   4 +
  arch/powerpc/include/asm/book3s/64/pgtable.h  |   7 ++
  arch/powerpc/include/asm/book3s/64/radix.h|   5 +
  arch/powerpc/include/asm/kasan.h  |  21 +++-
  arch/powerpc/kernel/process.c |   8 ++
  arch/powerpc/kernel/prom.c|  64 +-
  arch/powerpc/mm/kasan/Makefile|   3 +-
  .../mm/kasan/{kasan_init_32.c => init_32.c}   |   0
  arch/powerpc/mm/kasan/init_book3s_64.c|  72 +++
  14 files changed, 330 insertions(+),

Re: [PATCH v3 3/3] powerpc: Book3S 64-bit "heavyweight" KASAN support

2019-12-12 Thread Jordan Niethe
On Fri, Dec 13, 2019 at 2:19 AM Daniel Axtens  wrote:
>
> KASAN support on Book3S is a bit tricky to get right:
>
>  - It would be good to support inline instrumentation so as to be able to
>catch stack issues that cannot be caught with outline mode.
>
>  - Inline instrumentation requires a fixed offset.
>
>  - Book3S runs code in real mode after booting. Most notably a lot of KVM
>runs in real mode, and it would be good to be able to instrument it.
>
>  - Because code runs in real mode after boot, the offset has to point to
>valid memory both in and out of real mode.
>
>[For those not immersed in ppc64, in real mode, the top nibble or 2 bits
>(depending on radix/hash mmu) of the address is ignored. The linear
>mapping is placed at 0xc000. This means that a pointer to
>part of the linear mapping will work both in real mode, where it will be
>interpreted as a physical address of the form 0x000..., and out of real
>mode, where it will go via the linear mapping.]
>

How does hash or radix mmu mode effect how many bits are ignored in real mode?

> One approach is just to give up on inline instrumentation. This way all
> checks can be delayed until after everything set is up correctly, and the
> address-to-shadow calculations can be overridden. However, the features and
> speed boost provided by inline instrumentation are worth trying to do
> better.
>
> If _at compile time_ it is known how much contiguous physical memory a
> system has, the top 1/8th of the first block of physical memory can be set
> aside for the shadow. This is a big hammer and comes with 3 big
> consequences:
>
>  - there's no nice way to handle physically discontiguous memory, so only
>the first physical memory block can be used.
>
>  - kernels will simply fail to boot on machines with less memory than
>specified when compiling.
>
>  - kernels running on machines with more memory than specified when
>compiling will simply ignore the extra memory.
>
> Implement and document KASAN this way. The current implementation is Radix
> only.
>
> Despite the limitations, it can still find bugs,
> e.g. http://patchwork.ozlabs.org/patch/1103775/
>
> At the moment, this physical memory limit must be set _even for outline
> mode_. This may be changed in a later series - a different implementation
> could be added for outline mode that dynamically allocates shadow at a
> fixed offset. For example, see https://patchwork.ozlabs.org/patch/795211/
>
> Suggested-by: Michael Ellerman 
> Cc: Balbir Singh  # ppc64 out-of-line radix version
> Cc: Christophe Leroy  # ppc32 version
> Signed-off-by: Daniel Axtens 
>
> ---
> Changes since v2:
>
>  - Address feedback from Christophe around cleanups and docs.
>  - Address feedback from Balbir: at this point I don't have a good solution
>for the issues you identify around the limitations of the inline 
> implementation
>but I think that it's worth trying to get the stack instrumentation 
> support.
>I'm happy to have an alternative and more flexible outline mode - I had
>envisoned this would be called 'lightweight' mode as it imposes fewer 
> restrictions.
>I've linked to your implementation. I think it's best to add it in a 
> follow-up series.
>  - Made the default PHYS_MEM_SIZE_FOR_KASAN value 1024MB. I think most people 
> have
>guests with at least that much memory in the Radix 64s case so it's a much
>saner default - it means that if you just turn on KASAN without reading the
>docs you're much more likely to have a bootable kernel, which you will 
> never
>have if the value is set to zero! I'm happy to bikeshed the value if we 
> want.
>
> Changes since v1:
>  - Landed kasan vmalloc support upstream
>  - Lots of feedback from Christophe.
>
> Changes since the rfc:
>
>  - Boots real and virtual hardware, kvm works.
>
>  - disabled reporting when we're checking the stack for exception
>frames. The behaviour isn't wrong, just incompatible with KASAN.
>
>  - Documentation!
>
>  - Dropped old module stuff in favour of KASAN_VMALLOC.
>
> The bugs with ftrace and kuap were due to kernel bloat pushing
> prom_init calls to be done via the plt. Because we did not have
> a relocatable kernel, and they are done very early, this caused
> everything to explode. Compile with CONFIG_RELOCATABLE!
> ---
>  Documentation/dev-tools/kasan.rst |   8 +-
>  Documentation/powerpc/kasan.txt   | 112 +-
>  arch/powerpc/Kconfig  |   3 +
>  arch/powerpc/Kconfig.debug|  21 
>  arch/powerpc/Makefile |  11 ++
>  arch/powerpc/include/asm/book3s/64/hash.h |   4 +
>  arch/powerpc/include/asm/book3s/64/pgtable.h  |   7 ++
>  arch/powerpc/include/asm/book3s/64/radix.h|   5 +
>  arch/powerpc/include/asm/kasan.h  |  21 +++-
>  arch/powerpc/kernel/process.c |   8 ++
>  arch/powerpc/kernel/prom.c 

[PATCH v3 3/3] powerpc: Book3S 64-bit "heavyweight" KASAN support

2019-12-12 Thread Daniel Axtens
KASAN support on Book3S is a bit tricky to get right:

 - It would be good to support inline instrumentation so as to be able to
   catch stack issues that cannot be caught with outline mode.

 - Inline instrumentation requires a fixed offset.

 - Book3S runs code in real mode after booting. Most notably a lot of KVM
   runs in real mode, and it would be good to be able to instrument it.

 - Because code runs in real mode after boot, the offset has to point to
   valid memory both in and out of real mode.

   [For those not immersed in ppc64, in real mode, the top nibble or 2 bits
   (depending on radix/hash mmu) of the address is ignored. The linear
   mapping is placed at 0xc000. This means that a pointer to
   part of the linear mapping will work both in real mode, where it will be
   interpreted as a physical address of the form 0x000..., and out of real
   mode, where it will go via the linear mapping.]

One approach is just to give up on inline instrumentation. This way all
checks can be delayed until after everything set is up correctly, and the
address-to-shadow calculations can be overridden. However, the features and
speed boost provided by inline instrumentation are worth trying to do
better.

If _at compile time_ it is known how much contiguous physical memory a
system has, the top 1/8th of the first block of physical memory can be set
aside for the shadow. This is a big hammer and comes with 3 big
consequences:

 - there's no nice way to handle physically discontiguous memory, so only
   the first physical memory block can be used.

 - kernels will simply fail to boot on machines with less memory than
   specified when compiling.

 - kernels running on machines with more memory than specified when
   compiling will simply ignore the extra memory.

Implement and document KASAN this way. The current implementation is Radix
only.

Despite the limitations, it can still find bugs,
e.g. http://patchwork.ozlabs.org/patch/1103775/

At the moment, this physical memory limit must be set _even for outline
mode_. This may be changed in a later series - a different implementation
could be added for outline mode that dynamically allocates shadow at a
fixed offset. For example, see https://patchwork.ozlabs.org/patch/795211/

Suggested-by: Michael Ellerman 
Cc: Balbir Singh  # ppc64 out-of-line radix version
Cc: Christophe Leroy  # ppc32 version
Signed-off-by: Daniel Axtens 

---
Changes since v2:

 - Address feedback from Christophe around cleanups and docs.
 - Address feedback from Balbir: at this point I don't have a good solution
   for the issues you identify around the limitations of the inline 
implementation
   but I think that it's worth trying to get the stack instrumentation support.
   I'm happy to have an alternative and more flexible outline mode - I had
   envisoned this would be called 'lightweight' mode as it imposes fewer 
restrictions.
   I've linked to your implementation. I think it's best to add it in a 
follow-up series.
 - Made the default PHYS_MEM_SIZE_FOR_KASAN value 1024MB. I think most people 
have
   guests with at least that much memory in the Radix 64s case so it's a much
   saner default - it means that if you just turn on KASAN without reading the
   docs you're much more likely to have a bootable kernel, which you will never
   have if the value is set to zero! I'm happy to bikeshed the value if we want.

Changes since v1:
 - Landed kasan vmalloc support upstream
 - Lots of feedback from Christophe.

Changes since the rfc:

 - Boots real and virtual hardware, kvm works.

 - disabled reporting when we're checking the stack for exception
   frames. The behaviour isn't wrong, just incompatible with KASAN.

 - Documentation!

 - Dropped old module stuff in favour of KASAN_VMALLOC.

The bugs with ftrace and kuap were due to kernel bloat pushing
prom_init calls to be done via the plt. Because we did not have
a relocatable kernel, and they are done very early, this caused
everything to explode. Compile with CONFIG_RELOCATABLE!
---
 Documentation/dev-tools/kasan.rst |   8 +-
 Documentation/powerpc/kasan.txt   | 112 +-
 arch/powerpc/Kconfig  |   3 +
 arch/powerpc/Kconfig.debug|  21 
 arch/powerpc/Makefile |  11 ++
 arch/powerpc/include/asm/book3s/64/hash.h |   4 +
 arch/powerpc/include/asm/book3s/64/pgtable.h  |   7 ++
 arch/powerpc/include/asm/book3s/64/radix.h|   5 +
 arch/powerpc/include/asm/kasan.h  |  21 +++-
 arch/powerpc/kernel/process.c |   8 ++
 arch/powerpc/kernel/prom.c|  64 +-
 arch/powerpc/mm/kasan/Makefile|   3 +-
 .../mm/kasan/{kasan_init_32.c => init_32.c}   |   0
 arch/powerpc/mm/kasan/init_book3s_64.c|  72 +++
 14 files changed, 330 insertions(+), 9 deletions(-)
 rename arch/powerpc/mm/kasan/{kasan_init_32.c => init_32.c} (100%)
 create mode 100644 ar