Re: [PATCH v1 3/5] mm: ptdump: Provide page size to notepage()

2021-04-19 Thread Christophe Leroy




Le 19/04/2021 à 16:00, Steven Price a écrit :

On 19/04/2021 14:14, Christophe Leroy wrote:



Le 16/04/2021 à 12:51, Steven Price a écrit :

On 16/04/2021 11:38, Christophe Leroy wrote:



Le 16/04/2021 à 11:28, Steven Price a écrit :

On 15/04/2021 18:18, Christophe Leroy wrote:

To be honest I don't fully understand why powerpc requires the page_size - it appears to be 
using it purely to find "holes" in the calls to note_page(), but I haven't worked out why such 
holes would occur.


I was indeed introduced for KASAN. We have a first commit 
https://github.com/torvalds/linux/commit/cabe8138 which uses page size to detect whether it is a 
KASAN like stuff.


Then came https://github.com/torvalds/linux/commit/b00ff6d8c as a fix. I can't remember what the 
problem was exactly, something around the use of hugepages for kernel memory, came as part of 
the series 
https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1589866984.git.christophe.le...@csgroup.eu/ 





Ah, that's useful context. So it looks like powerpc took a different route to reducing the KASAN 
output to x86.


Given the generic ptdump code has handling for KASAN already it should be possible to drop that 
from the powerpc arch code, which I think means we don't actually need to provide page size to 
notepage(). Hopefully that means more code to delete ;)




Looking at how the generic ptdump code handles KASAN, I'm a bit sceptic.

IIUC, it is checking that kasan_early_shadow_pte is in the same page as the pgtable referred by 
the PMD entry. But what happens if that PMD entry is referring another pgtable which is inside the 
same page as kasan_early_shadow_pte ?


Shouldn't the test be

 if (pmd_page_vaddr(val) == lm_alias(kasan_early_shadow_pte))
 return note_kasan_page_table(walk, addr);


Now I come to look at this code again, I think you're right. On arm64 this doesn't cause a problem - 
page tables are page sized and page aligned, so there couldn't be any non-KASAN pgtables sharing the 
page. Obviously that's not necessarily true of other architectures.


Feel free to add a patch to your series ;)



Ok.

I'll leave that outside of the series, it is not a show stopper because early shadow page 
directories are all tagged __bss_page_aligned so we can't have two of them in the same page and it 
is really unlikely that we'll have any other statically defined page directory in the same pages either.


And for the special case of powerpc 8xx which is the only one for which we have both KASAN and 
HUGEPD at the time being, there are only two levels of page directories so no issue.


Christophe


Re: [PATCH v1 3/5] mm: ptdump: Provide page size to notepage()

2021-04-19 Thread Steven Price

On 19/04/2021 14:14, Christophe Leroy wrote:



Le 16/04/2021 à 12:51, Steven Price a écrit :

On 16/04/2021 11:38, Christophe Leroy wrote:



Le 16/04/2021 à 11:28, Steven Price a écrit :

On 15/04/2021 18:18, Christophe Leroy wrote:

To be honest I don't fully understand why powerpc requires the 
page_size - it appears to be using it purely to find "holes" in the 
calls to note_page(), but I haven't worked out why such holes would 
occur.


I was indeed introduced for KASAN. We have a first commit 
https://github.com/torvalds/linux/commit/cabe8138 which uses page 
size to detect whether it is a KASAN like stuff.


Then came https://github.com/torvalds/linux/commit/b00ff6d8c as a 
fix. I can't remember what the problem was exactly, something around 
the use of hugepages for kernel memory, came as part of the series 
https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1589866984.git.christophe.le...@csgroup.eu/ 




Ah, that's useful context. So it looks like powerpc took a different 
route to reducing the KASAN output to x86.


Given the generic ptdump code has handling for KASAN already it should 
be possible to drop that from the powerpc arch code, which I think 
means we don't actually need to provide page size to notepage(). 
Hopefully that means more code to delete ;)




Looking at how the generic ptdump code handles KASAN, I'm a bit sceptic.

IIUC, it is checking that kasan_early_shadow_pte is in the same page as 
the pgtable referred by the PMD entry. But what happens if that PMD 
entry is referring another pgtable which is inside the same page as 
kasan_early_shadow_pte ?


Shouldn't the test be

 if (pmd_page_vaddr(val) == lm_alias(kasan_early_shadow_pte))
     return note_kasan_page_table(walk, addr);


Now I come to look at this code again, I think you're right. On arm64 
this doesn't cause a problem - page tables are page sized and page 
aligned, so there couldn't be any non-KASAN pgtables sharing the page. 
Obviously that's not necessarily true of other architectures.


Feel free to add a patch to your series ;)

Steve


Re: [PATCH v1 3/5] mm: ptdump: Provide page size to notepage()

2021-04-19 Thread Christophe Leroy




Le 16/04/2021 à 12:51, Steven Price a écrit :

On 16/04/2021 11:38, Christophe Leroy wrote:



Le 16/04/2021 à 11:28, Steven Price a écrit :

On 15/04/2021 18:18, Christophe Leroy wrote:

To be honest I don't fully understand why powerpc requires the page_size - it appears to be using 
it purely to find "holes" in the calls to note_page(), but I haven't worked out why such holes 
would occur.


I was indeed introduced for KASAN. We have a first commit 
https://github.com/torvalds/linux/commit/cabe8138 which uses page size to detect whether it is a 
KASAN like stuff.


Then came https://github.com/torvalds/linux/commit/b00ff6d8c as a fix. I can't remember what the 
problem was exactly, something around the use of hugepages for kernel memory, came as part of the 
series 
https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1589866984.git.christophe.le...@csgroup.eu/ 



Ah, that's useful context. So it looks like powerpc took a different route to reducing the KASAN 
output to x86.


Given the generic ptdump code has handling for KASAN already it should be possible to drop that from 
the powerpc arch code, which I think means we don't actually need to provide page size to 
notepage(). Hopefully that means more code to delete ;)




Looking at how the generic ptdump code handles KASAN, I'm a bit sceptic.

IIUC, it is checking that kasan_early_shadow_pte is in the same page as the pgtable referred by the 
PMD entry. But what happens if that PMD entry is referring another pgtable which is inside the same 
page as kasan_early_shadow_pte ?


Shouldn't the test be

if (pmd_page_vaddr(val) == lm_alias(kasan_early_shadow_pte))
return note_kasan_page_table(walk, addr);


Christophe


Re: [PATCH v1 3/5] mm: ptdump: Provide page size to notepage()

2021-04-16 Thread Steven Price

On 16/04/2021 16:15, Christophe Leroy wrote:



Le 16/04/2021 à 17:04, Christophe Leroy a écrit :



Le 16/04/2021 à 16:40, Christophe Leroy a écrit :



Le 16/04/2021 à 15:00, Steven Price a écrit :

On 16/04/2021 12:08, Christophe Leroy wrote:



Le 16/04/2021 à 12:51, Steven Price a écrit :

On 16/04/2021 11:38, Christophe Leroy wrote:



Le 16/04/2021 à 11:28, Steven Price a écrit :
To be honest I don't fully understand why powerpc requires the 
page_size - it appears to be using it purely to find "holes" in 
the calls to note_page(), but I haven't worked out why such 
holes would occur.


I was indeed introduced for KASAN. We have a first commit 
https://github.com/torvalds/linux/commit/cabe8138 which uses page 
size to detect whether it is a KASAN like stuff.


Then came https://github.com/torvalds/linux/commit/b00ff6d8c as a 
fix. I can't remember what the problem was exactly, something 
around the use of hugepages for kernel memory, came as part of 
the series 
https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1589866984.git.christophe.le...@csgroup.eu/ 








Ah, that's useful context. So it looks like powerpc took a 
different route to reducing the KASAN output to x86.


Given the generic ptdump code has handling for KASAN already it 
should be possible to drop that from the powerpc arch code, which 
I think means we don't actually need to provide page size to 
notepage(). Hopefully that means more code to delete ;)




Yes ... and no.

It looks like the generic ptdump handles the case when several 
pgdir entries points to the same kasan_early_shadow_pte. But it 
doesn't take into account the powerpc case where we have regular 
page tables where several (if not all) PTEs are pointing to the 
kasan_early_shadow_page .


I'm not sure I follow quite how powerpc is different here. But could 
you have a similar check for PTEs against kasan_early_shadow_pte as 
the other levels already have?


I'm just worried that page_size isn't well defined in this interface 
and it's going to cause problems in the future.




I'm trying. I reverted the two commits b00ff6d8c and cabe8138.

At the moment, I don't get exactly what I expect: For linear memory I 
get one line for each 8M page whereas before reverting the patches I 
got one 16M line and one 112M line.


And for KASAN shadow area I get two lines for the 2x 8M pages 
shadowing linear mem then I get one 4M line for each PGDIR entry 
pointing to kasan_early_shadow_pte.


0xf800-0xf87f 0x0700 8M   huge    rw   
present
0xf880-0xf8ff 0x0780 8M   huge    rw   
present
0xf900-0xf93f 0x0143 4M   r
present

...
0xfec0-0xfeff 0x0143 4M   r
present


Any idea ?




I think the different with other architectures is here:

 } else if (flag != st->current_flags || level != st->level ||
    addr >= st->marker[1].start_address ||
    pa != st->last_pa + PAGE_SIZE) {


In addition to the checks everyone do, powerpc also checks "pa != 
st->last_pa + PAGE_SIZE".
And it is definitely for that test that page_size argument add been 
added.


By replacing that test by (pa - st->start_pa != addr - 
st->start_address) it works again. So we definitely don't need the real 
page size.


Yes that should work. Thanks for figuring it out!





I see that other architectures except RISCV don't dump the physical 
address. But even RISCV doesn't include that check.


Yes not having the physical address certainly simplifies things - 
although I can see why that can be handy to see. The disadvantage is 
that user space or vmalloc()'d memory will produce a lot of output 
because the physical addresses are unlikely to be contiguous. And for 
most uses you don't need the information.


That physical address dump was added by commit aaa229529244 
("powerpc/mm: Add physical address to Linux page table dump") 
[https://github.com/torvalds/linux/commit/aaa2295]


How do other architectures deal with the problem described by the 
commit log of that patch ?


AFAIK other architectures are "broken" in this regard. In practice I 
don't think it often causes an issue though.


Steve


Re: [PATCH v1 3/5] mm: ptdump: Provide page size to notepage()

2021-04-16 Thread Christophe Leroy




Le 16/04/2021 à 17:04, Christophe Leroy a écrit :



Le 16/04/2021 à 16:40, Christophe Leroy a écrit :



Le 16/04/2021 à 15:00, Steven Price a écrit :

On 16/04/2021 12:08, Christophe Leroy wrote:



Le 16/04/2021 à 12:51, Steven Price a écrit :

On 16/04/2021 11:38, Christophe Leroy wrote:



Le 16/04/2021 à 11:28, Steven Price a écrit :
To be honest I don't fully understand why powerpc requires the page_size - it appears to be 
using it purely to find "holes" in the calls to note_page(), but I haven't worked out why 
such holes would occur.


I was indeed introduced for KASAN. We have a first commit 
https://github.com/torvalds/linux/commit/cabe8138 which uses page size to detect whether it is 
a KASAN like stuff.


Then came https://github.com/torvalds/linux/commit/b00ff6d8c as a fix. I can't remember what 
the problem was exactly, something around the use of hugepages for kernel memory, came as part 
of the series 
https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1589866984.git.christophe.le...@csgroup.eu/ 







Ah, that's useful context. So it looks like powerpc took a different route to reducing the 
KASAN output to x86.


Given the generic ptdump code has handling for KASAN already it should be possible to drop that 
from the powerpc arch code, which I think means we don't actually need to provide page size to 
notepage(). Hopefully that means more code to delete ;)




Yes ... and no.

It looks like the generic ptdump handles the case when several pgdir entries points to the same 
kasan_early_shadow_pte. But it doesn't take into account the powerpc case where we have regular 
page tables where several (if not all) PTEs are pointing to the kasan_early_shadow_page .


I'm not sure I follow quite how powerpc is different here. But could you have a similar check for 
PTEs against kasan_early_shadow_pte as the other levels already have?


I'm just worried that page_size isn't well defined in this interface and it's going to cause 
problems in the future.




I'm trying. I reverted the two commits b00ff6d8c and cabe8138.

At the moment, I don't get exactly what I expect: For linear memory I get one line for each 8M 
page whereas before reverting the patches I got one 16M line and one 112M line.


And for KASAN shadow area I get two lines for the 2x 8M pages shadowing linear mem then I get one 
4M line for each PGDIR entry pointing to kasan_early_shadow_pte.


0xf800-0xf87f 0x0700 8M   huge    rw   present
0xf880-0xf8ff 0x0780 8M   huge    rw   present
0xf900-0xf93f 0x0143 4M   r    present

...

0xfec0-0xfeff 0x0143 4M   r    present

Any idea ?




I think the different with other architectures is here:

 } else if (flag != st->current_flags || level != st->level ||
    addr >= st->marker[1].start_address ||
    pa != st->last_pa + PAGE_SIZE) {


In addition to the checks everyone do, powerpc also checks "pa != st->last_pa + 
PAGE_SIZE".
And it is definitely for that test that page_size argument add been added.


By replacing that test by (pa - st->start_pa != addr - st->start_address) it works again. So we 
definitely don't need the real page size.





I see that other architectures except RISCV don't dump the physical address. But even RISCV doesn't 
include that check.


That physical address dump was added by commit aaa229529244 ("powerpc/mm: Add physical address to 
Linux page table dump") [https://github.com/torvalds/linux/commit/aaa2295]


How do other architectures deal with the problem described by the commit log of 
that patch ?

Christophe


Re: [PATCH v1 3/5] mm: ptdump: Provide page size to notepage()

2021-04-16 Thread Christophe Leroy




Le 16/04/2021 à 16:40, Christophe Leroy a écrit :



Le 16/04/2021 à 15:00, Steven Price a écrit :

On 16/04/2021 12:08, Christophe Leroy wrote:



Le 16/04/2021 à 12:51, Steven Price a écrit :

On 16/04/2021 11:38, Christophe Leroy wrote:



Le 16/04/2021 à 11:28, Steven Price a écrit :
To be honest I don't fully understand why powerpc requires the page_size - it appears to be 
using it purely to find "holes" in the calls to note_page(), but I haven't worked out why such 
holes would occur.


I was indeed introduced for KASAN. We have a first commit 
https://github.com/torvalds/linux/commit/cabe8138 which uses page size to detect whether it is 
a KASAN like stuff.


Then came https://github.com/torvalds/linux/commit/b00ff6d8c as a fix. I can't remember what 
the problem was exactly, something around the use of hugepages for kernel memory, came as part 
of the series 
https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1589866984.git.christophe.le...@csgroup.eu/ 






Ah, that's useful context. So it looks like powerpc took a different route to reducing the KASAN 
output to x86.


Given the generic ptdump code has handling for KASAN already it should be possible to drop that 
from the powerpc arch code, which I think means we don't actually need to provide page size to 
notepage(). Hopefully that means more code to delete ;)




Yes ... and no.

It looks like the generic ptdump handles the case when several pgdir entries points to the same 
kasan_early_shadow_pte. But it doesn't take into account the powerpc case where we have regular 
page tables where several (if not all) PTEs are pointing to the kasan_early_shadow_page .


I'm not sure I follow quite how powerpc is different here. But could you have a similar check for 
PTEs against kasan_early_shadow_pte as the other levels already have?


I'm just worried that page_size isn't well defined in this interface and it's going to cause 
problems in the future.




I'm trying. I reverted the two commits b00ff6d8c and cabe8138.

At the moment, I don't get exactly what I expect: For linear memory I get one line for each 8M page 
whereas before reverting the patches I got one 16M line and one 112M line.


And for KASAN shadow area I get two lines for the 2x 8M pages shadowing linear mem then I get one 4M 
line for each PGDIR entry pointing to kasan_early_shadow_pte.


0xf800-0xf87f 0x0700 8M   huge    rw   present
0xf880-0xf8ff 0x0780 8M   huge    rw   present
0xf900-0xf93f 0x0143 4M   r    present

...

0xfec0-0xfeff 0x0143 4M   r    present

Any idea ?




I think the different with other architectures is here:

} else if (flag != st->current_flags || level != st->level ||
   addr >= st->marker[1].start_address ||
   pa != st->last_pa + PAGE_SIZE) {


In addition to the checks everyone do, powerpc also checks "pa != st->last_pa + 
PAGE_SIZE".
And it is definitely for that test that page_size argument add been added.

I see that other architectures except RISCV don't dump the physical address. But even RISCV doesn't 
include that check.


That physical address dump was added by commit aaa229529244 ("powerpc/mm: Add physical address to 
Linux page table dump") [https://github.com/torvalds/linux/commit/aaa2295]


How do other architectures deal with the problem described by the commit log of 
that patch ?

Christophe


Re: [PATCH v1 3/5] mm: ptdump: Provide page size to notepage()

2021-04-16 Thread Christophe Leroy




Le 16/04/2021 à 15:00, Steven Price a écrit :

On 16/04/2021 12:08, Christophe Leroy wrote:



Le 16/04/2021 à 12:51, Steven Price a écrit :

On 16/04/2021 11:38, Christophe Leroy wrote:



Le 16/04/2021 à 11:28, Steven Price a écrit :

On 15/04/2021 18:18, Christophe Leroy wrote:

In order to support large pages on powerpc, notepage()
needs to know the page size of the page.

Add a page_size argument to notepage().

Signed-off-by: Christophe Leroy 
---
  arch/arm64/mm/ptdump.c |  2 +-
  arch/riscv/mm/ptdump.c |  2 +-
  arch/s390/mm/dump_pagetables.c |  3 ++-
  arch/x86/mm/dump_pagetables.c  |  2 +-
  include/linux/ptdump.h |  2 +-
  mm/ptdump.c    | 16 
  6 files changed, 14 insertions(+), 13 deletions(-)


[...]

diff --git a/mm/ptdump.c b/mm/ptdump.c
index da751448d0e4..61cd16afb1c8 100644
--- a/mm/ptdump.c
+++ b/mm/ptdump.c
@@ -17,7 +17,7 @@ static inline int note_kasan_page_table(struct mm_walk *walk,
  {
  struct ptdump_state *st = walk->private;
-    st->note_page(st, addr, 4, pte_val(kasan_early_shadow_pte[0]));
+    st->note_page(st, addr, 4, pte_val(kasan_early_shadow_pte[0]), PAGE_SIZE);


I'm not completely sure what the page_size is going to be used for, but note that KASAN 
presents an interesting case here. We short-cut by detecting it's a KASAN region at a high 
level (PGD/P4D/PUD/PMD) and instead of walking the tree down just call note_page() *once* but 
with level==4 because we know KASAN sets up the page table like that.


However the one call actually covers a much larger region - so while PAGE_SIZE matches the 
level it doesn't match the region covered. AFAICT this will lead to odd results if you enable 
KASAN on powerpc.


Hum  I successfully tested it with KASAN, I now realise that I tested it with 
CONFIG_KASAN_VMALLOC selected. In this situation, since 
https://github.com/torvalds/linux/commit/af3d0a686 we don't have any common shadow page table 
anymore.


I'll test again without CONFIG_KASAN_VMALLOC.



To be honest I don't fully understand why powerpc requires the page_size - it appears to be 
using it purely to find "holes" in the calls to note_page(), but I haven't worked out why such 
holes would occur.


I was indeed introduced for KASAN. We have a first commit 
https://github.com/torvalds/linux/commit/cabe8138 which uses page size to detect whether it is a 
KASAN like stuff.


Then came https://github.com/torvalds/linux/commit/b00ff6d8c as a fix. I can't remember what the 
problem was exactly, something around the use of hugepages for kernel memory, came as part of 
the series 
https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1589866984.git.christophe.le...@csgroup.eu/ 





Ah, that's useful context. So it looks like powerpc took a different route to reducing the KASAN 
output to x86.


Given the generic ptdump code has handling for KASAN already it should be possible to drop that 
from the powerpc arch code, which I think means we don't actually need to provide page size to 
notepage(). Hopefully that means more code to delete ;)




Yes ... and no.

It looks like the generic ptdump handles the case when several pgdir entries points to the same 
kasan_early_shadow_pte. But it doesn't take into account the powerpc case where we have regular 
page tables where several (if not all) PTEs are pointing to the kasan_early_shadow_page .


I'm not sure I follow quite how powerpc is different here. But could you have a similar check for 
PTEs against kasan_early_shadow_pte as the other levels already have?


I'm just worried that page_size isn't well defined in this interface and it's going to cause 
problems in the future.




I'm trying. I reverted the two commits b00ff6d8c and cabe8138.

At the moment, I don't get exactly what I expect: For linear memory I get one line for each 8M page 
whereas before reverting the patches I got one 16M line and one 112M line.


And for KASAN shadow area I get two lines for the 2x 8M pages shadowing linear mem then I get one 4M 
line for each PGDIR entry pointing to kasan_early_shadow_pte.


0xf800-0xf87f 0x0700 8M   hugerw   present
0xf880-0xf8ff 0x0780 8M   hugerw   present
0xf900-0xf93f 0x0143 4M   rpresent
0xf940-0xf97f 0x0143 4M   rpresent
0xf980-0xf9bf 0x0143 4M   rpresent
0xf9c0-0xf9ff 0x0143 4M   rpresent
0xfa00-0xfa3f 0x0143 4M   rpresent
0xfa40-0xfa7f 0x0143 4M   rpresent
0xfa80-0xfabf 0x0143 4M   rpresent
0xfac0-0xfaff 0x0143 4M   rpresent
0xfb00-0xfb3f 0x0143 4M   rpresent
0xfb40-0xfb7f 0x0143 4M   r   

Re: [PATCH v1 3/5] mm: ptdump: Provide page size to notepage()

2021-04-16 Thread Steven Price

On 16/04/2021 12:08, Christophe Leroy wrote:



Le 16/04/2021 à 12:51, Steven Price a écrit :

On 16/04/2021 11:38, Christophe Leroy wrote:



Le 16/04/2021 à 11:28, Steven Price a écrit :

On 15/04/2021 18:18, Christophe Leroy wrote:

In order to support large pages on powerpc, notepage()
needs to know the page size of the page.

Add a page_size argument to notepage().

Signed-off-by: Christophe Leroy 
---
  arch/arm64/mm/ptdump.c |  2 +-
  arch/riscv/mm/ptdump.c |  2 +-
  arch/s390/mm/dump_pagetables.c |  3 ++-
  arch/x86/mm/dump_pagetables.c  |  2 +-
  include/linux/ptdump.h |  2 +-
  mm/ptdump.c    | 16 
  6 files changed, 14 insertions(+), 13 deletions(-)


[...]

diff --git a/mm/ptdump.c b/mm/ptdump.c
index da751448d0e4..61cd16afb1c8 100644
--- a/mm/ptdump.c
+++ b/mm/ptdump.c
@@ -17,7 +17,7 @@ static inline int note_kasan_page_table(struct 
mm_walk *walk,

  {
  struct ptdump_state *st = walk->private;
-    st->note_page(st, addr, 4, pte_val(kasan_early_shadow_pte[0]));
+    st->note_page(st, addr, 4, pte_val(kasan_early_shadow_pte[0]), 
PAGE_SIZE);


I'm not completely sure what the page_size is going to be used for, 
but note that KASAN presents an interesting case here. We short-cut 
by detecting it's a KASAN region at a high level (PGD/P4D/PUD/PMD) 
and instead of walking the tree down just call note_page() *once* 
but with level==4 because we know KASAN sets up the page table like 
that.


However the one call actually covers a much larger region - so while 
PAGE_SIZE matches the level it doesn't match the region covered. 
AFAICT this will lead to odd results if you enable KASAN on powerpc.


Hum  I successfully tested it with KASAN, I now realise that I 
tested it with CONFIG_KASAN_VMALLOC selected. In this situation, 
since https://github.com/torvalds/linux/commit/af3d0a686 we don't 
have any common shadow page table anymore.


I'll test again without CONFIG_KASAN_VMALLOC.



To be honest I don't fully understand why powerpc requires the 
page_size - it appears to be using it purely to find "holes" in the 
calls to note_page(), but I haven't worked out why such holes would 
occur.


I was indeed introduced for KASAN. We have a first commit 
https://github.com/torvalds/linux/commit/cabe8138 which uses page 
size to detect whether it is a KASAN like stuff.


Then came https://github.com/torvalds/linux/commit/b00ff6d8c as a 
fix. I can't remember what the problem was exactly, something around 
the use of hugepages for kernel memory, came as part of the series 
https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1589866984.git.christophe.le...@csgroup.eu/ 




Ah, that's useful context. So it looks like powerpc took a different 
route to reducing the KASAN output to x86.


Given the generic ptdump code has handling for KASAN already it should 
be possible to drop that from the powerpc arch code, which I think 
means we don't actually need to provide page size to notepage(). 
Hopefully that means more code to delete ;)




Yes ... and no.

It looks like the generic ptdump handles the case when several pgdir 
entries points to the same kasan_early_shadow_pte. But it doesn't take 
into account the powerpc case where we have regular page tables where 
several (if not all) PTEs are pointing to the kasan_early_shadow_page .


I'm not sure I follow quite how powerpc is different here. But could you 
have a similar check for PTEs against kasan_early_shadow_pte as the 
other levels already have?


I'm just worried that page_size isn't well defined in this interface and 
it's going to cause problems in the future.


Steve


Re: [PATCH v1 3/5] mm: ptdump: Provide page size to notepage()

2021-04-16 Thread Christophe Leroy




Le 16/04/2021 à 12:51, Steven Price a écrit :

On 16/04/2021 11:38, Christophe Leroy wrote:



Le 16/04/2021 à 11:28, Steven Price a écrit :

On 15/04/2021 18:18, Christophe Leroy wrote:

In order to support large pages on powerpc, notepage()
needs to know the page size of the page.

Add a page_size argument to notepage().

Signed-off-by: Christophe Leroy 
---
  arch/arm64/mm/ptdump.c |  2 +-
  arch/riscv/mm/ptdump.c |  2 +-
  arch/s390/mm/dump_pagetables.c |  3 ++-
  arch/x86/mm/dump_pagetables.c  |  2 +-
  include/linux/ptdump.h |  2 +-
  mm/ptdump.c    | 16 
  6 files changed, 14 insertions(+), 13 deletions(-)


[...]

diff --git a/mm/ptdump.c b/mm/ptdump.c
index da751448d0e4..61cd16afb1c8 100644
--- a/mm/ptdump.c
+++ b/mm/ptdump.c
@@ -17,7 +17,7 @@ static inline int note_kasan_page_table(struct mm_walk *walk,
  {
  struct ptdump_state *st = walk->private;
-    st->note_page(st, addr, 4, pte_val(kasan_early_shadow_pte[0]));
+    st->note_page(st, addr, 4, pte_val(kasan_early_shadow_pte[0]), PAGE_SIZE);


I'm not completely sure what the page_size is going to be used for, but note that KASAN presents 
an interesting case here. We short-cut by detecting it's a KASAN region at a high level 
(PGD/P4D/PUD/PMD) and instead of walking the tree down just call note_page() *once* but with 
level==4 because we know KASAN sets up the page table like that.


However the one call actually covers a much larger region - so while PAGE_SIZE matches the level 
it doesn't match the region covered. AFAICT this will lead to odd results if you enable KASAN on 
powerpc.


Hum  I successfully tested it with KASAN, I now realise that I tested it with 
CONFIG_KASAN_VMALLOC selected. In this situation, since 
https://github.com/torvalds/linux/commit/af3d0a686 we don't have any common shadow page table 
anymore.


I'll test again without CONFIG_KASAN_VMALLOC.



To be honest I don't fully understand why powerpc requires the page_size - it appears to be using 
it purely to find "holes" in the calls to note_page(), but I haven't worked out why such holes 
would occur.


I was indeed introduced for KASAN. We have a first commit 
https://github.com/torvalds/linux/commit/cabe8138 which uses page size to detect whether it is a 
KASAN like stuff.


Then came https://github.com/torvalds/linux/commit/b00ff6d8c as a fix. I can't remember what the 
problem was exactly, something around the use of hugepages for kernel memory, came as part of the 
series 
https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1589866984.git.christophe.le...@csgroup.eu/ 



Ah, that's useful context. So it looks like powerpc took a different route to reducing the KASAN 
output to x86.


Given the generic ptdump code has handling for KASAN already it should be possible to drop that from 
the powerpc arch code, which I think means we don't actually need to provide page size to 
notepage(). Hopefully that means more code to delete ;)




Yes ... and no.

It looks like the generic ptdump handles the case when several pgdir entries points to the same 
kasan_early_shadow_pte. But it doesn't take into account the powerpc case where we have regular page 
tables where several (if not all) PTEs are pointing to the kasan_early_shadow_page .


Christophe


Re: [PATCH v1 3/5] mm: ptdump: Provide page size to notepage()

2021-04-16 Thread Steven Price

On 16/04/2021 11:38, Christophe Leroy wrote:



Le 16/04/2021 à 11:28, Steven Price a écrit :

On 15/04/2021 18:18, Christophe Leroy wrote:

In order to support large pages on powerpc, notepage()
needs to know the page size of the page.

Add a page_size argument to notepage().

Signed-off-by: Christophe Leroy 
---
  arch/arm64/mm/ptdump.c |  2 +-
  arch/riscv/mm/ptdump.c |  2 +-
  arch/s390/mm/dump_pagetables.c |  3 ++-
  arch/x86/mm/dump_pagetables.c  |  2 +-
  include/linux/ptdump.h |  2 +-
  mm/ptdump.c    | 16 
  6 files changed, 14 insertions(+), 13 deletions(-)


[...]

diff --git a/mm/ptdump.c b/mm/ptdump.c
index da751448d0e4..61cd16afb1c8 100644
--- a/mm/ptdump.c
+++ b/mm/ptdump.c
@@ -17,7 +17,7 @@ static inline int note_kasan_page_table(struct 
mm_walk *walk,

  {
  struct ptdump_state *st = walk->private;
-    st->note_page(st, addr, 4, pte_val(kasan_early_shadow_pte[0]));
+    st->note_page(st, addr, 4, pte_val(kasan_early_shadow_pte[0]), 
PAGE_SIZE);


I'm not completely sure what the page_size is going to be used for, 
but note that KASAN presents an interesting case here. We short-cut by 
detecting it's a KASAN region at a high level (PGD/P4D/PUD/PMD) and 
instead of walking the tree down just call note_page() *once* but with 
level==4 because we know KASAN sets up the page table like that.


However the one call actually covers a much larger region - so while 
PAGE_SIZE matches the level it doesn't match the region covered. 
AFAICT this will lead to odd results if you enable KASAN on powerpc.


Hum  I successfully tested it with KASAN, I now realise that I 
tested it with CONFIG_KASAN_VMALLOC selected. In this situation, since 
https://github.com/torvalds/linux/commit/af3d0a686 we don't have any 
common shadow page table anymore.


I'll test again without CONFIG_KASAN_VMALLOC.



To be honest I don't fully understand why powerpc requires the 
page_size - it appears to be using it purely to find "holes" in the 
calls to note_page(), but I haven't worked out why such holes would 
occur.


I was indeed introduced for KASAN. We have a first commit 
https://github.com/torvalds/linux/commit/cabe8138 which uses page size 
to detect whether it is a KASAN like stuff.


Then came https://github.com/torvalds/linux/commit/b00ff6d8c as a fix. I 
can't remember what the problem was exactly, something around the use of 
hugepages for kernel memory, came as part of the series 
https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1589866984.git.christophe.le...@csgroup.eu/ 


Ah, that's useful context. So it looks like powerpc took a different 
route to reducing the KASAN output to x86.


Given the generic ptdump code has handling for KASAN already it should 
be possible to drop that from the powerpc arch code, which I think means 
we don't actually need to provide page size to notepage(). Hopefully 
that means more code to delete ;)


Steve


Re: [PATCH v1 3/5] mm: ptdump: Provide page size to notepage()

2021-04-16 Thread Christophe Leroy




Le 16/04/2021 à 11:28, Steven Price a écrit :

On 15/04/2021 18:18, Christophe Leroy wrote:

In order to support large pages on powerpc, notepage()
needs to know the page size of the page.

Add a page_size argument to notepage().

Signed-off-by: Christophe Leroy 
---
  arch/arm64/mm/ptdump.c |  2 +-
  arch/riscv/mm/ptdump.c |  2 +-
  arch/s390/mm/dump_pagetables.c |  3 ++-
  arch/x86/mm/dump_pagetables.c  |  2 +-
  include/linux/ptdump.h |  2 +-
  mm/ptdump.c    | 16 
  6 files changed, 14 insertions(+), 13 deletions(-)


[...]

diff --git a/mm/ptdump.c b/mm/ptdump.c
index da751448d0e4..61cd16afb1c8 100644
--- a/mm/ptdump.c
+++ b/mm/ptdump.c
@@ -17,7 +17,7 @@ static inline int note_kasan_page_table(struct mm_walk *walk,
  {
  struct ptdump_state *st = walk->private;
-    st->note_page(st, addr, 4, pte_val(kasan_early_shadow_pte[0]));
+    st->note_page(st, addr, 4, pte_val(kasan_early_shadow_pte[0]), PAGE_SIZE);


I'm not completely sure what the page_size is going to be used for, but note that KASAN presents an 
interesting case here. We short-cut by detecting it's a KASAN region at a high level 
(PGD/P4D/PUD/PMD) and instead of walking the tree down just call note_page() *once* but with 
level==4 because we know KASAN sets up the page table like that.


However the one call actually covers a much larger region - so while PAGE_SIZE matches the level it 
doesn't match the region covered. AFAICT this will lead to odd results if you enable KASAN on powerpc.


Hum  I successfully tested it with KASAN, I now realise that I tested it with 
CONFIG_KASAN_VMALLOC selected. In this situation, since 
https://github.com/torvalds/linux/commit/af3d0a686 we don't have any common shadow page table anymore.


I'll test again without CONFIG_KASAN_VMALLOC.



To be honest I don't fully understand why powerpc requires the page_size - it appears to be using it 
purely to find "holes" in the calls to note_page(), but I haven't worked out why such holes would 
occur.


I was indeed introduced for KASAN. We have a first commit 
https://github.com/torvalds/linux/commit/cabe8138 which uses page size to detect whether it is a 
KASAN like stuff.


Then came https://github.com/torvalds/linux/commit/b00ff6d8c as a fix. I can't remember what the 
problem was exactly, something around the use of hugepages for kernel memory, came as part of the 
series 
https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1589866984.git.christophe.le...@csgroup.eu/



Christophe


Re: [PATCH v1 3/5] mm: ptdump: Provide page size to notepage()

2021-04-16 Thread Steven Price

On 15/04/2021 18:18, Christophe Leroy wrote:

In order to support large pages on powerpc, notepage()
needs to know the page size of the page.

Add a page_size argument to notepage().

Signed-off-by: Christophe Leroy 
---
  arch/arm64/mm/ptdump.c |  2 +-
  arch/riscv/mm/ptdump.c |  2 +-
  arch/s390/mm/dump_pagetables.c |  3 ++-
  arch/x86/mm/dump_pagetables.c  |  2 +-
  include/linux/ptdump.h |  2 +-
  mm/ptdump.c| 16 
  6 files changed, 14 insertions(+), 13 deletions(-)


[...]

diff --git a/mm/ptdump.c b/mm/ptdump.c
index da751448d0e4..61cd16afb1c8 100644
--- a/mm/ptdump.c
+++ b/mm/ptdump.c
@@ -17,7 +17,7 @@ static inline int note_kasan_page_table(struct mm_walk *walk,
  {
struct ptdump_state *st = walk->private;
  
-	st->note_page(st, addr, 4, pte_val(kasan_early_shadow_pte[0]));

+   st->note_page(st, addr, 4, pte_val(kasan_early_shadow_pte[0]), 
PAGE_SIZE);


I'm not completely sure what the page_size is going to be used for, but 
note that KASAN presents an interesting case here. We short-cut by 
detecting it's a KASAN region at a high level (PGD/P4D/PUD/PMD) and 
instead of walking the tree down just call note_page() *once* but with 
level==4 because we know KASAN sets up the page table like that.


However the one call actually covers a much larger region - so while 
PAGE_SIZE matches the level it doesn't match the region covered. AFAICT 
this will lead to odd results if you enable KASAN on powerpc.


To be honest I don't fully understand why powerpc requires the page_size 
- it appears to be using it purely to find "holes" in the calls to 
note_page(), but I haven't worked out why such holes would occur.


Steve

  
  	walk->action = ACTION_CONTINUE;
  
@@ -41,7 +41,7 @@ static int ptdump_pgd_entry(pgd_t *pgd, unsigned long addr,

st->effective_prot(st, 0, pgd_val(val));
  
  	if (pgd_leaf(val))

-   st->note_page(st, addr, 0, pgd_val(val));
+   st->note_page(st, addr, 0, pgd_val(val), PGDIR_SIZE);
  
  	return 0;

  }
@@ -62,7 +62,7 @@ static int ptdump_p4d_entry(p4d_t *p4d, unsigned long addr,
st->effective_prot(st, 1, p4d_val(val));
  
  	if (p4d_leaf(val))

-   st->note_page(st, addr, 1, p4d_val(val));
+   st->note_page(st, addr, 1, p4d_val(val), P4D_SIZE);
  
  	return 0;

  }
@@ -83,7 +83,7 @@ static int ptdump_pud_entry(pud_t *pud, unsigned long addr,
st->effective_prot(st, 2, pud_val(val));
  
  	if (pud_leaf(val))

-   st->note_page(st, addr, 2, pud_val(val));
+   st->note_page(st, addr, 2, pud_val(val), PUD_SIZE);
  
  	return 0;

  }
@@ -102,7 +102,7 @@ static int ptdump_pmd_entry(pmd_t *pmd, unsigned long addr,
if (st->effective_prot)
st->effective_prot(st, 3, pmd_val(val));
if (pmd_leaf(val))
-   st->note_page(st, addr, 3, pmd_val(val));
+   st->note_page(st, addr, 3, pmd_val(val), PMD_SIZE);
  
  	return 0;

  }
@@ -116,7 +116,7 @@ static int ptdump_pte_entry(pte_t *pte, unsigned long addr,
if (st->effective_prot)
st->effective_prot(st, 4, pte_val(val));
  
-	st->note_page(st, addr, 4, pte_val(val));

+   st->note_page(st, addr, 4, pte_val(val), PAGE_SIZE);
  
  	return 0;

  }
@@ -126,7 +126,7 @@ static int ptdump_hole(unsigned long addr, unsigned long 
next,
  {
struct ptdump_state *st = walk->private;
  
-	st->note_page(st, addr, depth, 0);

+   st->note_page(st, addr, depth, 0, 0);
  
  	return 0;

  }
@@ -153,5 +153,5 @@ void ptdump_walk_pgd(struct ptdump_state *st, struct 
mm_struct *mm, pgd_t *pgd)
mmap_read_unlock(mm);
  
  	/* Flush out the last page */

-   st->note_page(st, 0, -1, 0);
+   st->note_page(st, 0, -1, 0, 0);
  }





Re: [PATCH v1 3/5] mm: ptdump: Provide page size to notepage()

2021-04-15 Thread Christophe Leroy




Le 16/04/2021 à 01:12, Daniel Axtens a écrit :

Hi Christophe,


  static void note_page(struct ptdump_state *pt_st, unsigned long addr, int 
level,
- u64 val)
+ u64 val, unsigned long page_size)


Compilers can warn about unused parameters at -Wextra level.  However,
reading scripts/Makefile.extrawarn it looks like the warning is
explicitly _disabled_ in the kernel at W=1 and not reenabled at W=2 or
W=3. So I guess this is fine...


There are a lot lot lot functions having unused parameters in the kernel , especially the ones that 
are re-implemented by each architecture.





@@ -126,7 +126,7 @@ static int ptdump_hole(unsigned long addr, unsigned long 
next,
  {
struct ptdump_state *st = walk->private;
  
-	st->note_page(st, addr, depth, 0);

+   st->note_page(st, addr, depth, 0, 0);


I know it doesn't matter at this point, but I'm not really thrilled by
the idea of passing 0 as the size here. Doesn't the hole have a known
page size?


The hole has a size for sure, I don't think we can call it a page size:

On powerpc 8xx, we have 4 page sizes: 8M, 512k, 16k and 4k.
A page table will cover 4M areas and will contain pages of size 512k, 16k and 
4k.
A PGD table contains either entries which points to a page table (covering 4M), or two identical 
consecutive entries pointing to the same hugepd which contains a single PTE for an 8M page.


So, if a PGD entry is empty, the hole is 4M, it corresponds to none of the page sizes the 
architecture supports.



But looking at what is done with that size, it can make sense to pass it to notepage() anyway. Let's 
do that.




  
  	return 0;

  }
@@ -153,5 +153,5 @@ void ptdump_walk_pgd(struct ptdump_state *st, struct 
mm_struct *mm, pgd_t *pgd)
mmap_read_unlock(mm);
  
  	/* Flush out the last page */

-   st->note_page(st, 0, -1, 0);
+   st->note_page(st, 0, -1, 0, 0);


I'm more OK with the idea of passing 0 as the size when the depth is -1
(don't know): if we don't know the depth we conceptually can't know the
page size.

Regards,
Daniel



Re: [PATCH v1 3/5] mm: ptdump: Provide page size to notepage()

2021-04-15 Thread Daniel Axtens
Hi Christophe,

>  static void note_page(struct ptdump_state *pt_st, unsigned long addr, int 
> level,
> -   u64 val)
> +   u64 val, unsigned long page_size)

Compilers can warn about unused parameters at -Wextra level.  However,
reading scripts/Makefile.extrawarn it looks like the warning is
explicitly _disabled_ in the kernel at W=1 and not reenabled at W=2 or
W=3. So I guess this is fine...

> @@ -126,7 +126,7 @@ static int ptdump_hole(unsigned long addr, unsigned long 
> next,
>  {
>   struct ptdump_state *st = walk->private;
>  
> - st->note_page(st, addr, depth, 0);
> + st->note_page(st, addr, depth, 0, 0);

I know it doesn't matter at this point, but I'm not really thrilled by
the idea of passing 0 as the size here. Doesn't the hole have a known
page size?

>  
>   return 0;
>  }
> @@ -153,5 +153,5 @@ void ptdump_walk_pgd(struct ptdump_state *st, struct 
> mm_struct *mm, pgd_t *pgd)
>   mmap_read_unlock(mm);
>  
>   /* Flush out the last page */
> - st->note_page(st, 0, -1, 0);
> + st->note_page(st, 0, -1, 0, 0);

I'm more OK with the idea of passing 0 as the size when the depth is -1
(don't know): if we don't know the depth we conceptually can't know the
page size.

Regards,
Daniel