Re: [PATCH V15] mm/debug: Add tests validating architecture page table helpers

2020-03-09 Thread Anshuman Khandual



On 03/07/2020 12:35 PM, Christophe Leroy wrote:
> 
> 
> Le 07/03/2020 à 01:56, Anshuman Khandual a écrit :
>>
>>
>> On 03/07/2020 06:04 AM, Qian Cai wrote:
>>>
>>>
 On Mar 6, 2020, at 7:03 PM, Anshuman Khandual  
 wrote:

 Hmm, set_pte_at() function is not preferred here for these tests. The idea
 is to avoid or atleast minimize TLB/cache flushes triggered from these sort
 of 'static' tests. set_pte_at() is platform provided and could/might 
 trigger
 these flushes or some other platform specific synchronization stuff. Just
>>>
>>> Why is that important for this debugging option?
>>
>> Primarily reason is to avoid TLB/cache flush instructions on the system
>> during these tests that only involve transforming different page table
>> level entries through helpers. Unless really necessary, why should it
>> emit any TLB/cache flush instructions ?
> 
> What's the problem with thoses flushes ?
> 
>>
>>>
 wondering is there specific reason with respect to the soft lock up problem
 making it necessary to use set_pte_at() rather than a simple WRITE_ONCE() ?
>>>
>>> Looks at the s390 version of set_pte_at(), it has this comment,
>>> vmaddr);
>>>
>>> /*
>>>   * Certain architectures need to do special things when PTEs
>>>   * within a page table are directly modified.  Thus, the following
>>>   * hook is made available.
>>>   */
>>>
>>> I can only guess that powerpc  could be the same here.
>>
>> This comment is present in multiple platforms while defining set_pte_at().
>> Is not 'barrier()' here alone good enough ? Else what exactly set_pte_at()
>> does as compared to WRITE_ONCE() that avoids the soft lock up, just trying
>> to understand.
>>
> 
> 
> Argh ! I didn't realise that you were writing directly into the page tables. 
> When it works, that's only by chance I guess.
> 
> To properly set the page table entries, set_pte_at() has to be used:
> - On powerpc 8xx, with 16k pages, the page table entry must be copied four 
> times. set_pte_at() does it, WRITE_ONCE() doesn't.
> - On powerpc book3s/32 (hash MMU), the flag _PAGE_HASHPTE must be preserved 
> among writes. set_pte_at() preserves it, WRITE_ONCE() doesn't.
> 
> set_pte_at() also does a few other mandatory things, like calling pte_mkpte()
> 
> So, the WRITE_ONCE() must definitely become a set_pte_at()

Sure, will do. These are part of the clear tests that populates a given
entry with a non zero value before clearing and testing it with pxx_none().
In that context, WRITE_ONCE() seemed sufficient. But pte_clear() might be
closely tied with proper page table entry update and hence a preceding
set_pte_at() will be better.

There are still more WRITE_ONCE() for other page table levels during these
clear tests. set_pmd_at() and set_pud_at() are defined on platforms that
support (and enable) THP and PUD based THP respectively. Hence they could
not be used for clear tests as remaining helpers pmd_clear(), pud_clear(),
p4d_clear() and pgd_clear() still need to be validated with or without
THP support and enablement. We should just leave all other WRITE_ONCE()
instances unchanged. Please correct me if I am missing something here.

> 
> Christophe
> 


Re: [PATCH V15] mm/debug: Add tests validating architecture page table helpers

2020-03-06 Thread Christophe Leroy




Le 07/03/2020 à 01:56, Anshuman Khandual a écrit :



On 03/07/2020 06:04 AM, Qian Cai wrote:




On Mar 6, 2020, at 7:03 PM, Anshuman Khandual  wrote:

Hmm, set_pte_at() function is not preferred here for these tests. The idea
is to avoid or atleast minimize TLB/cache flushes triggered from these sort
of 'static' tests. set_pte_at() is platform provided and could/might trigger
these flushes or some other platform specific synchronization stuff. Just


Why is that important for this debugging option?


Primarily reason is to avoid TLB/cache flush instructions on the system
during these tests that only involve transforming different page table
level entries through helpers. Unless really necessary, why should it
emit any TLB/cache flush instructions ?


What's the problem with thoses flushes ?






wondering is there specific reason with respect to the soft lock up problem
making it necessary to use set_pte_at() rather than a simple WRITE_ONCE() ?


Looks at the s390 version of set_pte_at(), it has this comment,
vmaddr);

/*
  * Certain architectures need to do special things when PTEs
  * within a page table are directly modified.  Thus, the following
  * hook is made available.
  */

I can only guess that powerpc  could be the same here.


This comment is present in multiple platforms while defining set_pte_at().
Is not 'barrier()' here alone good enough ? Else what exactly set_pte_at()
does as compared to WRITE_ONCE() that avoids the soft lock up, just trying
to understand.




Argh ! I didn't realise that you were writing directly into the page 
tables. When it works, that's only by chance I guess.


To properly set the page table entries, set_pte_at() has to be used:
- On powerpc 8xx, with 16k pages, the page table entry must be copied 
four times. set_pte_at() does it, WRITE_ONCE() doesn't.
- On powerpc book3s/32 (hash MMU), the flag _PAGE_HASHPTE must be 
preserved among writes. set_pte_at() preserves it, WRITE_ONCE() doesn't.


set_pte_at() also does a few other mandatory things, like calling 
pte_mkpte()


So, the WRITE_ONCE() must definitely become a set_pte_at()

Christophe


Re: [PATCH V15] mm/debug: Add tests validating architecture page table helpers

2020-03-06 Thread Qian Cai



> On Mar 6, 2020, at 7:56 PM, Anshuman Khandual  
> wrote:
> 
> 
> 
> On 03/07/2020 06:04 AM, Qian Cai wrote:
>> 
>> 
>>> On Mar 6, 2020, at 7:03 PM, Anshuman Khandual  
>>> wrote:
>>> 
>>> Hmm, set_pte_at() function is not preferred here for these tests. The idea
>>> is to avoid or atleast minimize TLB/cache flushes triggered from these sort
>>> of 'static' tests. set_pte_at() is platform provided and could/might trigger
>>> these flushes or some other platform specific synchronization stuff. Just
>> 
>> Why is that important for this debugging option?
> 
> Primarily reason is to avoid TLB/cache flush instructions on the system
> during these tests that only involve transforming different page table
> level entries through helpers. Unless really necessary, why should it
> emit any TLB/cache flush instructions ?
> 
>> 
>>> wondering is there specific reason with respect to the soft lock up problem
>>> making it necessary to use set_pte_at() rather than a simple WRITE_ONCE() ?
>> 
>> Looks at the s390 version of set_pte_at(), it has this comment,
>> vmaddr);
>> 
>> /*
>> * Certain architectures need to do special things when PTEs
>> * within a page table are directly modified.  Thus, the following
>> * hook is made available.
>> */
>> 
>> I can only guess that powerpc  could be the same here.
> 
> This comment is present in multiple platforms while defining set_pte_at().
> Is not 'barrier()' here alone good enough ? Else what exactly set_pte_at()

No, barrier() is not enough.

> does as compared to WRITE_ONCE() that avoids the soft lock up, just trying
> to understand.

I surely can spend hours to figure which exact things in set_pte_at() is 
necessary for
pte_clear() not to stuck, and then propose a solution and possible need to 
retest on
multiple arches. I am not sure if that is a good use of my time just to saving
a few TLB/cache flush on a debug kernel?

Re: [PATCH V15] mm/debug: Add tests validating architecture page table helpers

2020-03-06 Thread Anshuman Khandual



On 03/07/2020 06:04 AM, Qian Cai wrote:
> 
> 
>> On Mar 6, 2020, at 7:03 PM, Anshuman Khandual  
>> wrote:
>>
>> Hmm, set_pte_at() function is not preferred here for these tests. The idea
>> is to avoid or atleast minimize TLB/cache flushes triggered from these sort
>> of 'static' tests. set_pte_at() is platform provided and could/might trigger
>> these flushes or some other platform specific synchronization stuff. Just
> 
> Why is that important for this debugging option?

Primarily reason is to avoid TLB/cache flush instructions on the system
during these tests that only involve transforming different page table
level entries through helpers. Unless really necessary, why should it
emit any TLB/cache flush instructions ?

> 
>> wondering is there specific reason with respect to the soft lock up problem
>> making it necessary to use set_pte_at() rather than a simple WRITE_ONCE() ?
> 
> Looks at the s390 version of set_pte_at(), it has this comment,
> vmaddr);
> 
> /*
>  * Certain architectures need to do special things when PTEs
>  * within a page table are directly modified.  Thus, the following
>  * hook is made available.
>  */
> 
> I can only guess that powerpc  could be the same here.

This comment is present in multiple platforms while defining set_pte_at().
Is not 'barrier()' here alone good enough ? Else what exactly set_pte_at()
does as compared to WRITE_ONCE() that avoids the soft lock up, just trying
to understand.


Re: [PATCH V15] mm/debug: Add tests validating architecture page table helpers

2020-03-06 Thread Qian Cai



> On Mar 6, 2020, at 7:03 PM, Anshuman Khandual  
> wrote:
> 
> Hmm, set_pte_at() function is not preferred here for these tests. The idea
> is to avoid or atleast minimize TLB/cache flushes triggered from these sort
> of 'static' tests. set_pte_at() is platform provided and could/might trigger
> these flushes or some other platform specific synchronization stuff. Just

Why is that important for this debugging option?

> wondering is there specific reason with respect to the soft lock up problem
> making it necessary to use set_pte_at() rather than a simple WRITE_ONCE() ?

Looks at the s390 version of set_pte_at(), it has this comment,
vmaddr);

/*
 * Certain architectures need to do special things when PTEs
 * within a page table are directly modified.  Thus, the following
 * hook is made available.
 */

I can only guess that powerpc  could be the same here.

Re: [PATCH V15] mm/debug: Add tests validating architecture page table helpers

2020-03-06 Thread Anshuman Khandual



On 03/07/2020 02:14 AM, Qian Cai wrote:
> On Fri, 2020-03-06 at 05:27 +0530, Anshuman Khandual wrote:
>> This adds tests which will validate architecture page table helpers and
>> other accessors in their compliance with expected generic MM semantics.
>> This will help various architectures in validating changes to existing
>> page table helpers or addition of new ones.
>>
>> This test covers basic page table entry transformations including but not
>> limited to old, young, dirty, clean, write, write protect etc at various
>> level along with populating intermediate entries with next page table page
>> and validating them.
>>
>> Test page table pages are allocated from system memory with required size
>> and alignments. The mapped pfns at page table levels are derived from a
>> real pfn representing a valid kernel text symbol. This test gets called
>> inside kernel_init() right after async_synchronize_full().
>>
>> This test gets built and run when CONFIG_DEBUG_VM_PGTABLE is selected. Any
>> architecture, which is willing to subscribe this test will need to select
>> ARCH_HAS_DEBUG_VM_PGTABLE. For now this is limited to arc, arm64, x86, s390
>> and ppc32 platforms where the test is known to build and run successfully.
>> Going forward, other architectures too can subscribe the test after fixing
>> any build or runtime problems with their page table helpers. Meanwhile for
>> better platform coverage, the test can also be enabled with CONFIG_EXPERT
>> even without ARCH_HAS_DEBUG_VM_PGTABLE.
>>
>> Folks interested in making sure that a given platform's page table helpers
>> conform to expected generic MM semantics should enable the above config
>> which will just trigger this test during boot. Any non conformity here will
>> be reported as an warning which would need to be fixed. This test will help
>> catch any changes to the agreed upon semantics expected from generic MM and
>> enable platforms to accommodate it thereafter.
> 
> OK, I get this working on powerpc hash MMU as well, so this?
> 
> diff --git a/Documentation/features/debug/debug-vm-pgtable/arch-support.txt
> b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt
> index 64d0f9b15c49..c527d05c0459 100644
> --- a/Documentation/features/debug/debug-vm-pgtable/arch-support.txt
> +++ b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt
> @@ -22,8 +22,7 @@
>  |   nios2: | TODO |
>  |openrisc: | TODO |
>  |  parisc: | TODO |
> -|  powerpc/32: |  ok  |
> -|  powerpc/64: | TODO |
> +| powerpc: |  ok  |
>  |   riscv: | TODO |
>  |s390: |  ok  |
>  |  sh: | TODO |
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 2e7eee523ba1..176930f40e07 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -116,7 +116,7 @@ config PPC
>   #
>   select ARCH_32BIT_OFF_T if PPC32
>   select ARCH_HAS_DEBUG_VIRTUAL
> - select ARCH_HAS_DEBUG_VM_PGTABLE if PPC32
> + select ARCH_HAS_DEBUG_VM_PGTABLE
>   select ARCH_HAS_DEVMEM_IS_ALLOWED
>   select ARCH_HAS_ELF_RANDOMIZE
>   select ARCH_HAS_FORTIFY_SOURCE
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 96a91bda3a85..98990a515268 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -256,7 +256,8 @@ static void __init pte_clear_tests(struct mm_struct *mm,
> pte_t *ptep,
>   pte_t pte = READ_ONCE(*ptep);
>  
>   pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
> - WRITE_ONCE(*ptep, pte);
> + set_pte_at(mm, vaddr, ptep, pte);

Hmm, set_pte_at() function is not preferred here for these tests. The idea
is to avoid or atleast minimize TLB/cache flushes triggered from these sort
of 'static' tests. set_pte_at() is platform provided and could/might trigger
these flushes or some other platform specific synchronization stuff. Just
wondering is there specific reason with respect to the soft lock up problem
making it necessary to use set_pte_at() rather than a simple WRITE_ONCE() ?

> + barrier();
>   pte_clear(mm, vaddr, ptep);
>   pte = READ_ONCE(*ptep);
>   WARN_ON(!pte_none(pte));
> 


Re: [PATCH V15] mm/debug: Add tests validating architecture page table helpers

2020-03-06 Thread Qian Cai
On Fri, 2020-03-06 at 05:27 +0530, Anshuman Khandual wrote:
> This adds tests which will validate architecture page table helpers and
> other accessors in their compliance with expected generic MM semantics.
> This will help various architectures in validating changes to existing
> page table helpers or addition of new ones.
> 
> This test covers basic page table entry transformations including but not
> limited to old, young, dirty, clean, write, write protect etc at various
> level along with populating intermediate entries with next page table page
> and validating them.
> 
> Test page table pages are allocated from system memory with required size
> and alignments. The mapped pfns at page table levels are derived from a
> real pfn representing a valid kernel text symbol. This test gets called
> inside kernel_init() right after async_synchronize_full().
> 
> This test gets built and run when CONFIG_DEBUG_VM_PGTABLE is selected. Any
> architecture, which is willing to subscribe this test will need to select
> ARCH_HAS_DEBUG_VM_PGTABLE. For now this is limited to arc, arm64, x86, s390
> and ppc32 platforms where the test is known to build and run successfully.
> Going forward, other architectures too can subscribe the test after fixing
> any build or runtime problems with their page table helpers. Meanwhile for
> better platform coverage, the test can also be enabled with CONFIG_EXPERT
> even without ARCH_HAS_DEBUG_VM_PGTABLE.
> 
> Folks interested in making sure that a given platform's page table helpers
> conform to expected generic MM semantics should enable the above config
> which will just trigger this test during boot. Any non conformity here will
> be reported as an warning which would need to be fixed. This test will help
> catch any changes to the agreed upon semantics expected from generic MM and
> enable platforms to accommodate it thereafter.

OK, I get this working on powerpc hash MMU as well, so this?

diff --git a/Documentation/features/debug/debug-vm-pgtable/arch-support.txt
b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt
index 64d0f9b15c49..c527d05c0459 100644
--- a/Documentation/features/debug/debug-vm-pgtable/arch-support.txt
+++ b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt
@@ -22,8 +22,7 @@
 |   nios2: | TODO |
 |openrisc: | TODO |
 |  parisc: | TODO |
-|  powerpc/32: |  ok  |
-|  powerpc/64: | TODO |
+| powerpc: |  ok  |
 |   riscv: | TODO |
 |s390: |  ok  |
 |  sh: | TODO |
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 2e7eee523ba1..176930f40e07 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -116,7 +116,7 @@ config PPC
    #
    select ARCH_32BIT_OFF_T if PPC32
    select ARCH_HAS_DEBUG_VIRTUAL
-   select ARCH_HAS_DEBUG_VM_PGTABLE if PPC32
+   select ARCH_HAS_DEBUG_VM_PGTABLE
    select ARCH_HAS_DEVMEM_IS_ALLOWED
    select ARCH_HAS_ELF_RANDOMIZE
    select ARCH_HAS_FORTIFY_SOURCE
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 96a91bda3a85..98990a515268 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -256,7 +256,8 @@ static void __init pte_clear_tests(struct mm_struct *mm,
pte_t *ptep,
    pte_t pte = READ_ONCE(*ptep);
 
    pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
-   WRITE_ONCE(*ptep, pte);
+   set_pte_at(mm, vaddr, ptep, pte);
+   barrier();
    pte_clear(mm, vaddr, ptep);
    pte = READ_ONCE(*ptep);
    WARN_ON(!pte_none(pte));


[PATCH V15] mm/debug: Add tests validating architecture page table helpers

2020-03-05 Thread Anshuman Khandual
This adds tests which will validate architecture page table helpers and
other accessors in their compliance with expected generic MM semantics.
This will help various architectures in validating changes to existing
page table helpers or addition of new ones.

This test covers basic page table entry transformations including but not
limited to old, young, dirty, clean, write, write protect etc at various
level along with populating intermediate entries with next page table page
and validating them.

Test page table pages are allocated from system memory with required size
and alignments. The mapped pfns at page table levels are derived from a
real pfn representing a valid kernel text symbol. This test gets called
inside kernel_init() right after async_synchronize_full().

This test gets built and run when CONFIG_DEBUG_VM_PGTABLE is selected. Any
architecture, which is willing to subscribe this test will need to select
ARCH_HAS_DEBUG_VM_PGTABLE. For now this is limited to arc, arm64, x86, s390
and ppc32 platforms where the test is known to build and run successfully.
Going forward, other architectures too can subscribe the test after fixing
any build or runtime problems with their page table helpers. Meanwhile for
better platform coverage, the test can also be enabled with CONFIG_EXPERT
even without ARCH_HAS_DEBUG_VM_PGTABLE.

Folks interested in making sure that a given platform's page table helpers
conform to expected generic MM semantics should enable the above config
which will just trigger this test during boot. Any non conformity here will
be reported as an warning which would need to be fixed. This test will help
catch any changes to the agreed upon semantics expected from generic MM and
enable platforms to accommodate it thereafter.

Cc: Andrew Morton 
Cc: Mike Rapoport 
Cc: Vineet Gupta 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Heiko Carstens 
Cc: Vasily Gorbik 
Cc: Christian Borntraeger 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Kirill A. Shutemov 
Cc: Paul Walmsley 
Cc: Palmer Dabbelt 
Cc: linux-snps-...@lists.infradead.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s...@vger.kernel.org
Cc: linux-ri...@lists.infradead.org
Cc: x...@kernel.org
Cc: linux-a...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org

Suggested-by: Catalin Marinas 
Reviewed-by: Ingo Molnar 
Tested-by: Gerald Schaefer  # s390
Tested-by: Christophe Leroy# ppc32
Signed-off-by: Qian Cai 
Signed-off-by: Andrew Morton 
Signed-off-by: Christophe Leroy 
Signed-off-by: Anshuman Khandual 
---
This adds a test validation for architecture exported page table helpers.
Patch adds basic transformation tests at various levels of the page table.

This test was originally suggested by Catalin during arm64 THP migration
RFC discussion earlier. Going forward it can include more specific tests
with respect to various generic MM functions like THP, HugeTLB etc and
platform specific tests.

https://lore.kernel.org/linux-mm/20190628102003.ga56...@arrakis.emea.arm.com/

Needs to be applied on linux V5.6-rc4

Changes in V15:

- Replaced __pa() with __pa_symbol() 
(https://patchwork.kernel.org/patch/11407715/) 
- Replaced pte_alloc_map() with pte_alloc_map_lock() per Qian
- Replaced pte_unmap() with pte_unmap_unlock() per Qian
- Added address to pte_clear_tests() and passed it down till pte_clear() per 
Qian

Changes in V14: 
(https://patchwork.kernel.org/project/linux-mm/list/?series=242305)

- Disabled DEBUG_VM_PGTABLE for IA64 and ARM (32 Bit) per Andrew and Christophe
- Updated DEBUG_VM_PGTABLE documentation wrt EXPERT and disabled platforms
- Updated RANDOM_[OR|NZ]VALUE open encodings with GENMASK() per Catalin
- Updated s390 constraint bits from 12 to 4 (S390_MASK_BITS) per Gerald
- Updated in-code documentation for RANDOM_ORVALUE per Gerald
- Updated pxx_basic_tests() to use invert functions first per Catalin
- Dropped ARCH_HAS_4LEVEL_HACK check from pud_basic_tests()
- Replaced __ARCH_HAS_[4|5]LEVEL_HACK with __PAGETABLE_[PUD|P4D]_FOLDED per 
Catalin
- Trimmed the CC list on the commit message per Catalin

Changes in V13: 
(https://patchwork.kernel.org/project/linux-mm/list/?series=237125)

- Subscribed s390 platform and updated debug-vm-pgtable/arch-support.txt per 
Gerald
- Dropped keyword 'extern' from debug_vm_pgtable() declaration per Christophe
- Moved debug_vm_pgtable() declarations to  per Christophe
- Moved debug_vm_pgtable() call site into kernel_init() per Christophe
- Changed CONFIG_DEBUG_VM_PGTABLE rules per Christophe
- Updated commit to include new supported platforms and changed config selection

Changes in V12: 
(https://patchwork.kernel.org/project/linux-mm/list/?series=233905)

- Replaced __mmdrop() with mmdrop()
- Enable ARCH_HAS_DEBUG_VM_PGTABLE on X86 for non CONFIG_X86_PAE platforms as 
the
  test procedure interfere with pre-allocated PMDs attached to the PGD