Re: [PATCH v8 4/9] phy: fsl: Add Lynx 10G SerDes driver

2022-12-06 Thread Stephen Boyd
Quoting Sean Anderson (2022-11-01 16:27:21)
> On 11/1/22 16:10, Stephen Boyd wrote:
> >> 
> >> Oh, I remember why I did this. I need the reference clock for 
> >> clk_hw_round_rate,
> >> which is AFAICT the only correct way to implement round_rate.
> >> 
> > 
> > Is the reference clk the parent of the clk implementing
> > clk_ops::round_rate()?
> 
> Yes. We may be able to produce a given output with multiple reference
> rates. However, the clock API provides no mechanism to say "Don't ask
> for the parent clock to be rate X, you just tried it and the parent
> clock can't support it." So instead, we loop over the possible reference
> rates and pick the first one which the parent says it can round to.
> 

Sorry, I'm lost. Why can't you loop over possible reference rates in
determine_rate/round_rate clk op here?


Re: [PATCH] powerpc/ftrace: fix syscall tracing on PPC64_ELF_ABI_V1

2022-12-06 Thread Michael Ellerman
Mathieu Desnoyers  writes:
> On 2022-12-05 17:50, Michael Ellerman wrote:
>> Michael Jeanson  writes:
>>> On 2022-12-05 15:11, Michael Jeanson wrote:
>>> Michael Jeanson  writes:
 In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on
 PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table
 changing from their dot prefixed variant to the non-prefixed ones.

 Since ftrace prefixes a dot to the syscall names when matching them to
 build its syscall event list, this resulted in no syscall events being
 available.

 Remove the PPC64_ELF_ABI_V1 specific version of
 arch_syscall_match_sym_name to have the same behavior across all 
 powerpc
 variants.
>>>
>>> This doesn't seem to work for me.
>>>
>>> Event with it applied I still don't see anything in
>>> /sys/kernel/debug/tracing/events/syscalls
>>>
>>> Did we break it in some other way recently?
>>>
>>> cheers
>>>
>>> I did some further testing, my config also enabled KALLSYMS_ALL, when I 
>>> remove
>>> it there is indeed no syscall events.
>> 
>> Aha, OK that explains it I guess.
>> 
>> I was using ppc64_guest_defconfig which has ABI_V1 and FTRACE_SYSCALLS,
>> but does not have KALLSYMS_ALL. So I guess there's some other bug
>> lurking in there.
>
> I don't have the setup handy to validate it, but I suspect it is caused 
> by the way scripts/kallsyms.c:symbol_valid() checks whether a symbol 
> entry needs to be integrated into the assembler output when 
> --all-symbols is not specified. It only keeps symbols which addresses 
> are in the text range. On PPC64_ELF_ABI_V1, this means only the 
> dot-prefixed symbols will be kept (those point to the function begin), 
> leaving out the non-dot-prefixed symbols (those point to the function 
> descriptors).

OK. So I guess it never worked without KALLSYMS_ALL.

It seems like most distros enable KALLSYMS_ALL, so I guess that's why
we've never noticed.

> So I see two possible solutions there: either we ensure that 
> FTRACE_SYSCALLS selects KALLSYMS_ALL on PPC64_ELF_ABI_V1, or we modify 
> scripts/kallsyms.c:symbol_valid() to also include function descriptor 
> symbols. This would mean accepting symbols pointing into the .opd ELF 
> section.

My only worry is that will cause some other breakage, because .opd
symbols are not really "text" in the normal sense, ie. you can't execute
them directly.

On the other hand the help for KALLSYMS_ALL says:

  "Normally kallsyms only contains the symbols of functions"

But without .opd included that's not really true. In practice it
probably doesn't really matter, because eg. backtraces will point to dot
symbols which can be resolved.

> IMHO the second option would be better because it does not increase the 
> kernel image size as much as KALLSYMS_ALL.

Yes I agree.

Even if that did break something, any breakage would be limited to
arches which uses function descriptors, which are now all rare.

Relatedly we have a patch in next to optionally use ABIv2 for 64-bit big
endian builds.

cheers


[powerpc:next] BUILD SUCCESS 5ddcc03a07ae1ab5062f89a946d9495f1fd8eaa4

2022-12-06 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
next
branch HEAD: 5ddcc03a07ae1ab5062f89a946d9495f1fd8eaa4  powerpc/cpuidle: Set 
CPUIDLE_FLAG_POLLING for snooze state

elapsed time: 728m

configs tested: 60
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
um i386_defconfig
um   x86_64_defconfig
powerpc   allnoconfig
arc defconfig
s390 allmodconfig
alpha   defconfig
x86_64 rhel-8.3-kunit
x86_64   rhel-8.3-kvm
s390defconfig
x86_64   rhel-8.3-syz
sh   allmodconfig
s390 allyesconfig
i386defconfig
powerpc  allmodconfig
mips allyesconfig
arm defconfig
x86_64  rhel-8.3-rust
x86_64rhel-8.3-kselftests
x86_64  defconfig
x86_64  rhel-8.3-func
arm  randconfig-r046-20221206
i386  randconfig-a014
arc  randconfig-r043-20221206
i386  randconfig-a012
i386  randconfig-a016
arm64allyesconfig
arm  allyesconfig
x86_64randconfig-a013
i386  randconfig-a001
x86_64randconfig-a011
x86_64randconfig-a004
i386  randconfig-a003
i386 allyesconfig
x86_64randconfig-a002
x86_64   rhel-8.3
x86_64randconfig-a015
x86_64randconfig-a006
x86_64   allyesconfig
i386  randconfig-a005
m68k allyesconfig
m68k allmodconfig
ia64 allmodconfig
arc  allyesconfig
alphaallyesconfig

clang tested configs:
hexagon  randconfig-r041-20221206
i386  randconfig-a013
hexagon  randconfig-r045-20221206
s390 randconfig-r044-20221206
riscvrandconfig-r042-20221206
i386  randconfig-a011
i386  randconfig-a015
i386  randconfig-a002
x86_64randconfig-a012
x86_64randconfig-a001
i386  randconfig-a004
x86_64randconfig-a003
x86_64randconfig-a014
x86_64randconfig-a016
x86_64randconfig-a005
i386  randconfig-a006

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp


Re: [PATCH v5 2/5] powerpc: mm: Implement p{m,u,4}d_leaf on all platforms

2022-12-06 Thread Rohan McLure
Great job spotting this. Somehow lost these throughout the revisions. Thanks.

> On 7 Dec 2022, at 9:24 am, Michael Ellerman  wrote:
> 
> Rohan McLure  writes:
>> The check that a higher-level entry in multi-level pages contains a page
>> translation entry (pte) is performed by p{m,u,4}d_leaf stubs, which may
>> be specialised for each choice of mmu. In a prior commit, we replace
>> uses to the catch-all stubs, p{m,u,4}d_is_leaf with p{m,u,4}d_leaf.
>> 
>> Replace the catch-all stub definitions for p{m,u,4}d_is_leaf with
>> definitions for p{m,u,4}d_leaf. A future patch will assume that
>> p{m,u,4}d_leaf is defined on all platforms.
>> 
>> In particular, implement pud_leaf for Book3E-64, pmd_leaf for all Book3E
>> and Book3S-64 platforms, with a catch-all definition for p4d_leaf.
>> 
>> Signed-off-by: Rohan McLure 
>> ---
>> v5: Split patch that replaces p{m,u,4}d_is_leaf into two patches, first
>> replacing callsites and afterward providing generic definition.
>> Remove ifndef-defines implementing p{m,u}d_leaf in favour of
>> implementing stubs in headers belonging to the particular platforms
>> needing them.
>> ---
>> arch/powerpc/include/asm/book3s/32/pgtable.h |  4 
>> arch/powerpc/include/asm/book3s/64/pgtable.h |  8 ++-
>> arch/powerpc/include/asm/nohash/64/pgtable.h |  5 +
>> arch/powerpc/include/asm/nohash/pgtable.h|  5 +
>> arch/powerpc/include/asm/pgtable.h   | 22 ++--
>> 5 files changed, 18 insertions(+), 26 deletions(-)
> 
> I needed the delta below to prevent the generic versions being defined
> and overriding our versions.
> 
> cheers
> 
> diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
> b/arch/powerpc/include/asm/book3s/32/pgtable.h
> index 44703c8c590c..117135be8cc2 100644
> --- a/arch/powerpc/include/asm/book3s/32/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
> @@ -244,6 +244,7 @@ static inline void pmd_clear(pmd_t *pmdp)
> *pmdp = __pmd(0);
> }
> 
> +#define pmd_leaf pmd_leaf
> static inline bool pmd_leaf(pmd_t pmd)
> {
> return false;
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
> b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 436632d04304..f00aa2d203c2 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -1438,11 +1438,13 @@ static inline bool is_pte_rw_upgrade(unsigned long 
> old_val, unsigned long new_va
> /*
>  * Like pmd_huge() and pmd_large(), but works regardless of config options
>  */
> +#define pmd_leaf pmd_leaf
> static inline bool pmd_leaf(pmd_t pmd)
> {
> return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
> }
> 
> +#define pud_leaf pud_leaf
> static inline bool pud_leaf(pud_t pud)
> {
> return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
> diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
> b/arch/powerpc/include/asm/nohash/64/pgtable.h
> index 2488da8f0deb..d88b22c753d3 100644
> --- a/arch/powerpc/include/asm/nohash/64/pgtable.h
> +++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
> @@ -147,6 +147,7 @@ static inline void pud_clear(pud_t *pudp)
> *pudp = __pud(0);
> }
> 
> +#define pud_leaf pud_leaf
> static inline bool pud_leaf(pud_t pud)
> {
> return false;
> diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
> b/arch/powerpc/include/asm/nohash/pgtable.h
> index 487804f5b1d1..dfae1dbb9c3b 100644
> --- a/arch/powerpc/include/asm/nohash/pgtable.h
> +++ b/arch/powerpc/include/asm/nohash/pgtable.h
> @@ -60,6 +60,7 @@ static inline bool pte_hw_valid(pte_t pte)
> return pte_val(pte) & _PAGE_PRESENT;
> }
> 
> +#define pmd_leaf pmd_leaf
> static inline bool pmd_leaf(pmd_t pmd)
> {
> return false;




Re: [PATCH] powerpc/ftrace: fix syscall tracing on PPC64_ELF_ABI_V1

2022-12-06 Thread Christophe Leroy


Le 06/12/2022 à 15:38, Mathieu Desnoyers a écrit :
> On 2022-12-05 17:50, Michael Ellerman wrote:
>> Michael Jeanson  writes:
>>> On 2022-12-05 15:11, Michael Jeanson wrote:
>>> Michael Jeanson  writes:
 In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on
 PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table
 changing from their dot prefixed variant to the non-prefixed ones.

 Since ftrace prefixes a dot to the syscall names when matching 
 them to
 build its syscall event list, this resulted in no syscall events 
 being
 available.

 Remove the PPC64_ELF_ABI_V1 specific version of
 arch_syscall_match_sym_name to have the same behavior across all 
 powerpc
 variants.
>>>
>>> This doesn't seem to work for me.
>>>
>>> Event with it applied I still don't see anything in
>>> /sys/kernel/debug/tracing/events/syscalls
>>>
>>> Did we break it in some other way recently?
>>>
>>> cheers
>>>
>>> I did some further testing, my config also enabled KALLSYMS_ALL, when 
>>> I remove
>>> it there is indeed no syscall events.
>>
>> Aha, OK that explains it I guess.
>>
>> I was using ppc64_guest_defconfig which has ABI_V1 and FTRACE_SYSCALLS,
>> but does not have KALLSYMS_ALL. So I guess there's some other bug
>> lurking in there.
> 
> I don't have the setup handy to validate it, but I suspect it is caused 
> by the way scripts/kallsyms.c:symbol_valid() checks whether a symbol 
> entry needs to be integrated into the assembler output when 
> --all-symbols is not specified. It only keeps symbols which addresses 
> are in the text range. On PPC64_ELF_ABI_V1, this means only the 
> dot-prefixed symbols will be kept (those point to the function begin), 
> leaving out the non-dot-prefixed symbols (those point to the function 
> descriptors).
> 
> So I see two possible solutions there: either we ensure that 
> FTRACE_SYSCALLS selects KALLSYMS_ALL on PPC64_ELF_ABI_V1, or we modify 
> scripts/kallsyms.c:symbol_valid() to also include function descriptor 
> symbols. This would mean accepting symbols pointing into the .opd ELF 
> section.
> 
> IMHO the second option would be better because it does not increase the 
> kernel image size as much as KALLSYMS_ALL.
> 

Yes, seems to be the best solution.

Maybe the only thing to do is to add a new range to text_ranges, 
something like (untested):

diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 03fa07ad45d9..decf31c497f5 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -64,6 +64,7 @@ static unsigned long long relative_base;
  static struct addr_range text_ranges[] = {
{ "_stext", "_etext" },
{ "_sinittext", "_einittext" },
+   { "__start_opd", "__end_opd" },
  };
  #define text_range_text (_ranges[0])
  #define text_range_inittext (_ranges[1])

---
Christophe


Re: [PATCH v5 5/5] powerpc: mm: support page table check

2022-12-06 Thread Michael Ellerman
Rohan McLure  writes:
> On creation and clearing of a page table mapping, instrument such calls
> by invoking page_table_check_pte_set and page_table_check_pte_clear
> respectively. These calls serve as a sanity check against illegal
> mappings.
>
> Enable ARCH_SUPPORTS_PAGE_TABLE_CHECK for all ppc64, and 32-bit
> platforms implementing Book3S.
>
> Change pud_pfn to be a runtime bug rather than a build bug as it is
> consumed by page_table_check_pud_{clear,set} which are not called.
>
> See also:
>
> riscv support in commit 3fee229a8eb9 ("riscv/mm: enable
> ARCH_SUPPORTS_PAGE_TABLE_CHECK")
> arm64 in commit 42b2547137f5 ("arm64/mm: enable
> ARCH_SUPPORTS_PAGE_TABLE_CHECK")
> x86_64 in commit d283d422c6c4 ("x86: mm: add x86_64 support for page table
> check")
>
> Reviewed-by: Russell Currey 
> Reviewed-by: Christophe Leroy 
> Signed-off-by: Rohan McLure 

This blows up for me when checking is enabled. This is a qemu pseries
KVM guest on a P9 host, booting Fedora 34. I haven't dug into what is
wrong yet.

cheers


[0.600480][   T63] [ cut here ]
[0.600546][   T63] kernel BUG at mm/page_table_check.c:115!
[0.600596][   T63] Oops: Exception in kernel mode, sig: 5 [#1]
[0.600645][   T63] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[0.600703][   T63] Modules linked in:
[0.600736][   T63] CPU: 0 PID: 63 Comm: systemd-bless-b Not tainted 
6.1.0-rc2-00178-gf0c0e10f5162 #65
[0.600803][   T63] Hardware name: IBM pSeries (emulated by qemu) POWER9 
(raw) 0x4e1202 0xf05 of:SLOF,git-5b4c5a hv:linux,kvm pSeries
[0.600885][   T63] NIP:  c04a7d58 LR: c04a7d74 CTR: 

[0.600942][   T63] REGS: c67635e0 TRAP: 0700   Not tainted  
(6.1.0-rc2-00178-gf0c0e10f5162)
[0.601008][   T63] MSR:  80029033   CR: 
44424420  XER: 0003
[0.601088][   T63] CFAR: c04a7d4c IRQMASK: 0
[0.601088][   T63] GPR00: c04a7d74 c6763880 
c1332500 c3a00408
[0.601088][   T63] GPR04: 0001 0001 
8603402f00c0 
[0.601088][   T63] GPR08: 05e0 0001 
c2702500 2000
[0.601088][   T63] GPR12: 7fffbc81 c2a2 
4000 4000
[0.601088][   T63] GPR16: 4000  
 4000
[0.601088][   T63] GPR20: ff7fefbf  
c3555a00 0001
[0.601088][   T63] GPR24: c28dda60 c00c00020ac0 
c82a1100 c00c000bd000
[0.601088][   T63] GPR28: 0001 c26fcf60 
0001 c3a00400
[0.601614][   T63] NIP [c04a7d58] 
page_table_check_set.part.0+0xc8/0x170
[0.601675][   T63] LR [c04a7d74] 
page_table_check_set.part.0+0xe4/0x170
[0.601734][   T63] Call Trace:
[0.601764][   T63] [c6763880] [c67638c0] 0xc67638c0 
(unreliable)
[0.601825][   T63] [c67638c0] [c0087c28] 
set_pte_at+0x68/0x210
[0.601884][   T63] [c6763910] [c0483fd8] 
__split_huge_pmd+0x7f8/0x11c0
[0.601947][   T63] [c6763a20] [c0485908] 
vma_adjust_trans_huge+0x158/0x2d0
[0.602006][   T63] [c6763a70] [c040b5dc] 
__vma_adjust+0x13c/0xbe0
[0.602067][   T63] [c6763b80] [c040d708] 
__split_vma+0x158/0x270
[0.602128][   T63] [c6763bd0] [c040d938] 
do_mas_align_munmap.constprop.0+0x118/0x610
[0.602196][   T63] [c6763cd0] [c0416228] 
sys_mremap+0x3c8/0x850
[0.602255][   T63] [c6763e10] [c002fab8] 
system_call_exception+0x128/0x330
[0.602314][   T63] [c6763e50] [c000d05c] 
system_call_vectored_common+0x15c/0x2ec
[0.602384][   T63] --- interrupt: 3000 at 0x7fffbd94f86c
[0.602425][   T63] NIP:  7fffbd94f86c LR:  CTR: 

[0.602481][   T63] REGS: c6763e80 TRAP: 3000   Not tainted  
(6.1.0-rc2-00178-gf0c0e10f5162)
[0.602546][   T63] MSR:  8280f033 
  CR: 44004422  XER: 
[0.602635][   T63] IRQMASK: 0
[0.602635][   T63] GPR00: 00a3 72c4e670 
7fffbda37000 7fffbc40
[0.602635][   T63] GPR04: 0041 0001 
0001 
[0.602635][   T63] GPR08: 7fffbc41  
 
[0.602635][   T63] GPR12:  7fffbc937d50 
 
[0.602635][   T63] GPR16:   
 
[0.602635][   T63] GPR20: 046b 003f 
72c4e7e8 0040
[0.602635][   T63] GPR24:   
0480 0041
[0.602635][   T63] GPR28: 7fffbc40  
0001 

Re: [PATCH v5 2/5] powerpc: mm: Implement p{m,u,4}d_leaf on all platforms

2022-12-06 Thread Michael Ellerman
Rohan McLure  writes:
> The check that a higher-level entry in multi-level pages contains a page
> translation entry (pte) is performed by p{m,u,4}d_leaf stubs, which may
> be specialised for each choice of mmu. In a prior commit, we replace
> uses to the catch-all stubs, p{m,u,4}d_is_leaf with p{m,u,4}d_leaf.
>
> Replace the catch-all stub definitions for p{m,u,4}d_is_leaf with
> definitions for p{m,u,4}d_leaf. A future patch will assume that
> p{m,u,4}d_leaf is defined on all platforms.
>
> In particular, implement pud_leaf for Book3E-64, pmd_leaf for all Book3E
> and Book3S-64 platforms, with a catch-all definition for p4d_leaf.
>
> Signed-off-by: Rohan McLure 
> ---
> v5: Split patch that replaces p{m,u,4}d_is_leaf into two patches, first
> replacing callsites and afterward providing generic definition.
> Remove ifndef-defines implementing p{m,u}d_leaf in favour of
> implementing stubs in headers belonging to the particular platforms
> needing them.
> ---
>  arch/powerpc/include/asm/book3s/32/pgtable.h |  4 
>  arch/powerpc/include/asm/book3s/64/pgtable.h |  8 ++-
>  arch/powerpc/include/asm/nohash/64/pgtable.h |  5 +
>  arch/powerpc/include/asm/nohash/pgtable.h|  5 +
>  arch/powerpc/include/asm/pgtable.h   | 22 ++--
>  5 files changed, 18 insertions(+), 26 deletions(-)

I needed the delta below to prevent the generic versions being defined
and overriding our versions.

cheers

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 44703c8c590c..117135be8cc2 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -244,6 +244,7 @@ static inline void pmd_clear(pmd_t *pmdp)
*pmdp = __pmd(0);
 }
 
+#define pmd_leaf pmd_leaf
 static inline bool pmd_leaf(pmd_t pmd)
 {
return false;
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 436632d04304..f00aa2d203c2 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -1438,11 +1438,13 @@ static inline bool is_pte_rw_upgrade(unsigned long 
old_val, unsigned long new_va
 /*
  * Like pmd_huge() and pmd_large(), but works regardless of config options
  */
+#define pmd_leaf pmd_leaf
 static inline bool pmd_leaf(pmd_t pmd)
 {
return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
 }
 
+#define pud_leaf pud_leaf
 static inline bool pud_leaf(pud_t pud)
 {
return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 2488da8f0deb..d88b22c753d3 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -147,6 +147,7 @@ static inline void pud_clear(pud_t *pudp)
*pudp = __pud(0);
 }
 
+#define pud_leaf pud_leaf
 static inline bool pud_leaf(pud_t pud)
 {
return false;
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 487804f5b1d1..dfae1dbb9c3b 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -60,6 +60,7 @@ static inline bool pte_hw_valid(pte_t pte)
return pte_val(pte) & _PAGE_PRESENT;
 }
 
+#define pmd_leaf pmd_leaf
 static inline bool pmd_leaf(pmd_t pmd)
 {
return false;



Re: [PATCH v3 4/9] scsi: lpfc: Change to use pci_aer_clear_uncorrect_error_status()

2022-12-06 Thread Bjorn Helgaas
[moved James, Dick, LPFC supporters to "to"]

On Wed, Sep 28, 2022 at 06:59:41PM +0800, Zhuo Chen wrote:
> lpfc_aer_cleanup_state() requires clearing both fatal and non-fatal
> uncorrectable error status.

I don't know what the point of lpfc_aer_cleanup_state() is.  AER
errors should be handled and cleared by the PCI core, not by
individual drivers.  Only lpfc, liquidio, and sky2 touch
PCI_ERR_UNCOR_STATUS.

But lpfc_aer_cleanup_state() is visible in the
"lpfc_aer_state_cleanup" sysfs file, so removing it would break any
userspace that uses it.

If we can rely on the PCI core to clean up AER errors itself
(admittedly, that might be a big "if"), maybe lpfc_aer_cleanup_state()
could just become a no-op?

Any comment from the LPFC folks?

Ideally, I would rather not export pci_aer_clear_nonfatal_status() or
pci_aer_clear_uncorrect_error_status() outside the PCI core at all.

> But using pci_aer_clear_nonfatal_status()
> will only clear non-fatal error status. To clear both fatal and
> non-fatal error status, use pci_aer_clear_uncorrect_error_status().
> 
> Signed-off-by: Zhuo Chen 
> ---
>  drivers/scsi/lpfc/lpfc_attr.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
> index 09cf2cd0ae60..d835cc0ba153 100644
> --- a/drivers/scsi/lpfc/lpfc_attr.c
> +++ b/drivers/scsi/lpfc/lpfc_attr.c
> @@ -4689,7 +4689,7 @@ static DEVICE_ATTR_RW(lpfc_aer_support);
>   * Description:
>   * If the @buf contains 1 and the device currently has the AER support
>   * enabled, then invokes the kernel AER helper routine
> - * pci_aer_clear_nonfatal_status() to clean up the uncorrectable
> + * pci_aer_clear_uncorrect_error_status() to clean up the uncorrectable
>   * error status register.
>   *
>   * Notes:
> @@ -4715,7 +4715,7 @@ lpfc_aer_cleanup_state(struct device *dev, struct 
> device_attribute *attr,
>   return -EINVAL;
>  
>   if (phba->hba_flag & HBA_AER_ENABLED)
> - rc = pci_aer_clear_nonfatal_status(phba->pcidev);
> + rc = pci_aer_clear_uncorrect_error_status(phba->pcidev);
>  
>   if (rc == 0)
>   return strlen(buf);
> -- 
> 2.30.1 (Apple Git-130)
> 


[PATCH mm-unstable RFC 00/26] mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap PTEs

2022-12-06 Thread David Hildenbrand
This is the follow-up on [1]:
[PATCH v2 0/8] mm: COW fixes part 3: reliable GUP R/W FOLL_GET of
anonymous pages

After we implemented __HAVE_ARCH_PTE_SWP_EXCLUSIVE on most prominent
enterprise architectures, implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all
remaining architectures that support swap PTEs.

This makes sure that exclusive anonymous pages will stay exclusive, even
after they were swapped out -- for example, making GUP R/W FOLL_GET of
anonymous pages reliable. Details can be found in [1].

This primarily fixes remaining known O_DIRECT memory corruptions that can
happen on concurrent swapout, whereby we can lose DMA reads to a page
(modifying the user page by writing to it).

To verify, there are two test cases (requiring swap space, obviously):
(1) The O_DIRECT+swapout test case [2] from Andrea. This test case tries
triggering a race condition.
(2) My vmsplice() test case [3] that tries to detect if the exclusive
marker was lost during swapout, not relying on a race condition.


For example, on 32bit x86 (with and without PAE), my test case fails
without these patches:
$ ./test_swp_exclusive
FAIL: page was replaced during COW
But succeeds with these patches:
$ ./test_swp_exclusive 
PASS: page was not replaced during COW


Why implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE for all architectures, even
the ones where swap support might be in a questionable state? This is the
first step towards removing "readable_exclusive" migration entries, and
instead using pte_swp_exclusive() also with (readable) migration entries
instead (as suggested by Peter). The only missing piece for that is
supporting pmd_swp_exclusive() on relevant architectures with THP
migration support.

As all relevant architectures now implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE,,
we can drop __HAVE_ARCH_PTE_SWP_EXCLUSIVE in the last patch.


RFC because some of the swap PTE layouts are really tricky and I really
need some feedback related to deciphering these layouts and "using yet
unused PTE bits in swap PTEs". I tried cross-compiling all relevant setups
(phew, I might only miss some power/nohash variants), but only tested on
x86 so far.

CCing arch maintainers only on this cover letter and on the respective
patch(es).


[1] https://lkml.kernel.org/r/20220329164329.208407-1-da...@redhat.com
[2] 
https://gitlab.com/aarcange/kernel-testcases-for-v5.11/-/blob/main/page_count_do_wp_page-swap.c
[3] 
https://gitlab.com/davidhildenbrand/scratchspace/-/blob/main/test_swp_exclusive.c

David Hildenbrand (26):
  mm/debug_vm_pgtable: more pte_swp_exclusive() sanity checks
  alpha/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  arc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  arm/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  csky/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  hexagon/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  ia64/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  loongarch/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  m68k/mm: remove dummy __swp definitions for nommu
  m68k/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  microblaze/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  mips/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  nios2/mm: refactor swap PTE layout
  nios2/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  openrisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  parisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  powerpc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit book3s
  powerpc/nohash/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  riscv/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  sh/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit
  sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 64bit
  um/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  x86/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also on 32bit
  xtensa/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
  mm: remove __HAVE_ARCH_PTE_SWP_EXCLUSIVE

 arch/alpha/include/asm/pgtable.h  | 40 -
 arch/arc/include/asm/pgtable-bits-arcv2.h | 26 +-
 arch/arm/include/asm/pgtable-2level.h |  3 +
 arch/arm/include/asm/pgtable-3level.h |  3 +
 arch/arm/include/asm/pgtable.h| 34 ++--
 arch/arm64/include/asm/pgtable.h  |  1 -
 arch/csky/abiv1/inc/abi/pgtable-bits.h| 13 ++-
 arch/csky/abiv2/inc/abi/pgtable-bits.h| 19 ++--
 arch/csky/include/asm/pgtable.h   | 17 
 arch/hexagon/include/asm/pgtable.h| 36 ++--
 arch/ia64/include/asm/pgtable.h   | 31 ++-
 arch/loongarch/include/asm/pgtable-bits.h |  4 +
 arch/loongarch/include/asm/pgtable.h  | 38 +++-
 arch/m68k/include/asm/mcf_pgtable.h   | 35 +++-
 arch/m68k/include/asm/motorola_pgtable.h  | 37 +++-
 arch/m68k/include/asm/pgtable_no.h|  6 --
 arch/m68k/include/asm/sun3_pgtable.h  | 38 +++-
 arch/microblaze/include/asm/pgtable.h | 44 +++---
 

Re: [PATCH v3 8/9] PCI/ERR: Clear fatal error status when pci_channel_io_frozen

2022-12-06 Thread Bjorn Helgaas
Hi Zhuo,

On Wed, Sep 28, 2022 at 06:59:45PM +0800, Zhuo Chen wrote:
> When state is pci_channel_io_frozen in pcie_do_recovery(), the
> severity is fatal and fatal error status should be cleared.
> So add pci_aer_clear_fatal_status().
> 
> Signed-off-by: Zhuo Chen 
> ---
>  drivers/pci/pcie/err.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> index f80b21244ef1..b46f1d36c090 100644
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -241,7 +241,10 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
>   pci_walk_bridge(bridge, report_resume, );
>  
>   pcie_clear_device_status(dev);
> - pci_aer_clear_nonfatal_status(dev);
> + if (state == pci_channel_io_frozen)
> + pci_aer_clear_fatal_status(dev);
> + else
> + pci_aer_clear_nonfatal_status(dev);

I'm confused.  It seems like we certainly need to clear fatal errors
after they occur *somewhere*, and if we don't, surely this would be a
very obvious issue.  But you didn't mention this being a bug fix, so I
assume it's more of a cleanup.

If it *is* a bug fix, please say that and give a hint about what the
bug looks like, e.g., what sort of messages a user might see.

If it's not a bug fix, I don't understand how AER fatal errors get
cleared today.  The PCI_ERR_UNCOR_STATUS bits are sticky, so they're
not cleared by a reset.  In the current tree, these are the only
places I see that clear AER fatal errors:

  pci_init_capabilities
pci_aer_init # once at device enumeration
  pci_aer_clear_status
pci_aer_raw_clear_status
  pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status)

  aer_probe
aer_enable_rootport  # once at Root Port enumeration
  pci_write_config_dword(pdev, aer + PCI_ERR_UNCOR_STATUS, reg32)

  dpc_process_error  # after DPC triggered
pci_aer_clear_fatal_status
  pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status)

  edr_handle_event   # after EDR event
pci_aer_raw_clear_status
  pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status)

  pci_restore_state  # after reset or PM sleep/resume
pci_aer_clear_status
  pci_aer_raw_clear_status
pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status)

The only one that could clear errors after an AER error (not DPC or
EDR), would be the pci_restore_state() in the reset path.  If the
current code relies on that, I'd say that's a pretty non-obvious
dependency.

>   pci_info(bridge, "device recovery successful\n");
>   return status;
> -- 
> 2.30.1 (Apple Git-130)
> 


Re: [PATCH v3 3/9] NTB: Remove pci_aer_clear_nonfatal_status() call

2022-12-06 Thread Serge Semin
Hi Bjorn

On Tue, Dec 06, 2022 at 12:09:56PM -0600, Bjorn Helgaas wrote:
> On Wed, Sep 28, 2022 at 02:03:55PM +0300, Serge Semin wrote:
> > On Wed, Sep 28, 2022 at 06:59:40PM +0800, Zhuo Chen wrote:
> > > There is no need to clear error status during init code, so remove it.
> > 
> > Why do you think there isn't? Justify in more details.
> 
> Thanks for taking a look, Sergey!  I agree we should leave it or add
> the rationale here.
> 
> > > Signed-off-by: Zhuo Chen 
> > > ---
> > >  drivers/ntb/hw/idt/ntb_hw_idt.c | 2 --
> > >  1 file changed, 2 deletions(-)
> > > 
> > > diff --git a/drivers/ntb/hw/idt/ntb_hw_idt.c 
> > > b/drivers/ntb/hw/idt/ntb_hw_idt.c
> > > index 0ed6f809ff2e..fed03217289d 100644
> > > --- a/drivers/ntb/hw/idt/ntb_hw_idt.c
> > > +++ b/drivers/ntb/hw/idt/ntb_hw_idt.c
> > > @@ -2657,8 +2657,6 @@ static int idt_init_pci(struct idt_ntb_dev *ndev)
> > >   ret = pci_enable_pcie_error_reporting(pdev);
> > >   if (ret != 0)
> > >   dev_warn(>dev, "PCIe AER capability disabled\n");
> > > - else /* Cleanup nonfatal error status before getting to init */
> > > - pci_aer_clear_nonfatal_status(pdev);
> 
> I do think drivers should not need to clear errors; I think the PCI
> core should be responsible for that.
> 
> And I think the core *does* do that in this path:
> 
>   pci_init_capabilities
> pci_aer_init
>   pci_aer_clear_status
> pci_aer_raw_clear_status
>   pci_write_config_dword(pdev, aer + PCI_ERR_COR_STATUS)
>   pci_write_config_dword(pdev, aer + PCI_ERR_UNCOR_STATUS)
> 
> pci_aer_clear_nonfatal_status() clears only non-fatal uncorrectable
> errors, while pci_aer_init() clears all correctable and all
> uncorrectable errors, so the PCI core is already doing more than
> idt_init_pci() does.
> 
> So I think this change is good because it removes some work from the
> driver, but let me know if you think otherwise.

It's hard to remember now all the details but IIRC back when this
driver was developed the "Unsupported Request" flag was left uncleared
on our platform even after the probe completion. Most likely an
erroneous TLP was generated by some action performed on the device
probe stage. The forced cleanup of the AER status solved that problem.
On the other hand the problem of having the UnsupReq+ flag set was
solved some time after the driver was merged in into the kernel (it
was caused by a vendor-specific behavior of the IDT PCIe switch placed
on the path between a RP and PCIe NTB). So since the original reason
of having the pci_aer_clear_nonfatal_status() method called here was
platform specific and fixed now anyway, and the AER flags cleanup is
done by the core, then I have no reason to be against the patch. It
would be good to add your clarification to the commit message though.

Reviewed-by: Serge Semin 

-Serge(y)

> 
> > >  
> > >   /* First enable the PCI device */
> > >   ret = pcim_enable_device(pdev);
> > > -- 
> > > 2.30.1 (Apple Git-130)
> > > 


Re: [PATCH 05/11] ARM: dts: socfpga: Fix pca9548 i2c-mux node name

2022-12-06 Thread Dinh Nguyen




On 12/2/22 10:49, Geert Uytterhoeven wrote:

"make dtbs_check":

 arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dtb: i2cswitch@70: 
$nodename:0: 'i2cswitch@70' does not match '^(i2c-?)?mux'
From schema: 
Documentation/devicetree/bindings/i2c/i2c-mux-pca954x.yaml
 arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dtb: i2cswitch@70: 
Unevaluated properties are not allowed ('#address-cells', '#size-cells', 
'i2c@0', 'i2c@1', 'i2c@2', 'i2c@3', 'i2c@4', 'i2c@5', 'i2c@6', 'i2c@7' were 
unexpected)
 From schema: Documentation/devicetree/bindings/i2c/i2c-mux-pca954x.yaml

Fix this by renaming the PCA9548 node to "i2c-mux", to match the I2C bus
multiplexer/switch DT bindings and the Generic Names Recommendation in
the Devicetree Specification.

Signed-off-by: Geert Uytterhoeven 
---
  arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts 
b/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts
index f24f17c2f5ee6bc4..e0630b0eed036d35 100644
--- a/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts
+++ b/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts
@@ -141,7 +141,7 @@ at24@50 {
reg = <0x50>;
};
  
-	i2cswitch@70 {

+   i2c-mux@70 {
compatible = "nxp,pca9548";
#address-cells = <1>;
#size-cells = <0>;


Applied!

Thanks,
Dinh


Re: [PATCH v3 3/9] NTB: Remove pci_aer_clear_nonfatal_status() call

2022-12-06 Thread Bjorn Helgaas
On Wed, Sep 28, 2022 at 02:03:55PM +0300, Serge Semin wrote:
> On Wed, Sep 28, 2022 at 06:59:40PM +0800, Zhuo Chen wrote:
> > There is no need to clear error status during init code, so remove it.
> 
> Why do you think there isn't? Justify in more details.

Thanks for taking a look, Sergey!  I agree we should leave it or add
the rationale here.

> > Signed-off-by: Zhuo Chen 
> > ---
> >  drivers/ntb/hw/idt/ntb_hw_idt.c | 2 --
> >  1 file changed, 2 deletions(-)
> > 
> > diff --git a/drivers/ntb/hw/idt/ntb_hw_idt.c 
> > b/drivers/ntb/hw/idt/ntb_hw_idt.c
> > index 0ed6f809ff2e..fed03217289d 100644
> > --- a/drivers/ntb/hw/idt/ntb_hw_idt.c
> > +++ b/drivers/ntb/hw/idt/ntb_hw_idt.c
> > @@ -2657,8 +2657,6 @@ static int idt_init_pci(struct idt_ntb_dev *ndev)
> > ret = pci_enable_pcie_error_reporting(pdev);
> > if (ret != 0)
> > dev_warn(>dev, "PCIe AER capability disabled\n");
> > -   else /* Cleanup nonfatal error status before getting to init */
> > -   pci_aer_clear_nonfatal_status(pdev);

I do think drivers should not need to clear errors; I think the PCI
core should be responsible for that.

And I think the core *does* do that in this path:

  pci_init_capabilities
pci_aer_init
  pci_aer_clear_status
pci_aer_raw_clear_status
  pci_write_config_dword(pdev, aer + PCI_ERR_COR_STATUS)
  pci_write_config_dword(pdev, aer + PCI_ERR_UNCOR_STATUS)

pci_aer_clear_nonfatal_status() clears only non-fatal uncorrectable
errors, while pci_aer_init() clears all correctable and all
uncorrectable errors, so the PCI core is already doing more than
idt_init_pci() does.

So I think this change is good because it removes some work from the
driver, but let me know if you think otherwise.

> >  
> > /* First enable the PCI device */
> > ret = pcim_enable_device(pdev);
> > -- 
> > 2.30.1 (Apple Git-130)
> > 


[linux-next:master] BUILD REGRESSION 5d562c48a21eeb029a8fd3f18e1b31fd83660474

2022-12-06 Thread kernel test robot
tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: 5d562c48a21eeb029a8fd3f18e1b31fd83660474  Add linux-next specific 
files for 20221206

Error/Warning reports:

https://lore.kernel.org/oe-kbuild-all/202211231857.0dmueoa1-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202211242120.mzzvguln-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202211290656.vhedfthu-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202211301840.y7rrob13-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202212020520.0okmino3-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202212032205.iehbbyyp-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202212060700.njmecjxs-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202212061249.u0basqzk-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202212061341.gnalcbx6-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202212061455.6ge7y0jg-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202212061633.u9qhpe62-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202212061758.tlpqnuof-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202212061918.w5iupcya-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202212062250.tr0othcz-...@intel.com

Error/Warning: (recently discovered and may have been fixed)

Error: failed to load BTF from vmlinux: No such file or directory
arch/loongarch/kernel/asm-offsets.c:262:6: warning: no previous prototype for 
'output_pbe_defines' [-Wmissing-prototypes]
arch/loongarch/power/hibernate.c:14:6: warning: no previous prototype for 
'save_processor_state' [-Wmissing-prototypes]
arch/loongarch/power/hibernate.c:26:6: warning: no previous prototype for 
'restore_processor_state' [-Wmissing-prototypes]
arch/loongarch/power/hibernate.c:38:5: warning: no previous prototype for 
'pfn_is_nosave' [-Wmissing-prototypes]
arch/loongarch/power/hibernate.c:48:5: warning: no previous prototype for 
'swsusp_arch_suspend' [-Wmissing-prototypes]
arch/loongarch/power/hibernate.c:56:5: warning: no previous prototype for 
'swsusp_arch_resume' [-Wmissing-prototypes]
arch/powerpc/kernel/kvm_emul.o: warning: objtool: kvm_template_end(): can't 
find starting instruction
arch/powerpc/kernel/optprobes_head.o: warning: objtool: 
optprobe_template_end(): can't find starting instruction
arch/riscv/kernel/crash_core.c:12:57: warning: format specifies type 'unsigned 
long' but the argument has type 'int' [-Wformat]
arch/riscv/kernel/crash_core.c:14:57: error: use of undeclared identifier 
'VMEMMAP_START'
arch/riscv/kernel/crash_core.c:15:55: error: use of undeclared identifier 
'VMEMMAP_END'; did you mean 'MEMREMAP_ENC'?
arch/riscv/kernel/crash_core.c:17:57: error: use of undeclared identifier 
'MODULES_VADDR'
arch/riscv/kernel/crash_core.c:18:55: error: use of undeclared identifier 
'MODULES_END'
arch/riscv/kernel/crash_core.c:8:20: error: use of undeclared identifier 
'VA_BITS'
clang-16: error: no such file or directory: 'liburandom_read.so'
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn201/irq_service_dcn201.c:40:20: 
warning: no previous prototype for 'to_dal_irq_source_dcn201' 
[-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c:353:5: warning: no previous 
prototype for 'amdgpu_mcbp_scan' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c:373:5: warning: no previous 
prototype for 'amdgpu_mcbp_trigger_preempt' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/fifo/gf100.c:451:1: warning: no previous 
prototype for 'gf100_fifo_nonstall_block' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/fifo/gf100.c:451:1: warning: no previous 
prototype for function 'gf100_fifo_nonstall_block' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/fifo/runl.c:34:1: warning: no previous 
prototype for 'nvkm_engn_cgrp_get' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/fifo/runl.c:34:1: warning: no previous 
prototype for function 'nvkm_engn_cgrp_get' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c:210:1: warning: no previous 
prototype for 'tu102_gr_load' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c:210:1: warning: no previous 
prototype for function 'tu102_gr_load' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/nvfw/acr.c:49:1: warning: no previous prototype 
for 'wpr_generic_header_dump' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/nvfw/acr.c:49:1: warning: no previous prototype 
for function 'wpr_generic_header_dump' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c:221:21: warning: variable 'loc' 
set but not used [-Wunused-but-set-variable]
drivers/media/i2c/tc358746.c:816:13: warning: 'm_best' is used uninitialized 
[-Wuninitialized]
drivers/media/i2c/tc358746.c:817:13: warning: 'p_best' is used uninitialized 
[-Wuninitialized]
drivers/media/platform/renesas/rzg2l-cru/rzg2l-csi2.c:445:7: warning: variable 
'ret' is used uninitialized whenever 'if' condition is true

[PATCH mm-unstable RFC 26/26] mm: remove __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Supported by all architectures that support swp PTEs, so let's drop it.

Signed-off-by: David Hildenbrand 
---
 arch/alpha/include/asm/pgtable.h |  1 -
 arch/arc/include/asm/pgtable-bits-arcv2.h|  1 -
 arch/arm/include/asm/pgtable.h   |  1 -
 arch/arm64/include/asm/pgtable.h |  1 -
 arch/csky/include/asm/pgtable.h  |  1 -
 arch/hexagon/include/asm/pgtable.h   |  1 -
 arch/ia64/include/asm/pgtable.h  |  1 -
 arch/loongarch/include/asm/pgtable.h |  1 -
 arch/m68k/include/asm/mcf_pgtable.h  |  1 -
 arch/m68k/include/asm/motorola_pgtable.h |  1 -
 arch/m68k/include/asm/sun3_pgtable.h |  1 -
 arch/microblaze/include/asm/pgtable.h|  1 -
 arch/mips/include/asm/pgtable.h  |  1 -
 arch/nios2/include/asm/pgtable.h |  1 -
 arch/openrisc/include/asm/pgtable.h  |  1 -
 arch/parisc/include/asm/pgtable.h|  1 -
 arch/powerpc/include/asm/book3s/32/pgtable.h |  1 -
 arch/powerpc/include/asm/book3s/64/pgtable.h |  1 -
 arch/powerpc/include/asm/nohash/pgtable.h|  1 -
 arch/riscv/include/asm/pgtable.h |  1 -
 arch/s390/include/asm/pgtable.h  |  1 -
 arch/sh/include/asm/pgtable_32.h |  1 -
 arch/sparc/include/asm/pgtable_32.h  |  1 -
 arch/sparc/include/asm/pgtable_64.h  |  1 -
 arch/um/include/asm/pgtable.h|  1 -
 arch/x86/include/asm/pgtable.h   |  1 -
 arch/xtensa/include/asm/pgtable.h|  1 -
 include/linux/pgtable.h  | 29 
 mm/debug_vm_pgtable.c|  2 --
 mm/memory.c  |  4 ---
 mm/rmap.c| 11 
 31 files changed, 73 deletions(-)

diff --git a/arch/alpha/include/asm/pgtable.h b/arch/alpha/include/asm/pgtable.h
index 970abf511b13..ba43cb841d19 100644
--- a/arch/alpha/include/asm/pgtable.h
+++ b/arch/alpha/include/asm/pgtable.h
@@ -328,7 +328,6 @@ extern inline pte_t mk_swap_pte(unsigned long type, 
unsigned long offset)
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
-#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
 static inline int pte_swp_exclusive(pte_t pte)
 {
return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
diff --git a/arch/arc/include/asm/pgtable-bits-arcv2.h 
b/arch/arc/include/asm/pgtable-bits-arcv2.h
index 611f412713b9..6e9f8ca6d6a1 100644
--- a/arch/arc/include/asm/pgtable-bits-arcv2.h
+++ b/arch/arc/include/asm/pgtable-bits-arcv2.h
@@ -132,7 +132,6 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned 
long address,
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
-#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
 static inline int pte_swp_exclusive(pte_t pte)
 {
return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 5e0446a9c667..d6dec218a1fe 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -296,7 +296,6 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(swp)__pte((swp).val)
 
-#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
 static inline int pte_swp_exclusive(pte_t pte)
 {
return pte_isset(pte, L_PTE_SWP_EXCLUSIVE);
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 4873c1d6e7d0..58e44aed2000 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -417,7 +417,6 @@ static inline pgprot_t mk_pmd_sect_prot(pgprot_t prot)
return __pgprot((pgprot_val(prot) & ~PMD_TABLE_BIT) | PMD_TYPE_SECT);
 }
 
-#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
 static inline pte_t pte_swp_mkexclusive(pte_t pte)
 {
return set_pte_bit(pte, __pgprot(PTE_SWP_EXCLUSIVE));
diff --git a/arch/csky/include/asm/pgtable.h b/arch/csky/include/asm/pgtable.h
index 574c97b9ecca..d4042495febc 100644
--- a/arch/csky/include/asm/pgtable.h
+++ b/arch/csky/include/asm/pgtable.h
@@ -200,7 +200,6 @@ static inline pte_t pte_mkyoung(pte_t pte)
return pte;
 }
 
-#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
 static inline int pte_swp_exclusive(pte_t pte)
 {
return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
diff --git a/arch/hexagon/include/asm/pgtable.h 
b/arch/hexagon/include/asm/pgtable.h
index 7eb008e477c8..59393613d086 100644
--- a/arch/hexagon/include/asm/pgtable.h
+++ b/arch/hexagon/include/asm/pgtable.h
@@ -397,7 +397,6 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
(((type & 0x1f) << 1) | \
 ((offset & 0x38) << 10) | ((offset & 0x7) << 7)) })
 
-#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
 static inline int pte_swp_exclusive(pte_t pte)
 {
return pte_val(pte) 

[PATCH mm-unstable RFC 24/26] x86/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also on 32bit

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE just like we already do on
x86-64. After deciphering the PTE layout it becomes clear that there are
still unused bits for 2-level and 3-level page tables that we should be
able to use. Reusing a bit avoids stealing one bit from the swap offset.

While at it, mask the type in __swp_entry(); use some helper definitions
to make the macros easier to grasp.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: "H. Peter Anvin" 
Signed-off-by: David Hildenbrand 
---
 arch/x86/include/asm/pgtable-2level.h | 26 +-
 arch/x86/include/asm/pgtable-3level.h | 26 +++---
 arch/x86/include/asm/pgtable.h|  2 --
 3 files changed, 44 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/pgtable-2level.h 
b/arch/x86/include/asm/pgtable-2level.h
index 60d0f9015317..e9482a11ac52 100644
--- a/arch/x86/include/asm/pgtable-2level.h
+++ b/arch/x86/include/asm/pgtable-2level.h
@@ -80,21 +80,37 @@ static inline unsigned long pte_bitop(unsigned long value, 
unsigned int rightshi
return ((value >> rightshift) & mask) << leftshift;
 }
 
-/* Encode and de-code a swap entry */
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <- offset --> 0 E <- type --> 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ */
 #define SWP_TYPE_BITS 5
+#define _SWP_TYPE_MASK ((1U << SWP_TYPE_BITS) - 1)
+#define _SWP_TYPE_SHIFT (_PAGE_BIT_PRESENT + 1)
 #define SWP_OFFSET_SHIFT (_PAGE_BIT_PROTNONE + 1)
 
-#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS)
+#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > 5)
 
-#define __swp_type(x)  (((x).val >> (_PAGE_BIT_PRESENT + 1)) \
-& ((1U << SWP_TYPE_BITS) - 1))
+#define __swp_type(x)  (((x).val >> _SWP_TYPE_SHIFT) \
+& _SWP_TYPE_MASK)
 #define __swp_offset(x)((x).val >> SWP_OFFSET_SHIFT)
 #define __swp_entry(type, offset)  ((swp_entry_t) { \
-((type) << (_PAGE_BIT_PRESENT + 1)) \
+(((type) & _SWP_TYPE_MASK) << 
_SWP_TYPE_SHIFT) \
 | ((offset) << SWP_OFFSET_SHIFT) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { (pte).pte_low 
})
 #define __swp_entry_to_pte(x)  ((pte_t) { .pte = (x).val })
 
+/* We borrow bit 7 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE_PAGE_PSE
+
 /* No inverted PFNs on 2 level page tables */
 
 static inline u64 protnone_mask(u64 val)
diff --git a/arch/x86/include/asm/pgtable-3level.h 
b/arch/x86/include/asm/pgtable-3level.h
index 28421a887209..2b87f965dd86 100644
--- a/arch/x86/include/asm/pgtable-3level.h
+++ b/arch/x86/include/asm/pgtable-3level.h
@@ -248,8 +248,24 @@ static inline pud_t native_pudp_get_and_clear(pud_t *pudp)
 #define native_pudp_get_and_clear(xp) native_local_pudp_get_and_clear(xp)
 #endif
 
-/* Encode and de-code a swap entry */
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+  * Format of swap PTEs:
+ *
+ *   6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3
+ *   3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2
+ *   < type -> <-- offset --
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   > 0 E 0 0 0 0 0 0 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ */
 #define SWP_TYPE_BITS  5
+#define _SWP_TYPE_MASK ((1U << SWP_TYPE_BITS) - 1)
 
 #define SWP_OFFSET_FIRST_BIT   (_PAGE_BIT_PROTNONE + 1)
 
@@ -257,9 +273,10 @@ static inline pud_t native_pudp_get_and_clear(pud_t *pudp)
 #define SWP_OFFSET_SHIFT   (SWP_OFFSET_FIRST_BIT + SWP_TYPE_BITS)
 
 #define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS)
-#define __swp_type(x)  (((x).val) & ((1UL << SWP_TYPE_BITS) - 
1))
+#define __swp_type(x)  (((x).val) & _SWP_TYPE_MASK)
 #define __swp_offset(x)((x).val >> SWP_TYPE_BITS)
-#define __swp_entry(type, offset)  ((swp_entry_t){(type) | (offset) << 
SWP_TYPE_BITS})
+#define __swp_entry(type, offset)  ((swp_entry_t){((type) & 
_SWP_TYPE_MASK) \
+   | (offset) << SWP_TYPE_BITS})
 
 /*
  * Normally, __swp_entry() converts from arch-independent swp_entry_t to
@@ -287,6 +304,9 @@ static 

[PATCH mm-unstable RFC 25/26] xtensa/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 1. This
bit should be safe to use for our usecase.

Most importantly, we can still distinguish swap PTEs from PAGE_NONE PTEs
(see pte_present()) and don't use one of the two reserved attribute
masks (1101 and ). Attribute mask 1100 and 1110 now identify swap PTEs.

While at it, remove SWP_TYPE_BITS (not really helpful as it's not used in
the actual swap macros) and mask the type in __swp_entry().

Cc: Chris Zankel 
Cc: Max Filippov 
Signed-off-by: David Hildenbrand 
---
 arch/xtensa/include/asm/pgtable.h | 32 ++-
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/arch/xtensa/include/asm/pgtable.h 
b/arch/xtensa/include/asm/pgtable.h
index 5b5484d707b2..1025e2dc292b 100644
--- a/arch/xtensa/include/asm/pgtable.h
+++ b/arch/xtensa/include/asm/pgtable.h
@@ -96,7 +96,7 @@
  * +- - - - - - - - - - - - - - - - - - - - -+
  *   (PAGE_NONE)|PPN| 0 | 00 | ADW | 01 | 11 | 11 |
  * +-+
- *   swap  | index |   type   | 01 | 11 | 00 |
+ *   swap  | index |   type   | 01 | 11 | e0 |
  * +-+
  *
  * For T1050 hardware and earlier the layout differs for present and 
(PAGE_NONE)
@@ -112,6 +112,7 @@
  *   RI ring (0=privileged, 1=user, 2 and 3 are unused)
  *   CAcache attribute: 00 bypass, 01 writeback, 10 
writethrough
  * (11 is invalid and used to mark pages that are not present)
+ *   e exclusive marker in swap PTEs
  *   w page is writable (hw)
  *   x page is executable (hw)
  *   index  swap offset / PAGE_SIZE (bit 11-31: 21 bits -> 8 GB)
@@ -158,6 +159,9 @@
 #define _PAGE_DIRTY(1<<7)  /* software: page dirty */
 #define _PAGE_ACCESSED (1<<8)  /* software: page accessed (read) */
 
+/* We borrow bit 1 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE(1<<1)
+
 #ifdef CONFIG_MMU
 
 #define _PAGE_CHG_MASK(PAGE_MASK | _PAGE_ACCESSED | _PAGE_DIRTY)
@@ -343,19 +347,37 @@ ptep_set_wrprotect(struct mm_struct *mm, unsigned long 
addr, pte_t *ptep)
 }
 
 /*
- * Encode and decode a swap and file entry.
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
  */
-#define SWP_TYPE_BITS  5
-#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS)
+#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > 5)
 
 #define __swp_type(entry)  (((entry).val >> 6) & 0x1f)
 #define __swp_offset(entry)((entry).val >> 11)
 #define __swp_entry(type,offs) \
-   ((swp_entry_t){((type) << 6) | ((offs) << 11) | \
+   ((swp_entry_t){(((type) & 0x1f) << 6) | ((offs) << 11) | \
 _PAGE_CA_INVALID | _PAGE_USER})
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 #endif /*  !defined (__ASSEMBLY__) */
 
 
-- 
2.38.1



[PATCH mm-unstable RFC 23/26] um/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 10, which is
yet unused for swap PTEs.

The pte_mkuptodate() is a bit weird in __pte_to_swp_entry() for a swap PTE
... but it only messes with bit 1 and 2 and there is a comment in
set_pte(), so leave these bits alone.

While at it, mask the type in __swp_entry().

Cc: Richard Weinberger 
Cc: Anton Ivanov 
Cc: Johannes Berg 
Signed-off-by: David Hildenbrand 
---
 arch/um/include/asm/pgtable.h | 37 +--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/um/include/asm/pgtable.h b/arch/um/include/asm/pgtable.h
index 4e3052f2671a..cedc5fd451ce 100644
--- a/arch/um/include/asm/pgtable.h
+++ b/arch/um/include/asm/pgtable.h
@@ -21,6 +21,9 @@
 #define _PAGE_PROTNONE 0x010   /* if the user mapped it with PROT_NONE;
   pte_present gives true */
 
+/* We borrow bit 10 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE0x400
+
 #ifdef CONFIG_3_LEVEL_PGTABLES
 #include 
 #else
@@ -288,16 +291,46 @@ extern pte_t *virt_to_pte(struct mm_struct *mm, unsigned 
long addr);
 
 #define update_mmu_cache(vma,address,ptep) do {} while (0)
 
-/* Encode and de-code a swap entry */
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <--- offset > E < type -> 0 0 0 1 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   _PAGE_NEWPAGE (bit 1) is always set to 1 in set_pte().
+ */
 #define __swp_type(x)  (((x).val >> 5) & 0x1f)
 #define __swp_offset(x)((x).val >> 11)
 
 #define __swp_entry(type, offset) \
-   ((swp_entry_t) { ((type) << 5) | ((offset) << 11) })
+   ((swp_entry_t) { (((type) & 0x1f) << 5) | ((offset) << 11) })
 #define __pte_to_swp_entry(pte) \
((swp_entry_t) { pte_val(pte_mkuptodate(pte)) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_get_bits(pte, _PAGE_SWP_EXCLUSIVE);
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_set_bits(pte, _PAGE_SWP_EXCLUSIVE);
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_clear_bits(pte, _PAGE_SWP_EXCLUSIVE);
+   return pte;
+}
+
 /* Clear a kernel PTE and flush it from the TLB */
 #define kpte_clear_flush(ptep, vaddr)  \
 do {   \
-- 
2.38.1



[PATCH mm-unstable RFC 22/26] sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 64bit

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit
from the type. Generic MM currently only uses 5 bits for the type
(MAX_SWAPFILES_SHIFT), so the stolen bit was effectively unused.

While at it, mask the type in __swp_entry().

Cc: "David S. Miller" 
Signed-off-by: David Hildenbrand 
---
 arch/sparc/include/asm/pgtable_64.h | 38 ++---
 1 file changed, 35 insertions(+), 3 deletions(-)

diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index 3bc9736bddb1..614fdedbb145 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -187,6 +187,9 @@ bool kern_addr_valid(unsigned long addr);
 #define _PAGE_SZHUGE_4U_PAGE_SZ4MB_4U
 #define _PAGE_SZHUGE_4V_PAGE_SZ4MB_4V
 
+/* We borrow bit 20 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE_AC(0x0010,UL)
+
 #ifndef __ASSEMBLY__
 
 pte_t mk_pte_io(unsigned long, pgprot_t, int, unsigned long);
@@ -961,18 +964,47 @@ void pgtable_trans_huge_deposit(struct mm_struct *mm, 
pmd_t *pmdp,
 pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp);
 #endif
 
-/* Encode and de-code a swap entry */
-#define __swp_type(entry)  (((entry).val >> PAGE_SHIFT) & 0xffUL)
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3
+ *   3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2
+ *   <--- offset ---
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   > E <-- type ---> <--- zeroes >
+ */
+#define __swp_type(entry)  (((entry).val >> PAGE_SHIFT) & 0x7fUL)
 #define __swp_offset(entry)((entry).val >> (PAGE_SHIFT + 8UL))
 #define __swp_entry(type, offset)  \
( (swp_entry_t) \
  { \
-   (((long)(type) << PAGE_SHIFT) | \
+   long)(type) & 0x7fUL) << PAGE_SHIFT) | \
  ((long)(offset) << (PAGE_SHIFT + 8UL))) \
  } )
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   return __pte(pte_val(pte) | _PAGE_SWP_EXCLUSIVE);
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   return __pte(pte_val(pte) & ~_PAGE_SWP_EXCLUSIVE);
+}
+
 int page_in_phys_avail(unsigned long paddr);
 
 /*
-- 
2.38.1



[PATCH mm-unstable RFC 21/26] sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by reusing the SRMMU_DIRTY
bit as that seems to be safe to reuse inside a swap PTE. This avoids
having to steal one bit from the swap offset.

While at it, relocate the swap PTE layout documentation and use the same
style now used for most other archs. Note that the old documentation was
wrong: we use 20 bit for the offset and the reserved bits were 8 instead
of 7 bits in the ascii art.

Cc: "David S. Miller" 
Signed-off-by: David Hildenbrand 
---
 arch/sparc/include/asm/pgtable_32.h | 27 ++-
 arch/sparc/include/asm/pgtsrmmu.h   | 14 +++---
 2 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/arch/sparc/include/asm/pgtable_32.h 
b/arch/sparc/include/asm/pgtable_32.h
index 5acc05b572e6..abf7a2601209 100644
--- a/arch/sparc/include/asm/pgtable_32.h
+++ b/arch/sparc/include/asm/pgtable_32.h
@@ -323,7 +323,16 @@ void srmmu_mapiorange(unsigned int bus, unsigned long xpa,
   unsigned long xva, unsigned int len);
 void srmmu_unmapiorange(unsigned long virt_addr, unsigned int len);
 
-/* Encode and de-code a swap entry */
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <-- offset ---> < type -> E 0 0 0 0 0 0
+ */
 static inline unsigned long __swp_type(swp_entry_t entry)
 {
return (entry.val >> SRMMU_SWP_TYPE_SHIFT) & SRMMU_SWP_TYPE_MASK;
@@ -344,6 +353,22 @@ static inline swp_entry_t __swp_entry(unsigned long type, 
unsigned long offset)
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & SRMMU_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   return __pte(pte_val(pte) | SRMMU_SWP_EXCLUSIVE);
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   return __pte(pte_val(pte) & ~SRMMU_SWP_EXCLUSIVE);
+}
+
 static inline unsigned long
 __get_phys (unsigned long addr)
 {
diff --git a/arch/sparc/include/asm/pgtsrmmu.h 
b/arch/sparc/include/asm/pgtsrmmu.h
index 6067925972d9..18e68d43f036 100644
--- a/arch/sparc/include/asm/pgtsrmmu.h
+++ b/arch/sparc/include/asm/pgtsrmmu.h
@@ -53,21 +53,13 @@
 
 #define SRMMU_CHG_MASK(0xff00 | SRMMU_REF | SRMMU_DIRTY)
 
-/* SRMMU swap entry encoding
- *
- * We use 5 bits for the type and 19 for the offset.  This gives us
- * 32 swapfiles of 4GB each.  Encoding looks like:
- *
- * ooot
- * fedcba9876543210fedcba9876543210
- *
- * The bottom 7 bits are reserved for protection and status bits, especially
- * PRESENT.
- */
+/* SRMMU swap entry encoding */
 #define SRMMU_SWP_TYPE_MASK0x1f
 #define SRMMU_SWP_TYPE_SHIFT   7
 #define SRMMU_SWP_OFF_MASK 0xf
 #define SRMMU_SWP_OFF_SHIFT(SRMMU_SWP_TYPE_SHIFT + 5)
+/* We borrow bit 6 to store the exclusive marker in swap PTEs. */
+#define SRMMU_SWP_EXCLUSIVESRMMU_DIRTY
 
 /* Some day I will implement true fine grained access bits for
  * user pages because the SRMMU gives us the capabilities to
-- 
2.38.1



[PATCH mm-unstable RFC 20/26] sh/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 6 in the PTE,
reducing the swap type in the !CONFIG_X2TLB case to 5 bits. Generic MM
currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the
stolen bit is effectively unused.

Interrestingly, the swap type in the !CONFIG_X2TLB case could currently
overlap with the _PAGE_PRESENT bit, because there is a sneaky shift by 1 in
__pte_to_swp_entry() and __swp_entry_to_pte(). Bit 0-7 in the architecture
specific swap PTE would get shifted to bit 1-8 in the PTE. As generic MM
uses 5 bits only, this didn't matter so far.

While at it, mask the type in __swp_entry().

Cc: Yoshinori Sato 
Cc: Rich Felker 
Signed-off-by: David Hildenbrand 
---
 arch/sh/include/asm/pgtable_32.h | 54 +---
 1 file changed, 42 insertions(+), 12 deletions(-)

diff --git a/arch/sh/include/asm/pgtable_32.h b/arch/sh/include/asm/pgtable_32.h
index d0240decacca..090940aadbcc 100644
--- a/arch/sh/include/asm/pgtable_32.h
+++ b/arch/sh/include/asm/pgtable_32.h
@@ -423,40 +423,70 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
 #endif
 
 /*
- * Encode and de-code a swap entry
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
  *
  * Constraints:
  * _PAGE_PRESENT at bit 8
  * _PAGE_PROTNONE at bit 9
  *
- * For the normal case, we encode the swap type into bits 0:7 and the
- * swap offset into bits 10:30. For the 64-bit PTE case, we keep the
- * preserved bits in the low 32-bits and use the upper 32 as the swap
- * offset (along with a 5-bit type), following the same approach as x86
- * PAE. This keeps the logic quite simple.
+ * For the normal case, we encode the swap type and offset into the swap PTE
+ * such that bits 8 and 9 stay zero. For the 64-bit PTE case, we use the
+ * upper 32 for the swap offset and swap type, following the same approach as
+ * x86 PAE. This keeps the logic quite simple.
  *
  * As is evident by the Alpha code, if we ever get a 64-bit unsigned
  * long (swp_entry_t) to match up with the 64-bit PTEs, this all becomes
  * much cleaner..
- *
- * NOTE: We should set ZEROs at the position of _PAGE_PRESENT
- *   and _PAGE_PROTNONE bits
  */
+
 #ifdef CONFIG_X2TLB
+/*
+ * Format of swap PTEs:
+ *
+ *   6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3
+ *   3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2
+ *   <- offset --> < type ->
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <--- zeroes > E 0 0 0 0 0 0
+ */
 #define __swp_type(x)  ((x).val & 0x1f)
 #define __swp_offset(x)((x).val >> 5)
-#define __swp_entry(type, offset)  ((swp_entry_t){ (type) | (offset) << 5})
+#define __swp_entry(type, offset)  ((swp_entry_t){ ((type) & 0x1f) | 
(offset) << 5})
 #define __pte_to_swp_entry(pte)((swp_entry_t){ (pte).pte_high 
})
 #define __swp_entry_to_pte(x)  ((pte_t){ 0, (x).val })
 
 #else
-#define __swp_type(x)  ((x).val & 0xff)
+/*
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <--- offset > 0 0 0 0 E < type -> 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ */
+#define __swp_type(x)  ((x).val & 0x1f)
 #define __swp_offset(x)((x).val >> 10)
-#define __swp_entry(type, offset)  ((swp_entry_t){(type) | (offset) <<10})
+#define __swp_entry(type, offset)  ((swp_entry_t){((type) & 0x1f) | 
(offset) <<10})
 
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) 
>> 1 })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val << 1 })
 #endif
 
+/* In both cases, we borrow bit 6 to store the exclusive marker in swap PTEs. 
*/
+#define _PAGE_SWP_EXCLUSIVE_PAGE_USER
+
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte.pte_low & _PAGE_SWP_EXCLUSIVE;
+}
+
+PTE_BIT_FUNC(low, swp_mkexclusive, |= _PAGE_SWP_EXCLUSIVE);
+PTE_BIT_FUNC(low, swp_clear_exclusive, &= ~_PAGE_SWP_EXCLUSIVE);
+
 #endif /* __ASSEMBLY__ */
 #endif /* __ASM_SH_PGTABLE_32_H */
-- 
2.38.1



[PATCH mm-unstable RFC 19/26] riscv/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit
from the offset. This reduces the maximum swap space per file: on 32bit
to 16 GiB (was 32 GiB).

Note that this bit does not conflict with swap PMDs and could also be used
in swap PMD context later.

While at it, mask the type in __swp_entry().

Cc: Paul Walmsley 
Cc: Palmer Dabbelt 
Cc: Albert Ou 
Signed-off-by: David Hildenbrand 
---
 arch/riscv/include/asm/pgtable-bits.h |  3 +++
 arch/riscv/include/asm/pgtable.h  | 29 ++-
 2 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/arch/riscv/include/asm/pgtable-bits.h 
b/arch/riscv/include/asm/pgtable-bits.h
index b9e13a8fe2b7..f896708e8331 100644
--- a/arch/riscv/include/asm/pgtable-bits.h
+++ b/arch/riscv/include/asm/pgtable-bits.h
@@ -27,6 +27,9 @@
  */
 #define _PAGE_PROT_NONE _PAGE_GLOBAL
 
+/* Used for swap PTEs only. */
+#define _PAGE_SWP_EXCLUSIVE _PAGE_ACCESSED
+
 #define _PAGE_PFN_SHIFT 10
 
 /*
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 7ee3ac315c7c..9730f9fed197 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -721,16 +721,18 @@ static inline pmd_t pmdp_establish(struct vm_area_struct 
*vma,
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
 /*
- * Encode and decode a swap entry
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
  *
  * Format of swap PTE:
  * bit0:   _PAGE_PRESENT (zero)
  * bit   1 to 3:   _PAGE_LEAF (zero)
  * bit5:   _PAGE_PROT_NONE (zero)
- * bits  6 to 10:  swap type
- * bits 10 to XLEN-1:  swap offset
+ * bit6:   exclusive marker
+ * bits  7 to 11:  swap type
+ * bits 11 to XLEN-1:  swap offset
  */
-#define __SWP_TYPE_SHIFT   6
+#define __SWP_TYPE_SHIFT   7
 #define __SWP_TYPE_BITS5
 #define __SWP_TYPE_MASK((1UL << __SWP_TYPE_BITS) - 1)
 #define __SWP_OFFSET_SHIFT (__SWP_TYPE_BITS + __SWP_TYPE_SHIFT)
@@ -741,11 +743,28 @@ static inline pmd_t pmdp_establish(struct vm_area_struct 
*vma,
 #define __swp_type(x)  (((x).val >> __SWP_TYPE_SHIFT) & __SWP_TYPE_MASK)
 #define __swp_offset(x)((x).val >> __SWP_OFFSET_SHIFT)
 #define __swp_entry(type, offset) ((swp_entry_t) \
-   { ((type) << __SWP_TYPE_SHIFT) | ((offset) << __SWP_OFFSET_SHIFT) })
+   { (((type) & __SWP_TYPE_MASK) << __SWP_TYPE_SHIFT) | \
+ ((offset) << __SWP_OFFSET_SHIFT) })
 
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   return __pte(pte_val(pte) | _PAGE_SWP_EXCLUSIVE);
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   return __pte(pte_val(pte) & ~_PAGE_SWP_EXCLUSIVE);
+}
+
 #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
 #define __pmd_to_swp_entry(pmd) ((swp_entry_t) { pmd_val(pmd) })
 #define __swp_entry_to_pmd(swp) __pmd((swp).val)
-- 
2.38.1



[PATCH mm-unstable RFC 18/26] powerpc/nohash/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit and 64bit.

On 64bit, let's use MSB 56 (LSB 7), located right next to the page type.
On 32bit, let's use LSB 2 to avoid stealing one bit from the swap offset.

There seems to be no real reason why these bits cannot be used for swap
PTEs. The important part is that _PAGE_PRESENT and _PAGE_HASHPTE remain
0.

While at it, mask the type in __swp_entry() and remove
_PAGE_BIT_SWAP_TYPE from pte-e500.h: while it was used in 64bit code it was
ignored in 32bit code.

Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Christophe Leroy 
Signed-off-by: David Hildenbrand 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h  | 22 +
 arch/powerpc/include/asm/nohash/32/pte-40x.h  |  6 ++---
 arch/powerpc/include/asm/nohash/32/pte-44x.h  | 18 --
 arch/powerpc/include/asm/nohash/32/pte-85xx.h |  4 ++--
 arch/powerpc/include/asm/nohash/64/pgtable.h  | 24 ---
 arch/powerpc/include/asm/nohash/pgtable.h | 16 +
 arch/powerpc/include/asm/nohash/pte-e500.h|  1 -
 7 files changed, 63 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 0d40b33184eb..1bb3698e6628 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -354,18 +354,30 @@ static inline int pte_young(pte_t pte)
 #endif
 
 #define pmd_page(pmd)  pfn_to_page(pmd_pfn(pmd))
+
 /*
- * Encode and decode a swap entry.
- * Note that the bits we use in a PTE for representing a swap entry
- * must not include the _PAGE_PRESENT bit.
- *   -- paulus
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs (32bit PTEs):
+ *
+ * 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
+ *   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ *   <-- offset ---> < type -> E 0 0
+ *
+ * E is the exclusive marker that is not stored in swap entries.
+ *
+ * For 64bit PTEs, the offset is extended by 32bit.
  */
 #define __swp_type(entry)  ((entry).val & 0x1f)
 #define __swp_offset(entry)((entry).val >> 5)
-#define __swp_entry(type, offset)  ((swp_entry_t) { (type) | ((offset) << 
5) })
+#define __swp_entry(type, offset)  ((swp_entry_t) { ((type) & 0x1f) | 
((offset) << 5) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) 
>> 3 })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val << 3 })
 
+/* We borrow LSB 2 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE0x04
+
 #endif /* !__ASSEMBLY__ */
 
 #endif /* __ASM_POWERPC_NOHASH_32_PGTABLE_H */
diff --git a/arch/powerpc/include/asm/nohash/32/pte-40x.h 
b/arch/powerpc/include/asm/nohash/32/pte-40x.h
index 2d3153cfc0d7..6fe46e754556 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-40x.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-40x.h
@@ -27,9 +27,9 @@
  *   of the 16 available.  Bit 24-26 of the TLB are cleared in the TLB
  *   miss handler.  Bit 27 is PAGE_USER, thus selecting the correct
  *   zone.
- * - PRESENT *must* be in the bottom two bits because swap cache
- *   entries use the top 30 bits.  Because 40x doesn't support SMP
- *   anyway, M is irrelevant so we borrow it for PAGE_PRESENT.  Bit 30
+ * - PRESENT *must* be in the bottom two bits because swap PTEs
+ *   use the top 30 bits.  Because 40x doesn't support SMP anyway, M is
+ *   irrelevant so we borrow it for PAGE_PRESENT.  Bit 30
  *   is cleared in the TLB miss handler before the TLB entry is loaded.
  * - All other bits of the PTE are loaded into TLBLO without
  *   modification, leaving us only the bits 20, 21, 24, 25, 26, 30 for
diff --git a/arch/powerpc/include/asm/nohash/32/pte-44x.h 
b/arch/powerpc/include/asm/nohash/32/pte-44x.h
index 78bc304f750e..b7ed13cee137 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-44x.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-44x.h
@@ -56,20 +56,10 @@
  * above bits.  Note that the bit values are CPU specific, not architecture
  * specific.
  *
- * The kernel PTE entry holds an arch-dependent swp_entry structure under
- * certain situations. In other words, in such situations some portion of
- * the PTE bits are used as a swp_entry. In the PPC implementation, the
- * 3-24th LSB are shared with swp_entry, however the 0-2nd three LSB still
- * hold protection values. That means the three protection bits are
- * reserved for both PTE and SWAP entry at the most significant three
- * LSBs.
- *
- * There are three protection bits available for SWAP entry:
- * _PAGE_PRESENT
- * _PAGE_HASHPTE (if HW has)
- *
- * So those three bits have to be inside of 0-2nd LSB of PTE.
- *
+ * The kernel PTE entry can be an ordinary PTE mapping a page or a special swap
+ * PTE. In case of a swap PTE, LSB 2-24 are used to store information 

[PATCH mm-unstable RFC 17/26] powerpc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit book3s

2022-12-06 Thread David Hildenbrand
We already implemented support for 64bit book3s in commit bff9beaa2e80
("powerpc/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE for book3s")

Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also in 32bit by reusing yet
unused LSB 2 / MSB 29. There seems to be no real reason why that bit cannot
be used, and reusing it avoids having to steal one bit from the swap
offset.

While at it, mask the type in __swp_entry().

Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Christophe Leroy 
Signed-off-by: David Hildenbrand 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 38 +---
 1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 75823f39e042..8107835b38c1 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -42,6 +42,9 @@
 #define _PMD_PRESENT_MASK (PAGE_MASK)
 #define _PMD_BAD   (~PAGE_MASK)
 
+/* We borrow the _PAGE_USER bit to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE_PAGE_USER
+
 /* And here we include common definitions */
 
 #define _PAGE_KERNEL_RO0
@@ -363,17 +366,42 @@ static inline void __ptep_set_access_flags(struct 
vm_area_struct *vma,
 #define pmd_page(pmd)  pfn_to_page(pmd_pfn(pmd))
 
 /*
- * Encode and decode a swap entry.
- * Note that the bits we use in a PTE for representing a swap entry
- * must not include the _PAGE_PRESENT bit or the _PAGE_HASHPTE bit (if used).
- *   -- paulus
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs (32bit PTEs):
+ *
+ * 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
+ *   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ *   E H P <- type --> <- offset -->
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   _PAGE_PRESENT (P) and __PAGE_HASHPTE (H) must be 0.
+ *
+ * For 64bit PTEs, the offset is extended by 32bit.
  */
 #define __swp_type(entry)  ((entry).val & 0x1f)
 #define __swp_offset(entry)((entry).val >> 5)
-#define __swp_entry(type, offset)  ((swp_entry_t) { (type) | ((offset) << 
5) })
+#define __swp_entry(type, offset)  ((swp_entry_t) { ((type) & 0x1f) | 
((offset) << 5) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) 
>> 3 })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val << 3 })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   return __pte(pte_val(pte) | _PAGE_SWP_EXCLUSIVE);
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   return __pte(pte_val(pte) & ~_PAGE_SWP_EXCLUSIVE);
+}
+
 /* Generic accessors to PTE bits */
 static inline int pte_write(pte_t pte) { return !!(pte_val(pte) & 
_PAGE_RW);}
 static inline int pte_read(pte_t pte)  { return 1; }
-- 
2.38.1



[PATCH mm-unstable RFC 16/26] parisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using the yet-unused
_PAGE_ACCESSED location in the swap PTE. Looking at pte_present()
and pte_none() checks, there seems to be no actual reason why we cannot
use it: we only have to make sure we're not using _PAGE_PRESENT.

Reusing this bit avoids having to steal one bit from the swap offset.

Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
Signed-off-by: David Hildenbrand 
---
 arch/parisc/include/asm/pgtable.h | 41 ---
 1 file changed, 38 insertions(+), 3 deletions(-)

diff --git a/arch/parisc/include/asm/pgtable.h 
b/arch/parisc/include/asm/pgtable.h
index bd09a44cfb2d..75115c8bf888 100644
--- a/arch/parisc/include/asm/pgtable.h
+++ b/arch/parisc/include/asm/pgtable.h
@@ -218,6 +218,9 @@ extern void __update_cache(pte_t pte);
 #define _PAGE_KERNEL_RWX   (_PAGE_KERNEL_EXEC | _PAGE_WRITE)
 #define _PAGE_KERNEL   (_PAGE_KERNEL_RO | _PAGE_WRITE)
 
+/* We borrow bit 23 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE_PAGE_ACCESSED
+
 /* The pgd/pmd contains a ptr (in phys addr space); since all pgds/pmds
  * are page-aligned, we don't care about the PAGE_OFFSET bits, except
  * for a few meta-information bits, so we shift the address to be
@@ -394,17 +397,49 @@ extern void paging_init (void);
 
 #define update_mmu_cache(vms,addr,ptep) __update_cache(*ptep)
 
-/* Encode and de-code a swap entry */
-
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs (32bit):
+ *
+ * 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
+ *   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ *   < offset -> P E  < type ->
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   _PAGE_PRESENT (P) must be 0.
+ *
+ *   For the 64bit version, the offset is extended by 32bit.
+ */
 #define __swp_type(x) ((x).val & 0x1f)
 #define __swp_offset(x)   ( (((x).val >> 6) &  0x7) | \
  (((x).val >> 8) & ~0x7) )
-#define __swp_entry(type, offset) ((swp_entry_t) { (type) | \
+#define __swp_entry(type, offset) ((swp_entry_t) { \
+   ((type) & 0x1f) | \
((offset &  0x7) << 6) | \
((offset & ~0x7) << 8) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, 
unsigned long addr, pte_t *ptep)
 {
pte_t pte;
-- 
2.38.1



[PATCH mm-unstable RFC 15/26] openrisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit
from the type. Generic MM currently only uses 5 bits for the type
(MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused.

While at it, mask the type in __swp_entry().

Cc: Stefan Kristiansson 
Cc: Stafford Horne 
Signed-off-by: David Hildenbrand 
---
 arch/openrisc/include/asm/pgtable.h | 41 +
 1 file changed, 36 insertions(+), 5 deletions(-)

diff --git a/arch/openrisc/include/asm/pgtable.h 
b/arch/openrisc/include/asm/pgtable.h
index 6477c17b3062..903b32d662ab 100644
--- a/arch/openrisc/include/asm/pgtable.h
+++ b/arch/openrisc/include/asm/pgtable.h
@@ -154,6 +154,9 @@ extern void paging_init(void);
 #define _KERNPG_TABLE \
(_PAGE_BASE | _PAGE_SRE | _PAGE_SWE | _PAGE_ACCESSED | _PAGE_DIRTY)
 
+/* We borrow bit 11 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE_PAGE_U_SHARED
+
 #define PAGE_NONE   __pgprot(_PAGE_ALL)
 #define PAGE_READONLY   __pgprot(_PAGE_ALL | _PAGE_URE | _PAGE_SRE)
 #define PAGE_READONLY_X __pgprot(_PAGE_ALL | _PAGE_URE | _PAGE_SRE | 
_PAGE_EXEC)
@@ -385,16 +388,44 @@ static inline void update_mmu_cache(struct vm_area_struct 
*vma,
 
 /* __PHX__ FIXME, SWAP, this probably doesn't work */
 
-/* Encode and de-code a swap entry (must be !pte_none(e) && !pte_present(e)) */
-/* Since the PAGE_PRESENT bit is bit 4, we can use the bits above */
-
-#define __swp_type(x)  (((x).val >> 5) & 0x7f)
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <-- offset ---> E <- type --> 0 0 0 0 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   The zero'ed bits include _PAGE_PRESENT.
+ */
+#define __swp_type(x)  (((x).val >> 5) & 0x3f)
 #define __swp_offset(x)((x).val >> 12)
 #define __swp_entry(type, offset) \
-   ((swp_entry_t) { ((type) << 5) | ((offset) << 12) })
+   ((swp_entry_t) { (((type) & 0x3f) << 5) | ((offset) << 12) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 typedef pte_t *pte_addr_t;
 
 #endif /* __ASSEMBLY__ */
-- 
2.38.1



[PATCH mm-unstable RFC 14/26] nios2/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using the yet-unused bit
31.

Cc: Thomas Bogendoerfer 
Signed-off-by: David Hildenbrand 
---
 arch/nios2/include/asm/pgtable-bits.h |  3 +++
 arch/nios2/include/asm/pgtable.h  | 22 +-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/nios2/include/asm/pgtable-bits.h 
b/arch/nios2/include/asm/pgtable-bits.h
index bfddff383e89..724f9b08b1d1 100644
--- a/arch/nios2/include/asm/pgtable-bits.h
+++ b/arch/nios2/include/asm/pgtable-bits.h
@@ -31,4 +31,7 @@
 #define _PAGE_ACCESSED (1<<26) /* page referenced */
 #define _PAGE_DIRTY(1<<27) /* dirty page */
 
+/* We borrow bit 31 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE(1<<31)
+
 #endif /* _ASM_NIOS2_PGTABLE_BITS_H */
diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2/include/asm/pgtable.h
index d1e5c9eb4643..05999da01731 100644
--- a/arch/nios2/include/asm/pgtable.h
+++ b/arch/nios2/include/asm/pgtable.h
@@ -239,7 +239,9 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
  *
  *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
  *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
- *   0 < type -> 0 0 0 0 0 0 <-- offset --->
+ *   E < type -> 0 0 0 0 0 0 <-- offset --->
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
  *
  * Note that the offset field is always non-zero if the swap type is 0, thus
  * !pte_none() is always true.
@@ -251,6 +253,24 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
 #define __swp_entry_to_pte(swp)((pte_t) { (swp).val })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 extern void __init paging_init(void);
 extern void __init mmu_init(void);
 
-- 
2.38.1



[PATCH mm-unstable RFC 13/26] nios2/mm: refactor swap PTE layout

2022-12-06 Thread David Hildenbrand
nios2 disables swap for a good reason: it doesn't even provide
sufficient type bits as required by core MM. However, swap entries are
nowadays also used for other purposes (migration entries,
PTE markers, HWPoison, ...), and accidential use could be problematic.

Let's properly use 5 bits for the swap type and document the layout.
Bits 26--31 should get ignored by hardware completely, so they can be
used.

Cc: Dinh Nguyen 
Signed-off-by: David Hildenbrand 
---
 arch/nios2/include/asm/pgtable.h | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2/include/asm/pgtable.h
index ab793bc517f5..d1e5c9eb4643 100644
--- a/arch/nios2/include/asm/pgtable.h
+++ b/arch/nios2/include/asm/pgtable.h
@@ -232,19 +232,21 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
__FILE__, __LINE__, pgd_val(e))
 
 /*
- * Encode and decode a swap entry (must be !pte_none(pte) && !pte_present(pte):
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
  *
- * 31 30 29 28 27 26 25 24 23 22 21 20 19 18 ...  1  0
- *  0  0  0  0 type.  0  0  0  0  0  0 offset.
+ * Format of swap PTEs:
  *
- * This gives us up to 2**2 = 4 swap files and 2**20 * 4K = 4G per swap file.
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   0 < type -> 0 0 0 0 0 0 <-- offset --->
  *
- * Note that the offset field is always non-zero, thus !pte_none(pte) is always
- * true.
+ * Note that the offset field is always non-zero if the swap type is 0, thus
+ * !pte_none() is always true.
  */
-#define __swp_type(swp)(((swp).val >> 26) & 0x3)
+#define __swp_type(swp)(((swp).val >> 26) & 0x1f)
 #define __swp_offset(swp)  ((swp).val & 0xf)
-#define __swp_entry(type, off) ((swp_entry_t) { (((type) & 0x3) << 26) \
+#define __swp_entry(type, off) ((swp_entry_t) { (((type) & 0x1f) << 26) \
 | ((off) & 0xf) })
 #define __swp_entry_to_pte(swp)((pte_t) { (swp).val })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
-- 
2.38.1



[PATCH mm-unstable RFC 12/26] mips/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE.

On 64bit, steal one bit from the type. Generic MM currently only uses 5
bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively
unused.

On 32bit we're able to locate unused bits. As the PTE layout for 32 bit
is very confusing, document it a bit better.

While at it, mask the type in __swp_entry()/mk_swap_pte().

Cc: Thomas Bogendoerfer 
Signed-off-by: David Hildenbrand 
---
 arch/mips/include/asm/pgtable-32.h | 86 ++
 arch/mips/include/asm/pgtable-64.h | 23 ++--
 arch/mips/include/asm/pgtable.h| 36 +
 3 files changed, 130 insertions(+), 15 deletions(-)

diff --git a/arch/mips/include/asm/pgtable-32.h 
b/arch/mips/include/asm/pgtable-32.h
index b40a0e69fccc..c2a3b899480c 100644
--- a/arch/mips/include/asm/pgtable-32.h
+++ b/arch/mips/include/asm/pgtable-32.h
@@ -191,42 +191,103 @@ static inline pte_t pfn_pte(unsigned long pfn, pgprot_t 
prot)
 
 #define pte_page(x)pfn_to_page(pte_pfn(x))
 
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ */
 #if defined(CONFIG_CPU_R3K_TLB)
 
-/* Swap entries must have VALID bit cleared. */
+/*
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <--- offset > < type -> V G E 0 0 0 0 0 0 P
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   _PAGE_PRESENT (P), _PAGE_VALID (V) and_PAGE_GLOBAL (G) have to remain
+ *   unused.
+ */
 #define __swp_type(x)  (((x).val >> 10) & 0x1f)
 #define __swp_offset(x)((x).val >> 15)
-#define __swp_entry(type,offset)   ((swp_entry_t) { ((type) << 10) | 
((offset) << 15) })
+#define __swp_entry(type,offset)   ((swp_entry_t) { (((type) & 0x1f) << 
10) | ((offset) << 15) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+/* We borrow bit 7 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE(1 << 7)
+
 #else
 
 #if defined(CONFIG_XPA)
 
-/* Swap entries must have VALID and GLOBAL bits cleared. */
+/*
+ * Format of swap PTEs:
+ *
+ *   6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3
+ *   3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2
+ *   0 0 0 0 0 0 E P <-- zeroes --->
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <- offset --> < type -> V G 0 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   _PAGE_PRESENT (P), _PAGE_VALID (V) and_PAGE_GLOBAL (G) have to remain
+ *   unused.
+ */
 #define __swp_type(x)  (((x).val >> 4) & 0x1f)
 #define __swp_offset(x) ((x).val >> 9)
-#define __swp_entry(type,offset)   ((swp_entry_t)  { ((type) << 4) | 
((offset) << 9) })
+#define __swp_entry(type,offset)   ((swp_entry_t)  { (((type) & 0x1f) << 
4) | ((offset) << 9) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { (pte).pte_high 
})
 #define __swp_entry_to_pte(x)  ((pte_t) { 0, (x).val })
 
+/*
+ * We borrow bit 57 (bit 25 in the low PTE) to store the exclusive marker in
+ * swap PTEs.
+ */
+#define _PAGE_SWP_EXCLUSIVE(1 << 25)
+
 #elif defined(CONFIG_PHYS_ADDR_T_64BIT) && defined(CONFIG_CPU_MIPS32)
 
-/* Swap entries must have VALID and GLOBAL bits cleared. */
+/*
+ * Format of swap PTEs:
+ *
+ *   6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3
+ *   3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2
+ *   <-- zeroes ---> E P 0 0 0 0 0 0
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <--- offset > < type -> V G
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   _PAGE_PRESENT (P), _PAGE_VALID (V) and_PAGE_GLOBAL (G) have to remain
+ *   unused.
+ */
 #define __swp_type(x)  (((x).val >> 2) & 0x1f)
 #define __swp_offset(x) ((x).val >> 7)
-#define __swp_entry(type, offset)  ((swp_entry_t)  { ((type) << 2) | 
((offset) << 7) })
+#define __swp_entry(type, offset)  ((swp_entry_t)  { (((type) & 0x1f) << 
2) | ((offset) << 7) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { (pte).pte_high 
})
 #define __swp_entry_to_pte(x)  ((pte_t) { 0, (x).val })
 
+/*
+ * We borrow bit 39 (bit 7 in the low PTE) to store the exclusive marker in 
swap
+ * PTEs.
+ */
+#define _PAGE_SWP_EXCLUSIVE(1 << 7)
+
 #else
 /*
- * Constraints:
- *  _PAGE_PRESENT at bit 0
- *  _PAGE_MODIFIED at bit 4
- *  

[PATCH mm-unstable RFC 11/26] microblaze/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit
from the type. Generic MM currently only uses 5 bits for the type
(MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused.

The shift by 2 when converting between PTE and arch-specific swap entry
makes the swap PTE layout a little bit harder to decipher.

While at it, drop the comment from paulus---copy-and-paste leftover
from powerpc where we actually have _PAGE_HASHPTE---and mask the type in
__swp_entry_to_pte() as well.

Cc: Michal Simek 
Signed-off-by: David Hildenbrand 
---
 arch/m68k/include/asm/mcf_pgtable.h   |  4 +--
 arch/microblaze/include/asm/pgtable.h | 45 +--
 2 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/arch/m68k/include/asm/mcf_pgtable.h 
b/arch/m68k/include/asm/mcf_pgtable.h
index 3f8f4d0e66dd..e573d7b649f7 100644
--- a/arch/m68k/include/asm/mcf_pgtable.h
+++ b/arch/m68k/include/asm/mcf_pgtable.h
@@ -46,8 +46,8 @@
 #define _CACHEMASK040  (~0x060)
 #define _PAGE_GLOBAL0400x400   /* 68040 global bit, used for 
kva descs */
 
-/* We borrow bit 7 to store the exclusive marker in swap PTEs. */
-#define _PAGE_SWP_EXCLUSIVE0x080
+/* We borrow bit 24 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVECF_PAGE_NOCACHE
 
 /*
  * Externally used page protection values.
diff --git a/arch/microblaze/include/asm/pgtable.h 
b/arch/microblaze/include/asm/pgtable.h
index 42f5988e998b..7e3de54bf426 100644
--- a/arch/microblaze/include/asm/pgtable.h
+++ b/arch/microblaze/include/asm/pgtable.h
@@ -131,10 +131,10 @@ extern pte_t *va_to_pte(unsigned long address);
  * of the 16 available.  Bit 24-26 of the TLB are cleared in the TLB
  * miss handler.  Bit 27 is PAGE_USER, thus selecting the correct
  * zone.
- * - PRESENT *must* be in the bottom two bits because swap cache
- * entries use the top 30 bits.  Because 4xx doesn't support SMP
- * anyway, M is irrelevant so we borrow it for PAGE_PRESENT.  Bit 30
- * is cleared in the TLB miss handler before the TLB entry is loaded.
+ * - PRESENT *must* be in the bottom two bits because swap PTEs use the top
+ * 30 bits.  Because 4xx doesn't support SMP anyway, M is irrelevant so we
+ * borrow it for PAGE_PRESENT.  Bit 30 is cleared in the TLB miss handler
+ * before the TLB entry is loaded.
  * - All other bits of the PTE are loaded into TLBLO without
  *  * modification, leaving us only the bits 20, 21, 24, 25, 26, 30 for
  * software PTE bits.  We actually use bits 21, 24, 25, and
@@ -155,6 +155,9 @@ extern pte_t *va_to_pte(unsigned long address);
 #define _PAGE_ACCESSED 0x400   /* software: R: page referenced */
 #define _PMD_PRESENT   PAGE_MASK
 
+/* We borrow bit 24 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE_PAGE_DIRTY
+
 /*
  * Some bits are unused...
  */
@@ -393,18 +396,40 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
 extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
 
 /*
- * Encode and decode a swap entry.
- * Note that the bits we use in a PTE for representing a swap entry
- * must not include the _PAGE_PRESENT bit, or the _PAGE_HASHPTE bit
- * (if used).  -- paulus
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
+ *   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ *   <-- offset ---> E < type -> 0 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
  */
-#define __swp_type(entry)  ((entry).val & 0x3f)
+#define __swp_type(entry)  ((entry).val & 0x1f)
 #define __swp_offset(entry)((entry).val >> 6)
 #define __swp_entry(type, offset) \
-   ((swp_entry_t) { (type) | ((offset) << 6) })
+   ((swp_entry_t) { ((type) & 0x1f) | ((offset) << 6) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) >> 2 })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val << 2 })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 extern unsigned long iopa(unsigned long addr);
 
 /* Values for nocacheflag and cmode */
-- 
2.38.1



[PATCH mm-unstable RFC 10/26] m68k/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit
from the type. Generic MM currently only uses 5 bits for the type
(MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused.

While at it, make sure for sun3 that the valid bit never gets set by
properly masking it off and mask the type in __swp_entry().

Cc: Geert Uytterhoeven 
Cc: Greg Ungerer 
Signed-off-by: David Hildenbrand 
---
 arch/m68k/include/asm/mcf_pgtable.h  | 36 --
 arch/m68k/include/asm/motorola_pgtable.h | 38 +--
 arch/m68k/include/asm/sun3_pgtable.h | 39 ++--
 3 files changed, 104 insertions(+), 9 deletions(-)

diff --git a/arch/m68k/include/asm/mcf_pgtable.h 
b/arch/m68k/include/asm/mcf_pgtable.h
index b619b22823f8..3f8f4d0e66dd 100644
--- a/arch/m68k/include/asm/mcf_pgtable.h
+++ b/arch/m68k/include/asm/mcf_pgtable.h
@@ -46,6 +46,9 @@
 #define _CACHEMASK040  (~0x060)
 #define _PAGE_GLOBAL0400x400   /* 68040 global bit, used for 
kva descs */
 
+/* We borrow bit 7 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE0x080
+
 /*
  * Externally used page protection values.
  */
@@ -254,15 +257,42 @@ static inline pte_t pte_mkcache(pte_t pte)
 extern pgd_t kernel_pg_dir[PTRS_PER_PGD];
 
 /*
- * Encode and de-code a swap entry (must be !pte_none(e) && !pte_present(e))
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <-- offset -> 0 0 0 E <-- type --->
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
  */
-#define __swp_type(x)  ((x).val & 0xFF)
+#define __swp_type(x)  ((x).val & 0x7f)
 #define __swp_offset(x)((x).val >> 11)
-#define __swp_entry(typ, off)  ((swp_entry_t) { (typ) | \
+#define __swp_entry(typ, off)  ((swp_entry_t) { ((typ) & 0x7f) | \
(off << 11) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  (__pte((x).val))
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 #define pmd_pfn(pmd)   (pmd_val(pmd) >> PAGE_SHIFT)
 #define pmd_page(pmd)  (pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT))
 
diff --git a/arch/m68k/include/asm/motorola_pgtable.h 
b/arch/m68k/include/asm/motorola_pgtable.h
index 7ac3d64c6b33..02896027c781 100644
--- a/arch/m68k/include/asm/motorola_pgtable.h
+++ b/arch/m68k/include/asm/motorola_pgtable.h
@@ -41,6 +41,9 @@
 
 #define _PAGE_PROTNONE 0x004
 
+/* We borrow bit 11 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE0x800
+
 #ifndef __ASSEMBLY__
 
 /* This is the cache mode to be used for pages containing page descriptors for
@@ -169,12 +172,41 @@ static inline pte_t pte_mkcache(pte_t pte)
 #define swapper_pg_dir kernel_pg_dir
 extern pgd_t kernel_pg_dir[128];
 
-/* Encode and de-code a swap entry (must be !pte_none(e) && !pte_present(e)) */
-#define __swp_type(x)  (((x).val >> 4) & 0xff)
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <- offset > E <-- type ---> 0 0 0 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ */
+#define __swp_type(x)  (((x).val >> 4) & 0x7f)
 #define __swp_offset(x)((x).val >> 12)
-#define __swp_entry(type, offset) ((swp_entry_t) { ((type) << 4) | ((offset) 
<< 12) })
+#define __swp_entry(type, offset) ((swp_entry_t) { (((type) & 0x7f) << 4) | 
((offset) << 12) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 #endif /* !__ASSEMBLY__ */
 #endif /* _MOTOROLA_PGTABLE_H */
diff --git a/arch/m68k/include/asm/sun3_pgtable.h 
b/arch/m68k/include/asm/sun3_pgtable.h
index 

[PATCH mm-unstable RFC 09/26] m68k/mm: remove dummy __swp definitions for nommu

2022-12-06 Thread David Hildenbrand
The definitions are not required, let's remove them.

Cc: Geert Uytterhoeven 
Cc: Greg Ungerer 
Signed-off-by: David Hildenbrand 
---
 arch/m68k/include/asm/pgtable_no.h | 6 --
 1 file changed, 6 deletions(-)

diff --git a/arch/m68k/include/asm/pgtable_no.h 
b/arch/m68k/include/asm/pgtable_no.h
index fed58da3a6b6..fc044df52b96 100644
--- a/arch/m68k/include/asm/pgtable_no.h
+++ b/arch/m68k/include/asm/pgtable_no.h
@@ -31,12 +31,6 @@
 extern void paging_init(void);
 #define swapper_pg_dir ((pgd_t *) 0)
 
-#define __swp_type(x)  (0)
-#define __swp_offset(x)(0)
-#define __swp_entry(typ,off)   ((swp_entry_t) { ((typ) | ((off) << 7)) })
-#define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
-#define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
-
 /*
  * ZERO_PAGE is a global shared page that is always zero: used
  * for zero-mapped memory areas etc..
-- 
2.38.1



[PATCH mm-unstable RFC 08/26] loongarch/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit
from the type. Generic MM currently only uses 5 bits for the type
(MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused.

While at it, also mask the type in mk_swap_pte().

Note that this bit does not conflict with swap PMDs and could also be used
in swap PMD context later.

Cc: Huacai Chen 
Cc: WANG Xuerui 
Signed-off-by: David Hildenbrand 
---
 arch/loongarch/include/asm/pgtable-bits.h |  4 +++
 arch/loongarch/include/asm/pgtable.h  | 39 ---
 2 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/arch/loongarch/include/asm/pgtable-bits.h 
b/arch/loongarch/include/asm/pgtable-bits.h
index 3d1e0a69975a..8b98d22a145b 100644
--- a/arch/loongarch/include/asm/pgtable-bits.h
+++ b/arch/loongarch/include/asm/pgtable-bits.h
@@ -20,6 +20,7 @@
 #define_PAGE_SPECIAL_SHIFT 11
 #define_PAGE_HGLOBAL_SHIFT 12 /* HGlobal is a PMD bit */
 #define_PAGE_PFN_SHIFT 12
+#define_PAGE_SWP_EXCLUSIVE_SHIFT 23
 #define_PAGE_PFN_END_SHIFT 48
 #define_PAGE_NO_READ_SHIFT 61
 #define_PAGE_NO_EXEC_SHIFT 62
@@ -33,6 +34,9 @@
 #define _PAGE_PROTNONE (_ULCAST_(1) << _PAGE_PROTNONE_SHIFT)
 #define _PAGE_SPECIAL  (_ULCAST_(1) << _PAGE_SPECIAL_SHIFT)
 
+/* We borrow bit 23 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE(_ULCAST_(1) << _PAGE_SWP_EXCLUSIVE_SHIFT)
+
 /* Used by TLB hardware (placed in EntryLo*) */
 #define _PAGE_VALID(_ULCAST_(1) << _PAGE_VALID_SHIFT)
 #define _PAGE_DIRTY(_ULCAST_(1) << _PAGE_DIRTY_SHIFT)
diff --git a/arch/loongarch/include/asm/pgtable.h 
b/arch/loongarch/include/asm/pgtable.h
index 022ec6be3602..70d037c957a4 100644
--- a/arch/loongarch/include/asm/pgtable.h
+++ b/arch/loongarch/include/asm/pgtable.h
@@ -249,13 +249,26 @@ extern void pud_init(void *addr);
 extern void pmd_init(void *addr);
 
 /*
- * Non-present pages:  high 40 bits are offset, next 8 bits type,
- * low 16 bits zero.
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3
+ *   3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2
+ *   <--- offset ---
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   --> E <--- type ---> <-- zeroes -->
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   The zero'ed bits include _PAGE_PRESENT and _PAGE_PROTNONE.
  */
 static inline pte_t mk_swap_pte(unsigned long type, unsigned long offset)
-{ pte_t pte; pte_val(pte) = (type << 16) | (offset << 24); return pte; }
+{ pte_t pte; pte_val(pte) = ((type & 0x7f) << 16) | (offset << 24); return 
pte; }
 
-#define __swp_type(x)  (((x).val >> 16) & 0xff)
+#define __swp_type(x)  (((x).val >> 16) & 0x7f)
 #define __swp_offset(x)((x).val >> 24)
 #define __swp_entry(type, offset) ((swp_entry_t) { pte_val(mk_swap_pte((type), 
(offset))) })
 #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) })
@@ -263,6 +276,24 @@ static inline pte_t mk_swap_pte(unsigned long type, 
unsigned long offset)
 #define __pmd_to_swp_entry(pmd) ((swp_entry_t) { pmd_val(pmd) })
 #define __swp_entry_to_pmd(x)  ((pmd_t) { (x).val | _PAGE_HUGE })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 extern void paging_init(void);
 
 #define pte_none(pte)  (!(pte_val(pte) & ~_PAGE_GLOBAL))
-- 
2.38.1



[PATCH mm-unstable RFC 07/26] ia64/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit
from the type. Generic MM currently only uses 5 bits for the type
(MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused.

While at it, also mask the type in __swp_entry().

Signed-off-by: David Hildenbrand 
---
 arch/ia64/include/asm/pgtable.h | 32 +---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/arch/ia64/include/asm/pgtable.h b/arch/ia64/include/asm/pgtable.h
index 01517a5e6778..d666eb229d4b 100644
--- a/arch/ia64/include/asm/pgtable.h
+++ b/arch/ia64/include/asm/pgtable.h
@@ -58,6 +58,9 @@
 #define _PAGE_ED   (__IA64_UL(1) << 52)/* exception deferral */
 #define _PAGE_PROTNONE (__IA64_UL(1) << 63)
 
+/* We borrow bit 7 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE(1 << 7)
+
 #define _PFN_MASK  _PAGE_PPN_MASK
 /* Mask of bits which may be changed by pte_modify(); the odd bits are there 
for _PAGE_PROTNONE */
 #define _PAGE_CHG_MASK (_PAGE_P | _PAGE_PROTNONE | _PAGE_PL_MASK | 
_PAGE_AR_MASK | _PAGE_ED)
@@ -399,6 +402,9 @@ extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
 extern void paging_init (void);
 
 /*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
  * Note: The macros below rely on the fact that MAX_SWAPFILES_SHIFT <= number 
of
  *  bits in the swap-type field of the swap pte.  It would be nice to
  *  enforce that, but we can't easily include  here.
@@ -406,16 +412,36 @@ extern void paging_init (void);
  *
  * Format of swap pte:
  * bit   0   : present bit (must be zero)
- * bits  1- 7: swap-type
+ * bits  1- 6: swap type
+ * bit   7   : exclusive marker
  * bits  8-62: swap offset
  * bit  63   : _PAGE_PROTNONE bit
  */
-#define __swp_type(entry)  (((entry).val >> 1) & 0x7f)
+#define __swp_type(entry)  (((entry).val >> 1) & 0x3f)
 #define __swp_offset(entry)(((entry).val << 1) >> 9)
-#define __swp_entry(type,offset)   ((swp_entry_t) { ((type) << 1) | 
((long) (offset) << 8) })
+#define __swp_entry(type,offset)   ((swp_entry_t) { ((type & 0x3f) << 1) | 
\
+((long) (offset) << 8) 
})
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 /*
  * ZERO_PAGE is a global shared page that is always zero: used
  * for zero-mapped memory areas etc..
-- 
2.38.1



[PATCH mm-unstable RFC 06/26] hexagon/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the
offset. This reduces the maximum swap space per file to 16 GiB (was 32
GiB).

While at it, mask the type in __swp_entry().

Cc: Brian Cain 
Signed-off-by: David Hildenbrand 
---
 arch/hexagon/include/asm/pgtable.h | 37 +-
 1 file changed, 31 insertions(+), 6 deletions(-)

diff --git a/arch/hexagon/include/asm/pgtable.h 
b/arch/hexagon/include/asm/pgtable.h
index f7048c18b6f9..7eb008e477c8 100644
--- a/arch/hexagon/include/asm/pgtable.h
+++ b/arch/hexagon/include/asm/pgtable.h
@@ -61,6 +61,9 @@ extern unsigned long empty_zero_page;
  * So we'll put up with a bit of inefficiency for now...
  */
 
+/* We borrow bit 6 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE(1<<6)
+
 /*
  * Top "FOURTH" level (pgd), which for the Hexagon VM is really
  * only the second from the bottom, pgd and pud both being collapsed.
@@ -359,9 +362,12 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
 #define ZERO_PAGE(vaddr) (virt_to_page(_zero_page))
 
 /*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
  * Swap/file PTE definitions.  If _PAGE_PRESENT is zero, the rest of the PTE is
  * interpreted as swap information.  The remaining free bits are interpreted as
- * swap type/offset tuple.  Rather than have the TLB fill handler test
+ * listed below.  Rather than have the TLB fill handler test
  * _PAGE_PRESENT, we're going to reserve the permissions bits and set them to
  * all zeros for swap entries, which speeds up the miss handler at the cost of
  * 3 bits of offset.  That trade-off can be revisited if necessary, but Hexagon
@@ -371,9 +377,10 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
  * Format of swap PTE:
  * bit 0:  Present (zero)
  * bits1-5:swap type (arch independent layer uses 5 bits max)
- * bits6-9:bits 3:0 of offset
+ * bit 6:  exclusive marker
+ * bits7-9:bits 2:0 of offset
  * bits10-12:  effectively _PAGE_PROTNONE (all zero)
- * bits13-31:  bits 22:4 of swap offset
+ * bits13-31:  bits 21:3 of swap offset
  *
  * The split offset makes some of the following macros a little gnarly,
  * but there's plenty of precedent for this sort of thing.
@@ -383,11 +390,29 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
 #define __swp_type(swp_pte)(((swp_pte).val >> 1) & 0x1f)
 
 #define __swp_offset(swp_pte) \
-   swp_pte).val >> 6) & 0xf) | (((swp_pte).val >> 9) & 0x70))
+   swp_pte).val >> 7) & 0x7) | (((swp_pte).val >> 10) & 0x38))
 
 #define __swp_entry(type, offset) \
((swp_entry_t)  { \
-   ((type << 1) | \
-((offset & 0x70) << 9) | ((offset & 0xf) << 6)) })
+   (((type & 0x1f) << 1) | \
+((offset & 0x38) << 10) | ((offset & 0x7) << 7)) })
+
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
 
 #endif
-- 
2.38.1



[PATCH mm-unstable RFC 05/26] csky/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the
offset. This reduces the maximum swap space per file to 16 GiB (was 32
GiB).

We might actually be able to reuse one of the other software bits
(_PAGE_READ / PAGE_WRITE) instead, because we only have to keep
pte_present(), pte_none() and HW happy. For now, let's keep it simple
because there might be something non-obvious.

Cc: Guo Ren 
Signed-off-by: David Hildenbrand 
---
 arch/csky/abiv1/inc/abi/pgtable-bits.h | 13 +
 arch/csky/abiv2/inc/abi/pgtable-bits.h | 19 ---
 arch/csky/include/asm/pgtable.h| 18 ++
 3 files changed, 39 insertions(+), 11 deletions(-)

diff --git a/arch/csky/abiv1/inc/abi/pgtable-bits.h 
b/arch/csky/abiv1/inc/abi/pgtable-bits.h
index 752c8b3f9194..ae7a2f76dd42 100644
--- a/arch/csky/abiv1/inc/abi/pgtable-bits.h
+++ b/arch/csky/abiv1/inc/abi/pgtable-bits.h
@@ -10,6 +10,9 @@
 #define _PAGE_ACCESSED (1<<3)
 #define _PAGE_MODIFIED (1<<4)
 
+/* We borrow bit 9 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE(1<<9)
+
 /* implemented in hardware */
 #define _PAGE_GLOBAL   (1<<6)
 #define _PAGE_VALID(1<<7)
@@ -26,7 +29,8 @@
 #define _PAGE_PROT_NONE_PAGE_READ
 
 /*
- * Encode and decode a swap entry
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
  *
  * Format of swap PTE:
  * bit  0:_PAGE_PRESENT (zero)
@@ -35,15 +39,16 @@
  * bit  6:_PAGE_GLOBAL (zero)
  * bit  7:_PAGE_VALID (zero)
  * bit  8:swap type[4]
- * bit 9 - 31:swap offset
+ * bit  9:exclusive marker
+ * bit10 - 31:swap offset
  */
 #define __swp_type(x)  x).val >> 2) & 0xf) | \
(((x).val >> 4) & 0x10))
-#define __swp_offset(x)((x).val >> 9)
+#define __swp_offset(x)((x).val >> 10)
 #define __swp_entry(type, offset)  ((swp_entry_t) { \
((type & 0xf) << 2) | \
((type & 0x10) << 4) | \
-   ((offset) << 9)})
+   ((offset) << 10)})
 
 #define HAVE_ARCH_UNMAPPED_AREA
 
diff --git a/arch/csky/abiv2/inc/abi/pgtable-bits.h 
b/arch/csky/abiv2/inc/abi/pgtable-bits.h
index 7e7f389f546f..526152bd2156 100644
--- a/arch/csky/abiv2/inc/abi/pgtable-bits.h
+++ b/arch/csky/abiv2/inc/abi/pgtable-bits.h
@@ -10,6 +10,9 @@
 #define _PAGE_PRESENT  (1<<10)
 #define _PAGE_MODIFIED (1<<11)
 
+/* We borrow bit 7 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE(1<<7)
+
 /* implemented in hardware */
 #define _PAGE_GLOBAL   (1<<0)
 #define _PAGE_VALID(1<<1)
@@ -26,23 +29,25 @@
 #define _PAGE_PROT_NONE_PAGE_WRITE
 
 /*
- * Encode and decode a swap entry
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
  *
  * Format of swap PTE:
  * bit  0:_PAGE_GLOBAL (zero)
  * bit  1:_PAGE_VALID (zero)
  * bit  2 - 6:swap type
- * bit  7 - 8:swap offset[0 - 1]
+ * bit  7:exclusive marker
+ * bit  8:swap offset[0]
  * bit  9:_PAGE_WRITE (zero)
  * bit 10:_PAGE_PRESENT (zero)
- * bit11 - 31:swap offset[2 - 22]
+ * bit11 - 31:swap offset[1 - 21]
  */
 #define __swp_type(x)  (((x).val >> 2) & 0x1f)
-#define __swp_offset(x)x).val >> 7) & 0x3) | \
-   (((x).val >> 9) & 0x7c))
+#define __swp_offset(x)x).val >> 8) & 0x1) | \
+   (((x).val >> 10) & 0x3e))
 #define __swp_entry(type, offset)  ((swp_entry_t) { \
((type & 0x1f) << 2) | \
-   ((offset & 0x3) << 7) | \
-   ((offset & 0x7c) << 9)})
+   ((offset & 0x1) << 8) | \
+   ((offset & 0x3e) << 10)})
 
 #endif /* __ASM_CSKY_PGTABLE_BITS_H */
diff --git a/arch/csky/include/asm/pgtable.h b/arch/csky/include/asm/pgtable.h
index 77bc6caff2d2..574c97b9ecca 100644
--- a/arch/csky/include/asm/pgtable.h
+++ b/arch/csky/include/asm/pgtable.h
@@ -200,6 +200,24 @@ static inline pte_t pte_mkyoung(pte_t pte)
return pte;
 }
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return 

[PATCH mm-unstable RFC 04/26] arm/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the
offset. This reduces the maximum swap space per file to 64 GiB (was 128
GiB).

While at it drop the PTE_TYPE_FAULT from __swp_entry_to_pte() which is
defined to be 0 and is rather confusing because we should be dealing
with "Linux PTEs" not "hardware PTEs". Also, properly mask the type in
__swp_entry().

Cc: Russell King 
Signed-off-by: David Hildenbrand 
---
 arch/arm/include/asm/pgtable-2level.h |  3 +++
 arch/arm/include/asm/pgtable-3level.h |  3 +++
 arch/arm/include/asm/pgtable.h| 35 +--
 3 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-2level.h 
b/arch/arm/include/asm/pgtable-2level.h
index 92abd4cd8ca2..ce543cd9380c 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -126,6 +126,9 @@
 #define L_PTE_SHARED   (_AT(pteval_t, 1) << 10)/* shared(v6), 
coherent(xsc3) */
 #define L_PTE_NONE (_AT(pteval_t, 1) << 11)
 
+/* We borrow bit 7 to store the exclusive marker in swap PTEs. */
+#define L_PTE_SWP_EXCLUSIVEL_PTE_RDONLY
+
 /*
  * These are the memory types, defined to be compatible with
  * pre-ARMv6 CPUs cacheable and bufferable bits: n/a,n/a,C,B
diff --git a/arch/arm/include/asm/pgtable-3level.h 
b/arch/arm/include/asm/pgtable-3level.h
index eabe72ff7381..106049791500 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -76,6 +76,9 @@
 #define L_PTE_NONE (_AT(pteval_t, 1) << 57)/* PROT_NONE */
 #define L_PTE_RDONLY   (_AT(pteval_t, 1) << 58)/* READ ONLY */
 
+/* We borrow bit 7 to store the exclusive marker in swap PTEs. */
+#define L_PTE_SWP_EXCLUSIVE(_AT(pteval_t, 1) << 7)
+
 #define L_PMD_SECT_VALID   (_AT(pmdval_t, 1) << 0)
 #define L_PMD_SECT_DIRTY   (_AT(pmdval_t, 1) << 55)
 #define L_PMD_SECT_NONE(_AT(pmdval_t, 1) << 57)
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 00954ab1a039..5e0446a9c667 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -269,27 +269,48 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t 
newprot)
 }
 
 /*
- * Encode and decode a swap entry.  Swap entries are stored in the Linux
- * page tables as follows:
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
  *
  *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
  *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
- *   <--- offset > < type -> 0 0
+ *   <--- offset --> E < type -> 0 0
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
  *
- * This gives us up to 31 swap files and 128GB per swap file.  Note that
+ * This gives us up to 31 swap files and 64GB per swap file.  Note that
  * the offset field is always non-zero.
  */
 #define __SWP_TYPE_SHIFT   2
 #define __SWP_TYPE_BITS5
 #define __SWP_TYPE_MASK((1 << __SWP_TYPE_BITS) - 1)
-#define __SWP_OFFSET_SHIFT (__SWP_TYPE_BITS + __SWP_TYPE_SHIFT)
+#define __SWP_OFFSET_SHIFT (__SWP_TYPE_BITS + __SWP_TYPE_SHIFT + 1)
 
 #define __swp_type(x)  (((x).val >> __SWP_TYPE_SHIFT) & 
__SWP_TYPE_MASK)
 #define __swp_offset(x)((x).val >> __SWP_OFFSET_SHIFT)
-#define __swp_entry(type,offset) ((swp_entry_t) { ((type) << __SWP_TYPE_SHIFT) 
| ((offset) << __SWP_OFFSET_SHIFT) })
+#define __swp_entry(type,offset) ((swp_entry_t) { (((type) & __SWP_TYPE_BITS) 
<< __SWP_TYPE_SHIFT) | \
+((offset) << 
__SWP_OFFSET_SHIFT) })
 
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
-#define __swp_entry_to_pte(swp)__pte((swp).val | PTE_TYPE_FAULT)
+#define __swp_entry_to_pte(swp)__pte((swp).val)
+
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_isset(pte, L_PTE_SWP_EXCLUSIVE);
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   return set_pte_bit(pte, __pgprot(L_PTE_SWP_EXCLUSIVE));
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   return clear_pte_bit(pte, __pgprot(L_PTE_SWP_EXCLUSIVE));
+}
 
 /*
  * It is an error for the kernel to have more swap files than we can
-- 
2.38.1



[PATCH mm-unstable RFC 03/26] arc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 5, which is yet
unused. The only important parts seems to be to not use _PAGE_PRESENT
(bit 9).

Cc: Vineet Gupta 
Signed-off-by: David Hildenbrand 
---
 arch/arc/include/asm/pgtable-bits-arcv2.h | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/arch/arc/include/asm/pgtable-bits-arcv2.h 
b/arch/arc/include/asm/pgtable-bits-arcv2.h
index 515e82db519f..611f412713b9 100644
--- a/arch/arc/include/asm/pgtable-bits-arcv2.h
+++ b/arch/arc/include/asm/pgtable-bits-arcv2.h
@@ -26,6 +26,9 @@
 #define _PAGE_GLOBAL   (1 << 8)  /* ASID agnostic (H) */
 #define _PAGE_PRESENT  (1 << 9)  /* PTE/TLB Valid (H) */
 
+/* We borrow bit 5 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE_PAGE_DIRTY
+
 #ifdef CONFIG_ARC_MMU_V4
 #define _PAGE_HW_SZ(1 << 10)  /* Normal/super (H) */
 #else
@@ -106,9 +109,18 @@ static inline void set_pte_at(struct mm_struct *mm, 
unsigned long addr,
 void update_mmu_cache(struct vm_area_struct *vma, unsigned long address,
  pte_t *ptep);
 
-/* Encode swap {type,off} tuple into PTE
- * We reserve 13 bits for 5-bit @type, keeping bits 12-5 zero, ensuring that
- * PAGE_PRESENT is zero in a PTE holding swap "identifier"
+/*
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <-- offset -> <--- zero --> E < type ->
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
+ *   The zero'ed bits include _PAGE_PRESENT.
  */
 #define __swp_entry(type, off) ((swp_entry_t) \
{ ((type) & 0x1f) | ((off) << 13) })
@@ -120,6 +132,15 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned 
long address,
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+PTE_BIT_FUNC(swp_mkexclusive, |= (_PAGE_SWP_EXCLUSIVE));
+PTE_BIT_FUNC(swp_clear_exclusive, &= ~(_PAGE_SWP_EXCLUSIVE));
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 #include 
 #endif
-- 
2.38.1



[PATCH mm-unstable RFC 02/26] alpha/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-12-06 Thread David Hildenbrand
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit
from the type. Generic MM currently only uses 5 bits for the type
(MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused.

While at it, mask the type in mk_swap_pte() as well.

Note that 32bit alpha has 64bit PTEs but only 32bit swap entries. So the
lower 32bit are zero in a swap PTE and we could have taken a bit in
there as well.

Cc: Richard Henderson 
Cc: Ivan Kokshaysky 
Cc: Matt Turner 
Signed-off-by: David Hildenbrand 
---
 arch/alpha/include/asm/pgtable.h | 41 
 1 file changed, 37 insertions(+), 4 deletions(-)

diff --git a/arch/alpha/include/asm/pgtable.h b/arch/alpha/include/asm/pgtable.h
index 9e45f6735d5d..970abf511b13 100644
--- a/arch/alpha/include/asm/pgtable.h
+++ b/arch/alpha/include/asm/pgtable.h
@@ -74,6 +74,9 @@ struct vm_area_struct;
 #define _PAGE_DIRTY0x2
 #define _PAGE_ACCESSED 0x4
 
+/* We borrow bit 39 to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE0x80UL
+
 /*
  * NOTE! The "accessed" bit isn't necessarily exact:  it can be kept exactly
  * by software (use the KRE/URE/KWE/UWE bits appropriately), but I'll fake it.
@@ -301,18 +304,48 @@ extern inline void update_mmu_cache(struct vm_area_struct 
* vma,
 }
 
 /*
- * Non-present pages:  high 24 bits are offset, next 8 bits type,
- * low 32 bits zero.
+ * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
+ * are !pte_none() && !pte_present().
+ *
+ * Format of swap PTEs:
+ *
+ *   6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3
+ *   3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2
+ *   <--- offset --> E <--- type -->
+ *
+ *   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+ *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *   <--- zeroes -->
+ *
+ *   E is the exclusive marker that is not stored in swap entries.
  */
 extern inline pte_t mk_swap_pte(unsigned long type, unsigned long offset)
-{ pte_t pte; pte_val(pte) = (type << 32) | (offset << 40); return pte; }
+{ pte_t pte; pte_val(pte) = ((type & 0x7f) << 32) | (offset << 40); return 
pte; }
 
-#define __swp_type(x)  (((x).val >> 32) & 0xff)
+#define __swp_type(x)  (((x).val >> 32) & 0x7f)
 #define __swp_offset(x)((x).val >> 40)
 #define __swp_entry(type, off) ((swp_entry_t) { pte_val(mk_swap_pte((type), 
(off))) })
 #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)  ((pte_t) { (x).val })
 
+#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
+static inline int pte_swp_exclusive(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
+}
+
+static inline pte_t pte_swp_mkexclusive(pte_t pte)
+{
+   pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
+static inline pte_t pte_swp_clear_exclusive(pte_t pte)
+{
+   pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
+   return pte;
+}
+
 #define pte_ERROR(e) \
printk("%s:%d: bad pte %016lx.\n", __FILE__, __LINE__, pte_val(e))
 #define pmd_ERROR(e) \
-- 
2.38.1



[PATCH mm-unstable RFC 01/26] mm/debug_vm_pgtable: more pte_swp_exclusive() sanity checks

2022-12-06 Thread David Hildenbrand
We want to implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures.
Let's extend our sanity checks, especially testing that our PTE bit
does not affect:
* is_swap_pte() -> pte_present() and pte_none()
* the swap entry + type
* pte_swp_soft_dirty()

Especially, the pfn_pte() is dodgy when the swap PTE layout differs
heavily from ordinary PTEs. Let's properly construct a swap PTE from
swap type+offset.

Signed-off-by: David Hildenbrand 
---
 mm/debug_vm_pgtable.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index c631ade3f1d2..0506622016d9 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -807,13 +807,34 @@ static void __init pmd_swap_soft_dirty_tests(struct 
pgtable_debug_args *args) {
 static void __init pte_swap_exclusive_tests(struct pgtable_debug_args *args)
 {
 #ifdef __HAVE_ARCH_PTE_SWP_EXCLUSIVE
-   pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot);
+   unsigned long max_swapfile_size = generic_max_swapfile_size();
+   swp_entry_t entry, entry2;
+   pte_t pte;
 
pr_debug("Validating PTE swap exclusive\n");
+
+   /* Create a swp entry with all possible bits set */
+   entry = swp_entry((1 << MAX_SWAPFILES_SHIFT) - 1,
+ max_swapfile_size - 1);
+
+   pte = swp_entry_to_pte(entry);
+   WARN_ON(pte_swp_exclusive(pte));
+   WARN_ON(!is_swap_pte(pte));
+   entry2 = pte_to_swp_entry(pte);
+   WARN_ON(memcmp(, , sizeof(entry)));
+
pte = pte_swp_mkexclusive(pte);
WARN_ON(!pte_swp_exclusive(pte));
+   WARN_ON(!is_swap_pte(pte));
+   WARN_ON(pte_swp_soft_dirty(pte));
+   entry2 = pte_to_swp_entry(pte);
+   WARN_ON(memcmp(, , sizeof(entry)));
+
pte = pte_swp_clear_exclusive(pte);
WARN_ON(pte_swp_exclusive(pte));
+   WARN_ON(!is_swap_pte(pte));
+   entry2 = pte_to_swp_entry(pte);
+   WARN_ON(memcmp(, , sizeof(entry)));
 #endif /* __HAVE_ARCH_PTE_SWP_EXCLUSIVE */
 }
 
-- 
2.38.1



Re: [PATCH] powerpc/ftrace: fix syscall tracing on PPC64_ELF_ABI_V1

2022-12-06 Thread Mathieu Desnoyers

On 2022-12-05 17:50, Michael Ellerman wrote:

Michael Jeanson  writes:

On 2022-12-05 15:11, Michael Jeanson wrote:

Michael Jeanson  writes:

In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on
PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table
changing from their dot prefixed variant to the non-prefixed ones.

Since ftrace prefixes a dot to the syscall names when matching them to
build its syscall event list, this resulted in no syscall events being
available.

Remove the PPC64_ELF_ABI_V1 specific version of
arch_syscall_match_sym_name to have the same behavior across all powerpc
variants.


This doesn't seem to work for me.

Event with it applied I still don't see anything in
/sys/kernel/debug/tracing/events/syscalls

Did we break it in some other way recently?

cheers


I did some further testing, my config also enabled KALLSYMS_ALL, when I remove
it there is indeed no syscall events.


Aha, OK that explains it I guess.

I was using ppc64_guest_defconfig which has ABI_V1 and FTRACE_SYSCALLS,
but does not have KALLSYMS_ALL. So I guess there's some other bug
lurking in there.


I don't have the setup handy to validate it, but I suspect it is caused 
by the way scripts/kallsyms.c:symbol_valid() checks whether a symbol 
entry needs to be integrated into the assembler output when 
--all-symbols is not specified. It only keeps symbols which addresses 
are in the text range. On PPC64_ELF_ABI_V1, this means only the 
dot-prefixed symbols will be kept (those point to the function begin), 
leaving out the non-dot-prefixed symbols (those point to the function 
descriptors).


So I see two possible solutions there: either we ensure that 
FTRACE_SYSCALLS selects KALLSYMS_ALL on PPC64_ELF_ABI_V1, or we modify 
scripts/kallsyms.c:symbol_valid() to also include function descriptor 
symbols. This would mean accepting symbols pointing into the .opd ELF 
section.


IMHO the second option would be better because it does not increase the 
kernel image size as much as KALLSYMS_ALL.


Thoughts ?

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com



Re: [PATCH 0/6] Make remove() of any bus based driver void returned

2022-12-06 Thread Dawei Li
On Mon, Dec 05, 2022 at 05:00:52PM +0100, Greg KH wrote:
> On Mon, Dec 05, 2022 at 11:36:38PM +0800, Dawei Li wrote:
> > For bus-based driver, device removal is implemented as:
> > device_remove() => bus->remove() => driver->remove()
> > 
> > Driver core needs no feedback from bus driver about the result of
> > remove callback. In which case, commit fc7a6209d571 ("bus: Make
> > remove callback return void") forces bus_type::remove be void-returned.
> > 
> > Now we have the situation that both 1st & 2nd part of calling chain
> > are void returned, so it does not make much sense for the last one
> > (driver->remove) to return non-void to its caller.
> > 
> > So the basic idea behind this patchset is making remove() callback of
> > any bus-based driver to be void returned.
> > 
> > This patchset includes changes for drivers below:
> > 1. hyperv
> > 2. macio
> > 3. apr
> > 4. xen
> > 5. ac87
> > 6. soundbus

Hi Greg:
Thanks for the reviewing.
> 
> Then that should be 6 different patchsets going to 6 different
> subsystems.  No need to make this seems like a unified set of patches at
> all.
Right, will fix all the issues for this patchset and resend them in 6
independent patches.

Thanks
  Dawei

> 
> > Q: Why not platform drivers?
> > A: Too many of them.(maybe 4K+)
> 
> That will have to be done eventually, right?
> 
> thanks,
> 
> greg k-h


Re: [PATCH 1/6] hyperv: Make remove callback of hyperv driver void returned

2022-12-06 Thread Dawei Li
On Tue, Dec 06, 2022 at 11:37:26AM +, Wei Liu wrote:
> On Mon, Dec 05, 2022 at 11:36:39PM +0800, Dawei Li wrote:
> > Since commit fc7a6209d571 ("bus: Make remove callback return
> > void") forces bus_type::remove be void-returned, it doesn't
> > make much sense for any bus based driver implementing remove
> > callbalk to return non-void to its caller.
> > 
> > This change is for hyperv bus based drivers.
> > 
> > Signed-off-by: Dawei Li 
> [...]

Hi Wei:
Thanks for the review.

> > -static int netvsc_remove(struct hv_device *dev)
> > +static void netvsc_remove(struct hv_device *dev)
> >  {
> > struct net_device_context *ndev_ctx;
> > struct net_device *vf_netdev, *net;
> > @@ -2603,7 +2603,6 @@ static int netvsc_remove(struct hv_device *dev)
> > net = hv_get_drvdata(dev);
> > if (net == NULL) {
> > dev_err(>device, "No net device to remove\n");
> > -   return 0;
> 
> This is wrong. You are introducing a NULL pointer dereference.
Nice catch, will fix it.

> 
> > }
> >  
> > ndev_ctx = netdev_priv(net);
> > @@ -2637,7 +2636,6 @@ static int netvsc_remove(struct hv_device *dev)
> >  
> > free_percpu(ndev_ctx->vf_stats);
> > free_netdev(net);
> > -   return 0;
> >  }
> >  
> >  static int netvsc_suspend(struct hv_device *dev)
> > diff --git a/drivers/pci/controller/pci-hyperv.c 
> > b/drivers/pci/controller/pci-hyperv.c
> > index ba64284eaf9f..3a09de70d6ea 100644
> > --- a/drivers/pci/controller/pci-hyperv.c
> > +++ b/drivers/pci/controller/pci-hyperv.c
> > @@ -3756,7 +3756,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev, 
> > bool keep_devs)
> >   *
> >   * Return: 0 on success, -errno on failure
> >   */
> 
> This comment is no longer needed in the new world.
> 
> But, are you sure you're modifying the correct piece of code?
> 
> hv_pci_remove is not a hook in the base bus type. It is used in struct
> hv_driver.
> 
> The same comment applies to all other modifications.
Sorry about the confusion.
In short, the point of this commit is making remove() of _driver_ void returned.

For bus-based driver, device removal is implemented as:
1 device_remove() =>
  2 bus->remove() =>
3 driver->remove()

1 is void. 
For 2, commit fc7a6209d571 ("bus: Make remove callback return void") forces
bus_type::remove be void-returned, which applies for hv_bus(vmbus) too.
So it doesn't make sense for 3(driver->remove) to return non-void, in this case 
it's hv_driver. 

Thanks,
  Dawei

> 
> Thanks,
> Wei.


Re: [PATCH 1/6] hyperv: Make remove callback of hyperv driver void returned

2022-12-06 Thread Wei Liu
On Mon, Dec 05, 2022 at 11:36:39PM +0800, Dawei Li wrote:
> Since commit fc7a6209d571 ("bus: Make remove callback return
> void") forces bus_type::remove be void-returned, it doesn't
> make much sense for any bus based driver implementing remove
> callbalk to return non-void to its caller.
> 
> This change is for hyperv bus based drivers.
> 
> Signed-off-by: Dawei Li 
[...]
> -static int netvsc_remove(struct hv_device *dev)
> +static void netvsc_remove(struct hv_device *dev)
>  {
>   struct net_device_context *ndev_ctx;
>   struct net_device *vf_netdev, *net;
> @@ -2603,7 +2603,6 @@ static int netvsc_remove(struct hv_device *dev)
>   net = hv_get_drvdata(dev);
>   if (net == NULL) {
>   dev_err(>device, "No net device to remove\n");
> - return 0;

This is wrong. You are introducing a NULL pointer dereference.

>   }
>  
>   ndev_ctx = netdev_priv(net);
> @@ -2637,7 +2636,6 @@ static int netvsc_remove(struct hv_device *dev)
>  
>   free_percpu(ndev_ctx->vf_stats);
>   free_netdev(net);
> - return 0;
>  }
>  
>  static int netvsc_suspend(struct hv_device *dev)
> diff --git a/drivers/pci/controller/pci-hyperv.c 
> b/drivers/pci/controller/pci-hyperv.c
> index ba64284eaf9f..3a09de70d6ea 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -3756,7 +3756,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool 
> keep_devs)
>   *
>   * Return: 0 on success, -errno on failure
>   */

This comment is no longer needed in the new world.

But, are you sure you're modifying the correct piece of code?

hv_pci_remove is not a hook in the base bus type. It is used in struct
hv_driver.

The same comment applies to all other modifications.

Thanks,
Wei.


Re: linux-next: build warnings after merge of the powerpc-objtool tree

2022-12-06 Thread Naveen N. Rao

Sathvika Vasireddy wrote:


On 29/11/22 20:58, Christophe Leroy wrote:


Le 29/11/2022 à 16:13, Sathvika Vasireddy a écrit :

Hi all,

On 25/11/22 09:00, Stephen Rothwell wrote:

Hi all,

After merging the powerpc-objtool tree, today's linux-next build (powerpc
pseries_le_defconfig) produced these warnings:

arch/powerpc/kernel/head_64.o: warning: objtool: end_first_256B():
can't find starting instruction
arch/powerpc/kernel/optprobes_head.o: warning: objtool:
optprobe_template_end(): can't find starting instruction

I have no idea what started this (they may have been there yesterday).

I was able to recreate the above mentioned warnings with
pseries_le_defconfig and powernv_defconfig. The regression report also
mentions a warning
(https://lore.kernel.org/oe-kbuild-all/202211282102.qur7hhrw-...@intel.com/) 
seen with arch/powerpc/kernel/kvm_emul.S assembly file.

   [1] arch/powerpc/kernel/optprobes_head.o: warning: objtool:
optprobe_template_end(): can't find starting instruction
   [2] arch/powerpc/kernel/kvm_emul.o: warning: objtool:
kvm_template_end(): can't find starting instruction
   [3] arch/powerpc/kernel/head_64.o: warning: objtool: end_first_256B():
can't find starting instruction

The warnings [1] and [2] go away after adding 'nop' instruction. Below
diff fixes it for me:

You have to add NOPs just because those labels are at the end of the
files. That's a bit odd.
I think either we are missing some kind of flagging for the symbols, or
objtool has a bug. In both cases, I'm not sure adding an artificial
'nop' is the solution. At least there should be a big hammer warning
explaining why.


The problem looks to be that commit dbcdbdfdf137b4 ("objtool: Rework 
instruction -> symbol mapping"), which was referenced by Sathvika below, 
changes how STT_NOTYPE symbols are handled. In the files throwing that 
warning, there are labels either at the very end of the file, or at the 
end of a section with no subsequent instruction. Before that commit, we 
didn't used to expect an instruction for STT_NOTYPE symbols.




I don't see these warnings with powerpc/topic/objtool branch. However, 
they are seen with linux-next master branch.
Commit dbcdbdfdf137b49144204571f1a5e5dc01b8aaad objtool: Rework 
instruction -> symbol mapping in linux-next is resulting in objtool 
can't find starting instruction warnings on powerpc.


Reverting this particular hunk (pasted below), resolves it and we don't 
see the problem anymore.


@@ -427,7 +427,10 @@ static int decode_instructions(struct objtool_file 
*file)

     }

     list_for_each_entry(func, >symbol_list, list) {
-   if (func->type != STT_FUNC || func->alias != func)
+   if (func->type != STT_NOTYPE && func->type != 
STT_FUNC)

+   continue;
+
+   if (func->return_thunk || func->alias != func)
     continue;

     if (!find_insn(file, sec, func->offset)) {


We are currently bailing out if find_insn() there fails. Should we 
instead just continue by not setting insn->sym?


@@ -430,11 +430,8 @@ static int decode_instructions(struct objtool_file *file)
   if (func->return_thunk || func->alias != func)
   continue;

-   if (!find_insn(file, sec, func->offset)) {
-   WARN("%s(): can't find starting instruction",
-func->name);
-   return -1;
-   }
+   if (!find_insn(file, sec, func->offset))
+   continue;

   sym_for_each_insn(file, func, insn) {
   insn->sym = func;



- Naveen



[PATCH AUTOSEL 5.10 02/10] ASoC: fsl_micfil: explicitly clear CHnF flags

2022-12-06 Thread Sasha Levin
From: Shengjiu Wang 

[ Upstream commit b776c4a4618ec1b5219d494c423dc142f23c4e8f ]

There may be failure when start 1 channel recording after
8 channels recording. The reason is that the CHnF
flags are not cleared successfully by software reset.

This issue is triggerred by the change of clearing
software reset bit.

CHnF flags are write 1 clear bits. Clear them by force
write.

Signed-off-by: Shengjiu Wang 
Link: 
https://lore.kernel.org/r/1651925654-32060-2-git-send-email-shengjiu.w...@nxp.com
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/fsl/fsl_micfil.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/sound/soc/fsl/fsl_micfil.c b/sound/soc/fsl/fsl_micfil.c
index ead4bfa13561..6c794605e33c 100644
--- a/sound/soc/fsl/fsl_micfil.c
+++ b/sound/soc/fsl/fsl_micfil.c
@@ -201,6 +201,14 @@ static int fsl_micfil_reset(struct device *dev)
if (ret)
return ret;
 
+   /*
+* Set SRES should clear CHnF flags, But even add delay here
+* the CHnF may not be cleared sometimes, so clear CHnF explicitly.
+*/
+   ret = regmap_write_bits(micfil->regmap, REG_MICFIL_STAT, 0xFF, 0xFF);
+   if (ret)
+   return ret;
+
return 0;
 }
 
-- 
2.35.1



[PATCH AUTOSEL 5.10 01/10] ASoC: fsl_micfil: explicitly clear software reset bit

2022-12-06 Thread Sasha Levin
From: Shengjiu Wang 

[ Upstream commit 292709b9cf3ba470af94b62c9bb60284cc581b79 ]

SRES is self-cleared bit, but REG_MICFIL_CTRL1 is defined as
non volatile register, it still remain in regmap cache after set,
then every update of REG_MICFIL_CTRL1, software reset happens.
to avoid this, clear it explicitly.

Signed-off-by: Shengjiu Wang 
Link: 
https://lore.kernel.org/r/1651925654-32060-1-git-send-email-shengjiu.w...@nxp.com
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/fsl/fsl_micfil.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/sound/soc/fsl/fsl_micfil.c b/sound/soc/fsl/fsl_micfil.c
index efc5daf53bba..ead4bfa13561 100644
--- a/sound/soc/fsl/fsl_micfil.c
+++ b/sound/soc/fsl/fsl_micfil.c
@@ -190,6 +190,17 @@ static int fsl_micfil_reset(struct device *dev)
return ret;
}
 
+   /*
+* SRES is self-cleared bit, but REG_MICFIL_CTRL1 is defined
+* as non-volatile register, so SRES still remain in regmap
+* cache after set, that every update of REG_MICFIL_CTRL1,
+* software reset happens. so clear it explicitly.
+*/
+   ret = regmap_clear_bits(micfil->regmap, REG_MICFIL_CTRL1,
+   MICFIL_CTRL1_SRES);
+   if (ret)
+   return ret;
+
return 0;
 }
 
-- 
2.35.1



[PATCH AUTOSEL 5.15 02/12] ASoC: fsl_micfil: explicitly clear CHnF flags

2022-12-06 Thread Sasha Levin
From: Shengjiu Wang 

[ Upstream commit b776c4a4618ec1b5219d494c423dc142f23c4e8f ]

There may be failure when start 1 channel recording after
8 channels recording. The reason is that the CHnF
flags are not cleared successfully by software reset.

This issue is triggerred by the change of clearing
software reset bit.

CHnF flags are write 1 clear bits. Clear them by force
write.

Signed-off-by: Shengjiu Wang 
Link: 
https://lore.kernel.org/r/1651925654-32060-2-git-send-email-shengjiu.w...@nxp.com
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/fsl/fsl_micfil.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/sound/soc/fsl/fsl_micfil.c b/sound/soc/fsl/fsl_micfil.c
index cb84d95c3aac..d1cd104f8584 100644
--- a/sound/soc/fsl/fsl_micfil.c
+++ b/sound/soc/fsl/fsl_micfil.c
@@ -202,6 +202,14 @@ static int fsl_micfil_reset(struct device *dev)
if (ret)
return ret;
 
+   /*
+* Set SRES should clear CHnF flags, But even add delay here
+* the CHnF may not be cleared sometimes, so clear CHnF explicitly.
+*/
+   ret = regmap_write_bits(micfil->regmap, REG_MICFIL_STAT, 0xFF, 0xFF);
+   if (ret)
+   return ret;
+
return 0;
 }
 
-- 
2.35.1



[PATCH AUTOSEL 5.15 01/12] ASoC: fsl_micfil: explicitly clear software reset bit

2022-12-06 Thread Sasha Levin
From: Shengjiu Wang 

[ Upstream commit 292709b9cf3ba470af94b62c9bb60284cc581b79 ]

SRES is self-cleared bit, but REG_MICFIL_CTRL1 is defined as
non volatile register, it still remain in regmap cache after set,
then every update of REG_MICFIL_CTRL1, software reset happens.
to avoid this, clear it explicitly.

Signed-off-by: Shengjiu Wang 
Link: 
https://lore.kernel.org/r/1651925654-32060-1-git-send-email-shengjiu.w...@nxp.com
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/fsl/fsl_micfil.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/sound/soc/fsl/fsl_micfil.c b/sound/soc/fsl/fsl_micfil.c
index 9f90989ac59a..cb84d95c3aac 100644
--- a/sound/soc/fsl/fsl_micfil.c
+++ b/sound/soc/fsl/fsl_micfil.c
@@ -191,6 +191,17 @@ static int fsl_micfil_reset(struct device *dev)
return ret;
}
 
+   /*
+* SRES is self-cleared bit, but REG_MICFIL_CTRL1 is defined
+* as non-volatile register, so SRES still remain in regmap
+* cache after set, that every update of REG_MICFIL_CTRL1,
+* software reset happens. so clear it explicitly.
+*/
+   ret = regmap_clear_bits(micfil->regmap, REG_MICFIL_CTRL1,
+   MICFIL_CTRL1_SRES);
+   if (ret)
+   return ret;
+
return 0;
 }
 
-- 
2.35.1



[PATCH AUTOSEL 6.0 02/13] ASoC: fsl_micfil: explicitly clear CHnF flags

2022-12-06 Thread Sasha Levin
From: Shengjiu Wang 

[ Upstream commit b776c4a4618ec1b5219d494c423dc142f23c4e8f ]

There may be failure when start 1 channel recording after
8 channels recording. The reason is that the CHnF
flags are not cleared successfully by software reset.

This issue is triggerred by the change of clearing
software reset bit.

CHnF flags are write 1 clear bits. Clear them by force
write.

Signed-off-by: Shengjiu Wang 
Link: 
https://lore.kernel.org/r/1651925654-32060-2-git-send-email-shengjiu.w...@nxp.com
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/fsl/fsl_micfil.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/sound/soc/fsl/fsl_micfil.c b/sound/soc/fsl/fsl_micfil.c
index 8aa6871e0d42..4b86ef82fd93 100644
--- a/sound/soc/fsl/fsl_micfil.c
+++ b/sound/soc/fsl/fsl_micfil.c
@@ -205,6 +205,14 @@ static int fsl_micfil_reset(struct device *dev)
if (ret)
return ret;
 
+   /*
+* Set SRES should clear CHnF flags, But even add delay here
+* the CHnF may not be cleared sometimes, so clear CHnF explicitly.
+*/
+   ret = regmap_write_bits(micfil->regmap, REG_MICFIL_STAT, 0xFF, 0xFF);
+   if (ret)
+   return ret;
+
return 0;
 }
 
-- 
2.35.1



[PATCH AUTOSEL 6.0 01/13] ASoC: fsl_micfil: explicitly clear software reset bit

2022-12-06 Thread Sasha Levin
From: Shengjiu Wang 

[ Upstream commit 292709b9cf3ba470af94b62c9bb60284cc581b79 ]

SRES is self-cleared bit, but REG_MICFIL_CTRL1 is defined as
non volatile register, it still remain in regmap cache after set,
then every update of REG_MICFIL_CTRL1, software reset happens.
to avoid this, clear it explicitly.

Signed-off-by: Shengjiu Wang 
Link: 
https://lore.kernel.org/r/1651925654-32060-1-git-send-email-shengjiu.w...@nxp.com
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/fsl/fsl_micfil.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/sound/soc/fsl/fsl_micfil.c b/sound/soc/fsl/fsl_micfil.c
index 79ef4e269bc9..8aa6871e0d42 100644
--- a/sound/soc/fsl/fsl_micfil.c
+++ b/sound/soc/fsl/fsl_micfil.c
@@ -194,6 +194,17 @@ static int fsl_micfil_reset(struct device *dev)
if (ret)
return ret;
 
+   /*
+* SRES is self-cleared bit, but REG_MICFIL_CTRL1 is defined
+* as non-volatile register, so SRES still remain in regmap
+* cache after set, that every update of REG_MICFIL_CTRL1,
+* software reset happens. so clear it explicitly.
+*/
+   ret = regmap_clear_bits(micfil->regmap, REG_MICFIL_CTRL1,
+   MICFIL_CTRL1_SRES);
+   if (ret)
+   return ret;
+
return 0;
 }
 
-- 
2.35.1