Re: [PATCH v8 4/9] phy: fsl: Add Lynx 10G SerDes driver
Quoting Sean Anderson (2022-11-01 16:27:21) > On 11/1/22 16:10, Stephen Boyd wrote: > >> > >> Oh, I remember why I did this. I need the reference clock for > >> clk_hw_round_rate, > >> which is AFAICT the only correct way to implement round_rate. > >> > > > > Is the reference clk the parent of the clk implementing > > clk_ops::round_rate()? > > Yes. We may be able to produce a given output with multiple reference > rates. However, the clock API provides no mechanism to say "Don't ask > for the parent clock to be rate X, you just tried it and the parent > clock can't support it." So instead, we loop over the possible reference > rates and pick the first one which the parent says it can round to. > Sorry, I'm lost. Why can't you loop over possible reference rates in determine_rate/round_rate clk op here?
Re: [PATCH] powerpc/ftrace: fix syscall tracing on PPC64_ELF_ABI_V1
Mathieu Desnoyers writes: > On 2022-12-05 17:50, Michael Ellerman wrote: >> Michael Jeanson writes: >>> On 2022-12-05 15:11, Michael Jeanson wrote: >>> Michael Jeanson writes: In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table changing from their dot prefixed variant to the non-prefixed ones. Since ftrace prefixes a dot to the syscall names when matching them to build its syscall event list, this resulted in no syscall events being available. Remove the PPC64_ELF_ABI_V1 specific version of arch_syscall_match_sym_name to have the same behavior across all powerpc variants. >>> >>> This doesn't seem to work for me. >>> >>> Event with it applied I still don't see anything in >>> /sys/kernel/debug/tracing/events/syscalls >>> >>> Did we break it in some other way recently? >>> >>> cheers >>> >>> I did some further testing, my config also enabled KALLSYMS_ALL, when I >>> remove >>> it there is indeed no syscall events. >> >> Aha, OK that explains it I guess. >> >> I was using ppc64_guest_defconfig which has ABI_V1 and FTRACE_SYSCALLS, >> but does not have KALLSYMS_ALL. So I guess there's some other bug >> lurking in there. > > I don't have the setup handy to validate it, but I suspect it is caused > by the way scripts/kallsyms.c:symbol_valid() checks whether a symbol > entry needs to be integrated into the assembler output when > --all-symbols is not specified. It only keeps symbols which addresses > are in the text range. On PPC64_ELF_ABI_V1, this means only the > dot-prefixed symbols will be kept (those point to the function begin), > leaving out the non-dot-prefixed symbols (those point to the function > descriptors). OK. So I guess it never worked without KALLSYMS_ALL. It seems like most distros enable KALLSYMS_ALL, so I guess that's why we've never noticed. > So I see two possible solutions there: either we ensure that > FTRACE_SYSCALLS selects KALLSYMS_ALL on PPC64_ELF_ABI_V1, or we modify > scripts/kallsyms.c:symbol_valid() to also include function descriptor > symbols. This would mean accepting symbols pointing into the .opd ELF > section. My only worry is that will cause some other breakage, because .opd symbols are not really "text" in the normal sense, ie. you can't execute them directly. On the other hand the help for KALLSYMS_ALL says: "Normally kallsyms only contains the symbols of functions" But without .opd included that's not really true. In practice it probably doesn't really matter, because eg. backtraces will point to dot symbols which can be resolved. > IMHO the second option would be better because it does not increase the > kernel image size as much as KALLSYMS_ALL. Yes I agree. Even if that did break something, any breakage would be limited to arches which uses function descriptors, which are now all rare. Relatedly we have a patch in next to optionally use ABIv2 for 64-bit big endian builds. cheers
[powerpc:next] BUILD SUCCESS 5ddcc03a07ae1ab5062f89a946d9495f1fd8eaa4
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next branch HEAD: 5ddcc03a07ae1ab5062f89a946d9495f1fd8eaa4 powerpc/cpuidle: Set CPUIDLE_FLAG_POLLING for snooze state elapsed time: 728m configs tested: 60 configs skipped: 2 The following configs have been built successfully. More configs may be tested in the coming days. gcc tested configs: um i386_defconfig um x86_64_defconfig powerpc allnoconfig arc defconfig s390 allmodconfig alpha defconfig x86_64 rhel-8.3-kunit x86_64 rhel-8.3-kvm s390defconfig x86_64 rhel-8.3-syz sh allmodconfig s390 allyesconfig i386defconfig powerpc allmodconfig mips allyesconfig arm defconfig x86_64 rhel-8.3-rust x86_64rhel-8.3-kselftests x86_64 defconfig x86_64 rhel-8.3-func arm randconfig-r046-20221206 i386 randconfig-a014 arc randconfig-r043-20221206 i386 randconfig-a012 i386 randconfig-a016 arm64allyesconfig arm allyesconfig x86_64randconfig-a013 i386 randconfig-a001 x86_64randconfig-a011 x86_64randconfig-a004 i386 randconfig-a003 i386 allyesconfig x86_64randconfig-a002 x86_64 rhel-8.3 x86_64randconfig-a015 x86_64randconfig-a006 x86_64 allyesconfig i386 randconfig-a005 m68k allyesconfig m68k allmodconfig ia64 allmodconfig arc allyesconfig alphaallyesconfig clang tested configs: hexagon randconfig-r041-20221206 i386 randconfig-a013 hexagon randconfig-r045-20221206 s390 randconfig-r044-20221206 riscvrandconfig-r042-20221206 i386 randconfig-a011 i386 randconfig-a015 i386 randconfig-a002 x86_64randconfig-a012 x86_64randconfig-a001 i386 randconfig-a004 x86_64randconfig-a003 x86_64randconfig-a014 x86_64randconfig-a016 x86_64randconfig-a005 i386 randconfig-a006 -- 0-DAY CI Kernel Test Service https://01.org/lkp
Re: [PATCH v5 2/5] powerpc: mm: Implement p{m,u,4}d_leaf on all platforms
Great job spotting this. Somehow lost these throughout the revisions. Thanks. > On 7 Dec 2022, at 9:24 am, Michael Ellerman wrote: > > Rohan McLure writes: >> The check that a higher-level entry in multi-level pages contains a page >> translation entry (pte) is performed by p{m,u,4}d_leaf stubs, which may >> be specialised for each choice of mmu. In a prior commit, we replace >> uses to the catch-all stubs, p{m,u,4}d_is_leaf with p{m,u,4}d_leaf. >> >> Replace the catch-all stub definitions for p{m,u,4}d_is_leaf with >> definitions for p{m,u,4}d_leaf. A future patch will assume that >> p{m,u,4}d_leaf is defined on all platforms. >> >> In particular, implement pud_leaf for Book3E-64, pmd_leaf for all Book3E >> and Book3S-64 platforms, with a catch-all definition for p4d_leaf. >> >> Signed-off-by: Rohan McLure >> --- >> v5: Split patch that replaces p{m,u,4}d_is_leaf into two patches, first >> replacing callsites and afterward providing generic definition. >> Remove ifndef-defines implementing p{m,u}d_leaf in favour of >> implementing stubs in headers belonging to the particular platforms >> needing them. >> --- >> arch/powerpc/include/asm/book3s/32/pgtable.h | 4 >> arch/powerpc/include/asm/book3s/64/pgtable.h | 8 ++- >> arch/powerpc/include/asm/nohash/64/pgtable.h | 5 + >> arch/powerpc/include/asm/nohash/pgtable.h| 5 + >> arch/powerpc/include/asm/pgtable.h | 22 ++-- >> 5 files changed, 18 insertions(+), 26 deletions(-) > > I needed the delta below to prevent the generic versions being defined > and overriding our versions. > > cheers > > diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h > b/arch/powerpc/include/asm/book3s/32/pgtable.h > index 44703c8c590c..117135be8cc2 100644 > --- a/arch/powerpc/include/asm/book3s/32/pgtable.h > +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h > @@ -244,6 +244,7 @@ static inline void pmd_clear(pmd_t *pmdp) > *pmdp = __pmd(0); > } > > +#define pmd_leaf pmd_leaf > static inline bool pmd_leaf(pmd_t pmd) > { > return false; > diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h > b/arch/powerpc/include/asm/book3s/64/pgtable.h > index 436632d04304..f00aa2d203c2 100644 > --- a/arch/powerpc/include/asm/book3s/64/pgtable.h > +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h > @@ -1438,11 +1438,13 @@ static inline bool is_pte_rw_upgrade(unsigned long > old_val, unsigned long new_va > /* > * Like pmd_huge() and pmd_large(), but works regardless of config options > */ > +#define pmd_leaf pmd_leaf > static inline bool pmd_leaf(pmd_t pmd) > { > return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE)); > } > > +#define pud_leaf pud_leaf > static inline bool pud_leaf(pud_t pud) > { > return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE)); > diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h > b/arch/powerpc/include/asm/nohash/64/pgtable.h > index 2488da8f0deb..d88b22c753d3 100644 > --- a/arch/powerpc/include/asm/nohash/64/pgtable.h > +++ b/arch/powerpc/include/asm/nohash/64/pgtable.h > @@ -147,6 +147,7 @@ static inline void pud_clear(pud_t *pudp) > *pudp = __pud(0); > } > > +#define pud_leaf pud_leaf > static inline bool pud_leaf(pud_t pud) > { > return false; > diff --git a/arch/powerpc/include/asm/nohash/pgtable.h > b/arch/powerpc/include/asm/nohash/pgtable.h > index 487804f5b1d1..dfae1dbb9c3b 100644 > --- a/arch/powerpc/include/asm/nohash/pgtable.h > +++ b/arch/powerpc/include/asm/nohash/pgtable.h > @@ -60,6 +60,7 @@ static inline bool pte_hw_valid(pte_t pte) > return pte_val(pte) & _PAGE_PRESENT; > } > > +#define pmd_leaf pmd_leaf > static inline bool pmd_leaf(pmd_t pmd) > { > return false;
Re: [PATCH] powerpc/ftrace: fix syscall tracing on PPC64_ELF_ABI_V1
Le 06/12/2022 à 15:38, Mathieu Desnoyers a écrit : > On 2022-12-05 17:50, Michael Ellerman wrote: >> Michael Jeanson writes: >>> On 2022-12-05 15:11, Michael Jeanson wrote: >>> Michael Jeanson writes: In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table changing from their dot prefixed variant to the non-prefixed ones. Since ftrace prefixes a dot to the syscall names when matching them to build its syscall event list, this resulted in no syscall events being available. Remove the PPC64_ELF_ABI_V1 specific version of arch_syscall_match_sym_name to have the same behavior across all powerpc variants. >>> >>> This doesn't seem to work for me. >>> >>> Event with it applied I still don't see anything in >>> /sys/kernel/debug/tracing/events/syscalls >>> >>> Did we break it in some other way recently? >>> >>> cheers >>> >>> I did some further testing, my config also enabled KALLSYMS_ALL, when >>> I remove >>> it there is indeed no syscall events. >> >> Aha, OK that explains it I guess. >> >> I was using ppc64_guest_defconfig which has ABI_V1 and FTRACE_SYSCALLS, >> but does not have KALLSYMS_ALL. So I guess there's some other bug >> lurking in there. > > I don't have the setup handy to validate it, but I suspect it is caused > by the way scripts/kallsyms.c:symbol_valid() checks whether a symbol > entry needs to be integrated into the assembler output when > --all-symbols is not specified. It only keeps symbols which addresses > are in the text range. On PPC64_ELF_ABI_V1, this means only the > dot-prefixed symbols will be kept (those point to the function begin), > leaving out the non-dot-prefixed symbols (those point to the function > descriptors). > > So I see two possible solutions there: either we ensure that > FTRACE_SYSCALLS selects KALLSYMS_ALL on PPC64_ELF_ABI_V1, or we modify > scripts/kallsyms.c:symbol_valid() to also include function descriptor > symbols. This would mean accepting symbols pointing into the .opd ELF > section. > > IMHO the second option would be better because it does not increase the > kernel image size as much as KALLSYMS_ALL. > Yes, seems to be the best solution. Maybe the only thing to do is to add a new range to text_ranges, something like (untested): diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c index 03fa07ad45d9..decf31c497f5 100644 --- a/scripts/kallsyms.c +++ b/scripts/kallsyms.c @@ -64,6 +64,7 @@ static unsigned long long relative_base; static struct addr_range text_ranges[] = { { "_stext", "_etext" }, { "_sinittext", "_einittext" }, + { "__start_opd", "__end_opd" }, }; #define text_range_text (_ranges[0]) #define text_range_inittext (_ranges[1]) --- Christophe
Re: [PATCH v5 5/5] powerpc: mm: support page table check
Rohan McLure writes: > On creation and clearing of a page table mapping, instrument such calls > by invoking page_table_check_pte_set and page_table_check_pte_clear > respectively. These calls serve as a sanity check against illegal > mappings. > > Enable ARCH_SUPPORTS_PAGE_TABLE_CHECK for all ppc64, and 32-bit > platforms implementing Book3S. > > Change pud_pfn to be a runtime bug rather than a build bug as it is > consumed by page_table_check_pud_{clear,set} which are not called. > > See also: > > riscv support in commit 3fee229a8eb9 ("riscv/mm: enable > ARCH_SUPPORTS_PAGE_TABLE_CHECK") > arm64 in commit 42b2547137f5 ("arm64/mm: enable > ARCH_SUPPORTS_PAGE_TABLE_CHECK") > x86_64 in commit d283d422c6c4 ("x86: mm: add x86_64 support for page table > check") > > Reviewed-by: Russell Currey > Reviewed-by: Christophe Leroy > Signed-off-by: Rohan McLure This blows up for me when checking is enabled. This is a qemu pseries KVM guest on a P9 host, booting Fedora 34. I haven't dug into what is wrong yet. cheers [0.600480][ T63] [ cut here ] [0.600546][ T63] kernel BUG at mm/page_table_check.c:115! [0.600596][ T63] Oops: Exception in kernel mode, sig: 5 [#1] [0.600645][ T63] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries [0.600703][ T63] Modules linked in: [0.600736][ T63] CPU: 0 PID: 63 Comm: systemd-bless-b Not tainted 6.1.0-rc2-00178-gf0c0e10f5162 #65 [0.600803][ T63] Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf05 of:SLOF,git-5b4c5a hv:linux,kvm pSeries [0.600885][ T63] NIP: c04a7d58 LR: c04a7d74 CTR: [0.600942][ T63] REGS: c67635e0 TRAP: 0700 Not tainted (6.1.0-rc2-00178-gf0c0e10f5162) [0.601008][ T63] MSR: 80029033 CR: 44424420 XER: 0003 [0.601088][ T63] CFAR: c04a7d4c IRQMASK: 0 [0.601088][ T63] GPR00: c04a7d74 c6763880 c1332500 c3a00408 [0.601088][ T63] GPR04: 0001 0001 8603402f00c0 [0.601088][ T63] GPR08: 05e0 0001 c2702500 2000 [0.601088][ T63] GPR12: 7fffbc81 c2a2 4000 4000 [0.601088][ T63] GPR16: 4000 4000 [0.601088][ T63] GPR20: ff7fefbf c3555a00 0001 [0.601088][ T63] GPR24: c28dda60 c00c00020ac0 c82a1100 c00c000bd000 [0.601088][ T63] GPR28: 0001 c26fcf60 0001 c3a00400 [0.601614][ T63] NIP [c04a7d58] page_table_check_set.part.0+0xc8/0x170 [0.601675][ T63] LR [c04a7d74] page_table_check_set.part.0+0xe4/0x170 [0.601734][ T63] Call Trace: [0.601764][ T63] [c6763880] [c67638c0] 0xc67638c0 (unreliable) [0.601825][ T63] [c67638c0] [c0087c28] set_pte_at+0x68/0x210 [0.601884][ T63] [c6763910] [c0483fd8] __split_huge_pmd+0x7f8/0x11c0 [0.601947][ T63] [c6763a20] [c0485908] vma_adjust_trans_huge+0x158/0x2d0 [0.602006][ T63] [c6763a70] [c040b5dc] __vma_adjust+0x13c/0xbe0 [0.602067][ T63] [c6763b80] [c040d708] __split_vma+0x158/0x270 [0.602128][ T63] [c6763bd0] [c040d938] do_mas_align_munmap.constprop.0+0x118/0x610 [0.602196][ T63] [c6763cd0] [c0416228] sys_mremap+0x3c8/0x850 [0.602255][ T63] [c6763e10] [c002fab8] system_call_exception+0x128/0x330 [0.602314][ T63] [c6763e50] [c000d05c] system_call_vectored_common+0x15c/0x2ec [0.602384][ T63] --- interrupt: 3000 at 0x7fffbd94f86c [0.602425][ T63] NIP: 7fffbd94f86c LR: CTR: [0.602481][ T63] REGS: c6763e80 TRAP: 3000 Not tainted (6.1.0-rc2-00178-gf0c0e10f5162) [0.602546][ T63] MSR: 8280f033 CR: 44004422 XER: [0.602635][ T63] IRQMASK: 0 [0.602635][ T63] GPR00: 00a3 72c4e670 7fffbda37000 7fffbc40 [0.602635][ T63] GPR04: 0041 0001 0001 [0.602635][ T63] GPR08: 7fffbc41 [0.602635][ T63] GPR12: 7fffbc937d50 [0.602635][ T63] GPR16: [0.602635][ T63] GPR20: 046b 003f 72c4e7e8 0040 [0.602635][ T63] GPR24: 0480 0041 [0.602635][ T63] GPR28: 7fffbc40 0001
Re: [PATCH v5 2/5] powerpc: mm: Implement p{m,u,4}d_leaf on all platforms
Rohan McLure writes: > The check that a higher-level entry in multi-level pages contains a page > translation entry (pte) is performed by p{m,u,4}d_leaf stubs, which may > be specialised for each choice of mmu. In a prior commit, we replace > uses to the catch-all stubs, p{m,u,4}d_is_leaf with p{m,u,4}d_leaf. > > Replace the catch-all stub definitions for p{m,u,4}d_is_leaf with > definitions for p{m,u,4}d_leaf. A future patch will assume that > p{m,u,4}d_leaf is defined on all platforms. > > In particular, implement pud_leaf for Book3E-64, pmd_leaf for all Book3E > and Book3S-64 platforms, with a catch-all definition for p4d_leaf. > > Signed-off-by: Rohan McLure > --- > v5: Split patch that replaces p{m,u,4}d_is_leaf into two patches, first > replacing callsites and afterward providing generic definition. > Remove ifndef-defines implementing p{m,u}d_leaf in favour of > implementing stubs in headers belonging to the particular platforms > needing them. > --- > arch/powerpc/include/asm/book3s/32/pgtable.h | 4 > arch/powerpc/include/asm/book3s/64/pgtable.h | 8 ++- > arch/powerpc/include/asm/nohash/64/pgtable.h | 5 + > arch/powerpc/include/asm/nohash/pgtable.h| 5 + > arch/powerpc/include/asm/pgtable.h | 22 ++-- > 5 files changed, 18 insertions(+), 26 deletions(-) I needed the delta below to prevent the generic versions being defined and overriding our versions. cheers diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index 44703c8c590c..117135be8cc2 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -244,6 +244,7 @@ static inline void pmd_clear(pmd_t *pmdp) *pmdp = __pmd(0); } +#define pmd_leaf pmd_leaf static inline bool pmd_leaf(pmd_t pmd) { return false; diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index 436632d04304..f00aa2d203c2 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -1438,11 +1438,13 @@ static inline bool is_pte_rw_upgrade(unsigned long old_val, unsigned long new_va /* * Like pmd_huge() and pmd_large(), but works regardless of config options */ +#define pmd_leaf pmd_leaf static inline bool pmd_leaf(pmd_t pmd) { return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE)); } +#define pud_leaf pud_leaf static inline bool pud_leaf(pud_t pud) { return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE)); diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h index 2488da8f0deb..d88b22c753d3 100644 --- a/arch/powerpc/include/asm/nohash/64/pgtable.h +++ b/arch/powerpc/include/asm/nohash/64/pgtable.h @@ -147,6 +147,7 @@ static inline void pud_clear(pud_t *pudp) *pudp = __pud(0); } +#define pud_leaf pud_leaf static inline bool pud_leaf(pud_t pud) { return false; diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h index 487804f5b1d1..dfae1dbb9c3b 100644 --- a/arch/powerpc/include/asm/nohash/pgtable.h +++ b/arch/powerpc/include/asm/nohash/pgtable.h @@ -60,6 +60,7 @@ static inline bool pte_hw_valid(pte_t pte) return pte_val(pte) & _PAGE_PRESENT; } +#define pmd_leaf pmd_leaf static inline bool pmd_leaf(pmd_t pmd) { return false;
Re: [PATCH v3 4/9] scsi: lpfc: Change to use pci_aer_clear_uncorrect_error_status()
[moved James, Dick, LPFC supporters to "to"] On Wed, Sep 28, 2022 at 06:59:41PM +0800, Zhuo Chen wrote: > lpfc_aer_cleanup_state() requires clearing both fatal and non-fatal > uncorrectable error status. I don't know what the point of lpfc_aer_cleanup_state() is. AER errors should be handled and cleared by the PCI core, not by individual drivers. Only lpfc, liquidio, and sky2 touch PCI_ERR_UNCOR_STATUS. But lpfc_aer_cleanup_state() is visible in the "lpfc_aer_state_cleanup" sysfs file, so removing it would break any userspace that uses it. If we can rely on the PCI core to clean up AER errors itself (admittedly, that might be a big "if"), maybe lpfc_aer_cleanup_state() could just become a no-op? Any comment from the LPFC folks? Ideally, I would rather not export pci_aer_clear_nonfatal_status() or pci_aer_clear_uncorrect_error_status() outside the PCI core at all. > But using pci_aer_clear_nonfatal_status() > will only clear non-fatal error status. To clear both fatal and > non-fatal error status, use pci_aer_clear_uncorrect_error_status(). > > Signed-off-by: Zhuo Chen > --- > drivers/scsi/lpfc/lpfc_attr.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c > index 09cf2cd0ae60..d835cc0ba153 100644 > --- a/drivers/scsi/lpfc/lpfc_attr.c > +++ b/drivers/scsi/lpfc/lpfc_attr.c > @@ -4689,7 +4689,7 @@ static DEVICE_ATTR_RW(lpfc_aer_support); > * Description: > * If the @buf contains 1 and the device currently has the AER support > * enabled, then invokes the kernel AER helper routine > - * pci_aer_clear_nonfatal_status() to clean up the uncorrectable > + * pci_aer_clear_uncorrect_error_status() to clean up the uncorrectable > * error status register. > * > * Notes: > @@ -4715,7 +4715,7 @@ lpfc_aer_cleanup_state(struct device *dev, struct > device_attribute *attr, > return -EINVAL; > > if (phba->hba_flag & HBA_AER_ENABLED) > - rc = pci_aer_clear_nonfatal_status(phba->pcidev); > + rc = pci_aer_clear_uncorrect_error_status(phba->pcidev); > > if (rc == 0) > return strlen(buf); > -- > 2.30.1 (Apple Git-130) >
[PATCH mm-unstable RFC 00/26] mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap PTEs
This is the follow-up on [1]: [PATCH v2 0/8] mm: COW fixes part 3: reliable GUP R/W FOLL_GET of anonymous pages After we implemented __HAVE_ARCH_PTE_SWP_EXCLUSIVE on most prominent enterprise architectures, implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all remaining architectures that support swap PTEs. This makes sure that exclusive anonymous pages will stay exclusive, even after they were swapped out -- for example, making GUP R/W FOLL_GET of anonymous pages reliable. Details can be found in [1]. This primarily fixes remaining known O_DIRECT memory corruptions that can happen on concurrent swapout, whereby we can lose DMA reads to a page (modifying the user page by writing to it). To verify, there are two test cases (requiring swap space, obviously): (1) The O_DIRECT+swapout test case [2] from Andrea. This test case tries triggering a race condition. (2) My vmsplice() test case [3] that tries to detect if the exclusive marker was lost during swapout, not relying on a race condition. For example, on 32bit x86 (with and without PAE), my test case fails without these patches: $ ./test_swp_exclusive FAIL: page was replaced during COW But succeeds with these patches: $ ./test_swp_exclusive PASS: page was not replaced during COW Why implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE for all architectures, even the ones where swap support might be in a questionable state? This is the first step towards removing "readable_exclusive" migration entries, and instead using pte_swp_exclusive() also with (readable) migration entries instead (as suggested by Peter). The only missing piece for that is supporting pmd_swp_exclusive() on relevant architectures with THP migration support. As all relevant architectures now implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE,, we can drop __HAVE_ARCH_PTE_SWP_EXCLUSIVE in the last patch. RFC because some of the swap PTE layouts are really tricky and I really need some feedback related to deciphering these layouts and "using yet unused PTE bits in swap PTEs". I tried cross-compiling all relevant setups (phew, I might only miss some power/nohash variants), but only tested on x86 so far. CCing arch maintainers only on this cover letter and on the respective patch(es). [1] https://lkml.kernel.org/r/20220329164329.208407-1-da...@redhat.com [2] https://gitlab.com/aarcange/kernel-testcases-for-v5.11/-/blob/main/page_count_do_wp_page-swap.c [3] https://gitlab.com/davidhildenbrand/scratchspace/-/blob/main/test_swp_exclusive.c David Hildenbrand (26): mm/debug_vm_pgtable: more pte_swp_exclusive() sanity checks alpha/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE arc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE arm/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE csky/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE hexagon/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE ia64/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE loongarch/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE m68k/mm: remove dummy __swp definitions for nommu m68k/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE microblaze/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE mips/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE nios2/mm: refactor swap PTE layout nios2/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE openrisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE parisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE powerpc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit book3s powerpc/nohash/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE riscv/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE sh/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 64bit um/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE x86/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also on 32bit xtensa/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE mm: remove __HAVE_ARCH_PTE_SWP_EXCLUSIVE arch/alpha/include/asm/pgtable.h | 40 - arch/arc/include/asm/pgtable-bits-arcv2.h | 26 +- arch/arm/include/asm/pgtable-2level.h | 3 + arch/arm/include/asm/pgtable-3level.h | 3 + arch/arm/include/asm/pgtable.h| 34 ++-- arch/arm64/include/asm/pgtable.h | 1 - arch/csky/abiv1/inc/abi/pgtable-bits.h| 13 ++- arch/csky/abiv2/inc/abi/pgtable-bits.h| 19 ++-- arch/csky/include/asm/pgtable.h | 17 arch/hexagon/include/asm/pgtable.h| 36 ++-- arch/ia64/include/asm/pgtable.h | 31 ++- arch/loongarch/include/asm/pgtable-bits.h | 4 + arch/loongarch/include/asm/pgtable.h | 38 +++- arch/m68k/include/asm/mcf_pgtable.h | 35 +++- arch/m68k/include/asm/motorola_pgtable.h | 37 +++- arch/m68k/include/asm/pgtable_no.h| 6 -- arch/m68k/include/asm/sun3_pgtable.h | 38 +++- arch/microblaze/include/asm/pgtable.h | 44 +++---
Re: [PATCH v3 8/9] PCI/ERR: Clear fatal error status when pci_channel_io_frozen
Hi Zhuo, On Wed, Sep 28, 2022 at 06:59:45PM +0800, Zhuo Chen wrote: > When state is pci_channel_io_frozen in pcie_do_recovery(), the > severity is fatal and fatal error status should be cleared. > So add pci_aer_clear_fatal_status(). > > Signed-off-by: Zhuo Chen > --- > drivers/pci/pcie/err.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c > index f80b21244ef1..b46f1d36c090 100644 > --- a/drivers/pci/pcie/err.c > +++ b/drivers/pci/pcie/err.c > @@ -241,7 +241,10 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, > pci_walk_bridge(bridge, report_resume, ); > > pcie_clear_device_status(dev); > - pci_aer_clear_nonfatal_status(dev); > + if (state == pci_channel_io_frozen) > + pci_aer_clear_fatal_status(dev); > + else > + pci_aer_clear_nonfatal_status(dev); I'm confused. It seems like we certainly need to clear fatal errors after they occur *somewhere*, and if we don't, surely this would be a very obvious issue. But you didn't mention this being a bug fix, so I assume it's more of a cleanup. If it *is* a bug fix, please say that and give a hint about what the bug looks like, e.g., what sort of messages a user might see. If it's not a bug fix, I don't understand how AER fatal errors get cleared today. The PCI_ERR_UNCOR_STATUS bits are sticky, so they're not cleared by a reset. In the current tree, these are the only places I see that clear AER fatal errors: pci_init_capabilities pci_aer_init # once at device enumeration pci_aer_clear_status pci_aer_raw_clear_status pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status) aer_probe aer_enable_rootport # once at Root Port enumeration pci_write_config_dword(pdev, aer + PCI_ERR_UNCOR_STATUS, reg32) dpc_process_error # after DPC triggered pci_aer_clear_fatal_status pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status) edr_handle_event # after EDR event pci_aer_raw_clear_status pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status) pci_restore_state # after reset or PM sleep/resume pci_aer_clear_status pci_aer_raw_clear_status pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status) The only one that could clear errors after an AER error (not DPC or EDR), would be the pci_restore_state() in the reset path. If the current code relies on that, I'd say that's a pretty non-obvious dependency. > pci_info(bridge, "device recovery successful\n"); > return status; > -- > 2.30.1 (Apple Git-130) >
Re: [PATCH v3 3/9] NTB: Remove pci_aer_clear_nonfatal_status() call
Hi Bjorn On Tue, Dec 06, 2022 at 12:09:56PM -0600, Bjorn Helgaas wrote: > On Wed, Sep 28, 2022 at 02:03:55PM +0300, Serge Semin wrote: > > On Wed, Sep 28, 2022 at 06:59:40PM +0800, Zhuo Chen wrote: > > > There is no need to clear error status during init code, so remove it. > > > > Why do you think there isn't? Justify in more details. > > Thanks for taking a look, Sergey! I agree we should leave it or add > the rationale here. > > > > Signed-off-by: Zhuo Chen > > > --- > > > drivers/ntb/hw/idt/ntb_hw_idt.c | 2 -- > > > 1 file changed, 2 deletions(-) > > > > > > diff --git a/drivers/ntb/hw/idt/ntb_hw_idt.c > > > b/drivers/ntb/hw/idt/ntb_hw_idt.c > > > index 0ed6f809ff2e..fed03217289d 100644 > > > --- a/drivers/ntb/hw/idt/ntb_hw_idt.c > > > +++ b/drivers/ntb/hw/idt/ntb_hw_idt.c > > > @@ -2657,8 +2657,6 @@ static int idt_init_pci(struct idt_ntb_dev *ndev) > > > ret = pci_enable_pcie_error_reporting(pdev); > > > if (ret != 0) > > > dev_warn(>dev, "PCIe AER capability disabled\n"); > > > - else /* Cleanup nonfatal error status before getting to init */ > > > - pci_aer_clear_nonfatal_status(pdev); > > I do think drivers should not need to clear errors; I think the PCI > core should be responsible for that. > > And I think the core *does* do that in this path: > > pci_init_capabilities > pci_aer_init > pci_aer_clear_status > pci_aer_raw_clear_status > pci_write_config_dword(pdev, aer + PCI_ERR_COR_STATUS) > pci_write_config_dword(pdev, aer + PCI_ERR_UNCOR_STATUS) > > pci_aer_clear_nonfatal_status() clears only non-fatal uncorrectable > errors, while pci_aer_init() clears all correctable and all > uncorrectable errors, so the PCI core is already doing more than > idt_init_pci() does. > > So I think this change is good because it removes some work from the > driver, but let me know if you think otherwise. It's hard to remember now all the details but IIRC back when this driver was developed the "Unsupported Request" flag was left uncleared on our platform even after the probe completion. Most likely an erroneous TLP was generated by some action performed on the device probe stage. The forced cleanup of the AER status solved that problem. On the other hand the problem of having the UnsupReq+ flag set was solved some time after the driver was merged in into the kernel (it was caused by a vendor-specific behavior of the IDT PCIe switch placed on the path between a RP and PCIe NTB). So since the original reason of having the pci_aer_clear_nonfatal_status() method called here was platform specific and fixed now anyway, and the AER flags cleanup is done by the core, then I have no reason to be against the patch. It would be good to add your clarification to the commit message though. Reviewed-by: Serge Semin -Serge(y) > > > > > > > /* First enable the PCI device */ > > > ret = pcim_enable_device(pdev); > > > -- > > > 2.30.1 (Apple Git-130) > > >
Re: [PATCH 05/11] ARM: dts: socfpga: Fix pca9548 i2c-mux node name
On 12/2/22 10:49, Geert Uytterhoeven wrote: "make dtbs_check": arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dtb: i2cswitch@70: $nodename:0: 'i2cswitch@70' does not match '^(i2c-?)?mux' From schema: Documentation/devicetree/bindings/i2c/i2c-mux-pca954x.yaml arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dtb: i2cswitch@70: Unevaluated properties are not allowed ('#address-cells', '#size-cells', 'i2c@0', 'i2c@1', 'i2c@2', 'i2c@3', 'i2c@4', 'i2c@5', 'i2c@6', 'i2c@7' were unexpected) From schema: Documentation/devicetree/bindings/i2c/i2c-mux-pca954x.yaml Fix this by renaming the PCA9548 node to "i2c-mux", to match the I2C bus multiplexer/switch DT bindings and the Generic Names Recommendation in the Devicetree Specification. Signed-off-by: Geert Uytterhoeven --- arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts b/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts index f24f17c2f5ee6bc4..e0630b0eed036d35 100644 --- a/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts +++ b/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts @@ -141,7 +141,7 @@ at24@50 { reg = <0x50>; }; - i2cswitch@70 { + i2c-mux@70 { compatible = "nxp,pca9548"; #address-cells = <1>; #size-cells = <0>; Applied! Thanks, Dinh
Re: [PATCH v3 3/9] NTB: Remove pci_aer_clear_nonfatal_status() call
On Wed, Sep 28, 2022 at 02:03:55PM +0300, Serge Semin wrote: > On Wed, Sep 28, 2022 at 06:59:40PM +0800, Zhuo Chen wrote: > > There is no need to clear error status during init code, so remove it. > > Why do you think there isn't? Justify in more details. Thanks for taking a look, Sergey! I agree we should leave it or add the rationale here. > > Signed-off-by: Zhuo Chen > > --- > > drivers/ntb/hw/idt/ntb_hw_idt.c | 2 -- > > 1 file changed, 2 deletions(-) > > > > diff --git a/drivers/ntb/hw/idt/ntb_hw_idt.c > > b/drivers/ntb/hw/idt/ntb_hw_idt.c > > index 0ed6f809ff2e..fed03217289d 100644 > > --- a/drivers/ntb/hw/idt/ntb_hw_idt.c > > +++ b/drivers/ntb/hw/idt/ntb_hw_idt.c > > @@ -2657,8 +2657,6 @@ static int idt_init_pci(struct idt_ntb_dev *ndev) > > ret = pci_enable_pcie_error_reporting(pdev); > > if (ret != 0) > > dev_warn(>dev, "PCIe AER capability disabled\n"); > > - else /* Cleanup nonfatal error status before getting to init */ > > - pci_aer_clear_nonfatal_status(pdev); I do think drivers should not need to clear errors; I think the PCI core should be responsible for that. And I think the core *does* do that in this path: pci_init_capabilities pci_aer_init pci_aer_clear_status pci_aer_raw_clear_status pci_write_config_dword(pdev, aer + PCI_ERR_COR_STATUS) pci_write_config_dword(pdev, aer + PCI_ERR_UNCOR_STATUS) pci_aer_clear_nonfatal_status() clears only non-fatal uncorrectable errors, while pci_aer_init() clears all correctable and all uncorrectable errors, so the PCI core is already doing more than idt_init_pci() does. So I think this change is good because it removes some work from the driver, but let me know if you think otherwise. > > > > /* First enable the PCI device */ > > ret = pcim_enable_device(pdev); > > -- > > 2.30.1 (Apple Git-130) > >
[linux-next:master] BUILD REGRESSION 5d562c48a21eeb029a8fd3f18e1b31fd83660474
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master branch HEAD: 5d562c48a21eeb029a8fd3f18e1b31fd83660474 Add linux-next specific files for 20221206 Error/Warning reports: https://lore.kernel.org/oe-kbuild-all/202211231857.0dmueoa1-...@intel.com https://lore.kernel.org/oe-kbuild-all/202211242120.mzzvguln-...@intel.com https://lore.kernel.org/oe-kbuild-all/202211290656.vhedfthu-...@intel.com https://lore.kernel.org/oe-kbuild-all/202211301840.y7rrob13-...@intel.com https://lore.kernel.org/oe-kbuild-all/202212020520.0okmino3-...@intel.com https://lore.kernel.org/oe-kbuild-all/202212032205.iehbbyyp-...@intel.com https://lore.kernel.org/oe-kbuild-all/202212060700.njmecjxs-...@intel.com https://lore.kernel.org/oe-kbuild-all/202212061249.u0basqzk-...@intel.com https://lore.kernel.org/oe-kbuild-all/202212061341.gnalcbx6-...@intel.com https://lore.kernel.org/oe-kbuild-all/202212061455.6ge7y0jg-...@intel.com https://lore.kernel.org/oe-kbuild-all/202212061633.u9qhpe62-...@intel.com https://lore.kernel.org/oe-kbuild-all/202212061758.tlpqnuof-...@intel.com https://lore.kernel.org/oe-kbuild-all/202212061918.w5iupcya-...@intel.com https://lore.kernel.org/oe-kbuild-all/202212062250.tr0othcz-...@intel.com Error/Warning: (recently discovered and may have been fixed) Error: failed to load BTF from vmlinux: No such file or directory arch/loongarch/kernel/asm-offsets.c:262:6: warning: no previous prototype for 'output_pbe_defines' [-Wmissing-prototypes] arch/loongarch/power/hibernate.c:14:6: warning: no previous prototype for 'save_processor_state' [-Wmissing-prototypes] arch/loongarch/power/hibernate.c:26:6: warning: no previous prototype for 'restore_processor_state' [-Wmissing-prototypes] arch/loongarch/power/hibernate.c:38:5: warning: no previous prototype for 'pfn_is_nosave' [-Wmissing-prototypes] arch/loongarch/power/hibernate.c:48:5: warning: no previous prototype for 'swsusp_arch_suspend' [-Wmissing-prototypes] arch/loongarch/power/hibernate.c:56:5: warning: no previous prototype for 'swsusp_arch_resume' [-Wmissing-prototypes] arch/powerpc/kernel/kvm_emul.o: warning: objtool: kvm_template_end(): can't find starting instruction arch/powerpc/kernel/optprobes_head.o: warning: objtool: optprobe_template_end(): can't find starting instruction arch/riscv/kernel/crash_core.c:12:57: warning: format specifies type 'unsigned long' but the argument has type 'int' [-Wformat] arch/riscv/kernel/crash_core.c:14:57: error: use of undeclared identifier 'VMEMMAP_START' arch/riscv/kernel/crash_core.c:15:55: error: use of undeclared identifier 'VMEMMAP_END'; did you mean 'MEMREMAP_ENC'? arch/riscv/kernel/crash_core.c:17:57: error: use of undeclared identifier 'MODULES_VADDR' arch/riscv/kernel/crash_core.c:18:55: error: use of undeclared identifier 'MODULES_END' arch/riscv/kernel/crash_core.c:8:20: error: use of undeclared identifier 'VA_BITS' clang-16: error: no such file or directory: 'liburandom_read.so' drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn201/irq_service_dcn201.c:40:20: warning: no previous prototype for 'to_dal_irq_source_dcn201' [-Wmissing-prototypes] drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c:353:5: warning: no previous prototype for 'amdgpu_mcbp_scan' [-Wmissing-prototypes] drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c:373:5: warning: no previous prototype for 'amdgpu_mcbp_trigger_preempt' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/engine/fifo/gf100.c:451:1: warning: no previous prototype for 'gf100_fifo_nonstall_block' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/engine/fifo/gf100.c:451:1: warning: no previous prototype for function 'gf100_fifo_nonstall_block' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/engine/fifo/runl.c:34:1: warning: no previous prototype for 'nvkm_engn_cgrp_get' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/engine/fifo/runl.c:34:1: warning: no previous prototype for function 'nvkm_engn_cgrp_get' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c:210:1: warning: no previous prototype for 'tu102_gr_load' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c:210:1: warning: no previous prototype for function 'tu102_gr_load' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/nvfw/acr.c:49:1: warning: no previous prototype for 'wpr_generic_header_dump' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/nvfw/acr.c:49:1: warning: no previous prototype for function 'wpr_generic_header_dump' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c:221:21: warning: variable 'loc' set but not used [-Wunused-but-set-variable] drivers/media/i2c/tc358746.c:816:13: warning: 'm_best' is used uninitialized [-Wuninitialized] drivers/media/i2c/tc358746.c:817:13: warning: 'p_best' is used uninitialized [-Wuninitialized] drivers/media/platform/renesas/rzg2l-cru/rzg2l-csi2.c:445:7: warning: variable 'ret' is used uninitialized whenever 'if' condition is true
[PATCH mm-unstable RFC 26/26] mm: remove __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Supported by all architectures that support swp PTEs, so let's drop it. Signed-off-by: David Hildenbrand --- arch/alpha/include/asm/pgtable.h | 1 - arch/arc/include/asm/pgtable-bits-arcv2.h| 1 - arch/arm/include/asm/pgtable.h | 1 - arch/arm64/include/asm/pgtable.h | 1 - arch/csky/include/asm/pgtable.h | 1 - arch/hexagon/include/asm/pgtable.h | 1 - arch/ia64/include/asm/pgtable.h | 1 - arch/loongarch/include/asm/pgtable.h | 1 - arch/m68k/include/asm/mcf_pgtable.h | 1 - arch/m68k/include/asm/motorola_pgtable.h | 1 - arch/m68k/include/asm/sun3_pgtable.h | 1 - arch/microblaze/include/asm/pgtable.h| 1 - arch/mips/include/asm/pgtable.h | 1 - arch/nios2/include/asm/pgtable.h | 1 - arch/openrisc/include/asm/pgtable.h | 1 - arch/parisc/include/asm/pgtable.h| 1 - arch/powerpc/include/asm/book3s/32/pgtable.h | 1 - arch/powerpc/include/asm/book3s/64/pgtable.h | 1 - arch/powerpc/include/asm/nohash/pgtable.h| 1 - arch/riscv/include/asm/pgtable.h | 1 - arch/s390/include/asm/pgtable.h | 1 - arch/sh/include/asm/pgtable_32.h | 1 - arch/sparc/include/asm/pgtable_32.h | 1 - arch/sparc/include/asm/pgtable_64.h | 1 - arch/um/include/asm/pgtable.h| 1 - arch/x86/include/asm/pgtable.h | 1 - arch/xtensa/include/asm/pgtable.h| 1 - include/linux/pgtable.h | 29 mm/debug_vm_pgtable.c| 2 -- mm/memory.c | 4 --- mm/rmap.c| 11 31 files changed, 73 deletions(-) diff --git a/arch/alpha/include/asm/pgtable.h b/arch/alpha/include/asm/pgtable.h index 970abf511b13..ba43cb841d19 100644 --- a/arch/alpha/include/asm/pgtable.h +++ b/arch/alpha/include/asm/pgtable.h @@ -328,7 +328,6 @@ extern inline pte_t mk_swap_pte(unsigned long type, unsigned long offset) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) -#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE static inline int pte_swp_exclusive(pte_t pte) { return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; diff --git a/arch/arc/include/asm/pgtable-bits-arcv2.h b/arch/arc/include/asm/pgtable-bits-arcv2.h index 611f412713b9..6e9f8ca6d6a1 100644 --- a/arch/arc/include/asm/pgtable-bits-arcv2.h +++ b/arch/arc/include/asm/pgtable-bits-arcv2.h @@ -132,7 +132,6 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) -#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE static inline int pte_swp_exclusive(pte_t pte) { return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h index 5e0446a9c667..d6dec218a1fe 100644 --- a/arch/arm/include/asm/pgtable.h +++ b/arch/arm/include/asm/pgtable.h @@ -296,7 +296,6 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(swp)__pte((swp).val) -#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE static inline int pte_swp_exclusive(pte_t pte) { return pte_isset(pte, L_PTE_SWP_EXCLUSIVE); diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 4873c1d6e7d0..58e44aed2000 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -417,7 +417,6 @@ static inline pgprot_t mk_pmd_sect_prot(pgprot_t prot) return __pgprot((pgprot_val(prot) & ~PMD_TABLE_BIT) | PMD_TYPE_SECT); } -#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE static inline pte_t pte_swp_mkexclusive(pte_t pte) { return set_pte_bit(pte, __pgprot(PTE_SWP_EXCLUSIVE)); diff --git a/arch/csky/include/asm/pgtable.h b/arch/csky/include/asm/pgtable.h index 574c97b9ecca..d4042495febc 100644 --- a/arch/csky/include/asm/pgtable.h +++ b/arch/csky/include/asm/pgtable.h @@ -200,7 +200,6 @@ static inline pte_t pte_mkyoung(pte_t pte) return pte; } -#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE static inline int pte_swp_exclusive(pte_t pte) { return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; diff --git a/arch/hexagon/include/asm/pgtable.h b/arch/hexagon/include/asm/pgtable.h index 7eb008e477c8..59393613d086 100644 --- a/arch/hexagon/include/asm/pgtable.h +++ b/arch/hexagon/include/asm/pgtable.h @@ -397,7 +397,6 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) (((type & 0x1f) << 1) | \ ((offset & 0x38) << 10) | ((offset & 0x7) << 7)) }) -#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE static inline int pte_swp_exclusive(pte_t pte) { return pte_val(pte)
[PATCH mm-unstable RFC 24/26] x86/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also on 32bit
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE just like we already do on x86-64. After deciphering the PTE layout it becomes clear that there are still unused bits for 2-level and 3-level page tables that we should be able to use. Reusing a bit avoids stealing one bit from the swap offset. While at it, mask the type in __swp_entry(); use some helper definitions to make the macros easier to grasp. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: "H. Peter Anvin" Signed-off-by: David Hildenbrand --- arch/x86/include/asm/pgtable-2level.h | 26 +- arch/x86/include/asm/pgtable-3level.h | 26 +++--- arch/x86/include/asm/pgtable.h| 2 -- 3 files changed, 44 insertions(+), 10 deletions(-) diff --git a/arch/x86/include/asm/pgtable-2level.h b/arch/x86/include/asm/pgtable-2level.h index 60d0f9015317..e9482a11ac52 100644 --- a/arch/x86/include/asm/pgtable-2level.h +++ b/arch/x86/include/asm/pgtable-2level.h @@ -80,21 +80,37 @@ static inline unsigned long pte_bitop(unsigned long value, unsigned int rightshi return ((value >> rightshift) & mask) << leftshift; } -/* Encode and de-code a swap entry */ +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <- offset --> 0 E <- type --> 0 + * + * E is the exclusive marker that is not stored in swap entries. + */ #define SWP_TYPE_BITS 5 +#define _SWP_TYPE_MASK ((1U << SWP_TYPE_BITS) - 1) +#define _SWP_TYPE_SHIFT (_PAGE_BIT_PRESENT + 1) #define SWP_OFFSET_SHIFT (_PAGE_BIT_PROTNONE + 1) -#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS) +#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > 5) -#define __swp_type(x) (((x).val >> (_PAGE_BIT_PRESENT + 1)) \ -& ((1U << SWP_TYPE_BITS) - 1)) +#define __swp_type(x) (((x).val >> _SWP_TYPE_SHIFT) \ +& _SWP_TYPE_MASK) #define __swp_offset(x)((x).val >> SWP_OFFSET_SHIFT) #define __swp_entry(type, offset) ((swp_entry_t) { \ -((type) << (_PAGE_BIT_PRESENT + 1)) \ +(((type) & _SWP_TYPE_MASK) << _SWP_TYPE_SHIFT) \ | ((offset) << SWP_OFFSET_SHIFT) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { (pte).pte_low }) #define __swp_entry_to_pte(x) ((pte_t) { .pte = (x).val }) +/* We borrow bit 7 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE_PAGE_PSE + /* No inverted PFNs on 2 level page tables */ static inline u64 protnone_mask(u64 val) diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h index 28421a887209..2b87f965dd86 100644 --- a/arch/x86/include/asm/pgtable-3level.h +++ b/arch/x86/include/asm/pgtable-3level.h @@ -248,8 +248,24 @@ static inline pud_t native_pudp_get_and_clear(pud_t *pudp) #define native_pudp_get_and_clear(xp) native_local_pudp_get_and_clear(xp) #endif -/* Encode and de-code a swap entry */ +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 + * 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 + * < type -> <-- offset -- + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * > 0 E 0 0 0 0 0 0 0 + * + * E is the exclusive marker that is not stored in swap entries. + */ #define SWP_TYPE_BITS 5 +#define _SWP_TYPE_MASK ((1U << SWP_TYPE_BITS) - 1) #define SWP_OFFSET_FIRST_BIT (_PAGE_BIT_PROTNONE + 1) @@ -257,9 +273,10 @@ static inline pud_t native_pudp_get_and_clear(pud_t *pudp) #define SWP_OFFSET_SHIFT (SWP_OFFSET_FIRST_BIT + SWP_TYPE_BITS) #define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS) -#define __swp_type(x) (((x).val) & ((1UL << SWP_TYPE_BITS) - 1)) +#define __swp_type(x) (((x).val) & _SWP_TYPE_MASK) #define __swp_offset(x)((x).val >> SWP_TYPE_BITS) -#define __swp_entry(type, offset) ((swp_entry_t){(type) | (offset) << SWP_TYPE_BITS}) +#define __swp_entry(type, offset) ((swp_entry_t){((type) & _SWP_TYPE_MASK) \ + | (offset) << SWP_TYPE_BITS}) /* * Normally, __swp_entry() converts from arch-independent swp_entry_t to @@ -287,6 +304,9 @@ static
[PATCH mm-unstable RFC 25/26] xtensa/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 1. This bit should be safe to use for our usecase. Most importantly, we can still distinguish swap PTEs from PAGE_NONE PTEs (see pte_present()) and don't use one of the two reserved attribute masks (1101 and ). Attribute mask 1100 and 1110 now identify swap PTEs. While at it, remove SWP_TYPE_BITS (not really helpful as it's not used in the actual swap macros) and mask the type in __swp_entry(). Cc: Chris Zankel Cc: Max Filippov Signed-off-by: David Hildenbrand --- arch/xtensa/include/asm/pgtable.h | 32 ++- 1 file changed, 27 insertions(+), 5 deletions(-) diff --git a/arch/xtensa/include/asm/pgtable.h b/arch/xtensa/include/asm/pgtable.h index 5b5484d707b2..1025e2dc292b 100644 --- a/arch/xtensa/include/asm/pgtable.h +++ b/arch/xtensa/include/asm/pgtable.h @@ -96,7 +96,7 @@ * +- - - - - - - - - - - - - - - - - - - - -+ * (PAGE_NONE)|PPN| 0 | 00 | ADW | 01 | 11 | 11 | * +-+ - * swap | index | type | 01 | 11 | 00 | + * swap | index | type | 01 | 11 | e0 | * +-+ * * For T1050 hardware and earlier the layout differs for present and (PAGE_NONE) @@ -112,6 +112,7 @@ * RI ring (0=privileged, 1=user, 2 and 3 are unused) * CAcache attribute: 00 bypass, 01 writeback, 10 writethrough * (11 is invalid and used to mark pages that are not present) + * e exclusive marker in swap PTEs * w page is writable (hw) * x page is executable (hw) * index swap offset / PAGE_SIZE (bit 11-31: 21 bits -> 8 GB) @@ -158,6 +159,9 @@ #define _PAGE_DIRTY(1<<7) /* software: page dirty */ #define _PAGE_ACCESSED (1<<8) /* software: page accessed (read) */ +/* We borrow bit 1 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE(1<<1) + #ifdef CONFIG_MMU #define _PAGE_CHG_MASK(PAGE_MASK | _PAGE_ACCESSED | _PAGE_DIRTY) @@ -343,19 +347,37 @@ ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) } /* - * Encode and decode a swap and file entry. + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). */ -#define SWP_TYPE_BITS 5 -#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS) +#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > 5) #define __swp_type(entry) (((entry).val >> 6) & 0x1f) #define __swp_offset(entry)((entry).val >> 11) #define __swp_entry(type,offs) \ - ((swp_entry_t){((type) << 6) | ((offs) << 11) | \ + ((swp_entry_t){(((type) & 0x1f) << 6) | ((offs) << 11) | \ _PAGE_CA_INVALID | _PAGE_USER}) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + #endif /* !defined (__ASSEMBLY__) */ -- 2.38.1
[PATCH mm-unstable RFC 23/26] um/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 10, which is yet unused for swap PTEs. The pte_mkuptodate() is a bit weird in __pte_to_swp_entry() for a swap PTE ... but it only messes with bit 1 and 2 and there is a comment in set_pte(), so leave these bits alone. While at it, mask the type in __swp_entry(). Cc: Richard Weinberger Cc: Anton Ivanov Cc: Johannes Berg Signed-off-by: David Hildenbrand --- arch/um/include/asm/pgtable.h | 37 +-- 1 file changed, 35 insertions(+), 2 deletions(-) diff --git a/arch/um/include/asm/pgtable.h b/arch/um/include/asm/pgtable.h index 4e3052f2671a..cedc5fd451ce 100644 --- a/arch/um/include/asm/pgtable.h +++ b/arch/um/include/asm/pgtable.h @@ -21,6 +21,9 @@ #define _PAGE_PROTNONE 0x010 /* if the user mapped it with PROT_NONE; pte_present gives true */ +/* We borrow bit 10 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE0x400 + #ifdef CONFIG_3_LEVEL_PGTABLES #include #else @@ -288,16 +291,46 @@ extern pte_t *virt_to_pte(struct mm_struct *mm, unsigned long addr); #define update_mmu_cache(vma,address,ptep) do {} while (0) -/* Encode and de-code a swap entry */ +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <--- offset > E < type -> 0 0 0 1 0 + * + * E is the exclusive marker that is not stored in swap entries. + * _PAGE_NEWPAGE (bit 1) is always set to 1 in set_pte(). + */ #define __swp_type(x) (((x).val >> 5) & 0x1f) #define __swp_offset(x)((x).val >> 11) #define __swp_entry(type, offset) \ - ((swp_entry_t) { ((type) << 5) | ((offset) << 11) }) + ((swp_entry_t) { (((type) & 0x1f) << 5) | ((offset) << 11) }) #define __pte_to_swp_entry(pte) \ ((swp_entry_t) { pte_val(pte_mkuptodate(pte)) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_get_bits(pte, _PAGE_SWP_EXCLUSIVE); +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_set_bits(pte, _PAGE_SWP_EXCLUSIVE); + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_clear_bits(pte, _PAGE_SWP_EXCLUSIVE); + return pte; +} + /* Clear a kernel PTE and flush it from the TLB */ #define kpte_clear_flush(ptep, vaddr) \ do { \ -- 2.38.1
[PATCH mm-unstable RFC 22/26] sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 64bit
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit was effectively unused. While at it, mask the type in __swp_entry(). Cc: "David S. Miller" Signed-off-by: David Hildenbrand --- arch/sparc/include/asm/pgtable_64.h | 38 ++--- 1 file changed, 35 insertions(+), 3 deletions(-) diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index 3bc9736bddb1..614fdedbb145 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -187,6 +187,9 @@ bool kern_addr_valid(unsigned long addr); #define _PAGE_SZHUGE_4U_PAGE_SZ4MB_4U #define _PAGE_SZHUGE_4V_PAGE_SZ4MB_4V +/* We borrow bit 20 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE_AC(0x0010,UL) + #ifndef __ASSEMBLY__ pte_t mk_pte_io(unsigned long, pgprot_t, int, unsigned long); @@ -961,18 +964,47 @@ void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp, pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp); #endif -/* Encode and de-code a swap entry */ -#define __swp_type(entry) (((entry).val >> PAGE_SHIFT) & 0xffUL) +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 + * 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 + * <--- offset --- + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * > E <-- type ---> <--- zeroes > + */ +#define __swp_type(entry) (((entry).val >> PAGE_SHIFT) & 0x7fUL) #define __swp_offset(entry)((entry).val >> (PAGE_SHIFT + 8UL)) #define __swp_entry(type, offset) \ ( (swp_entry_t) \ { \ - (((long)(type) << PAGE_SHIFT) | \ + long)(type) & 0x7fUL) << PAGE_SHIFT) | \ ((long)(offset) << (PAGE_SHIFT + 8UL))) \ } ) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SWP_EXCLUSIVE); +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return __pte(pte_val(pte) & ~_PAGE_SWP_EXCLUSIVE); +} + int page_in_phys_avail(unsigned long paddr); /* -- 2.38.1
[PATCH mm-unstable RFC 21/26] sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by reusing the SRMMU_DIRTY bit as that seems to be safe to reuse inside a swap PTE. This avoids having to steal one bit from the swap offset. While at it, relocate the swap PTE layout documentation and use the same style now used for most other archs. Note that the old documentation was wrong: we use 20 bit for the offset and the reserved bits were 8 instead of 7 bits in the ascii art. Cc: "David S. Miller" Signed-off-by: David Hildenbrand --- arch/sparc/include/asm/pgtable_32.h | 27 ++- arch/sparc/include/asm/pgtsrmmu.h | 14 +++--- 2 files changed, 29 insertions(+), 12 deletions(-) diff --git a/arch/sparc/include/asm/pgtable_32.h b/arch/sparc/include/asm/pgtable_32.h index 5acc05b572e6..abf7a2601209 100644 --- a/arch/sparc/include/asm/pgtable_32.h +++ b/arch/sparc/include/asm/pgtable_32.h @@ -323,7 +323,16 @@ void srmmu_mapiorange(unsigned int bus, unsigned long xpa, unsigned long xva, unsigned int len); void srmmu_unmapiorange(unsigned long virt_addr, unsigned int len); -/* Encode and de-code a swap entry */ +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <-- offset ---> < type -> E 0 0 0 0 0 0 + */ static inline unsigned long __swp_type(swp_entry_t entry) { return (entry.val >> SRMMU_SWP_TYPE_SHIFT) & SRMMU_SWP_TYPE_MASK; @@ -344,6 +353,22 @@ static inline swp_entry_t __swp_entry(unsigned long type, unsigned long offset) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & SRMMU_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return __pte(pte_val(pte) | SRMMU_SWP_EXCLUSIVE); +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return __pte(pte_val(pte) & ~SRMMU_SWP_EXCLUSIVE); +} + static inline unsigned long __get_phys (unsigned long addr) { diff --git a/arch/sparc/include/asm/pgtsrmmu.h b/arch/sparc/include/asm/pgtsrmmu.h index 6067925972d9..18e68d43f036 100644 --- a/arch/sparc/include/asm/pgtsrmmu.h +++ b/arch/sparc/include/asm/pgtsrmmu.h @@ -53,21 +53,13 @@ #define SRMMU_CHG_MASK(0xff00 | SRMMU_REF | SRMMU_DIRTY) -/* SRMMU swap entry encoding - * - * We use 5 bits for the type and 19 for the offset. This gives us - * 32 swapfiles of 4GB each. Encoding looks like: - * - * ooot - * fedcba9876543210fedcba9876543210 - * - * The bottom 7 bits are reserved for protection and status bits, especially - * PRESENT. - */ +/* SRMMU swap entry encoding */ #define SRMMU_SWP_TYPE_MASK0x1f #define SRMMU_SWP_TYPE_SHIFT 7 #define SRMMU_SWP_OFF_MASK 0xf #define SRMMU_SWP_OFF_SHIFT(SRMMU_SWP_TYPE_SHIFT + 5) +/* We borrow bit 6 to store the exclusive marker in swap PTEs. */ +#define SRMMU_SWP_EXCLUSIVESRMMU_DIRTY /* Some day I will implement true fine grained access bits for * user pages because the SRMMU gives us the capabilities to -- 2.38.1
[PATCH mm-unstable RFC 20/26] sh/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 6 in the PTE, reducing the swap type in the !CONFIG_X2TLB case to 5 bits. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. Interrestingly, the swap type in the !CONFIG_X2TLB case could currently overlap with the _PAGE_PRESENT bit, because there is a sneaky shift by 1 in __pte_to_swp_entry() and __swp_entry_to_pte(). Bit 0-7 in the architecture specific swap PTE would get shifted to bit 1-8 in the PTE. As generic MM uses 5 bits only, this didn't matter so far. While at it, mask the type in __swp_entry(). Cc: Yoshinori Sato Cc: Rich Felker Signed-off-by: David Hildenbrand --- arch/sh/include/asm/pgtable_32.h | 54 +--- 1 file changed, 42 insertions(+), 12 deletions(-) diff --git a/arch/sh/include/asm/pgtable_32.h b/arch/sh/include/asm/pgtable_32.h index d0240decacca..090940aadbcc 100644 --- a/arch/sh/include/asm/pgtable_32.h +++ b/arch/sh/include/asm/pgtable_32.h @@ -423,40 +423,70 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) #endif /* - * Encode and de-code a swap entry + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). * * Constraints: * _PAGE_PRESENT at bit 8 * _PAGE_PROTNONE at bit 9 * - * For the normal case, we encode the swap type into bits 0:7 and the - * swap offset into bits 10:30. For the 64-bit PTE case, we keep the - * preserved bits in the low 32-bits and use the upper 32 as the swap - * offset (along with a 5-bit type), following the same approach as x86 - * PAE. This keeps the logic quite simple. + * For the normal case, we encode the swap type and offset into the swap PTE + * such that bits 8 and 9 stay zero. For the 64-bit PTE case, we use the + * upper 32 for the swap offset and swap type, following the same approach as + * x86 PAE. This keeps the logic quite simple. * * As is evident by the Alpha code, if we ever get a 64-bit unsigned * long (swp_entry_t) to match up with the 64-bit PTEs, this all becomes * much cleaner.. - * - * NOTE: We should set ZEROs at the position of _PAGE_PRESENT - * and _PAGE_PROTNONE bits */ + #ifdef CONFIG_X2TLB +/* + * Format of swap PTEs: + * + * 6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 + * 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 + * <- offset --> < type -> + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <--- zeroes > E 0 0 0 0 0 0 + */ #define __swp_type(x) ((x).val & 0x1f) #define __swp_offset(x)((x).val >> 5) -#define __swp_entry(type, offset) ((swp_entry_t){ (type) | (offset) << 5}) +#define __swp_entry(type, offset) ((swp_entry_t){ ((type) & 0x1f) | (offset) << 5}) #define __pte_to_swp_entry(pte)((swp_entry_t){ (pte).pte_high }) #define __swp_entry_to_pte(x) ((pte_t){ 0, (x).val }) #else -#define __swp_type(x) ((x).val & 0xff) +/* + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <--- offset > 0 0 0 0 E < type -> 0 + * + * E is the exclusive marker that is not stored in swap entries. + */ +#define __swp_type(x) ((x).val & 0x1f) #define __swp_offset(x)((x).val >> 10) -#define __swp_entry(type, offset) ((swp_entry_t){(type) | (offset) <<10}) +#define __swp_entry(type, offset) ((swp_entry_t){((type) & 0x1f) | (offset) <<10}) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) >> 1 }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val << 1 }) #endif +/* In both cases, we borrow bit 6 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE_PAGE_USER + +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte.pte_low & _PAGE_SWP_EXCLUSIVE; +} + +PTE_BIT_FUNC(low, swp_mkexclusive, |= _PAGE_SWP_EXCLUSIVE); +PTE_BIT_FUNC(low, swp_clear_exclusive, &= ~_PAGE_SWP_EXCLUSIVE); + #endif /* __ASSEMBLY__ */ #endif /* __ASM_SH_PGTABLE_32_H */ -- 2.38.1
[PATCH mm-unstable RFC 19/26] riscv/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the offset. This reduces the maximum swap space per file: on 32bit to 16 GiB (was 32 GiB). Note that this bit does not conflict with swap PMDs and could also be used in swap PMD context later. While at it, mask the type in __swp_entry(). Cc: Paul Walmsley Cc: Palmer Dabbelt Cc: Albert Ou Signed-off-by: David Hildenbrand --- arch/riscv/include/asm/pgtable-bits.h | 3 +++ arch/riscv/include/asm/pgtable.h | 29 ++- 2 files changed, 27 insertions(+), 5 deletions(-) diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h index b9e13a8fe2b7..f896708e8331 100644 --- a/arch/riscv/include/asm/pgtable-bits.h +++ b/arch/riscv/include/asm/pgtable-bits.h @@ -27,6 +27,9 @@ */ #define _PAGE_PROT_NONE _PAGE_GLOBAL +/* Used for swap PTEs only. */ +#define _PAGE_SWP_EXCLUSIVE _PAGE_ACCESSED + #define _PAGE_PFN_SHIFT 10 /* diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 7ee3ac315c7c..9730f9fed197 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -721,16 +721,18 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ /* - * Encode and decode a swap entry + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). * * Format of swap PTE: * bit0: _PAGE_PRESENT (zero) * bit 1 to 3: _PAGE_LEAF (zero) * bit5: _PAGE_PROT_NONE (zero) - * bits 6 to 10: swap type - * bits 10 to XLEN-1: swap offset + * bit6: exclusive marker + * bits 7 to 11: swap type + * bits 11 to XLEN-1: swap offset */ -#define __SWP_TYPE_SHIFT 6 +#define __SWP_TYPE_SHIFT 7 #define __SWP_TYPE_BITS5 #define __SWP_TYPE_MASK((1UL << __SWP_TYPE_BITS) - 1) #define __SWP_OFFSET_SHIFT (__SWP_TYPE_BITS + __SWP_TYPE_SHIFT) @@ -741,11 +743,28 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, #define __swp_type(x) (((x).val >> __SWP_TYPE_SHIFT) & __SWP_TYPE_MASK) #define __swp_offset(x)((x).val >> __SWP_OFFSET_SHIFT) #define __swp_entry(type, offset) ((swp_entry_t) \ - { ((type) << __SWP_TYPE_SHIFT) | ((offset) << __SWP_OFFSET_SHIFT) }) + { (((type) & __SWP_TYPE_MASK) << __SWP_TYPE_SHIFT) | \ + ((offset) << __SWP_OFFSET_SHIFT) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SWP_EXCLUSIVE); +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return __pte(pte_val(pte) & ~_PAGE_SWP_EXCLUSIVE); +} + #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION #define __pmd_to_swp_entry(pmd) ((swp_entry_t) { pmd_val(pmd) }) #define __swp_entry_to_pmd(swp) __pmd((swp).val) -- 2.38.1
[PATCH mm-unstable RFC 18/26] powerpc/nohash/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit and 64bit. On 64bit, let's use MSB 56 (LSB 7), located right next to the page type. On 32bit, let's use LSB 2 to avoid stealing one bit from the swap offset. There seems to be no real reason why these bits cannot be used for swap PTEs. The important part is that _PAGE_PRESENT and _PAGE_HASHPTE remain 0. While at it, mask the type in __swp_entry() and remove _PAGE_BIT_SWAP_TYPE from pte-e500.h: while it was used in 64bit code it was ignored in 32bit code. Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Signed-off-by: David Hildenbrand --- arch/powerpc/include/asm/nohash/32/pgtable.h | 22 + arch/powerpc/include/asm/nohash/32/pte-40x.h | 6 ++--- arch/powerpc/include/asm/nohash/32/pte-44x.h | 18 -- arch/powerpc/include/asm/nohash/32/pte-85xx.h | 4 ++-- arch/powerpc/include/asm/nohash/64/pgtable.h | 24 --- arch/powerpc/include/asm/nohash/pgtable.h | 16 + arch/powerpc/include/asm/nohash/pte-e500.h| 1 - 7 files changed, 63 insertions(+), 28 deletions(-) diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h index 0d40b33184eb..1bb3698e6628 100644 --- a/arch/powerpc/include/asm/nohash/32/pgtable.h +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h @@ -354,18 +354,30 @@ static inline int pte_young(pte_t pte) #endif #define pmd_page(pmd) pfn_to_page(pmd_pfn(pmd)) + /* - * Encode and decode a swap entry. - * Note that the bits we use in a PTE for representing a swap entry - * must not include the _PAGE_PRESENT bit. - * -- paulus + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs (32bit PTEs): + * + * 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 + * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + * <-- offset ---> < type -> E 0 0 + * + * E is the exclusive marker that is not stored in swap entries. + * + * For 64bit PTEs, the offset is extended by 32bit. */ #define __swp_type(entry) ((entry).val & 0x1f) #define __swp_offset(entry)((entry).val >> 5) -#define __swp_entry(type, offset) ((swp_entry_t) { (type) | ((offset) << 5) }) +#define __swp_entry(type, offset) ((swp_entry_t) { ((type) & 0x1f) | ((offset) << 5) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) >> 3 }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val << 3 }) +/* We borrow LSB 2 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE0x04 + #endif /* !__ASSEMBLY__ */ #endif /* __ASM_POWERPC_NOHASH_32_PGTABLE_H */ diff --git a/arch/powerpc/include/asm/nohash/32/pte-40x.h b/arch/powerpc/include/asm/nohash/32/pte-40x.h index 2d3153cfc0d7..6fe46e754556 100644 --- a/arch/powerpc/include/asm/nohash/32/pte-40x.h +++ b/arch/powerpc/include/asm/nohash/32/pte-40x.h @@ -27,9 +27,9 @@ * of the 16 available. Bit 24-26 of the TLB are cleared in the TLB * miss handler. Bit 27 is PAGE_USER, thus selecting the correct * zone. - * - PRESENT *must* be in the bottom two bits because swap cache - * entries use the top 30 bits. Because 40x doesn't support SMP - * anyway, M is irrelevant so we borrow it for PAGE_PRESENT. Bit 30 + * - PRESENT *must* be in the bottom two bits because swap PTEs + * use the top 30 bits. Because 40x doesn't support SMP anyway, M is + * irrelevant so we borrow it for PAGE_PRESENT. Bit 30 * is cleared in the TLB miss handler before the TLB entry is loaded. * - All other bits of the PTE are loaded into TLBLO without * modification, leaving us only the bits 20, 21, 24, 25, 26, 30 for diff --git a/arch/powerpc/include/asm/nohash/32/pte-44x.h b/arch/powerpc/include/asm/nohash/32/pte-44x.h index 78bc304f750e..b7ed13cee137 100644 --- a/arch/powerpc/include/asm/nohash/32/pte-44x.h +++ b/arch/powerpc/include/asm/nohash/32/pte-44x.h @@ -56,20 +56,10 @@ * above bits. Note that the bit values are CPU specific, not architecture * specific. * - * The kernel PTE entry holds an arch-dependent swp_entry structure under - * certain situations. In other words, in such situations some portion of - * the PTE bits are used as a swp_entry. In the PPC implementation, the - * 3-24th LSB are shared with swp_entry, however the 0-2nd three LSB still - * hold protection values. That means the three protection bits are - * reserved for both PTE and SWAP entry at the most significant three - * LSBs. - * - * There are three protection bits available for SWAP entry: - * _PAGE_PRESENT - * _PAGE_HASHPTE (if HW has) - * - * So those three bits have to be inside of 0-2nd LSB of PTE. - * + * The kernel PTE entry can be an ordinary PTE mapping a page or a special swap + * PTE. In case of a swap PTE, LSB 2-24 are used to store information
[PATCH mm-unstable RFC 17/26] powerpc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit book3s
We already implemented support for 64bit book3s in commit bff9beaa2e80 ("powerpc/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE for book3s") Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also in 32bit by reusing yet unused LSB 2 / MSB 29. There seems to be no real reason why that bit cannot be used, and reusing it avoids having to steal one bit from the swap offset. While at it, mask the type in __swp_entry(). Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Signed-off-by: David Hildenbrand --- arch/powerpc/include/asm/book3s/32/pgtable.h | 38 +--- 1 file changed, 33 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index 75823f39e042..8107835b38c1 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -42,6 +42,9 @@ #define _PMD_PRESENT_MASK (PAGE_MASK) #define _PMD_BAD (~PAGE_MASK) +/* We borrow the _PAGE_USER bit to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE_PAGE_USER + /* And here we include common definitions */ #define _PAGE_KERNEL_RO0 @@ -363,17 +366,42 @@ static inline void __ptep_set_access_flags(struct vm_area_struct *vma, #define pmd_page(pmd) pfn_to_page(pmd_pfn(pmd)) /* - * Encode and decode a swap entry. - * Note that the bits we use in a PTE for representing a swap entry - * must not include the _PAGE_PRESENT bit or the _PAGE_HASHPTE bit (if used). - * -- paulus + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs (32bit PTEs): + * + * 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 + * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + * E H P <- type --> <- offset --> + * + * E is the exclusive marker that is not stored in swap entries. + * _PAGE_PRESENT (P) and __PAGE_HASHPTE (H) must be 0. + * + * For 64bit PTEs, the offset is extended by 32bit. */ #define __swp_type(entry) ((entry).val & 0x1f) #define __swp_offset(entry)((entry).val >> 5) -#define __swp_entry(type, offset) ((swp_entry_t) { (type) | ((offset) << 5) }) +#define __swp_entry(type, offset) ((swp_entry_t) { ((type) & 0x1f) | ((offset) << 5) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) >> 3 }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val << 3 }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SWP_EXCLUSIVE); +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return __pte(pte_val(pte) & ~_PAGE_SWP_EXCLUSIVE); +} + /* Generic accessors to PTE bits */ static inline int pte_write(pte_t pte) { return !!(pte_val(pte) & _PAGE_RW);} static inline int pte_read(pte_t pte) { return 1; } -- 2.38.1
[PATCH mm-unstable RFC 16/26] parisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using the yet-unused _PAGE_ACCESSED location in the swap PTE. Looking at pte_present() and pte_none() checks, there seems to be no actual reason why we cannot use it: we only have to make sure we're not using _PAGE_PRESENT. Reusing this bit avoids having to steal one bit from the swap offset. Cc: "James E.J. Bottomley" Cc: Helge Deller Signed-off-by: David Hildenbrand --- arch/parisc/include/asm/pgtable.h | 41 --- 1 file changed, 38 insertions(+), 3 deletions(-) diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h index bd09a44cfb2d..75115c8bf888 100644 --- a/arch/parisc/include/asm/pgtable.h +++ b/arch/parisc/include/asm/pgtable.h @@ -218,6 +218,9 @@ extern void __update_cache(pte_t pte); #define _PAGE_KERNEL_RWX (_PAGE_KERNEL_EXEC | _PAGE_WRITE) #define _PAGE_KERNEL (_PAGE_KERNEL_RO | _PAGE_WRITE) +/* We borrow bit 23 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE_PAGE_ACCESSED + /* The pgd/pmd contains a ptr (in phys addr space); since all pgds/pmds * are page-aligned, we don't care about the PAGE_OFFSET bits, except * for a few meta-information bits, so we shift the address to be @@ -394,17 +397,49 @@ extern void paging_init (void); #define update_mmu_cache(vms,addr,ptep) __update_cache(*ptep) -/* Encode and de-code a swap entry */ - +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs (32bit): + * + * 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 + * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + * < offset -> P E < type -> + * + * E is the exclusive marker that is not stored in swap entries. + * _PAGE_PRESENT (P) must be 0. + * + * For the 64bit version, the offset is extended by 32bit. + */ #define __swp_type(x) ((x).val & 0x1f) #define __swp_offset(x) ( (((x).val >> 6) & 0x7) | \ (((x).val >> 8) & ~0x7) ) -#define __swp_entry(type, offset) ((swp_entry_t) { (type) | \ +#define __swp_entry(type, offset) ((swp_entry_t) { \ + ((type) & 0x1f) | \ ((offset & 0x7) << 6) | \ ((offset & ~0x7) << 8) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { pte_t pte; -- 2.38.1
[PATCH mm-unstable RFC 15/26] openrisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. While at it, mask the type in __swp_entry(). Cc: Stefan Kristiansson Cc: Stafford Horne Signed-off-by: David Hildenbrand --- arch/openrisc/include/asm/pgtable.h | 41 + 1 file changed, 36 insertions(+), 5 deletions(-) diff --git a/arch/openrisc/include/asm/pgtable.h b/arch/openrisc/include/asm/pgtable.h index 6477c17b3062..903b32d662ab 100644 --- a/arch/openrisc/include/asm/pgtable.h +++ b/arch/openrisc/include/asm/pgtable.h @@ -154,6 +154,9 @@ extern void paging_init(void); #define _KERNPG_TABLE \ (_PAGE_BASE | _PAGE_SRE | _PAGE_SWE | _PAGE_ACCESSED | _PAGE_DIRTY) +/* We borrow bit 11 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE_PAGE_U_SHARED + #define PAGE_NONE __pgprot(_PAGE_ALL) #define PAGE_READONLY __pgprot(_PAGE_ALL | _PAGE_URE | _PAGE_SRE) #define PAGE_READONLY_X __pgprot(_PAGE_ALL | _PAGE_URE | _PAGE_SRE | _PAGE_EXEC) @@ -385,16 +388,44 @@ static inline void update_mmu_cache(struct vm_area_struct *vma, /* __PHX__ FIXME, SWAP, this probably doesn't work */ -/* Encode and de-code a swap entry (must be !pte_none(e) && !pte_present(e)) */ -/* Since the PAGE_PRESENT bit is bit 4, we can use the bits above */ - -#define __swp_type(x) (((x).val >> 5) & 0x7f) +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <-- offset ---> E <- type --> 0 0 0 0 0 + * + * E is the exclusive marker that is not stored in swap entries. + * The zero'ed bits include _PAGE_PRESENT. + */ +#define __swp_type(x) (((x).val >> 5) & 0x3f) #define __swp_offset(x)((x).val >> 12) #define __swp_entry(type, offset) \ - ((swp_entry_t) { ((type) << 5) | ((offset) << 12) }) + ((swp_entry_t) { (((type) & 0x3f) << 5) | ((offset) << 12) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + typedef pte_t *pte_addr_t; #endif /* __ASSEMBLY__ */ -- 2.38.1
[PATCH mm-unstable RFC 14/26] nios2/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using the yet-unused bit 31. Cc: Thomas Bogendoerfer Signed-off-by: David Hildenbrand --- arch/nios2/include/asm/pgtable-bits.h | 3 +++ arch/nios2/include/asm/pgtable.h | 22 +- 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/arch/nios2/include/asm/pgtable-bits.h b/arch/nios2/include/asm/pgtable-bits.h index bfddff383e89..724f9b08b1d1 100644 --- a/arch/nios2/include/asm/pgtable-bits.h +++ b/arch/nios2/include/asm/pgtable-bits.h @@ -31,4 +31,7 @@ #define _PAGE_ACCESSED (1<<26) /* page referenced */ #define _PAGE_DIRTY(1<<27) /* dirty page */ +/* We borrow bit 31 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE(1<<31) + #endif /* _ASM_NIOS2_PGTABLE_BITS_H */ diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2/include/asm/pgtable.h index d1e5c9eb4643..05999da01731 100644 --- a/arch/nios2/include/asm/pgtable.h +++ b/arch/nios2/include/asm/pgtable.h @@ -239,7 +239,9 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) * * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 - * 0 < type -> 0 0 0 0 0 0 <-- offset ---> + * E < type -> 0 0 0 0 0 0 <-- offset ---> + * + * E is the exclusive marker that is not stored in swap entries. * * Note that the offset field is always non-zero if the swap type is 0, thus * !pte_none() is always true. @@ -251,6 +253,24 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) #define __swp_entry_to_pte(swp)((pte_t) { (swp).val }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + extern void __init paging_init(void); extern void __init mmu_init(void); -- 2.38.1
[PATCH mm-unstable RFC 13/26] nios2/mm: refactor swap PTE layout
nios2 disables swap for a good reason: it doesn't even provide sufficient type bits as required by core MM. However, swap entries are nowadays also used for other purposes (migration entries, PTE markers, HWPoison, ...), and accidential use could be problematic. Let's properly use 5 bits for the swap type and document the layout. Bits 26--31 should get ignored by hardware completely, so they can be used. Cc: Dinh Nguyen Signed-off-by: David Hildenbrand --- arch/nios2/include/asm/pgtable.h | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2/include/asm/pgtable.h index ab793bc517f5..d1e5c9eb4643 100644 --- a/arch/nios2/include/asm/pgtable.h +++ b/arch/nios2/include/asm/pgtable.h @@ -232,19 +232,21 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) __FILE__, __LINE__, pgd_val(e)) /* - * Encode and decode a swap entry (must be !pte_none(pte) && !pte_present(pte): + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). * - * 31 30 29 28 27 26 25 24 23 22 21 20 19 18 ... 1 0 - * 0 0 0 0 type. 0 0 0 0 0 0 offset. + * Format of swap PTEs: * - * This gives us up to 2**2 = 4 swap files and 2**20 * 4K = 4G per swap file. + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * 0 < type -> 0 0 0 0 0 0 <-- offset ---> * - * Note that the offset field is always non-zero, thus !pte_none(pte) is always - * true. + * Note that the offset field is always non-zero if the swap type is 0, thus + * !pte_none() is always true. */ -#define __swp_type(swp)(((swp).val >> 26) & 0x3) +#define __swp_type(swp)(((swp).val >> 26) & 0x1f) #define __swp_offset(swp) ((swp).val & 0xf) -#define __swp_entry(type, off) ((swp_entry_t) { (((type) & 0x3) << 26) \ +#define __swp_entry(type, off) ((swp_entry_t) { (((type) & 0x1f) << 26) \ | ((off) & 0xf) }) #define __swp_entry_to_pte(swp)((pte_t) { (swp).val }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) -- 2.38.1
[PATCH mm-unstable RFC 12/26] mips/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE. On 64bit, steal one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. On 32bit we're able to locate unused bits. As the PTE layout for 32 bit is very confusing, document it a bit better. While at it, mask the type in __swp_entry()/mk_swap_pte(). Cc: Thomas Bogendoerfer Signed-off-by: David Hildenbrand --- arch/mips/include/asm/pgtable-32.h | 86 ++ arch/mips/include/asm/pgtable-64.h | 23 ++-- arch/mips/include/asm/pgtable.h| 36 + 3 files changed, 130 insertions(+), 15 deletions(-) diff --git a/arch/mips/include/asm/pgtable-32.h b/arch/mips/include/asm/pgtable-32.h index b40a0e69fccc..c2a3b899480c 100644 --- a/arch/mips/include/asm/pgtable-32.h +++ b/arch/mips/include/asm/pgtable-32.h @@ -191,42 +191,103 @@ static inline pte_t pfn_pte(unsigned long pfn, pgprot_t prot) #define pte_page(x)pfn_to_page(pte_pfn(x)) +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + */ #if defined(CONFIG_CPU_R3K_TLB) -/* Swap entries must have VALID bit cleared. */ +/* + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <--- offset > < type -> V G E 0 0 0 0 0 0 P + * + * E is the exclusive marker that is not stored in swap entries. + * _PAGE_PRESENT (P), _PAGE_VALID (V) and_PAGE_GLOBAL (G) have to remain + * unused. + */ #define __swp_type(x) (((x).val >> 10) & 0x1f) #define __swp_offset(x)((x).val >> 15) -#define __swp_entry(type,offset) ((swp_entry_t) { ((type) << 10) | ((offset) << 15) }) +#define __swp_entry(type,offset) ((swp_entry_t) { (((type) & 0x1f) << 10) | ((offset) << 15) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +/* We borrow bit 7 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE(1 << 7) + #else #if defined(CONFIG_XPA) -/* Swap entries must have VALID and GLOBAL bits cleared. */ +/* + * Format of swap PTEs: + * + * 6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 + * 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 + * 0 0 0 0 0 0 E P <-- zeroes ---> + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <- offset --> < type -> V G 0 0 + * + * E is the exclusive marker that is not stored in swap entries. + * _PAGE_PRESENT (P), _PAGE_VALID (V) and_PAGE_GLOBAL (G) have to remain + * unused. + */ #define __swp_type(x) (((x).val >> 4) & 0x1f) #define __swp_offset(x) ((x).val >> 9) -#define __swp_entry(type,offset) ((swp_entry_t) { ((type) << 4) | ((offset) << 9) }) +#define __swp_entry(type,offset) ((swp_entry_t) { (((type) & 0x1f) << 4) | ((offset) << 9) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { (pte).pte_high }) #define __swp_entry_to_pte(x) ((pte_t) { 0, (x).val }) +/* + * We borrow bit 57 (bit 25 in the low PTE) to store the exclusive marker in + * swap PTEs. + */ +#define _PAGE_SWP_EXCLUSIVE(1 << 25) + #elif defined(CONFIG_PHYS_ADDR_T_64BIT) && defined(CONFIG_CPU_MIPS32) -/* Swap entries must have VALID and GLOBAL bits cleared. */ +/* + * Format of swap PTEs: + * + * 6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 + * 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 + * <-- zeroes ---> E P 0 0 0 0 0 0 + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <--- offset > < type -> V G + * + * E is the exclusive marker that is not stored in swap entries. + * _PAGE_PRESENT (P), _PAGE_VALID (V) and_PAGE_GLOBAL (G) have to remain + * unused. + */ #define __swp_type(x) (((x).val >> 2) & 0x1f) #define __swp_offset(x) ((x).val >> 7) -#define __swp_entry(type, offset) ((swp_entry_t) { ((type) << 2) | ((offset) << 7) }) +#define __swp_entry(type, offset) ((swp_entry_t) { (((type) & 0x1f) << 2) | ((offset) << 7) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { (pte).pte_high }) #define __swp_entry_to_pte(x) ((pte_t) { 0, (x).val }) +/* + * We borrow bit 39 (bit 7 in the low PTE) to store the exclusive marker in swap + * PTEs. + */ +#define _PAGE_SWP_EXCLUSIVE(1 << 7) + #else /* - * Constraints: - * _PAGE_PRESENT at bit 0 - * _PAGE_MODIFIED at bit 4 - *
[PATCH mm-unstable RFC 11/26] microblaze/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. The shift by 2 when converting between PTE and arch-specific swap entry makes the swap PTE layout a little bit harder to decipher. While at it, drop the comment from paulus---copy-and-paste leftover from powerpc where we actually have _PAGE_HASHPTE---and mask the type in __swp_entry_to_pte() as well. Cc: Michal Simek Signed-off-by: David Hildenbrand --- arch/m68k/include/asm/mcf_pgtable.h | 4 +-- arch/microblaze/include/asm/pgtable.h | 45 +-- 2 files changed, 37 insertions(+), 12 deletions(-) diff --git a/arch/m68k/include/asm/mcf_pgtable.h b/arch/m68k/include/asm/mcf_pgtable.h index 3f8f4d0e66dd..e573d7b649f7 100644 --- a/arch/m68k/include/asm/mcf_pgtable.h +++ b/arch/m68k/include/asm/mcf_pgtable.h @@ -46,8 +46,8 @@ #define _CACHEMASK040 (~0x060) #define _PAGE_GLOBAL0400x400 /* 68040 global bit, used for kva descs */ -/* We borrow bit 7 to store the exclusive marker in swap PTEs. */ -#define _PAGE_SWP_EXCLUSIVE0x080 +/* We borrow bit 24 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVECF_PAGE_NOCACHE /* * Externally used page protection values. diff --git a/arch/microblaze/include/asm/pgtable.h b/arch/microblaze/include/asm/pgtable.h index 42f5988e998b..7e3de54bf426 100644 --- a/arch/microblaze/include/asm/pgtable.h +++ b/arch/microblaze/include/asm/pgtable.h @@ -131,10 +131,10 @@ extern pte_t *va_to_pte(unsigned long address); * of the 16 available. Bit 24-26 of the TLB are cleared in the TLB * miss handler. Bit 27 is PAGE_USER, thus selecting the correct * zone. - * - PRESENT *must* be in the bottom two bits because swap cache - * entries use the top 30 bits. Because 4xx doesn't support SMP - * anyway, M is irrelevant so we borrow it for PAGE_PRESENT. Bit 30 - * is cleared in the TLB miss handler before the TLB entry is loaded. + * - PRESENT *must* be in the bottom two bits because swap PTEs use the top + * 30 bits. Because 4xx doesn't support SMP anyway, M is irrelevant so we + * borrow it for PAGE_PRESENT. Bit 30 is cleared in the TLB miss handler + * before the TLB entry is loaded. * - All other bits of the PTE are loaded into TLBLO without * * modification, leaving us only the bits 20, 21, 24, 25, 26, 30 for * software PTE bits. We actually use bits 21, 24, 25, and @@ -155,6 +155,9 @@ extern pte_t *va_to_pte(unsigned long address); #define _PAGE_ACCESSED 0x400 /* software: R: page referenced */ #define _PMD_PRESENT PAGE_MASK +/* We borrow bit 24 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE_PAGE_DIRTY + /* * Some bits are unused... */ @@ -393,18 +396,40 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) extern pgd_t swapper_pg_dir[PTRS_PER_PGD]; /* - * Encode and decode a swap entry. - * Note that the bits we use in a PTE for representing a swap entry - * must not include the _PAGE_PRESENT bit, or the _PAGE_HASHPTE bit - * (if used). -- paulus + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 + * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + * <-- offset ---> E < type -> 0 0 + * + * E is the exclusive marker that is not stored in swap entries. */ -#define __swp_type(entry) ((entry).val & 0x3f) +#define __swp_type(entry) ((entry).val & 0x1f) #define __swp_offset(entry)((entry).val >> 6) #define __swp_entry(type, offset) \ - ((swp_entry_t) { (type) | ((offset) << 6) }) + ((swp_entry_t) { ((type) & 0x1f) | ((offset) << 6) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) >> 2 }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val << 2 }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + extern unsigned long iopa(unsigned long addr); /* Values for nocacheflag and cmode */ -- 2.38.1
[PATCH mm-unstable RFC 10/26] m68k/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. While at it, make sure for sun3 that the valid bit never gets set by properly masking it off and mask the type in __swp_entry(). Cc: Geert Uytterhoeven Cc: Greg Ungerer Signed-off-by: David Hildenbrand --- arch/m68k/include/asm/mcf_pgtable.h | 36 -- arch/m68k/include/asm/motorola_pgtable.h | 38 +-- arch/m68k/include/asm/sun3_pgtable.h | 39 ++-- 3 files changed, 104 insertions(+), 9 deletions(-) diff --git a/arch/m68k/include/asm/mcf_pgtable.h b/arch/m68k/include/asm/mcf_pgtable.h index b619b22823f8..3f8f4d0e66dd 100644 --- a/arch/m68k/include/asm/mcf_pgtable.h +++ b/arch/m68k/include/asm/mcf_pgtable.h @@ -46,6 +46,9 @@ #define _CACHEMASK040 (~0x060) #define _PAGE_GLOBAL0400x400 /* 68040 global bit, used for kva descs */ +/* We borrow bit 7 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE0x080 + /* * Externally used page protection values. */ @@ -254,15 +257,42 @@ static inline pte_t pte_mkcache(pte_t pte) extern pgd_t kernel_pg_dir[PTRS_PER_PGD]; /* - * Encode and de-code a swap entry (must be !pte_none(e) && !pte_present(e)) + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <-- offset -> 0 0 0 E <-- type ---> + * + * E is the exclusive marker that is not stored in swap entries. */ -#define __swp_type(x) ((x).val & 0xFF) +#define __swp_type(x) ((x).val & 0x7f) #define __swp_offset(x)((x).val >> 11) -#define __swp_entry(typ, off) ((swp_entry_t) { (typ) | \ +#define __swp_entry(typ, off) ((swp_entry_t) { ((typ) & 0x7f) | \ (off << 11) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) (__pte((x).val)) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + #define pmd_pfn(pmd) (pmd_val(pmd) >> PAGE_SHIFT) #define pmd_page(pmd) (pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT)) diff --git a/arch/m68k/include/asm/motorola_pgtable.h b/arch/m68k/include/asm/motorola_pgtable.h index 7ac3d64c6b33..02896027c781 100644 --- a/arch/m68k/include/asm/motorola_pgtable.h +++ b/arch/m68k/include/asm/motorola_pgtable.h @@ -41,6 +41,9 @@ #define _PAGE_PROTNONE 0x004 +/* We borrow bit 11 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE0x800 + #ifndef __ASSEMBLY__ /* This is the cache mode to be used for pages containing page descriptors for @@ -169,12 +172,41 @@ static inline pte_t pte_mkcache(pte_t pte) #define swapper_pg_dir kernel_pg_dir extern pgd_t kernel_pg_dir[128]; -/* Encode and de-code a swap entry (must be !pte_none(e) && !pte_present(e)) */ -#define __swp_type(x) (((x).val >> 4) & 0xff) +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <- offset > E <-- type ---> 0 0 0 0 + * + * E is the exclusive marker that is not stored in swap entries. + */ +#define __swp_type(x) (((x).val >> 4) & 0x7f) #define __swp_offset(x)((x).val >> 12) -#define __swp_entry(type, offset) ((swp_entry_t) { ((type) << 4) | ((offset) << 12) }) +#define __swp_entry(type, offset) ((swp_entry_t) { (((type) & 0x7f) << 4) | ((offset) << 12) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + #endif /* !__ASSEMBLY__ */ #endif /* _MOTOROLA_PGTABLE_H */ diff --git a/arch/m68k/include/asm/sun3_pgtable.h b/arch/m68k/include/asm/sun3_pgtable.h index
[PATCH mm-unstable RFC 09/26] m68k/mm: remove dummy __swp definitions for nommu
The definitions are not required, let's remove them. Cc: Geert Uytterhoeven Cc: Greg Ungerer Signed-off-by: David Hildenbrand --- arch/m68k/include/asm/pgtable_no.h | 6 -- 1 file changed, 6 deletions(-) diff --git a/arch/m68k/include/asm/pgtable_no.h b/arch/m68k/include/asm/pgtable_no.h index fed58da3a6b6..fc044df52b96 100644 --- a/arch/m68k/include/asm/pgtable_no.h +++ b/arch/m68k/include/asm/pgtable_no.h @@ -31,12 +31,6 @@ extern void paging_init(void); #define swapper_pg_dir ((pgd_t *) 0) -#define __swp_type(x) (0) -#define __swp_offset(x)(0) -#define __swp_entry(typ,off) ((swp_entry_t) { ((typ) | ((off) << 7)) }) -#define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) -#define __swp_entry_to_pte(x) ((pte_t) { (x).val }) - /* * ZERO_PAGE is a global shared page that is always zero: used * for zero-mapped memory areas etc.. -- 2.38.1
[PATCH mm-unstable RFC 08/26] loongarch/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. While at it, also mask the type in mk_swap_pte(). Note that this bit does not conflict with swap PMDs and could also be used in swap PMD context later. Cc: Huacai Chen Cc: WANG Xuerui Signed-off-by: David Hildenbrand --- arch/loongarch/include/asm/pgtable-bits.h | 4 +++ arch/loongarch/include/asm/pgtable.h | 39 --- 2 files changed, 39 insertions(+), 4 deletions(-) diff --git a/arch/loongarch/include/asm/pgtable-bits.h b/arch/loongarch/include/asm/pgtable-bits.h index 3d1e0a69975a..8b98d22a145b 100644 --- a/arch/loongarch/include/asm/pgtable-bits.h +++ b/arch/loongarch/include/asm/pgtable-bits.h @@ -20,6 +20,7 @@ #define_PAGE_SPECIAL_SHIFT 11 #define_PAGE_HGLOBAL_SHIFT 12 /* HGlobal is a PMD bit */ #define_PAGE_PFN_SHIFT 12 +#define_PAGE_SWP_EXCLUSIVE_SHIFT 23 #define_PAGE_PFN_END_SHIFT 48 #define_PAGE_NO_READ_SHIFT 61 #define_PAGE_NO_EXEC_SHIFT 62 @@ -33,6 +34,9 @@ #define _PAGE_PROTNONE (_ULCAST_(1) << _PAGE_PROTNONE_SHIFT) #define _PAGE_SPECIAL (_ULCAST_(1) << _PAGE_SPECIAL_SHIFT) +/* We borrow bit 23 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE(_ULCAST_(1) << _PAGE_SWP_EXCLUSIVE_SHIFT) + /* Used by TLB hardware (placed in EntryLo*) */ #define _PAGE_VALID(_ULCAST_(1) << _PAGE_VALID_SHIFT) #define _PAGE_DIRTY(_ULCAST_(1) << _PAGE_DIRTY_SHIFT) diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h index 022ec6be3602..70d037c957a4 100644 --- a/arch/loongarch/include/asm/pgtable.h +++ b/arch/loongarch/include/asm/pgtable.h @@ -249,13 +249,26 @@ extern void pud_init(void *addr); extern void pmd_init(void *addr); /* - * Non-present pages: high 40 bits are offset, next 8 bits type, - * low 16 bits zero. + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 + * 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 + * <--- offset --- + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * --> E <--- type ---> <-- zeroes --> + * + * E is the exclusive marker that is not stored in swap entries. + * The zero'ed bits include _PAGE_PRESENT and _PAGE_PROTNONE. */ static inline pte_t mk_swap_pte(unsigned long type, unsigned long offset) -{ pte_t pte; pte_val(pte) = (type << 16) | (offset << 24); return pte; } +{ pte_t pte; pte_val(pte) = ((type & 0x7f) << 16) | (offset << 24); return pte; } -#define __swp_type(x) (((x).val >> 16) & 0xff) +#define __swp_type(x) (((x).val >> 16) & 0x7f) #define __swp_offset(x)((x).val >> 24) #define __swp_entry(type, offset) ((swp_entry_t) { pte_val(mk_swap_pte((type), (offset))) }) #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) }) @@ -263,6 +276,24 @@ static inline pte_t mk_swap_pte(unsigned long type, unsigned long offset) #define __pmd_to_swp_entry(pmd) ((swp_entry_t) { pmd_val(pmd) }) #define __swp_entry_to_pmd(x) ((pmd_t) { (x).val | _PAGE_HUGE }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + extern void paging_init(void); #define pte_none(pte) (!(pte_val(pte) & ~_PAGE_GLOBAL)) -- 2.38.1
[PATCH mm-unstable RFC 07/26] ia64/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. While at it, also mask the type in __swp_entry(). Signed-off-by: David Hildenbrand --- arch/ia64/include/asm/pgtable.h | 32 +--- 1 file changed, 29 insertions(+), 3 deletions(-) diff --git a/arch/ia64/include/asm/pgtable.h b/arch/ia64/include/asm/pgtable.h index 01517a5e6778..d666eb229d4b 100644 --- a/arch/ia64/include/asm/pgtable.h +++ b/arch/ia64/include/asm/pgtable.h @@ -58,6 +58,9 @@ #define _PAGE_ED (__IA64_UL(1) << 52)/* exception deferral */ #define _PAGE_PROTNONE (__IA64_UL(1) << 63) +/* We borrow bit 7 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE(1 << 7) + #define _PFN_MASK _PAGE_PPN_MASK /* Mask of bits which may be changed by pte_modify(); the odd bits are there for _PAGE_PROTNONE */ #define _PAGE_CHG_MASK (_PAGE_P | _PAGE_PROTNONE | _PAGE_PL_MASK | _PAGE_AR_MASK | _PAGE_ED) @@ -399,6 +402,9 @@ extern pgd_t swapper_pg_dir[PTRS_PER_PGD]; extern void paging_init (void); /* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * * Note: The macros below rely on the fact that MAX_SWAPFILES_SHIFT <= number of * bits in the swap-type field of the swap pte. It would be nice to * enforce that, but we can't easily include here. @@ -406,16 +412,36 @@ extern void paging_init (void); * * Format of swap pte: * bit 0 : present bit (must be zero) - * bits 1- 7: swap-type + * bits 1- 6: swap type + * bit 7 : exclusive marker * bits 8-62: swap offset * bit 63 : _PAGE_PROTNONE bit */ -#define __swp_type(entry) (((entry).val >> 1) & 0x7f) +#define __swp_type(entry) (((entry).val >> 1) & 0x3f) #define __swp_offset(entry)(((entry).val << 1) >> 9) -#define __swp_entry(type,offset) ((swp_entry_t) { ((type) << 1) | ((long) (offset) << 8) }) +#define __swp_entry(type,offset) ((swp_entry_t) { ((type & 0x3f) << 1) | \ +((long) (offset) << 8) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + /* * ZERO_PAGE is a global shared page that is always zero: used * for zero-mapped memory areas etc.. -- 2.38.1
[PATCH mm-unstable RFC 06/26] hexagon/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the offset. This reduces the maximum swap space per file to 16 GiB (was 32 GiB). While at it, mask the type in __swp_entry(). Cc: Brian Cain Signed-off-by: David Hildenbrand --- arch/hexagon/include/asm/pgtable.h | 37 +- 1 file changed, 31 insertions(+), 6 deletions(-) diff --git a/arch/hexagon/include/asm/pgtable.h b/arch/hexagon/include/asm/pgtable.h index f7048c18b6f9..7eb008e477c8 100644 --- a/arch/hexagon/include/asm/pgtable.h +++ b/arch/hexagon/include/asm/pgtable.h @@ -61,6 +61,9 @@ extern unsigned long empty_zero_page; * So we'll put up with a bit of inefficiency for now... */ +/* We borrow bit 6 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE(1<<6) + /* * Top "FOURTH" level (pgd), which for the Hexagon VM is really * only the second from the bottom, pgd and pud both being collapsed. @@ -359,9 +362,12 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) #define ZERO_PAGE(vaddr) (virt_to_page(_zero_page)) /* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * * Swap/file PTE definitions. If _PAGE_PRESENT is zero, the rest of the PTE is * interpreted as swap information. The remaining free bits are interpreted as - * swap type/offset tuple. Rather than have the TLB fill handler test + * listed below. Rather than have the TLB fill handler test * _PAGE_PRESENT, we're going to reserve the permissions bits and set them to * all zeros for swap entries, which speeds up the miss handler at the cost of * 3 bits of offset. That trade-off can be revisited if necessary, but Hexagon @@ -371,9 +377,10 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) * Format of swap PTE: * bit 0: Present (zero) * bits1-5:swap type (arch independent layer uses 5 bits max) - * bits6-9:bits 3:0 of offset + * bit 6: exclusive marker + * bits7-9:bits 2:0 of offset * bits10-12: effectively _PAGE_PROTNONE (all zero) - * bits13-31: bits 22:4 of swap offset + * bits13-31: bits 21:3 of swap offset * * The split offset makes some of the following macros a little gnarly, * but there's plenty of precedent for this sort of thing. @@ -383,11 +390,29 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) #define __swp_type(swp_pte)(((swp_pte).val >> 1) & 0x1f) #define __swp_offset(swp_pte) \ - swp_pte).val >> 6) & 0xf) | (((swp_pte).val >> 9) & 0x70)) + swp_pte).val >> 7) & 0x7) | (((swp_pte).val >> 10) & 0x38)) #define __swp_entry(type, offset) \ ((swp_entry_t) { \ - ((type << 1) | \ -((offset & 0x70) << 9) | ((offset & 0xf) << 6)) }) + (((type & 0x1f) << 1) | \ +((offset & 0x38) << 10) | ((offset & 0x7) << 7)) }) + +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} #endif -- 2.38.1
[PATCH mm-unstable RFC 05/26] csky/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the offset. This reduces the maximum swap space per file to 16 GiB (was 32 GiB). We might actually be able to reuse one of the other software bits (_PAGE_READ / PAGE_WRITE) instead, because we only have to keep pte_present(), pte_none() and HW happy. For now, let's keep it simple because there might be something non-obvious. Cc: Guo Ren Signed-off-by: David Hildenbrand --- arch/csky/abiv1/inc/abi/pgtable-bits.h | 13 + arch/csky/abiv2/inc/abi/pgtable-bits.h | 19 --- arch/csky/include/asm/pgtable.h| 18 ++ 3 files changed, 39 insertions(+), 11 deletions(-) diff --git a/arch/csky/abiv1/inc/abi/pgtable-bits.h b/arch/csky/abiv1/inc/abi/pgtable-bits.h index 752c8b3f9194..ae7a2f76dd42 100644 --- a/arch/csky/abiv1/inc/abi/pgtable-bits.h +++ b/arch/csky/abiv1/inc/abi/pgtable-bits.h @@ -10,6 +10,9 @@ #define _PAGE_ACCESSED (1<<3) #define _PAGE_MODIFIED (1<<4) +/* We borrow bit 9 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE(1<<9) + /* implemented in hardware */ #define _PAGE_GLOBAL (1<<6) #define _PAGE_VALID(1<<7) @@ -26,7 +29,8 @@ #define _PAGE_PROT_NONE_PAGE_READ /* - * Encode and decode a swap entry + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). * * Format of swap PTE: * bit 0:_PAGE_PRESENT (zero) @@ -35,15 +39,16 @@ * bit 6:_PAGE_GLOBAL (zero) * bit 7:_PAGE_VALID (zero) * bit 8:swap type[4] - * bit 9 - 31:swap offset + * bit 9:exclusive marker + * bit10 - 31:swap offset */ #define __swp_type(x) x).val >> 2) & 0xf) | \ (((x).val >> 4) & 0x10)) -#define __swp_offset(x)((x).val >> 9) +#define __swp_offset(x)((x).val >> 10) #define __swp_entry(type, offset) ((swp_entry_t) { \ ((type & 0xf) << 2) | \ ((type & 0x10) << 4) | \ - ((offset) << 9)}) + ((offset) << 10)}) #define HAVE_ARCH_UNMAPPED_AREA diff --git a/arch/csky/abiv2/inc/abi/pgtable-bits.h b/arch/csky/abiv2/inc/abi/pgtable-bits.h index 7e7f389f546f..526152bd2156 100644 --- a/arch/csky/abiv2/inc/abi/pgtable-bits.h +++ b/arch/csky/abiv2/inc/abi/pgtable-bits.h @@ -10,6 +10,9 @@ #define _PAGE_PRESENT (1<<10) #define _PAGE_MODIFIED (1<<11) +/* We borrow bit 7 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE(1<<7) + /* implemented in hardware */ #define _PAGE_GLOBAL (1<<0) #define _PAGE_VALID(1<<1) @@ -26,23 +29,25 @@ #define _PAGE_PROT_NONE_PAGE_WRITE /* - * Encode and decode a swap entry + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). * * Format of swap PTE: * bit 0:_PAGE_GLOBAL (zero) * bit 1:_PAGE_VALID (zero) * bit 2 - 6:swap type - * bit 7 - 8:swap offset[0 - 1] + * bit 7:exclusive marker + * bit 8:swap offset[0] * bit 9:_PAGE_WRITE (zero) * bit 10:_PAGE_PRESENT (zero) - * bit11 - 31:swap offset[2 - 22] + * bit11 - 31:swap offset[1 - 21] */ #define __swp_type(x) (((x).val >> 2) & 0x1f) -#define __swp_offset(x)x).val >> 7) & 0x3) | \ - (((x).val >> 9) & 0x7c)) +#define __swp_offset(x)x).val >> 8) & 0x1) | \ + (((x).val >> 10) & 0x3e)) #define __swp_entry(type, offset) ((swp_entry_t) { \ ((type & 0x1f) << 2) | \ - ((offset & 0x3) << 7) | \ - ((offset & 0x7c) << 9)}) + ((offset & 0x1) << 8) | \ + ((offset & 0x3e) << 10)}) #endif /* __ASM_CSKY_PGTABLE_BITS_H */ diff --git a/arch/csky/include/asm/pgtable.h b/arch/csky/include/asm/pgtable.h index 77bc6caff2d2..574c97b9ecca 100644 --- a/arch/csky/include/asm/pgtable.h +++ b/arch/csky/include/asm/pgtable.h @@ -200,6 +200,24 @@ static inline pte_t pte_mkyoung(pte_t pte) return pte; } +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return
[PATCH mm-unstable RFC 04/26] arm/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the offset. This reduces the maximum swap space per file to 64 GiB (was 128 GiB). While at it drop the PTE_TYPE_FAULT from __swp_entry_to_pte() which is defined to be 0 and is rather confusing because we should be dealing with "Linux PTEs" not "hardware PTEs". Also, properly mask the type in __swp_entry(). Cc: Russell King Signed-off-by: David Hildenbrand --- arch/arm/include/asm/pgtable-2level.h | 3 +++ arch/arm/include/asm/pgtable-3level.h | 3 +++ arch/arm/include/asm/pgtable.h| 35 +-- 3 files changed, 34 insertions(+), 7 deletions(-) diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h index 92abd4cd8ca2..ce543cd9380c 100644 --- a/arch/arm/include/asm/pgtable-2level.h +++ b/arch/arm/include/asm/pgtable-2level.h @@ -126,6 +126,9 @@ #define L_PTE_SHARED (_AT(pteval_t, 1) << 10)/* shared(v6), coherent(xsc3) */ #define L_PTE_NONE (_AT(pteval_t, 1) << 11) +/* We borrow bit 7 to store the exclusive marker in swap PTEs. */ +#define L_PTE_SWP_EXCLUSIVEL_PTE_RDONLY + /* * These are the memory types, defined to be compatible with * pre-ARMv6 CPUs cacheable and bufferable bits: n/a,n/a,C,B diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h index eabe72ff7381..106049791500 100644 --- a/arch/arm/include/asm/pgtable-3level.h +++ b/arch/arm/include/asm/pgtable-3level.h @@ -76,6 +76,9 @@ #define L_PTE_NONE (_AT(pteval_t, 1) << 57)/* PROT_NONE */ #define L_PTE_RDONLY (_AT(pteval_t, 1) << 58)/* READ ONLY */ +/* We borrow bit 7 to store the exclusive marker in swap PTEs. */ +#define L_PTE_SWP_EXCLUSIVE(_AT(pteval_t, 1) << 7) + #define L_PMD_SECT_VALID (_AT(pmdval_t, 1) << 0) #define L_PMD_SECT_DIRTY (_AT(pmdval_t, 1) << 55) #define L_PMD_SECT_NONE(_AT(pmdval_t, 1) << 57) diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h index 00954ab1a039..5e0446a9c667 100644 --- a/arch/arm/include/asm/pgtable.h +++ b/arch/arm/include/asm/pgtable.h @@ -269,27 +269,48 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) } /* - * Encode and decode a swap entry. Swap entries are stored in the Linux - * page tables as follows: + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: * * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 - * <--- offset > < type -> 0 0 + * <--- offset --> E < type -> 0 0 + * + * E is the exclusive marker that is not stored in swap entries. * - * This gives us up to 31 swap files and 128GB per swap file. Note that + * This gives us up to 31 swap files and 64GB per swap file. Note that * the offset field is always non-zero. */ #define __SWP_TYPE_SHIFT 2 #define __SWP_TYPE_BITS5 #define __SWP_TYPE_MASK((1 << __SWP_TYPE_BITS) - 1) -#define __SWP_OFFSET_SHIFT (__SWP_TYPE_BITS + __SWP_TYPE_SHIFT) +#define __SWP_OFFSET_SHIFT (__SWP_TYPE_BITS + __SWP_TYPE_SHIFT + 1) #define __swp_type(x) (((x).val >> __SWP_TYPE_SHIFT) & __SWP_TYPE_MASK) #define __swp_offset(x)((x).val >> __SWP_OFFSET_SHIFT) -#define __swp_entry(type,offset) ((swp_entry_t) { ((type) << __SWP_TYPE_SHIFT) | ((offset) << __SWP_OFFSET_SHIFT) }) +#define __swp_entry(type,offset) ((swp_entry_t) { (((type) & __SWP_TYPE_BITS) << __SWP_TYPE_SHIFT) | \ +((offset) << __SWP_OFFSET_SHIFT) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) -#define __swp_entry_to_pte(swp)__pte((swp).val | PTE_TYPE_FAULT) +#define __swp_entry_to_pte(swp)__pte((swp).val) + +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_isset(pte, L_PTE_SWP_EXCLUSIVE); +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return set_pte_bit(pte, __pgprot(L_PTE_SWP_EXCLUSIVE)); +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return clear_pte_bit(pte, __pgprot(L_PTE_SWP_EXCLUSIVE)); +} /* * It is an error for the kernel to have more swap files than we can -- 2.38.1
[PATCH mm-unstable RFC 03/26] arc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 5, which is yet unused. The only important parts seems to be to not use _PAGE_PRESENT (bit 9). Cc: Vineet Gupta Signed-off-by: David Hildenbrand --- arch/arc/include/asm/pgtable-bits-arcv2.h | 27 --- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/arch/arc/include/asm/pgtable-bits-arcv2.h b/arch/arc/include/asm/pgtable-bits-arcv2.h index 515e82db519f..611f412713b9 100644 --- a/arch/arc/include/asm/pgtable-bits-arcv2.h +++ b/arch/arc/include/asm/pgtable-bits-arcv2.h @@ -26,6 +26,9 @@ #define _PAGE_GLOBAL (1 << 8) /* ASID agnostic (H) */ #define _PAGE_PRESENT (1 << 9) /* PTE/TLB Valid (H) */ +/* We borrow bit 5 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE_PAGE_DIRTY + #ifdef CONFIG_ARC_MMU_V4 #define _PAGE_HW_SZ(1 << 10) /* Normal/super (H) */ #else @@ -106,9 +109,18 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr, void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t *ptep); -/* Encode swap {type,off} tuple into PTE - * We reserve 13 bits for 5-bit @type, keeping bits 12-5 zero, ensuring that - * PAGE_PRESENT is zero in a PTE holding swap "identifier" +/* + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <-- offset -> <--- zero --> E < type -> + * + * E is the exclusive marker that is not stored in swap entries. + * The zero'ed bits include _PAGE_PRESENT. */ #define __swp_entry(type, off) ((swp_entry_t) \ { ((type) & 0x1f) | ((off) << 13) }) @@ -120,6 +132,15 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +PTE_BIT_FUNC(swp_mkexclusive, |= (_PAGE_SWP_EXCLUSIVE)); +PTE_BIT_FUNC(swp_clear_exclusive, &= ~(_PAGE_SWP_EXCLUSIVE)); + #ifdef CONFIG_TRANSPARENT_HUGEPAGE #include #endif -- 2.38.1
[PATCH mm-unstable RFC 02/26] alpha/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. While at it, mask the type in mk_swap_pte() as well. Note that 32bit alpha has 64bit PTEs but only 32bit swap entries. So the lower 32bit are zero in a swap PTE and we could have taken a bit in there as well. Cc: Richard Henderson Cc: Ivan Kokshaysky Cc: Matt Turner Signed-off-by: David Hildenbrand --- arch/alpha/include/asm/pgtable.h | 41 1 file changed, 37 insertions(+), 4 deletions(-) diff --git a/arch/alpha/include/asm/pgtable.h b/arch/alpha/include/asm/pgtable.h index 9e45f6735d5d..970abf511b13 100644 --- a/arch/alpha/include/asm/pgtable.h +++ b/arch/alpha/include/asm/pgtable.h @@ -74,6 +74,9 @@ struct vm_area_struct; #define _PAGE_DIRTY0x2 #define _PAGE_ACCESSED 0x4 +/* We borrow bit 39 to store the exclusive marker in swap PTEs. */ +#define _PAGE_SWP_EXCLUSIVE0x80UL + /* * NOTE! The "accessed" bit isn't necessarily exact: it can be kept exactly * by software (use the KRE/URE/KWE/UWE bits appropriately), but I'll fake it. @@ -301,18 +304,48 @@ extern inline void update_mmu_cache(struct vm_area_struct * vma, } /* - * Non-present pages: high 24 bits are offset, next 8 bits type, - * low 32 bits zero. + * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that + * are !pte_none() && !pte_present(). + * + * Format of swap PTEs: + * + * 6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 + * 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 + * <--- offset --> E <--- type --> + * + * 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * <--- zeroes --> + * + * E is the exclusive marker that is not stored in swap entries. */ extern inline pte_t mk_swap_pte(unsigned long type, unsigned long offset) -{ pte_t pte; pte_val(pte) = (type << 32) | (offset << 40); return pte; } +{ pte_t pte; pte_val(pte) = ((type & 0x7f) << 32) | (offset << 40); return pte; } -#define __swp_type(x) (((x).val >> 32) & 0xff) +#define __swp_type(x) (((x).val >> 32) & 0x7f) #define __swp_offset(x)((x).val >> 40) #define __swp_entry(type, off) ((swp_entry_t) { pte_val(mk_swap_pte((type), (off))) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val(pte) }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + pte_val(pte) |= _PAGE_SWP_EXCLUSIVE; + return pte; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE; + return pte; +} + #define pte_ERROR(e) \ printk("%s:%d: bad pte %016lx.\n", __FILE__, __LINE__, pte_val(e)) #define pmd_ERROR(e) \ -- 2.38.1
[PATCH mm-unstable RFC 01/26] mm/debug_vm_pgtable: more pte_swp_exclusive() sanity checks
We want to implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures. Let's extend our sanity checks, especially testing that our PTE bit does not affect: * is_swap_pte() -> pte_present() and pte_none() * the swap entry + type * pte_swp_soft_dirty() Especially, the pfn_pte() is dodgy when the swap PTE layout differs heavily from ordinary PTEs. Let's properly construct a swap PTE from swap type+offset. Signed-off-by: David Hildenbrand --- mm/debug_vm_pgtable.c | 23 ++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index c631ade3f1d2..0506622016d9 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -807,13 +807,34 @@ static void __init pmd_swap_soft_dirty_tests(struct pgtable_debug_args *args) { static void __init pte_swap_exclusive_tests(struct pgtable_debug_args *args) { #ifdef __HAVE_ARCH_PTE_SWP_EXCLUSIVE - pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot); + unsigned long max_swapfile_size = generic_max_swapfile_size(); + swp_entry_t entry, entry2; + pte_t pte; pr_debug("Validating PTE swap exclusive\n"); + + /* Create a swp entry with all possible bits set */ + entry = swp_entry((1 << MAX_SWAPFILES_SHIFT) - 1, + max_swapfile_size - 1); + + pte = swp_entry_to_pte(entry); + WARN_ON(pte_swp_exclusive(pte)); + WARN_ON(!is_swap_pte(pte)); + entry2 = pte_to_swp_entry(pte); + WARN_ON(memcmp(, , sizeof(entry))); + pte = pte_swp_mkexclusive(pte); WARN_ON(!pte_swp_exclusive(pte)); + WARN_ON(!is_swap_pte(pte)); + WARN_ON(pte_swp_soft_dirty(pte)); + entry2 = pte_to_swp_entry(pte); + WARN_ON(memcmp(, , sizeof(entry))); + pte = pte_swp_clear_exclusive(pte); WARN_ON(pte_swp_exclusive(pte)); + WARN_ON(!is_swap_pte(pte)); + entry2 = pte_to_swp_entry(pte); + WARN_ON(memcmp(, , sizeof(entry))); #endif /* __HAVE_ARCH_PTE_SWP_EXCLUSIVE */ } -- 2.38.1
Re: [PATCH] powerpc/ftrace: fix syscall tracing on PPC64_ELF_ABI_V1
On 2022-12-05 17:50, Michael Ellerman wrote: Michael Jeanson writes: On 2022-12-05 15:11, Michael Jeanson wrote: Michael Jeanson writes: In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table changing from their dot prefixed variant to the non-prefixed ones. Since ftrace prefixes a dot to the syscall names when matching them to build its syscall event list, this resulted in no syscall events being available. Remove the PPC64_ELF_ABI_V1 specific version of arch_syscall_match_sym_name to have the same behavior across all powerpc variants. This doesn't seem to work for me. Event with it applied I still don't see anything in /sys/kernel/debug/tracing/events/syscalls Did we break it in some other way recently? cheers I did some further testing, my config also enabled KALLSYMS_ALL, when I remove it there is indeed no syscall events. Aha, OK that explains it I guess. I was using ppc64_guest_defconfig which has ABI_V1 and FTRACE_SYSCALLS, but does not have KALLSYMS_ALL. So I guess there's some other bug lurking in there. I don't have the setup handy to validate it, but I suspect it is caused by the way scripts/kallsyms.c:symbol_valid() checks whether a symbol entry needs to be integrated into the assembler output when --all-symbols is not specified. It only keeps symbols which addresses are in the text range. On PPC64_ELF_ABI_V1, this means only the dot-prefixed symbols will be kept (those point to the function begin), leaving out the non-dot-prefixed symbols (those point to the function descriptors). So I see two possible solutions there: either we ensure that FTRACE_SYSCALLS selects KALLSYMS_ALL on PPC64_ELF_ABI_V1, or we modify scripts/kallsyms.c:symbol_valid() to also include function descriptor symbols. This would mean accepting symbols pointing into the .opd ELF section. IMHO the second option would be better because it does not increase the kernel image size as much as KALLSYMS_ALL. Thoughts ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com
Re: [PATCH 0/6] Make remove() of any bus based driver void returned
On Mon, Dec 05, 2022 at 05:00:52PM +0100, Greg KH wrote: > On Mon, Dec 05, 2022 at 11:36:38PM +0800, Dawei Li wrote: > > For bus-based driver, device removal is implemented as: > > device_remove() => bus->remove() => driver->remove() > > > > Driver core needs no feedback from bus driver about the result of > > remove callback. In which case, commit fc7a6209d571 ("bus: Make > > remove callback return void") forces bus_type::remove be void-returned. > > > > Now we have the situation that both 1st & 2nd part of calling chain > > are void returned, so it does not make much sense for the last one > > (driver->remove) to return non-void to its caller. > > > > So the basic idea behind this patchset is making remove() callback of > > any bus-based driver to be void returned. > > > > This patchset includes changes for drivers below: > > 1. hyperv > > 2. macio > > 3. apr > > 4. xen > > 5. ac87 > > 6. soundbus Hi Greg: Thanks for the reviewing. > > Then that should be 6 different patchsets going to 6 different > subsystems. No need to make this seems like a unified set of patches at > all. Right, will fix all the issues for this patchset and resend them in 6 independent patches. Thanks Dawei > > > Q: Why not platform drivers? > > A: Too many of them.(maybe 4K+) > > That will have to be done eventually, right? > > thanks, > > greg k-h
Re: [PATCH 1/6] hyperv: Make remove callback of hyperv driver void returned
On Tue, Dec 06, 2022 at 11:37:26AM +, Wei Liu wrote: > On Mon, Dec 05, 2022 at 11:36:39PM +0800, Dawei Li wrote: > > Since commit fc7a6209d571 ("bus: Make remove callback return > > void") forces bus_type::remove be void-returned, it doesn't > > make much sense for any bus based driver implementing remove > > callbalk to return non-void to its caller. > > > > This change is for hyperv bus based drivers. > > > > Signed-off-by: Dawei Li > [...] Hi Wei: Thanks for the review. > > -static int netvsc_remove(struct hv_device *dev) > > +static void netvsc_remove(struct hv_device *dev) > > { > > struct net_device_context *ndev_ctx; > > struct net_device *vf_netdev, *net; > > @@ -2603,7 +2603,6 @@ static int netvsc_remove(struct hv_device *dev) > > net = hv_get_drvdata(dev); > > if (net == NULL) { > > dev_err(>device, "No net device to remove\n"); > > - return 0; > > This is wrong. You are introducing a NULL pointer dereference. Nice catch, will fix it. > > > } > > > > ndev_ctx = netdev_priv(net); > > @@ -2637,7 +2636,6 @@ static int netvsc_remove(struct hv_device *dev) > > > > free_percpu(ndev_ctx->vf_stats); > > free_netdev(net); > > - return 0; > > } > > > > static int netvsc_suspend(struct hv_device *dev) > > diff --git a/drivers/pci/controller/pci-hyperv.c > > b/drivers/pci/controller/pci-hyperv.c > > index ba64284eaf9f..3a09de70d6ea 100644 > > --- a/drivers/pci/controller/pci-hyperv.c > > +++ b/drivers/pci/controller/pci-hyperv.c > > @@ -3756,7 +3756,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev, > > bool keep_devs) > > * > > * Return: 0 on success, -errno on failure > > */ > > This comment is no longer needed in the new world. > > But, are you sure you're modifying the correct piece of code? > > hv_pci_remove is not a hook in the base bus type. It is used in struct > hv_driver. > > The same comment applies to all other modifications. Sorry about the confusion. In short, the point of this commit is making remove() of _driver_ void returned. For bus-based driver, device removal is implemented as: 1 device_remove() => 2 bus->remove() => 3 driver->remove() 1 is void. For 2, commit fc7a6209d571 ("bus: Make remove callback return void") forces bus_type::remove be void-returned, which applies for hv_bus(vmbus) too. So it doesn't make sense for 3(driver->remove) to return non-void, in this case it's hv_driver. Thanks, Dawei > > Thanks, > Wei.
Re: [PATCH 1/6] hyperv: Make remove callback of hyperv driver void returned
On Mon, Dec 05, 2022 at 11:36:39PM +0800, Dawei Li wrote: > Since commit fc7a6209d571 ("bus: Make remove callback return > void") forces bus_type::remove be void-returned, it doesn't > make much sense for any bus based driver implementing remove > callbalk to return non-void to its caller. > > This change is for hyperv bus based drivers. > > Signed-off-by: Dawei Li [...] > -static int netvsc_remove(struct hv_device *dev) > +static void netvsc_remove(struct hv_device *dev) > { > struct net_device_context *ndev_ctx; > struct net_device *vf_netdev, *net; > @@ -2603,7 +2603,6 @@ static int netvsc_remove(struct hv_device *dev) > net = hv_get_drvdata(dev); > if (net == NULL) { > dev_err(>device, "No net device to remove\n"); > - return 0; This is wrong. You are introducing a NULL pointer dereference. > } > > ndev_ctx = netdev_priv(net); > @@ -2637,7 +2636,6 @@ static int netvsc_remove(struct hv_device *dev) > > free_percpu(ndev_ctx->vf_stats); > free_netdev(net); > - return 0; > } > > static int netvsc_suspend(struct hv_device *dev) > diff --git a/drivers/pci/controller/pci-hyperv.c > b/drivers/pci/controller/pci-hyperv.c > index ba64284eaf9f..3a09de70d6ea 100644 > --- a/drivers/pci/controller/pci-hyperv.c > +++ b/drivers/pci/controller/pci-hyperv.c > @@ -3756,7 +3756,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool > keep_devs) > * > * Return: 0 on success, -errno on failure > */ This comment is no longer needed in the new world. But, are you sure you're modifying the correct piece of code? hv_pci_remove is not a hook in the base bus type. It is used in struct hv_driver. The same comment applies to all other modifications. Thanks, Wei.
Re: linux-next: build warnings after merge of the powerpc-objtool tree
Sathvika Vasireddy wrote: On 29/11/22 20:58, Christophe Leroy wrote: Le 29/11/2022 à 16:13, Sathvika Vasireddy a écrit : Hi all, On 25/11/22 09:00, Stephen Rothwell wrote: Hi all, After merging the powerpc-objtool tree, today's linux-next build (powerpc pseries_le_defconfig) produced these warnings: arch/powerpc/kernel/head_64.o: warning: objtool: end_first_256B(): can't find starting instruction arch/powerpc/kernel/optprobes_head.o: warning: objtool: optprobe_template_end(): can't find starting instruction I have no idea what started this (they may have been there yesterday). I was able to recreate the above mentioned warnings with pseries_le_defconfig and powernv_defconfig. The regression report also mentions a warning (https://lore.kernel.org/oe-kbuild-all/202211282102.qur7hhrw-...@intel.com/) seen with arch/powerpc/kernel/kvm_emul.S assembly file. [1] arch/powerpc/kernel/optprobes_head.o: warning: objtool: optprobe_template_end(): can't find starting instruction [2] arch/powerpc/kernel/kvm_emul.o: warning: objtool: kvm_template_end(): can't find starting instruction [3] arch/powerpc/kernel/head_64.o: warning: objtool: end_first_256B(): can't find starting instruction The warnings [1] and [2] go away after adding 'nop' instruction. Below diff fixes it for me: You have to add NOPs just because those labels are at the end of the files. That's a bit odd. I think either we are missing some kind of flagging for the symbols, or objtool has a bug. In both cases, I'm not sure adding an artificial 'nop' is the solution. At least there should be a big hammer warning explaining why. The problem looks to be that commit dbcdbdfdf137b4 ("objtool: Rework instruction -> symbol mapping"), which was referenced by Sathvika below, changes how STT_NOTYPE symbols are handled. In the files throwing that warning, there are labels either at the very end of the file, or at the end of a section with no subsequent instruction. Before that commit, we didn't used to expect an instruction for STT_NOTYPE symbols. I don't see these warnings with powerpc/topic/objtool branch. However, they are seen with linux-next master branch. Commit dbcdbdfdf137b49144204571f1a5e5dc01b8aaad objtool: Rework instruction -> symbol mapping in linux-next is resulting in objtool can't find starting instruction warnings on powerpc. Reverting this particular hunk (pasted below), resolves it and we don't see the problem anymore. @@ -427,7 +427,10 @@ static int decode_instructions(struct objtool_file *file) } list_for_each_entry(func, >symbol_list, list) { - if (func->type != STT_FUNC || func->alias != func) + if (func->type != STT_NOTYPE && func->type != STT_FUNC) + continue; + + if (func->return_thunk || func->alias != func) continue; if (!find_insn(file, sec, func->offset)) { We are currently bailing out if find_insn() there fails. Should we instead just continue by not setting insn->sym? @@ -430,11 +430,8 @@ static int decode_instructions(struct objtool_file *file) if (func->return_thunk || func->alias != func) continue; - if (!find_insn(file, sec, func->offset)) { - WARN("%s(): can't find starting instruction", -func->name); - return -1; - } + if (!find_insn(file, sec, func->offset)) + continue; sym_for_each_insn(file, func, insn) { insn->sym = func; - Naveen
[PATCH AUTOSEL 5.10 02/10] ASoC: fsl_micfil: explicitly clear CHnF flags
From: Shengjiu Wang [ Upstream commit b776c4a4618ec1b5219d494c423dc142f23c4e8f ] There may be failure when start 1 channel recording after 8 channels recording. The reason is that the CHnF flags are not cleared successfully by software reset. This issue is triggerred by the change of clearing software reset bit. CHnF flags are write 1 clear bits. Clear them by force write. Signed-off-by: Shengjiu Wang Link: https://lore.kernel.org/r/1651925654-32060-2-git-send-email-shengjiu.w...@nxp.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/fsl/fsl_micfil.c | 8 1 file changed, 8 insertions(+) diff --git a/sound/soc/fsl/fsl_micfil.c b/sound/soc/fsl/fsl_micfil.c index ead4bfa13561..6c794605e33c 100644 --- a/sound/soc/fsl/fsl_micfil.c +++ b/sound/soc/fsl/fsl_micfil.c @@ -201,6 +201,14 @@ static int fsl_micfil_reset(struct device *dev) if (ret) return ret; + /* +* Set SRES should clear CHnF flags, But even add delay here +* the CHnF may not be cleared sometimes, so clear CHnF explicitly. +*/ + ret = regmap_write_bits(micfil->regmap, REG_MICFIL_STAT, 0xFF, 0xFF); + if (ret) + return ret; + return 0; } -- 2.35.1
[PATCH AUTOSEL 5.10 01/10] ASoC: fsl_micfil: explicitly clear software reset bit
From: Shengjiu Wang [ Upstream commit 292709b9cf3ba470af94b62c9bb60284cc581b79 ] SRES is self-cleared bit, but REG_MICFIL_CTRL1 is defined as non volatile register, it still remain in regmap cache after set, then every update of REG_MICFIL_CTRL1, software reset happens. to avoid this, clear it explicitly. Signed-off-by: Shengjiu Wang Link: https://lore.kernel.org/r/1651925654-32060-1-git-send-email-shengjiu.w...@nxp.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/fsl/fsl_micfil.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/sound/soc/fsl/fsl_micfil.c b/sound/soc/fsl/fsl_micfil.c index efc5daf53bba..ead4bfa13561 100644 --- a/sound/soc/fsl/fsl_micfil.c +++ b/sound/soc/fsl/fsl_micfil.c @@ -190,6 +190,17 @@ static int fsl_micfil_reset(struct device *dev) return ret; } + /* +* SRES is self-cleared bit, but REG_MICFIL_CTRL1 is defined +* as non-volatile register, so SRES still remain in regmap +* cache after set, that every update of REG_MICFIL_CTRL1, +* software reset happens. so clear it explicitly. +*/ + ret = regmap_clear_bits(micfil->regmap, REG_MICFIL_CTRL1, + MICFIL_CTRL1_SRES); + if (ret) + return ret; + return 0; } -- 2.35.1
[PATCH AUTOSEL 5.15 02/12] ASoC: fsl_micfil: explicitly clear CHnF flags
From: Shengjiu Wang [ Upstream commit b776c4a4618ec1b5219d494c423dc142f23c4e8f ] There may be failure when start 1 channel recording after 8 channels recording. The reason is that the CHnF flags are not cleared successfully by software reset. This issue is triggerred by the change of clearing software reset bit. CHnF flags are write 1 clear bits. Clear them by force write. Signed-off-by: Shengjiu Wang Link: https://lore.kernel.org/r/1651925654-32060-2-git-send-email-shengjiu.w...@nxp.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/fsl/fsl_micfil.c | 8 1 file changed, 8 insertions(+) diff --git a/sound/soc/fsl/fsl_micfil.c b/sound/soc/fsl/fsl_micfil.c index cb84d95c3aac..d1cd104f8584 100644 --- a/sound/soc/fsl/fsl_micfil.c +++ b/sound/soc/fsl/fsl_micfil.c @@ -202,6 +202,14 @@ static int fsl_micfil_reset(struct device *dev) if (ret) return ret; + /* +* Set SRES should clear CHnF flags, But even add delay here +* the CHnF may not be cleared sometimes, so clear CHnF explicitly. +*/ + ret = regmap_write_bits(micfil->regmap, REG_MICFIL_STAT, 0xFF, 0xFF); + if (ret) + return ret; + return 0; } -- 2.35.1
[PATCH AUTOSEL 5.15 01/12] ASoC: fsl_micfil: explicitly clear software reset bit
From: Shengjiu Wang [ Upstream commit 292709b9cf3ba470af94b62c9bb60284cc581b79 ] SRES is self-cleared bit, but REG_MICFIL_CTRL1 is defined as non volatile register, it still remain in regmap cache after set, then every update of REG_MICFIL_CTRL1, software reset happens. to avoid this, clear it explicitly. Signed-off-by: Shengjiu Wang Link: https://lore.kernel.org/r/1651925654-32060-1-git-send-email-shengjiu.w...@nxp.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/fsl/fsl_micfil.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/sound/soc/fsl/fsl_micfil.c b/sound/soc/fsl/fsl_micfil.c index 9f90989ac59a..cb84d95c3aac 100644 --- a/sound/soc/fsl/fsl_micfil.c +++ b/sound/soc/fsl/fsl_micfil.c @@ -191,6 +191,17 @@ static int fsl_micfil_reset(struct device *dev) return ret; } + /* +* SRES is self-cleared bit, but REG_MICFIL_CTRL1 is defined +* as non-volatile register, so SRES still remain in regmap +* cache after set, that every update of REG_MICFIL_CTRL1, +* software reset happens. so clear it explicitly. +*/ + ret = regmap_clear_bits(micfil->regmap, REG_MICFIL_CTRL1, + MICFIL_CTRL1_SRES); + if (ret) + return ret; + return 0; } -- 2.35.1
[PATCH AUTOSEL 6.0 02/13] ASoC: fsl_micfil: explicitly clear CHnF flags
From: Shengjiu Wang [ Upstream commit b776c4a4618ec1b5219d494c423dc142f23c4e8f ] There may be failure when start 1 channel recording after 8 channels recording. The reason is that the CHnF flags are not cleared successfully by software reset. This issue is triggerred by the change of clearing software reset bit. CHnF flags are write 1 clear bits. Clear them by force write. Signed-off-by: Shengjiu Wang Link: https://lore.kernel.org/r/1651925654-32060-2-git-send-email-shengjiu.w...@nxp.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/fsl/fsl_micfil.c | 8 1 file changed, 8 insertions(+) diff --git a/sound/soc/fsl/fsl_micfil.c b/sound/soc/fsl/fsl_micfil.c index 8aa6871e0d42..4b86ef82fd93 100644 --- a/sound/soc/fsl/fsl_micfil.c +++ b/sound/soc/fsl/fsl_micfil.c @@ -205,6 +205,14 @@ static int fsl_micfil_reset(struct device *dev) if (ret) return ret; + /* +* Set SRES should clear CHnF flags, But even add delay here +* the CHnF may not be cleared sometimes, so clear CHnF explicitly. +*/ + ret = regmap_write_bits(micfil->regmap, REG_MICFIL_STAT, 0xFF, 0xFF); + if (ret) + return ret; + return 0; } -- 2.35.1
[PATCH AUTOSEL 6.0 01/13] ASoC: fsl_micfil: explicitly clear software reset bit
From: Shengjiu Wang [ Upstream commit 292709b9cf3ba470af94b62c9bb60284cc581b79 ] SRES is self-cleared bit, but REG_MICFIL_CTRL1 is defined as non volatile register, it still remain in regmap cache after set, then every update of REG_MICFIL_CTRL1, software reset happens. to avoid this, clear it explicitly. Signed-off-by: Shengjiu Wang Link: https://lore.kernel.org/r/1651925654-32060-1-git-send-email-shengjiu.w...@nxp.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/fsl/fsl_micfil.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/sound/soc/fsl/fsl_micfil.c b/sound/soc/fsl/fsl_micfil.c index 79ef4e269bc9..8aa6871e0d42 100644 --- a/sound/soc/fsl/fsl_micfil.c +++ b/sound/soc/fsl/fsl_micfil.c @@ -194,6 +194,17 @@ static int fsl_micfil_reset(struct device *dev) if (ret) return ret; + /* +* SRES is self-cleared bit, but REG_MICFIL_CTRL1 is defined +* as non-volatile register, so SRES still remain in regmap +* cache after set, that every update of REG_MICFIL_CTRL1, +* software reset happens. so clear it explicitly. +*/ + ret = regmap_clear_bits(micfil->regmap, REG_MICFIL_CTRL1, + MICFIL_CTRL1_SRES); + if (ret) + return ret; + return 0; } -- 2.35.1