Re: [PATCH v2 00/18] clean up asm/uaccess.h, kill set_fs for good
On Thu, Feb 17, 2022 at 8:20 AM Christophe Leroy wrote: > Le 16/02/2022 à 14:13, Arnd Bergmann a écrit : > > > > Christoph Hellwig and a few others spent a huge effort on removing > > set_fs() from most of the important architectures, but about half the > > other architectures were never completed even though most of them don't > > actually use set_fs() at all. > > > > I did a patch for microblaze at some point, which turned out to be fairly > > generic, and now ported it to most other architectures, using new generic > > implementations of access_ok() and __{get,put}_kernel_nocheck(). > > > > Three architectures (sparc64, ia64, and sh) needed some extra work, > > which I also completed. > > > > The final series contains extra cleanup changes that touch all > > architectures. Please review and test these, so we can merge them > > for v5.18. > > As a further cleanup, have you thought about making a generic version of > clear_user() ? On almost all architectures, clear_user() does an > access_ok() then calls __clear_user() or similar. This already exists in include/asm-generic/uaccess.h, but that file is currently not as easy to use as it should be. I've previously looked into what it would take to get more architectures to use common code in that file, but I currently have no plans to work on that. > Maybe also the same with put_user() and get_user() ? After all it is > just access_ok() followed by __put_user() or __get_user() ? It seems > more tricky though, as some architectures seems to have less trivial > stuff there. Same here: architectures can already provide a __put_user_fn() and __get_user_fn(), to get the generic versions of the interface, but few architectures use that. You can actually get all the interfaces by just providing raw_copy_from_user() and raw_copy_to_user(), but the get_user/put_user versions you get from that are fairly inefficient. > I also see all architectures have a prototype for strncpy_from_user() > and strnlen_user(). Could be a common prototype instead when we have > GENERIC_STRNCPY_FROM_USER / GENERIC_STRNLEN_USER > > And we have also > user_access_begin()/user_read_access_begin()/user_write_access_begin() > which call access_ok() then do the real work. Could be made generic with > call to some arch specific __user_access_begin() and friends after the > access_ok() and eventually the might_fault(). In my opinion, the biggest win would be to move the type-agnostic part of get_user/put_user into completely generic code, this is what architectures get wrong the most, see patch 02/18 in this series for instance. What I'd like to see is that architectures only provide fixed-length versions of unsafe_get_user()/unsafe_put_user(), with the type-agnostic versions (get_user(), __get_user(), unsafe_get_user() and their put versions) all defined once in include/linux/uaccess.h based on those. I tried implementing this in the past, but unfortunately the resulting object code from my generalized implementation was worse than what we have today, so I did not continue that work. Arnd
Re: [PATCH v2 00/18] clean up asm/uaccess.h, kill set_fs for good
Le 16/02/2022 à 14:13, Arnd Bergmann a écrit : > From: Arnd Bergmann > > Christoph Hellwig and a few others spent a huge effort on removing > set_fs() from most of the important architectures, but about half the > other architectures were never completed even though most of them don't > actually use set_fs() at all. > > I did a patch for microblaze at some point, which turned out to be fairly > generic, and now ported it to most other architectures, using new generic > implementations of access_ok() and __{get,put}_kernel_nocheck(). > > Three architectures (sparc64, ia64, and sh) needed some extra work, > which I also completed. > > The final series contains extra cleanup changes that touch all > architectures. Please review and test these, so we can merge them > for v5.18. As a further cleanup, have you thought about making a generic version of clear_user() ? On almost all architectures, clear_user() does an access_ok() then calls __clear_user() or similar. Maybe also the same with put_user() and get_user() ? After all it is just access_ok() followed by __put_user() or __get_user() ? It seems more tricky though, as some architectures seems to have less trivial stuff there. I also see all architectures have a prototype for strncpy_from_user() and strnlen_user(). Could be a common prototype instead when we have GENERIC_STRNCPY_FROM_USER / GENERIC_STRNLEN_USER And we have also user_access_begin()/user_read_access_begin()/user_write_access_begin() which call access_ok() then do the real work. Could be made generic with call to some arch specific __user_access_begin() and friends after the access_ok() and eventually the might_fault(). Christophe
Re: No Linux logs when doing `ppc64_cpu --smt=off/8`
On Thu, 17 Feb 2022 at 01:07, Michael Ellerman wrote: > > Michal Suchánek writes: > > On Mon, Feb 14, 2022 at 01:33:24PM +0100, Paul Menzel wrote: > >> Yes, simple `nproc` suffice, but I was more thinking about, that the Linux > >> log is often used for debugging and the changes of amount of processing > >> units might be good to have. `ppc64_cpu --smt=off` or `=8` seems to block > >> for quite some time, and each thread/processing unit seems to powered > >> down/on sequentially, so it takes quite some time and it blocks. So 140 > >> messages would indeed be quite noise. No idea how `ppc64_cpu` works, and if > >> it could log a message at the beginning and end. > > > > Yes, it enables/disables threads one by one. AFAICT the kernel cannot know > > that > > ppc64_cpu will enable/disable more threads later, it can either log each > > or none. Rate limiting would not show the whole picture so it's not > > great solution either. > > Right, ppc64_cpu just uses the sysfs online files, so it's doing them one at a > time. The kernel has no knowledge that ppc64_cpu is turning all > secondaries on/off so there's no easy way for the kernel to do a summary > message. > > An easy solution would be for ppc64_cpu to log something via syslog(3). Another option would be to implement this sysfs knob for powerpc: /sys/devices/system/cpu/smt/control https://github.com/torvalds/linux/commit/05736e4ac13c08a4a9b1ef2de26dd31a32cbee57 It currently returns "notimplemented" on power, but it's designed (for x86) to turn SMT on and off. With this implemented we could offline all CPUs at once before returning to userspace. It "just" needs someone with some time to work on it. ...which I spent the afternoon attempting to do: https://lore.kernel.org/linuxppc-dev/20220217070419.351683-1-j...@jms.id.au/ It does what it says on the tin, but there's a few rough edges that need working out. It won't solve the problem you had of wanting a message in the kernel logs (I notice that the x86 cpu hotplug code does print something; I wonder how that goes on a large box). If you or anyone else wants to run with the idea then free free to use my patch as a starting point. Cheers, Joel
[RFC PATCH] powerpc: Implement hotplug smt control
x86 added a control for turning SMT on and off in commit 05736e4ac13c ("cpu/hotplug: Provide knobs to control SMT"). Implement this for powerpc as an alternative to the currently method of iterating through /sys/devices/system/cpu/cpuN/online for every CPU. # ppc64_cpu --info Core 0:0*1*2*3*4*5*6*7* Core 1:8*9* 10* 11* 12* 13* 14* 15* # grep . /sys/devices/system/cpu/smt/* /sys/devices/system/cpu/smt/active:1 /sys/devices/system/cpu/smt/control:on # echo off > /sys/devices/system/cpu/smt/control # ppc64_cpu --info Core 0:0*1 2 3 4 5 6 7 Core 1:8*9101112131415 # grep . /sys/devices/system/cpu/smt/* /sys/devices/system/cpu/smt/active:0 /sys/devices/system/cpu/smt/control:off Signed-off-by: Joel Stanley --- This is a RFC as there are bugs: - Booting with nosmt results in a WARN for every sibling thread smp: Bringing up secondary CPUs ... CPU UP failed (-125) CPU 1 state (null) (151) [ cut here ] WARNING: CPU: 0 PID: 1 at kernel/cpu.c:711 _cpu_up+0x304/0x310 Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc4-00055-g2db720040f59-dirty #8 ... NIP [c0136ac4] _cpu_up+0x304/0x310 LR [c0136ac4] _cpu_up+0x304/0x310 Call Trace: [c8503ab0] [c0136ac4] _cpu_up+0x304/0x310 (unreliable) [c8503b60] [c0136be4] cpu_up+0x114/0x1b0 [c8503c00] [c0137538] bringup_nonboot_cpus+0xb8/0x110 [c8503c60] [c2031838] smp_init+0x48/0xd4 [c8503cc0] [c20048f4] kernel_init_freeable+0x20c/0x3dc [c8503da0] [c00127a4] kernel_init+0x34/0x1a0 [c8503e10] [c000cd64] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 1d270028 7cca482a 48089689 6000 7ec4b378 7f03c378 4bffd45d 7ec6b378 7f05c378 7f84e378 3860 4bffe919 <0b03> eac10060 4bfffe64 3c4c0271 ---[ end trace ]--- and then you can get a BUG at runtime if you fiddle with the online state: # ppc64_cpu --smt=on # ppc64_cpu --info Core 0:0*1*2*3*4*5*6*7* Core 1:8*9* 10* 11* 12* 13* 14* 15* # ppc64_cpu --smt=off [ 95.643467][ T203] [ cut here ] [ 95.643556][ T203] kernel BUG at kernel/irq_work.c:235! [ 95.643633][ T203] Oops: Exception in kernel mode, sig: 5 [#1] - Using the smt control to turn off SMT means you cannot online those CPUs with /sys/devices/system/cpu/cpuN/online (returns EPERM) # ppc64_cpu --info Core 0:0*1*2*3*4*5*6*7* Core 1:8*9* 10* 11* 12* 13* 14* 15* # echo off > /sys/devices/system/cpu/smt/control # ppc64_cpu --info Core 0:0*1 2 3 4 5 6 7 Core 1:8*9101112131415 # grep . /sys/devices/system/cpu/smt/* /sys/devices/system/cpu/smt/active:0 /sys/devices/system/cpu/smt/control:off # ppc64_cpu --smt=on One or more cpus could not be on/offlined # strace ppc64_cpu --smt=on openat(AT_FDCWD, "/sys/devices/system/cpu/cpu0/online", O_RDONLY) = 3 newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=65536, ...}, AT_EMPTY_PATH) = 0 read(3, "1\n", 8192)= 2 close(3)= 0 openat(AT_FDCWD, "/sys/devices/system/cpu/cpu0/online", O_WRONLY) = 3 write(3, "1", 1)= 1 close(3)= 0 openat(AT_FDCWD, "/sys/devices/system/cpu/cpu1/online", O_WRONLY) = 3 write(3, "1", 1)= -1 EPERM (Operation not permitted) close(3)= 0 access("/sys/devices/system/cpu/cpu8/online", F_OK) = 0 openat(AT_FDCWD, "/sys/devices/system/cpu/cpu8/online", O_RDONLY) = 3 newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=65536, ...}, AT_EMPTY_PATH) = 0 read(3, "1\n", 8192)= 2 close(3)= 0 openat(AT_FDCWD, "/sys/devices/system/cpu/cpu8/online", O_WRONLY) = 3 write(3, "1", 1)= 1 close(3)= 0 openat(AT_FDCWD, "/sys/devices/system/cpu/cpu9/online", O_WRONLY) = 3 write(3, "1", 1)= -1 EPERM (Operation not permitted) close(3)= 0 write(2, "One or more cpus could not be on"..., 42One or more cpus could not be on/offlined - Setting smt off with the online interface doesn't update the status of the smt control sysfs file (but does update the active file): # ppc64_cpu --smt=off # grep . /sys/devices/system/cpu/smt/* /sys/devices/system/cpu/smt/active:0 /sys/devices/system/cpu/smt/control:on
[powerpc:fixes-test] BUILD SUCCESS fe663df7825811358531dc2e8a52d9eaa5e3515e
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git fixes-test branch HEAD: fe663df7825811358531dc2e8a52d9eaa5e3515e powerpc/lib/sstep: fix 'ptesync' build error elapsed time: 738m configs tested: 128 configs skipped: 109 The following configs have been built successfully. More configs may be tested in the coming days. gcc tested configs: arm defconfig arm64allyesconfig arm64 defconfig arm allyesconfig arm allmodconfig i386 randconfig-c001 powerpc mpc8540_ads_defconfig m68k m5208evb_defconfig sparc sparc32_defconfig mipsgpr_defconfig m68k atari_defconfig mips ip32_defconfig sh ecovec24_defconfig m68km5307c3_defconfig h8300h8300h-sim_defconfig m68kstmark2_defconfig shsh7785lcr_defconfig arm jornada720_defconfig powerpc motionpro_defconfig sparcallyesconfig sh espt_defconfig mips tb0226_defconfig armkeystone_defconfig arm multi_v4t_defconfig nds32alldefconfig mips bmips_be_defconfig xtensa iss_defconfig mips ci20_defconfig armmulti_v7_defconfig powerpcmpc7448_hpc2_defconfig powerpc makalu_defconfig mips gcw0_defconfig sh sdk7786_defconfig sh sh7710voipgw_defconfig powerpc mpc85xx_cds_defconfig powerpcadder875_defconfig powerpc mpc837x_rdb_defconfig s390 allyesconfig sh kfr2r09_defconfig xtensa alldefconfig openrisc simple_smp_defconfig sh se7750_defconfig powerpc mpc83xx_defconfig sh sh7785lcr_32bit_defconfig powerpc mpc834x_itx_defconfig arm randconfig-c002-20220216 ia64 allmodconfig ia64defconfig ia64 allyesconfig m68k allmodconfig m68kdefconfig m68k allyesconfig nios2 defconfig arc allyesconfig nds32 allnoconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allmodconfig parisc64defconfig parisc allyesconfig s390defconfig i386 allyesconfig sparc defconfig i386defconfig i386 debian-10.3-kselftests i386 debian-10.3 mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allmodconfig powerpc allnoconfig x86_64randconfig-a006 x86_64randconfig-a004 x86_64randconfig-a002 i386 randconfig-a012 i386 randconfig-a014 i386 randconfig-a016 riscvnommu_k210_defconfig riscvallyesconfig riscvnommu_virt_defconfig riscv allnoconfig riscv defconfig riscv rv32_defconfig riscvallmodconfig x86_64rhel-8.3-kselftests um x86_64_defconfig um i386_defconfig x86_64 allyesconfig x86_64 defconfig x86_64 rhel-8.3 x86_64 rhel-8.3-func x86_64 kexec clang tested configs
[powerpc:next-test] BUILD SUCCESS bbbca72352bb9484bc057c91a408332b35ee8f4c
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test branch HEAD: bbbca72352bb9484bc057c91a408332b35ee8f4c powerpc/papr_scm: Implement initial support for injecting smart errors elapsed time: 738m configs tested: 129 configs skipped: 3 The following configs have been built successfully. More configs may be tested in the coming days. gcc tested configs: arm defconfig arm64allyesconfig arm64 defconfig arm allyesconfig arm allmodconfig i386 randconfig-c001 powerpc mpc8540_ads_defconfig m68k m5208evb_defconfig sparc sparc32_defconfig mipsgpr_defconfig m68km5307c3_defconfig h8300h8300h-sim_defconfig m68kstmark2_defconfig shsh7785lcr_defconfig arm jornada720_defconfig m68k hp300_defconfig arm exynos_defconfig openriscor1ksim_defconfig powerpc motionpro_defconfig sparcallyesconfig sh espt_defconfig mips tb0226_defconfig armkeystone_defconfig arm multi_v4t_defconfig nds32alldefconfig mips bmips_be_defconfig xtensa iss_defconfig mips ci20_defconfig armmulti_v7_defconfig powerpcmpc7448_hpc2_defconfig powerpc makalu_defconfig mips gcw0_defconfig sh sdk7786_defconfig powerpc mpc85xx_cds_defconfig sh sh7710voipgw_defconfig powerpcadder875_defconfig powerpc mpc837x_rdb_defconfig s390 allyesconfig sh kfr2r09_defconfig xtensa alldefconfig openrisc simple_smp_defconfig sh se7750_defconfig powerpc mpc83xx_defconfig sh sh7785lcr_32bit_defconfig powerpc mpc834x_itx_defconfig arm randconfig-c002-20220216 ia64 allmodconfig ia64defconfig ia64 allyesconfig m68k allmodconfig m68kdefconfig m68k allyesconfig nios2 defconfig arc allyesconfig nds32 allnoconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allmodconfig parisc64defconfig parisc allyesconfig s390defconfig i386 allyesconfig sparc defconfig i386defconfig i386 debian-10.3-kselftests i386 debian-10.3 mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allmodconfig powerpc allnoconfig x86_64randconfig-a006 x86_64randconfig-a004 x86_64randconfig-a002 i386 randconfig-a012 i386 randconfig-a014 i386 randconfig-a016 riscvrandconfig-r042-20220216 arc randconfig-r043-20220216 s390 randconfig-r044-20220216 riscvnommu_k210_defconfig riscvallyesconfig riscvnommu_virt_defconfig riscv allnoconfig riscv defconfig riscv rv32_defconfig riscvallmodconfig x86_64rhel-8.3-kselftests um x86_64_defconfig um i386_defconfig x86_64 allyesconfig x86_64 defconfig
Re: No Linux logs when doing `ppc64_cpu --smt=off/8`
Michal Suchánek writes: > On Mon, Feb 14, 2022 at 01:33:24PM +0100, Paul Menzel wrote: >> Am 14.02.22 um 10:43 schrieb Michal Suchánek: >> > On Mon, Feb 14, 2022 at 07:08:07AM +0100, Paul Menzel wrote: >> > > On the POWER8 server IBM S822LC running `ppc64_cpu --smt=off` or >> > > `ppc64_cpu >> > > --smt=8`, Linux 5.17-rc4 does not log anything. I would have expected a >> > > message about the change in number of processing units. >> > >> > IIRC it was considered too noisy for systems with many CPUs and the >> > message was dropped. You can always check the resulting state with >> > ppc64_cpu or examining sysfs. One of the messages was removed because it was potentially buggy: ed8029d7b472 ("powerpc/pseries: Stop calling printk in rtas_stop_self()") We may have removed some other messages, but my grepping skills can't find anything relevant at the moment. But in general yes, it used to be far too verbose on large systems. >> Yes, simple `nproc` suffice, but I was more thinking about, that the Linux >> log is often used for debugging and the changes of amount of processing >> units might be good to have. `ppc64_cpu --smt=off` or `=8` seems to block >> for quite some time, and each thread/processing unit seems to powered >> down/on sequentially, so it takes quite some time and it blocks. So 140 >> messages would indeed be quite noise. No idea how `ppc64_cpu` works, and if >> it could log a message at the beginning and end. > > Yes, it enables/disables threads one by one. AFAICT the kernel cannot know > that > ppc64_cpu will enable/disable more threads later, it can either log each > or none. Rate limiting would not show the whole picture so it's not > great solution either. Right, ppc64_cpu just uses the sysfs online files, so it's doing them one at a time. The kernel has no knowledge that ppc64_cpu is turning all secondaries on/off so there's no easy way for the kernel to do a summary message. An easy solution would be for ppc64_cpu to log something via syslog(3). cheers
Re: [PATCH v2 18/18] uaccess: drop maining CONFIG_SET_FS users
On Wed, Feb 16, 2022 at 7:44 PM Sam Ravnborg wrote: > > Hi Arnd, > > Fix spelling in $subject... done > sparc/Kconfig b/arch/sparc/Kconfig > > index 9f6f9bce5292..9276f321b3e3 100644 > > --- a/arch/sparc/Kconfig > > +++ b/arch/sparc/Kconfig > > @@ -46,7 +46,6 @@ config SPARC > > select LOCKDEP_SMALL if LOCKDEP > > select NEED_DMA_MAP_STATE > > select NEED_SG_DMA_LENGTH > > - select SET_FS > > select TRACE_IRQFLAGS_SUPPORT > > > > config SPARC32 > > @@ -101,6 +100,7 @@ config SPARC64 > > select HAVE_SETUP_PER_CPU_AREA > > select NEED_PER_CPU_EMBED_FIRST_CHUNK > > select NEED_PER_CPU_PAGE_FIRST_CHUNK > > + select SET_FS > This looks wrong - looks like some merge went wrong here. Fixed now. > > Other than the above the sparc32 changes looks fine, and with the Kconf > stuff fixed: > Acked-by: Sam Ravnborg # for sparc32 changes Thanks! Arnd
Re: [PATCH v2 15/18] sparc64: remove CONFIG_SET_FS support
On Wed, Feb 16, 2022 at 7:41 PM Sam Ravnborg wrote: > On Wed, Feb 16, 2022 at 07:34:59PM +0100, Sam Ravnborg wrote: > > > > I think you somehow missed the Kconfig change, and also the related > > sparc32 change which continue to have set_fs() after this patch. Right, thanks for pointing out the issue. > I now notice the sparc32 bits are in the last patch. > To avoid breaking bisect-ability on sparc64 I think you need to merge > the sparc32 changes with this patch, unless the sparc64 changes can > coexist with CONFIG_SET_FS continue to be set. I originally had them in the reverse order and broke bisectability during my rebase. The end result is still fine, but now I need to move the 'select SET_FS' from CONFIG_SPARC to CONFIG_SPARC32 in this patch and then remove it again from there in the last step. I've done that in my local copy now. Arnd
Re: [next-20220215] WARNING at fs/iomap/buffered-io.c:75 with xfstests
[adding fsdevel to this party, since iomap is core code, not just XFS...] On Wed, Feb 16, 2022 at 12:55:02PM +0530, Sachin Sant wrote: > While running xfstests on IBM Power10 logical partition (LPAR) booted > with 5.17.0-rc4-next-20220215 following warning was seen: > > [ 2547.384210] xfs filesystem being mounted at /mnt/scratch supports > timestamps until 2038 (0x7fff) > [ 2547.389021] XFS (dm-0): metadata I/O error in "xfs_imap_to_bp+0x64/0x98 > [xfs]" at daddr 0x2bc2c0 len 32 error 5 > [ 2547.389020] XFS (dm-0): metadata I/O error in "xfs_imap_to_bp+0x64/0x98 > [xfs]" at daddr 0x15cede0 len 32 error 5 > [ 2547.389026] XFS (dm-0): metadata I/O error in > "xfs_btree_read_buf_block.constprop.29+0xd0/0x110 [xfs]" at daddr 0xc60 len 8 > error 5 > [ 2547.389032] XFS (dm-0): metadata I/O error in > "xfs_btree_read_buf_block.constprop.29+0xd0/0x110 [xfs]" at daddr 0x3bffa30 > len 8 error 5 > [ 2547.389120] XFS (dm-0): log I/O error -5 > [ 2547.389135] XFS (dm-0): Metadata I/O Error (0x1) detected at > xfs_trans_read_buf_map+0x31c/0x368 [xfs] (fs/xfs/xfs_trans_buf.c:296). > Shutting down filesystem. Hmm, the filesystem went down... > [ 2547.389195] XFS (dm-0): Please unmount the filesystem and rectify the > problem(s) > [ 2547.662818] [ cut here ] > [ 2547.662832] WARNING: CPU: 24 PID: 2463718 at fs/iomap/buffered-io.c:75 > iomap_page_release+0x1b0/0x1e0 ...and I think this is complaining about this debugging test in iomap_page_release: WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != folio_test_uptodate(folio)); which checks that we're not releasing or invalidating a page that's partially uptodate on a blocksize < pagesize filesystem (or so I gather from "POWER10 LPAR" (64k pages?) and "XFS" (4k blocks?))... > [ 2547.662840] Modules linked in: dm_thin_pool dm_persistent_data > dm_bio_prison dm_snapshot dm_bufio loop dm_flakey xfs libcrc32c dm_mod rfkill > bonding sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio sch_fq_codel > ext4 mbcache jbd2 sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp fuse > [last unloaded: scsi_debug] > [ 2547.662866] CPU: 24 PID: 2463718 Comm: umount Not tainted > 5.17.0-rc4-next-20220215 #1 > [ 2547.662871] NIP: c0565b80 LR: c0565aa0 CTR: > c0565e40 > [ 2547.662874] REGS: cf0035b0 TRAP: 0700 Not tainted > (5.17.0-rc4-next-20220215) > [ 2547.662877] MSR: 8282b033 CR: > 44000228 XER: 2004 > [ 2547.662885] CFAR: c0565ac8 IRQMASK: 0 > [ 2547.662885] GPR00: c037dd3c cf003850 c2a03300 > 0001 > [ 2547.662885] GPR04: 0010 0001 c000df6e7bc0 > 0001 > [ 2547.662885] GPR08: fffe 0010 > 0228 > [ 2547.662885] GPR12: c0565e40 c00abfcfe680 > > [ 2547.662885] GPR16: > > [ 2547.662885] GPR20: > fffe > [ 2547.662885] GPR24: cf0038e0 > c000243db278 > [ 2547.662885] GPR28: 0012 c000df6e7ae0 0010 > c00c0240c100 > [ 2547.662920] NIP [c0565b80] iomap_page_release+0x1b0/0x1e0 > [ 2547.662924] LR [c0565aa0] iomap_page_release+0xd0/0x1e0 > [ 2547.662927] Call Trace: > [ 2547.662928] [cf003850] [cf003890] 0xcf003890 > (unreliable) > [ 2547.662932] [cf003890] [c037dd3c] > truncate_cleanup_folio+0x7c/0x140 > [ 2547.662937] [cf0038c0] [c037f068] > truncate_inode_pages_range+0x148/0x5e0 > [ 2547.662942] [cf003a40] [c04c2058] evict+0x248/0x270 > [ 2547.662946] [cf003a80] [c04c20fc] dispose_list+0x7c/0xb0 > [ 2547.662951] [cf003ac0] [c04c2314] evict_inodes+0x1e4/0x300 > [ 2547.662955] [cf003b60] [c04922d0] > generic_shutdown_super+0x70/0x1e0 ...but in this particular case, we're dumping pages (and inodes) as part of unmounting the filesystem, in which case we've already flushed dirty data to disk, because sync_filesystem gets called before evict_inodes here... > [ 2547.662959] [cf003bd0] [c0492518] > kill_block_super+0x38/0x90 > [ 2547.662964] [cf003c00] [c04926e8] > deactivate_locked_super+0x78/0xf0 > [ 2547.662968] [cf003c30] [c04cde9c] cleanup_mnt+0xfc/0x1c0 > [ 2547.662972] [cf003c80] [c0189448] task_work_run+0xf8/0x170 > [ 2547.662976] [cf003cd0] [c0021b94] > do_notify_resume+0x434/0x480 > [ 2547.662981] [cf003d80] [c00338b0] > interrupt_exit_user_prepare_main+0x1a0/0x260 > [ 2547.662985] [cf003de0] [c0033d74] > syscall_exit_prepare+0x74/0x150 > [ 2547.662989] [cf003e10] [c000c658] >
Re: [PATCH v2 15/18] sparc64: remove CONFIG_SET_FS support
Hi Arnd. On Wed, Feb 16, 2022 at 02:13:29PM +0100, Arnd Bergmann wrote: > From: Arnd Bergmann > > sparc64 uses address space identifiers to differentiate between kernel > and user space, using ASI_P for kernel threads but ASI_AIUS for normal > user space, with the option of changing between them. > > As nothing really changes the ASI any more, just hardcode ASI_AIUS > everywhere. Kernel threads are not allowed to access __user pointers > anyway. > > Signed-off-by: Arnd Bergmann > --- > arch/sparc/include/asm/processor_64.h | 4 > arch/sparc/include/asm/switch_to_64.h | 4 +--- > arch/sparc/include/asm/thread_info_64.h | 4 +--- > arch/sparc/include/asm/uaccess_64.h | 20 +--- > arch/sparc/kernel/process_64.c | 12 > arch/sparc/kernel/traps_64.c| 2 -- > arch/sparc/lib/NGmemcpy.S | 3 +-- > arch/sparc/mm/init_64.c | 7 --- > 8 files changed, 8 insertions(+), 48 deletions(-) I think you somehow missed the Kconfig change, and also the related sparc32 change which continue to have set_fs() after this patch. I did not manage to review the patch - as I am too unfamiliar with the code paths and the set_fs() removal changes. Sam
Re: [PATCH 08/14] arm64: simplify access_ok()
Le 15/02/2022 à 10:12, Arnd Bergmann a écrit : > On Tue, Feb 15, 2022 at 9:17 AM Ard Biesheuvel wrote: >> On Mon, 14 Feb 2022 at 17:37, Arnd Bergmann wrote: >>> From: Arnd Bergmann >>> >> >> With set_fs() out of the picture, wouldn't it be sufficient to check >> that bit #55 is clear? (the bit that selects between TTBR0 and TTBR1) >> That would also remove the need to strip the tag from the address. >> >> Something like >> >> asm goto("tbnz %0, #55, %2 \n" >> "tbnz %1, #55, %2 \n" >> :: "r"(addr), "r"(addr + size - 1) :: notok); >> return 1; >> notok: >> return 0; >> >> with an additional sanity check on the size which the compiler could >> eliminate for compile-time constant values. > > That should work, but I don't see it as a clear enough advantage to > have a custom implementation. For the constant-size case, it probably > isn't better than a compiler-scheduled comparison against a > constant limit, but it does hurt maintainability when the next person > wants to change the behavior of access_ok() globally. > > If we want to get into micro-optimizing uaccess, I think a better target > would be a CONFIG_CC_HAS_ASM_GOTO_OUTPUT version > of __get_user()/__put_user as we have on x86 and powerpc. > There is also the user block accesses with user_access_begin()/user_access_end() together with unsafe_put_user() and unsafe_get_user() which allowed us to optimise user accesses on powerpc, especially in the signal code. Christophe
Re: [PATCH v2 18/18] uaccess: drop maining CONFIG_SET_FS users
Hi Arnd, Fix spelling in $subject... sparc/Kconfig b/arch/sparc/Kconfig > index 9f6f9bce5292..9276f321b3e3 100644 > --- a/arch/sparc/Kconfig > +++ b/arch/sparc/Kconfig > @@ -46,7 +46,6 @@ config SPARC > select LOCKDEP_SMALL if LOCKDEP > select NEED_DMA_MAP_STATE > select NEED_SG_DMA_LENGTH > - select SET_FS > select TRACE_IRQFLAGS_SUPPORT > > config SPARC32 > @@ -101,6 +100,7 @@ config SPARC64 > select HAVE_SETUP_PER_CPU_AREA > select NEED_PER_CPU_EMBED_FIRST_CHUNK > select NEED_PER_CPU_PAGE_FIRST_CHUNK > + select SET_FS This looks wrong - looks like some merge went wrong here. > > config ARCH_PROC_KCORE_TEXT > def_bool y > diff --git a/arch/sparc/include/asm/processor_32.h > b/arch/sparc/include/asm/processor_32.h > index 647bf0ac7beb..b26c35336b51 100644 > --- a/arch/sparc/include/asm/processor_32.h > +++ b/arch/sparc/include/asm/processor_32.h > @@ -32,10 +32,6 @@ struct fpq { > }; > #endif > > -typedef struct { > - int seg; > -} mm_segment_t; > - > /* The Sparc processor specific thread struct. */ > struct thread_struct { > struct pt_regs *kregs; > @@ -50,11 +46,9 @@ struct thread_struct { > unsigned long fsr; > unsigned long fpqdepth; > struct fpq fpqueue[16]; > - mm_segment_t current_ds; > }; > > #define INIT_THREAD { \ > - .current_ds = KERNEL_DS, \ > .kregs = (struct pt_regs *)(init_stack+THREAD_SIZE)-1 \ > } > > diff --git a/arch/sparc/include/asm/uaccess_32.h > b/arch/sparc/include/asm/uaccess_32.h > index 367747116260..9fd6c53644b6 100644 > --- a/arch/sparc/include/asm/uaccess_32.h > +++ b/arch/sparc/include/asm/uaccess_32.h > @@ -12,19 +12,6 @@ > #include > > #include > - > -/* Sparc is not segmented, however we need to be able to fool access_ok() > - * when doing system calls from kernel mode legitimately. > - * > - * "For historical reasons, these macros are grossly misnamed." -Linus > - */ > - > -#define KERNEL_DS ((mm_segment_t) { 0 }) > -#define USER_DS ((mm_segment_t) { -1 }) > - > -#define get_fs() (current->thread.current_ds) > -#define set_fs(val) ((current->thread.current_ds) = (val)) > - > #include > > /* Uh, these should become the main single-value transfer routines.. > diff --git a/arch/sparc/kernel/process_32.c b/arch/sparc/kernel/process_32.c > index 2dc0bf9fe62e..88c0c14aaff0 100644 > --- a/arch/sparc/kernel/process_32.c > +++ b/arch/sparc/kernel/process_32.c > @@ -300,7 +300,6 @@ int copy_thread(unsigned long clone_flags, unsigned long > sp, unsigned long arg, > extern int nwindows; > unsigned long psr; > memset(new_stack, 0, STACKFRAME_SZ + TRACEREG_SZ); > - p->thread.current_ds = KERNEL_DS; > ti->kpc = (((unsigned long) ret_from_kernel_thread) - 0x8); > childregs->u_regs[UREG_G1] = sp; /* function */ > childregs->u_regs[UREG_G2] = arg; > @@ -311,7 +310,6 @@ int copy_thread(unsigned long clone_flags, unsigned long > sp, unsigned long arg, > } > memcpy(new_stack, (char *)regs - STACKFRAME_SZ, STACKFRAME_SZ + > TRACEREG_SZ); > childregs->u_regs[UREG_FP] = sp; > - p->thread.current_ds = USER_DS; > ti->kpc = (((unsigned long) ret_from_fork) - 0x8); > ti->kpsr = current->thread.fork_kpsr | PSR_PIL; > ti->kwim = current->thread.fork_kwim; Other than the above the sparc32 changes looks fine, and with the Kconf stuff fixed: Acked-by: Sam Ravnborg # for sparc32 changes
Re: [PATCH v2 15/18] sparc64: remove CONFIG_SET_FS support
Hi Arnd, On Wed, Feb 16, 2022 at 07:34:59PM +0100, Sam Ravnborg wrote: > Hi Arnd. > > On Wed, Feb 16, 2022 at 02:13:29PM +0100, Arnd Bergmann wrote: > > From: Arnd Bergmann > > > > sparc64 uses address space identifiers to differentiate between kernel > > and user space, using ASI_P for kernel threads but ASI_AIUS for normal > > user space, with the option of changing between them. > > > > As nothing really changes the ASI any more, just hardcode ASI_AIUS > > everywhere. Kernel threads are not allowed to access __user pointers > > anyway. > > > > Signed-off-by: Arnd Bergmann > > --- > > arch/sparc/include/asm/processor_64.h | 4 > > arch/sparc/include/asm/switch_to_64.h | 4 +--- > > arch/sparc/include/asm/thread_info_64.h | 4 +--- > > arch/sparc/include/asm/uaccess_64.h | 20 +--- > > arch/sparc/kernel/process_64.c | 12 > > arch/sparc/kernel/traps_64.c| 2 -- > > arch/sparc/lib/NGmemcpy.S | 3 +-- > > arch/sparc/mm/init_64.c | 7 --- > > 8 files changed, 8 insertions(+), 48 deletions(-) > > I think you somehow missed the Kconfig change, and also the related > sparc32 change which continue to have set_fs() after this patch. I now notice the sparc32 bits are in the last patch. To avoid breaking bisect-ability on sparc64 I think you need to merge the sparc32 changes with this patch, unless the sparc64 changes can coexist with CONFIG_SET_FS continue to be set. Sam
Re: [next-20220215] WARNING at fs/iomap/buffered-io.c:75 with xfstests
On Wed, Feb 16, 2022 at 09:35:04AM -0800, Darrick J. Wong wrote: > On Wed, Feb 16, 2022 at 12:55:02PM +0530, Sachin Sant wrote: > > [ 2547.662818] [ cut here ] > > [ 2547.662832] WARNING: CPU: 24 PID: 2463718 at fs/iomap/buffered-io.c:75 > > iomap_page_release+0x1b0/0x1e0 > > ...and I think this is complaining about this debugging test in > iomap_page_release: > > WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != > folio_test_uptodate(folio)); > > which checks that we're not releasing or invalidating a page that's > partially uptodate on a blocksize < pagesize filesystem (or so I gather > from "POWER10 LPAR" (64k pages?) and "XFS" (4k blocks?))... This happens _all_ _the_ _time_ in my testing. If your block size is less than page size, you can expect it. What it's supposed to be testing is that we remembered to set the uptodate flag once all blocks in this page are uptodate. What's tripping the check is iomap_writepage_map()'s stupid clearing of the uptodate flag on the folio: if (unlikely(error)) { ... if (!count) { folio_clear_uptodate(folio); folio_unlock(folio); goto done; What particularly annoys me about this is that the folio _was_ uptodate, and it was dirty, so it has newer data in it than is on disk, but we're going to re-read the folio from disk and throw away that user data. > Given that in this case we've already cleared SB_ACTIVE from the > superblock s_flags, I wonder if we could amend that code to read: > > if (inode->i_sb->s_flags & SB_ACTIVE) > WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != > folio_test_uptodate(folio)); > > Since we don't really care about memory pages that aren't fully up to > date if we're in the midst of freeing all the incore filesystem state. > > Thoughts? Seems like papering over a bad design decision to me.
Re: [PATCH v4 2/6] Partially revert "KVM: Pass kvm_init()'s opaque param to additional arch funcs"
On Wed, 16 Feb 2022 11:15:17 +0800 Chao Gao wrote: > This partially reverts commit b99040853738 ("KVM: Pass kvm_init()'s opaque > param to additional arch funcs") remove opaque from > kvm_arch_check_processor_compat because no one uses this opaque now. > Address conflicts for ARM (due to file movement) and manually handle RISC-V > which comes after the commit. > > And changes about kvm_arch_hardware_setup() in original commit are still > needed so they are not reverted. > > Signed-off-by: Chao Gao > Reviewed-by: Sean Christopherson for KVM s390: Acked-by: Claudio Imbrenda > --- > arch/arm64/kvm/arm.c | 2 +- > arch/mips/kvm/mips.c | 2 +- > arch/powerpc/kvm/powerpc.c | 2 +- > arch/riscv/kvm/main.c | 2 +- > arch/s390/kvm/kvm-s390.c | 2 +- > arch/x86/kvm/x86.c | 2 +- > include/linux/kvm_host.h | 2 +- > virt/kvm/kvm_main.c| 16 +++- > 8 files changed, 10 insertions(+), 20 deletions(-) > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > index ecc5958e27fe..0165cf3aac3a 100644 > --- a/arch/arm64/kvm/arm.c > +++ b/arch/arm64/kvm/arm.c > @@ -73,7 +73,7 @@ int kvm_arch_hardware_setup(void *opaque) > return 0; > } > > -int kvm_arch_check_processor_compat(void *opaque) > +int kvm_arch_check_processor_compat(void) > { > return 0; > } > diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c > index a25e0b73ee70..092d09fb6a7e 100644 > --- a/arch/mips/kvm/mips.c > +++ b/arch/mips/kvm/mips.c > @@ -140,7 +140,7 @@ int kvm_arch_hardware_setup(void *opaque) > return 0; > } > > -int kvm_arch_check_processor_compat(void *opaque) > +int kvm_arch_check_processor_compat(void) > { > return 0; > } > diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c > index 2ad0ccd202d5..30c817f3fa0c 100644 > --- a/arch/powerpc/kvm/powerpc.c > +++ b/arch/powerpc/kvm/powerpc.c > @@ -423,7 +423,7 @@ int kvm_arch_hardware_setup(void *opaque) > return 0; > } > > -int kvm_arch_check_processor_compat(void *opaque) > +int kvm_arch_check_processor_compat(void) > { > return kvmppc_core_check_processor_compat(); > } > diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c > index 2e5ca43c8c49..992877e78393 100644 > --- a/arch/riscv/kvm/main.c > +++ b/arch/riscv/kvm/main.c > @@ -20,7 +20,7 @@ long kvm_arch_dev_ioctl(struct file *filp, > return -EINVAL; > } > > -int kvm_arch_check_processor_compat(void *opaque) > +int kvm_arch_check_processor_compat(void) > { > return 0; > } > diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c > index 577f1ead6a51..0053b81c6b02 100644 > --- a/arch/s390/kvm/kvm-s390.c > +++ b/arch/s390/kvm/kvm-s390.c > @@ -252,7 +252,7 @@ int kvm_arch_hardware_enable(void) > return 0; > } > > -int kvm_arch_check_processor_compat(void *opaque) > +int kvm_arch_check_processor_compat(void) > { > return 0; > } > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 9b484ed61f37..ffb88f0b7265 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -11509,7 +11509,7 @@ void kvm_arch_hardware_unsetup(void) > static_call(kvm_x86_hardware_unsetup)(); > } > > -int kvm_arch_check_processor_compat(void *opaque) > +int kvm_arch_check_processor_compat(void) > { > struct cpuinfo_x86 *c = _data(smp_processor_id()); > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index f11039944c08..2ad78e729bf7 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -1413,7 +1413,7 @@ int kvm_arch_hardware_enable(void); > void kvm_arch_hardware_disable(void); > int kvm_arch_hardware_setup(void *opaque); > void kvm_arch_hardware_unsetup(void); > -int kvm_arch_check_processor_compat(void *opaque); > +int kvm_arch_check_processor_compat(void); > int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu); > bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu); > int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu); > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 83c57bcc6eb6..ee47d33d69e1 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -5643,22 +5643,14 @@ void kvm_unregister_perf_callbacks(void) > } > #endif > > -struct kvm_cpu_compat_check { > - void *opaque; > - int *ret; > -}; > - > -static void check_processor_compat(void *data) > +static void check_processor_compat(void *rtn) > { > - struct kvm_cpu_compat_check *c = data; > - > - *c->ret = kvm_arch_check_processor_compat(c->opaque); > + *(int *)rtn = kvm_arch_check_processor_compat(); > } > > int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align, > struct module *module) > { > - struct kvm_cpu_compat_check c; > int r; > int cpu; > > @@ -5686,10 +5678,8 @@ int kvm_init(void *opaque, unsigned vcpu_size, > unsigned vcpu_align, > if (r < 0) > goto out_free_1; > > - c.ret = > - c.opaque = opaque; >
Re: [PATCH v4 00/13] Fix LKDTM for PPC64/IA64/PARISC v4
On Wed, Feb 16, 2022 at 11:22:33PM +1100, Michael Ellerman wrote: > Kees Cook writes: > > On Tue, Feb 15, 2022 at 01:40:55PM +0100, Christophe Leroy wrote: > >> PPC64/IA64/PARISC have function descriptors. LKDTM doesn't work > >> on those three architectures because LKDTM messes up function > >> descriptors with functions. > >> > >> This series does some cleanup in the three architectures and > >> refactors function descriptors so that it can then easily use it > >> in a generic way in LKDTM. > > > > Thanks for doing this! It looks good to me. :) > > How should we merge this series, it's a bit all over the map. > > I could put it in a topic branch? That's fine by me -- I had assumed it'd go via the ppc tree. But if you'd rather I take it as a topic branch I can do that too. -- Kees Cook
RE: [PATCH v2 02/18] uaccess: fix nios2 and microblaze get_user_8()
From: Arnd Bergmann > Sent: 16 February 2022 13:13 > > These two architectures implement 8-byte get_user() through > a memcpy() into a four-byte variable, which won't fit. > > Use a temporary 64-bit variable instead here, and use a double > cast the way that risc-v and openrisc do to avoid compile-time > warnings. > ... > case 4: \ > - __get_user_asm("lw", (ptr), __gu_val, __gu_err);\ > + __get_user_asm("lw", (ptr), x, __gu_err); \ > break; \ > - case 8: \ > - __gu_err = __copy_from_user(&__gu_val, ptr, 8); \ > - if (__gu_err) \ > - __gu_err = -EFAULT; \ > + case 8: { \ > + __u64 __x = 0; \ > + __gu_err = raw_copy_from_user(&__x, ptr, 8) ? \ > + -EFAULT : 0;\ > + (x) = (typeof(x))(typeof((x) - (x)))__x;\ > break; \ Wouldn't it be better to just fetch two 32bit values: Something like (for LE - nios2 is definitely LE: __u32 val_lo, val_hi; __get_user_asm("lw", (ptr), val_lo, __gu_err); __get_user_asm("lw", (ptr) + 4, val_hi, __gu_err); x = val_lo | val_hi << 32; break; David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Re: [PATCH v4 00/13] Fix LKDTM for PPC64/IA64/PARISC v4
On 2/15/22 17:07, Kees Cook wrote: > On Tue, Feb 15, 2022 at 01:40:55PM +0100, Christophe Leroy wrote: >> PPC64/IA64/PARISC have function descriptors. LKDTM doesn't work >> on those three architectures because LKDTM messes up function >> descriptors with functions. >> >> This series does some cleanup in the three architectures and >> refactors function descriptors so that it can then easily use it >> in a generic way in LKDTM. > > Thanks for doing this! It looks good to me. :) I endorse that. Thank you, Christophe! Acked-by: Helge Deller Helge > -Kees > >> >> Changes in v4: >> - Added patch 1 which Fixes 'sparse' for powerpc64le after wrong report on >> previous series, refer >> https://github.com/ruscur/linux-ci/actions/runs/1351427671 >> - Exported dereference_function_descriptor() to modules >> - Addressed other received comments >> - Rebased on latest powerpc/next (5a72345e6a78120368fcc841b570331b6c5a50da) >> >> Changes in v3: >> - Addressed received comments >> - Swapped some of the powerpc patches to keep func_descr_t renamed as struct >> func_desc and remove 'struct ppc64_opd_entry' >> - Changed HAVE_FUNCTION_DESCRIPTORS macro to a config item >> CONFIG_HAVE_FUNCTION_DESCRIPTORS >> - Dropped patch 11 ("Fix lkdtm_EXEC_RODATA()") >> >> Changes in v2: >> - Addressed received comments >> - Moved dereference_[kernel]_function_descriptor() out of line >> - Added patches to remove func_descr_t and func_desc_t in powerpc >> - Using func_desc_t instead of funct_descr_t >> - Renamed HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR to HAVE_FUNCTION_DESCRIPTORS >> - Added a new lkdtm test to check protection of function descriptors >> >> Christophe Leroy (13): >> powerpc: Fix 'sparse' checking on PPC64le >> powerpc: Move and rename func_descr_t >> powerpc: Use 'struct func_desc' instead of 'struct ppc64_opd_entry' >> powerpc: Remove 'struct ppc64_opd_entry' >> powerpc: Prepare func_desc_t for refactorisation >> ia64: Rename 'ip' to 'addr' in 'struct fdesc' >> asm-generic: Define CONFIG_HAVE_FUNCTION_DESCRIPTORS >> asm-generic: Define 'func_desc_t' to commonly describe function >> descriptors >> asm-generic: Refactor dereference_[kernel]_function_descriptor() >> lkdtm: Force do_nothing() out of line >> lkdtm: Really write into kernel text in WRITE_KERN >> lkdtm: Fix execute_[user]_location() >> lkdtm: Add a test for function descriptors protection >> >> arch/Kconfig | 3 + >> arch/ia64/Kconfig| 1 + >> arch/ia64/include/asm/elf.h | 2 +- >> arch/ia64/include/asm/sections.h | 24 +--- >> arch/ia64/kernel/module.c| 6 +- >> arch/parisc/Kconfig | 1 + >> arch/parisc/include/asm/sections.h | 16 ++ >> arch/parisc/kernel/process.c | 21 --- >> arch/powerpc/Kconfig | 1 + >> arch/powerpc/Makefile| 2 +- >> arch/powerpc/include/asm/code-patching.h | 2 +- >> arch/powerpc/include/asm/elf.h | 6 ++ >> arch/powerpc/include/asm/sections.h | 29 ++ >> arch/powerpc/include/asm/types.h | 6 -- >> arch/powerpc/include/uapi/asm/elf.h | 8 --- >> arch/powerpc/kernel/module_64.c | 42 ++ >> arch/powerpc/kernel/ptrace/ptrace.c | 6 ++ >> arch/powerpc/kernel/signal_64.c | 8 +-- >> drivers/misc/lkdtm/core.c| 1 + >> drivers/misc/lkdtm/lkdtm.h | 1 + >> drivers/misc/lkdtm/perms.c | 71 +++- >> include/asm-generic/sections.h | 15 - >> include/linux/kallsyms.h | 2 +- >> kernel/extable.c | 24 +++- >> tools/testing/selftests/lkdtm/tests.txt | 1 + >> 25 files changed, 155 insertions(+), 144 deletions(-) >> >> -- >> 2.34.1 >> >
Re: [PATCH v2 09/13] powerpc/ftrace: Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS
Heiko Carstens writes: > So, the in both variants s390 provides nearly identical data. The only > difference is that for FL_SAVE_REGS the program status word mask is > missing; therefore it is not possible to figure out the condition code > or if interrupts were enabled/disabled. > > Vasily, Sven, I think we have two options here: > > - don't provide sane psw mask contents at all and say (again) that > ptregs contents are identical > > - provide (finally) a full psw mask contents using epsw, and indicate > validity with a flags bit in pt_regs > > I would vote for the second option, even though epsw is slow. But this > is about the third or fourth time this came up in different > contexts. So I'd guess we should go for the slow but complete > solution. Opinions? Given that this only affects ftrace_regs_caller, i would also vote for the second option.
[PATCH v2 18/18] uaccess: drop maining CONFIG_SET_FS users
From: Arnd Bergmann There are no remaining callers of set_fs(), so CONFIG_SET_FS can be removed globally, along with the thread_info field and any references to it. This turns access_ok() into a cheaper check against TASK_SIZE_MAX. With CONFIG_SET_FS gone, so drop all remaining references to set_fs()/get_fs(), mm_segment_t and uaccess_kernel(). Signed-off-by: Arnd Bergmann --- arch/Kconfig | 3 - arch/alpha/Kconfig| 1 - arch/alpha/include/asm/processor.h| 4 -- arch/alpha/include/asm/thread_info.h | 2 - arch/alpha/include/asm/uaccess.h | 19 -- arch/arc/Kconfig | 1 - arch/arc/include/asm/segment.h| 20 --- arch/arc/include/asm/thread_info.h| 3 - arch/arc/include/asm/uaccess.h| 1 - arch/arm/lib/uaccess_with_memcpy.c| 10 arch/csky/Kconfig | 1 - arch/csky/include/asm/processor.h | 2 - arch/csky/include/asm/segment.h | 10 arch/csky/include/asm/thread_info.h | 2 - arch/csky/include/asm/uaccess.h | 3 - arch/csky/kernel/asm-offsets.c| 1 - arch/h8300/Kconfig| 1 - arch/h8300/include/asm/processor.h| 1 - arch/h8300/include/asm/segment.h | 40 - arch/h8300/include/asm/thread_info.h | 3 - arch/h8300/kernel/entry.S | 1 - arch/h8300/kernel/head_ram.S | 1 - arch/h8300/mm/init.c | 6 -- arch/h8300/mm/memory.c| 1 - arch/hexagon/Kconfig | 1 - arch/hexagon/include/asm/thread_info.h| 6 -- arch/hexagon/kernel/process.c | 1 - arch/microblaze/Kconfig | 1 - arch/microblaze/include/asm/thread_info.h | 6 -- arch/microblaze/include/asm/uaccess.h | 24 arch/microblaze/kernel/asm-offsets.c | 1 - arch/microblaze/kernel/process.c | 1 - arch/nds32/Kconfig| 1 - arch/nds32/include/asm/thread_info.h | 4 -- arch/nds32/include/asm/uaccess.h | 15 + arch/nds32/kernel/process.c | 5 +- arch/nds32/mm/alignment.c | 3 - arch/nios2/Kconfig| 1 - arch/nios2/include/asm/thread_info.h | 9 --- arch/nios2/include/asm/uaccess.h | 12 arch/openrisc/Kconfig | 1 - arch/openrisc/include/asm/thread_info.h | 7 --- arch/openrisc/include/asm/uaccess.h | 23 arch/parisc/include/asm/futex.h | 6 -- arch/parisc/lib/memcpy.c | 2 +- arch/sparc/Kconfig| 2 +- arch/sparc/include/asm/processor_32.h | 6 -- arch/sparc/include/asm/uaccess_32.h | 13 - arch/sparc/kernel/process_32.c| 2 - arch/xtensa/Kconfig | 1 - arch/xtensa/include/asm/asm-uaccess.h | 71 --- arch/xtensa/include/asm/processor.h | 7 --- arch/xtensa/include/asm/thread_info.h | 3 - arch/xtensa/include/asm/uaccess.h | 16 - arch/xtensa/kernel/asm-offsets.c | 3 - drivers/hid/uhid.c| 2 +- drivers/scsi/sg.c | 5 -- fs/exec.c | 6 -- include/asm-generic/access_ok.h | 10 +--- include/asm-generic/uaccess.h | 25 +--- include/linux/syscalls.h | 4 -- include/linux/uaccess.h | 33 --- include/rdma/ib.h | 2 +- kernel/events/callchain.c | 4 -- kernel/events/core.c | 3 - kernel/exit.c | 14 - kernel/kthread.c | 5 -- kernel/stacktrace.c | 3 - kernel/trace/bpf_trace.c | 4 -- mm/maccess.c | 11 mm/memory.c | 8 --- net/bpfilter/bpfilter_kern.c | 2 +- 72 files changed, 10 insertions(+), 522 deletions(-) delete mode 100644 arch/arc/include/asm/segment.h delete mode 100644 arch/csky/include/asm/segment.h delete mode 100644 arch/h8300/include/asm/segment.h diff --git a/arch/Kconfig b/arch/Kconfig index fa5db36bda67..99349547afed 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -24,9 +24,6 @@ config KEXEC_ELF config HAVE_IMA_KEXEC bool -config SET_FS - bool - config HOTPLUG_SMT bool diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig index 4e87783c90ad..eee8b5b0a58b 100644 --- a/arch/alpha/Kconfig +++ b/arch/alpha/Kconfig @@ -35,7 +35,6 @@ config ALPHA select OLD_SIGSUSPEND select CPU_NO_EFFICIENT_FFS if !ALPHA_EV67 select MMU_GATHER_NO_RANGE - select SET_FS select SPARSEMEM_EXTREME if SPARSEMEM select ZONE_DMA help diff --git
[PATCH v2 17/18] ia64: remove CONFIG_SET_FS support
From: Arnd Bergmann ia64 only uses set_fs() in one file to handle unaligned access for both user space and kernel instructions. Rewrite this to explicitly pass around a flag about which one it is and drop the feature from the architecture. Signed-off-by: Arnd Bergmann --- arch/ia64/Kconfig | 1 - arch/ia64/include/asm/processor.h | 4 -- arch/ia64/include/asm/thread_info.h | 2 - arch/ia64/include/asm/uaccess.h | 21 +++--- arch/ia64/kernel/unaligned.c| 60 +++-- 5 files changed, 45 insertions(+), 43 deletions(-) diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index a7e01573abd8..6b6a35b3d959 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -61,7 +61,6 @@ config IA64 select NEED_SG_DMA_LENGTH select NUMA if !FLATMEM select PCI_MSI_ARCH_FALLBACKS if PCI_MSI - select SET_FS select ZONE_DMA32 default y help diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h index 45365c2ef598..7cbce290f4e5 100644 --- a/arch/ia64/include/asm/processor.h +++ b/arch/ia64/include/asm/processor.h @@ -243,10 +243,6 @@ DECLARE_PER_CPU(struct cpuinfo_ia64, ia64_cpu_info); extern void print_cpu_info (struct cpuinfo_ia64 *); -typedef struct { - unsigned long seg; -} mm_segment_t; - #define SET_UNALIGN_CTL(task,value) \ ({ \ (task)->thread.flags = (((task)->thread.flags & ~IA64_THREAD_UAC_MASK) \ diff --git a/arch/ia64/include/asm/thread_info.h b/arch/ia64/include/asm/thread_info.h index 51d20cb37706..ef83493e6778 100644 --- a/arch/ia64/include/asm/thread_info.h +++ b/arch/ia64/include/asm/thread_info.h @@ -27,7 +27,6 @@ struct thread_info { __u32 cpu; /* current CPU */ __u32 last_cpu; /* Last CPU thread ran on */ __u32 status; /* Thread synchronous flags */ - mm_segment_t addr_limit;/* user-level address space limit */ int preempt_count; /* 0=premptable, <0=BUG; will also serve as bh-counter */ #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE __u64 utime; @@ -48,7 +47,6 @@ struct thread_info { .task = , \ .flags = 0,\ .cpu= 0,\ - .addr_limit = KERNEL_DS,\ .preempt_count = INIT_PREEMPT_COUNT, \ } diff --git a/arch/ia64/include/asm/uaccess.h b/arch/ia64/include/asm/uaccess.h index e242a3cc1330..60adadeb3e9e 100644 --- a/arch/ia64/include/asm/uaccess.h +++ b/arch/ia64/include/asm/uaccess.h @@ -42,26 +42,17 @@ #include /* - * For historical reasons, the following macros are grossly misnamed: - */ -#define KERNEL_DS ((mm_segment_t) { ~0UL }) /* cf. access_ok() */ -#define USER_DS((mm_segment_t) { TASK_SIZE-1 })/* cf. access_ok() */ - -#define get_fs() (current_thread_info()->addr_limit) -#define set_fs(x) (current_thread_info()->addr_limit = (x)) - -/* - * When accessing user memory, we need to make sure the entire area really is in - * user-level space. In order to do this efficiently, we make sure that the page at - * address TASK_SIZE is never valid. We also need to make sure that the address doesn't + * When accessing user memory, we need to make sure the entire area really is + * in user-level space. We also need to make sure that the address doesn't * point inside the virtually mapped linear page table. */ static inline int __access_ok(const void __user *p, unsigned long size) { + unsigned long limit = TASK_SIZE; unsigned long addr = (unsigned long)p; - unsigned long seg = get_fs().seg; - return likely(addr <= seg) && -(seg == KERNEL_DS.seg || likely(REGION_OFFSET(addr) < RGN_MAP_LIMIT)); + + return likely((size <= limit) && (addr <= (limit - size)) && +likely(REGION_OFFSET(addr) < RGN_MAP_LIMIT)); } #define __access_ok __access_ok #include diff --git a/arch/ia64/kernel/unaligned.c b/arch/ia64/kernel/unaligned.c index 6c1a8951dfbb..0acb5a0cd7ab 100644 --- a/arch/ia64/kernel/unaligned.c +++ b/arch/ia64/kernel/unaligned.c @@ -749,9 +749,25 @@ emulate_load_updates (update_t type, load_store_t ld, struct pt_regs *regs, unsi } } +static int emulate_store(unsigned long ifa, void *val, int len, bool kernel_mode) +{ + if (kernel_mode) + return copy_to_kernel_nofault((void *)ifa, val, len); + + return copy_to_user((void __user *)ifa, val, len); +} + +static int emulate_load(void *val, unsigned long ifa, int len, bool kernel_mode) +{ + if (kernel_mode) + return copy_from_kernel_nofault(val, (void *)ifa, len); + + return copy_from_user(val, (void
[PATCH v2 16/18] sh: remove CONFIG_SET_FS support
From: Arnd Bergmann sh uses set_fs/get_fs only in one file, to handle address errors in both user and kernel memory. It already has an abstraction to differentiate between I/O and memory, so adding a third class for kernel memory fits into the same scheme and lets us kill off CONFIG_SET_FS. Signed-off-by: Arnd Bergmann --- arch/sh/Kconfig | 1 - arch/sh/include/asm/processor.h | 1 - arch/sh/include/asm/segment.h | 33 --- arch/sh/include/asm/thread_info.h | 2 -- arch/sh/include/asm/uaccess.h | 4 arch/sh/kernel/io_trapped.c | 9 ++--- arch/sh/kernel/process_32.c | 2 -- arch/sh/kernel/traps_32.c | 30 +--- 8 files changed, 21 insertions(+), 61 deletions(-) delete mode 100644 arch/sh/include/asm/segment.h diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig index 2474a04ceac4..f676e92b7d5b 100644 --- a/arch/sh/Kconfig +++ b/arch/sh/Kconfig @@ -65,7 +65,6 @@ config SUPERH select PERF_EVENTS select PERF_USE_VMALLOC select RTC_LIB - select SET_FS select SPARSE_IRQ select TRACE_IRQFLAGS_SUPPORT help diff --git a/arch/sh/include/asm/processor.h b/arch/sh/include/asm/processor.h index 3820d698846e..85a6c1c3c16e 100644 --- a/arch/sh/include/asm/processor.h +++ b/arch/sh/include/asm/processor.h @@ -3,7 +3,6 @@ #define __ASM_SH_PROCESSOR_H #include -#include #include #ifndef __ASSEMBLY__ diff --git a/arch/sh/include/asm/segment.h b/arch/sh/include/asm/segment.h deleted file mode 100644 index 02e54a3335d6.. --- a/arch/sh/include/asm/segment.h +++ /dev/null @@ -1,33 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef __ASM_SH_SEGMENT_H -#define __ASM_SH_SEGMENT_H - -#ifndef __ASSEMBLY__ - -typedef struct { - unsigned long seg; -} mm_segment_t; - -#define MAKE_MM_SEG(s) ((mm_segment_t) { (s) }) - -/* - * The fs value determines whether argument validity checking should be - * performed or not. If get_fs() == USER_DS, checking is performed, with - * get_fs() == KERNEL_DS, checking is bypassed. - * - * For historical reasons, these macros are grossly misnamed. - */ -#define KERNEL_DS MAKE_MM_SEG(0xUL) -#ifdef CONFIG_MMU -#define USER_DSMAKE_MM_SEG(PAGE_OFFSET) -#else -#define USER_DSKERNEL_DS -#endif - -#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg) - -#define get_fs() (current_thread_info()->addr_limit) -#define set_fs(x) (current_thread_info()->addr_limit = (x)) - -#endif /* __ASSEMBLY__ */ -#endif /* __ASM_SH_SEGMENT_H */ diff --git a/arch/sh/include/asm/thread_info.h b/arch/sh/include/asm/thread_info.h index 598d0184ffea..b119b859a0a3 100644 --- a/arch/sh/include/asm/thread_info.h +++ b/arch/sh/include/asm/thread_info.h @@ -30,7 +30,6 @@ struct thread_info { __u32 status; /* thread synchronous flags */ __u32 cpu; int preempt_count; /* 0 => preemptable, <0 => BUG */ - mm_segment_taddr_limit; /* thread address space */ unsigned long previous_sp;/* sp of previous stack in case of nested IRQ stacks */ __u8supervisor_stack[0]; @@ -58,7 +57,6 @@ struct thread_info { .status = 0,\ .cpu= 0,\ .preempt_count = INIT_PREEMPT_COUNT, \ - .addr_limit = KERNEL_DS,\ } /* how to get the current stack pointer from C */ diff --git a/arch/sh/include/asm/uaccess.h b/arch/sh/include/asm/uaccess.h index ccd219d74851..a79609eb14be 100644 --- a/arch/sh/include/asm/uaccess.h +++ b/arch/sh/include/asm/uaccess.h @@ -2,11 +2,7 @@ #ifndef __ASM_SH_UACCESS_H #define __ASM_SH_UACCESS_H -#include #include - -#define user_addr_max()(current_thread_info()->addr_limit.seg) - #include /* diff --git a/arch/sh/kernel/io_trapped.c b/arch/sh/kernel/io_trapped.c index 004ad0130b10..e803b14ef12e 100644 --- a/arch/sh/kernel/io_trapped.c +++ b/arch/sh/kernel/io_trapped.c @@ -270,7 +270,6 @@ static struct mem_access trapped_io_access = { int handle_trapped_io(struct pt_regs *regs, unsigned long address) { - mm_segment_t oldfs; insn_size_t instruction; int tmp; @@ -281,16 +280,12 @@ int handle_trapped_io(struct pt_regs *regs, unsigned long address) WARN_ON(user_mode(regs)); - oldfs = get_fs(); - set_fs(KERNEL_DS); - if (copy_from_user(, (void *)(regs->pc), - sizeof(instruction))) { - set_fs(oldfs); + if (copy_from_kernel_nofault(, (void *)(regs->pc), +sizeof(instruction))) { return 0; } tmp = handle_unaligned_access(instruction, regs, _io_access,
[PATCH v2 15/18] sparc64: remove CONFIG_SET_FS support
From: Arnd Bergmann sparc64 uses address space identifiers to differentiate between kernel and user space, using ASI_P for kernel threads but ASI_AIUS for normal user space, with the option of changing between them. As nothing really changes the ASI any more, just hardcode ASI_AIUS everywhere. Kernel threads are not allowed to access __user pointers anyway. Signed-off-by: Arnd Bergmann --- arch/sparc/include/asm/processor_64.h | 4 arch/sparc/include/asm/switch_to_64.h | 4 +--- arch/sparc/include/asm/thread_info_64.h | 4 +--- arch/sparc/include/asm/uaccess_64.h | 20 +--- arch/sparc/kernel/process_64.c | 12 arch/sparc/kernel/traps_64.c| 2 -- arch/sparc/lib/NGmemcpy.S | 3 +-- arch/sparc/mm/init_64.c | 7 --- 8 files changed, 8 insertions(+), 48 deletions(-) diff --git a/arch/sparc/include/asm/processor_64.h b/arch/sparc/include/asm/processor_64.h index ae851e8fce4c..89850dff6b03 100644 --- a/arch/sparc/include/asm/processor_64.h +++ b/arch/sparc/include/asm/processor_64.h @@ -47,10 +47,6 @@ #ifndef __ASSEMBLY__ -typedef struct { - unsigned char seg; -} mm_segment_t; - /* The Sparc processor specific thread struct. */ /* XXX This should die, everything can go into thread_info now. */ struct thread_struct { diff --git a/arch/sparc/include/asm/switch_to_64.h b/arch/sparc/include/asm/switch_to_64.h index b1d4e2e3210f..14f3c49bfdbc 100644 --- a/arch/sparc/include/asm/switch_to_64.h +++ b/arch/sparc/include/asm/switch_to_64.h @@ -20,10 +20,8 @@ do { \ */ #define switch_to(prev, next, last)\ do { save_and_clear_fpu(); \ - /* If you are tempted to conditionalize the following */\ - /* so that ASI is only written if it changes, think again. */ \ __asm__ __volatile__("wr %%g0, %0, %%asi" \ - : : "r" (task_thread_info(next)->current_ds));\ + : : "r" (ASI_AIUS));\ trap_block[current_thread_info()->cpu].thread = \ task_thread_info(next); \ __asm__ __volatile__( \ diff --git a/arch/sparc/include/asm/thread_info_64.h b/arch/sparc/include/asm/thread_info_64.h index 8047a9caab2f..1a44372e2bc0 100644 --- a/arch/sparc/include/asm/thread_info_64.h +++ b/arch/sparc/include/asm/thread_info_64.h @@ -46,7 +46,7 @@ struct thread_info { struct pt_regs *kregs; int preempt_count; /* 0 => preemptable, <0 => BUG */ __u8new_child; - __u8current_ds; + __u8__pad; __u16 cpu; unsigned long *utraps; @@ -81,7 +81,6 @@ struct thread_info { #define TI_KREGS 0x0028 #define TI_PRE_COUNT 0x0030 #define TI_NEW_CHILD 0x0034 -#define TI_CURRENT_DS 0x0035 #define TI_CPU 0x0036 #define TI_UTRAPS 0x0038 #define TI_REG_WINDOW 0x0040 @@ -116,7 +115,6 @@ struct thread_info { #define INIT_THREAD_INFO(tsk) \ { \ .task = , \ - .current_ds = ASI_P, \ .preempt_count = INIT_PREEMPT_COUNT, \ .kregs = (struct pt_regs *)(init_stack+THREAD_SIZE)-1 \ } diff --git a/arch/sparc/include/asm/uaccess_64.h b/arch/sparc/include/asm/uaccess_64.h index 59b9a545df23..94266a5c5b04 100644 --- a/arch/sparc/include/asm/uaccess_64.h +++ b/arch/sparc/include/asm/uaccess_64.h @@ -12,33 +12,15 @@ #include #include +#include /* * Sparc64 is segmented, though more like the M68K than the I386. * We use the secondary ASI to address user memory, which references a * completely different VM map, thus there is zero chance of the user * doing something queer and tricking us into poking kernel memory. - * - * What is left here is basically what is needed for the other parts of - * the kernel that expect to be able to manipulate, erum, "segments". - * Or perhaps more properly, permissions. - * - * "For historical reasons, these macros are grossly misnamed." -Linus */ -#define KERNEL_DS ((mm_segment_t) { ASI_P }) -#define USER_DS ((mm_segment_t) { ASI_AIUS }) /* har har har */ - -#define get_fs() ((mm_segment_t){(current_thread_info()->current_ds)}) - -#include - -#define set_fs(val) \ -do { \ - current_thread_info()->current_ds = (val).seg; \ - __asm__ __volatile__ ("wr %%g0, %0,
[PATCH v2 14/18] lib/test_lockup: fix kernel pointer check for separate address spaces
From: Arnd Bergmann test_kernel_ptr() uses access_ok() to figure out if a given address points to user space instead of kernel space. However on architectures that set CONFIG_ALTERNATE_USER_ADDRESS_SPACE, a pointer can be valid for both, and the check always fails because access_ok() returns true. Make the check for user space pointers conditional on the type of address space layout. Signed-off-by: Arnd Bergmann --- lib/test_lockup.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/lib/test_lockup.c b/lib/test_lockup.c index 6a0f329a794a..c3fd87d6c2dd 100644 --- a/lib/test_lockup.c +++ b/lib/test_lockup.c @@ -417,9 +417,14 @@ static bool test_kernel_ptr(unsigned long addr, int size) return false; /* should be at least readable kernel address */ - if (access_ok((void __user *)ptr, 1) || - access_ok((void __user *)ptr + size - 1, 1) || - get_kernel_nofault(buf, ptr) || + if (!IS_ENABLED(CONFIG_ALTERNATE_USER_ADDRESS_SPACE) && + (access_ok((void __user *)ptr, 1) || +access_ok((void __user *)ptr + size - 1, 1))) { + pr_err("user space ptr invalid in kernel: %#lx\n", addr); + return true; + } + + if (get_kernel_nofault(buf, ptr) || get_kernel_nofault(buf, ptr + size - 1)) { pr_err("invalid kernel ptr: %#lx\n", addr); return true; -- 2.29.2
[PATCH v2 13/18] uaccess: generalize access_ok()
From: Arnd Bergmann There are many different ways that access_ok() is defined across architectures, but in the end, they all just compare against the user_addr_max() value or they accept anything. Provide one definition that works for most architectures, checking against TASK_SIZE_MAX for user processes or skipping the check inside of uaccess_kernel() sections. For architectures without CONFIG_SET_FS(), this should be the fastest check, as it comes down to a single comparison of a pointer against a compile-time constant, while the architecture specific versions tend to do something more complex for historic reasons or get something wrong. Type checking for __user annotations is handled inconsistently across architectures, but this is easily simplified as well by using an inline function that takes a 'const void __user *' argument. A handful of callers need an extra __user annotation for this. Some architectures had trick to use 33-bit or 65-bit arithmetic on the addresses to calculate the overflow, however this simpler version uses fewer registers, which means it can produce better object code in the end despite needing a second (statically predicted) branch. Reviewed-by: Christoph Hellwig Acked-by: Mark Rutland [arm64, asm-generic] Signed-off-by: Arnd Bergmann --- arch/Kconfig | 7 arch/alpha/include/asm/uaccess.h | 34 +++ arch/arc/include/asm/uaccess.h| 29 - arch/arm/include/asm/uaccess.h| 20 + arch/arm64/include/asm/uaccess.h | 11 ++--- arch/csky/include/asm/uaccess.h | 8 arch/hexagon/include/asm/uaccess.h| 25 arch/ia64/include/asm/uaccess.h | 5 +-- arch/m68k/Kconfig.cpu | 1 + arch/m68k/include/asm/uaccess.h | 19 + arch/microblaze/include/asm/uaccess.h | 8 +--- arch/mips/include/asm/uaccess.h | 29 + arch/nds32/include/asm/uaccess.h | 7 +--- arch/nios2/include/asm/uaccess.h | 11 + arch/openrisc/include/asm/uaccess.h | 19 + arch/parisc/Kconfig | 1 + arch/parisc/include/asm/uaccess.h | 12 ++ arch/powerpc/include/asm/uaccess.h| 11 + arch/riscv/include/asm/uaccess.h | 31 +- arch/s390/Kconfig | 1 + arch/s390/include/asm/uaccess.h | 14 +-- arch/sh/include/asm/uaccess.h | 22 +- arch/sparc/Kconfig| 1 + arch/sparc/include/asm/uaccess.h | 3 -- arch/sparc/include/asm/uaccess_32.h | 18 ++-- arch/sparc/include/asm/uaccess_64.h | 12 +- arch/um/include/asm/uaccess.h | 5 ++- arch/x86/include/asm/uaccess.h| 14 +-- arch/xtensa/include/asm/uaccess.h | 10 + include/asm-generic/access_ok.h | 59 +++ include/asm-generic/uaccess.h | 21 +- include/linux/uaccess.h | 7 32 files changed, 109 insertions(+), 366 deletions(-) create mode 100644 include/asm-generic/access_ok.h diff --git a/arch/Kconfig b/arch/Kconfig index 678a80713b21..fa5db36bda67 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -898,6 +898,13 @@ config HAVE_SOFTIRQ_ON_OWN_STACK Architecture provides a function to run __do_softirq() on a separate stack. +config ALTERNATE_USER_ADDRESS_SPACE + bool + help + Architectures set this when the CPU uses separate address + spaces for kernel and user space pointers. In this case, the + access_ok() check on a __user pointer is skipped. + config PGTABLE_LEVELS int default 2 diff --git a/arch/alpha/include/asm/uaccess.h b/arch/alpha/include/asm/uaccess.h index 1b6f25efa247..82c5743fc9cd 100644 --- a/arch/alpha/include/asm/uaccess.h +++ b/arch/alpha/include/asm/uaccess.h @@ -20,28 +20,7 @@ #define get_fs() (current_thread_info()->addr_limit) #define set_fs(x) (current_thread_info()->addr_limit = (x)) -#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg) - -/* - * Is a address valid? This does a straightforward calculation rather - * than tests. - * - * Address valid if: - * - "addr" doesn't have any high-bits set - * - AND "size" doesn't have any high-bits set - * - AND "addr+size-(size != 0)" doesn't have any high-bits set - * - OR we are in kernel mode. - */ -#define __access_ok(addr, size) ({ \ - unsigned long __ao_a = (addr), __ao_b = (size); \ - unsigned long __ao_end = __ao_a + __ao_b - !!__ao_b;\ - (get_fs().seg & (__ao_a | __ao_b | __ao_end)) == 0; }) - -#define access_ok(addr, size) \ -({ \ - __chk_user_ptr(addr); \ - __access_ok(((unsigned long)(addr)), (size)); \ -}) +#include /* * These are the main single-value transfer routines. They automatically @@ -105,7 +84,7 @@ extern void
[PATCH v2 12/18] uaccess: fix type mismatch warnings from access_ok()
From: Arnd Bergmann On some architectures, access_ok() does not do any argument type checking, so replacing the definition with a generic one causes a few warnings for harmless issues that were never caught before. Fix the ones that I found either through my own test builds or that were reported by the 0-day bot. Reported-by: kernel test robot Signed-off-by: Arnd Bergmann --- arch/arc/kernel/process.c | 2 +- arch/arm/kernel/swp_emulate.c | 2 +- arch/arm/kernel/traps.c | 2 +- arch/csky/kernel/signal.c | 2 +- arch/mips/sibyte/common/sb_tbprof.c | 6 +++--- arch/nios2/kernel/signal.c | 20 +++- arch/powerpc/lib/sstep.c| 4 ++-- arch/riscv/kernel/perf_callchain.c | 2 +- arch/sparc/kernel/signal_32.c | 2 +- lib/test_lockup.c | 4 ++-- 10 files changed, 24 insertions(+), 22 deletions(-) diff --git a/arch/arc/kernel/process.c b/arch/arc/kernel/process.c index 8e90052f6f05..5f7f5aab361f 100644 --- a/arch/arc/kernel/process.c +++ b/arch/arc/kernel/process.c @@ -43,7 +43,7 @@ SYSCALL_DEFINE0(arc_gettls) return task_thread_info(current)->thr_ptr; } -SYSCALL_DEFINE3(arc_usr_cmpxchg, int *, uaddr, int, expected, int, new) +SYSCALL_DEFINE3(arc_usr_cmpxchg, int __user *, uaddr, int, expected, int, new) { struct pt_regs *regs = current_pt_regs(); u32 uval; diff --git a/arch/arm/kernel/swp_emulate.c b/arch/arm/kernel/swp_emulate.c index 6166ba38bf99..b74bfcf94fb1 100644 --- a/arch/arm/kernel/swp_emulate.c +++ b/arch/arm/kernel/swp_emulate.c @@ -195,7 +195,7 @@ static int swp_handler(struct pt_regs *regs, unsigned int instr) destreg, EXTRACT_REG_NUM(instr, RT2_OFFSET), data); /* Check access in reasonable access range for both SWP and SWPB */ - if (!access_ok((address & ~3), 4)) { + if (!access_ok((void __user *)(address & ~3), 4)) { pr_debug("SWP{B} emulation: access to %p not allowed!\n", (void *)address); res = -EFAULT; diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index da04ed85855a..26c8c8276297 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -576,7 +576,7 @@ do_cache_op(unsigned long start, unsigned long end, int flags) if (end < start || flags) return -EINVAL; - if (!access_ok(start, end - start)) + if (!access_ok((void __user *)start, end - start)) return -EFAULT; return __do_cache_op(start, end); diff --git a/arch/csky/kernel/signal.c b/arch/csky/kernel/signal.c index c7b763d2f526..8867ddf3e6c7 100644 --- a/arch/csky/kernel/signal.c +++ b/arch/csky/kernel/signal.c @@ -136,7 +136,7 @@ static inline void __user *get_sigframe(struct ksignal *ksig, static int setup_rt_frame(struct ksignal *ksig, sigset_t *set, struct pt_regs *regs) { - struct rt_sigframe *frame; + struct rt_sigframe __user *frame; int err = 0; frame = get_sigframe(ksig, regs, sizeof(*frame)); diff --git a/arch/mips/sibyte/common/sb_tbprof.c b/arch/mips/sibyte/common/sb_tbprof.c index f80d7a710333..01d00b87d0f5 100644 --- a/arch/mips/sibyte/common/sb_tbprof.c +++ b/arch/mips/sibyte/common/sb_tbprof.c @@ -437,13 +437,13 @@ static int sbprof_tb_release(struct inode *inode, struct file *filp) return 0; } -static ssize_t sbprof_tb_read(struct file *filp, char *buf, +static ssize_t sbprof_tb_read(struct file *filp, char __user *buf, size_t size, loff_t *offp) { int cur_sample, sample_off, cur_count, sample_left; char *src; int count = 0; - char *dest= buf; + char __user *dest = buf; long cur_off = *offp; if (!access_ok(buf, size)) @@ -512,7 +512,7 @@ static long sbprof_tb_ioctl(struct file *filp, if (err) break; - err = put_user(TB_FULL, (int *) arg); + err = put_user(TB_FULL, (int __user *) arg); break; } diff --git a/arch/nios2/kernel/signal.c b/arch/nios2/kernel/signal.c index 2009ae2d3c3b..386e46443b60 100644 --- a/arch/nios2/kernel/signal.c +++ b/arch/nios2/kernel/signal.c @@ -36,10 +36,10 @@ struct rt_sigframe { static inline int rt_restore_ucontext(struct pt_regs *regs, struct switch_stack *sw, - struct ucontext *uc, int *pr2) + struct ucontext __user *uc, int *pr2) { int temp; - unsigned long *gregs = uc->uc_mcontext.gregs; + unsigned long __user *gregs = uc->uc_mcontext.gregs; int err; /* Always make any pending restarted system calls return -EINTR */ @@ -102,10 +102,11 @@ asmlinkage int do_rt_sigreturn(struct switch_stack *sw) { struct pt_regs *regs = (struct pt_regs *)(sw + 1); /* Verify, can we
[PATCH v2 11/18] arm64: simplify access_ok()
From: Arnd Bergmann arm64 has an inline asm implementation of access_ok() that is derived from the 32-bit arm version and optimized for the case that both the limit and the size are variable. With set_fs() gone, the limit is always constant, and the size usually is as well, so just using the default implementation reduces the check into a comparison against a constant that can be scheduled by the compiler. On a defconfig build, this saves over 28KB of .text. Acked-by: Robin Murphy Acked-by: Mark Rutland Signed-off-by: Arnd Bergmann --- arch/arm64/include/asm/uaccess.h | 34 ++-- 1 file changed, 10 insertions(+), 24 deletions(-) diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h index 2e20879fe3cf..199c553b740a 100644 --- a/arch/arm64/include/asm/uaccess.h +++ b/arch/arm64/include/asm/uaccess.h @@ -26,6 +26,14 @@ #include #include +static inline int __access_ok(const void __user *ptr, unsigned long size) +{ + unsigned long limit = TASK_SIZE_MAX; + unsigned long addr = (unsigned long)ptr; + + return (size <= limit) && (addr <= (limit - size)); +} + /* * Test whether a block of memory is a valid user space address. * Returns 1 if the range is valid, 0 otherwise. @@ -33,10 +41,8 @@ * This is equivalent to the following test: * (u65)addr + (u65)size <= (u65)TASK_SIZE_MAX */ -static inline unsigned long __range_ok(const void __user *addr, unsigned long size) +static inline int access_ok(const void __user *addr, unsigned long size) { - unsigned long ret, limit = TASK_SIZE_MAX - 1; - /* * Asynchronous I/O running in a kernel thread does not have the * TIF_TAGGED_ADDR flag of the process owning the mm, so always untag @@ -46,29 +52,9 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR))) addr = untagged_addr(addr); - __chk_user_ptr(addr); - asm volatile( - // A + B <= C + 1 for all A,B,C, in four easy steps: - // 1: X = A + B; X' = X % 2^64 - " adds%0, %3, %2\n" - // 2: Set C = 0 if X > 2^64, to guarantee X' > C in step 4 - " csel%1, xzr, %1, hi\n" - // 3: Set X' = ~0 if X >= 2^64. For X == 2^64, this decrements X' - //to compensate for the carry flag being set in step 4. For - //X > 2^64, X' merely has to remain nonzero, which it does. - " csinv %0, %0, xzr, cc\n" - // 4: For X < 2^64, this gives us X' - C - 1 <= 0, where the -1 - //comes from the carry in being clear. Otherwise, we are - //testing X' - C == 0, subject to the previous adjustments. - " sbcsxzr, %0, %1\n" - " cset%0, ls\n" - : "=" (ret), "+r" (limit) : "Ir" (size), "0" (addr) : "cc"); - - return ret; + return likely(__access_ok(addr, size)); } -#define access_ok(addr, size) __range_ok(addr, size) - /* * User access enabling/disabling. */ -- 2.29.2
[PATCH v2 10/18] m68k: fix access_ok for coldfire
From: Arnd Bergmann While most m68k platforms use separate address spaces for user and kernel space, at least coldfire does not, and the other ones have a TASK_SIZE that is less than the entire 4GB address range. Using the default implementation of __access_ok() stops coldfire user space from trivially accessing kernel memory. Signed-off-by: Arnd Bergmann --- arch/m68k/include/asm/uaccess.h | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/arch/m68k/include/asm/uaccess.h b/arch/m68k/include/asm/uaccess.h index 79617c0b2f91..8eb625e75452 100644 --- a/arch/m68k/include/asm/uaccess.h +++ b/arch/m68k/include/asm/uaccess.h @@ -12,14 +12,21 @@ #include /* We let the MMU do all checking */ -static inline int access_ok(const void __user *addr, +static inline int access_ok(const void __user *ptr, unsigned long size) { + unsigned long limit = TASK_SIZE; + unsigned long addr = (unsigned long)ptr; + /* * XXX: for !CONFIG_CPU_HAS_ADDRESS_SPACES this really needs to check * for TASK_SIZE! +* Removing this helper is probably sufficient. */ - return 1; + if (IS_ENABLED(CONFIG_CPU_HAS_ADDRESS_SPACES)) + return 1; + + return (size <= limit) && (addr <= (limit - size)); } /* -- 2.29.2
[PATCH v2 09/18] mips: use simpler access_ok()
From: Arnd Bergmann Before unifying the mips version of __access_ok() with the generic code, this converts it to the same algorithm. This is a change in behavior on mips64, as now address in the user segment, the lower 2^62 bytes, is taken to be valid, relying on a page fault for addresses that are within that segment but not valid on that CPU. The new version should be the most effecient way to do this, but it gets rid of the special handling for size=0 that most other architectures ignore as well. Signed-off-by: Arnd Bergmann --- arch/mips/include/asm/uaccess.h | 22 -- 1 file changed, 4 insertions(+), 18 deletions(-) diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h index db9a8e002b62..d7c89dc3426c 100644 --- a/arch/mips/include/asm/uaccess.h +++ b/arch/mips/include/asm/uaccess.h @@ -19,6 +19,7 @@ #ifdef CONFIG_32BIT #define __UA_LIMIT 0x8000UL +#define TASK_SIZE_MAX __UA_LIMIT #define __UA_ADDR ".word" #define __UA_LA"la" @@ -33,6 +34,7 @@ extern u64 __ua_limit; #define __UA_LIMIT __ua_limit +#define TASK_SIZE_MAX XKSSEG #define __UA_ADDR ".dword" #define __UA_LA"dla" @@ -42,22 +44,6 @@ extern u64 __ua_limit; #endif /* CONFIG_64BIT */ -/* - * Is a address valid? This does a straightforward calculation rather - * than tests. - * - * Address valid if: - * - "addr" doesn't have any high-bits set - * - AND "size" doesn't have any high-bits set - * - AND "addr+size" doesn't have any high-bits set - * - OR we are in kernel mode. - * - * __ua_size() is a trick to avoid runtime checking of positive constant - * sizes; for those we already know at compile time that the size is ok. - */ -#define __ua_size(size) \ - ((__builtin_constant_p(size) && (signed long) (size) > 0) ? 0 : (size)) - /* * access_ok: - Checks if a user space pointer is valid * @addr: User space pointer to start of block to check @@ -79,9 +65,9 @@ extern u64 __ua_limit; static inline int __access_ok(const void __user *p, unsigned long size) { unsigned long addr = (unsigned long)p; - unsigned long end = addr + size - !!size; + unsigned long limit = TASK_SIZE_MAX; - return (__UA_LIMIT & (addr | end | __ua_size(size))) == 0; + return (size <= limit) && (addr <= (limit - size)); } #define access_ok(addr, size) \ -- 2.29.2
[PATCH v2 08/18] uaccess: add generic __{get,put}_kernel_nofault
From: Arnd Bergmann Nine architectures are still missing __{get,put}_kernel_nofault: alpha, ia64, microblaze, nds32, nios2, openrisc, sh, sparc32, xtensa. Add a generic version that lets everything use the normal copy_{from,to}_kernel_nofault() code based on these, removing the last use of get_fs()/set_fs() from architecture-independent code. Reviewed-by: Christoph Hellwig Signed-off-by: Arnd Bergmann --- arch/arm/include/asm/uaccess.h | 2 - arch/arm64/include/asm/uaccess.h| 2 - arch/m68k/include/asm/uaccess.h | 2 - arch/mips/include/asm/uaccess.h | 2 - arch/parisc/include/asm/uaccess.h | 1 - arch/powerpc/include/asm/uaccess.h | 2 - arch/riscv/include/asm/uaccess.h| 2 - arch/s390/include/asm/uaccess.h | 2 - arch/sparc/include/asm/uaccess_64.h | 2 - arch/um/include/asm/uaccess.h | 2 - arch/x86/include/asm/uaccess.h | 2 - include/asm-generic/uaccess.h | 2 - include/linux/uaccess.h | 19 + mm/maccess.c| 108 14 files changed, 19 insertions(+), 131 deletions(-) diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h index 32dbfd81f42a..d20d78c34b94 100644 --- a/arch/arm/include/asm/uaccess.h +++ b/arch/arm/include/asm/uaccess.h @@ -476,8 +476,6 @@ do { \ : "r" (x), "i" (-EFAULT)\ : "cc") -#define HAVE_GET_KERNEL_NOFAULT - #define __get_kernel_nofault(dst, src, type, err_label) \ do { \ const type *__pk_ptr = (src); \ diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h index 3a5ff5e20586..2e20879fe3cf 100644 --- a/arch/arm64/include/asm/uaccess.h +++ b/arch/arm64/include/asm/uaccess.h @@ -26,8 +26,6 @@ #include #include -#define HAVE_GET_KERNEL_NOFAULT - /* * Test whether a block of memory is a valid user space address. * Returns 1 if the range is valid, 0 otherwise. diff --git a/arch/m68k/include/asm/uaccess.h b/arch/m68k/include/asm/uaccess.h index ba670523885c..79617c0b2f91 100644 --- a/arch/m68k/include/asm/uaccess.h +++ b/arch/m68k/include/asm/uaccess.h @@ -390,8 +390,6 @@ raw_copy_to_user(void __user *to, const void *from, unsigned long n) #define INLINE_COPY_FROM_USER #define INLINE_COPY_TO_USER -#define HAVE_GET_KERNEL_NOFAULT - #define __get_kernel_nofault(dst, src, type, err_label) \ do { \ type *__gk_dst = (type *)(dst); \ diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h index f8f74f9f5883..db9a8e002b62 100644 --- a/arch/mips/include/asm/uaccess.h +++ b/arch/mips/include/asm/uaccess.h @@ -296,8 +296,6 @@ struct __large_struct { unsigned long buf[100]; }; (val) = __gu_tmp.t; \ } -#define HAVE_GET_KERNEL_NOFAULT - #define __get_kernel_nofault(dst, src, type, err_label) \ do { \ int __gu_err; \ diff --git a/arch/parisc/include/asm/uaccess.h b/arch/parisc/include/asm/uaccess.h index ebf8a845b017..0925bbd6db67 100644 --- a/arch/parisc/include/asm/uaccess.h +++ b/arch/parisc/include/asm/uaccess.h @@ -95,7 +95,6 @@ struct exception_table_entry { (val) = (__force __typeof__(*(ptr))) __gu_val; \ } -#define HAVE_GET_KERNEL_NOFAULT #define __get_kernel_nofault(dst, src, type, err_label)\ { \ type __z; \ diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h index 63316100080c..a0032c2e7550 100644 --- a/arch/powerpc/include/asm/uaccess.h +++ b/arch/powerpc/include/asm/uaccess.h @@ -467,8 +467,6 @@ do { \ unsafe_put_user(*(u8*)(_src + _i), (u8 __user *)(_dst + _i), e); \ } while (0) -#define HAVE_GET_KERNEL_NOFAULT - #define __get_kernel_nofault(dst, src, type, err_label) \ __get_user_size_goto(*((type *)(dst)), \ (__force type __user *)(src), sizeof(type), err_label) diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h index c701a5e57a2b..4407b9e48d2c 100644 --- a/arch/riscv/include/asm/uaccess.h +++ b/arch/riscv/include/asm/uaccess.h @@ -346,8 +346,6 @@ unsigned long __must_check clear_user(void __user *to, unsigned long n) __clear_user(to, n) : n; } -#define HAVE_GET_KERNEL_NOFAULT - #define __get_kernel_nofault(dst,
[PATCH v2 07/18] nios2: drop access_ok() check from __put_user()
From: Arnd Bergmann Unlike other architectures, the nios2 version of __put_user() has an extra check for access_ok(), preventing it from being used to implement __put_kernel_nofault(). Split up put_user() along the same lines as __get_user()/get_user() Signed-off-by: Arnd Bergmann --- arch/nios2/include/asm/uaccess.h | 56 +++- 1 file changed, 33 insertions(+), 23 deletions(-) diff --git a/arch/nios2/include/asm/uaccess.h b/arch/nios2/include/asm/uaccess.h index ca9285a915ef..a5cbe07cf0da 100644 --- a/arch/nios2/include/asm/uaccess.h +++ b/arch/nios2/include/asm/uaccess.h @@ -167,34 +167,44 @@ do { \ : "r" (val), "r" (ptr), "i" (-EFAULT)); \ } -#define put_user(x, ptr) \ +#define __put_user_common(__pu_val, __pu_ptr) \ ({ \ long __pu_err = -EFAULT;\ - __typeof__(*(ptr)) __user *__pu_ptr = (ptr);\ - __typeof__(*(ptr)) __pu_val = (__typeof(*ptr))(x); \ - if (access_ok(__pu_ptr, sizeof(*__pu_ptr))) { \ - switch (sizeof(*__pu_ptr)) {\ - case 1: \ - __put_user_asm(__pu_val, "stb", __pu_ptr, __pu_err); \ - break; \ - case 2: \ - __put_user_asm(__pu_val, "sth", __pu_ptr, __pu_err); \ - break; \ - case 4: \ - __put_user_asm(__pu_val, "stw", __pu_ptr, __pu_err); \ - break; \ - default:\ - /* XXX: This looks wrong... */ \ - __pu_err = 0; \ - if (copy_to_user(__pu_ptr, &(__pu_val), \ - sizeof(*__pu_ptr))) \ - __pu_err = -EFAULT; \ - break; \ - } \ + switch (sizeof(*__pu_ptr)) {\ + case 1: \ + __put_user_asm(__pu_val, "stb", __pu_ptr, __pu_err);\ + break; \ + case 2: \ + __put_user_asm(__pu_val, "sth", __pu_ptr, __pu_err);\ + break; \ + case 4: \ + __put_user_asm(__pu_val, "stw", __pu_ptr, __pu_err);\ + break; \ + default:\ + /* XXX: This looks wrong... */ \ + __pu_err = 0; \ + if (__copy_to_user(__pu_ptr, &(__pu_val), \ + sizeof(*__pu_ptr))) \ + __pu_err = -EFAULT; \ + break; \ } \ __pu_err; \ }) -#define __put_user(x, ptr) put_user(x, ptr) +#define __put_user(x, ptr) \ +({ \ + __auto_type __pu_ptr = (ptr); \ + typeof(*__pu_ptr) __pu_val = (typeof(*__pu_ptr))(x);\ + __put_user_common(__pu_val, __pu_ptr); \ +}) + +#define put_user(x, ptr) \ +({ \ + __auto_type __pu_ptr = (ptr); \ + typeof(*__pu_ptr) __pu_val = (typeof(*__pu_ptr))(x);\ + access_ok(__pu_ptr, sizeof(*__pu_ptr)) ?\ + __put_user_common(__pu_val, __pu_ptr) : \ + -EFAULT;\ +})
[PATCH v2 06/18] x86: use more conventional access_ok() definition
From: Arnd Bergmann The way that access_ok() is defined on x86 is slightly different from most other architectures, and a bit more complex. The generic version tends to result in the best output on all architectures, as it results in single comparison against a constant limit for calls with a known size. There are a few callers of __range_not_ok(), all of which use TASK_SIZE as the limit rather than TASK_SIZE_MAX, but I could not see any reason for picking this. Changing these to call __access_ok() instead uses the default limit, but keeps the behavior otherwise. x86 is the only architecture with a WARN_ON_IN_IRQ() checking access_ok(), but it's probably best to leave that in place. Signed-off-by: Arnd Bergmann --- arch/x86/include/asm/uaccess.h | 25 +++-- 1 file changed, 3 insertions(+), 22 deletions(-) diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 79c4869ccdd6..a59ba2578e64 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -16,33 +16,14 @@ * Test whether a block of memory is a valid user space address. * Returns 0 if the range is valid, nonzero otherwise. */ -static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size) +static inline bool __access_ok(void __user *ptr, unsigned long size) { unsigned long limit = TASK_SIZE_MAX; + unsigned long addr = ptr; - /* -* If we have used "sizeof()" for the size, -* we know it won't overflow the limit (but -* it might overflow the 'addr', so it's -* important to subtract the size from the -* limit, not add it to the address). -*/ - if (__builtin_constant_p(size)) - return unlikely(addr > limit - size); - - /* Arbitrary sizes? Be careful about overflow */ - addr += size; - if (unlikely(addr < size)) - return true; - return unlikely(addr > limit); + return (size <= limit) && (addr <= (limit - size)); } -#define __access_ok(addr, size) \ -({ \ - __chk_user_ptr(addr); \ - !__chk_range_not_ok((unsigned long __force)(addr), size); \ -}) - #ifdef CONFIG_DEBUG_ATOMIC_SLEEP static inline bool pagefault_disabled(void); # define WARN_ON_IN_IRQ() \ -- 2.29.2
[PATCH v2 05/18] x86: remove __range_not_ok()
From: Arnd Bergmann The __range_not_ok() helper is an x86 (and sparc64) specific interface that does roughly the same thing as __access_ok(), but with different calling conventions. Change this to use the normal interface in order for consistency as we clean up all access_ok() implementations. This changes the limit from TASK_SIZE to TASK_SIZE_MAX, which Al points out is the right thing do do here anyway. The callers have to use __access_ok() instead of the normal access_ok() though, because on x86 that contains a WARN_ON_IN_IRQ() check that cannot be used inside of NMI context while tracing. Suggested-by: Al Viro Suggested-by: Christoph Hellwig Link: https://lore.kernel.org/lkml/ygsukcxgr7r4n...@zeniv-ca.linux.org.uk/ Signed-off-by: Arnd Bergmann --- arch/x86/events/core.c | 2 +- arch/x86/include/asm/uaccess.h | 10 ++ arch/x86/kernel/dumpstack.c| 2 +- arch/x86/kernel/stacktrace.c | 2 +- arch/x86/lib/usercopy.c| 2 +- 5 files changed, 10 insertions(+), 8 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index e686c5e0537b..eef816fc216d 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -2794,7 +2794,7 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *re static inline int valid_user_frame(const void __user *fp, unsigned long size) { - return (__range_not_ok(fp, size, TASK_SIZE) == 0); + return __access_ok(fp, size); } static unsigned long get_segment_base(unsigned int segment) diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index ac96f9b2d64b..79c4869ccdd6 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -16,8 +16,10 @@ * Test whether a block of memory is a valid user space address. * Returns 0 if the range is valid, nonzero otherwise. */ -static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, unsigned long limit) +static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size) { + unsigned long limit = TASK_SIZE_MAX; + /* * If we have used "sizeof()" for the size, * we know it won't overflow the limit (but @@ -35,10 +37,10 @@ static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, un return unlikely(addr > limit); } -#define __range_not_ok(addr, size, limit) \ +#define __access_ok(addr, size) \ ({ \ __chk_user_ptr(addr); \ - __chk_range_not_ok((unsigned long __force)(addr), size, limit); \ + !__chk_range_not_ok((unsigned long __force)(addr), size); \ }) #ifdef CONFIG_DEBUG_ATOMIC_SLEEP @@ -69,7 +71,7 @@ static inline bool pagefault_disabled(void); #define access_ok(addr, size) \ ({ \ WARN_ON_IN_IRQ(); \ - likely(!__range_not_ok(addr, size, TASK_SIZE_MAX)); \ + likely(__access_ok(addr, size));\ }) extern int __get_user_1(void); diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c index 53de044e5654..da534fb7b5c6 100644 --- a/arch/x86/kernel/dumpstack.c +++ b/arch/x86/kernel/dumpstack.c @@ -85,7 +85,7 @@ static int copy_code(struct pt_regs *regs, u8 *buf, unsigned long src, * Make sure userspace isn't trying to trick us into dumping kernel * memory by pointing the userspace instruction pointer at it. */ - if (__chk_range_not_ok(src, nbytes, TASK_SIZE_MAX)) + if (!__access_ok((void __user *)src, nbytes)) return -EINVAL; /* diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c index 15b058eefc4e..ee117fcf46ed 100644 --- a/arch/x86/kernel/stacktrace.c +++ b/arch/x86/kernel/stacktrace.c @@ -90,7 +90,7 @@ copy_stack_frame(const struct stack_frame_user __user *fp, { int ret; - if (__range_not_ok(fp, sizeof(*frame), TASK_SIZE)) + if (!__access_ok(fp, sizeof(*frame))) return 0; ret = 1; diff --git a/arch/x86/lib/usercopy.c b/arch/x86/lib/usercopy.c index c3e8a62ca561..ad0139d25401 100644 --- a/arch/x86/lib/usercopy.c +++ b/arch/x86/lib/usercopy.c @@ -32,7 +32,7 @@ copy_from_user_nmi(void *to, const void __user *from, unsigned long n) { unsigned long ret; - if (__range_not_ok(from, n, TASK_SIZE)) + if (!__access_ok(from, n)) return n; if (!nmi_uaccess_okay()) -- 2.29.2
[PATCH v2 04/18] sparc64: add __{get,put}_kernel_nocheck()
From: Arnd Bergmann sparc64 is one of the architectures that uses separate address spaces for kernel and user addresses, so __get_kernel_nofault() can not just call into the normal __get_user() without the access_ok() check. Instead duplicate __get_user() and __put_user() into their in-kernel versions, with minor changes for the calling conventions and leaving out the address space modifier on the assembler instruction. This could surely be written more elegantly, but duplicating it gets the job done. Signed-off-by: Arnd Bergmann --- arch/sparc/include/asm/uaccess_64.h | 78 + 1 file changed, 78 insertions(+) diff --git a/arch/sparc/include/asm/uaccess_64.h b/arch/sparc/include/asm/uaccess_64.h index 30eb4c6414d1..b283798315b1 100644 --- a/arch/sparc/include/asm/uaccess_64.h +++ b/arch/sparc/include/asm/uaccess_64.h @@ -100,6 +100,42 @@ void __retl_efault(void); struct __large_struct { unsigned long buf[100]; }; #define __m(x) ((struct __large_struct *)(x)) +#define __put_kernel_nofault(dst, src, type, label)\ +do { \ + type *addr = (type __force *)(dst); \ + type data = *(type *)src; \ + register int __pu_ret; \ + switch (sizeof(type)) { \ + case 1: __put_kernel_asm(data, b, addr, __pu_ret); break; \ + case 2: __put_kernel_asm(data, h, addr, __pu_ret); break; \ + case 4: __put_kernel_asm(data, w, addr, __pu_ret); break; \ + case 8: __put_kernel_asm(data, x, addr, __pu_ret); break; \ + default: __pu_ret = __put_user_bad(); break;\ + } \ + if (__pu_ret) \ + goto label; \ +} while (0) + +#define __put_kernel_asm(x, size, addr, ret) \ +__asm__ __volatile__( \ + "/* Put kernel asm, inline. */\n" \ + "1:\t" "st"#size " %1, [%2]\n\t" \ + "clr%0\n" \ + "2:\n\n\t" \ + ".section .fixup,#alloc,#execinstr\n\t" \ + ".align 4\n"\ + "3:\n\t"\ + "sethi %%hi(2b), %0\n\t" \ + "jmpl %0 + %%lo(2b), %%g0\n\t"\ + " mov %3, %0\n\n\t" \ + ".previous\n\t" \ + ".section __ex_table,\"a\"\n\t" \ + ".align 4\n\t" \ + ".word 1b, 3b\n\t" \ + ".previous\n\n\t" \ + : "=r" (ret) : "r" (x), "r" (__m(addr)), \ +"i" (-EFAULT)) + #define __put_user_nocheck(data, addr, size) ({\ register int __pu_ret; \ switch (size) { \ @@ -134,6 +170,48 @@ __asm__ __volatile__( \ int __put_user_bad(void); +#define __get_kernel_nofault(dst, src, type, label) \ +do {\ + type *addr = (type __force *)(src); \ + register int __gu_ret; \ + register unsigned long __gu_val; \ + switch (sizeof(type)) { \ + case 1: __get_kernel_asm(__gu_val, ub, addr, __gu_ret); break; \ + case 2: __get_kernel_asm(__gu_val, uh, addr, __gu_ret); break; \ + case 4: __get_kernel_asm(__gu_val, uw, addr, __gu_ret); break; \ + case 8: __get_kernel_asm(__gu_val, x, addr, __gu_ret); break; \ + default: \ + __gu_val = 0;\ + __gu_ret = __get_user_bad(); \ + break; \ + }\ + if (__gu_ret)
[PATCH v2 03/18] nds32: fix access_ok() checks in get/put_user
From: Arnd Bergmann The get_user()/put_user() functions are meant to check for access_ok(), while the __get_user()/__put_user() functions don't. This broke in 4.19 for nds32, when it gained an extraneous check in __get_user(), but lost the check it needs in __put_user(). Fixes: 487913ab18c2 ("nds32: Extract the checking and getting pointer to a macro") Cc: sta...@vger.kernel.org @ v4.19+ Signed-off-by: Arnd Bergmann --- arch/nds32/include/asm/uaccess.h | 22 +- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/arch/nds32/include/asm/uaccess.h b/arch/nds32/include/asm/uaccess.h index d4cbf069dc22..37a40981deb3 100644 --- a/arch/nds32/include/asm/uaccess.h +++ b/arch/nds32/include/asm/uaccess.h @@ -70,9 +70,7 @@ static inline void set_fs(mm_segment_t fs) * versions are void (ie, don't return a value as such). */ -#define get_user __get_user \ - -#define __get_user(x, ptr) \ +#define get_user(x, ptr) \ ({ \ long __gu_err = 0; \ __get_user_check((x), (ptr), __gu_err); \ @@ -85,6 +83,14 @@ static inline void set_fs(mm_segment_t fs) (void)0;\ }) +#define __get_user(x, ptr) \ +({ \ + long __gu_err = 0; \ + const __typeof__(*(ptr)) __user *__p = (ptr); \ + __get_user_err((x), __p, (__gu_err)); \ + __gu_err; \ +}) + #define __get_user_check(x, ptr, err) \ ({ \ const __typeof__(*(ptr)) __user *__p = (ptr); \ @@ -165,12 +171,18 @@ do { \ : "r"(addr), "i"(-EFAULT) \ : "cc") -#define put_user __put_user \ +#define put_user(x, ptr) \ +({ \ + long __pu_err = 0; \ + __put_user_check((x), (ptr), __pu_err); \ + __pu_err; \ +}) #define __put_user(x, ptr) \ ({ \ long __pu_err = 0; \ - __put_user_err((x), (ptr), __pu_err); \ + __typeof__(*(ptr)) __user *__p = (ptr); \ + __put_user_err((x), __p, __pu_err); \ __pu_err; \ }) -- 2.29.2
[PATCH v2 02/18] uaccess: fix nios2 and microblaze get_user_8()
From: Arnd Bergmann These two architectures implement 8-byte get_user() through a memcpy() into a four-byte variable, which won't fit. Use a temporary 64-bit variable instead here, and use a double cast the way that risc-v and openrisc do to avoid compile-time warnings. Fixes: 6a090e97972d ("arch/microblaze: support get_user() of size 8 bytes") Fixes: 5ccc6af5e88e ("nios2: Memory management") Signed-off-by: Arnd Bergmann --- arch/microblaze/include/asm/uaccess.h | 18 +- arch/nios2/include/asm/uaccess.h | 26 -- 2 files changed, 25 insertions(+), 19 deletions(-) diff --git a/arch/microblaze/include/asm/uaccess.h b/arch/microblaze/include/asm/uaccess.h index 5b6e0e7788f4..3fe96979d2c6 100644 --- a/arch/microblaze/include/asm/uaccess.h +++ b/arch/microblaze/include/asm/uaccess.h @@ -130,27 +130,27 @@ extern long __user_bad(void); #define __get_user(x, ptr) \ ({ \ - unsigned long __gu_val = 0; \ long __gu_err; \ switch (sizeof(*(ptr))) { \ case 1: \ - __get_user_asm("lbu", (ptr), __gu_val, __gu_err); \ + __get_user_asm("lbu", (ptr), x, __gu_err); \ break; \ case 2: \ - __get_user_asm("lhu", (ptr), __gu_val, __gu_err); \ + __get_user_asm("lhu", (ptr), x, __gu_err); \ break; \ case 4: \ - __get_user_asm("lw", (ptr), __gu_val, __gu_err);\ + __get_user_asm("lw", (ptr), x, __gu_err); \ break; \ - case 8: \ - __gu_err = __copy_from_user(&__gu_val, ptr, 8); \ - if (__gu_err) \ - __gu_err = -EFAULT; \ + case 8: { \ + __u64 __x = 0; \ + __gu_err = raw_copy_from_user(&__x, ptr, 8) ? \ + -EFAULT : 0;\ + (x) = (typeof(x))(typeof((x) - (x)))__x;\ break; \ + } \ default:\ /* __gu_val = 0; __gu_err = -EINVAL;*/ __gu_err = __user_bad();\ } \ - x = (__force __typeof__(*(ptr))) __gu_val; \ __gu_err; \ }) diff --git a/arch/nios2/include/asm/uaccess.h b/arch/nios2/include/asm/uaccess.h index ba9340e96fd4..ca9285a915ef 100644 --- a/arch/nios2/include/asm/uaccess.h +++ b/arch/nios2/include/asm/uaccess.h @@ -88,6 +88,7 @@ extern __must_check long strnlen_user(const char __user *s, long n); /* Optimized macros */ #define __get_user_asm(val, insn, addr, err) \ { \ + unsigned long __gu_val; \ __asm__ __volatile__( \ " movi%0, %3\n" \ "1: " insn " %1, 0(%2)\n" \ @@ -96,14 +97,20 @@ extern __must_check long strnlen_user(const char __user *s, long n); " .section __ex_table,\"a\"\n"\ " .word 1b, 2b\n" \ " .previous" \ - : "=" (err), "=r" (val) \ + : "=" (err), "=r" (__gu_val) \ : "r" (addr), "i" (-EFAULT)); \ + val = (__force __typeof__(*(addr)))__gu_val;\ } -#define __get_user_unknown(val, size, ptr, err) do { \ +extern void __get_user_unknown(void); + +#define __get_user_8(val, ptr, err) do { \ + u64 __val = 0; \ err = 0;
[PATCH v2 01/18] uaccess: fix integer overflow on access_ok()
From: Arnd Bergmann Three architectures check the end of a user access against the address limit without taking a possible overflow into account. Passing a negative length or another overflow in here returns success when it should not. Use the most common correct implementation here, which optimizes for a constant 'size' argument, and turns the common case into a single comparison. Cc: sta...@vger.kernel.org Fixes: da551281947c ("csky: User access") Fixes: f663b60f5215 ("microblaze: Fix uaccess_ok macro") Fixes: 7567746e1c0d ("Hexagon: Add user access functions") Reported-by: David Laight Reviewed-by: Christoph Hellwig Signed-off-by: Arnd Bergmann --- arch/csky/include/asm/uaccess.h | 7 +++ arch/hexagon/include/asm/uaccess.h| 18 +- arch/microblaze/include/asm/uaccess.h | 19 --- 3 files changed, 16 insertions(+), 28 deletions(-) diff --git a/arch/csky/include/asm/uaccess.h b/arch/csky/include/asm/uaccess.h index c40f06ee8d3e..ac5a54f57d40 100644 --- a/arch/csky/include/asm/uaccess.h +++ b/arch/csky/include/asm/uaccess.h @@ -3,14 +3,13 @@ #ifndef __ASM_CSKY_UACCESS_H #define __ASM_CSKY_UACCESS_H -#define user_addr_max() \ - (uaccess_kernel() ? KERNEL_DS.seg : get_fs().seg) +#define user_addr_max() (current_thread_info()->addr_limit.seg) static inline int __access_ok(unsigned long addr, unsigned long size) { - unsigned long limit = current_thread_info()->addr_limit.seg; + unsigned long limit = user_addr_max(); - return ((addr < limit) && ((addr + size) < limit)); + return (size <= limit) && (addr <= (limit - size)); } #define __access_ok __access_ok diff --git a/arch/hexagon/include/asm/uaccess.h b/arch/hexagon/include/asm/uaccess.h index ef5bfef8d490..719ba3f3c45c 100644 --- a/arch/hexagon/include/asm/uaccess.h +++ b/arch/hexagon/include/asm/uaccess.h @@ -25,17 +25,17 @@ * Returns true (nonzero) if the memory block *may* be valid, false (zero) * if it is definitely invalid. * - * User address space in Hexagon, like x86, goes to 0xbfff, so the - * simple MSB-based tests used by MIPS won't work. Some further - * optimization is probably possible here, but for now, keep it - * reasonably simple and not *too* slow. After all, we've got the - * MMU for backup. */ +#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg) +#define user_addr_max() (uaccess_kernel() ? ~0UL : TASK_SIZE) -#define __access_ok(addr, size) \ - ((get_fs().seg == KERNEL_DS.seg) || \ - (((unsigned long)addr < get_fs().seg) && \ - (unsigned long)size < (get_fs().seg - (unsigned long)addr))) +static inline int __access_ok(unsigned long addr, unsigned long size) +{ + unsigned long limit = TASK_SIZE; + + return (size <= limit) && (addr <= (limit - size)); +} +#define __access_ok __access_ok /* * When a kernel-mode page fault is taken, the faulting instruction diff --git a/arch/microblaze/include/asm/uaccess.h b/arch/microblaze/include/asm/uaccess.h index d2a8ef9f8978..5b6e0e7788f4 100644 --- a/arch/microblaze/include/asm/uaccess.h +++ b/arch/microblaze/include/asm/uaccess.h @@ -39,24 +39,13 @@ # define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg) -static inline int access_ok(const void __user *addr, unsigned long size) +static inline int __access_ok(unsigned long addr, unsigned long size) { - if (!size) - goto ok; + unsigned long limit = user_addr_max(); - if ((get_fs().seg < ((unsigned long)addr)) || - (get_fs().seg < ((unsigned long)addr + size - 1))) { - pr_devel("ACCESS fail at 0x%08x (size 0x%x), seg 0x%08x\n", - (__force u32)addr, (u32)size, - (u32)get_fs().seg); - return 0; - } -ok: - pr_devel("ACCESS OK at 0x%08x (size 0x%x), seg 0x%08x\n", - (__force u32)addr, (u32)size, - (u32)get_fs().seg); - return 1; + return (size <= limit) && (addr <= (limit - size)); } +#define access_ok(addr, size) __access_ok((unsigned long)addr, size) # define __FIXUP_SECTION ".section .fixup,\"ax\"\n" # define __EX_TABLE_SECTION".section __ex_table,\"a\"\n" -- 2.29.2
[PATCH v2 00/18] clean up asm/uaccess.h, kill set_fs for good
From: Arnd Bergmann Christoph Hellwig and a few others spent a huge effort on removing set_fs() from most of the important architectures, but about half the other architectures were never completed even though most of them don't actually use set_fs() at all. I did a patch for microblaze at some point, which turned out to be fairly generic, and now ported it to most other architectures, using new generic implementations of access_ok() and __{get,put}_kernel_nocheck(). Three architectures (sparc64, ia64, and sh) needed some extra work, which I also completed. The final series contains extra cleanup changes that touch all architectures. Please review and test these, so we can merge them for v5.18. The series is available at https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git/log/?h=set_fs-2 for testing. Changes in v2: - add fixes for more nios2 and microblaze bugs found in the process - do final cleanup in a single patch - fix sparc64 regression - introduce CONFIG_ALTERNATE_USER_ADDRESS_SPACE - fix access_ok() in lib/test_lockup.c Arnd Bergmann (18): uaccess: fix integer overflow on access_ok() uaccess: fix nios2 and microblaze get_user_8() nds32: fix access_ok() checks in get/put_user sparc64: add __{get,put}_kernel_nocheck() x86: remove __range_not_ok() x86: use more conventional access_ok() definition nios2: drop access_ok() check from __put_user() uaccess: add generic __{get,put}_kernel_nofault mips: use simpler access_ok() m68k: fix access_ok for coldfire arm64: simplify access_ok() uaccess: fix type mismatch warnings from access_ok() uaccess: generalize access_ok() lib/test_lockup: fix kernel pointer check for separate address spaces sparc64: remove CONFIG_SET_FS support sh: remove CONFIG_SET_FS support ia64: remove CONFIG_SET_FS support uaccess: drop maining CONFIG_SET_FS users arch/Kconfig | 10 +- arch/alpha/Kconfig| 1 - arch/alpha/include/asm/processor.h| 4 - arch/alpha/include/asm/thread_info.h | 2 - arch/alpha/include/asm/uaccess.h | 53 +- arch/arc/Kconfig | 1 - arch/arc/include/asm/segment.h| 20 arch/arc/include/asm/thread_info.h| 3 - arch/arc/include/asm/uaccess.h| 30 -- arch/arc/kernel/process.c | 2 +- arch/arm/include/asm/uaccess.h| 22 +--- arch/arm/kernel/swp_emulate.c | 2 +- arch/arm/kernel/traps.c | 2 +- arch/arm/lib/uaccess_with_memcpy.c| 10 -- arch/arm64/include/asm/uaccess.h | 29 +- arch/csky/Kconfig | 1 - arch/csky/include/asm/processor.h | 2 - arch/csky/include/asm/segment.h | 10 -- arch/csky/include/asm/thread_info.h | 2 - arch/csky/include/asm/uaccess.h | 12 --- arch/csky/kernel/asm-offsets.c| 1 - arch/csky/kernel/signal.c | 2 +- arch/h8300/Kconfig| 1 - arch/h8300/include/asm/processor.h| 1 - arch/h8300/include/asm/segment.h | 40 arch/h8300/include/asm/thread_info.h | 3 - arch/h8300/kernel/entry.S | 1 - arch/h8300/kernel/head_ram.S | 1 - arch/h8300/mm/init.c | 6 -- arch/h8300/mm/memory.c| 1 - arch/hexagon/Kconfig | 1 - arch/hexagon/include/asm/thread_info.h| 6 -- arch/hexagon/include/asm/uaccess.h| 25 - arch/hexagon/kernel/process.c | 1 - arch/ia64/Kconfig | 1 - arch/ia64/include/asm/processor.h | 4 - arch/ia64/include/asm/thread_info.h | 2 - arch/ia64/include/asm/uaccess.h | 26 ++--- arch/ia64/kernel/unaligned.c | 60 +++ arch/m68k/Kconfig.cpu | 1 + arch/m68k/include/asm/uaccess.h | 14 +-- arch/microblaze/Kconfig | 1 - arch/microblaze/include/asm/thread_info.h | 6 -- arch/microblaze/include/asm/uaccess.h | 61 ++- arch/microblaze/kernel/asm-offsets.c | 1 - arch/microblaze/kernel/process.c | 1 - arch/mips/include/asm/uaccess.h | 49 + arch/mips/sibyte/common/sb_tbprof.c | 6 +- arch/nds32/Kconfig| 1 - arch/nds32/include/asm/thread_info.h | 4 - arch/nds32/include/asm/uaccess.h | 40 arch/nds32/kernel/process.c | 5 +- arch/nds32/mm/alignment.c | 3 - arch/nios2/Kconfig| 1 - arch/nios2/include/asm/thread_info.h | 9 -- arch/nios2/include/asm/uaccess.h | 105 +-- arch/nios2/kernel/signal.c| 20 ++-- arch/openrisc/Kconfig | 1 - arch/openrisc/include/asm/thread_info.h | 7
Re: [PATCH] powerpc: Fix STACKTRACE=n build
On Sat, 12 Feb 2022 22:13:49 +1100, Michael Ellerman wrote: > Our skiroot_defconfig doesn't enable FTRACE, and so doesn't get > STACKTRACE enabled either. That leads to a build failure since commit > 1614b2b11fab ("arch: Make ARCH_STACKWALK independent of STACKTRACE") > made stacktrace.c build even when STACKTRACE=n. > > arch/powerpc/kernel/stacktrace.c: In function ‘handle_backtrace_ipi’: > arch/powerpc/kernel/stacktrace.c:171:2: error: implicit declaration of > function ‘nmi_cpu_backtrace’ > 171 | nmi_cpu_backtrace(regs); > | ^ > arch/powerpc/kernel/stacktrace.c: In function > ‘arch_trigger_cpumask_backtrace’: > arch/powerpc/kernel/stacktrace.c:226:2: error: implicit declaration of > function ‘nmi_trigger_cpumask_backtrace’ > 226 | nmi_trigger_cpumask_backtrace(mask, exclude_self, > raise_backtrace_ipi); > | ^ > > [...] Applied to powerpc/next. [1/1] powerpc: Fix STACKTRACE=n build https://git.kernel.org/powerpc/c/5a72345e6a78120368fcc841b570331b6c5a50da cheers
Re: [PATCH] powerpc/64: Move paca allocation later in boot
On Tue, 25 Jan 2022 00:05:44 +1100, Michael Ellerman wrote: > Mahesh & Sourabh identified two problems[1][2] with ppc64_bolted_size() > and paca allocation. > > The first is that on a Radix capable machine but with "disable_radix" on > the command line, there is a window during early boot where > early_radix_enabled() is true, even though it will later become false. > > [...] Applied to powerpc/next. [1/1] powerpc/64: Move paca allocation later in boot https://git.kernel.org/powerpc/c/2e7f1e2b30b5b8aa5de6547407c68670fd227ad8 cheers
Re: [PATCH 11/14] sparc64: remove CONFIG_SET_FS support
On Tue, Feb 15, 2022 at 1:48 AM Al Viro wrote: > > On Mon, Feb 14, 2022 at 05:34:49PM +0100, Arnd Bergmann wrote: > > > -/* > > - * Sparc64 is segmented, though more like the M68K than the I386. > > - * We use the secondary ASI to address user memory, which references a > > - * completely different VM map, thus there is zero chance of the user > > - * doing something queer and tricking us into poking kernel memory. > > Actually, this part of comment probably ought to stay - it is relevant > for understanding what's going on (e.g. why is access_ok() always true, etc.) Ok, I've put it back now. Arnd
Re: [PATCH 11/14] sparc64: remove CONFIG_SET_FS support
On Mon, Feb 14, 2022 at 6:06 PM Christoph Hellwig wrote: > > > void prom_world(int enter) > > { > > - if (!enter) > > - set_fs(get_fs()); > > - > > __asm__ __volatile__("flushw"); > > } > > The enter argument is now unused. Right, good point. I'll add a comment, but I think I will leave that as this seems too hard to change the callers in assembly code for this. If any sparc64 developer wants to clean that up, I'm happy to integrate the cleanup patch in my series. Arnd
Re: [PATCH v2 09/13] powerpc/ftrace: Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS
On Tue, Feb 15, 2022 at 09:55:52PM +0530, Naveen N. Rao wrote: > > > > > > > I think this is wrong. We need to differentiate > > > > > > > between ftrace_caller() and ftrace_regs_caller() > > > > > > > here, and only return pt_regs if coming in through > > > > > > > ftrace_regs_caller() (i.e., FL_SAVE_REGS is set). > > > > > > > > > > > > Not sure I follow you. > > > > > > > > > > > > This is based on 5740a7c71ab6 ("s390/ftrace: add > > > > > > HAVE_DYNAMIC_FTRACE_WITH_ARGS support") > > > > > > > > > > > > It's all the point of HAVE_DYNAMIC_FTRACE_WITH_ARGS, > > > > > > have the regs also with ftrace_caller(). > > > > > > > > > > > > Sure you only have the params, but that's the same on > > > > > > s390, so what did I miss ? > > Steven has explained the rationale for this in his other response: > https://lore.kernel.org/all/20220215093849.556d5...@gandalf.local.home/ Thanks for this pointer, this clarifies a couple of things! > > > > It looks like s390 is special since it apparently saves all > > > > registers even for ftrace_caller: > > > > https://lore.kernel.org/all/YbipdU5X4HNDWIni@osiris/ > > > > > > It is not what I understand from their code, see > > > https://elixir.bootlin.com/linux/v5.17-rc3/source/arch/s390/kernel/mcount.S#L37 > > > > > > > > > They have a common macro called with argument 'allregs' which is set > > > to 0 for ftrace_caller() and 1 for ftrace_regs_caller(). > > > When allregs == 1, the macro seems to save more. > > > > > > But ok, I can do like x86, but I need a trick to know whether > > > FL_SAVE_REGS is set or not, like they do with fregs->regs.cs > > > Any idea what the condition can be for powerpc ? > > We'll need to explicitly zero-out something in pt_regs in ftrace_caller(). > We can probably use regs->msr since we don't expect it to be zero when saved > from ftrace_regs_caller(). > > > > Finally, it looks like this change is done via commit 894979689d3a > > ("s390/ftrace: provide separate ftrace_caller/ftrace_regs_caller > > implementations") four hours the same day after the implementation of > > arch_ftrace_get_regs() > > > > They may have forgotten to change arch_ftrace_get_regs() which was added > > in commit 5740a7c71ab6 ("s390/ftrace: add HAVE_DYNAMIC_FTRACE_WITH_ARGS > > support") with the assumption that ftrace_caller and ftrace_regs_caller > > where identical. > > Indeed, good find! Thank you for bringing this up! So, the in both variants s390 provides nearly identical data. The only difference is that for FL_SAVE_REGS the program status word mask is missing; therefore it is not possible to figure out the condition code or if interrupts were enabled/disabled. Vasily, Sven, I think we have two options here: - don't provide sane psw mask contents at all and say (again) that ptregs contents are identical - provide (finally) a full psw mask contents using epsw, and indicate validity with a flags bit in pt_regs I would vote for the second option, even though epsw is slow. But this is about the third or fourth time this came up in different contexts. So I'd guess we should go for the slow but complete solution. Opinions?
Re: [PATCH v5 0/5] KVM: PPC: MMIO fixes
On Tue, 25 Jan 2022 18:56:50 -0300, Fabiano Rosas wrote: > Changes from v4: > > -patch 4: switched to kvm_debug_ratelimited. > > -patch 5: kept the Program interrupt for BookE. > > v4: > https://lore.kernel.org/r/20220121222626.972495-1-faro...@linux.ibm.com > > [...] Applied to powerpc/topic/ppc-kvm. [1/5] KVM: PPC: Book3S HV: Stop returning internal values to userspace https://git.kernel.org/powerpc/c/36d014d37d59065087e51b8381e37993f1ca99bc [2/5] KVM: PPC: Fix vmx/vsx mixup in mmio emulation https://git.kernel.org/powerpc/c/b99234b918c6e36b9aa0a5b2981e86b6bd11f8e2 [3/5] KVM: PPC: mmio: Reject instructions that access more than mmio.data size https://git.kernel.org/powerpc/c/3f831504482ab0d0d53d1966987959d1485345cc [4/5] KVM: PPC: mmio: Return to guest after emulation failure https://git.kernel.org/powerpc/c/349fbfe9b918e6dea00734f07c0fbaf4c2e2df5e [5/5] KVM: PPC: Book3s: mmio: Deliver DSI after emulation failure https://git.kernel.org/powerpc/c/c1c8a66367a35aabbad9bd629b105b9fb49f2c1f cheers
Re: powerpc: Set crashkernel offset to mid of RMA region
On Fri, 4 Feb 2022 14:26:01 +0530, Sourabh Jain wrote: > On large config LPARs (having 192 and more cores), Linux fails to boot > due to insufficient memory in the first memblock. It is due to the > memory reservation for the crash kernel which starts at 128MB offset of > the first memblock. This memory reservation for the crash kernel doesn't > leave enough space in the first memblock to accommodate other essential > system resources. > > [...] Applied to powerpc/next. [1/1] powerpc: Set crashkernel offset to mid of RMA region https://git.kernel.org/powerpc/c/7c5ed82b800d8615cdda00729e7b62e5899f0b13 cheers
Re: [PATCHv2] selftests/powerpc/copyloops: Add memmove_64 test
On Mon, 13 Sep 2021 11:47:20 +0530, Ritesh Harjani wrote: > While debugging an issue, we wanted to check whether the arch specific > kernel memmove implementation is correct. > This selftest could help test that. > > Applied to powerpc/next. [1/1] selftests/powerpc/copyloops: Add memmove_64 test https://git.kernel.org/powerpc/c/2504e5b9827f7fc76ed0e4593adc852ac7a19851 cheers
Re: [PATCH] powerpc/pseries: make pseries_devicetree_update() static
On Mon, 7 Feb 2022 16:12:47 -0600, Nathan Lynch wrote: > pseries_devicetree_update() has only one call site, in the same file in > which it is defined. Make it static. > > Applied to powerpc/next. [1/1] powerpc/pseries: make pseries_devicetree_update() static https://git.kernel.org/powerpc/c/92e6dc257bd5771cf8db662e4d371fdb58fcbf43 cheers
Re: [RFC PATCH v1 00/11] powerpc/machdep: Remove dust and convert to static calls
On Fri, 3 Sep 2021 11:18:34 + (UTC), Christophe Leroy wrote: > The purpose of this series is to convert machine dependent > functions in structure ppc_md into static calls. > > First part of the series remove some dust in and around machdep.h > > Then some helpers are defined to abstract the access to ppc_md. structure. > > [...] Patches 3, 5, 7 and 8 applied to powerpc/next. [03/11] powerpc/machdep: Remove CONFIG_PPC_HAS_FEATURE_CALLS https://git.kernel.org/powerpc/c/d6a6c725a20467f52a41270bdaad9565c66f3b7a [05/11] powerpc/machdep: Move sys_ctrler_t definition into pmac.h https://git.kernel.org/powerpc/c/e6d03ac156db84422519aa8628efc210d24bf889 [07/11] powerpc/mpc86xx_hpcn: Remove obsolete statement https://git.kernel.org/powerpc/c/fae65a9ac8fd2221dbf034019fa18d72b2b0c8e9 [08/11] powerpc/corenet: Change criteria to set MPIC_ENABLE_COREINT https://git.kernel.org/powerpc/c/66ada2907864cafa4578b92926cb8bc0a4bc8c9c cheers
Re: [PATCH v4 1/5] powerpc/vdso: augment VDSO32 functions to support 64 bits build
On Fri, 21 Jan 2022 16:30:21 +, Christophe Leroy wrote: > VDSO64 cacheflush.S datapage.S gettimeofday.S and vgettimeofday.c > are very similar to their VDSO32 counterpart. > > VDSO32 counterpart is already more complete than the VDSO64 version > as it supports both PPC32 vdso and 32 bits VDSO for PPC64. > > Use compat macros wherever necessary in PPC32 files > so that they can also be used to build VDSO64. > > [...] Applied to powerpc/next. [1/5] powerpc/vdso: augment VDSO32 functions to support 64 bits build https://git.kernel.org/powerpc/c/f061fb03ee611c5657010ee4fa2a3fa64dfe3bd0 [2/5] powerpc/vdso: Rework VDSO32 makefile to add a prefix to object files https://git.kernel.org/powerpc/c/d88378d8d2c776154c6b606f2a423a81d7795f6f [3/5] powerpc/vdso: Merge vdso64 and vdso32 into a single directory https://git.kernel.org/powerpc/c/fd1feade75fb1a9275c39d76c5ccdbbbe6b37aa3 [4/5] powerpc/vdso: Remove cvdso_call_time macro https://git.kernel.org/powerpc/c/9b97bea90072a075363a200dd7b54ad4a24e9491 [5/5] powerpc/vdso: Move cvdso_call macro into gettimeofday.S https://git.kernel.org/powerpc/c/692b21d78046851e75dc25bba773189c670b49c2 cheers
Re: [PATCH v3 1/2] powerpc/set_memory: Avoid spinlock recursion in change_page_attr()
On Fri, 24 Dec 2021 11:07:33 +, Christophe Leroy wrote: > Commit 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines") > included a spin_lock() to change_page_attr() in order to > safely perform the three step operations. But then > commit 9f7853d7609d ("powerpc/mm: Fix set_memory_*() against > concurrent accesses") modify it to use pte_update() and do > the operation safely against concurrent access. > > [...] Applied to powerpc/next. [1/2] powerpc/set_memory: Avoid spinlock recursion in change_page_attr() https://git.kernel.org/powerpc/c/a4c182ecf33584b9b2d1aa9dad073014a504c01f [2/2] powerpc: Add set_memory_{p/np}() and remove set_memory_attr() https://git.kernel.org/powerpc/c/f222ab83df92acf72691a2021e1f0d99880dcdf1 cheers
Re: [PATCH v2 00/13] Implement livepatch on PPC32 and more
On Mon, 20 Dec 2021 16:37:58 +, Christophe Leroy wrote: > This series implements livepatch on PPC32 and implements > CONFIG_DYNAMIC_FTRACE_WITH_ARGS to simplify ftracing. > > v2: > - Fix problem with strict modules RWX > - Convert powerpc to CONFIG_DYNAMIC_FTRACE_WITH_ARGS > - Convert function graph tracing to C > - Refactor PPC32 versus PPC64 > > [...] Patches 1 and 3-13 applied to powerpc/next. [01/13] livepatch: Fix build failure on 32 bits processors https://git.kernel.org/powerpc/c/2f293651eca3eacaeb56747dede31edace7329d2 [03/13] powerpc/module_32: Fix livepatching for RO modules https://git.kernel.org/powerpc/c/0c850965d6909d39fd69d6a3602bb62b48cad417 [04/13] powerpc/ftrace: Add support for livepatch to PPC32 https://git.kernel.org/powerpc/c/a4520b25276500f1abcfc55d24f1251b7b08eff6 [05/13] powerpc/ftrace: Don't save again LR in ftrace_regs_caller() on PPC32 https://git.kernel.org/powerpc/c/7875bc9b07cde868784195e215f4deaa0fa928a2 [06/13] powerpc/ftrace: Simplify PPC32's return_to_handler() https://git.kernel.org/powerpc/c/7bdb478c1d15cfd3a92db6331cb2d3dd3a8b9436 [07/13] powerpc/ftrace: Prepare PPC32's ftrace_caller() for CONFIG_DYNAMIC_FTRACE_WITH_ARGS https://git.kernel.org/powerpc/c/d95bf254be5f74c1e4c8f7cb64e2e21b9cc91717 [08/13] powerpc/ftrace: Prepare PPC64's ftrace_caller() for CONFIG_DYNAMIC_FTRACE_WITH_ARGS https://git.kernel.org/powerpc/c/c75388a8ceffbf1bf72c61afe66a72e58aa20c74 [09/13] powerpc/ftrace: Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS https://git.kernel.org/powerpc/c/40b035efe288f42bbf4483236cde652584ccb64e [10/13] powerpc/ftrace: Refactor ftrace_{en/dis}able_ftrace_graph_caller https://git.kernel.org/powerpc/c/0c81ed5ed43863d313cf253b0ebada6ea2f17676 [11/13] powerpc/ftrace: directly call of function graph tracer by ftrace caller https://git.kernel.org/powerpc/c/830213786c498b0c488fedd2abc15a7ce442b42f [12/13] powerpc/ftrace: Prepare ftrace_64_mprofile.S for reuse by PPC32 https://git.kernel.org/powerpc/c/41315494beed087011f256b4f1439bb3d8236904 [13/13] powerpc/ftrace: Remove ftrace_32.S https://git.kernel.org/powerpc/c/4ee83a2cfbc46c13f2a08fe6d48dbcede53cdbf8 cheers
Re: [PATCH] powerpc: Use the newly added is_tsk_32bit_task() macro
On Fri, 21 Jan 2022 07:58:47 +, Christophe Leroy wrote: > Two places deserve using the macro is_tsk_32bit_task() added by > commit 252745240ba0 ("powerpc/audit: Fix syscall_get_arch()") > > Applied to powerpc/next. [1/1] powerpc: Use the newly added is_tsk_32bit_task() macro https://git.kernel.org/powerpc/c/9d44d1bd93b9a881f407b3202dc13fbd85fb5f1a cheers
Re: [PATCH] powerpc/bpf: Always reallocate BPF_REG_5, BPF_REG_AX and TMP_REG when possible
On Mon, 10 Jan 2022 12:29:42 +, Christophe Leroy wrote: > BPF_REG_5, BPF_REG_AX and TMP_REG are mapped on non volatile registers > because there are not enough volatile registers, but they don't need > to be preserved on function calls. > > So when some volatile registers become available, those registers can > always be reallocated regardless of whether SEEN_FUNC is set or not. > > [...] Applied to powerpc/next. [1/1] powerpc/bpf: Always reallocate BPF_REG_5, BPF_REG_AX and TMP_REG when possible https://git.kernel.org/powerpc/c/a8936569a07bf27cc9cfc2a39a1e5ea91273b2d4 cheers
Re: [PATCH] powerpc/32s: Enable STRICT_MODULE_RWX for the 603 core
On Mon, 17 Jan 2022 10:06:39 +, Christophe Leroy wrote: > The book3s/32 MMU doesn't support per page execution protection and > doesn't support RO protection for kernel pages. > > However, on the 603 which implements software loaded TLBs, execution > protection is honored by the TLB Miss handler which doesn't load > Instruction TLB for non executable pages. And RO protection is > honored by clearing the C bit for RO pages, leading to DSI. > > [...] Applied to powerpc/next. [1/1] powerpc/32s: Enable STRICT_MODULE_RWX for the 603 core https://git.kernel.org/powerpc/c/0670010f3b10aeaad0dfdf0dad0bcd020fc70eb5 cheers
Re: [PATCH 1/3] powerpc/lib/sstep: Use l1_dcache_bytes() instead of opencoding
On Fri, 21 Jan 2022 08:06:27 +, Christophe Leroy wrote: > Don't opencode dcache size retrieval based on whether that's ppc32 or ppc64. > > Use l1_dcache_bytes() > > Applied to powerpc/next. [1/3] powerpc/lib/sstep: Use l1_dcache_bytes() instead of opencoding https://git.kernel.org/powerpc/c/67484e0de9c93b4a9187bb49f45dfdaa8dc03c0b [2/3] powerpc/lib/sstep: Remove unneeded #ifdef __powerpc64__ https://git.kernel.org/powerpc/c/7c3bba91999075f4cfcab0542e4eb74d2d63554b [3/3] powerpc/lib/sstep: use truncate_if_32bit() https://git.kernel.org/powerpc/c/6836f099039e6c72fb548bf527345aa4345c3308 cheers
Re: [PATCH 1/2] powerpc/32: Remove remaining .stabs annotations
On Tue, 30 Nov 2021 13:04:49 +0100, Christophe Leroy wrote: > STABS debug format has been superseded long time ago by DWARF. > > Remove the few remaining .stabs annotations from old 32 bits code. > > Applied to powerpc/next. [1/2] powerpc/32: Remove remaining .stabs annotations https://git.kernel.org/powerpc/c/12318163737cd8808d13faa6e2393774191a6182 [2/2] powerpc/32: Remove _ENTRY() macro https://git.kernel.org/powerpc/c/27e21e8f128a56d3462f0fe2fd3a59c02cc002b1 cheers
Re: [PATCH v2] powerpc/mm: Update default hugetlb size early
On Fri, 11 Feb 2022 12:22:15 +0530, Aneesh Kumar K.V wrote: > commit: d9c234005227 ("Do not depend on MAX_ORDER when grouping pages by > mobility") > introduced pageblock_order which will be used to group pages better. > The kernel now groups pages based on the value of HPAGE_SHIFT. Hence > HPAGE_SHIFT > should be set before we call set_pageblock_order. > > set_pageblock_order happens early in the boot and default hugetlb page size > should be initialized before that to compute the right pageblock_order value. > > [...] Applied to powerpc/next. [1/1] powerpc/mm: Update default hugetlb size early https://git.kernel.org/powerpc/c/2354ad252b66695be02f4acd18e37bf6264f0464 cheers
Re: [PATCH v4 00/13] Fix LKDTM for PPC64/IA64/PARISC v4
Hi! On 2/15/22 13:40, Christophe Leroy wrote: > PPC64/IA64/PARISC have function descriptors. LKDTM doesn't work > on those three architectures because LKDTM messes up function > descriptors with functions. > > This series does some cleanup in the three architectures and > refactors function descriptors so that it can then easily use it > in a generic way in LKDTM. I'll test the series on ia64 later this week. I have an Itanium box at home for testing kernel patches. Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaub...@debian.org `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Re: [PATCH v4 00/13] Fix LKDTM for PPC64/IA64/PARISC v4
Kees Cook writes: > On Tue, Feb 15, 2022 at 01:40:55PM +0100, Christophe Leroy wrote: >> PPC64/IA64/PARISC have function descriptors. LKDTM doesn't work >> on those three architectures because LKDTM messes up function >> descriptors with functions. >> >> This series does some cleanup in the three architectures and >> refactors function descriptors so that it can then easily use it >> in a generic way in LKDTM. > > Thanks for doing this! It looks good to me. :) How should we merge this series, it's a bit all over the map. I could put it in a topic branch? cheers
Re: [next-20220215] WARNING at fs/iomap/buffered-io.c:75 with xfstests
Hi Sachin, On Wed, 16 Feb 2022 15:17:14 +0530 Sachin Sant wrote: > > >> While running xfstests on IBM Power10 logical partition (LPAR) booted > >> with 5.17.0-rc4-next-20220215 following warning was seen: > >> > >> The warning is seen when test tries to unmount the file system. This > >> problem is seen > >> while running generic/475 sub test. Have attached captured messages during > >> the test > >> run of generic/475. > >> > >> xfstest is a recent add to upstream regression bucket. I don’t have any > >> previous data > >> to attempt a git bisect. > > > > If you have time, could you test v5.17-rc4-2-gd567f5db412e (the commit > > in Linus' tree that next-20220215 is based on) and if that OK, then a > > bisect from that to 5.17.0-rc4-next-20220215 may be helpful. > > Unfortunately I cannot recreate the problem consistently. I tried same test > run with both > mainline as well as linux-next20220215. In both attempts I wasn’t able to > recreate it. No worries, thanks anyway. -- Cheers, Stephen Rothwell pgpuSuQ1aqCCF.pgp Description: OpenPGP digital signature
Re: [PATCH v1 4/4] powerpc/ftrace: Style cleanup in ftrace_mprofile.S
Christophe Leroy wrote: Add some line breaks to better match the file's style, add some space after comma and fix a couple of misplaced blanks. Suggested-by: Naveen N. Rao Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/trace/ftrace_mprofile.S | 12 1 file changed, 8 insertions(+), 4 deletions(-) Thanks. For the series: Reviewed-by: Naveen N. Rao - Naveen
Re: [PATCH v2 1/2] crypto: vmx: Turn CRYPTO_DEV_VMX_ENCRYPT into tristate
Hi, side notes about the subject (following private notes from Nicolai): I dared to use "crypto: vmx: " in subject, next version I'll use "crypto: vmx - " as it's used in crypto. Also continue the subject line into the rest of the commit message isn't probably wanted. Kind regards, Petr
Re: [PATCH v2 1/2] crypto: vmx: Turn CRYPTO_DEV_VMX_ENCRYPT into tristate
Hi Nicolai, thanks for all your comments. > Hi Petr, > Petr Vorel writes: > > and remove CRYPTO_DEV_VMX, which looked redundant when only > > CRYPTO_DEV_VMX_ENCRYPT used it. Also it forces CRYPTO_GHASH to be > > builtin even CRYPTO_DEV_VMX_ENCRYPT was configured as module. > I'm confused by the description. CRYPTO_DEV_VMX_ENCRYPT has been a > tristate since ever? And thus, with CRYPTO_DEV_VMX_ENCRYPT=m, > CRYPTO_GHASH=m would be possible as far as vmx is concerned? I'm sorry, the description is wrong. I'm not kconfig expert and I verify it again, but I run into it when testing this with defconfig: $ make defconfig && scripts/config --enable CONFIG_KVM_GUEST --disable CRYPTO_MANAGER_DISABLE_TESTS --enable CONFIG_MODULE_SIG It somehow inherits CRYPTO_DEV_VMX=y. But I'll verify it again. > What this patch really does is to merge CRYPTO_DEV_VMX into > CRYPTO_DEV_VMX_ENCRYPT AFAICS. +1 This is a proper description. > These two seem indeed redundant to me, but for consistency with the > other crypto drivers (e.g. bcm, ccree, ...), I'd rather keep > CRYPTO_DEV_VMX and merge CRYPTO_DEV_VMX_ENCRYPT into it. I'm not sure myself, because some other modules also have Kconfig. $ ls -1 drivers/crypto/*/Kconfig drivers/crypto/allwinner/Kconfig drivers/crypto/amlogic/Kconfig drivers/crypto/caam/Kconfig drivers/crypto/ccp/Kconfig drivers/crypto/hisilicon/Kconfig drivers/crypto/chelsio/Kconfig drivers/crypto/keembay/Kconfig drivers/crypto/marvell/Kconfig drivers/crypto/nx/Kconfig drivers/crypto/qat/Kconfig drivers/crypto/stm32/Kconfig drivers/crypto/ux500/Kconfig drivers/crypto/virtio/Kconfig drivers/crypto/vmx/Kconfig Sure, some of them have many config options in Kconfig, but at least drivers/crypto/chelsio/Kconfig and drivers/crypto/virtio/Kconfig configure just the module. Given it's just these two, I should probably merge CRYPTO_DEV_VMX_ENCRYPT into CRYPTO_DEV_VMX as you suggest (also defconfig changes would not be needed). @Herbert, Nayna, Paulo, Breno: any preference of these. > > Update powerpc defconfigs and description in MAINTAINERS. > The change to MAINTAINERS is completely unrelated? If anything, it had > to come with a separate patch then. You're right, I was hesitating myself. > > Signed-off-by: Petr Vorel > > --- > > new in v2 > > This might be a bit aggressive, but IMHO CRYPTO_DEV_VMX only complicated > > things for nothing. > I agree on the redundancy, but as said, CRYPTO_DEV_VMX_ENCRYPT should > probably the one to get dropped in favor of CRYPTO_DEV_VMX. > > But if you do *not* agree with removing it, I just add > > select to drivers/crypto/vmx/Kconfig (which forces dependencies to be > > always modules.) > > If it's ok for you to remove, please also check whether the description > > is ok. get_maintainer.pl script has size limitation: > > $ ./scripts/get_maintainer.pl drivers/crypto/vmx/Kconfig > > ... > > linux-cry...@vger.kernel.org (open list:IBM Power VMX Cryptographic > > Acceleration Instru...) > > maybe the name should be shorter. > > Kind regards, > > Petr > > MAINTAINERS| 2 +- > > arch/powerpc/configs/powernv_defconfig | 2 +- > > arch/powerpc/configs/ppc64_defconfig | 2 +- > > arch/powerpc/configs/pseries_defconfig | 2 +- > > drivers/crypto/Kconfig | 6 -- > > drivers/crypto/vmx/Kconfig | 4 ++-- > > 6 files changed, 6 insertions(+), 12 deletions(-) > If you were to drop CONFIG_CRYPTO_DEV_VMX (like it's implemented in this > patch), then something had to be done about > obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/ > in drivers/crypto/Makefile as well. +1 (I obviously forget to amend). Kind regards, Petr ... > > +++ b/drivers/crypto/Kconfig > > @@ -761,12 +761,6 @@ config CRYPTO_DEV_QCOM_RNG > > To compile this driver as a module, choose M here. The > > module will be called qcom-rng. If unsure, say N. > > -config CRYPTO_DEV_VMX > > - bool "Support for VMX cryptographic acceleration instructions" > > - depends on PPC64 && VSX > > - help > > - Support for VMX cryptographic acceleration instructions. > > - > As said, I'd keep this one (while moving the GHASH dependency here) ... > > source "drivers/crypto/vmx/Kconfig" > ... and drop this one. > Thanks, > Nicolai
Re: [next-20220215] WARNING at fs/iomap/buffered-io.c:75 with xfstests
>> While running xfstests on IBM Power10 logical partition (LPAR) booted >> with 5.17.0-rc4-next-20220215 following warning was seen: >> >> The warning is seen when test tries to unmount the file system. This problem >> is seen >> while running generic/475 sub test. Have attached captured messages during >> the test >> run of generic/475. >> >> xfstest is a recent add to upstream regression bucket. I don’t have any >> previous data >> to attempt a git bisect. > > If you have time, could you test v5.17-rc4-2-gd567f5db412e (the commit > in Linus' tree that next-20220215 is based on) and if that OK, then a > bisect from that to 5.17.0-rc4-next-20220215 may be helpful. Unfortunately I cannot recreate the problem consistently. I tried same test run with both mainline as well as linux-next20220215. In both attempts I wasn’t able to recreate it. Thanks -Sachin
Re: [PATCH v2 1/2] crypto: vmx: Turn CRYPTO_DEV_VMX_ENCRYPT into tristate
Hi Petr, Petr Vorel writes: > and remove CRYPTO_DEV_VMX, which looked redundant when only > CRYPTO_DEV_VMX_ENCRYPT used it. Also it forces CRYPTO_GHASH to be > builtin even CRYPTO_DEV_VMX_ENCRYPT was configured as module. I'm confused by the description. CRYPTO_DEV_VMX_ENCRYPT has been a tristate since ever? And thus, with CRYPTO_DEV_VMX_ENCRYPT=m, CRYPTO_GHASH=m would be possible as far as vmx is concerned? What this patch really does is to merge CRYPTO_DEV_VMX into CRYPTO_DEV_VMX_ENCRYPT AFAICS. These two seem indeed redundant to me, but for consistency with the other crypto drivers (e.g. bcm, ccree, ...), I'd rather keep CRYPTO_DEV_VMX and merge CRYPTO_DEV_VMX_ENCRYPT into it. > Update powerpc defconfigs and description in MAINTAINERS. The change to MAINTAINERS is completely unrelated? If anything, it had to come with a separate patch then. > > Signed-off-by: Petr Vorel > --- > new in v2 > > This might be a bit aggressive, but IMHO CRYPTO_DEV_VMX only complicated > things for nothing. I agree on the redundancy, but as said, CRYPTO_DEV_VMX_ENCRYPT should probably the one to get dropped in favor of CRYPTO_DEV_VMX. > But if you do *not* agree with removing it, I just add > select to drivers/crypto/vmx/Kconfig (which forces dependencies to be > always modules.) > > If it's ok for you to remove, please also check whether the description > is ok. get_maintainer.pl script has size limitation: > > $ ./scripts/get_maintainer.pl drivers/crypto/vmx/Kconfig > ... > linux-cry...@vger.kernel.org (open list:IBM Power VMX Cryptographic > Acceleration Instru...) > > maybe the name should be shorter. > > Kind regards, > Petr > > MAINTAINERS| 2 +- > arch/powerpc/configs/powernv_defconfig | 2 +- > arch/powerpc/configs/ppc64_defconfig | 2 +- > arch/powerpc/configs/pseries_defconfig | 2 +- > drivers/crypto/Kconfig | 6 -- > drivers/crypto/vmx/Kconfig | 4 ++-- > 6 files changed, 6 insertions(+), 12 deletions(-) If you were to drop CONFIG_CRYPTO_DEV_VMX (like it's implemented in this patch), then something had to be done about obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/ in drivers/crypto/Makefile as well. > > diff --git a/MAINTAINERS b/MAINTAINERS > index ea3e6c914384..80e562579180 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -9207,7 +9207,7 @@ L: target-de...@vger.kernel.org > S: Supported > F: drivers/scsi/ibmvscsi_tgt/ > > -IBM Power VMX Cryptographic instructions > +IBM Power VMX Cryptographic Acceleration Instructions Driver > M: Breno Leitão > M: Nayna Jain > M: Paulo Flabiano Smorigo > diff --git a/arch/powerpc/configs/powernv_defconfig > b/arch/powerpc/configs/powernv_defconfig > index 49f49c263935..4b250d05dcdf 100644 > --- a/arch/powerpc/configs/powernv_defconfig > +++ b/arch/powerpc/configs/powernv_defconfig > @@ -337,7 +337,7 @@ CONFIG_CRYPTO_TEA=m > CONFIG_CRYPTO_TWOFISH=m > CONFIG_CRYPTO_LZO=m > CONFIG_CRYPTO_DEV_NX=y > -CONFIG_CRYPTO_DEV_VMX=y > +CONFIG_CRYPTO_DEV_VMX_ENCRYPT=m > CONFIG_VIRTUALIZATION=y > CONFIG_KVM_BOOK3S_64=m > CONFIG_KVM_BOOK3S_64_HV=m > diff --git a/arch/powerpc/configs/ppc64_defconfig > b/arch/powerpc/configs/ppc64_defconfig > index c8b0e80d613b..ebd33b94debb 100644 > --- a/arch/powerpc/configs/ppc64_defconfig > +++ b/arch/powerpc/configs/ppc64_defconfig > @@ -355,7 +355,7 @@ CONFIG_CRYPTO_TWOFISH=m > CONFIG_CRYPTO_LZO=m > CONFIG_CRYPTO_DEV_NX=y > CONFIG_CRYPTO_DEV_NX_ENCRYPT=m > -CONFIG_CRYPTO_DEV_VMX=y > +CONFIG_CRYPTO_DEV_VMX_ENCRYPT=m > CONFIG_PRINTK_TIME=y > CONFIG_PRINTK_CALLER=y > CONFIG_MAGIC_SYSRQ=y > diff --git a/arch/powerpc/configs/pseries_defconfig > b/arch/powerpc/configs/pseries_defconfig > index b571d084c148..304673817ef1 100644 > --- a/arch/powerpc/configs/pseries_defconfig > +++ b/arch/powerpc/configs/pseries_defconfig > @@ -315,7 +315,7 @@ CONFIG_CRYPTO_TWOFISH=m > CONFIG_CRYPTO_LZO=m > CONFIG_CRYPTO_DEV_NX=y > CONFIG_CRYPTO_DEV_NX_ENCRYPT=m > -CONFIG_CRYPTO_DEV_VMX=y > +CONFIG_CRYPTO_DEV_VMX_ENCRYPT=m > CONFIG_VIRTUALIZATION=y > CONFIG_KVM_BOOK3S_64=m > CONFIG_KVM_BOOK3S_64_HV=m > diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig > index 4f705674f94f..956f956607a5 100644 > --- a/drivers/crypto/Kconfig > +++ b/drivers/crypto/Kconfig > @@ -761,12 +761,6 @@ config CRYPTO_DEV_QCOM_RNG > To compile this driver as a module, choose M here. The > module will be called qcom-rng. If unsure, say N. > > -config CRYPTO_DEV_VMX > - bool "Support for VMX cryptographic acceleration instructions" > - depends on PPC64 && VSX > - help > - Support for VMX cryptographic acceleration instructions. > - As said, I'd keep this one (while moving the GHASH dependency here) ... > source "drivers/crypto/vmx/Kconfig" ... and drop this one. Thanks, Nicolai > > config CRYPTO_DEV_IMGTEC_HASH > diff --git a/drivers/crypto/vmx/Kconfig b/drivers/crypto/vmx/Kconfig > index c85fab7ef0bd..1a3808b719f3 100644 > ---