(This RFC is mainly to get feedback on the user interface. Tests and documentation will be added to the non-rfc followups. This builds but is otherwise untested.)
In the "Add static DEXCR support" series[1] the kernel was made to initialise the DEXCR to a static value on all CPUs when they online. This series allows the DEXCR value to be changed at runtime with a per-thread granularity. It provides a prctl interface to set and query this configuration. It also provides a system wide sysctl override for the SBHE aspect, which specifically has effects that can bleed over to other CPUs (temporarily after changing it) and certain tracing tools may require it be set globally across all threads. Some notes on the patches/changes from the original RFC: 1. We don't need all the aspects to use feature bits, but the aspect information is in the device tree and this is the simplest mechanism to access it. Adding some kind of callback support to the feature detector would also work. The dexcr_supported variable introduced in patch 4 is a half-hearted example of how the callbacks could just update that variable, and no more CPU features would be necessary. 2. The thread used to track 'default' as a separate state (way back in the original RFC before the split into static/dynamic). This RFC simplifies it away, as it is only useful if there is a system wide default that can be changed. The current system wide default is decided at compile time, so we just initialise the thread config to that. If the 'default' state were added in future though, would that be a userspace ABI concern? I guess it could just return a 'default' flag as well as the current 'on' and 'off' flags to indicate what the default is. 3. The prctl controls are reduced to what I expect to be most useful. Default state is removed as above, and so is 'force' (where the aspect would no longer be configurable). 'inherit' remains as a way to control the DEXCR of child process trees that may not be aware of it. 4. The prctl set interface is privileged. The concern is a non-privileged process disabling NPHIE (HASHCHK enabler) and then invoking a setuid binary which doesn't set NPHIE itself. It seems that kind of information about the exec target is not available in arch specific code. 5. A lot of the synchonization of the sysctl interface is removed. Apparently the kernfs system that manages these files enforces exclusive access to a given sysctl file. Additionally, the proc_dointvec_minmax() function was made to store the result with WRITE_ONCE(), so we can assume a regular atomic store of an aligned word. 6. The ROP protection enforcement is refactored a bit. The idea is to allow baking into the kernel at compile time that NPHIE cannot be changed by a thread. Seems to allow making the system more secure on paper, not sure how useful it is in practice. 7. The prctl interface tries to stay separate from the DEXCR structure. This makes it a little contorted (having to convert the prctl value to an aspect), but I think makes the interface more robust against future changes to the DEXCR. E.g., if all 32 aspect bits were exhausted and a second DEXCR added, the current interface could still handle that. [1]: https://patchwork.ozlabs.org/project/linuxppc-dev/cover/20230616034846.311705-1-bg...@linux.ibm.com/ Benjamin Gray (6): powerpc/dexcr: Make all aspects CPU features powerpc/dexcr: Add thread specific DEXCR configuration prctl: Define PowerPC DEXCR interface powerpc/dexcr: Add prctl implementation powerpc/dexcr: Add sysctl entry for SBHE system override powerpc/dexcr: Add enforced userspace ROP protection config arch/powerpc/Kconfig | 5 + arch/powerpc/include/asm/cputable.h | 6 +- arch/powerpc/include/asm/processor.h | 22 +++ arch/powerpc/kernel/Makefile | 1 + arch/powerpc/kernel/dexcr.c | 218 +++++++++++++++++++++++++++ arch/powerpc/kernel/process.c | 24 +++ arch/powerpc/kernel/prom.c | 3 + include/uapi/linux/prctl.h | 13 ++ kernel/sys.c | 16 ++ 9 files changed, 307 insertions(+), 1 deletion(-) create mode 100644 arch/powerpc/kernel/dexcr.c -- 2.41.0