Re: [PATCH v2 2/2] certs: Add support for using elliptic curve keys for signing modules
+++ Stefan Berger [08/04/21 11:24 -0400]: Add support for using elliptic curve keys for signing modules. It uses a NIST P384 (secp384r1) key if the user chooses an elliptic curve key and will have ECDSA support built into the kernel. Note: A developer choosing an ECDSA key for signing modules should still delete the signing key (rm certs/signing_key.*) when building an older version of a kernel that only supports RSA keys. Unless kbuild automati- cally detects and generates a new kernel module key, ECDSA-signed kernel modules will fail signature verification. Signed-off-by: Stefan Berger --- v2: - check for ECDSA key by id-ecPublicKey from output line 'Public Key Algorithm: id-ecPublicKey'. --- certs/Kconfig | 25 + certs/Makefile| 9 + crypto/asymmetric_keys/pkcs7_parser.c | 4 3 files changed, 38 insertions(+) diff --git a/certs/Kconfig b/certs/Kconfig index 48675ad319db..6f8337874ae0 100644 --- a/certs/Kconfig +++ b/certs/Kconfig @@ -15,6 +15,31 @@ config MODULE_SIG_KEY then the kernel will automatically generate the private key and certificate as described in Documentation/admin-guide/module-signing.rst +choice + prompt "Type of module signing key to be generated" + default MODULE_SIG_KEY_TYPE_RSA + help +The type of module signing key type to generate. This option +does not apply if a #PKCS11 URI is used. + +config MODULE_SIG_KEY_TYPE_RSA + bool "RSA" + depends on MODULE_SIG || IMA_APPRAISE_MODSIG + help +Use an RSA key for module signing. + +config MODULE_SIG_KEY_TYPE_ECDSA + bool "ECDSA" + select CRYPTO_ECDSA + depends on MODULE_SIG || IMA_APPRAISE_MODSIG + help +Use an elliptic curve key (NIST P384) for module signing. + +Note: Remove all ECDSA signing keys, e.g. certs/signing_key.pem, +when falling back to building Linux 5.11 and older kernels. + +endchoice + config SYSTEM_TRUSTED_KEYRING bool "Provide system-wide ring of trusted keys" depends on KEYS diff --git a/certs/Makefile b/certs/Makefile index f64bc89ccbf1..c2fabc288550 100644 --- a/certs/Makefile +++ b/certs/Makefile @@ -62,7 +62,15 @@ ifeq ($(CONFIG_MODULE_SIG_KEY),"certs/signing_key.pem") X509TEXT=$(shell openssl x509 -in $(CONFIG_MODULE_SIG_KEY) -text) +# Support user changing key type +ifdef CONFIG_MODULE_SIG_KEY_TYPE_ECDSA +keytype_openssl = -newkey ec -pkeyopt ec_paramgen_curve:secp384r1 +$(if $(findstring id-ecPublicKey,$(X509TEXT)),,$(shell rm -f $(CONFIG_MODULE_SIG_KEY))) +endif + +ifdef CONFIG_MODULE_SIG_KEY_TYPE_RSA $(if $(findstring rsaEncryption,$(X509TEXT)),,$(shell rm -f $(CONFIG_MODULE_SIG_KEY))) +endif $(obj)/signing_key.pem: $(obj)/x509.genkey @$(kecho) "###" @@ -77,6 +85,7 @@ $(obj)/signing_key.pem: $(obj)/x509.genkey -batch -x509 -config $(obj)/x509.genkey \ -outform PEM -out $(obj)/signing_key.pem \ -keyout $(obj)/signing_key.pem \ + $(keytype_openssl) \ $($(quiet)redirect_openssl) @$(kecho) "###" @$(kecho) "### Key pair generated." diff --git a/crypto/asymmetric_keys/pkcs7_parser.c b/crypto/asymmetric_keys/pkcs7_parser.c index 967329e0a07b..2546ec6a0505 100644 --- a/crypto/asymmetric_keys/pkcs7_parser.c +++ b/crypto/asymmetric_keys/pkcs7_parser.c @@ -269,6 +269,10 @@ int pkcs7_sig_note_pkey_algo(void *context, size_t hdrlen, ctx->sinfo->sig->pkey_algo = "rsa"; ctx->sinfo->sig->encoding = "pkcs1"; break; + case OID_id_ecdsa_with_sha256: + ctx->sinfo->sig->pkey_algo = "ecdsa"; + ctx->sinfo->sig->encoding = "x962"; + break; Hi Stefan, Does CONFIG_MODULE_SIG_KEY_TYPE_ECDSA have a dependency on MODULE_SIG_SHA256? By default, MODULE_SIG_SHA1 is selected when CONFIG_MODULE_SIG is enabled. I was doing some quick testing and found that when I enabled MODULE_SIG_KEY_TYPE_ECDSA I get a "Unsupported pkey algo: 5" error on module load, which goes away after fixing my config and selecting MODULE_SIG_SHA256. Thanks, Jessica
Re: [PATCH v4 05/13] module: Add printk formats to add module build ID to stacktraces
+++ Stephen Boyd [17/04/21 18:52 -0700]: [snip] Sounds good. It means that some more ifdefs are probably required vs. making the array size be 0 when the config is disabled but that isn't a big problem for me. I'm reworking the code now and will test and then send v5 shortly. Thanks! Great, thanks a lot Stephen! Jessica
Re: [PATCH v4 05/13] module: Add printk formats to add module build ID to stacktraces
+++ Stephen Boyd [09/04/21 18:52 -0700]: Let's make kernel stacktraces easier to identify by including the build ID[1] of a module if the stacktrace is printing a symbol from a module. This makes it simpler for developers to locate a kernel module's full debuginfo for a particular stacktrace. Combined with scripts/decode_stracktrace.sh, a developer can download the matching debuginfo from a debuginfod[2] server and find the exact file and line number for the functions plus offsets in a stacktrace that match the module. This is especially useful for pstore crash debugging where the kernel crashes are recorded in something like console-ramoops and the recovery kernel/modules are different or the debuginfo doesn't exist on the device due to space concerns (the debuginfo can be too large for space limited devices). Originally, I put this on the %pS format, but that was quickly rejected given that %pS is used in other places such as ftrace where build IDs aren't meaningful. There was some discussions on the list to put every module build ID into the "Modules linked in:" section of the stacktrace message but that quickly becomes very hard to read once you have more than three or four modules linked in. It also provides too much information when we don't expect each module to be traversed in a stacktrace. Having the build ID for modules that aren't important just makes things messy. Splitting it to multiple lines for each module quickly explodes the number of lines printed in an oops too, possibly wrapping the warning off the console. And finally, trying to stash away each module used in a callstack to provide the ID of each symbol printed is cumbersome and would require changes to each architecture to stash away modules and return their build IDs once unwinding has completed. Instead, we opt for the simpler approach of introducing new printk formats '%pS[R]b' for "pointer symbolic backtrace with module build ID" and '%pBb' for "pointer backtrace with module build ID" and then updating the few places in the architecture layer where the stacktrace is printed to use this new format. Example: WARNING: CPU: 3 PID: 3373 at drivers/misc/lkdtm/bugs.c:83 lkdtm_WARNING+0x28/0x30 [lkdtm] Modules linked in: lkdtm rfcomm algif_hash algif_skcipher af_alg xt_cgroup uinput xt_MASQUERADE hci_uart CPU: 3 PID: 3373 Comm: bash Not tainted 5.11 #12 a8c0d47f7051f3e6670ceaea724af66a39c6cec8 Hardware name: Google Lazor (rev3+) with KB Backlight (DT) pstate: 0049 (nzcv daif +PAN -UAO -TCO BTYPE=--) pc : lkdtm_WARNING+0x28/0x30 [lkdtm] lr : lkdtm_do_action+0x24/0x40 [lkdtm] sp : ffc013febca0 x29: ffc013febca0 x28: ff88d9438040 x27: x26: x25: x24: ffdd0e9772c0 x23: 0020 x22: ffdd0e975366 x21: ffdd0e9772e0 x20: ffc013febde0 x19: 0008 x18: x17: x16: 0037 x15: ffdd102ab174 x14: 0003 x13: 0004 x12: x11: x10: x9 : 0001 x8 : ffdd0e979000 x7 : x6 : ffdd10ff6b54 x5 : x4 : x3 : ffc013feb938 x2 : ff89fef05a70 x1 : ff89feef5788 x0 : ffdd0e9772e0 Call trace: lkdtm_WARNING+0x28/0x30 [lkdtm 6c2215028606bda50de823490723dc4bc5bf46f9] direct_entry+0x16c/0x1b4 [lkdtm 6c2215028606bda50de823490723dc4bc5bf46f9] full_proxy_write+0x74/0xa4 vfs_write+0xec/0x2e8 ksys_write+0x84/0xf0 __arm64_sys_write+0x24/0x30 el0_svc_common+0xf4/0x1c0 do_el0_svc_compat+0x28/0x3c el0_svc_compat+0x10/0x1c el0_sync_compat_handler+0xa8/0xcc el0_sync_compat+0x178/0x180 ---[ end trace f89bc7f5417cbcc6 ]--- Cc: Jiri Olsa Cc: Alexei Starovoitov Cc: Jessica Yu Cc: Evan Green Cc: Hsin-Yi Wang Cc: Petr Mladek Cc: Steven Rostedt Cc: Sergey Senozhatsky Cc: Andy Shevchenko Cc: Rasmus Villemoes Cc: Cc: Matthew Wilcox Link: https://fedoraproject.org/wiki/Releases/FeatureBuildId [1] Link: https://sourceware.org/elfutils/Debuginfod.html [2] Signed-off-by: Stephen Boyd --- Documentation/core-api/printk-formats.rst | 11 +++ include/linux/kallsyms.h | 20 - include/linux/module.h| 6 +- kernel/kallsyms.c | 95 ++- kernel/module.c | 24 +- lib/vsprintf.c| 8 +- 6 files changed, 139 insertions(+), 25 deletions(-) diff --git a/Documentation/core-api/printk-formats.rst b/Documentation/core-api/printk-formats.rst index 160e710d992f..5f60533f2a56 100644 --- a/Documentation/core-api/printk-formats.rst +++ b/Documentation/core-api/printk-formats.rst @@ -114,6 +114,17 @@ used when printing stack backtraces. The specifier takes into consideration the effect of compiler optimisations which may occur when tail-calls are used and marked with
Re: Re: [PATCH] kernel/module: Use BUG_ON instead of if condition followed by BUG.
+++ 周传高 [13/04/21 15:21 +0800]: +++ zhouchuangao [30/03/21 05:07 -0700]: It can be optimized at compile time. Signed-off-by: zhouchuangao Hi, Could you please provide a more descriptive changelog? I.e., Is this a fix for a cocinelle warning? What are the optimization(s)? Thanks, First, This patch comes from cocinelle warning. Second, #define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0) BUG_ON uses unlikely in if(). Through disassembly, we can see that brk #0x800 is compiled to the end of the function. As you can see below: .. ff8008660bec: d65f03c0ret ff8008660bf0: d421brk #0x800 Usually, the condition in if () is not satisfied. For the multi-stage pipeline, we do not need to perform fetch decode and excute operation on brk instruction. In my opinion, this can improve the efficiency of the multi-stage pipeline. Thanks. Could you please modify your commit/changelog message to include this information (including the coccinelle warning) and resend the patch?
Re: [PATCH] kernel/module: Use BUG_ON instead of if condition followed by BUG.
+++ zhouchuangao [30/03/21 05:07 -0700]: It can be optimized at compile time. Signed-off-by: zhouchuangao Hi, Could you please provide a more descriptive changelog? I.e., Is this a fix for a cocinelle warning? What are the optimization(s)? Thanks, Jessica --- kernel/module.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index 3047935..f46fc4f 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -1014,8 +1014,8 @@ void __symbol_put(const char *symbol) }; preempt_disable(); - if (!find_symbol(&fsa)) - BUG(); + BUG_ON(!find_symbol(&fsa)); + module_put(fsa.owner); preempt_enable(); } -- 2.7.4
Re: [PATCH v3 04/12] module: Add printk format to add module build ID to stacktraces
+++ Stephen Boyd [30/03/21 20:05 -0700]: [snipped] diff --git a/kernel/module.c b/kernel/module.c index 30479355ab85..6f5bc1b046a5 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -2770,6 +2771,20 @@ static void add_kallsyms(struct module *mod, const struct load_info *info) } mod->core_kallsyms.num_symtab = ndst; } + +static void init_build_id(struct module *mod, const struct load_info *info) +{ + const Elf_Shdr *sechdr; + unsigned int i; + + for (i = 0; i < info->hdr->e_shnum; i++) { + sechdr = &info->sechdrs[i]; + if (!sect_empty(sechdr) && sechdr->sh_type == SHT_NOTE && + !build_id_parse_buf((void *)sechdr->sh_addr, mod->build_id, + sechdr->sh_size)) + break; + } +} Why not just look for the .note.gnu.build-id section instead of trying to parse each note section? Doesn't it always contain the build id? At least the ld man page seems to suggest this section name should be consistent. Jessica
Re: [PATCH v3 02/12] buildid: Stash away kernels build ID on init
+++ Stephen Boyd [30/03/21 20:05 -0700]: [snipped] diff --git a/lib/buildid.c b/lib/buildid.c index 010ab0674cb9..b939bbc59233 100644 --- a/lib/buildid.c +++ b/lib/buildid.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include +#include #include #include #include @@ -171,3 +172,19 @@ int build_id_parse_buf(const void *buf, unsigned char *build_id, u32 buf_size) { return parse_build_id_buf(build_id, NULL, buf, buf_size); } + +unsigned char vmlinux_build_id[BUILD_ID_SIZE_MAX] __ro_after_init; + +/** + * init_vmlinux_build_id - Get the running kernel's build ID + * + * Return: Running kernel's build ID + */ Hm, init_vmlinux_build_id() doesn't return anything, so this comment is not accurate - maybe "Get the running kernel's build ID and store it in vmlinux_build_id"? +void __init init_vmlinux_build_id(void) +{ + extern const void __start_notes __weak; + extern const void __stop_notes __weak; + unsigned int size = &__stop_notes - &__start_notes; + + build_id_parse_buf(&__start_notes, vmlinux_build_id, size); +}
Re: [PATCH v3 04/17] module: ensure __cfi_check alignment
+++ Sami Tolvanen [23/03/21 13:39 -0700]: CONFIG_CFI_CLANG_SHADOW assumes the __cfi_check() function is page aligned and at the beginning of the .text section. While Clang would normally align the function correctly, it fails to do so for modules with no executable code. This change ensures the correct __cfi_check() location and alignment. It also discards the .eh_frame section, which Clang can generate with certain sanitizers, such as CFI. Link: https://bugs.llvm.org/show_bug.cgi?id=46293 Signed-off-by: Sami Tolvanen Acked-by: Jessica Yu Thanks!
[PATCH] module: treat exit sections the same as init sections when !CONFIG_MODULE_UNLOAD
Dynamic code patching (alternatives, jump_label and static_call) can have sites in __exit code, even if __exit is never executed. Therefore __exit must be present at runtime, at least for as long as __init code is. Additionally, for jump_label and static_call, the __exit sites must also identify as within_module_init(), such that the infrastructure is aware to never touch them after module init -- alternatives are only ran once at init and hence don't have this particular constraint. By making __exit identify as __init for !MODULE_UNLOAD, the above is satisfied. So the section ordering should look like the following when !CONFIG_MODULE_UNLOAD, with the .exit sections moved to the init region of the module. Core section allocation order: .text .rodata __ksymtab_gpl __ksymtab_strings .note.* sections .bss .data .gnu.linkonce.this_module Init section allocation order: .init.text .exit.text .symtab .strtab [jeyu: thanks to Peter Zijlstra for most of the changelog] Link: https://lore.kernel.org/lkml/YFiuphGw0RKehWsQ@gunter/ Signed-off-by: Jessica Yu --- Do you want to take this patch with the other static_call patches? Or should I take this through modules-next? kernel/module.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index 30479355ab85..173a09175511 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2802,7 +2802,11 @@ void * __weak module_alloc(unsigned long size) bool __weak module_init_section(const char *name) { +#ifndef CONFIG_MODULE_UNLOAD + return strstarts(name, ".init") || module_exit_section(name); +#else return strstarts(name, ".init"); +#endif } bool __weak module_exit_section(const char *name) @@ -3116,11 +3120,6 @@ static int rewrite_section_headers(struct load_info *info, int flags) */ shdr->sh_addr = (size_t)info->hdr + shdr->sh_offset; -#ifndef CONFIG_MODULE_UNLOAD - /* Don't load .exit sections */ - if (module_exit_section(info->secstrings+shdr->sh_name)) - shdr->sh_flags &= ~(unsigned long)SHF_ALLOC; -#endif } /* Track but don't keep modinfo and version sections. */ -- 2.30.1
Re: [PATCH 3/3] static_call: Fix static_call_update() sanity check
+++ Peter Zijlstra [22/03/21 17:54 +0100]: On Mon, Mar 22, 2021 at 03:50:14PM +0100, Jessica Yu wrote: It should be doable. If you want the exit sections to be treated the same as module init, the following patch should stuff any exit sections into the module init "region" (completely untested). Hence it should be freed together with the init sections and it would identify as init through within_module_init(). Let me know if this works for you. That does indeed seem to DTRT from a quick scan of module.c. Very nice tidy patch. I was afraid it'd be much worse. Assuming it actually works; for your Changelog: "Dynamic code patching (alternatives, jump_label and static_call) can have sites in __exit code, even it __exit is never executed. Therefore __exit must be present at runtime, at least for as long as __init code is. Additionally, for jump_label and static_call, the __exit sites must also identify as within_module_init(), such that the infrastructure is aware to never touch them after module init -- alternatives are only ran once at init and hence don't have this particular constraint. By making __exit identify as __init for UNLOAD_MODULE, the above is satisfied." Thanks a lot for the changelog :-) I'll turn this into a formal patch after some testing tomorrow. Jessica
Re: [PATCH 3/3] static_call: Fix static_call_update() sanity check
+++ Steven Rostedt [19/03/21 16:57 -0400]: On Fri, 19 Mar 2021 20:34:24 +0100 Peter Zijlstra wrote: On Fri, Mar 19, 2021 at 02:00:05PM -0400, Steven Rostedt wrote: > Would making __exit code the same as init code work? That is, load it just > like module init code is loaded, and free it when the init code is freed As stated, yes. But it must then also identify as init through within_module_init(). I think that's doable. Since the usecases for that appear to be mostly about "think code may no longer exist after it is used". Thus, having exit code act just like init code when UNLOAD is not set, appears appropriate. Jessica, please correct me if I'm wrong. It should be doable. If you want the exit sections to be treated the same as module init, the following patch should stuff any exit sections into the module init "region" (completely untested). Hence it should be freed together with the init sections and it would identify as init through within_module_init(). Let me know if this works for you. --- diff --git a/kernel/module.c b/kernel/module.c index 30479355ab85..1c3396a9dd8b 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2802,7 +2802,11 @@ void * __weak module_alloc(unsigned long size) bool __weak module_init_section(const char *name) { - return strstarts(name, ".init"); +#ifndef CONFIG_UNLOAD_MODULE + return strstarts(name, ".init") || module_exit_section(name); +#else + return strstarts(name, ".init") +#endif } bool __weak module_exit_section(const char *name) @@ -3116,11 +3120,6 @@ static int rewrite_section_headers(struct load_info *info, int flags) */ shdr->sh_addr = (size_t)info->hdr + shdr->sh_offset; -#ifndef CONFIG_MODULE_UNLOAD - /* Don't load .exit sections */ - if (module_exit_section(info->secstrings+shdr->sh_name)) - shdr->sh_flags &= ~(unsigned long)SHF_ALLOC; -#endif } /* Track but don't keep modinfo and version sections. */
Re: [PATCH 3/3] static_call: Fix static_call_update() sanity check
+++ Steven Rostedt [19/03/21 14:00 -0400]: On Fri, 19 Mar 2021 13:57:38 +0100 Peter Zijlstra wrote: Jessica, can you explain how !MODULE_UNLOAD is supposed to work? Alternatives, jump_labels and static_call all can have relocations into __exit code. Not loading it at all would be BAD. According to the description: " Without this option you will not be able to unload any modules (note that some modules may not be unloadable anyway), which makes your kernel smaller, faster and simpler. If unsure, say Y." Seems there's no reason to load the "exit" portion, as that's what makes it "smaller". Exactly. If you disable MODULE_UNLOAD, then you don't intend to ever unload any modules, and so you'll never end up calling the module's cleanup/exit function. That code would basically be never used, so that's why it's not loaded in the first place.
Re: [PATCH v2 2/2] modules: add CONFIG_MODPROBE_PATH
+++ Rasmus Villemoes [09/03/21 22:17 +0100]: Allow the developer to specifiy the initial value of the modprobe_path[] string. This can be used to set it to the empty string initially, thus effectively disabling request_module() during early boot until userspace writes a new value via the /proc/sys/kernel/modprobe interface. [1] When building a custom kernel (often for an embedded target), it's normal to build everything into the kernel that is needed for booting, and indeed the initramfs often contains no modules at all, so every such request_module() done before userspace init has mounted the real rootfs is a waste of time. This is particularly useful when combined with the previous patch, which allowed the initramfs to be unpacked asynchronously - for that to work, it had to make any usermodehelper call wait for the unpacking to finish before attempting to invoke the userspace helper. By eliminating all such (known-to-be-futile) calls of usermodehelper, the initramfs unpacking and the {device,late}_initcalls can proceed in parallel for much longer. For a relatively slow ppc board I'm working on, the two patches combined lead to 0.2s faster boot - but more importantly, the fact that the initramfs unpacking proceeds completely in the background while devices get probed means I get to handle the gpio watchdog in time without getting reset. [1] __request_module() already has an early -ENOENT return when modprobe_path is the empty string. Signed-off-by: Rasmus Villemoes Acked-by: Jessica Yu Thanks!
Re: [GIT PULL] Modules updates for v5.12
+++ Christoph Hellwig [24/02/21 08:52 +0100]: On Tue, Feb 23, 2021 at 12:07:39PM -0800, Linus Torvalds wrote: On Tue, Feb 23, 2021 at 12:03 PM Linus Torvalds wrote: > > This is unacceptably slow. If that symbol trimming takes 30% of the > whole kernel build time, it needs to be fixed or removed. I think I'm going to mark TRIM_UNUSED_KSYMS as "depends on BROKEN". There's no way I can accept that horrible overhead, and the rationale for that config option is questionable at best, I think it is pretty useful for embedded setups. BROKEN seems pretty strong for something that absolutely works as intendended. I guess to make you (and possibly others) not grumpy we just need to ensure it doesn't get pulled in by allmodconfig. So maybe just invert the symbol, default the KEEP_UNUSED_SYMBOL, and add a message to the helptext explaining the slowdown? Hm, something like this maybe? (untested) --- From 08bc08229fc3801b1a580a07ce7ff3e806b3fe90 Mon Sep 17 00:00:00 2001 From: Jessica Yu Date: Wed, 24 Feb 2021 14:54:09 +0100 Subject: [PATCH] Kconfig: invert TRIM_UNUSED_SYMBOLS option and rename it to KEEP_UNUSED_SYMBOLS Removing CONFIG_UNUSED_SYMBOLS (commit 367948220fce "module: remove EXPORT_UNUSED_SYMBOL*") unhid the CONFIG_TRIM_UNUSED_SYMBOLS option and therefore it now gets automatically enabled with allyesconfig. To prevent allyesconfig from enabling TRIM_UNUSED_SYMBOLS (which is known to slow build times), invert the config option and name it KEEP_UNUSED_SYMBOLS, which does the same thing as TRIM_UNUSED_SYMBOLS=n. That way, allyesconfig will keep the previous behavior of not trimming symbols. Signed-off-by: Jessica Yu --- Makefile| 4 ++-- arch/mips/configs/generic_defconfig | 2 +- include/asm-generic/export.h| 2 +- include/linux/export.h | 2 +- init/Kconfig| 21 +++-- kernel/livepatch/Kconfig| 2 +- scripts/Makefile.build | 2 +- 7 files changed, 18 insertions(+), 17 deletions(-) diff --git a/Makefile b/Makefile index b18dbc634690..23f50521e97f 100644 --- a/Makefile +++ b/Makefile @@ -1164,7 +1164,7 @@ vmlinux-deps := $(KBUILD_LDS) $(KBUILD_VMLINUX_OBJS) $(KBUILD_VMLINUX_LIBS) # Recurse until adjust_autoksyms.sh is satisfied PHONY += autoksyms_recursive -ifdef CONFIG_TRIM_UNUSED_KSYMS +ifndef CONFIG_KEEP_UNUSED_KSYMS # For the kernel to actually contain only the needed exported symbols, # we have to build modules as well to determine what those symbols are. # (this can be evaluated only once include/config/auto.conf has been included) @@ -1175,7 +1175,7 @@ autoksyms_recursive: descend modules.order "$(MAKE) -f $(srctree)/Makefile vmlinux" endif -autoksyms_h := $(if $(CONFIG_TRIM_UNUSED_KSYMS), include/generated/autoksyms.h) +autoksyms_h := $(if $(CONFIG_KEEP_UNUSED_KSYMS),,include/generated/autoksyms.h) quiet_cmd_autoksyms_h = GEN $@ cmd_autoksyms_h = mkdir -p $(dir $@); \ diff --git a/arch/mips/configs/generic_defconfig b/arch/mips/configs/generic_defconfig index 714169e411cf..d46da28ea032 100644 --- a/arch/mips/configs/generic_defconfig +++ b/arch/mips/configs/generic_defconfig @@ -29,7 +29,7 @@ CONFIG_MIPS_O32_FP64_SUPPORT=y CONFIG_JUMP_LABEL=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y -CONFIG_TRIM_UNUSED_KSYMS=y +CONFIG_KEEP_UNUSED_KSYMS=n CONFIG_NET=y CONFIG_PACKET=y CONFIG_UNIX=y diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h index 07a36a874dca..06d401464195 100644 --- a/include/asm-generic/export.h +++ b/include/asm-generic/export.h @@ -57,7 +57,7 @@ __kstrtab_\name: #endif .endm -#if defined(CONFIG_TRIM_UNUSED_KSYMS) +#if !defined(CONFIG_KEEP_UNUSED_KSYMS) #include #include diff --git a/include/linux/export.h b/include/linux/export.h index 6271a5d9c988..449f7d15e580 100644 --- a/include/linux/export.h +++ b/include/linux/export.h @@ -118,7 +118,7 @@ struct kernel_symbol { */ #define __EXPORT_SYMBOL(sym, sec, ns) -#elif defined(CONFIG_TRIM_UNUSED_KSYMS) +#elif !defined(CONFIG_KEEP_UNUSED_KSYMS) #include diff --git a/init/Kconfig b/init/Kconfig index ba8bd5256980..db5d00bfc239 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2272,29 +2272,30 @@ config MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS If unsure, say N. -config TRIM_UNUSED_KSYMS - bool "Trim unused exported kernel symbols" - depends on BROKEN +config KEEP_UNUSED_KSYMS + bool "Keep unused exported kernel symbols" + default y help The kernel and some modules make many symbols available for other modules to use via EXPORT_SYMBOL() and variants. Depending on the set of modules being selected in your kernel configuration, many of those exported symbols might never be used. - This option allows for unused exported symbols to be dropped from - the build. In turn
Re: [GIT PULL] Modules updates for v5.12
+++ Linus Torvalds [23/02/21 12:03 -0800]: On Tue, Feb 23, 2021 at 12:01 PM Christoph Hellwig wrote: Does your build now enable TRIM_UNUSED_KSYMS but previously didn't by chance? Crossed emails. This is plain "make allmodconfig", so yes, now it will enable TRIM_UNUSED_KSYMS. This is unacceptably slow. If that symbol trimming takes 30% of the whole kernel build time, it needs to be fixed or removed. [ Adding Masahiro to CC ] It looks like CONFIG_TRIM_UNUSED_KSYMS had been hiding behind CONFIG_UNUSED_SYMBOLS all this time, and once the EXPORT_UNUSED_SYMBOL stuff was removed, it exposed that option to be selected by allyesconfig. That option had previously caused build issues on powerpc on linux-next, so I had temporarily marked that as BROKEN on powerpc until Masahiro's fix landed in linux-next. I was not aware of the additional build slowdown issue :/ In any case, Christoph's suggestion to invert the option sounds reasonable, since the mips defconfig selects it, it does not seem totally unused.
[GIT PULL] Modules updates for v5.12
Hi Linus, Please pull below to receive modules updates for the v5.12 merge window. A summary can be found in the signed tag. Thank you, Jessica --- The following changes since commit 19c329f6808995b142b3966301f217c831e7cf31: Linux 5.11-rc4 (2021-01-17 16:37:05 -0800) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux.git tags/modules-for-v5.12 for you to fetch changes up to 1e80d9cb579ed7edd121753eeccce82ff82521b4: module: potential uninitialized return in module_kallsyms_on_each_symbol() (2021-02-10 16:57:04 +0100) Modules updates for v5.12 Summary of modules changes for the 5.12 merge window: - Retire EXPORT_UNUSED_SYMBOL() and EXPORT_SYMBOL_GPL_FUTURE(). These export types were introduced between 2006 - 2008. All the of the unused symbols have been long removed and gpl future symbols were converted to gpl quite a long time ago, and I don't believe these export types have been used ever since. So, I think it should be safe to retire those export types now. (Christoph Hellwig) - Refactor and clean up some aged code cruft in the module loader (Christoph Hellwig) - Build {,module_}kallsyms_on_each_symbol only when livepatching is enabled, as it is the only caller (Christoph Hellwig) - Unexport find_module() and module_mutex and fix the last module callers to not rely on these anymore. Make module_mutex internal to the module loader. (Christoph Hellwig) - Harden ELF checks on module load and validate ELF structures before checking the module signature (Frank van der Linden) - Fix undefined symbol warning for clang (Fangrui Song) - Fix smatch warning (Dan Carpenter) Signed-off-by: Jessica Yu Christoph Hellwig (13): powerpc/powernv: remove get_cxl_module drm: remove drm_fb_helper_modinit module: unexport find_module and module_mutex module: use RCU to synchronize find_module kallsyms: refactor {,module_}kallsyms_on_each_symbol kallsyms: only build {,module_}kallsyms_on_each_symbol when required module: mark module_mutex static module: remove each_symbol_in_section module: merge each_symbol_section into find_symbol module: pass struct find_symbol_args to find_symbol module: move struct symsearch to module.c module: remove EXPORT_SYMBOL_GPL_FUTURE module: remove EXPORT_UNUSED_SYMBOL* Dan Carpenter (1): module: potential uninitialized return in module_kallsyms_on_each_symbol() Fangrui Song (1): module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols Frank van der Linden (1): module: harden ELF info handling arch/arm/configs/bcm2835_defconfig | 1 - arch/arm/configs/mxs_defconfig | 1 - arch/mips/configs/nlm_xlp_defconfig | 1 - arch/mips/configs/nlm_xlr_defconfig | 1 - arch/parisc/configs/generic-32bit_defconfig | 1 - arch/parisc/configs/generic-64bit_defconfig | 1 - arch/powerpc/configs/ppc6xx_defconfig | 1 - arch/powerpc/platforms/powernv/pci-cxl.c| 22 -- arch/s390/configs/debug_defconfig | 1 - arch/s390/configs/defconfig | 1 - arch/sh/configs/edosk7760_defconfig | 1 - arch/sh/configs/sdk7780_defconfig | 1 - arch/x86/configs/i386_defconfig | 1 - arch/x86/configs/x86_64_defconfig | 1 - arch/x86/tools/relocs.c | 4 +- drivers/gpu/drm/drm_crtc_helper_internal.h | 10 - drivers/gpu/drm/drm_fb_helper.c | 21 -- drivers/gpu/drm/drm_kms_helper_common.c | 25 +- include/asm-generic/vmlinux.lds.h | 42 --- include/linux/export.h | 9 - include/linux/kallsyms.h| 17 +- include/linux/module.h | 48 +-- init/Kconfig| 17 - kernel/kallsyms.c | 8 +- kernel/livepatch/core.c | 7 +- kernel/module.c | 481 ++-- kernel/module_signature.c | 2 +- kernel/module_signing.c | 2 +- kernel/trace/trace_kprobe.c | 4 +- lib/bug.c | 3 - scripts/checkpatch.pl | 6 +- scripts/mod/modpost.c | 50 +-- scripts/mod/modpost.h | 3 - scripts/module.lds.S| 6 - tools/include/linux/export.h| 3 - 35 files changed, 287 insertions(+), 516 deletions(-)
Re: [PATCH] kbuild: fix CONFIG_TRIM_UNUSED_KSYMS build for ppc64
+++ Masahiro Yamada [11/02/21 15:14 +0900]: Stephen Rothwell reported a build error on ppc64 when CONFIG_TRIM_UNUSED_KSYMS is enabled. Jessica Yu pointed out the cause of the error with the reference to the ppc64 elf ABI: "Symbol names with a dot (.) prefix are reserved for holding entry point addresses. The value of a symbol named ".FN", if it exists, is the entry point of the function "FN". As it turned out, CONFIG_TRIM_UNUSED_KSYMS has never worked for ppc64, which has been unnoticed until recently because this option depends on !UNUSED_SYMBOLS hence is disabled by all{mod,yes}config. (Then, it was uncovered by another patch removing UNUSED_SYMBOLS.) Removing the dot prefix in scripts/gen_autoksyms.sh fixes the issue. Please note it must be done before 'sort -u', because modules have both ._mcount and _mcount undefined when CONFIG_FUNCTION_TRACER=y. Link: https://lore.kernel.org/lkml/20210209210843.3af66...@canb.auug.org.au/ Reported-by: Stephen Rothwell Signed-off-by: Masahiro Yamada Thanks a lot for the quick fix. This fixes the ppc64 build issue on my end: Tested-by: Jessica Yu Do you plan to take this through the kbuild tree? If so, please let me know when you've applied it, then I can undo the temporary workaround I currently have in modules-next. Thank you! Jessica
Re: [PATCH] module: potential uninitialized return in module_kallsyms_on_each_symbol()
+++ Dan Carpenter [10/02/21 13:57 +0300]: Smatch complains that: kernel/module.c:4472 module_kallsyms_on_each_symbol() error: uninitialized symbol 'ret'. This warning looks like it could be correct if the &modules list is empty. Fixes: 013c1667cf78 ("kallsyms: refactor {,module_}kallsyms_on_each_symbol") Signed-off-by: Dan Carpenter Applied, thanks! Jessica
Re: linux-next: build failure after merge of the modules tree
+++ Stephen Rothwell [10/02/21 23:21 +1100]: Hi Jessica, On Wed, 10 Feb 2021 09:06:48 +0100 Jessica Yu wrote: Sorry, by "feature" I meant CONFIG_TRIM_UNUSED_KSYMS. This config option was introduced around v4.7. If simply enabling it produces these compilation errors I was wondering if it ever built properly on powerpc. Ah, of course. So for a quick fix, you could revert just the changes to lib/Kconfig and all the defconfigs. That way all the UNUSED_SYMBOLS infrastructure is still removed, but TRIM_UNUSED_KSYMS remains (un)set whenever it used to be (un)set and that could then be cleaned up in a followup patch set per architecture when we know it works. Hi Stephen, What's maybe simpler is marking this as BROKEN on PPC64 as a temporary fix for linux-next until Masahiro's patch (the real fix) is posted. I think the real problem is that TRIM_UNUSED_KSYMS was pretty much hiding behind UNUSED_SYMBOLS=y until it finally got removed. Anyway the main point is that TRIM_UNUSED_KSYMS=y shouldn't be breaking any builds. I've pushed this change and hope that won't break linux-next now. --- From 1fa67f8391acb88a54e7defe6b73f8f171450f4a Mon Sep 17 00:00:00 2001 From: Jessica Yu Date: Wed, 10 Feb 2021 16:59:53 +0100 Subject: [PATCH] module: mark TRIM_UNUSED_KSYMS as BROKEN on powerpc Commit 367948220fce, which removed CONFIG_UNUSED_SYMBOLS, unraveled the TRIM_UNUSED_KSYMS config option to be generally available without depending on CONFIG_UNUSED_SYMBOLS=y. With CONFIG_UNUSED_SYMBOLS gone, this means allyesconfig will now select this option without being "behind" CONFIG_UNUSED_SYMBOLS. Unfortunately, this revealed some underlying build issues on powerpc. When CONFIG_TRIM_UNUSED_KSYMS is enabled on powerpc, the build would fail like this: In file included from include/linux/export.h:123, from include/linux/linkage.h:7, from arch/powerpc/include/asm/unistd.h:18, from :2: ./include/generated/autoksyms.h:5:9: warning: missing whitespace after the macro name 5 | #define __KSYM_.HT_update_self_and_peer_setting 1 | ^~~ ./include/generated/autoksyms.h:6:9: warning: missing whitespace after the macro name 6 | #define __KSYM_.RemovePeerTS 1 | ^~~ ./include/generated/autoksyms.h:6: warning: "__KSYM_" redefined 6 | #define __KSYM_.RemovePeerTS 1 | ./include/generated/autoksyms.h:5: note: this is the location of the previous definition This is due to the addition of a dot "." prefix to some symbols that is specific to powerpc, which confuses the gen_autoksyms script. This is a temporary workaround for linux-next until gen_autoksyms and modpost gets fixed. Link: http://lore.kernel.org/r/20210209210843.3af66...@canb.auug.org.au Signed-off-by: Jessica Yu --- init/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/init/Kconfig b/init/Kconfig index 11b803b45c19..e4504a04b601 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2264,6 +2264,7 @@ config MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS config TRIM_UNUSED_KSYMS bool "Trim unused exported kernel symbols" + depends on !PPC64 || BROKEN help The kernel and some modules make many symbols available for other modules to use via EXPORT_SYMBOL() and variants. Depending
Re: linux-next: build failure after merge of the modules tree
+++ Stephen Rothwell [10/02/21 08:50 +1100]: Hi Jessica, On Tue, 9 Feb 2021 16:16:20 +0100 Jessica Yu wrote: Hmm, these errors don't look like it's related to that particular commit. I was I found this commit by bisection and then tested by reverting it. Before this commit, CONFIG_TRIM_UNUSED_KSYMS would not be set in the allyesconfig build because CONFIG_UNUSED_SYMBOLS was set. After this commit, CONFIG_TRIM_UNUSED_KSYMS will be set in the allyesconfig build. Ah, that makes sense then. I would get the error on powerpc whenever CONFIG_TRIM_UNUSED_KSYMS was enabled. able to reproduce these weird autoksym errors even without any modules-next patches applied, and on a clean v5.11-rc7 tree. To reproduce it, CONFIG_TRIM_UNUSED_KSYMS needs to be enabled. I guess that's why we run into these errors with allyesconfig. I used a gcc-7 ppc64le cross compiler and got the same compiler warnings. It seems to not compile on powerpc properly because it looks like some symbols have an extra dot "." prefix, for example in kthread.o: 168: 031824 NOTYPE GLOBAL DEFAULT6 kthread_create_worker 169: 1d90 104 FUNCGLOBAL DEFAULT1 .kthread_create_worker 170: 033024 NOTYPE GLOBAL DEFAULT6 kthread_create_worker_on_cpu 171: 1e0088 FUNCGLOBAL DEFAULT1 .kthread_create_worker_on_cpu 172: 034824 NOTYPE GLOBAL DEFAULT6 kthread_queue_work 173: 1e60 228 FUNCGLOBAL DEFAULT1 .kthread_queue_work So I suppose this dot prefix is specific to powerpc. From the ppc64 elf abi docs: Symbol names with a dot (.) prefix are reserved for holding entry point addresses. The value of a symbol named ".FN", if it exists, is the entry point of the function "FN". I guess the presence of the extra dot symbols is confusing scripts/gen_autoksyms.sh, so we get the dot symbols in autoksyms.h, which the preprocessor doesn't like. I am wondering how this was never caught until now and also now curious if this feature was ever functional on powerpc.. Which feature? Sorry, by "feature" I meant CONFIG_TRIM_UNUSED_KSYMS. This config option was introduced around v4.7. If simply enabling it produces these compilation errors I was wondering if it ever built properly on powerpc. Thanks, Jessica
Re: linux-next: build failure after merge of the modules tree
+++ Stephen Rothwell [09/02/21 21:08 +1100]: Hi all, After merging the modules tree, today's linux-next build (powerpc allyesconfig) failed like this: In file included from include/linux/export.h:123, from include/linux/linkage.h:7, from arch/powerpc/include/asm/unistd.h:18, from :2: ./include/generated/autoksyms.h:5:9: warning: missing whitespace after the macro name 5 | #define __KSYM_.HT_update_self_and_peer_setting 1 | ^~~ ./include/generated/autoksyms.h:6:9: warning: missing whitespace after the macro name 6 | #define __KSYM_.RemovePeerTS 1 | ^~~ ./include/generated/autoksyms.h:6: warning: "__KSYM_" redefined 6 | #define __KSYM_.RemovePeerTS 1 | ./include/generated/autoksyms.h:5: note: this is the location of the previous definition and on and on :-( Caused by commit 367948220fce ("module: remove EXPORT_UNUSED_SYMBOL*") I have reverted that commit for today. [ Adding Michael and Masahiro to CC ] Hi Stephen, Hmm, these errors don't look like it's related to that particular commit. I was able to reproduce these weird autoksym errors even without any modules-next patches applied, and on a clean v5.11-rc7 tree. To reproduce it, CONFIG_TRIM_UNUSED_KSYMS needs to be enabled. I guess that's why we run into these errors with allyesconfig. I used a gcc-7 ppc64le cross compiler and got the same compiler warnings. It seems to not compile on powerpc properly because it looks like some symbols have an extra dot "." prefix, for example in kthread.o: 168: 031824 NOTYPE GLOBAL DEFAULT6 kthread_create_worker 169: 1d90 104 FUNCGLOBAL DEFAULT1 .kthread_create_worker 170: 033024 NOTYPE GLOBAL DEFAULT6 kthread_create_worker_on_cpu 171: 1e0088 FUNCGLOBAL DEFAULT1 .kthread_create_worker_on_cpu 172: 034824 NOTYPE GLOBAL DEFAULT6 kthread_queue_work 173: 1e60 228 FUNCGLOBAL DEFAULT1 .kthread_queue_work So I suppose this dot prefix is specific to powerpc. From the ppc64 elf abi docs: Symbol names with a dot (.) prefix are reserved for holding entry point addresses. The value of a symbol named ".FN", if it exists, is the entry point of the function "FN". I guess the presence of the extra dot symbols is confusing scripts/gen_autoksyms.sh, so we get the dot symbols in autoksyms.h, which the preprocessor doesn't like. I am wondering how this was never caught until now and also now curious if this feature was ever functional on powerpc.. Thanks, Jessica
Re: module loader dead code removal and cleanups v3
+++ Christoph Hellwig [02/02/21 13:13 +0100]: Hi all, this series removes support for long term unused export types and cleans up various loose ends in the module loader. Changes since v2: - clean up klp_find_object_symbol a bit - remove the now unused module_assert_mutex helper Changes since v1: - move struct symsearch to module.c - rework drm to not call find_module at all - allow RCU-sched locking for find_module - keep find_module as a public API instead of module_loaded - update a few comments and commit logs Diffstat: Queued on modules-next (along with the updated patch 10). Thanks everyone, Jessica
Re: [PATCH v6] modules: introduce the MODULE_SCMVERSION config
+++ Will McVicker [21/01/21 21:36 +]: Config MODULE_SCMVERSION introduces a new module attribute -- `scmversion` -- which can be used to identify a given module's SCM version. This is very useful for developers that update their kernel independently from their kernel modules or vice-versa since the SCM version provided by UTS_RELEASE (`uname -r`) will now differ from the module's vermagic attribute. For example, we have a CI setup that tests new kernel changes on the hikey960 and db845c devices without updating their kernel modules. When these tests fail, we need to be able to identify the exact device configuration the test was using. By including MODULE_SCMVERSION, we can identify the exact kernel and modules' SCM versions for debugging the failures. Additionally, by exposing the SCM version via the sysfs node /sys/module/MODULENAME/scmversion, one can also verify the SCM versions of the modules loaded from the initramfs. Currently, modinfo can only retrieve module attributes from the module's ko on disk and not from the actual module that is loaded in RAM. You can retrieve the SCM version in two ways, 1) By using modinfo: > modinfo -F scmversion MODULENAME 2) By module sysfs node: > cat /sys/module/MODULENAME/scmversion Hi Will, First off, thanks for being patient and being responsive throughout the patch review process. Personally, I am rather neutral towards this feature. We already provide CONFIG_MODULE_SRCVERSION to provide a checksum of a module's source files and I think the SCMVERSION is a nicer alternative. I can see how an optional scmversion field might be helpful information for distro developers in testing environments and in situations where it is possible for the kernel and modules to be updated/packaged separately (for instance, the kernel selftest modules under lib/ are in-tree modules that are provided as a separate kernel module package in SLE). Generally, out of principle, I do not want to merge a patch that's been NAK'd repeatedly; even if I take the patch it'd likely be contested all the way up to the merge window. So this boils down to whether Christoph (and maybe Greg) are fine with this being a debug option that's not enabled by default. Thanks, Jessica
Re: module loader dead code removal and cleanups v3
+++ Christoph Hellwig [02/02/21 13:13 +0100]: Hi all, this series removes support for long term unused export types and cleans up various loose ends in the module loader. Changes since v2: - clean up klp_find_object_symbol a bit - remove the now unused module_assert_mutex helper Changes since v1: - move struct symsearch to module.c - rework drm to not call find_module at all - allow RCU-sched locking for find_module - keep find_module as a public API instead of module_loaded - update a few comments and commit logs Thanks Christoph for cleaning up all that aged cruft, and thanks everyone for the reviews. I was curious about EXPORT_SYMBOL_GPL_FUTURE and EXPORT_UNUSED_SYMBOL variants, and found that most of that stuff was introduced between 2006 - 2008. All the of the unused symbols were removed and gpl future symbols were converted to gpl quite a long time ago, and I don't believe these export types have been used ever since. So I think it's safe to retire those export types now. The patchset looks good so far. After Miroslav's comments are addressed, I'll wait an extra day or two in case there are more comments before queueing them onto modules-next. I can take the first two patches as well provided the acks are there (I think patch 2 is missing Daniel Vetter's ack from v1 of the series, but I'll add that back in). Thanks, Jessica
Re: [PATCH 04/13] module: use RCU to synchronize find_module
+++ Miroslav Benes [29/01/21 16:29 +0100]: On Thu, 28 Jan 2021, Christoph Hellwig wrote: Allow for a RCU-sched critical section around find_module, following the lower level find_module_all helper, and switch the two callers outside of module.c to use such a RCU-sched critical section instead of module_mutex. That's a nice idea. @@ -57,7 +58,7 @@ static void klp_find_object_module(struct klp_object *obj) if (!klp_is_module(obj)) return; - mutex_lock(&module_mutex); + rcu_read_lock_sched(); /* * We do not want to block removal of patched modules and therefore * we do not take a reference here. The patches are removed by @@ -74,7 +75,7 @@ static void klp_find_object_module(struct klp_object *obj) if (mod && mod->klp_alive) RCU always baffles me a bit, so I'll ask. Don't we need rcu_dereference_sched() here? "mod" comes from a RCU-protected list, so I wonder. Same here :-) I had to double check the RCU documentation. For our modules list case I believe the rcu list API should take care of that for us. Worth noting is this snippet from Documentation/RCU/whatisRCU.txt: rcu_dereference() is typically used indirectly, via the _rcu list-manipulation primitives, such as list_for_each_entry_rcu()
Re: [PATCH 13/13] module: remove EXPORY_UNUSED_SYMBOL*
+++ Christoph Hellwig [21/01/21 08:49 +0100]: EXPORT_UNUSED_SYMBOL* is not actually used anywhere. Remove the unused functionality as we generally just remove unused code anyway. Signed-off-by: Christoph Hellwig --- arch/arm/configs/bcm2835_defconfig | 1 - arch/arm/configs/mxs_defconfig | 1 - arch/mips/configs/nlm_xlp_defconfig | 1 - arch/mips/configs/nlm_xlr_defconfig | 1 - arch/parisc/configs/generic-32bit_defconfig | 1 - arch/parisc/configs/generic-64bit_defconfig | 1 - arch/powerpc/configs/ppc6xx_defconfig | 1 - arch/s390/configs/debug_defconfig | 1 - arch/s390/configs/defconfig | 1 - arch/sh/configs/edosk7760_defconfig | 1 - arch/sh/configs/sdk7780_defconfig | 1 - arch/x86/configs/i386_defconfig | 1 - arch/x86/configs/x86_64_defconfig | 1 - arch/x86/tools/relocs.c | 4 +- include/asm-generic/vmlinux.lds.h | 28 - include/linux/export.h | 8 --- include/linux/module.h | 13 init/Kconfig| 17 - kernel/module.c | 69 ++--- scripts/checkpatch.pl | 6 +- scripts/mod/modpost.c | 39 +--- scripts/mod/modpost.h | 2 - scripts/module.lds.S| 4 -- tools/include/linux/export.h| 2 - 24 files changed, 13 insertions(+), 192 deletions(-) diff --git a/arch/arm/configs/bcm2835_defconfig b/arch/arm/configs/bcm2835_defconfig index 44ff9cd88d8161..d6c6c2e031c43a 100644 --- a/arch/arm/configs/bcm2835_defconfig +++ b/arch/arm/configs/bcm2835_defconfig @@ -177,7 +177,6 @@ CONFIG_BOOT_PRINTK_DELAY=y CONFIG_DYNAMIC_DEBUG=y CONFIG_DEBUG_INFO=y # CONFIG_ENABLE_MUST_CHECK is not set -CONFIG_UNUSED_SYMBOLS=y CONFIG_DEBUG_MEMORY_INIT=y CONFIG_LOCKUP_DETECTOR=y CONFIG_SCHED_TRACER=y diff --git a/arch/arm/configs/mxs_defconfig b/arch/arm/configs/mxs_defconfig index a9c6f32a9b1c9d..ca32446b187f5d 100644 --- a/arch/arm/configs/mxs_defconfig +++ b/arch/arm/configs/mxs_defconfig @@ -164,7 +164,6 @@ CONFIG_FONTS=y CONFIG_PRINTK_TIME=y CONFIG_DEBUG_INFO=y CONFIG_FRAME_WARN=2048 -CONFIG_UNUSED_SYMBOLS=y CONFIG_MAGIC_SYSRQ=y CONFIG_DEBUG_KERNEL=y CONFIG_SOFTLOCKUP_DETECTOR=y diff --git a/arch/mips/configs/nlm_xlp_defconfig b/arch/mips/configs/nlm_xlp_defconfig index 72a211d2d556fd..32c29061172325 100644 --- a/arch/mips/configs/nlm_xlp_defconfig +++ b/arch/mips/configs/nlm_xlp_defconfig @@ -549,7 +549,6 @@ CONFIG_PRINTK_TIME=y CONFIG_DEBUG_INFO=y # CONFIG_ENABLE_MUST_CHECK is not set CONFIG_FRAME_WARN=1024 -CONFIG_UNUSED_SYMBOLS=y CONFIG_DEBUG_MEMORY_INIT=y CONFIG_DETECT_HUNG_TASK=y CONFIG_SCHEDSTATS=y diff --git a/arch/mips/configs/nlm_xlr_defconfig b/arch/mips/configs/nlm_xlr_defconfig index 4ecb157e56d427..bf9b9244929ecd 100644 --- a/arch/mips/configs/nlm_xlr_defconfig +++ b/arch/mips/configs/nlm_xlr_defconfig @@ -500,7 +500,6 @@ CONFIG_CRC7=m CONFIG_PRINTK_TIME=y CONFIG_DEBUG_INFO=y # CONFIG_ENABLE_MUST_CHECK is not set -CONFIG_UNUSED_SYMBOLS=y CONFIG_DEBUG_MEMORY_INIT=y CONFIG_DETECT_HUNG_TASK=y CONFIG_SCHEDSTATS=y diff --git a/arch/parisc/configs/generic-32bit_defconfig b/arch/parisc/configs/generic-32bit_defconfig index 3cbcfad5f7249d..7611d48c599e01 100644 --- a/arch/parisc/configs/generic-32bit_defconfig +++ b/arch/parisc/configs/generic-32bit_defconfig @@ -22,7 +22,6 @@ CONFIG_PCI_LBA=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODULE_FORCE_UNLOAD=y -CONFIG_UNUSED_SYMBOLS=y # CONFIG_BLK_DEV_BSG is not set # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set CONFIG_BINFMT_MISC=m diff --git a/arch/parisc/configs/generic-64bit_defconfig b/arch/parisc/configs/generic-64bit_defconfig index 8f81fcbf04c413..53054b81461a10 100644 --- a/arch/parisc/configs/generic-64bit_defconfig +++ b/arch/parisc/configs/generic-64bit_defconfig @@ -31,7 +31,6 @@ CONFIG_MODULE_FORCE_LOAD=y CONFIG_MODULE_UNLOAD=y CONFIG_MODULE_FORCE_UNLOAD=y CONFIG_MODVERSIONS=y -CONFIG_UNUSED_SYMBOLS=y CONFIG_BLK_DEV_INTEGRITY=y CONFIG_BINFMT_MISC=m # CONFIG_COMPACTION is not set diff --git a/arch/powerpc/configs/ppc6xx_defconfig b/arch/powerpc/configs/ppc6xx_defconfig index ef09f3cce1fa85..34c3859040f9f7 100644 --- a/arch/powerpc/configs/ppc6xx_defconfig +++ b/arch/powerpc/configs/ppc6xx_defconfig @@ -1072,7 +1072,6 @@ CONFIG_NLS_ISO8859_15=m CONFIG_NLS_KOI8_R=m CONFIG_NLS_KOI8_U=m CONFIG_DEBUG_INFO=y -CONFIG_UNUSED_SYMBOLS=y CONFIG_HEADERS_INSTALL=y CONFIG_MAGIC_SYSRQ=y CONFIG_DEBUG_KERNEL=y diff --git a/arch/s390/configs/debug_defconfig b/arch/s390/configs/debug_defconfig index c4f6ff98a612cd..58e54d17e3154b 100644 --- a/arch/s390/configs/debug_defconfig +++ b/arch/s390/configs/debug_defconfig @@ -71,7 +71,6 @@ CONFIG_MODULE_FORCE_UNLOAD=y CONFIG_MODVERSIONS=y CONFIG_MODULE_SRCVERSION_ALL=y CONFIG_MODULE_SIG_SHA256=y -CONFIG_UNUSED_SYMBOLS=y CONFIG_BLK_DEV_INTEGRITY
Re: [PATCH 04/13] livepatch: move klp_find_object_module to module.c
+++ Christoph Hellwig [21/01/21 08:49 +0100]: To uncouple the livepatch code from module loader internals move a slightly refactored version of klp_find_object_module to module.c This allows to mark find_module static and removes one of the last users of module_mutex outside of module.c. Signed-off-by: Christoph Hellwig --- include/linux/module.h | 3 +-- kernel/livepatch/core.c | 39 +-- kernel/module.c | 17 - 3 files changed, 30 insertions(+), 29 deletions(-) diff --git a/include/linux/module.h b/include/linux/module.h index b4654f8a408134..8588482bde4116 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -586,8 +586,7 @@ static inline bool within_module(unsigned long addr, const struct module *mod) return within_module_init(addr, mod) || within_module_core(addr, mod); } -/* Search for module by name: must hold module_mutex. */ -struct module *find_module(const char *name); +struct module *find_klp_module(const char *name); /* Check if a module is loaded. */ bool module_loaded(const char *name); diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c index a7f625dc24add3..878759baadd81c 100644 --- a/kernel/livepatch/core.c +++ b/kernel/livepatch/core.c @@ -49,30 +49,6 @@ static bool klp_is_module(struct klp_object *obj) return obj->name; } -/* sets obj->mod if object is not vmlinux and module is found */ -static void klp_find_object_module(struct klp_object *obj) -{ - struct module *mod; - - mutex_lock(&module_mutex); - /* -* We do not want to block removal of patched modules and therefore -* we do not take a reference here. The patches are removed by -* klp_module_going() instead. -*/ - mod = find_module(obj->name); - /* -* Do not mess work of klp_module_coming() and klp_module_going(). -* Note that the patch might still be needed before klp_module_going() -* is called. Module functions can be called even in the GOING state -* until mod->exit() finishes. This is especially important for -* patches that modify semantic of the functions. -*/ - if (mod && mod->klp_alive) - obj->mod = mod; - mutex_unlock(&module_mutex); -} Hmm, I am not a huge fan of moving more livepatch code into module.c, I wonder if we can keep them separate. Why not have module_is_loaded() kill two birds with one stone? That is, just have it return a module pointer to signify that the module is loaded, NULL if not. Then we don't need an extra find_klp_module() function just to call find_module() and return a pointer, as module_is_loaded() can just do that for us. As for the mod->klp_alive check, I believe this function (klp_find_object_module()) is called with klp_mutex held, and mod->klp_alive is only modified under klp_mutex. Also, if klp_alive is true, the module is at least COMING and cannot be GOING until it acquires the klp_mutex again in klp_module_going(). So does that hunk really need to be under module_mutex? It has been a long time since I've looked at livepatch code so it would be great if someone could double check. Jessica
Re: [PATCH v3] module: harden ELF info handling
+++ Frank van der Linden [14/01/21 22:21 +]: 5fdc7db644 ("module: setup load info before module_sig_check()") moved the ELF setup, so that it was done before the signature check. This made the module name available to signature error messages. However, the checks for ELF correctness in setup_load_info are not sufficient to prevent bad memory references due to corrupted offset fields, indices, etc. So, there's a regression in behavior here: a corrupt and unsigned (or badly signed) module, which might previously have been rejected immediately, can now cause an oops/crash. Harden ELF handling for module loading by doing the following: - Move the signature check back up so that it comes before ELF initialization. It's best to do the signature check to see if we can trust the module, before using the ELF structures inside it. This also makes checks against info->len more accurate again, as this field will be reduced by the length of the signature in mod_check_sig(). The module name is now once again not available for error messages during the signature check, but that seems like a fair tradeoff. - Check if sections have offset / size fields that at least don't exceed the length of the module. - Check if sections have section name offsets that don't fall outside the section name table. - Add a few other sanity checks against invalid section indices, etc. This is not an exhaustive consistency check, but the idea is to at least get through the signature and blacklist checks without crashing because of corrupted ELF info, and to error out gracefully for most issues that would have caused problems later on. Fixes: 5fdc7db644 ("module: setup load info before module_sig_check()") Signed-off-by: Frank van der Linden Queued on modules-next. Thanks Frank! Jessica
Re: [PATCH v3] module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
+++ Fangrui Song [15/01/21 11:52 -0800]: clang-12 -fno-pic (since https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6) can emit `call __stack_chk_fail@PLT` instead of `call __stack_chk_fail` on x86. The two forms should have identical behaviors on x86-64 but the former causes GNU as<2.37 to produce an unreferenced undefined symbol _GLOBAL_OFFSET_TABLE_. (On x86-32, there is an R_386_PC32 vs R_386_PLT32 difference but the linker behavior is identical as far as Linux kernel is concerned.) Simply ignore _GLOBAL_OFFSET_TABLE_ for now, like what scripts/mod/modpost.c:ignore_undef_symbol does. This also fixes the problem for gcc/clang -fpie and -fpic, which may emit `call foo@PLT` for external function calls on x86. Note: ld -z defs and dynamic loaders do not error for unreferenced undefined symbols so the module loader is reading too much. If we ever need to ignore more symbols, the code should be refactored to ignore unreferenced symbols. Reported-by: Marco Elver Link: https://github.com/ClangBuiltLinux/linux/issues/1250 Signed-off-by: Fangrui Song Reviewed-by: Nick Desaulniers Tested-by: Marco Elver Cc: --- Changes in v2: * Fix Marco's email address * Add a function ignore_undef_symbol similar to scripts/mod/modpost.c:ignore_undef_symbol --- Changes in v3: * Fix the style of a multi-line comment. * Use static bool ignore_undef_symbol. Patch has been queued up on modules-next: https://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux.git/commit/?h=modules-next&id=ebfac7b778fac8b0e8e92ec91d0b055f046b4604 Thanks! Jessica
Re: [PATCH v2] module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
+++ Marco Elver [15/01/21 08:03 +0100]: On Thu, 14 Jan 2021 at 22:54, Fangrui Song wrote: clang-12 -fno-pic (since https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6) can emit `call __stack_chk_fail@PLT` instead of `call __stack_chk_fail` on x86. The two forms should have identical behaviors on x86-64 but the former causes GNU as<2.37 to produce an unreferenced undefined symbol _GLOBAL_OFFSET_TABLE_. (On x86-32, there is an R_386_PC32 vs R_386_PLT32 difference but the linker behavior is identical as far as Linux kernel is concerned.) Simply ignore _GLOBAL_OFFSET_TABLE_ for now, like what scripts/mod/modpost.c:ignore_undef_symbol does. This also fixes the problem for gcc/clang -fpie and -fpic, which may emit `call foo@PLT` for external function calls on x86. Note: ld -z defs and dynamic loaders do not error for unreferenced undefined symbols so the module loader is reading too much. If we ever need to ignore more symbols, the code should be refactored to ignore unreferenced symbols. Reported-by: Marco Elver Link: https://github.com/ClangBuiltLinux/linux/issues/1250 Signed-off-by: Fangrui Song Tested-by: Marco Elver Thank you for the patch! --- kernel/module.c | 20 ++-- 1 file changed, 18 insertions(+), 2 deletions(-) --- Changes in v2: * Fix Marco's email address * Add a function ignore_undef_symbol similar to scripts/mod/modpost.c:ignore_undef_symbol diff --git a/kernel/module.c b/kernel/module.c index 4bf30e4b3eaa..278f5129bde2 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2348,6 +2348,20 @@ static int verify_exported_symbols(struct module *mod) return 0; } +static int ignore_undef_symbol(Elf_Half emachine, const char *name) Why not 'bool' return-type? +{ + /* On x86, PIC code and Clang non-PIC code may have call foo@PLT. GNU as Not sure if checkpatch.pl warns about this, but this multi-line comment does not follow the normal kernel-style (see elsewhere in file): /* * ... */ +1 to Marco's comments. Otherwise, patch looks good to me. Thanks Fangrui! Jessica
Re: [PATCH v2] module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
+++ Nick Desaulniers [14/01/21 14:01 -0800]: On Thu, Jan 14, 2021 at 1:54 PM 'Fangrui Song' via Clang Built Linux wrote: clang-12 -fno-pic (since https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6) can emit `call __stack_chk_fail@PLT` instead of `call __stack_chk_fail` on x86. The two forms should have identical behaviors on x86-64 but the former causes GNU as<2.37 to produce an unreferenced undefined symbol _GLOBAL_OFFSET_TABLE_. (On x86-32, there is an R_386_PC32 vs R_386_PLT32 difference but the linker behavior is identical as far as Linux kernel is concerned.) Simply ignore _GLOBAL_OFFSET_TABLE_ for now, like what scripts/mod/modpost.c:ignore_undef_symbol does. This also fixes the problem for gcc/clang -fpie and -fpic, which may emit `call foo@PLT` for external function calls on x86. Note: ld -z defs and dynamic loaders do not error for unreferenced undefined symbols so the module loader is reading too much. If we ever need to ignore more symbols, the code should be refactored to ignore unreferenced symbols. Reported-by: Marco Elver Link: https://github.com/ClangBuiltLinux/linux/issues/1250 Signed-off-by: Fangrui Song Thanks for the patch. Reviewed-by: Nick Desaulniers Jessica, would you mind adding when applying: Cc: as I suspect we might want this fixed in stable tree's branches, too. It might of interest to add: Link: https://sourceware.org/bugzilla/show_bug.cgi?id=27178 too. Sure, will do! Thanks, Jessica
Re: [PATCH] module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
+++ Fāng-ruì Sòng [14/01/21 08:57 -0800]: On Thu, Jan 14, 2021 at 6:06 AM Jessica Yu wrote: +++ Fangrui Song [13/01/21 21:48 -0800]: >clang-12 -fno-pic (since >https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6) >can emit `call __stack_chk_fail@PLT` instead of `call __stack_chk_fail` >on x86. The two forms should have identical behaviors on x86-64 but the >former causes GNU as<2.37 to produce an unreferenced undefined symbol >_GLOBAL_OFFSET_TABLE_. > >(On x86-32, there is an R_386_PC32 vs R_386_PLT32 difference but the >linker behavior is identical as far as Linux kernel is concerned.) > >Simply ignore _GLOBAL_OFFSET_TABLE_ for now, like what >scripts/mod/modpost.c:ignore_undef_symbol does. This also fixes the >problem for gcc/clang -fpie and -fpic, which may emit `call foo@PLT` for >external function calls on x86. > >Note: ld -z defs and dynamic loaders do not error for unreferenced >undefined symbols so the module loader is reading too much. If we ever >need to ignore more symbols, the code should be refactored to ignore >unreferenced symbols. > >Reported-by: Marco Elver >Link: https://github.com/ClangBuiltLinux/linux/issues/1250 >Signed-off-by: Fangrui Song >--- > kernel/module.c | 10 -- > 1 file changed, 8 insertions(+), 2 deletions(-) > >diff --git a/kernel/module.c b/kernel/module.c >index 4bf30e4b3eaa..2e2deea99289 100644 >--- a/kernel/module.c >+++ b/kernel/module.c >@@ -2395,8 +2395,14 @@ static int simplify_symbols(struct module *mod, const struct load_info *info) > break; > } > >- /* Ok if weak. */ >- if (!ksym && ELF_ST_BIND(sym[i].st_info) == STB_WEAK) >+ /* Ok if weak. Also allow _GLOBAL_OFFSET_TABLE_: >+ * GNU as before 2.37 produces an unreferenced _GLOBAL_OFFSET_TABLE_ >+ * for call foo@PLT on x86-64. If the code ever needs to ignore >+ * more symbols, refactor the code to only warn if referenced by >+ * a relocation. >+ */ >+ if (!ksym && (ELF_ST_BIND(sym[i].st_info) == STB_WEAK || >+!strcmp(name, "_GLOBAL_OFFSET_TABLE_"))) > break; Hi Fangrui, Thanks for the patch. I am puzzled why we don't already mirror modpost here, that particular line of code in modpost to ignore _GLOBAL_OFFSET_TABLE_ has been there long before my time. Let's properly mirror modpost then, and create a similar helper function ignore_undef_symbol() (and stick the _GLOBAL_OFFSET_TABLE_ check in there) to account for future cases like this. Thanks, Jessica Hi Jessica, I guess __this_module in scripts/mod/modpost.c:ignore_undef_symbol is not a problem. For PPC64 _restgpr0_* and _savegpr0_*, I am not sure ignoring the undefined functions in kernel/module.c is right. (I know they can be produced by gcc -Os in some cases (https://reviews.llvm.org/D79977), but I want to learn whether that is a real issue before adding them.) If we ever need to ignore more symbols, the code should be refactored to not warn for unreferenced undefined symbols as my description says. Hi Fangrui, Sorry for the confusion, I did not mean to exactly mirror ignore_undef_symbol() in modpost. The conditions are different there and not all of them would not apply to kernel/module.c. Like __this_module, as you say, is not a problem as this will be resolved once all the module *.o are linked in the final .ko. So when it reaches the module loader it would no longer be UNDEF. I assume that it is a similar situation for the PPC64 symbols. What I meant that we could probably make this patch look nicer by moving the hardcoded check for "_GLOBAL_OFFSET_TABLE_" to a helper function ignore_undef_symbol(), similar to how modpost does it, without adding any additional conditions for now. And yes, if we ever need to extend this and refactor to not warn for unreferenced undefined symbols, we should refactor to generalize this case, but for the scope of this patch I think the suggested change is sufficient for now. Thank you, Jessica
Re: [PATCH] module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
+++ Fangrui Song [13/01/21 21:48 -0800]: clang-12 -fno-pic (since https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6) can emit `call __stack_chk_fail@PLT` instead of `call __stack_chk_fail` on x86. The two forms should have identical behaviors on x86-64 but the former causes GNU as<2.37 to produce an unreferenced undefined symbol _GLOBAL_OFFSET_TABLE_. (On x86-32, there is an R_386_PC32 vs R_386_PLT32 difference but the linker behavior is identical as far as Linux kernel is concerned.) Simply ignore _GLOBAL_OFFSET_TABLE_ for now, like what scripts/mod/modpost.c:ignore_undef_symbol does. This also fixes the problem for gcc/clang -fpie and -fpic, which may emit `call foo@PLT` for external function calls on x86. Note: ld -z defs and dynamic loaders do not error for unreferenced undefined symbols so the module loader is reading too much. If we ever need to ignore more symbols, the code should be refactored to ignore unreferenced symbols. Reported-by: Marco Elver Link: https://github.com/ClangBuiltLinux/linux/issues/1250 Signed-off-by: Fangrui Song --- kernel/module.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index 4bf30e4b3eaa..2e2deea99289 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2395,8 +2395,14 @@ static int simplify_symbols(struct module *mod, const struct load_info *info) break; } - /* Ok if weak. */ - if (!ksym && ELF_ST_BIND(sym[i].st_info) == STB_WEAK) + /* Ok if weak. Also allow _GLOBAL_OFFSET_TABLE_: +* GNU as before 2.37 produces an unreferenced _GLOBAL_OFFSET_TABLE_ +* for call foo@PLT on x86-64. If the code ever needs to ignore +* more symbols, refactor the code to only warn if referenced by +* a relocation. +*/ + if (!ksym && (ELF_ST_BIND(sym[i].st_info) == STB_WEAK || + !strcmp(name, "_GLOBAL_OFFSET_TABLE_"))) break; Hi Fangrui, Thanks for the patch. I am puzzled why we don't already mirror modpost here, that particular line of code in modpost to ignore _GLOBAL_OFFSET_TABLE_ has been there long before my time. Let's properly mirror modpost then, and create a similar helper function ignore_undef_symbol() (and stick the _GLOBAL_OFFSET_TABLE_ check in there) to account for future cases like this. Thanks, Jessica
Re: [PATCH v2] module: harden ELF info handling
+++ Frank van der Linden [07/01/21 19:30 +]: 5fdc7db644 ("module: setup load info before module_sig_check()") moved the ELF setup, so that it was done before the signature check. This made the module name available to signature error messages. However, the checks for ELF correctness in setup_load_info are not sufficient to prevent bad memory references due to corrupted offset fields, indices, etc. So, there's a regression in behavior here: a corrupt and unsigned (or badly signed) module, which might previously have been rejected immediately, can now cause an oops/crash. Harden ELF handling for module loading by doing the following: - Move the signature check back up so that it comes before ELF initialization. It's best to do the signature check to see if we can trust the module, before using the ELF structures inside it. This also makes checks against info->len more accurate again, as this field will be reduced by the length of the signature in mod_check_sig(). The module name is now once again not available for error messages during the signature check, but that seems like a fair tradeoff. - Check if sections have offset / size fields that at least don't exceed the length of the module. - Check if sections have section name offsets that don't fall outside the section name table. - Add a few other sanity checks against invalid section indices, etc. This is not an exhaustive consistency check, but the idea is to at least get through the signature and blacklist checks without crashing because of corrupted ELF info, and to error out gracefully for most issues that would have caused problems later on. Fixes: 5fdc7db644 ("module: setup load info before module_sig_check()") Signed-off-by: Frank van der Linden --- kernel/module.c | 143 -- kernel/module_signature.c | 2 +- kernel/module_signing.c | 2 +- 3 files changed, 126 insertions(+), 21 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index 4bf30e4b3eaa..34fc6c85eb65 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2964,7 +2964,7 @@ static int module_sig_check(struct load_info *info, int flags) } if (is_module_sig_enforced()) { - pr_notice("%s: loading of %s is rejected\n", info->name, reason); + pr_notice("Loading of %s is rejected\n", reason); return -EKEYREJECTED; } @@ -2977,9 +2977,33 @@ static int module_sig_check(struct load_info *info, int flags) } #endif /* !CONFIG_MODULE_SIG */ -/* Sanity checks against invalid binaries, wrong arch, weird elf version. */ -static int elf_header_check(struct load_info *info) +static int validate_section_offset(struct load_info *info, Elf_Shdr *shdr) { + unsigned long secend; + + /* +* Check for both overflow and offset/size being +* too large. +*/ + secend = shdr->sh_offset + shdr->sh_size; + if (secend < shdr->sh_offset || secend > info->len) + return -ENOEXEC; + + return 0; +} + +/* + * Sanity checks against invalid binaries, wrong arch, weird elf version. + * + * Also do basic validity checks against section offsets and sizes, the + * section name string table, and the indices used for it (sh_name). + */ +static int elf_validity_check(struct load_info *info) +{ + unsigned int i; + Elf_Shdr *shdr, *strhdr; + int err; + if (info->len < sizeof(*(info->hdr))) return -ENOEXEC; @@ -2989,11 +3013,78 @@ static int elf_header_check(struct load_info *info) || info->hdr->e_shentsize != sizeof(Elf_Shdr)) return -ENOEXEC; + /* +* e_shnum is 16 bits, and sizeof(Elf_Shdr) is +* known and small. So e_shnum * sizeof(Elf_Shdr) +* will not overflow unsigned long on any platform. +*/ if (info->hdr->e_shoff >= info->len || (info->hdr->e_shnum * sizeof(Elf_Shdr) > info->len - info->hdr->e_shoff)) return -ENOEXEC; + info->sechdrs = (void *)info->hdr + info->hdr->e_shoff; + + /* +* Verify if the section name table index is valid. +*/ + if (info->hdr->e_shstrndx == SHN_UNDEF + || info->hdr->e_shstrndx >= info->hdr->e_shnum) + return -ENOEXEC; + + strhdr = &info->sechdrs[info->hdr->e_shstrndx]; + err = validate_section_offset(info, strhdr); + if (err < 0) + return err; + + /* +* The section name table must be NUL-terminated, as required +* by the spec. This makes strcmp and pr_* calls that access +* strings in the section safe. +*/ + info->secstrings = (void *)info->hdr + strhdr->sh_offset; + if (info->secstrings[strhdr->sh_size - 1] != '\0') + return -ENOEXEC; + + /* +* The code assumes that section 0 has a length of zero and +* an addr of zero, so check for it. +
Re: [PATCH v5] modules: introduce the MODULE_SCMVERSION config
+++ Will McVicker [08/01/21 00:30 +]: Config MODULE_SCMVERSION introduces a new module attribute -- `scmversion` -- which can be used to identify a given module's SCM version. This is very useful for developers that update their kernel independently from their kernel modules or vice-versa since the SCM version provided by UTS_RELEASE (`uname -r`) will now differ from the module's vermagic attribute. For example, we have a CI setup that tests new kernel changes on the hikey960 and db845c devices without updating their kernel modules. When these tests fail, we need to be able to identify the exact device configuration the test was using. By including MODULE_SCMVERSION, we can identify the exact kernel and modules' SCM versions for debugging the failures. Additionally, by exposing the SCM version via the sysfs node /sys/module/MODULENAME/scmversion, one can also verify the SCM versions of the modules loaded from the initramfs. Currently, modinfo can only retrieve module attributes from the module's ko on disk and not from the actual module that is loaded in RAM. You can retrieve the SCM version in two ways, 1) By using modinfo: > modinfo -F scmversion MODULENAME 2) By module sysfs node: > cat /sys/module/MODULENAME/scmversion Signed-off-by: Will McVicker Hi Will, Thanks for v5, I'm fine with this patch now that we've made it a configurable developer/debug option to supply an scmversion field for in-tree modules (although, this currently works for out-of-tree modules too since we based module_srcpath on KBUILD_EXTMOD in the external module case). I basically see this option as an alternative to CONFIG_MODULE_SRCVERSION_ALL to aid distro developers debug issues when the kernel and in-tree modules are updated separately. In any case, since there was pushback in our earlier discussions, I'd like to ask if there are any remaining objections to this patch. Masahiro, are you fine with the Makefile and modpost changes? Thanks, Jessica --- Documentation/ABI/stable/sysfs-module | 18 ++ include/linux/module.h| 1 + init/Kconfig | 14 ++ kernel/module.c | 2 ++ scripts/Makefile.modpost | 22 ++ scripts/mod/modpost.c | 24 +++- 6 files changed, 80 insertions(+), 1 deletion(-) diff --git a/Documentation/ABI/stable/sysfs-module b/Documentation/ABI/stable/sysfs-module index 6272ae5fb366..a75d137e79f4 100644 --- a/Documentation/ABI/stable/sysfs-module +++ b/Documentation/ABI/stable/sysfs-module @@ -32,3 +32,21 @@ Description: Note: If the module is built into the kernel, or if the CONFIG_MODULE_UNLOAD kernel configuration value is not enabled, this file will not be present. + +What: /sys/module/MODULENAME/scmversion +Date: November 2020 +KernelVersion: 5.12 +Contact: Will McVicker +Description: This read-only file will appear if modpost was supplied with an + SCM version for the module. It can be enabled with the config + MODULE_SCMVERSION. The SCM version is retrieved by + scripts/setlocalversion, which means that the presence of this + file depends on CONFIG_LOCALVERSION_AUTO=y. When read, the SCM + version that the module was compiled with is returned. The SCM + version is returned in the following format:: + + === + Git:g[a-f0-9]\+(-dirty)\? + Mercurial: hg[a-f0-9]\+(-dirty)\? + Subversion: svn[0-9]\+ + === diff --git a/include/linux/module.h b/include/linux/module.h index 7a0bcb5b1ffc..3b1612193cf9 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -372,6 +372,7 @@ struct module { struct module_attribute *modinfo_attrs; const char *version; const char *srcversion; + const char *scmversion; struct kobject *holders_dir; /* Exported symbols */ diff --git a/init/Kconfig b/init/Kconfig index b77c60f8b963..3d9dac3c4e8f 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2131,6 +2131,20 @@ config MODULE_SRCVERSION_ALL the version). With this option, such a "srcversion" field will be created for all modules. If unsure, say N. +config MODULE_SCMVERSION + bool "SCM version for modules" + depends on LOCALVERSION_AUTO + help + This enables the module attribute "scmversion" which can be used + by developers to identify the SCM version of a given module, e.g. + git sha1 or hg sha1. The SCM version can be queried by modinfo or + via the sysfs node: /sys/modules/MODULENAME/scmversion. This is + useful when the kernel or kernel modules are updated separately + since that causes the vermagic of the kernel and the module to + differ. + + If unsure,
Re: Linux Kernel module notification bug
+++ Adam Zabrocki [12/01/21 01:15 +0100]: Hello, On Mon, Jan 11, 2021 at 03:20:48PM +0100, Jessica Yu wrote: +++ Adam Zabrocki [10/01/21 18:54 +0100]: > Hello, > > I believe that the following commit does introduce a gentle "functionality > bug": > > "module: delay kobject uevent until after module init call": > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/kernel/module.c?id=38dc717e97153e46375ee21797aa54777e5498f3 > > The official Linux Kernel API for the kernel module activities notification has > been divided based on the readiness 'stage' for such module. We have the > following stages: > >MODULE_STATE_LIVE, /* Normal state. */ >MODULE_STATE_COMING,/* Full formed, running module_init. */ >MODULE_STATE_GOING, /* Going away. */ >MODULE_STATE_UNFORMED, /* Still setting it up. */ > > LIVE means that the kernel module is correctly running and all initialization > work has been already done. Otherwise, we have event 'COMING' or 'UNFORMED'. > > In the described commit, creation of the KOBJECT has been moved after invoking > a notficiation of the newly formed kernel module state (LIVE). That's somehow > inconsistent from my understanding of the kernel modules notifiers logic. > Creation of the new objects (like KOBJ) should be done before notification of > the stage LIVE is invoked. I'm confused. We're not creating any kobjects here. That is all done in mod_sysfs_setup(), which is called while the module is still COMING. What that commit does is delay telling userspace about the module (specifically, systemd/udev) until the module is basically ready. Systemd was basically receiving the uevent too early, before the module has initialized, hence we decided to delay the uevent [1]. Sorry for the confusion on my side. I was referring to the internal state of the KOBJ itself which is being actively modified when uevent is sent. During the module creation KOBJ_ADD modifies 'kobj->state_add_uevent_sent'. Until recent commit, kernel didn't modify KOBJ after sending LIVE notification. > This commit breaks some of the projects that rely on the LIVE notification to > monitor when the newly loaded module is ready. Hm, could you please explain specifically what is the issue you're seeing? What projects is it breaking? I'm specifically referencing these projects which are tracking kernel modules for integrity purpose (e.g. anti-rootkit tools) like Linux Kernel Runtime Guard. It is possible to modify these tools to adopt and take into account modification of KOBJ after LIVE state notification. However, from my understanding of the kernel modules notifiers logic, KOBJ should be fully formed at this stage. Ah I see, thanks for the clarification. I was under the impression that kobj->state_add_uevent_sent is an internal field interesting only to lib/kobject.c, and did not know tools like you mention are implicitly tracking that. In any case, I think your suggested change doesn't affect the systemd/udev issue we were trying to fix, as long as the uevent is sent after module init(). Would you mind sending a patch? (with a Fixes: tag and including the explanations you've provided in this thread in the changelog) Thanks! Jessica
Re: Linux Kernel module notification bug
+++ Adam Zabrocki [10/01/21 18:54 +0100]: Hello, I believe that the following commit does introduce a gentle "functionality bug": "module: delay kobject uevent until after module init call": https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/kernel/module.c?id=38dc717e97153e46375ee21797aa54777e5498f3 The official Linux Kernel API for the kernel module activities notification has been divided based on the readiness 'stage' for such module. We have the following stages: MODULE_STATE_LIVE, /* Normal state. */ MODULE_STATE_COMING,/* Full formed, running module_init. */ MODULE_STATE_GOING, /* Going away. */ MODULE_STATE_UNFORMED, /* Still setting it up. */ LIVE means that the kernel module is correctly running and all initialization work has been already done. Otherwise, we have event 'COMING' or 'UNFORMED'. In the described commit, creation of the KOBJECT has been moved after invoking a notficiation of the newly formed kernel module state (LIVE). That's somehow inconsistent from my understanding of the kernel modules notifiers logic. Creation of the new objects (like KOBJ) should be done before notification of the stage LIVE is invoked. I'm confused. We're not creating any kobjects here. That is all done in mod_sysfs_setup(), which is called while the module is still COMING. What that commit does is delay telling userspace about the module (specifically, systemd/udev) until the module is basically ready. Systemd was basically receiving the uevent too early, before the module has initialized, hence we decided to delay the uevent [1]. This commit breaks some of the projects that rely on the LIVE notification to monitor when the newly loaded module is ready. Hm, could you please explain specifically what is the issue you're seeing? What projects is it breaking? Thanks, Jessica [1] https://github.com/systemd/systemd/issues/17586
Re: [PATCH v4] modules: introduce the MODULE_SCMVERSION config
+++ Will McVicker [06/01/21 10:44 -0800]: Thanks for the vacation notice Jessica! I'm just letting you know I'm back as well and am happy to respond to any concerns regarding v4 when you get all caught up. I hope you had a relaxing holiday :) Hi Will - thanks, same to you! On Fri, Dec 18, 2020 at 4:01 AM Jessica Yu wrote: +++ Will McVicker [16/12/20 22:08 +]: >Config MODULE_SCMVERSION introduces a new module attribute -- >`scmversion` -- which can be used to identify a given module's SCM >version. This is very useful for developers that update their kernel >independently from their kernel modules or vice-versa since the SCM >version provided by UTS_RELEASE (`uname -r`) will now differ from the >module's vermagic attribute. > >For example, we have a CI setup that tests new kernel changes on the >hikey960 and db845c devices without updating their kernel modules. When >these tests fail, we need to be able to identify the exact device >configuration the test was using. By including MODULE_SCMVERSION, we can >identify the exact kernel and modules' SCM versions for debugging the >failures. > >Additionally, by exposing the SCM version via the sysfs node >/sys/module/MODULENAME/scmversion, one can also verify the SCM versions >of the modules loaded from the initramfs. Currently, modinfo can only >retrieve module attributes from the module's ko on disk and not from the >actual module that is loaded in RAM. > >You can retrieve the SCM version in two ways, > >1) By using modinfo: >> modinfo -F scmversion MODULENAME >2) By module sysfs node: >> cat /sys/module/MODULENAME/scmversion > >Signed-off-by: Will McVicker >--- [ added back diff ] >Changelog since v3: >- Dropped [PATCH v2 1/2] scripts/setlocalversion: allow running in a subdir > > Documentation/ABI/stable/sysfs-module | 18 ++ > include/linux/module.h| 1 + > init/Kconfig | 12 > kernel/module.c | 2 ++ > scripts/Makefile.modpost | 22 ++ > scripts/mod/modpost.c | 24 +++- > 6 files changed, 78 insertions(+), 1 deletion(-) > >diff --git a/Documentation/ABI/stable/sysfs-module b/Documentation/ABI/stable/sysfs-module >index 6272ae5fb366..2ba731767737 100644 >--- a/Documentation/ABI/stable/sysfs-module >+++ b/Documentation/ABI/stable/sysfs-module >@@ -32,3 +32,21 @@ Description: >Note: If the module is built into the kernel, or if the >CONFIG_MODULE_UNLOAD kernel configuration value is not enabled, >this file will not be present. >+ >+What: /sys/module/MODULENAME/scmversion >+Date: November 2020 >+KernelVersion: 5.11 I guess we'll have to bump KernelVersion now (sorry about the timing!) >+Contact: Will McVicker >+Description: This read-only file will appear if modpost was supplied with an >+ SCM version for the module. It can be enabled with the config >+ MODULE_SCMVERSION. The SCM version is retrieved by >+ scripts/setlocalversion, which means that the presence of this >+ file depends on CONFIG_LOCALVERSION_AUTO=y or LOCALVERSION=. I think the "or LOCALVERSION=" part is inaccurate, right? We need just LOCALVERSION_AUTO for the full scm string for this to work. >+ When read, the SCM version that the module was compiled with is >+ returned. The SCM version is returned in the following format:: >+ >+ === >+ Git:g[a-f0-9]\+(-dirty)\? >+ Mercurial: hg[a-f0-9]\+(-dirty)\? >+ Subversion: svn[0-9]\+ >+ === >diff --git a/include/linux/module.h b/include/linux/module.h >index c4e7a887f469..6bd710308863 100644 >--- a/include/linux/module.h >+++ b/include/linux/module.h >@@ -372,6 +372,7 @@ struct module { >struct module_attribute *modinfo_attrs; >const char *version; >const char *srcversion; >+ const char *scmversion; >struct kobject *holders_dir; > >/* Exported symbols */ >diff --git a/init/Kconfig b/init/Kconfig >index b77c60f8b963..d9ae12f16ba2 100644 >--- a/init/Kconfig >+++ b/init/Kconfig >@@ -2131,6 +2131,18 @@ config MODULE_SRCVERSION_ALL > the version). With this option, such a "srcversion" field > will be created for all modules. If unsure, say N. > >+config MODULE_SCMVERSION >+ bool "SCM version for modules" >+ depends on LOCALVERSION_AUTO >+ help >+ This enables the module attribute "scmversion" which can be used >+ by developers to identify the SCM version of a given modu
Re: [PATCH] module: harden ELF info handling
Hi Frank, Sorry for the delay. I've just gotten back from vacation :-) +++ Frank van der Linden [21/12/20 23:49 +]: 5fdc7db644 ("module: setup load info before module_sig_check()") moved the ELF setup, so that it was done before the signature check. This made the module name available to signature error messages. However, the checks for ELF correctness in setup_load_info are not sufficient to prevent bad memory references due to corrupted offset fields, indices, etc. So, there's a regression in behavior here: a corrupt and unsigned (or badly signed) module, which might previously have been rejected immediately, can now cause an oops/crash. Harden ELF handling for module loading by doing the following: - Move the signature check back up so that it comes before ELF initialization. It's best to do the signature check to see if we can trust the module, before using the ELF structures inside it. This also makes checks against info->len more accurate again, as this field will be reduced by the length of the signature in mod_check_sig(). The module name is now once again not available for error messages during the signature check, but that seems like a fair tradeoff. I vaguely remember that I had made the module name available in response to a one-off request, IIRC someone had wanted the module name logged to be able to figure out which module(s) had failed signature verification. But I do agree with your line of reasoning, that we should probably not access internal module structures until we have verified that we can trust the module. It is a chicken and egg problem unfortunately. Although, it is probably worth it to trade ease of debugging for a more hardened approach. - Check if sections have offset / size fields that at least don't exceed the length of the module. - Check if sections have section name offsets that don't fall outside the section name table. - Add a few other sanity checks against invalid section indices, etc. This is not an exhaustive consistency check, but the idea is to at least get through the signature and blacklist checks without crashing because of corrupted ELF info, and to error out gracefully for most issues that would have caused problems later on. Fixes: 5fdc7db644 ("module: setup load info before module_sig_check()") Signed-off-by: Frank van der Linden --- kernel/module.c | 143 +- kernel/module_signature.c | 2 +- kernel/module_signing.c | 2 +- 3 files changed, 126 insertions(+), 21 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index 4bf30e4b3eaa..ef7681a22a1a 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2964,7 +2964,7 @@ static int module_sig_check(struct load_info *info, int flags) } if (is_module_sig_enforced()) { - pr_notice("%s: loading of %s is rejected\n", info->name, reason); + pr_notice("loading of %s is rejected\n", reason); Small nit: Let's start with a capital letter perhaps? Just to be consistent with the other log messages that don't start with a prefix. Same goes for the other pr_err()s below. return -EKEYREJECTED; } @@ -2977,9 +2977,33 @@ static int module_sig_check(struct load_info *info, int flags) } #endif /* !CONFIG_MODULE_SIG */ -/* Sanity checks against invalid binaries, wrong arch, weird elf version. */ -static int elf_header_check(struct load_info *info) +static int validate_section_offset(struct load_info *info, Elf_Shdr *shdr) { + unsigned long secend; + + /* +* Check for both overflow and offset/size being +* too large. +*/ + secend = shdr->sh_offset + shdr->sh_size; + if (secend < shdr->sh_offset || secend >= info->len) Should this not be secend > info->len? + return -ENOEXEC; + + return 0; +} + +/* + * Sanity checks against invalid binaries, wrong arch, weird elf version. + * + * Also do basic validity checks against section offsets and sizes, the + * section name string table, and the indices used for it (sh_name). + */ +static int elf_validity_check(struct load_info *info) +{ + unsigned int i; + Elf_Shdr *shdr, *strhdr; + int err; + if (info->len < sizeof(*(info->hdr))) return -ENOEXEC; @@ -2989,11 +3013,78 @@ static int elf_header_check(struct load_info *info) || info->hdr->e_shentsize != sizeof(Elf_Shdr)) return -ENOEXEC; + /* +* e_shnum is 16 bits, and sizeof(Elf_Shdr) is +* known and small. So e_shnum * sizeof(Elf_Shdr) +* will not overflow unsigned long on any platform. +*/ if (info->hdr->e_shoff >= info->len || (info->hdr->e_shnum * sizeof(Elf_Shdr) > info->len - info->hdr->e_shoff)) return -ENOEXEC; + info->sechdrs = (void *)info->hdr + info->hdr->e_shoff; + + /* +* Verify if the section name table index is va
Re: [PATCH v4] modules: introduce the MODULE_SCMVERSION config
+++ Will McVicker [16/12/20 22:08 +]: Config MODULE_SCMVERSION introduces a new module attribute -- `scmversion` -- which can be used to identify a given module's SCM version. This is very useful for developers that update their kernel independently from their kernel modules or vice-versa since the SCM version provided by UTS_RELEASE (`uname -r`) will now differ from the module's vermagic attribute. For example, we have a CI setup that tests new kernel changes on the hikey960 and db845c devices without updating their kernel modules. When these tests fail, we need to be able to identify the exact device configuration the test was using. By including MODULE_SCMVERSION, we can identify the exact kernel and modules' SCM versions for debugging the failures. Additionally, by exposing the SCM version via the sysfs node /sys/module/MODULENAME/scmversion, one can also verify the SCM versions of the modules loaded from the initramfs. Currently, modinfo can only retrieve module attributes from the module's ko on disk and not from the actual module that is loaded in RAM. You can retrieve the SCM version in two ways, 1) By using modinfo: > modinfo -F scmversion MODULENAME 2) By module sysfs node: > cat /sys/module/MODULENAME/scmversion Signed-off-by: Will McVicker Hi Will, thanks for v4. I'm on vacation for the next two weeks and with the current merge window almost over, I hope it'd be OK with you if we revisit this early January. Just wanted to give you a heads up. Thanks and happy holidays! Jessica
[GIT PULL] Modules updates for v5.11
Hi Linus, Please pull below to receive modules updates for the v5.11 merge window. A summary can be found in the signed tag. Thank you, Jessica --- The following changes since commit 3650b228f83adda7e5ee532e2b90429c03f7b9ec: Linux 5.10-rc1 (2020-10-25 15:14:11 -0700) are available in the Git repository at: https://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux.git/ tags/modules-for-v5.11 for you to fetch changes up to 38dc717e97153e46375ee21797aa54777e5498f3: module: delay kobject uevent until after module init call (2020-12-09 09:42:47 +0100) Modules updates for v5.11 Summary of modules changes for the 5.11 merge window: - Fix a race condition between systemd/udev and the module loader. The module loader was sending a uevent before the module was fully initialized (i.e., before its init function has been called). This means udev can start processing the module uevent before the module has finished initializing, and some udev rules expect that the module has initialized already upon receiving the uevent. This resulted in some systemd mount units failing if udev processes the event faster than the module can finish init. This is fixed by delaying the uevent until after the module has called its init routine. - Make the linker array sections for kernel params and module version attributes more robust by switching to use the alignment of the type in question. Namely, linker section arrays will be constructed using the alignment required by the struct (using __alignof__()) as opposed to a specific value such as sizeof(void *) or sizeof(long). This is less likely to cause breakages should the size of the type ever change (from Johan Hovold) - Fix module state inconsistency by setting it back to GOING when a module fails to load and is on its way out (from Miroslav Benes) - Some comment and code cleanups (from Sergey Shtylyov) Signed-off-by: Jessica Yu Jessica Yu (1): module: delay kobject uevent until after module init call Johan Hovold (7): module: drop version-attribute alignment module: simplify version-attribute handling params: drop redundant "unused" attributes params: use type alignment for kernel parameters params: clean up module-param macros init: use type alignment for kernel parameters module: drop semicolon from version macro Miroslav Benes (1): module: set MODULE_STATE_GOING state when a module fails to load Sergey Shtylyov (6): module: merge repetitive strings in module_sig_check() module: avoid *goto*s in module_sig_check() module: only handle errors with the *switch* statement in module_sig_check() module: fix up 'kernel-doc' comments module: add more 'kernel-doc' comments module: fix comment style include/linux/init.h| 2 +- include/linux/module.h | 28 +++ include/linux/moduleparam.h | 12 +-- kernel/module.c | 200 ++-- kernel/params.c | 10 +-- 5 files changed, 142 insertions(+), 110 deletions(-)
Re: static_branch_enable() does not work from a __init function?
+++ Peter Zijlstra [16/12/20 14:23 +0100]: On Wed, Dec 16, 2020 at 02:10:16PM +0100, Jessica Yu wrote: +++ Peter Zijlstra [16/12/20 13:47 +0100]: > Only because we're having .init=false, incorrectly. See the other email. Ah yeah, you're right. I also misread the intention of the if conditional :/ If we're currently running an init function it's fine, but after that it will be unsafe. Exactly, seeing how it'll end up being freed and such ;-) Btw, your patch seems to work for me, using the test module provided by Dexuan. Ah, excellent. I couldn't be bothered to figure out how to build a module ;-) I'll add your Tested-by and go queue it for /urgent I suppose. That's fine by me :-) Tested-by: Jessica Yu
Re: static_branch_enable() does not work from a __init function?
+++ Peter Zijlstra [16/12/20 13:47 +0100]: On Wed, Dec 16, 2020 at 12:55:25PM +0100, Jessica Yu wrote: +++ Peter Zijlstra [16/12/20 10:26 +0100]: [snip] > > PS, I originally found: in arch/x86/kvm/vmx/vmx.c: vmx_init(), it looks > > like the line "static_branch_enable(&enable_evmcs);" does not take effect > > in a v5.4-based kernel, but does take effect in the v5.10 kernel in the > > same x86-64 virtual machine on Hyper-V, so I made the above test module > > to test static_branch_enable(), and found that static_branch_enable() in > > the test module does not work with both v5.10 and my v5.4 kernel, if the > > __init marker is used. Because the jump label code currently does not allow you to update if the entry resides in an init section. By marking the module init section __init you place it in the .init.text section. jump_label_add_module() detects this (by calling within_module_init()) and marks the entry by calling jump_entry_set_init(). Then you have the following sequence of calls (roughly): static_branch_enable static_key_enable static_key_enable_cpuslocked jump_label_update jump_label_can_update jump_entry_is_init returns true, so bail out Judging from the comment in jump_label_can_update(), this seems to be intentional behavior: static bool jump_label_can_update(struct jump_entry *entry, bool init) { /* * Cannot update code that was in an init text area. */ if (!init && jump_entry_is_init(entry)) return false; Only because we're having .init=false, incorrectly. See the other email. Ah yeah, you're right. I also misread the intention of the if conditional :/ If we're currently running an init function it's fine, but after that it will be unsafe. Btw, your patch seems to work for me, using the test module provided by Dexuan.
Re: static_branch_enable() does not work from a __init function?
+++ Peter Zijlstra [16/12/20 10:26 +0100]: On Wed, Dec 16, 2020 at 03:54:29AM +, Dexuan Cui wrote: PS, I originally found: in arch/x86/kvm/vmx/vmx.c: vmx_init(), it looks like the line "static_branch_enable(&enable_evmcs);" does not take effect in a v5.4-based kernel, but does take effect in the v5.10 kernel in the same x86-64 virtual machine on Hyper-V, so I made the above test module to test static_branch_enable(), and found that static_branch_enable() in the test module does not work with both v5.10 and my v5.4 kernel, if the __init marker is used. By the way, it probably works now because there was a workaround merged in v5.10, that mentions this very issue: commit 064eedf2c50f692088e1418c553084bf9c1432f8 Author: Vitaly Kuznetsov Date: Wed Oct 14 16:33:46 2020 +0200 KVM: VMX: eVMCS: make evmcs_sanitize_exec_ctrls() work again It was noticed that evmcs_sanitize_exec_ctrls() is not being executed nowadays despite the code checking 'enable_evmcs' static key looking correct. Turns out, static key magic doesn't work in '__init' section (and it is unclear when things changed) but setup_vmcs_config() is called only once per CPU so we don't really need it to. Switch to checking 'enlightened_vmcs' instead, it is supposed to be in sync with 'enable_evmcs'. Opportunistically make evmcs_sanitize_exec_ctrls '__init' and drop unneeded extra newline from it. Reported-by: Yang Weijiang Signed-off-by: Vitaly Kuznetsov Message-Id: <20201014143346.2430936-1-vkuzn...@redhat.com> Signed-off-by: Paolo Bonzini
Re: static_branch_enable() does not work from a __init function?
+++ Peter Zijlstra [16/12/20 10:26 +0100]: [snip] PS, I originally found: in arch/x86/kvm/vmx/vmx.c: vmx_init(), it looks like the line "static_branch_enable(&enable_evmcs);" does not take effect in a v5.4-based kernel, but does take effect in the v5.10 kernel in the same x86-64 virtual machine on Hyper-V, so I made the above test module to test static_branch_enable(), and found that static_branch_enable() in the test module does not work with both v5.10 and my v5.4 kernel, if the __init marker is used. Because the jump label code currently does not allow you to update if the entry resides in an init section. By marking the module init section __init you place it in the .init.text section. jump_label_add_module() detects this (by calling within_module_init()) and marks the entry by calling jump_entry_set_init(). Then you have the following sequence of calls (roughly): static_branch_enable static_key_enable static_key_enable_cpuslocked jump_label_update jump_label_can_update jump_entry_is_init returns true, so bail out Judging from the comment in jump_label_can_update(), this seems to be intentional behavior: static bool jump_label_can_update(struct jump_entry *entry, bool init) { /* * Cannot update code that was in an init text area. */ if (!init && jump_entry_is_init(entry)) return false; By removing the __init marker you're bypassing the within_module_init() check and that's why it works.
Re: [PATCH v3 1/2] scripts/setlocalversion: allow running in a subdir
Hi Will, +++ Will McVicker [08/12/20 20:05 +]: Getting the scmversion using scripts/setlocalversion currently only works when run at the root of a git or mecurial project. This was introduced in commit 8558f59edf93 ("setlocalversion: Ignote SCMs above the linux source tree") so that if one is building within a subdir of a git tree that isn't the kernel git project, then the vermagic wouldn't include that git sha1. However, the proper solution to that is to just set this config in your defconfig: # CONFIG_LOCALVERSION_AUTO is not set which is already the default in many defconfigs: $ grep -r "CONFIG_LOCALVERSION_AUTO is not set" arch/* | wc -l 89 So let's bring back this functionality so that we can use scripts/setlocalversion to capture the SCM version of external modules that reside within subdirectories of an SCM project. Hm, this seems to essentially be a revert of commit 8558f59edf93. AFAICT from light testing it also reintroduces the issue it was originally trying to fix, no? From the reporter: Dan McGee writes: > Note that when in git, you get the appended "+" sign. If > LOCALVERSION_AUTO is set, you will get something like > "eee-gb01b08c-dirty" (whereas the copy of the tree in /tmp still > returns "eee"). It doesn't matter whether the working tree is dirty or > clean. > > Is there a way to disable this? I'm building from a clean tarball that > just happens to be unpacked inside a git repository. One would think > setting LOCALVERSION_AUTO to false would do it, but no such luck... Correct me if I'm wrong, but what I'm understanding is that the original reporter was having trouble with setlocalversion appending unwanted strings ("+" or "gXXX-dirty" etc) when building from a clean tarball that happens to live inside a git repo. Even if LOCALVERSION_AUTO is disabled it still appends the "+" string if the SCM above the linux source tree is not at an annotated tag. Since setlocalversion is getting confused by the presence of a different scm that commit fixed this by confining the checks to the root of the (possibly git managed) kernel source tree. Masahiro can probably better comment since he maintains scripts/*. In any case, this patch isn't of interest to in-tree modules, since we can generate the scmversion perfectly fine without it, so I doubt it's going to get any support here. Would you be fine with dropping the first patch or would that pose issues? Signed-off-by: Will McVicker --- scripts/setlocalversion | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/scripts/setlocalversion b/scripts/setlocalversion index bb709eda96cd..cd42009e675b 100755 --- a/scripts/setlocalversion +++ b/scripts/setlocalversion @@ -44,8 +44,7 @@ scm_version() fi # Check for git and a git repo. - if test -z "$(git rev-parse --show-cdup 2>/dev/null)" && - head=$(git rev-parse --verify HEAD 2>/dev/null); then + if head=$(git rev-parse --verify HEAD 2>/dev/null); then # If we are at a tagged commit (like "v2.6.30-rc6"), we ignore # it, because this version is defined in the top level Makefile. @@ -102,7 +101,7 @@ scm_version() fi # Check for mercurial and a mercurial repo. - if test -d .hg && hgid=$(hg id 2>/dev/null); then + if hgid=$(hg id 2>/dev/null); then # Do we have an tagged version? If so, latesttagdistance == 1 if [ "$(hg log -r . --template '{latesttagdistance}')" = "1" ]; then id=$(hg log -r . --template '{latesttag}') -- 2.29.2.576.ga3fc446d84-goog
Re: [PATCH RFC 1/1] module: delay kobject uevent until after module init call
+++ Jessica Yu [03/12/20 14:51 +0100]: Apparently there has been a longstanding race between udev/systemd and the module loader. Currently, the module loader sends a uevent right after sysfs initialization, but before the module calls its init function. However, some udev rules expect that the module has initialized already upon receiving the uevent. This race has been triggered recently (see link in references) in some systemd mount unit files. For instance, the configfs module creates the /sys/kernel/config mount point in its init function, however the module loader issues the uevent before this happens. sys-kernel-config.mount expects to be able to mount /sys/kernel/config upon receipt of the module loading uevent, but if the configfs module has not called its init function yet, then this directory will not exist and the mount unit fails. A similar situation exists for sys-fs-fuse-connections.mount, as the fuse sysfs mount point is created during the fuse module's init function. If udev is faster than module initialization then the mount unit would fail in a similar fashion. To fix this race, delay the module KOBJ_ADD uevent until after the module has finished calling its init routine. References: https://github.com/systemd/systemd/issues/17586 Signed-off-by: Jessica Yu Thanks all, this has been applied to modules-next to try to get as much -next time as possible before the upcoming merge window. Jessica
Re: [PATCH v2 2/2] modules: add scmversion field
+++ Will McVicker [25/11/20 01:05 +]: Add the modinfo field `scmversion` to include the SCM version of kernel modules, e.g. git sha1. This allows one to identify the exact source code version of a given kernel module. You can retrieve it in two ways, 1) By using modinfo > modinfo -F scmversion 2) By module sysfs node > cat /sys/module//scmversion Signed-off-by: Will McVicker Hi Will, sorry for the delay. First, what will help is to include a justification and a detailed explanation of an example use-case (for in-tree modules) in the changelog/commit message, which is scattered over thread replies but seems to be omitted here. For example, your explanation here [1] was pretty helpful and should be included in the changelog. Please describe in the commit message how this new sysfs entry will be used and why it is helpful to have this information available for in-tree modules. Second, this feature seems to be a debugging aid for distro developers analogous to (or an alternative to) CONFIG_MODULE_SRCVERSION_ALL. As opposed to including it by default (implicitly depending on localversion config options), perhaps this feature should be packaged in a config option i.e. CONFIG_MODULE_SCMVERSION with a depends on CONFIG_LOCALVERSION_AUTO. That way the dependency is explicitly specified and the option could be enabled for distros that want this. [1] https://lore.kernel.org/lkml/20201123221338.ga2726...@google.com/ Thanks, Jessica --- Documentation/ABI/stable/sysfs-module | 17 + include/linux/module.h| 1 + kernel/module.c | 2 ++ scripts/Makefile.modpost | 20 scripts/mod/modpost.c | 24 +++- 5 files changed, 63 insertions(+), 1 deletion(-) diff --git a/Documentation/ABI/stable/sysfs-module b/Documentation/ABI/stable/sysfs-module index 6272ae5fb366..46c99ec927ab 100644 --- a/Documentation/ABI/stable/sysfs-module +++ b/Documentation/ABI/stable/sysfs-module @@ -32,3 +32,20 @@ Description: Note: If the module is built into the kernel, or if the CONFIG_MODULE_UNLOAD kernel configuration value is not enabled, this file will not be present. + +What: /sys/module/MODULENAME/scmversion +Date: November 2020 +KernelVersion: 5.10 +Contact: Will McVicker +Description: This read-only file will appear if modpost was supplied with an + SCM version for the module. The SCM version is retrieved by + scripts/setlocalversion, which means that the presence of this + file depends on CONFIG_LOCALVERSION_AUTO=y or LOCALVERSION=. + When read, the SCM version that the module was compiled with is + returned. The SCM version is returned in the following format:: + + === + Git:g[a-f0-9]\+(-dirty)\? + Mercurial: hg[a-f0-9]\+(-dirty)\? + Subversion: svn[0-9]\+ + === diff --git a/include/linux/module.h b/include/linux/module.h index 6264617bab4d..63137ca5147b 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -372,6 +372,7 @@ struct module { struct module_attribute *modinfo_attrs; const char *version; const char *srcversion; + const char *scmversion; struct kobject *holders_dir; /* Exported symbols */ diff --git a/kernel/module.c b/kernel/module.c index a4fa44a652a7..a203dab4a03b 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -807,6 +807,7 @@ static struct module_attribute modinfo_##field = { \ MODINFO_ATTR(version); MODINFO_ATTR(srcversion); +MODINFO_ATTR(scmversion); static char last_unloaded_module[MODULE_NAME_LEN+1]; @@ -1269,6 +1270,7 @@ static struct module_attribute *modinfo_attrs[] = { &module_uevent, &modinfo_version, &modinfo_srcversion, + &modinfo_scmversion, &modinfo_initstate, &modinfo_coresize, &modinfo_initsize, diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost index f54b6ac37ac2..fb4ddf2bf794 100644 --- a/scripts/Makefile.modpost +++ b/scripts/Makefile.modpost @@ -66,6 +66,7 @@ ifeq ($(KBUILD_EXTMOD),) input-symdump := vmlinux.symvers output-symdump := Module.symvers +module_srcpath := $(srctree) else @@ -77,6 +78,17 @@ src := $(obj) include $(if $(wildcard $(KBUILD_EXTMOD)/Kbuild), \ $(KBUILD_EXTMOD)/Kbuild, $(KBUILD_EXTMOD)/Makefile) +# Get the external module's source path. KBUILD_EXTMOD could either be an +# absolute path or relative path from $(srctree). This makes sure that we +# aren't using a relative path from a separate working directory (O= or +# KBUILD_OUTPUT) since that may not be the actual module's SCM project path. So +# check the path relative to $(srctree) first. +ifneq ($(realpath $(srctree)/$(KBUILD_EXTMOD) 2>/dev/null),) + module_srcpath := $(srctree)/
Re: [PATCH] module: drop semicolon from version macro
+++ Johan Hovold [07/12/20 10:13 +0100]: Drop the trailing semicolon from the MODULE_VERSION() macro definition which was left when removing the array-of-pointer indirection. Signed-off-by: Johan Hovold Applied, thanks! Jessica --- include/linux/module.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/module.h b/include/linux/module.h index ebe2641d7b0b..b63db970fd26 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -279,7 +279,7 @@ extern typeof(name) __mod_##type##__##name##_device_table \ }, \ .module_name= KBUILD_MODNAME, \ .version= _version, \ - }; + } #endif /* Optional firmware file (or files) needed by the module -- 2.26.2
[PATCH RFC 1/1] module: delay kobject uevent until after module init call
Apparently there has been a longstanding race between udev/systemd and the module loader. Currently, the module loader sends a uevent right after sysfs initialization, but before the module calls its init function. However, some udev rules expect that the module has initialized already upon receiving the uevent. This race has been triggered recently (see link in references) in some systemd mount unit files. For instance, the configfs module creates the /sys/kernel/config mount point in its init function, however the module loader issues the uevent before this happens. sys-kernel-config.mount expects to be able to mount /sys/kernel/config upon receipt of the module loading uevent, but if the configfs module has not called its init function yet, then this directory will not exist and the mount unit fails. A similar situation exists for sys-fs-fuse-connections.mount, as the fuse sysfs mount point is created during the fuse module's init function. If udev is faster than module initialization then the mount unit would fail in a similar fashion. To fix this race, delay the module KOBJ_ADD uevent until after the module has finished calling its init routine. References: https://github.com/systemd/systemd/issues/17586 Signed-off-by: Jessica Yu --- kernel/module.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index a40ec708f8f2..e1dd0df57244 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -1897,7 +1897,6 @@ static int mod_sysfs_init(struct module *mod) if (err) mod_kobject_put(mod); - /* delay uevent until full sysfs population */ out: return err; } @@ -1934,7 +1933,6 @@ static int mod_sysfs_setup(struct module *mod, add_sect_attrs(mod, info); add_notes_attrs(mod, info); - kobject_uevent(&mod->mkobj.kobj, KOBJ_ADD); return 0; out_unreg_modinfo_attrs: @@ -3656,6 +3654,9 @@ static noinline int do_init_module(struct module *mod) blocking_notifier_call_chain(&module_notify_list, MODULE_STATE_LIVE, mod); + /* Delay uevent until module has finished its init routine */ + kobject_uevent(&mod->mkobj.kobj, KOBJ_ADD); + /* * We need to finish all async code before the module init sequence * is done. This has potential to deadlock. For example, a newly -- 2.16.4
[PATCH RFC 0/1] Delay module uevent until after initialization
Hi, This patch addresses a race condition between udev and the module loader that was recently described here: https://github.com/systemd/systemd/issues/17586 Currently, the module loader issues a KOBJ_ADD uevent before it calls a module's initialization function. Some mount units expect that the module has initialized already upon receiving the uevent. For instance, the configfs module creates the /sys/kernel/config mount point during its init function, but the module loader issues the uevent before the init function is called. If udev processes the uevent before the module loader calls the init function, then the mount unit will fail, since /sys/kernel/config will not exist yet. Nicolas Morey-Chaisemartin provided a simple test script to trigger the race condition: while true; do umount configfs rmmod configfs sleep 1 modprobe configfs ls -alFd /sys/kernel/config sleep 1 systemctl status sys-kernel-config.mount | tail -n 1 done When the mount fails due to the race condition, you would see a message like: systemd[1]: Condition check resulted in Kernel Configuration File System being skipped. I ran the script for about 30 minutes on my own machine and managed to trigger the failure condition 4 times. With the patch applied, I was not able to trigger the failed condition anymore after running the script for the same amount of time. Nicolas also reported similar test results after testing a kernel with the patch applied. This is sent first as an RFC to both LKML and systemd mailing lists since the uevent call has been like this in the module loader for more than a decade (since v2.6), I would be cautious as to not break anything that actually relies on the current behavior for whatever reason. More testing would be greatly appreciated. Thanks, Jessica Jessica Yu (1): module: delay kobject uevent until after module init call kernel/module.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
Re: [PATCH 0/8] linker-section array fix and clean ups
+++ Johan Hovold [27/11/20 10:59 +0100]: On Wed, Nov 25, 2020 at 03:51:20PM +0100, Jessica Yu wrote: I've queued up patches 3, 4, 6, 7, 8 for testing before pushing them out to modules-next. Thanks, Jessica. Perhaps you can consider taking also the one for setup parameters (patch 5/8) through your tree since its related to the module-parameter one. Johan Sure, done. Thanks, Jessica
Re: [PATCH 0/8] linker-section array fix and clean ups
+++ Johan Hovold [23/11/20 11:39 +0100]: On Fri, Nov 13, 2020 at 03:18:36PM +0100, Johan Hovold wrote: On Wed, Nov 11, 2020 at 04:47:16PM +0100, Jessica Yu wrote: > Thanks for providing the links and references. Your explanation and > this reply from Jakub [1] clarified things for me. I was not aware of > the distinction gcc made between aligned attributes on types vs. on > variables. So from what I understand now, gcc suppresses the > optimization when the alignment is specified in the variable > declaration, but not necessarily when the aligned attribute is just on > the type. > > Even though it's been in use for a long time, I think it would be > really helpful if this gcc quirk was explained just a bit more in the > patch changelogs, especially since this is undocumented behavior. > I found the explanation in [1] (as well as in your cover letter) to be > sufficient. Maybe something like "GCC suppresses any optimizations > increasing alignment when the alignment is specified in the variable > declaration, as opposed to just on the type definition. Therefore, > explicitly specify type alignment when declaring entries to prevent > gcc from increasing alignment." Sure, I can try to expand the commit messages a bit. I've amended the commit messages of the relevant patches to make it more clear that the optimisation can be suppressed by specifying alignment when declaring variables, but without making additional claims about the type attribute. I hope the result is acceptable to you. Perhaps you can include a lore link to the patches when applying so that this thread can be found easily if needed. Hi Johan, Good idea, I've included a link to this thread for each patch. I've queued up patches 3, 4, 6, 7, 8 for testing before pushing them out to modules-next. Thanks! Jessica
Re: Ubuntu mainline kernel builds now failing not able to find module.lds file
+++ Salvatore Bonaccorso [25/11/20 07:01 +0100]: Hi Steve, On Fri, Oct 30, 2020 at 12:43:24AM -0500, Steve French wrote: I typically build cifs.ko for testing using the latest Ubuntu mainline build - but building a module in the 5.10-rc1 kernel - while booted to the 5.10-rc1 ubuntu mainlinekerel - e.g. "make C=1 -C /usr/src/linux-headers-`uname -r` M=`pwd` modules CF=-D__CHECK_ENDIAN__" which has worked for years - no longer works. make: Entering directory '/usr/src/linux-headers-5.10.0-051000rc1-generic' make[2]: *** No rule to make target 'scripts/module.lds', needed by '/home/smfrench/cifs-2.6/fs/cifs/cifs.ko'. Stop. make[1]: *** [scripts/Makefile.modpost:117: __modpost] Error 2 make: *** [Makefile:1703: modules] Error 2 make: Leaving directory '/usr/src/linux-headers-5.10.0-051000rc1-generic' I don't see a file in scripts/module.lds in /usr/src/linux-headers-5.10.0-051000rc1-generic/scripts directory copying from scripts/module.lds in the 5.10-rc1 git tree to /usr/src/linux-headers-5.10.0-051000rc1-generic/scripts fixed the problem but was wondering if this is just a packaging problem with Ubuntu (missing a file in the kernel headers package for their mainline daily builds?) There is 596b0474d3d9 ("kbuild: preprocess module linker script") in v5.10-rc1 causing this. So likely the packaging will need some adjustment to cope with that change? Yeah, likely it's a distro packaging issue. We had to account for scripts/module.lds recently on openSUSE for example: https://github.com/openSUSE/kernel-source/commit/fe37c160c33dc09edff1781810aa098a2c316e20 Jessica
Re: [PATCH v1 0/2] Add support to capture external module's SCM version
+++ William Mcvicker [23/11/20 14:13 -0800]: On Mon, Nov 23, 2020 at 09:02:57AM +, Christoph Hellwig wrote: On Sat, Nov 21, 2020 at 01:16:49AM +, Will McVicker wrote: > These two patches add module support to capture an external module's SCM > version as a MODULE_INFO() attribute. This allows users to identity the SCM > version of a given kernel module by using the modinfo tool or on the device > via sysfs: As this obviously is of no use for in-tree modules it falls under the we don't add code to support things that are not in tree rule and has no business in the kernel. Hi Christoph, Ah sorry, I didn't intend this to come across as only for external modules. That just seemed like the easiest way to explain how the scmversion attribute can be different from the vermagic. We mainly need this for in-tree kernel modules since that's where most our drivers are. Let me re-phrase this with that in mind. Basically, I like to look at this as an improved version of the existing srcversion module attribute since it allows you to easily identify the module version with a quick SCM version string check instead of doing a full checksum on the module source. For example, we have a setup to test kernel changes on the hikey and db845c devices without updating the kernel modules. Without this scmversion module attribute, you can't identify the original module version using `uname -r`. And for kernel modules in the initramfs, you can't even use modinfo to get the module vermagic. With this patch, you are able to get the SCM version for *all* kernel modules (on disk and in the initramfs) via the sysfs node: /sys/module//scmversion. This also works the other way around when developers update their kernel modules to fix some bug (like a security vulnerability) but don't need to update the full kernel. Hi Will, If this were also intended for in-tree kernel modules, then why do intree modules only get the UTS_RELEASE string in their scmversion field, which basically already exists in the vermagic? Or do you plan to change that? Jessica
Re: [PATCH v4 bpf-next 4/5] bpf: load and verify kernel module BTFs
+++ Andrii Nakryiko [11/11/20 12:11 -0800]: On Wed, Nov 11, 2020 at 2:13 AM Jessica Yu wrote: +++ Andrii Nakryiko [09/11/20 17:19 -0800]: [snipped] >diff --git a/kernel/module.c b/kernel/module.c >index a4fa44a652a7..f2996b02ab2e 100644 >--- a/kernel/module.c >+++ b/kernel/module.c >@@ -380,6 +380,35 @@ static void *section_objs(const struct load_info *info, > return (void *)info->sechdrs[sec].sh_addr; > } > >+/* Find a module section: 0 means not found. Ignores SHF_ALLOC flag. */ >+static unsigned int find_any_sec(const struct load_info *info, const char *name) >+{ >+ unsigned int i; >+ >+ for (i = 1; i < info->hdr->e_shnum; i++) { >+ Elf_Shdr *shdr = &info->sechdrs[i]; >+ if (strcmp(info->secstrings + shdr->sh_name, name) == 0) >+ return i; >+ } >+ return 0; >+} >+ >+/* >+ * Find a module section, or NULL. Fill in number of "objects" in section. >+ * Ignores SHF_ALLOC flag. >+ */ >+static __maybe_unused void *any_section_objs(const struct load_info *info, >+ const char *name, >+ size_t object_size, >+ unsigned int *num) >+{ >+ unsigned int sec = find_any_sec(info, name); >+ >+ /* Section 0 has sh_addr 0 and sh_size 0. */ >+ *num = info->sechdrs[sec].sh_size / object_size; >+ return (void *)info->sechdrs[sec].sh_addr; >+} >+ Hm, I see this patchset has already been applied to bpf-next, but I guess that doesn't preclude any follow-up patches :-) Of course! I am not a huge fan of the code duplication here, and also the fact that they're only called in one place. any_section_objs() and find_any_sec() are pretty much identical to section_objs() and find_sec(), other than the fact the former drops the SHF_ALLOC check. Right, but the alternative was to add a new flag to existing section_objs() and find_sec() functions, which would cause much more code churn for no good reason (besides saving some trivial code duplication). And those true/false flags are harder to read in code anyways. That's true, all fair points. I thought there was the possibility to avoid the code duplication if .BTF were also set to SHF_ALLOC, but I see for reasons you explained below it is more trouble than it's worth. Moreover, since it appears that the ".BTF" section is not marked SHF_ALLOC, I think this will leave mod->btf_data as a dangling pointer after the module is done loading and the module's load_info has been deallocated, since SHF_ALLOC sections are not allocated nor copied to the module's final location in memory. I can make sure that we also reset the btf_data pointer back to NULL, if that's a big concern. It's not a terribly huge concern, since mod->btf_data is only accessed in the btf coming notifier at the moment, but it's probably best to at least not advertise it as a valid pointer anymore after the module is done loading. We do some pointer and section size cleanup at the end of do_init_module() for sections that are deallocated at the end of module load (starting where init_layout.base is reset to NULL), we could just tack on mod->btf_data = NULL there as well. Why not simply mark the ".BTF" section in the module SHF_ALLOC? We already do some sh_flags rewriting in rewrite_section_headers(). Then the module loader knows to keep the section in memory and you can use section_objs(). And since the .BTF section stays in module memory, that might save you the memcpy() to btf->data in btf_parse_module() (unless that is still needed for some reason). Wasn't aware about rewrite_section_headers() manipulations. Are you suggesting to just add SHF_ALLOC there for the .BTF section from the kernel side? I guess that would work, but won't avoid memory copy (so actually would waste kernel memory, if I understand correctly). The reason being that the module's BTF is registered as an independently ref-counted BTF object, which could be held past the kernel module being unloaded. So I can't directly reference module's .BTF data anyways. Ah OK, I was not aware that the section could be held past the module being unloaded. Then yeah, it would be a memory waste to keep them in memory if they are being memcpy'd anyway. Thanks for clarifying! Jessica
Re: [PATCH 0/8] linker-section array fix and clean ups
+++ Johan Hovold [06/11/20 17:45 +0100]: On Fri, Nov 06, 2020 at 05:03:45PM +0100, Jessica Yu wrote: +++ Johan Hovold [03/11/20 18:57 +0100]: >We rely on the linker to create arrays for a number of things including >kernel parameters and device-tree-match entries. > >The stride of these linker-section arrays obviously needs to match the >expectations of the code accessing them or bad things will happen. > >One thing to watch out for is that gcc is known to increase the >alignment of larger objects with static extent as an optimisation (on >x86), but this can be suppressed by using the aligned attribute when >declaring entries. > >We've been relying on this behaviour for 16 years for kernel parameters >(and other structures) and it indeed hasn't changed since the >introduction of the aligned attribute in gcc 3.1 (see align_variable() >in [1]). > >Occasionally this gcc optimisation do cause problems which have instead >been worked around in various creative ways including using indirection >through an array of pointers. This was originally done for tracepoints >[2] after a number of failed attempts to create properly aligned arrays, >and the approach was later reused for module-version attributes [3] and >earlycon entries. >[2] https://lore.kernel.org/lkml/20110126222622.GA10794@Krystal/ So unfortunately, I am not familiar enough with the semantics of gcc's aligned attribute. AFAICT from the patch you linked in [2], the original purpose of the pointer indirection workaround was to avoid relying on (potentially inconsistent) compiler-specific behavior with respect to the aligned attribute. The main concern was potential up-alignment being done by gcc (or the linker) despite the desired alignment being specified. Indeed, the gcc documentation also states that the aligned attribute only specifies the *minimum* alignment, although there's no guarantee that up-alignment wouldn't occur. So I guess my question is, is there some implicit guarantee that specifying alignment by type via __alignof__ that's supposed to prevent gcc from up-aligning? Or are we just assuming that gcc won't increase the alignment? The gcc docs don't seem to clarify this unfortunately. It's simply specifying alignment when declaring the variable that prevents this optimisation. The relevant code is in the function align_variable() in [1] where DATA_ALIGNMENT() is never called in case an alignment has been specified (!DECL_USER_ALIGN(decl)). There's no mention in the documentation of this that I'm aware of, but this is the way the aligned attribute has worked since its introduction judging from the commit history. As mentioned above, we've been relying on this for kernel parameters and other structures since 2003-2004 so if it ever were to change we'd find out soon enough. It's about to be used for scheduler classes as well. [2] Johan [1] https://github.com/gcc-mirror/gcc/blob/master/gcc/varasm.c [2] https://lore.kernel.org/r/160396870486.397.377616182428528939.tip-bot2@tip-bot2 Thanks for providing the links and references. Your explanation and this reply from Jakub [1] clarified things for me. I was not aware of the distinction gcc made between aligned attributes on types vs. on variables. So from what I understand now, gcc suppresses the optimization when the alignment is specified in the variable declaration, but not necessarily when the aligned attribute is just on the type. Even though it's been in use for a long time, I think it would be really helpful if this gcc quirk was explained just a bit more in the patch changelogs, especially since this is undocumented behavior. I found the explanation in [1] (as well as in your cover letter) to be sufficient. Maybe something like "GCC suppresses any optimizations increasing alignment when the alignment is specified in the variable declaration, as opposed to just on the type definition. Therefore, explicitly specify type alignment when declaring entries to prevent gcc from increasing alignment." In any case, I can take the module and moduleparam.h patches through my tree, but I will wait a few days in case there are any objections. Thanks, Jessica [1] https://lore.kernel.org/lkml/20201021131806.GA2176@tucnak/
Re: [PATCH v4 bpf-next 4/5] bpf: load and verify kernel module BTFs
+++ Andrii Nakryiko [09/11/20 17:19 -0800]: [snipped] diff --git a/kernel/module.c b/kernel/module.c index a4fa44a652a7..f2996b02ab2e 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -380,6 +380,35 @@ static void *section_objs(const struct load_info *info, return (void *)info->sechdrs[sec].sh_addr; } +/* Find a module section: 0 means not found. Ignores SHF_ALLOC flag. */ +static unsigned int find_any_sec(const struct load_info *info, const char *name) +{ + unsigned int i; + + for (i = 1; i < info->hdr->e_shnum; i++) { + Elf_Shdr *shdr = &info->sechdrs[i]; + if (strcmp(info->secstrings + shdr->sh_name, name) == 0) + return i; + } + return 0; +} + +/* + * Find a module section, or NULL. Fill in number of "objects" in section. + * Ignores SHF_ALLOC flag. + */ +static __maybe_unused void *any_section_objs(const struct load_info *info, +const char *name, +size_t object_size, +unsigned int *num) +{ + unsigned int sec = find_any_sec(info, name); + + /* Section 0 has sh_addr 0 and sh_size 0. */ + *num = info->sechdrs[sec].sh_size / object_size; + return (void *)info->sechdrs[sec].sh_addr; +} + Hm, I see this patchset has already been applied to bpf-next, but I guess that doesn't preclude any follow-up patches :-) I am not a huge fan of the code duplication here, and also the fact that they're only called in one place. any_section_objs() and find_any_sec() are pretty much identical to section_objs() and find_sec(), other than the fact the former drops the SHF_ALLOC check. Moreover, since it appears that the ".BTF" section is not marked SHF_ALLOC, I think this will leave mod->btf_data as a dangling pointer after the module is done loading and the module's load_info has been deallocated, since SHF_ALLOC sections are not allocated nor copied to the module's final location in memory. Why not simply mark the ".BTF" section in the module SHF_ALLOC? We already do some sh_flags rewriting in rewrite_section_headers(). Then the module loader knows to keep the section in memory and you can use section_objs(). And since the .BTF section stays in module memory, that might save you the memcpy() to btf->data in btf_parse_module() (unless that is still needed for some reason). Thanks, Jessica /* Provided by the linker */ extern const struct kernel_symbol __start___ksymtab[]; extern const struct kernel_symbol __stop___ksymtab[]; @@ -3250,6 +3279,9 @@ static int find_module_sections(struct module *mod, struct load_info *info) sizeof(*mod->bpf_raw_events), &mod->num_bpf_raw_events); #endif +#ifdef CONFIG_DEBUG_INFO_BTF_MODULES + mod->btf_data = any_section_objs(info, ".BTF", 1, &mod->btf_data_size); +#endif #ifdef CONFIG_JUMP_LABEL mod->jump_entries = section_objs(info, "__jump_table", sizeof(*mod->jump_entries), -- 2.24.1
Re: [PATCH 0/2] module: add/fix 'kernel-doc' comments
+++ Sergey Shtylyov [04/11/20 23:33 +0300]: Here are 2 patches against the 'modules-next' branch of Jessica Yu's 'linux.git' repo. I'm cleaning up the 'kernel-doc' function comments... [1/2] module: fix up 'kernel-doc' comments [2/2] module: add more 'kernel-doc' comments Applied, thanks! Jessica
Re: [PATCH v2] module: fix comment style
+++ Sergey Shtylyov [07/11/20 23:20 +0300]: Many comments in this module do not comply with the preferred multi-line comment style as reported by 'scripts/checkpatch.pl': WARNING: Block comments use * on subsequent lines WARNING: Block comments use a trailing */ on a separate line Fix those comments, along with (unreported for some reason?) the starts of the multi-line comments not being /* on their own line... Signed-off-by: Sergey Shtylyov --- This patch is against the 'modules-next' branch of Jessica Yu's 'linux.git' repo plus my 2 patches posted this week. I'm not sure such patches are welcome, please let me know if they're not... Changes in version 2: - fixed up the comment before find_symbol(). Applied, thanks for the comment cleanups! Jessica
Re: [PATCH 0/8] linker-section array fix and clean ups
+++ Johan Hovold [03/11/20 18:57 +0100]: We rely on the linker to create arrays for a number of things including kernel parameters and device-tree-match entries. The stride of these linker-section arrays obviously needs to match the expectations of the code accessing them or bad things will happen. One thing to watch out for is that gcc is known to increase the alignment of larger objects with static extent as an optimisation (on x86), but this can be suppressed by using the aligned attribute when declaring entries. We've been relying on this behaviour for 16 years for kernel parameters (and other structures) and it indeed hasn't changed since the introduction of the aligned attribute in gcc 3.1 (see align_variable() in [1]). Occasionally this gcc optimisation do cause problems which have instead been worked around in various creative ways including using indirection through an array of pointers. This was originally done for tracepoints [2] after a number of failed attempts to create properly aligned arrays, and the approach was later reused for module-version attributes [3] and earlycon entries. [2] https://lore.kernel.org/lkml/20110126222622.GA10794@Krystal/ Hi Johan, So unfortunately, I am not familiar enough with the semantics of gcc's aligned attribute. AFAICT from the patch you linked in [2], the original purpose of the pointer indirection workaround was to avoid relying on (potentially inconsistent) compiler-specific behavior with respect to the aligned attribute. The main concern was potential up-alignment being done by gcc (or the linker) despite the desired alignment being specified. Indeed, the gcc documentation also states that the aligned attribute only specifies the *minimum* alignment, although there's no guarantee that up-alignment wouldn't occur. So I guess my question is, is there some implicit guarantee that specifying alignment by type via __alignof__ that's supposed to prevent gcc from up-aligning? Or are we just assuming that gcc won't increase the alignment? The gcc docs don't seem to clarify this unfortunately. Thanks, Jessica
Re: [PATCH v2 0/3] module: refactor module_sig_check()
+++ Sergey Shtylyov [31/10/20 23:05 +0300]: Here are 3 patches against the 'modules-next' branch of Jessica Yu's 'linux.git' repo. I'm doing some refactoring in module_sig_check()... [1/3] module: merge repetitive strings in module_sig_check() [2/3] module: avoid *goto*s in module_sig_check() [3/3] module: only handle errors with the *switch* statement in module_sig_check() Applied to modules-next. Thanks Sergey for the cleanups! Jessica
Re: [PATCH] module: set MODULE_STATE_GOING state when a module fails to load
+++ Miroslav Benes [29/10/20 13:31 +0100]: On Wed, 28 Oct 2020, Jessica Yu wrote: +++ Miroslav Benes [27/10/20 15:03 +0100]: >If a module fails to load due to an error in prepare_coming_module(), >the following error handling in load_module() runs with >MODULE_STATE_COMING in module's state. Fix it by correctly setting >MODULE_STATE_GOING under "bug_cleanup" label. > >Signed-off-by: Miroslav Benes >--- > kernel/module.c | 1 + > 1 file changed, 1 insertion(+) > >diff --git a/kernel/module.c b/kernel/module.c >index a4fa44a652a7..b34235082394 100644 >--- a/kernel/module.c >+++ b/kernel/module.c >@@ -3991,6 +3991,7 @@ static int load_module(struct load_info *info, const >char __user *uargs, > MODULE_STATE_GOING, mod); >klp_module_going(mod); > bug_cleanup: >+ mod->state = MODULE_STATE_GOING; > /* module_bug_cleanup needs module_mutex protection */ > mutex_lock(&module_mutex); > module_bug_cleanup(mod); Thanks for the fix! Hmm, I am wondering if we also need to set the module to GOING if it happens to fail while it is still UNFORMED. Currently, when a module is UNFORMED and encounters an error during load_module(), it stays UNFORMED until it finally goes away. That sounds fine, but try_module_get() technically permits you to get a module while it's UNFORMED (but not if it's GOING). Theoretically someone could increase the refcount of an unformed module that has encountered an error condition and is in the process of going away. Right. This shouldn't happen if we properly set the module to GOING whenever it encounters an error during load_module(). That's correct. But - I cannot think of a scenario where someone could call try_module_get() on an unformed module, since find_module() etc. do not return unformed modules, so they shouldn't be visible outside of the module loader. So in practice, I think we're probably safe here.. Hopefully yes. I haven't found anything that would contradict it. I think it is even safer to leave UNFORMED there. free_module() explicitly sets UNFORMED state too while going through the similar process. That's true, and agreed. ftrace_release_mod() is the only inconsistency there. It is called with UNFORMED in load_module() if going through ddebug_cleanup label directly, and with GOING in both do_init_module() before free_module() is called and delete_module syscall. But it probably does not care. Yes, I guess that inconsistency with ftrace_release_mod() has been there long before this patch :-/ A quick look through ftrace.c doesn't suggest to me there are any direct dependencies on module state there (other than that comment above ftrace_module_init()). In practice there hasn't been any issues.. I have applied this to modules-next. Thanks! Jessica
Re: [PATCH] module: set MODULE_STATE_GOING state when a module fails to load
+++ Miroslav Benes [27/10/20 15:03 +0100]: If a module fails to load due to an error in prepare_coming_module(), the following error handling in load_module() runs with MODULE_STATE_COMING in module's state. Fix it by correctly setting MODULE_STATE_GOING under "bug_cleanup" label. Signed-off-by: Miroslav Benes --- kernel/module.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/module.c b/kernel/module.c index a4fa44a652a7..b34235082394 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -3991,6 +3991,7 @@ static int load_module(struct load_info *info, const char __user *uargs, MODULE_STATE_GOING, mod); klp_module_going(mod); bug_cleanup: + mod->state = MODULE_STATE_GOING; /* module_bug_cleanup needs module_mutex protection */ mutex_lock(&module_mutex); module_bug_cleanup(mod); Thanks for the fix! Hmm, I am wondering if we also need to set the module to GOING if it happens to fail while it is still UNFORMED. Currently, when a module is UNFORMED and encounters an error during load_module(), it stays UNFORMED until it finally goes away. That sounds fine, but try_module_get() technically permits you to get a module while it's UNFORMED (but not if it's GOING). Theoretically someone could increase the refcount of an unformed module that has encountered an error condition and is in the process of going away. This shouldn't happen if we properly set the module to GOING whenever it encounters an error during load_module(). But - I cannot think of a scenario where someone could call try_module_get() on an unformed module, since find_module() etc. do not return unformed modules, so they shouldn't be visible outside of the module loader. So in practice, I think we're probably safe here..
Re: [PATCH] module: use hidden visibility for weak symbol references
+++ Will Deacon [28/10/20 13:24 +]: On Wed, Oct 28, 2020 at 01:27:01PM +0100, Ard Biesheuvel wrote: On Wed, 28 Oct 2020 at 11:00, Will Deacon wrote: > On Tue, Oct 27, 2020 at 04:11:32PM +0100, Ard Biesheuvel wrote: > > Geert reports that commit be2881824ae9eb92 ("arm64/build: Assert for > > unwanted sections") results in build errors on arm64 for configurations > > that have CONFIG_MODULES disabled. > > > > The commit in question added ASSERT()s to the arm64 linker script to > > ensure that linker generated sections such as .got, .plt etc are empty, > > but as it turns out, there are corner cases where the linker does emit > > content into those sections. More specifically, weak references to > > function symbols (which can remain unsatisfied, and can therefore not > > be emitted as relative references) will be emitted as GOT and PLT > > entries when linking the kernel in PIE mode (which is the case when > > CONFIG_RELOCATABLE is enabled, which is on by default). > > > > What happens is that code such as > > > > struct device *(*fn)(struct device *dev); > > struct device *iommu_device; > > > > fn = symbol_get(mdev_get_iommu_device); > > if (fn) { > > iommu_device = fn(dev); > > > > essentially gets converted into the following when CONFIG_MODULES is off: > > > > struct device *iommu_device; > > > > if (&mdev_get_iommu_device) { > > iommu_device = mdev_get_iommu_device(dev); > > > > where mdev_get_iommu_device is emitted as a weak symbol reference into > > the object file. The first reference is decorated with an ordinary > > ABS64 data relocation (which yields 0x0 if the reference remains > > unsatisfied). However, the indirect call is turned into a direct call > > covered by a R_AARCH64_CALL26 relocation, which is converted into a > > call via a PLT entry taking the target address from the associated > > GOT entry. > > > > Given that such GOT and PLT entries are unnecessary for fully linked > > binaries such as the kernel, let's give these weak symbol references > > hidden visibility, so that the linker knows that the weak reference > > via R_AARCH64_CALL26 can simply remain unsatisfied. > > > > Cc: Jessica Yu > > Cc: Kees Cook > > Cc: Geert Uytterhoeven > > Cc: Nick Desaulniers > > Signed-off-by: Ard Biesheuvel > > --- > > include/linux/module.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > Cheers. I gave this a spin, but I unfortunately still see the following > linker warning with allnoconfig: > > aarch64-linux-gnu-ld: warning: orphan section `.igot.plt' from `arch/arm64/kernel/head.o' being placed in section `.igot.plt' > > which looks unrelated to symbol_get(), but maybe it's worth knocking these > things on the head (no pun intended) at the same time? > Yeah, that is just one of those spurious sections that turns up empty anyway. The head.o is a red herring, it is simply the first file appearing in the link. This should fix it diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S index 6567d80dd15f..48b222f1c700 100644 --- a/arch/arm64/kernel/vmlinux.lds.S +++ b/arch/arm64/kernel/vmlinux.lds.S @@ -278,7 +278,7 @@ SECTIONS * explicitly check instead of blindly discarding. */ .plt : { - *(.plt) *(.plt.*) *(.iplt) *(.igot) + *(.plt) *(.plt.*) *(.iplt) *(.igot .igot.plt) } ASSERT(SIZEOF(.plt) == 0, "Unexpected run-time procedure linkages detected!") Cheers, that fixes the extra warning for me. If you could send a proper patch, I'm happy to queue as an arm64 fix! (I'm assuming the former is going via Jessica, but I can also take that with her Ack). Hi! Yes, please feel free to take this patch along with the other fix: Acked-by: Jessica Yu Thanks, Jessica
Re: [PATCH 0/2] module: some refactoring in module_sig_check()
+++ Sergey Shtylyov [13/10/20 23:32 +0300]: Here are 2 patches against the 'modules-next' branch of Jessica Yu's 'linux.git' repo. I'm doing some little refactoring in module_sig_check()... [1/2] module: merge repetitive strings in module_sig_check() [2/2] module: unindent comments in module_sig_check() Hi Sergey, I'm fine with these patches, but are you still planning on sending a v2 based on Joe Perches' suggestions? Thanks! Jessica
[GIT PULL] Modules updates for v5.10
Hi Linus, Please pull below to receive modules updates for the v5.10 merge window. Details can be found in the signed tag. Thank you, Jessica -- The following changes since commit f75aef392f869018f78cfedf3c320a6b3fcfda6b: Linux 5.9-rc3 (2020-08-30 16:01:54 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux.git tags/modules-for-v5.10 for you to fetch changes up to fdf09ab887829cd1b671e45d9549f8ec1ffda0fa: module: statically initialize init section freeing data (2020-10-12 18:27:00 +0200) Modules updates for v5.10 Summary of modules changes for the 5.10 merge window: - Code cleanups. More informative error messages and statically initialize init_free_wq to avoid a workqueue warning. Signed-off-by: Jessica Yu Daniel Jordan (1): module: statically initialize init section freeing data Qu Wenruo (1): module: Add more error message for failed kernel module loading kernel/module.c | 23 +++ 1 file changed, 11 insertions(+), 12 deletions(-)
Re: [PATCH] module: statically initialize init section freeing data
+++ Daniel Jordan [08/10/20 13:32 -0400]: Corentin hit the following workqueue warning when running with CRYPTO_MANAGER_EXTRA_TESTS: WARNING: CPU: 2 PID: 147 at kernel/workqueue.c:1473 __queue_work+0x3b8/0x3d0 Modules linked in: ghash_generic CPU: 2 PID: 147 Comm: modprobe Not tainted 5.6.0-rc1-next-20200214-00068-g166c9264f0b1-dirty #545 Hardware name: Pine H64 model A (DT) pc : __queue_work+0x3b8/0x3d0 Call trace: __queue_work+0x3b8/0x3d0 queue_work_on+0x6c/0x90 do_init_module+0x188/0x1f0 load_module+0x1d00/0x22b0 I wasn't able to reproduce on x86 or rpi 3b+. This is WARN_ON(!list_empty(&work->entry)) from __queue_work(), and it happens because the init_free_wq work item isn't initialized in time for a crypto test that requests the gcm module. Some crypto tests were recently moved earlier in boot as explained in commit c4741b230597 ("crypto: run initcalls for generic implementations earlier"), which went into mainline less than two weeks before the Fixes commit. Avoid the warning by statically initializing init_free_wq and the corresponding llist. Link: https://lore.kernel.org/lkml/20200217204803.GA13479@Red/ Fixes: 1a7b7d922081 ("modules: Use vmalloc special flag") Reported-by: Corentin Labbe Tested-by: Corentin Labbe Tested-on: sun50i-h6-pine-h64 Tested-on: imx8mn-ddr4-evk Tested-on: sun50i-a64-bananapi-m64 Signed-off-by: Daniel Jordan Applied, thanks! Jessica
Re: [PATCH v2] kbuild: preprocess module linker script
+++ Masahiro Yamada [08/09/20 13:27 +0900]: There was a request to preprocess the module linker script like we do for the vmlinux one. (https://lkml.org/lkml/2020/8/21/512) The difference between vmlinux.lds and module.lds is that the latter is needed for external module builds, thus must be cleaned up by 'make mrproper' instead of 'make clean'. Also, it must be created by 'make modules_prepare'. You cannot put it in arch/$(SRCARCH)/kernel/, which is cleaned up by 'make clean'. I moved arch/$(SRCARCH)/kernel/module.lds to arch/$(SRCARCH)/include/asm/module.lds.h, which is included from scripts/module.lds.S. scripts/module.lds is fine because 'make clean' keeps all the build artifacts under scripts/. You can add arch-specific sections in . Signed-off-by: Masahiro Yamada Tested-by: Jessica Yu Acked-by: Will Deacon Acked-by: Jessica Yu Thanks for working on this!
Re: [PATCH v2] module: Harden STRICT_MODULE_RWX
+++ Masahiro Yamada [31/08/20 19:42 +0900]: [snipped for brevity] Sorry for the delay. Please try the attached patch. Hi Masahiro, Thank you for the patch. Sorry for the delay, I just wanted to report back after briefly testing your patch. It works great, at the moment I've only tested with arm64. I made the following change to arch/arm64/include/asm/module.lds.h: diff --git a/arch/arm64/include/asm/module.lds.h b/arch/arm64/include/asm/module.lds.h index 691f15af788e..d8e786e5fcdb 100644 --- a/arch/arm64/include/asm/module.lds.h +++ b/arch/arm64/include/asm/module.lds.h @@ -2,6 +2,8 @@ SECTIONS { .plt (NOLOAD) : { BYTE(0) } .init.plt (NOLOAD) : { BYTE(0) } +#ifdef CONFIG_DYNAMIC_FTRACE .text.ftrace_trampoline (NOLOAD) : { BYTE(0) } +#endif } #endif Since originally we wanted to include .text.ftrace_trampoline only conditionally. The resulting scripts/module.lds looks correct with CONFIG_DYNAMIC_FTRACE=y: SECTIONS { /DISCARD/ : { *(.discard) *(.discard.*) } __ksymtab 0 : { *(SORT(___ksymtab+*)) } __ksymtab_gpl 0 : { *(SORT(___ksymtab_gpl+*)) } __ksymtab_unused 0 : { *(SORT(___ksymtab_unused+*)) } __ksymtab_unused_gpl 0 : { *(SORT(___ksymtab_unused_gpl+*)) } __ksymtab_gpl_future 0 : { *(SORT(___ksymtab_gpl_future+*)) } __kcrctab 0 : { *(SORT(___kcrctab+*)) } __kcrctab_gpl 0 : { *(SORT(___kcrctab_gpl+*)) } __kcrctab_unused 0 : { *(SORT(___kcrctab_unused+*)) } __kcrctab_unused_gpl 0 : { *(SORT(___kcrctab_unused_gpl+*)) } __kcrctab_gpl_future 0 : { *(SORT(___kcrctab_gpl_future+*)) } .init_array 0 : ALIGN(8) { *(SORT(.init_array.*)) *(.init_array) } __jump_table 0 : ALIGN(8) { KEEP(*(__jump_table)) } } SECTIONS { .plt (NOLOAD) : { BYTE(0) } .init.plt (NOLOAD) : { BYTE(0) } .text.ftrace_trampoline (NOLOAD) : { BYTE(0) } } And with CONFIG_DYNAMIC_FTRACE=n as well: SECTIONS { .plt (NOLOAD) : { BYTE(0) } .init.plt (NOLOAD) : { BYTE(0) } } I will test on more arches in the next days but in the meantime you could add my Tested-by: Jessica Yu Thank you for working on this!
Re: [PATCH] module: Add more error message for failed kernel module loading
+++ Qu Wenruo [02/09/20 14:46 +0800]: When kernel module loading failed, user space only get one of the following error messages: - ENOEXEC This is the most confusing one. From corrupted ELF header to bad WRITE|EXEC flags check introduced by in module_enforce_rwx_sections() all returns this error number. - EPERM This is for blacklisted modules. But mod doesn't do extra explain on this error either. - ENOMEM The only error which needs no explain. This means, if a user got "Exec format error" from modprobe, it provides no meaningful way for the user to debug, and will take extra time communicating to get extra info. So this patch will add extra error messages for -ENOEXEC and -EPERM errors, allowing user to do better debugging and reporting. Signed-off-by: Qu Wenruo Reviewed-by: Lucas De Marchi Applied, thanks. Jessica
[PATCH] arm64/module: set trampoline section flags regardless of CONFIG_DYNAMIC_FTRACE
In the arm64 module linker script, the section .text.ftrace_trampoline is specified unconditionally regardless of whether CONFIG_DYNAMIC_FTRACE is enabled (this is simply due to the limitation that module linker scripts are not preprocessed like the vmlinux one). Normally, for .plt and .text.ftrace_trampoline, the section flags present in the module binary wouldn't matter since module_frob_arch_sections() would assign them manually anyway. However, the arm64 module loader only sets the section flags for .text.ftrace_trampoline when CONFIG_DYNAMIC_FTRACE=y. That's only become problematic recently due to a recent change in binutils-2.35, where the .text.ftrace_trampoline section (along with the .plt section) is now marked writable and executable (WAX). We no longer allow writable and executable sections to be loaded due to commit 5c3a7db0c7ec ("module: Harden STRICT_MODULE_RWX"), so this is causing all modules linked with binutils-2.35 to be rejected under arm64. Drop the IS_ENABLED(CONFIG_DYNAMIC_FTRACE) check in module_frob_arch_sections() so that the section flags for .text.ftrace_trampoline get properly set to SHF_EXECINSTR|SHF_ALLOC, without SHF_WRITE. Link: http://lore.kernel.org/r/20200831094651.GA16385@linux-8ccs Acked-by: Will Deacon Signed-off-by: Jessica Yu --- arch/arm64/kernel/module-plts.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c index 0ce3a28e3347..2e224435c024 100644 --- a/arch/arm64/kernel/module-plts.c +++ b/arch/arm64/kernel/module-plts.c @@ -305,8 +305,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, mod->arch.core.plt_shndx = i; else if (!strcmp(secstrings + sechdrs[i].sh_name, ".init.plt")) mod->arch.init.plt_shndx = i; - else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) && -!strcmp(secstrings + sechdrs[i].sh_name, + else if (!strcmp(secstrings + sechdrs[i].sh_name, ".text.ftrace_trampoline")) tramp = sechdrs + i; else if (sechdrs[i].sh_type == SHT_SYMTAB) -- 2.16.4
Re: [PATCH] module: Add more error message for failed kernel module loading
+++ Qu Wenruo [31/08/20 16:37 +0800]: When kernel module loading failed, user space only get one of the following error messages: - -ENOEXEC This is the most confusing one. From corrupted ELF header to bad WRITE|EXEC flags check introduced by in module_enforce_rwx_sections() all returns this error number. - -EPERM This is for blacklisted modules. But mod doesn't do extra explain on this error either. - -ENOMEM The only error which needs no explain. This means, if a user got "Exec format error" from modprobe, it provides no meaningful way for the user to debug, and will take extra time communicating to get extra info. So this patch will add extra error messages for -ENOEXEC and -EPERM errors, allowing user to do better debugging and reporting. Signed-off-by: Qu Wenruo Thanks for your patch, agreed that there should be more descriptive error messages to help debug module loading issues. --- kernel/module.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index 8fa2600bde6a..204bf29437b8 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2068,8 +2068,12 @@ static int module_enforce_rwx_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs, int i; for (i = 0; i < hdr->e_shnum; i++) { - if ((sechdrs[i].sh_flags & shf_wx) == shf_wx) + if ((sechdrs[i].sh_flags & shf_wx) == shf_wx) { + pr_err( + "Module %s section %d has invalid WRITE|EXEC flags\n", + mod->name, i); I think it's OK to put pr_err and the format string on the same line. IMO the line break doesn't add much readability value in this case. Also, we have access to secstrings in this function. We can print out the section name with secstrings + sechdrs[i].sh_name in addition to the section number, I think that would be helpful. And can we reformat the message to start with the module name, similar to other pr_err() sites? i.e., pr_err("%s: section %s (index %d) has invalid WRITE|EXEC flags", mod->name, secstrings + sechdrs[i].sh_name, i) The rest looks fine to me. Thanks! Jessica
Re: [PATCH v2] module: Harden STRICT_MODULE_RWX
+++ Ard Biesheuvel [31/08/20 16:25 +0300]: On Mon, 31 Aug 2020 at 13:43, Masahiro Yamada wrote: On Mon, Aug 31, 2020 at 6:47 PM Jessica Yu wrote: > > +++ Will Deacon [21/08/20 13:30 +0100]: > [snipped] > >> > > > So module_enforce_rwx_sections() is already called after > >> > > > module_frob_arch_sections() - which really baffled me at first, since > >> > > > sh_type and sh_flags should have been set already in > >> > > > module_frob_arch_sections(). > >> > > > > >> > > > I added some debug prints to see which section the module code was > >> > > > tripping on, and it was .text.ftrace_trampoline. See this snippet from > >> > > > arm64's module_frob_arch_sections(): > >> > > > > >> > > > else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) && > >> > > > !strcmp(secstrings + sechdrs[i].sh_name, > >> > > > ".text.ftrace_trampoline")) > >> > > > tramp = sechdrs + i; > >> > > > > >> > > > Since Mauro's config doesn't have CONFIG_DYNAMIC_FTRACE enabled, tramp > >> > > > is never set here and the if (tramp) check at the end of the function > >> > > > fails, so its section flags are never set, so they remain WAX and fail > >> > > > the rwx check. > >> > > > >> > > Right. Our module.lds does not go through the preprocessor, so we > >> > > cannot add the #ifdef check there currently. So we should either drop > >> > > the IS_ENABLED() check here, or simply rename the section, dropping > >> > > the .text prefix (which doesn't seem to have any significance outside > >> > > this context) > >> > > > >> > > I'll leave it to Will to make the final call here. > >> > > >> > Why don't we just preprocess the linker script, like we do for the main > >> > kernel? > >> > > >> > >> That should work as well, I just haven't checked how straight-forward > >> it is to change that. > > > >Ok, if it's _not_ straightforward, then let's just drop the IS_ENABLED() > >altogether. > > Unfortunately I've been getting more reports about this issue, so let's just > get the aforementioned workaround merged first. Does the following look OK? > > diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c > index 0ce3a28e3347..2e224435c024 100644 > --- a/arch/arm64/kernel/module-plts.c > +++ b/arch/arm64/kernel/module-plts.c > @@ -305,8 +305,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, > mod->arch.core.plt_shndx = i; > else if (!strcmp(secstrings + sechdrs[i].sh_name, ".init.plt")) > mod->arch.init.plt_shndx = i; > - else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) && > -!strcmp(secstrings + sechdrs[i].sh_name, > + else if (!strcmp(secstrings + sechdrs[i].sh_name, > ".text.ftrace_trampoline")) > tramp = sechdrs + i; > else if (sechdrs[i].sh_type == SHT_SYMTAB) > > If so I'll turn it into a formal patch and we can get that merged in the next -rc. > > Thanks, > > Jessica Sorry for the delay. Please try the attached patch. Thanks Masahiro, Yes, thanks Masahiro for looking into this! And no worries about the delay. I will be able to test the patch tomorrow. I'll leave it to the maintainers to make the final call, but this does look rather intrusive to me, especially for an -rc. Renaming scripts/module-common.lds to scripts/module.lds means that the distros may have to update their scripts to generate the linux-headers packages etc, so if we do this at all, we'd better do it between releases. Yes, agreed - I was suggesting dropping the IS_ENABLED() check for the next -rc so that the bug reports about this module loading issue stop cropping up, and the "proper" fix of supporting module.lds.S -> .lds would be suitable for the next release instead. Jessica
Re: [PATCH v2] module: Harden STRICT_MODULE_RWX
+++ Will Deacon [21/08/20 13:30 +0100]: [snipped] > > > So module_enforce_rwx_sections() is already called after > > > module_frob_arch_sections() - which really baffled me at first, since > > > sh_type and sh_flags should have been set already in > > > module_frob_arch_sections(). > > > > > > I added some debug prints to see which section the module code was > > > tripping on, and it was .text.ftrace_trampoline. See this snippet from > > > arm64's module_frob_arch_sections(): > > > > > > else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) && > > > !strcmp(secstrings + sechdrs[i].sh_name, > > > ".text.ftrace_trampoline")) > > > tramp = sechdrs + i; > > > > > > Since Mauro's config doesn't have CONFIG_DYNAMIC_FTRACE enabled, tramp > > > is never set here and the if (tramp) check at the end of the function > > > fails, so its section flags are never set, so they remain WAX and fail > > > the rwx check. > > > > Right. Our module.lds does not go through the preprocessor, so we > > cannot add the #ifdef check there currently. So we should either drop > > the IS_ENABLED() check here, or simply rename the section, dropping > > the .text prefix (which doesn't seem to have any significance outside > > this context) > > > > I'll leave it to Will to make the final call here. > > Why don't we just preprocess the linker script, like we do for the main > kernel? > That should work as well, I just haven't checked how straight-forward it is to change that. Ok, if it's _not_ straightforward, then let's just drop the IS_ENABLED() altogether. Unfortunately I've been getting more reports about this issue, so let's just get the aforementioned workaround merged first. Does the following look OK? diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c index 0ce3a28e3347..2e224435c024 100644 --- a/arch/arm64/kernel/module-plts.c +++ b/arch/arm64/kernel/module-plts.c @@ -305,8 +305,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, mod->arch.core.plt_shndx = i; else if (!strcmp(secstrings + sechdrs[i].sh_name, ".init.plt")) mod->arch.init.plt_shndx = i; - else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) && -!strcmp(secstrings + sechdrs[i].sh_name, + else if (!strcmp(secstrings + sechdrs[i].sh_name, ".text.ftrace_trampoline")) tramp = sechdrs + i; else if (sechdrs[i].sh_type == SHT_SYMTAB) If so I'll turn it into a formal patch and we can get that merged in the next -rc. Thanks, Jessica
Re: [PATCH v2] module: Harden STRICT_MODULE_RWX
+++ Ard Biesheuvel [22/08/20 15:47 +0200]: (+ Masahiro) On Fri, 21 Aug 2020 at 14:30, Will Deacon wrote: On Fri, Aug 21, 2020 at 02:27:05PM +0200, Ard Biesheuvel wrote: > On Fri, 21 Aug 2020 at 14:20, Will Deacon wrote: > > > > On Thu, Aug 13, 2020 at 03:07:13PM +0200, Ard Biesheuvel wrote: > > > On Thu, 13 Aug 2020 at 15:04, Jessica Yu wrote: > > > > > > > > +++ Ard Biesheuvel [13/08/20 10:36 +0200]: > > > > >On Wed, 12 Aug 2020 at 22:00, Peter Zijlstra wrote: > > > > >> > > > > >> On Wed, Aug 12, 2020 at 06:37:57PM +0200, Ard Biesheuvel wrote: > > > > >> > I know there is little we can do at this point, apart from ignoring > > > > >> > the permissions - perhaps we should just defer the w^x check until > > > > >> > after calling module_frob_arch_sections()? > > > > >> > > > > >> My earlier suggestion was to ignore it for 0-sized sections. > > > > > > > > > >Only they are 1 byte sections in this case. > > > > > > > > > >We override the sh_type and sh_flags explicitly for these sections at > > > > >module load time, so deferring the check seems like a reasonable > > > > >alternative to me. > > > > > > > > So module_enforce_rwx_sections() is already called after > > > > module_frob_arch_sections() - which really baffled me at first, since > > > > sh_type and sh_flags should have been set already in > > > > module_frob_arch_sections(). > > > > > > > > I added some debug prints to see which section the module code was > > > > tripping on, and it was .text.ftrace_trampoline. See this snippet from > > > > arm64's module_frob_arch_sections(): > > > > > > > > else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) && > > > > !strcmp(secstrings + sechdrs[i].sh_name, > > > > ".text.ftrace_trampoline")) > > > > tramp = sechdrs + i; > > > > > > > > Since Mauro's config doesn't have CONFIG_DYNAMIC_FTRACE enabled, tramp > > > > is never set here and the if (tramp) check at the end of the function > > > > fails, so its section flags are never set, so they remain WAX and fail > > > > the rwx check. > > > > > > Right. Our module.lds does not go through the preprocessor, so we > > > cannot add the #ifdef check there currently. So we should either drop > > > the IS_ENABLED() check here, or simply rename the section, dropping > > > the .text prefix (which doesn't seem to have any significance outside > > > this context) > > > > > > I'll leave it to Will to make the final call here. > > > > Why don't we just preprocess the linker script, like we do for the main > > kernel? > > > > That should work as well, I just haven't checked how straight-forward > it is to change that. Ok, if it's _not_ straightforward, then let's just drop the IS_ENABLED() altogether. I played around with this for a while, but failed to get Kbuild to instantiate $(objtree)/arch/arm64/kernel/module.lds based on $(srctree)/arch/arm64/kernel/module.lds.S and the cpp_lds_S rule. Perhaps Masahiro has any suggestions here? Otherwise, let's just drop the IS_ENABLED() check for now. I also tinkered around a bit and was able to generate $(objtree)/arch/arm64/kernel/module.lds based on $(srctree)/arch/arm64/kernel/module.lds.S only if I specified the former as the make target directly. Correct me if I'm wrong, but I guess this might be because the single build targets would utilize scripts/Makefile.build (where the cpp_lds_S rule is defined) while the module-related Makefiles don't seem to support .lds.S -> .lds in general.. Masahiro, how easy would it be to extend .lds.S -> .lds support to module linker scripts as well? Thanks, Jessica
[GIT PULL] Modules updates for v5.9
Hi Linus, Please pull below to receive modules updates for the v5.9 merge window. The most important change would be Christoph Hellwig's patch implementing proprietary taint inheritance, in an effort to discourage the creation of GPL "shim" modules that interface between GPL symbols and proprietary symbols. A recent example cited from the discussion can be found here: https://lore.kernel.org/netdev/6376ca34-bc6f-45de-9ffd-7e32664c7...@fb.com/T/#md514322fdfa212afe9f1d3eb4e5f7eaefece36eb Thank you, Jessica -- The following changes since commit b3a9e3b9622ae10064826dccb4f7a52bd88c7407: Linux 5.8-rc1 (2020-06-14 12:45:04 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux.git tags/modules-for-v5.9 for you to fetch changes up to 262e6ae7081df304fc625cf368d5c2cbba2bb991: modules: inherit TAINT_PROPRIETARY_MODULE (2020-08-05 10:31:28 +0200) Modules updates for v5.9 Summary of modules changes for the 5.9 merge window: - Have modules that use symbols from proprietary modules inherit the TAINT_PROPRIETARY_MODULE taint, in an effort to prevent GPL shim modules that are used to circumvent _GPL exports. These are modules that claim to be GPL licensed while also using symbols from proprietary modules. Such modules will be rejected while non-GPL modules will inherit the proprietary taint. - Module export space cleanup. Unexport symbols that are unused outside of module.c or otherwise used in only built-in code. Signed-off-by: Jessica Yu Christoph Hellwig (8): modules: mark ref_module static modules: mark find_symbol static modules: mark each_symbol_section static modules: unexport __module_text_address modules: unexport __module_address modules: rename the licence field in struct symsearch to license modules: return licensing information from find_symbol modules: inherit TAINT_PROPRIETARY_MODULE Randy Dunlap (1): modules: linux/moduleparam.h: drop duplicated word in a comment include/linux/module.h | 26 +++- include/linux/moduleparam.h | 2 +- kernel/module.c | 60 - 3 files changed, 47 insertions(+), 41 deletions(-)
Re: [PATCH v2] module: Harden STRICT_MODULE_RWX
+++ Ard Biesheuvel [13/08/20 10:36 +0200]: On Wed, 12 Aug 2020 at 22:00, Peter Zijlstra wrote: On Wed, Aug 12, 2020 at 06:37:57PM +0200, Ard Biesheuvel wrote: > I know there is little we can do at this point, apart from ignoring > the permissions - perhaps we should just defer the w^x check until > after calling module_frob_arch_sections()? My earlier suggestion was to ignore it for 0-sized sections. Only they are 1 byte sections in this case. We override the sh_type and sh_flags explicitly for these sections at module load time, so deferring the check seems like a reasonable alternative to me. So module_enforce_rwx_sections() is already called after module_frob_arch_sections() - which really baffled me at first, since sh_type and sh_flags should have been set already in module_frob_arch_sections(). I added some debug prints to see which section the module code was tripping on, and it was .text.ftrace_trampoline. See this snippet from arm64's module_frob_arch_sections(): else if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) && !strcmp(secstrings + sechdrs[i].sh_name, ".text.ftrace_trampoline")) tramp = sechdrs + i; Since Mauro's config doesn't have CONFIG_DYNAMIC_FTRACE enabled, tramp is never set here and the if (tramp) check at the end of the function fails, so its section flags are never set, so they remain WAX and fail the rwx check.
Re: [PATCH v2] module: Harden STRICT_MODULE_RWX
+++ Szabolcs Nagy [12/08/20 15:15 +0100]: The 08/12/2020 13:56, Will Deacon wrote: On Wed, Aug 12, 2020 at 12:40:05PM +0200, pet...@infradead.org wrote: > On Wed, Aug 12, 2020 at 10:56:56AM +0200, Ard Biesheuvel wrote: > > The module .lds has BYTE(0) in the section contents to prevent the > > linker from pruning them entirely. The (NOLOAD) is there to ensure > > that this byte does not end up in the .ko, which is more a matter of > > principle than anything else, so we can happily drop that if it helps. > > > > However, this should only affect the PROGBITS vs NOBITS designation, > > and so I am not sure whether it makes a difference. > > > > Depending on where the w^x check occurs, we might simply override the > > permissions of these sections, and strip the writable permission if it > > is set in the PLT handling init code, which manipulates the metadata > > of all these 3 sections before the module space is vmalloc'ed. > > What's curious is that this seems the result of some recent binutils > change. Every build with binutils-2.34 (or older) does not seem to > generate these as WAX, but has the much more sensible WA. > > I suppose we can change the kernel check and 'allow' W^X for 0 sized > sections, but I think we should still figure out why binutils-2.35 is > now generating WAX sections all of a sudden, it might come bite us > elsewhere. Agreed, I think it's important to figure out what's going on here before we try to bodge around it. Adding Szabolcs, in case he has any ideas. To save him reading the whole thread, here's a summary: AArch64 kernel modules built with binutils 2.35 end up with a couple of ELF sections marked as SHF_WRITE | SHF_ALLOC | SHF_EXECINSTR: [ 5] .plt PROGBITS 0388 01d000 08 00 WAX 0 0 1 [ 6] .init.plt NOBITS 0390 01d008 08 00 WA 0 0 1 [ 7] .text.ftrace_trampoline PROGBITS 0398 01d008 08 00 WAX 0 0 1 This results in the module being rejected by our loader, because we don't permit writable, executable mappings. Our linker script for these entries uses NOLOAD, so it's odd to see PROGBITS appearing in the readelf output above (and older binutils emits NOBITS sections). Anyway, here's the linker script: SECTIONS { .plt (NOLOAD) : { BYTE(0) } .init.plt (NOLOAD) : { BYTE(0) } .text.ftrace_trampoline (NOLOAD) : { BYTE(0) } } It appears that the name of the section influences the behaviour, as Jessica observed [1] that sections named .text.* end up with PROGBITS, whereas random naming such as ".test" ends up with NOBITS, as before. We've looked at the changelog between binutils 2.34 and 2.35, but nothing stands out. Any clues? Is this intentional binutils behaviour? for me it bisects to https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=8c803a2dd7d3d742a3d0071914f557ef465afe71 i will have to investigate further what's going on. Thanks for the hint. I'm almost certain it's due to this excerpt from the changelog: "we now init sh_type and sh_flags for all known ABI sections in _bfd_elf_new_section_hook." Indeed, .plt and .text.* are listed as special sections in bfd/elf.c. The former requires an exact match and the latter only has to match the prefix ".text." Since the code considers ".plt" and ".text.ftrace_trampoline" special sections, their sh_type and sh_flags are now set by default. Now I guess the question is whether this can be overriden by a linker script..
Re: [PATCH v2] module: Harden STRICT_MODULE_RWX
+++ pet...@infradead.org [12/08/20 12:40 +0200]: On Wed, Aug 12, 2020 at 10:56:56AM +0200, Ard Biesheuvel wrote: The module .lds has BYTE(0) in the section contents to prevent the linker from pruning them entirely. The (NOLOAD) is there to ensure that this byte does not end up in the .ko, which is more a matter of principle than anything else, so we can happily drop that if it helps. However, this should only affect the PROGBITS vs NOBITS designation, and so I am not sure whether it makes a difference. Depending on where the w^x check occurs, we might simply override the permissions of these sections, and strip the writable permission if it is set in the PLT handling init code, which manipulates the metadata of all these 3 sections before the module space is vmalloc'ed. What's curious is that this seems the result of some recent binutils change. Every build with binutils-2.34 (or older) does not seem to generate these as WAX, but has the much more sensible WA. I suppose we can change the kernel check and 'allow' W^X for 0 sized sections, but I think we should still figure out why binutils-2.35 is now generating WAX sections all of a sudden, it might come bite us elsewhere. I have just tested with binutils-2.35 and am observing the same behavior. Both .plt and .text.ftrace_trampoline end up with SHT_PROGBITS and are marked 'WAX'. With binutils-2.34 they keep the NOBITS designation. I had thought NOLOAD implies NOBITS, but that doesn't seem to be the case anymore? I tinkered with module.lds a bit and noticed that the name of the section seems to matters. So this: SECTIONS { .plt (NOLOAD) : { BYTE(0) } .init.plt (NOLOAD) : { BYTE(0) } .text.ftrace_trampoline (NOLOAD) : { BYTE(0) } + .test (NOLOAD) : { BYTE(0) } + .text.test (NOLOAD) : { BYTE(0) } + .plt.test (NOLOAD) : { BYTE(0) } } Yielded the following: [22] .plt PROGBITS0340 000e40 01 00 WAX 0 0 1 [23] .init.plt NOBITS 0341 000e41 01 00 WA 0 0 1 [24] .text.ftrace_trampoline PROGBITS0342 000e41 01 00 WAX 0 0 1 [25] .test NOBITS 0343 000e42 01 00 WA 0 0 1 [26] .text.testPROGBITS0344 000e42 01 00 WAX 0 0 1 [27] .plt.test NOBITS 0345 000e43 01 00 WA 0 0 1 So ".plt" and any section starting with ".text" were automatically designated as SHT_PROGBITS and given the executable flag. It appears the NOLOAD directive has been ignored or overridden in the case of these sections. I am not sure if this is a bug in binutils or if this behavior is intentional. I tried to grok the changelog between 2.34 and 2.35 but could not find anything glaringly obvious that would cause this change. CC'ing the binutils mailing list, hopefully that's the right place to ask. Jessica
Re: [PATCH v2] module: Harden STRICT_MODULE_RWX
+++ Mauro Carvalho Chehab [11/08/20 17:27 +0200]: Em Tue, 11 Aug 2020 16:55:24 +0200 pet...@infradead.org escreveu: On Tue, Aug 11, 2020 at 04:34:27PM +0200, Mauro Carvalho Chehab wrote: > [33] .plt PROGBITS 0340 00035c80 >0001 WAX 0 0 1 > [34] .init.plt NOBITS 0341 00035c81 >0001 WA 0 0 1 > [35] .text.ftrace[...] PROGBITS 0342 00035c81 >0001 WAX 0 0 1 .plt and .text.ftrace_tramplines are buggered. arch/arm64/kernel/module.lds even marks then as NOLOAD. Hmm... Shouldn't the code at module_enforce_rwx_sections() or at load_module() ignore such sections instead of just rejecting probing all modules? I mean, if the existing toolchain is not capable of excluding those sections, either the STRICT_MODULE_RWX hardening should be disabled, if a broken toolchain is detected or some runtime code should handle such sections on a different way. Hi Mauro, thanks for providing the readelf output. The sections marked 'WAX' are indeed the reason why the module loader is rejecting them. Interesting, my cross-compiled modules do not have the executable flag - [38] .plt NOBITS 0340 00038fc0 0001 WA 0 0 1 [39] .init.plt NOBITS 0341 00038fc0 0001 WA 0 0 1 [40] .text.ftrace_tram NOBITS 0342 00038fc0 0001 WA 0 0 1 ld version: GNU ld (GNU Binutils) 2.34 Copyright (C) 2020 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or (at your option) a later version. And gcc: aarch64-linux-gcc (GCC) 9.3.0 Copyright (C) 2019 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. I'm a bit confused about what NOLOAD actually implies in this context. From the ld documentation - "The `(NOLOAD)' directive will mark a section to not be loaded at run time." But these sections are marked SHF_ALLOC and are referenced to in the module plt code. Or does it just tell the linker to not initialize it? Adding Ard to CC, I'm sure he'd know more about the section flag specifics. Jessica
Re: [PATCH v2] module: Harden STRICT_MODULE_RWX
+++ Jessica Yu [10/08/20 11:25 +0200]: +++ Mauro Carvalho Chehab [08/08/20 10:12 +0200]: [snip] Right now, what happens is: # modprobe wlcore modprobe: ERROR: could not insert 'wlcore': Exec format error This seems to be failing for all modules, as doesn't show anything probed. Btw, IMO, it would be useful to have some pr_debug() infra in order to explain why insmod is failing, or to have more error codes used there, as nothing was printed at dmesg. That makes harder to debug issues there. I ended losing a lot of time yesterday rebuilding the Kernel and checking the FS, before deciding to add some printks inside the Kernel ;-) In order for modprobe to start working again, I had to apply this dirty hack: diff --git a/kernel/module.c b/kernel/module.c index 910a57640818..10d590dc48ad 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2051,11 +2051,12 @@ static int module_enforce_rwx_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs, const unsigned long shf_wx = SHF_WRITE|SHF_EXECINSTR; int i; +#if 0 for (i = 0; i < hdr->e_shnum; i++) { if ((sechdrs[i].sh_flags & shf_wx) == shf_wx) return -ENOEXEC; } - +#endif return 0; } [ I somehow munged the To: header in the last mail. Sorry about that, it's fixed now. ] All this hunk does it reject loading modules that have any sections that have both the writable and executable flags. You're saying it's happening for all modules on your setup - I am curious as to which sections have both these flags - what does readelf -S tell you? Hmm, I was not able to reproduce this with a cross-compiled kernel using the attached config (gcc 9.3.0 with vanilla v5.8 kernel). I am curious if the failing sections are also SHF_ALLOC - in that case, the code is doing what it is intended to do, which is rejecting loading any modules with writable and executable sections. If the problematic sections are *not* SHF_ALLOC, then we might be able to work around that by ignoring non-SHF_ALLOC sections as they are not copied to the final module location anyway. Jessica
Re: [PATCH v2] module: Harden STRICT_MODULE_RWX
+++ Mauro Carvalho Chehab [08/08/20 10:12 +0200]: [snip] Right now, what happens is: # modprobe wlcore modprobe: ERROR: could not insert 'wlcore': Exec format error This seems to be failing for all modules, as doesn't show anything probed. Btw, IMO, it would be useful to have some pr_debug() infra in order to explain why insmod is failing, or to have more error codes used there, as nothing was printed at dmesg. That makes harder to debug issues there. I ended losing a lot of time yesterday rebuilding the Kernel and checking the FS, before deciding to add some printks inside the Kernel ;-) In order for modprobe to start working again, I had to apply this dirty hack: diff --git a/kernel/module.c b/kernel/module.c index 910a57640818..10d590dc48ad 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2051,11 +2051,12 @@ static int module_enforce_rwx_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs, const unsigned long shf_wx = SHF_WRITE|SHF_EXECINSTR; int i; +#if 0 for (i = 0; i < hdr->e_shnum; i++) { if ((sechdrs[i].sh_flags & shf_wx) == shf_wx) return -ENOEXEC; } - +#endif return 0; } All this hunk does it reject loading modules that have any sections that have both the writable and executable flags. You're saying it's happening for all modules on your setup - I am curious as to which sections have both these flags - what does readelf -S tell you? Thanks, Jessica
Re: [PATCH 1/2] module: Correctly truncate sysfs sections output
+++ Kees Cook [06/08/20 23:35 -0700]: The only-root-readable /sys/module/$module/sections/$section files did not truncate their output to the available buffer size. While most paths into the kernfs read handlers end up using PAGE_SIZE buffers, it's possible to get there through other paths (e.g. splice, sendfile). Actually limit the output to the "count" passed into the read function, and report it back correctly. *sigh* Reported-by: kernel test robot Link: https://lore.kernel.org/lkml/20200805002015.GE23458@shao2-debian Fixes: ed66f991bb19 ("module: Refactor section attr into bin attribute") Cc: sta...@vger.kernel.org Cc: Jessica Yu Signed-off-by: Kees Cook Oof, thanks for fixing this! Acked-by: Jessica Yu --- kernel/module.c | 22 +++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index aa183c9ac0a2..08c46084d8cc 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -1520,18 +1520,34 @@ struct module_sect_attrs { struct module_sect_attr attrs[]; }; +#define MODULE_SECT_READ_SIZE (3 /* "0x", "\n" */ + (BITS_PER_LONG / 4)) static ssize_t module_sect_read(struct file *file, struct kobject *kobj, struct bin_attribute *battr, char *buf, loff_t pos, size_t count) { struct module_sect_attr *sattr = container_of(battr, struct module_sect_attr, battr); + char bounce[MODULE_SECT_READ_SIZE + 1]; + size_t wrote; if (pos != 0) return -EINVAL; - return sprintf(buf, "0x%px\n", - kallsyms_show_value(file->f_cred) ? (void *)sattr->address : NULL); + /* +* Since we're a binary read handler, we must account for the +* trailing NUL byte that sprintf will write: if "buf" is +* too small to hold the NUL, or the NUL is exactly the last +* byte, the read will look like it got truncated by one byte. +* Since there is no way to ask sprintf nicely to not write +* the NUL, we have to use a bounce buffer. +*/ + wrote = scnprintf(bounce, sizeof(bounce), "0x%px\n", +kallsyms_show_value(file->f_cred) + ? (void *)sattr->address : NULL); + count = min(count, wrote); + memcpy(buf, bounce, count); + + return count; } static void free_sect_attrs(struct module_sect_attrs *sect_attrs) @@ -1580,7 +1596,7 @@ static void add_sect_attrs(struct module *mod, const struct load_info *info) goto out; sect_attrs->nsections++; sattr->battr.read = module_sect_read; - sattr->battr.size = 3 /* "0x", "\n" */ + (BITS_PER_LONG / 4); + sattr->battr.size = MODULE_SECT_READ_SIZE; sattr->battr.attr.mode = 0400; *(gattr++) = &(sattr++)->battr; } -- 2.25.1
Re: [PATCH v4 11/17] module: Call security_kernel_post_load_data()
+++ Kees Cook [29/07/20 10:58 -0700]: Now that there is an API for checking loaded contents for modules loaded without a file, call into the LSM hooks. Cc: Jessica Yu Signed-off-by: Kees Cook Acked-by: Jessica Yu
Re: [PATCH v4 00/10] Function Granular KASLR
+++ Joe Lawrence [03/08/20 14:17 -0400]: On 8/3/20 1:45 PM, Kees Cook wrote: On Mon, Aug 03, 2020 at 02:39:32PM +0300, Evgenii Shatokhin wrote: There are at least 2 places where high-order memory allocations might happen during module loading. Such allocations may fail if memory is fragmented, while physically contiguous memory areas are not really needed there. I suggest to switch to kvmalloc/kvfree there. Thanks Evgenii for pointing out the potential memory allocation issues that may arise with very large modules when memory is fragmented. I was curious as to which modules on my machine would be considered large, and there seems to be quite a handful...(x86_64 with v5.8-rc6 with a relatively standard distro config and FG KASLR patches on top): ./amdgpu/sections 7277 ./i915/sections 4267 ./nouveau/sections 3772 ./xfs/sections 2395 ./btrfs/sections 1966 ./mac80211/sections 1588 ./kvm/sections 1468 ./cfg80211/sections 1194 ./drm/sections 1012 ./bluetooth/sections 843 ./iwlmvm/sections 664 ./usbcore/sections 524 ./videodev/sections 436 So, I agree with the suggestion that we could switch to kvmalloc() to try to mitigate potential allocation problems when memory is fragmented. While this does seem to be the right solution for the extant problem, I do want to take a moment and ask if the function sections need to be exposed at all? What tools use this information, and do they just want to see the bounds of the code region? (i.e. the start/end of all the .text* sections) Perhaps .text.* could be excluded from the sysfs section list? [[cc += FChE, see [0] for Evgenii's full mail ]] It looks like debugging tools like systemtap [1], gdb [2] and its add-symbol-file cmd, etc. peek at the /sys/module//section/ info. But yeah, it would be preferable if we didn't export a long sysfs representation if nobody actually needs it. Thanks Joe for looking into this. Hmm, AFAICT for gdb it's not a hard dependency per se - for add-symbol-file I was under the impression that we are responsible for obtaining the relevant section addresses ourselves through /sys/module/ (the most oft cited method) and then feeding those to add-symbol-file. It would definitely be more difficult to find out the section addresses without the /sys/module/ section entries. In any case, it sounds like systemtap has a hard dependency on /sys/module/*/sections anyway. Regarding /proc/kallsyms, I think it is probably possible to expose section symbols and their addresses via /proc/kallsyms rather than through sysfs (it would then live in the module's vmalloc'ed memory) but I'm not sure how helpful that would actually be, especially since existing tools depend on the sysfs interface being there. [0] https://lore.kernel.org/lkml/e9c4d88b-86db-47e9-4299-3fac45a7e...@virtuozzo.com/ [1] https://fossies.org/linux/systemtap/staprun/staprun.c [2] https://www.oreilly.com/library/view/linux-device-drivers/0596005903/ch04.html#linuxdrive3-CHP-4-SECT-6.1 -- Joe
Re: [PATCH 8/8] modules: inherit TAINT_PROPRIETARY_MODULE
+++ Christoph Hellwig [31/07/20 11:00 +0200]: On Fri, Jul 31, 2020 at 10:51:30AM +0200, Jessica Yu wrote: + if (mod->using_gplonly_symbols) { + pr_info("%s: module using GPL-only symbols uses symbols from proprietary module %s.\n", + mod->name, owner->name); pr_err() maybe? + return false; + } + + if (!test_bit(TAINT_PROPRIETARY_MODULE, &mod->taints)) { + pr_info("%s: module uses symbols from proprietary module %s, inheriting taint.\n", + mod->name, owner->name); and pr_warn()? But otherwise this looks much better. Ok with me. Can you just fix it up, or do you want a full resend? I can fix it up, no need to resend. Thanks!
Re: [PATCH 8/8] modules: inherit TAINT_PROPRIETARY_MODULE
+++ Christoph Hellwig [30/07/20 18:29 +0200]: On Thu, Jul 30, 2020 at 04:12:32PM +0200, Jessica Yu wrote: + if (owner && test_bit(TAINT_PROPRIETARY_MODULE, &owner->taints)) { + if (mod->using_gplonly_symbols) { + sym = NULL; + goto getname; + } + add_taint_module(mod, TAINT_PROPRIETARY_MODULE, +LOCKDEP_NOW_UNRELIABLE); + } Sorry that I didn't think of this yesterday, but I'm wondering if we should print a warning before add_taint_module(). Maybe something along the lines of, "%s: module uses symbols from proprietary module %s, inheriting taint.", with %s being mod->name, owner->name. We can check mod->taints for TAINT_PROPRIETARY_MODULE and print the warning once. Additionally, maybe it's a good idea to print an error before goto getname (e.g., "%s: module using GPL-only symbols uses symbols from proprietary module %s."), so one would know why the module load failed, right now this manifests itself as an unknown symbol error. Otherwise, this patchset looks good to me and I agree with it in principle. Thanks Christoph! What about this version? It also factors the code out into a new helper, and replaces the add_taint_module with a simple set_bit, as the system-wide tain must have been set before by definition: Yep, this version looks much better. See below for nits. --- From 25e928b6b691911717d30b3449e56fca3e13dba9 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Tue, 28 Jul 2020 23:33:33 +0200 Subject: modules: inherit TAINT_PROPRIETARY_MODULE If a TAINT_PROPRIETARY_MODULE exports symbol, inherit the taint flag for all modules importing these symbols, and don't allow loading symbols from TAINT_PROPRIETARY_MODULE modules if the module previously imported gplonly symbols. Add a anti-circumvention devices so people don't accidentally get themselves into trouble this way. Comment from Greg: "Ah, the proven-to-be-illegal "GPL Condom" defense :)" Signed-off-by: Christoph Hellwig Reviewed-by: Greg Kroah-Hartman --- include/linux/module.h | 1 + kernel/module.c| 26 ++ 2 files changed, 27 insertions(+) diff --git a/include/linux/module.h b/include/linux/module.h index 30b0f5fcdb3c37..e30ed5fa33a738 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -389,6 +389,7 @@ struct module { unsigned int num_gpl_syms; const struct kernel_symbol *gpl_syms; const s32 *gpl_crcs; + bool using_gplonly_symbols; #ifdef CONFIG_UNUSED_SYMBOLS /* unused exported symbols. */ diff --git a/kernel/module.c b/kernel/module.c index afb2bfdd5134b3..81d5facce28c14 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -1431,6 +1431,24 @@ static int verify_namespace_is_imported(const struct load_info *info, return 0; } +static bool inherit_taint(struct module *mod, struct module *owner) +{ + if (!owner || !test_bit(TAINT_PROPRIETARY_MODULE, &owner->taints)) + return true; + + if (mod->using_gplonly_symbols) { + pr_info("%s: module using GPL-only symbols uses symbols from proprietary module %s.\n", + mod->name, owner->name); pr_err() maybe? + return false; + } + + if (!test_bit(TAINT_PROPRIETARY_MODULE, &mod->taints)) { + pr_info("%s: module uses symbols from proprietary module %s, inheriting taint.\n", + mod->name, owner->name); and pr_warn()? But otherwise this looks much better. Thanks, Jessica
Re: [PATCH 8/8] modules: inherit TAINT_PROPRIETARY_MODULE
+++ Christoph Hellwig [30/07/20 08:10 +0200]: If a TAINT_PROPRIETARY_MODULE exports symbol, inherit the taint flag for all modules importing these symbols, and don't allow loading symbols from TAINT_PROPRIETARY_MODULE modules if the module previously imported gplonly symbols. Add a anti-circumvention devices so people don't accidentally get themselves into trouble this way. Comment from Greg: Ah, the proven-to-be-illegal "GPL Condom" defense :) Signed-off-by: Christoph Hellwig Reviewed-by: Greg Kroah-Hartman --- include/linux/module.h | 1 + kernel/module.c| 12 2 files changed, 13 insertions(+) diff --git a/include/linux/module.h b/include/linux/module.h index 30b0f5fcdb3c37..e30ed5fa33a738 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -389,6 +389,7 @@ struct module { unsigned int num_gpl_syms; const struct kernel_symbol *gpl_syms; const s32 *gpl_crcs; + bool using_gplonly_symbols; #ifdef CONFIG_UNUSED_SYMBOLS /* unused exported symbols. */ diff --git a/kernel/module.c b/kernel/module.c index afb2bfdd5134b3..04f993863ae417 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -1456,6 +1456,18 @@ static const struct kernel_symbol *resolve_symbol(struct module *mod, if (!sym) goto unlock; + if (license == GPL_ONLY) + mod->using_gplonly_symbols = true; + + if (owner && test_bit(TAINT_PROPRIETARY_MODULE, &owner->taints)) { + if (mod->using_gplonly_symbols) { + sym = NULL; + goto getname; + } + add_taint_module(mod, TAINT_PROPRIETARY_MODULE, +LOCKDEP_NOW_UNRELIABLE); + } Sorry that I didn't think of this yesterday, but I'm wondering if we should print a warning before add_taint_module(). Maybe something along the lines of, "%s: module uses symbols from proprietary module %s, inheriting taint.", with %s being mod->name, owner->name. We can check mod->taints for TAINT_PROPRIETARY_MODULE and print the warning once. Additionally, maybe it's a good idea to print an error before goto getname (e.g., "%s: module using GPL-only symbols uses symbols from proprietary module %s."), so one would know why the module load failed, right now this manifests itself as an unknown symbol error. Otherwise, this patchset looks good to me and I agree with it in principle. Thanks Christoph!
Re: [PATCH 6/7] modules: return licensing information from find_symbol
+++ Christoph Hellwig [29/07/20 08:27 +0200]: Report the GPLONLY status through a new argument. Signed-off-by: Christoph Hellwig --- include/linux/module.h | 2 +- kernel/module.c| 16 +++- 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/include/linux/module.h b/include/linux/module.h index b79219eed83c56..c9bc3412ae4465 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -582,7 +582,7 @@ struct module *find_module(const char *name); struct symsearch { const struct kernel_symbol *start, *stop; const s32 *crcs; - enum { + enum mod_licence { NOT_GPL_ONLY, GPL_ONLY, WILL_BE_GPL_ONLY, diff --git a/kernel/module.c b/kernel/module.c index 54e853c7212f72..a907bc57d343f9 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -495,6 +495,7 @@ struct find_symbol_arg { struct module *owner; const s32 *crc; const struct kernel_symbol *sym; + enum mod_licence licence; Just a small nit. Most of module.c uses license rather than licence - could we unify the spelling to remain consistent? Sigh, American vs. British English.. :)
Re: [PATCH 2/7] modules: mark find_symbol static
+++ Christoph Hellwig [29/07/20 08:27 +0200]: find_symbol is only used in module.c. Signed-off-by: Christoph Hellwig CCing the livepatching ML, as this may or may not impact its users. AFAIK, the out-of-tree kpatch module had used find_symbol() in the past, I am not sure what its current status is. I suspect all of its functionality has been migrated to upstream livepatch already. --- include/linux/module.h | 11 --- kernel/module.c| 3 +-- 2 files changed, 1 insertion(+), 13 deletions(-) diff --git a/include/linux/module.h b/include/linux/module.h index f1fdbeef2153a8..90bdc362be3681 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -590,17 +590,6 @@ struct symsearch { bool unused; }; -/* - * Search for an exported symbol by name. - * - * Must be called with module_mutex held or preemption disabled. - */ -const struct kernel_symbol *find_symbol(const char *name, - struct module **owner, - const s32 **crc, - bool gplok, - bool warn); - /* * Walk the exported symbol table * diff --git a/kernel/module.c b/kernel/module.c index 17d64dae756c80..84da96a6d8241c 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -585,7 +585,7 @@ static bool find_exported_symbol_in_section(const struct symsearch *syms, /* Find an exported symbol and return it, along with, (optional) crc and * (optional) module which owns it. Needs preempt disabled or module_mutex. */ -const struct kernel_symbol *find_symbol(const char *name, +static const struct kernel_symbol *find_symbol(const char *name, struct module **owner, const s32 **crc, bool gplok, @@ -608,7 +608,6 @@ const struct kernel_symbol *find_symbol(const char *name, pr_debug("Failed to find symbol %s\n", name); return NULL; } -EXPORT_SYMBOL_GPL(find_symbol); /* * Search for module by name: must hold module_mutex (or preempt disabled -- 2.27.0
Re: [PATCH v4 10/10] module: Reorder functions
+++ Kristen Carlson Accardi [17/07/20 10:00 -0700]: Introduce a new config option to allow modules to be re-ordered by function. This option can be enabled independently of the kernel text KASLR or FG_KASLR settings so that it can be used by architectures that do not support either of these features. This option will be selected by default if CONFIG_FG_KASLR is selected. If a module has functions split out into separate text sections (i.e. compiled with the -ffunction-sections flag), reorder the functions to provide some code diversification to modules. Signed-off-by: Kristen Carlson Accardi Reviewed-by: Kees Cook Acked-by: Ard Biesheuvel Tested-by: Ard Biesheuvel Reviewed-by: Tony Luck Tested-by: Tony Luck Hi Kristen! I've boot tested this on x86, (un)loaded some modules, and checked their resulting section addresses as a quick sanity test. Feel free to add my: Acked-by: Jessica Yu Tested-by: Jessica Yu Thank you! Jessica
Re: [PATCH] modules: linux/moduleparam.h: drop duplicated word in a comment
+++ Randy Dunlap [17/07/20 16:36 -0700]: From: Randy Dunlap Drop the doubled word "the" in a comment. Signed-off-by: Randy Dunlap Cc: Jessica Yu --- include/linux/moduleparam.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-next-20200714.orig/include/linux/moduleparam.h +++ linux-next-20200714/include/linux/moduleparam.h @@ -108,7 +108,7 @@ struct kparam_array * ".") the kernel commandline parameter. Note that - is changed to _, so * the user can use "foo-bar=1" even for variable "foo_bar". * - * @perm is 0 if the the variable is not to appear in sysfs, or 0444 + * @perm is 0 if the variable is not to appear in sysfs, or 0444 * for world-readable, 0644 for root-writable, etc. Note that if it * is writable, you may need to use kernel_param_lock() around * accesses (esp. charp, which can be kfreed when it changes). Applied, thanks! Jessica
Re: [PATCH 3/5] module: Do not expose section addresses to non-CAP_SYSLOG
+++ Kees Cook [02/07/20 16:26 -0700]: The printing of section addresses in /sys/module/*/sections/* was not using the correct credentials to evaluate visibility. Before: # cat /sys/module/*/sections/.*text 0xc0458000 ... # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text" 0xc0458000 ... After: # cat /sys/module/*/sections/*.text 0xc0458000 ... # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text" 0x ... Additionally replaces the existing (safe) /proc/modules check with file->f_cred for consistency. Cc: sta...@vger.kernel.org Reported-by: Dominik Czarnota Fixes: be71eda5383f ("module: Fix display of wrong module .text address") Signed-off-by: Kees Cook Tested-by: Jessica Yu Acked-by: Jessica Yu
Re: [PATCH 2/5] module: Refactor section attr into bin attribute
+++ Kees Cook [02/07/20 16:26 -0700]: In order to gain access to the open file's f_cred for kallsym visibility permission checks, refactor the module section attributes to use the bin_attribute instead of attribute interface. Additionally removes the redundant "name" struct member. Cc: sta...@vger.kernel.org Signed-off-by: Kees Cook Hi Kees, This looks good to me: Tested-by: Jessica Yu Acked-by: Jessica Yu Thanks!
Re: [PATCH] kernel/module: add name size info to pr_debug() calls
+++ Jim Cromie [11/06/20 08:20 -0600]: when booted with arg: module.dyndbg=+p dmesg gets volumes of info about loaded modules. This adds module & symbol names, and sizes where pertinent. Now I can know which module's info Im looking at. Hi, Could you please fix the changelog formatting according to Documentation/process/submitting-patches.rst? More specifically, complete sentences, line wrapped at 75 columns, and a Signed-off-by: line at the end. There are plenty of examples if you look through the lkml mailing list. --- kernel/module.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index e8a198588f26..d871d9cee9eb 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2294,8 +2294,8 @@ static int simplify_symbols(struct module *mod, const struct load_info *info) case SHN_ABS: /* Don't need to do anything */ - pr_debug("Absolute symbol: 0x%08lx\n", - (long)sym[i].st_value); + pr_debug("Absolute symbol: 0x%08lx %s\n", +(long)sym[i].st_value, name); I would prefer "Absolute symbol %s:" rather than putting the symbol name at the end. break; case SHN_LIVEPATCH: @@ -2409,7 +2409,7 @@ static void layout_sections(struct module *mod, struct load_info *info) for (i = 0; i < info->hdr->e_shnum; i++) info->sechdrs[i].sh_entsize = ~0UL; - pr_debug("Core section allocation order:\n"); + pr_debug("Core section allocation order for: %s\n", mod->name); I would slightly prefer "Core section allocation order for %s:", but it's a matter of taste. for (m = 0; m < ARRAY_SIZE(masks); ++m) { for (i = 0; i < info->hdr->e_shnum; ++i) { Elf_Shdr *s = &info->sechdrs[i]; @@ -2442,7 +2442,7 @@ static void layout_sections(struct module *mod, struct load_info *info) } } - pr_debug("Init section allocation order:\n"); + pr_debug("Init section allocation order for: %s\n", mod->name); Same here. for (m = 0; m < ARRAY_SIZE(masks); ++m) { for (i = 0; i < info->hdr->e_shnum; ++i) { Elf_Shdr *s = &info->sechdrs[i]; @@ -3276,7 +3276,7 @@ static int move_module(struct module *mod, struct load_info *info) mod->init_layout.base = NULL; /* Transfer each section which specifies SHF_ALLOC */ - pr_debug("final section addresses:\n"); + pr_debug("final section addresses for: %s\n", mod->name); Let's capitalize the "f" in "final section addresses" to be consistent with the other two above. for (i = 0; i < info->hdr->e_shnum; i++) { void *dest; Elf_Shdr *shdr = &info->sechdrs[i]; @@ -3294,8 +3294,8 @@ static int move_module(struct module *mod, struct load_info *info) memcpy(dest, (void *)shdr->sh_addr, shdr->sh_size); /* Update sh_addr to point to copy in image. */ shdr->sh_addr = (unsigned long)dest; - pr_debug("\t0x%lx %s\n", -(long)shdr->sh_addr, info->secstrings + shdr->sh_name); + pr_debug("\t0x%lx 0x%.8lx %s\n", (long)shdr->sh_addr, +(long)shdr->sh_size, info->secstrings + shdr->sh_name); Any reason why you added sh_size here? Is it really needed? You can usually deduce how much space a section is taking up in a module by looking at the final section address printout. Plus the elf section size is not entirely accurate here as each module section is aligned to page boundaries when the module loader allocates memory for each section. I would prefer to just leave it out. Thanks, Jessica
Re: [GIT PULL] Modules updates for v5.8
+++ Linus Torvalds [05/06/20 13:38 -0700]: On Fri, Jun 5, 2020 at 2:34 AM Jessica Yu wrote: Please pull below to receive modules updates for the v5.8 merge window. Done. Considering the confusion this merge window with the dependencies of trees with each other, can you verify that what I got matches what you expect? I reviewed the result of the merge and it matches what I expected. Thank you!
Re: [PATCH v4 11/11] module: Make module_enable_ro() static again
+++ Guenter Roeck [05/06/20 06:24 -0700]: On Wed, Apr 29, 2020 at 10:24:53AM -0500, Josh Poimboeuf wrote: Now that module_enable_ro() has no more external users, make it static again. Suggested-by: Jessica Yu Signed-off-by: Josh Poimboeuf Acked-by: Miroslav Benes Apparently this patch made it into the upstream kernel on its own, not caring about its dependencies. Results are impressive. Build results: total: 155 pass: 101 fail: 54 Qemu test results: total: 431 pass: 197 fail: 234 That means bisects will be all but impossible until this is fixed. Was that really necessary ? Sigh, I am really sorry about this. We made a mistake in handling inter-tree dependencies between livepatching and modules-next, unfortunately :-( Merging the modules-next pull request next should resolve the module_enable_ro() not defined for !ARCH_HAS_STRICT_MODULE_RWX build issue. The failure was hidden in linux-next since both trees were always merged together. Again, it doesn't excuse us from build testing our separate trees more rigorously.
Re: unable to compile after "module: Make module_enable_ro() static again"
+++ Anatoly Pugachev [05/06/20 13:21 +0300]: Hello! i'm unable to compile kernel on debian sid/unstable (tested on sparc64 and ppc64) after commit e6eff4376e2897c2e14b70d87bf7284cdb093830 There are some module changes that still need to be merged that should fix this issue. Please wait until after the merge window or at least until after the following pull request gets merged into mainline: https://lore.kernel.org/r/20200605093354.ga23...@linux-8ccs.fritz.box Thanks, and apologies for the inconvenience, Jessica
Re: [PATCH] kernel/modules: fix build without ARCH_HAS_STRICT_MODULE_RWX
+++ Miroslav Benes [05/06/20 09:50 +0200]: Hi, On Thu, 4 Jun 2020, Max Filippov wrote: On configurations with CONFIG_ARCH_HAS_STRICT_MODULE_RWX disabled kernel build fails with the following message: kernel/module.c:3593:2: error: implicit declaration of function ‘module_enable_ro’; Add empty module_enable_ro definition to fix the build. Fixes: e6eff4376e28 ("module: Make module_enable_ro() static again") Signed-off-by: Max Filippov I think the problem should disappear once Jessica sends her pull request for modules tree. Yep, the pull request was sent today. So this should get resolved in the next day or two. Thanks, Jessica