Re: [PATCH v5] docs/zh_CN: add translations in zh_CN/dev-tools/gcov
Reviewed-by: Fangrui Song Inlined some suggestions. On 2021-04-14, Alex Shi wrote: Reviewed-by: Alex Shi On 2021/4/14 下午9:21, Wu XiangCheng wrote: From: Bernard Zhao Add new zh translations * zh_CN/dev-tools/gcov.rst * zh_CN/dev-tools/index.rst and link them to zh_CN/index.rst Signed-off-by: Bernard Zhao Reviewed-by: Wu XiangCheng Signed-off-by: Wu XiangCheng --- base: linux-next commit 269dd42f4776 ("docs/zh_CN: add riscv to zh_CN index") Changes since V4: * modified some words under Alex Shi's advices Changes since V3: * update to newest linux-next * fix `` * fix tags * fix list indent Changes since V2: * fix some inaccurate translation Changes since V1: * add index.rst in dev-tools and link to to zh_CN/index.rst * fix some inaccurate translation .../translations/zh_CN/dev-tools/gcov.rst | 265 ++ .../translations/zh_CN/dev-tools/index.rst| 35 +++ Documentation/translations/zh_CN/index.rst| 1 + 3 files changed, 301 insertions(+) create mode 100644 Documentation/translations/zh_CN/dev-tools/gcov.rst create mode 100644 Documentation/translations/zh_CN/dev-tools/index.rst diff --git a/Documentation/translations/zh_CN/dev-tools/gcov.rst b/Documentation/translations/zh_CN/dev-tools/gcov.rst new file mode 100644 index ..7515b488bc4e --- /dev/null +++ b/Documentation/translations/zh_CN/dev-tools/gcov.rst @@ -0,0 +1,265 @@ +.. include:: ../disclaimer-zh_CN.rst + +:Original: Documentation/dev-tools/gcov.rst +:Translator: 赵军奎 Bernard Zhao + +在Linux内核里使用gcov做代码覆盖率检查 += + +gcov是linux中已经集成的一个分析模块,该模块在内核中对GCC的代码覆盖率统 instrumentation 一般译作 插桩,而非 分析。 +计提供了支持。 +linux内核运行时的代码覆盖率数据会以gcov兼容的格式存储在debug-fs中,可 专有名词 Linux 应大写。 +以通过gcov的 ``-o`` 选项(如下示例)获得指定文件的代码运行覆盖率统计数据 +(需要跳转到内核编译路径下并且要有root权限):: + +# cd /tmp/linux-out +# gcov -o /sys/kernel/debug/gcov/tmp/linux-out/kernel spinlock.c + +这将在当前目录中创建带有执行计数注释的源代码文件。 +在获得这些统计文件后,可以使用图形化的 gcov_ 前端工具(比如 lcov_ ),来实现 +自动化处理linux内核的覆盖率运行数据,同时生成易于阅读的HTML格式文件。 + +可能的用途: + +* 调试(用来判断每一行的代码是否已经运行过) +* 测试改进(如何修改测试代码,尽可能地覆盖到没有运行过的代码) +* 内核配置优化(对于某一个选项配置,如果关联的代码从来没有运行过,是 + 否还需要这个配置) minimizing: 优化 -> 最小化/简化 +.. _gcov: https://gcc.gnu.org/onlinedocs/gcc/Gcov.html +.. _lcov: http://ltp.sourceforge.net/coverage/lcov.php + + +准备 + + +内核打开如下配置:: + +CONFIG_DEBUG_FS=y +CONFIG_GCOV_KERNEL=y + +获取整个内核的覆盖率数据,还需要打开:: + +CONFIG_GCOV_PROFILE_ALL=y + +需要注意的是,整个内核开启覆盖率统计会造成内核镜像文件尺寸的增大, +同时内核运行的也会变慢一些。 s/的// +另外,并不是所有的架构都支持整个内核开启覆盖率统计。 + +代码运行覆盖率数据只在debugfs挂载完成后才可以访问:: + +mount -t debugfs none /sys/kernel/debug + + +定制化 +-- + +如果要单独针对某一个路径或者文件进行代码覆盖率统计,可以在内核相应路 +径的Makefile中增加如下的配置: + +- 单独统计单个文件(例如main.o):: + +GCOV_PROFILE_main.o := y + +- 单独统计某一个路径:: + +GCOV_PROFILE := y + +如果要在整个内核的覆盖率统计(开启CONFIG_GCOV_PROFILE_ALL)中单独排除 +某一个文件或者路径,可以使用如下的方法:: + +GCOV_PROFILE_main.o := n + +和:: + +GCOV_PROFILE := n + +此机制仅支持链接到内核镜像或编译为内核模块的文件。 + + +相关文件 + + +gcov功能需要在debugfs中创建如下文件: + +``/sys/kernel/debug/gcov`` +gcov相关功能的根路径 + +``/sys/kernel/debug/gcov/reset`` +全局复位文件:向该文件写入数据后会将所有的gcov统计数据清0 + +``/sys/kernel/debug/gcov/path/to/compile/dir/file.gcda`` +gcov工具可以识别的覆盖率统计数据文件,向该文件写入数据后 + 会将本文件的gcov统计数据清0 + +``/sys/kernel/debug/gcov/path/to/compile/dir/file.gcno`` +gcov工具需要的软连接文件(指向编译时生成的信息统计文件),这个文件是 +在gcc编译时如果配置了选项 ``-ftest-coverage`` 时生成的。 + + +针对模块的统计 +-- + +内核中的模块会动态的加载和卸载,模块卸载时对应的数据会被清除掉。 +gcov提供了一种机制,通过保留相关数据的副本来收集这部分卸载模块的覆盖率数据。 +模块卸载后这些备份数据在debugfs中会继续存在。 +一旦这个模块重新加载,模块关联的运行统计会被初始化成debugfs中备份的数据。 + +可以通过对内核参数gcov_persist的修改来停用gcov对模块的备份机制:: + +gcov_persist = 0 + +在运行时,用户还可以通过写入模块的数据文件或者写入gcov复位文件来丢弃已卸 +载模块的数据。 + + +编译机和测试机分离 +-- + +gcov的内核分析架构支持内核的编译和运行是在同一台机器上,也可以编译和运 分析 -> 插桩 +行是在不同的机器上。 +如果内核编译和运行是不同的机器,那么需要额外的准备工作,这取决于gcov工具 +是在哪里使用的: + +.. _gcov-test_zh: + +a) 若gcov运行在测试机上 + +测试机上面gcov工具的版本必须要跟内核编译机器使用的gcc版本相兼容, +同时下面的文件要从编译机拷贝到测试机上: + +从源代码中: + - 所有的C文件和头文件 + +从编译目录中: + - 所有的C文件和头文件 + - 所有的.gcda文件和.gcno文件 + - 所有目录的链接 + +特别需要注意,测试机器上面的目录结构跟编译机器上面的目录机构必须 +完全一致。 +如果文件是软链接,需要替换成真正的目录文件(这是由make的当前工作 +目录变量CURDIR引起的)。 + +.. _gcov-build_zh: + +b) 若gcov运行在编译机上 + +测试用例运行结束后,如下的文件需要从测试机中拷贝到编译机上: + +从sysfs中的gcov目录中: + - 所有的.gcda文件 + - 所有的.gcno文件软链接 + +这些文件可以拷贝到编译机的任意目录下,gcov使用-o选项指定拷贝的 +目录。 + +比如一个是示例的目录结构如下:: + + /tmp/linux:内核源码目录 + /tmp/out: 内核编译文件路径(make O=指定) + /tmp/coverage: 从测试机器上面拷贝的数据文件路径 + + [user@build] cd /tmp/out + [user@build] gcov -o /tmp/coverage/tmp/out/init main.c + + +关于编译器的注意事项 + + +GCC和LLVM gcov工具不一定兼容。 +如果编译器是GCC,使用 gcov_ 来处理.gcno和.gcda文件,如果是Clang编译器, +则使用 llvm-cov_ 。 + +.. _gcov: https://gcc.gnu.org/onlinedocs/gcc/Gcov.html +.. _llvm-cov: https://llvm.org/docs/CommandGuide/llvm-cov.html + +GCC和Clang gcov之间的版本差异由Kconfig处理的。 +kconfig会根据编译工具链的检查自动选择合适的gcov格式。 + +问题定位 + + +可能出现的问题1 +编译到
Re: [PATCH 2/2] gcov: re-drop support for clang-10
On 2021-04-07, Nick Desaulniers wrote: LLVM changed the expected function signatures for llvm_gcda_emit_function() in the clang-11 release. Drop the older implementations and require folks to upgrade their compiler if they're interested in GCOV support. Signed-off-by: Nick Desaulniers --- kernel/gcov/clang.c | 40 1 file changed, 40 deletions(-) diff --git a/kernel/gcov/clang.c b/kernel/gcov/clang.c index 1747204541bf..78c4dc751080 100644 --- a/kernel/gcov/clang.c +++ b/kernel/gcov/clang.c @@ -69,9 +69,6 @@ struct gcov_fn_info { u32 ident; u32 checksum; -#if CONFIG_CLANG_VERSION < 11 - u8 use_extra_checksum; -#endif u32 cfg_checksum; u32 num_counters; @@ -113,23 +110,6 @@ void llvm_gcda_start_file(const char *orig_filename, u32 version, u32 checksum) } EXPORT_SYMBOL(llvm_gcda_start_file); -#if CONFIG_CLANG_VERSION < 11 -void llvm_gcda_emit_function(u32 ident, u32 func_checksum, - u8 use_extra_checksum, u32 cfg_checksum) -{ - struct gcov_fn_info *info = kzalloc(sizeof(*info), GFP_KERNEL); - - if (!info) - return; - - INIT_LIST_HEAD(>head); - info->ident = ident; - info->checksum = func_checksum; - info->use_extra_checksum = use_extra_checksum; - info->cfg_checksum = cfg_checksum; - list_add_tail(>head, _info->functions); -} -#else void llvm_gcda_emit_function(u32 ident, u32 func_checksum, u32 cfg_checksum) { struct gcov_fn_info *info = kzalloc(sizeof(*info), GFP_KERNEL); @@ -143,7 +123,6 @@ void llvm_gcda_emit_function(u32 ident, u32 func_checksum, u32 cfg_checksum) info->cfg_checksum = cfg_checksum; list_add_tail(>head, _info->functions); } -#endif EXPORT_SYMBOL(llvm_gcda_emit_function); void llvm_gcda_emit_arcs(u32 num_counters, u64 *counters) @@ -274,16 +253,8 @@ int gcov_info_is_compatible(struct gcov_info *info1, struct gcov_info *info2) !list_is_last(_ptr2->head, >functions)) { if (fn_ptr1->checksum != fn_ptr2->checksum) return false; -#if CONFIG_CLANG_VERSION < 11 - if (fn_ptr1->use_extra_checksum != fn_ptr2->use_extra_checksum) - return false; - if (fn_ptr1->use_extra_checksum && - fn_ptr1->cfg_checksum != fn_ptr2->cfg_checksum) - return false; -#else if (fn_ptr1->cfg_checksum != fn_ptr2->cfg_checksum) return false; -#endif fn_ptr1 = list_next_entry(fn_ptr1, head); fn_ptr2 = list_next_entry(fn_ptr2, head); } @@ -403,21 +374,10 @@ size_t convert_to_gcda(char *buffer, struct gcov_info *info) u32 i; pos += store_gcov_u32(buffer, pos, GCOV_TAG_FUNCTION); -#if CONFIG_CLANG_VERSION < 11 - pos += store_gcov_u32(buffer, pos, - fi_ptr->use_extra_checksum ? 3 : 2); -#else pos += store_gcov_u32(buffer, pos, 3); -#endif pos += store_gcov_u32(buffer, pos, fi_ptr->ident); pos += store_gcov_u32(buffer, pos, fi_ptr->checksum); -#if CONFIG_CLANG_VERSION < 11 - if (fi_ptr->use_extra_checksum) - pos += store_gcov_u32(buffer, pos, fi_ptr->cfg_checksum); -#else pos += store_gcov_u32(buffer, pos, fi_ptr->cfg_checksum); -#endif - pos += store_gcov_u32(buffer, pos, GCOV_TAG_COUNTER_BASE); pos += store_gcov_u32(buffer, pos, fi_ptr->num_counters * 2); for (i = 0; i < fi_ptr->num_counters; i++) -- 2.31.1.295.g9ea45b61b8-goog Looks good for both. Thanks! Reviewed-by: Fangrui Song
Re: [PATCH] riscv: Use $(LD) instead of $(CC) to link vDSO
On 2021-03-25, Nathan Chancellor wrote: Currently, the VDSO is being linked through $(CC). This does not match how the rest of the kernel links objects, which is through the $(LD) variable. When linking with clang, there are a couple of warnings about flags that will not be used during the link: clang-12: warning: argument unused during compilation: '-no-pie' [-Wunused-command-line-argument] clang-12: warning: argument unused during compilation: '-pg' [-Wunused-command-line-argument] '-no-pie' was added in commit 85602bea297f ("RISC-V: build vdso-dummy.o with -no-pie") to override '-pie' getting added to the ld command from distribution versions of GCC that enable PIE by default. It is technically no longer needed after commit c2c81bb2f691 ("RISC-V: Fix the VDSO symbol generaton for binutils-2.35+"), which removed vdso-dummy.o in favor of generating vdso-syms.S from vdso.so with $(NM) but this also resolves the issue in case it ever comes back due to having full control over the $(LD) command. '-pg' is for function tracing, it is not used during linking as clang states. Looks good. -pg affects the link action: it changes crt1.o to gcrt1.o. Since the Makefile uses -nostdlib, crt1.o is suppressed, so -pg is entirely unneeded. (-nostdlib implies -nostartfiles so the previous usage has a redundant option.) These flags could be removed/filtered to fix the warnings but it is easier to just match the rest of the kernel and use $(LD) directly for linking. See commits fe00e50b2db8 ("ARM: 8858/1: vdso: use $(LD) instead of $(CC) to link VDSO") 691efbedc60d ("arm64: vdso: use $(LD) instead of $(CC) to link VDSO") 2ff906994b6c ("MIPS: VDSO: Use $(LD) instead of $(CC) to link VDSO") 2b2a25845d53 ("s390/vdso: Use $(LD) instead of $(CC) to link vDSO") for more information. The flags are converted to linker flags and '--eh-frame-hdr' is added to match what is added by GCC implicitly, which can be seen by adding '-v' to GCC's invocation. Another minor change which may be shipped together: --hash-style=both can be --hash-style=gnu. We don't need sysv .hash . The glibc/musl support for .gnu.hash has been there for years. .gnu.hash is often smaller than .hash . Reviewed-by: Fangrui Song Additionally, since this area is being modified, use the $(OBJCOPY) variable instead of an open coded $(CROSS_COMPILE)objcopy so that the user's choice of objcopy binary is respected. Link: https://github.com/ClangBuiltLinux/linux/issues/803 Link: https://github.com/ClangBuiltLinux/linux/issues/970 Signed-off-by: Nathan Chancellor --- arch/riscv/kernel/vdso/Makefile | 12 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/arch/riscv/kernel/vdso/Makefile b/arch/riscv/kernel/vdso/Makefile index 71a315e73cbe..ca2b40dfd24b 100644 --- a/arch/riscv/kernel/vdso/Makefile +++ b/arch/riscv/kernel/vdso/Makefile @@ -41,11 +41,10 @@ KASAN_SANITIZE := n $(obj)/vdso.o: $(obj)/vdso.so # link rule for the .so file, .lds has to be first -SYSCFLAGS_vdso.so.dbg = $(c_flags) $(obj)/vdso.so.dbg: $(src)/vdso.lds $(obj-vdso) FORCE $(call if_changed,vdsold) -SYSCFLAGS_vdso.so.dbg = -shared -s -Wl,-soname=linux-vdso.so.1 \ - -Wl,--build-id=sha1 -Wl,--hash-style=both +LDFLAGS_vdso.so.dbg = -shared -s -soname=linux-vdso.so.1 \ + --build-id=sha1 --hash-style=both --eh-frame-hdr # We also create a special relocatable object that should mirror the symbol # table and layout of the linked DSO. With ld --just-symbols we can then @@ -60,13 +59,10 @@ $(obj)/%.so: $(obj)/%.so.dbg FORCE # actual build commands # The DSO images are built using a special linker script -# Add -lgcc so rv32 gets static muldi3 and lshrdi3 definitions. # Make sure only to export the intended __vdso_xxx symbol offsets. quiet_cmd_vdsold = VDSOLD $@ - cmd_vdsold = $(CC) $(KBUILD_CFLAGS) $(call cc-option, -no-pie) -nostdlib -nostartfiles $(SYSCFLAGS_$(@F)) \ - -Wl,-T,$(filter-out FORCE,$^) -o $@.tmp && \ - $(CROSS_COMPILE)objcopy \ - $(patsubst %, -G __vdso_%, $(vdso-syms)) $@.tmp $@ && \ + cmd_vdsold = $(LD) $(ld_flags) -T $(filter-out FORCE,$^) -o $@.tmp && \ + $(OBJCOPY) $(patsubst %, -G __vdso_%, $(vdso-syms)) $@.tmp $@ && \ rm $@.tmp # Extracts symbol offsets from the VDSO, converting them into an assembly file -- 2.31.0 -- You received this message because you are subscribed to the Google Groups "Clang Built Linux" group. To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20210325215156.1986901-1-nathan%40kernel.org.
Re: [PATCH] riscv: Use $(LD) instead of $(CC) to link vDSO
On 2021-03-26, Nathan Chancellor wrote: On Sat, Mar 27, 2021 at 12:05:34AM +0800, kernel test robot wrote: Hi Nathan, I love your patch! Yet something to improve: [auto build test ERROR on linus/master] [also build test ERROR on v5.12-rc4 next-20210326] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Nathan-Chancellor/riscv-Use-LD-instead-of-CC-to-link-vDSO/20210326-055421 base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 002322402dafd846c424ffa9240a937f49b48c42 config: riscv-randconfig-r032-20210326 (attached as .config) compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project f490a5969bd52c8a48586f134ff8f02ccbb295b3) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install riscv cross compiling tool for clang build # apt-get install binutils-riscv64-linux-gnu # https://github.com/0day-ci/linux/commit/dfdcaf93f40f0d15ffc3f25128442c1688e612d6 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Nathan-Chancellor/riscv-Use-LD-instead-of-CC-to-link-vDSO/20210326-055421 git checkout dfdcaf93f40f0d15ffc3f25128442c1688e612d6 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=riscv For the record, I tried to use this script to reproduce but it has a couple of bugs: 1. It does not download the right version of clang. This report says that it is clang-13 but the one that the script downloaded is clang-12. 2. It does not download it to the right location. The script expects ~/0day/clang-latest but it is downloaded to ~/0day/clang it seems. I symlinked it to get around it. If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): >> riscv64-linux-gnu-objcopy: 'arch/riscv/kernel/vdso/vdso.so.dbg': No such file This error only occurs because of errors before it that are not shown due to a denylist: ld.lld: error: arch/riscv/kernel/vdso/rt_sigreturn.o:(.text+0x0): relocation R_RISCV_ALIGN requires unimplemented linker relaxation; recompile with -mno-relax ld.lld: error: arch/riscv/kernel/vdso/getcpu.o:(.text+0x0): relocation R_RISCV_ALIGN requires unimplemented linker relaxation; recompile with -mno-relax ld.lld: error: arch/riscv/kernel/vdso/flush_icache.o:(.text+0x0): relocation R_RISCV_ALIGN requires unimplemented linker relaxation; recompile with -mno-relax My patch only adds another occurrence of this error because we move from $(CC)'s default linker (in clang's case, ld.bfd) to $(LD), which in the case of 0day appears to be ld.lld. ld.lld should not be used with RISC-V in its current form due to errors of this nature, which happen without my patch as well: https://github.com/ClangBuiltLinux/linux/issues/1020 Linker relaxation in ld.lld for RISC-V is an ongoing debate/process. Please give RISC-V the current treatment as s390 with ld.lld for the time being to get meaningful reports. We will reach out once that issue has been resolved. TL;DR: Patch exposes existing issue with LD=ld.lld that would have happened without it in different areas, the report can be ignored. Yes, lkp frequently reports this error. It can be suppressed by using -mno-relax... if ld.lld is picked. Hmm. This motivated me to file https://github.com/riscv/riscv-elf-psabi-doc/issues/183 R_RISCV_ALIGN friendly to linkers not supporting relaxation (riscv_relax_delete_bytes). Cheers! Nathan -- You received this message because you are subscribed to the Google Groups "Clang Built Linux" group. To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20210326235839.zgfvmtfxrb3hy6i4%40archlinux-ax161.
Re: [PATCH 3/3] riscv: Select HAVE_DYNAMIC_FTRACE when -fpatchable-function-entry is available
On 2021-03-25, Nathan Chancellor wrote: clang prior to 13.0.0 does not support -fpatchable-function-entry for RISC-V. clang: error: unsupported option '-fpatchable-function-entry=8' for target 'riscv64-unknown-linux-gnu' To avoid this error, only select HAVE_DYNAMIC_FTRACE when this option is not available. If clang -fpatchable-function-entry=8 does not error "unsupported option" for one target, it means the backend feature is supported on this target. Reviewed-by: Fangrui Song Fixes: afc76b8b8011 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT") Link: https://github.com/ClangBuiltLinux/linux/issues/1268 Reported-by: kernel test robot Signed-off-by: Nathan Chancellor --- arch/riscv/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 87d7b52f278f..ba1d07640b66 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -227,7 +227,7 @@ config ARCH_RV64I bool "RV64I" select 64BIT select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 && GCC_VERSION >= 5 - select HAVE_DYNAMIC_FTRACE if MMU + select HAVE_DYNAMIC_FTRACE if MMU && $(cc-option,-fpatchable-function-entry=8) select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE select HAVE_FTRACE_MCOUNT_RECORD select HAVE_FUNCTION_GRAPH_TRACER -- 2.31.0 -- You received this message because you are subscribed to the Google Groups "Clang Built Linux" group. To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20210325223807.2423265-4-nathan%40kernel.org.
Re: [PATCH 1/3] scripts/recordmcount.pl: Fix RISC-V regex for clang
On 2021-03-25, Nathan Chancellor wrote: Clang can generate R_RISCV_CALL_PLT relocations to _mcount: $ llvm-objdump -dr build/riscv/init/main.o | rg mcount 000e: R_RISCV_CALL_PLT _mcount 004e: R_RISCV_CALL_PLT _mcount After this, the __start_mcount_loc section is properly generated and function tracing still works. R_RISCV_CALL_PLT can replace R_RISCV_CALL in all use cases. R_RISCV_CALL can/may be deprecated: https://github.com/ClangBuiltLinux/linux/issues/1331#issuecomment-802468296 Reviewed-by: Fangrui Song Cc: sta...@vger.kernel.org Link: https://github.com/ClangBuiltLinux/linux/issues/1331 Signed-off-by: Nathan Chancellor --- scripts/recordmcount.pl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl index 867860ea57da..a36df04cfa09 100755 --- a/scripts/recordmcount.pl +++ b/scripts/recordmcount.pl @@ -392,7 +392,7 @@ if ($arch eq "x86_64") { $mcount_regex = "^\\s*([0-9a-fA-F]+):.*\\s_mcount\$"; } elsif ($arch eq "riscv") { $function_regex = "^([0-9a-fA-F]+)\\s+<([^.0-9][0-9a-zA-Z_\\.]+)>:"; -$mcount_regex = "^\\s*([0-9a-fA-F]+):\\sR_RISCV_CALL\\s_mcount\$"; +$mcount_regex = "^\\s*([0-9a-fA-F]+):\\sR_RISCV_CALL(_PLT)?\\s_mcount\$"; $type = ".quad"; $alignment = 2; } elsif ($arch eq "nds32") { -- 2.31.0 -- You received this message because you are subscribed to the Google Groups "Clang Built Linux" group. To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20210325223807.2423265-2-nathan%40kernel.org.
Re: [PATCH] gcov: fix clang-11+ support
On 2021-03-12, Nick Desaulniers wrote: On Fri, Mar 12, 2021 at 12:25 PM 'Fangrui Song' via Clang Built Linux wrote: function_name can be unconditionally deleted. It is not used by llvm-cov gcov. You'll need to delete a few assignments to gcov_info_free but you can then unify the gcov_fn_info_dup and gcov_info_free implementations. LG. On big-endian systems, clang < 11 emitted .gcno/.gcda files do not work with llvm-cov gcov < 11. To fix it and make .gcno/.gcda work with gcc gcov I chose to break compatibility (and make all the breaking changes like deleting some CC1 options) in a short window. At that time I was not aware that there is the kernel implementation. Later on I was CCed on a few https://github.com/ClangBuiltLinux/linux/ gcov issues but I forgot to mention the interface change. These are all good suggestions. Since in v2 I'll drop support for clang < 11, I will skip additional patches to disable GCOV when using older clang for BE, and the function_name cleanup. Only llvm_gcda_start_file & llvm_gcda_emit_function need version dispatch. In that case (since there will just be two #if in the file) we don't even need depends on CC_IS_GCC || CLANG_VERSION >= 11 Now in clang 11 onward, clang --coverage defaults to the gcov 4.8 compatible format. You can specify the CC1 option (internal option, subject to change) -coverage-version to make it compatible with other versions' gcov. -Xclang -coverage-version='407*' => 4.7 -Xclang -coverage-version='704*' => 7.4 -Xclang -coverage-version='B02*' => 10.2 (('B'-'A')*10 = 10) How come LLVM doesn't default to 10.2 format, if it can optionally produce it? We might be able to reuse more code in the kernel between the two impelementations, though I expect the symbols the runtime is expected to provide will still differ. Seeing the `B` in `B02*` is also curious. Thanks for the review, will include your tag in v2. 4.8 has the widest range of compiler support. gcov 4.8~7.* use the same format. clang instrumentation does not support the column field (useless in my opinion) introduced in gcov 9, so it just writes zeros.
Re: [PATCH] gcov: fix clang-11+ support
On 2021-03-12, Nick Desaulniers wrote: LLVM changed the expected function signatures for llvm_gcda_start_file() and llvm_gcda_emit_function() in the clang-11 release. Users of clang-11 or newer may have noticed their kernels failing to boot due to a panic when enabling CONFIG_GCOV_KERNEL=y +CONFIG_GCOV_PROFILE_ALL=y. Fix up the function signatures so calling these functions doesn't panic the kernel. When we drop clang-10 support from the kernel, we should carefully update the original implementations to try to preserve git blame, deleting these implementations. Link: https://reviews.llvm.org/rGcdd683b516d147925212724b09ec6fb792a40041 Link: https://reviews.llvm.org/rG13a633b438b6500ecad9e4f936ebadf3411d0f44 Cc: Fangrui Song Reported-by: Prasad Sodagudi Signed-off-by: Nick Desaulniers --- kernel/gcov/clang.c | 69 + 1 file changed, 69 insertions(+) diff --git a/kernel/gcov/clang.c b/kernel/gcov/clang.c index c94b820a1b62..20e6760ec05d 100644 --- a/kernel/gcov/clang.c +++ b/kernel/gcov/clang.c @@ -75,7 +75,9 @@ struct gcov_fn_info { u32 num_counters; u64 *counters; +#if __clang_major__ < 11 const char *function_name; +#endif function_name can be unconditionally deleted. It is not used by llvm-cov gcov. You'll need to delete a few assignments to gcov_info_free but you can then unify the gcov_fn_info_dup and gcov_info_free implementations. }; static struct gcov_info *current_info; @@ -105,6 +107,7 @@ void llvm_gcov_init(llvm_gcov_callback writeout, llvm_gcov_callback flush) } EXPORT_SYMBOL(llvm_gcov_init); +#if __clang_major__ < 11 void llvm_gcda_start_file(const char *orig_filename, const char version[4], u32 checksum) { @@ -113,7 +116,17 @@ void llvm_gcda_start_file(const char *orig_filename, const char version[4], current_info->checksum = checksum; } EXPORT_SYMBOL(llvm_gcda_start_file); +#else +void llvm_gcda_start_file(const char *orig_filename, u32 version, u32 checksum) +{ + current_info->filename = orig_filename; + current_info->version = version; + current_info->checksum = checksum; +} +EXPORT_SYMBOL(llvm_gcda_start_file); +#endif LG. On big-endian systems, clang < 11 emitted .gcno/.gcda files do not work with llvm-cov gcov < 11. To fix it and make .gcno/.gcda work with gcc gcov I chose to break compatibility (and make all the breaking changes like deleting some CC1 options) in a short window. At that time I was not aware that there is the kernel implementation. Later on I was CCed on a few https://github.com/ClangBuiltLinux/linux/ gcov issues but I forgot to mention the interface change. Now in clang 11 onward, clang --coverage defaults to the gcov 4.8 compatible format. You can specify the CC1 option (internal option, subject to change) -coverage-version to make it compatible with other versions' gcov. -Xclang -coverage-version='407*' => 4.7 -Xclang -coverage-version='704*' => 7.4 -Xclang -coverage-version='B02*' => 10.2 (('B'-'A')*10 = 10) Reviewed-by: Fangrui Song +#if __clang_major__ < 11 void llvm_gcda_emit_function(u32 ident, const char *function_name, u32 func_checksum, u8 use_extra_checksum, u32 cfg_checksum) { @@ -133,6 +146,24 @@ void llvm_gcda_emit_function(u32 ident, const char *function_name, list_add_tail(>head, _info->functions); } EXPORT_SYMBOL(llvm_gcda_emit_function); +#else +void llvm_gcda_emit_function(u32 ident, u32 func_checksum, + u8 use_extra_checksum, u32 cfg_checksum) +{ + struct gcov_fn_info *info = kzalloc(sizeof(*info), GFP_KERNEL); + + if (!info) + return; + + INIT_LIST_HEAD(>head); + info->ident = ident; + info->checksum = func_checksum; + info->use_extra_checksum = use_extra_checksum; + info->cfg_checksum = cfg_checksum; + list_add_tail(>head, _info->functions); +} +EXPORT_SYMBOL(llvm_gcda_emit_function); +#endif void llvm_gcda_emit_arcs(u32 num_counters, u64 *counters) { @@ -295,6 +326,7 @@ void gcov_info_add(struct gcov_info *dst, struct gcov_info *src) } } +#if __clang_major__ < 11 static struct gcov_fn_info *gcov_fn_info_dup(struct gcov_fn_info *fn) { size_t cv_size; /* counter values size */ @@ -322,6 +354,28 @@ static struct gcov_fn_info *gcov_fn_info_dup(struct gcov_fn_info *fn) kfree(fn_dup); return NULL; } +#else +static struct gcov_fn_info *gcov_fn_info_dup(struct gcov_fn_info *fn) +{ + size_t cv_size; /* counter values size */ + struct gcov_fn_info *fn_dup = kmemdup(fn, sizeof(*fn), + GFP_KERNEL); + if (!fn_dup) + return NULL; + INIT_LIST_HEAD(_dup->head); + + cv_size = fn->num_counters * sizeof(fn->counters[0]); + fn_dup->counters = vmalloc(cv_size); + if (!fn_dup->counters) { + kfree(fn_dup); +
Re: [PATCH] [RFC] arm64: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION
On 2021-03-10, Nicolas Pitre wrote: On Mon, 1 Mar 2021, Nicholas Piggin wrote: Excerpts from Arnd Bergmann's message of February 27, 2021 7:49 pm: > Unlike what Nick expected in his submission, I now think the annotations > will be needed for LTO just like they are for --gc-sections. Yeah I wasn't sure exactly what LTO looks like or how it would work. I thought perhaps LTO might be able to find dead code with circular / back references, we could put references from the code back to these tables or something so they would be kept without KEEP. I don't know, I was handwaving! I managed to get powerpc (and IIRC x86?) working with gc sections with those KEEP annotations, but effectiveness of course is far worse than what Nicolas was able to achieve with all his techniques and tricks. But yes unless there is some other mechanism to handle these tables, then KEEP probably has to stay. I suggest this wants a very explicit and systematic way to handle it (maybe with some toolchain support) rather than trying to just remove things case by case and see what breaks. I don't know if Nicolas is still been working on his shrinking patches recenty but he probably knows more than anyone about this stuff. Looks like not much has changed since last time I played with this stuff. There is a way to omit the KEEP() on tables, but something must create a dependency from the code being pointed to by those tables to the table entries themselves. I did write my findings in the following article (just skip over the introductory blurb): https://lwn.net/Articles/741494/ Hey, this article taught me R_*_NONE which motivated me to add various R_*_NONE support to LLVM 9! In the weekend I noticed that with binutils>=2.26, one can use .reloc ., BFD_RELOC_NONE, target (https://sourceware.org/bugzilla/show_bug.cgi?id=27530 ). I implemented it for many targets in LLVM, but that will require 13.0.0. Once that dependency is there, then the KEEP() may go and garbage-collecting a function will also garbage-collect the table entry automatically. OTOH this trickery is not needed with LTO as garbage collection happens at the source code optimization level. The KEEP() may remain in that case as unneeded table entries will simply not be created in the first place. For Thin LTO, --gc-sections is still very useful. I have more notes in https://maskray.me/blog/2021-02-28-linker-garbage-collection#link-time-optimization .
Re: [PATCH] [RFC] arm64: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION
On 2021-03-10, Arnd Bergmann wrote: On Wed, Mar 10, 2021 at 9:50 PM Masahiro Yamada wrote: On Mon, Mar 1, 2021 at 10:11 AM Nicholas Piggin wrote: > Excerpts from Arnd Bergmann's message of February 27, 2021 7:49 pm: masahiro@oscar:~/ref/linux$ echo 'void this_func_is_unused(void) {}' >> kernel/cpu.c masahiro@oscar:~/ref/linux$ export CROSS_COMPILE=/home/masahiro/tools/powerpc-10.1.0/bin/powerpc-linux- masahiro@oscar:~/ref/linux$ make ARCH=powerpc defconfig masahiro@oscar:~/ref/linux$ ./scripts/config -e EXPERT masahiro@oscar:~/ref/linux$ ./scripts/config -e LD_DEAD_CODE_DATA_ELIMINATION masahiro@oscar:~/ref/linux$ ~/tools/powerpc-10.1.0/bin/powerpc-linux-nm -n vmlinux | grep this_func c0170560 T .this_func_is_unused c1d8d560 D this_func_is_unused masahiro@oscar:~/ref/linux$ grep DEAD_CODE_ .config CONFIG_HAVE_LD_DEAD_CODE_DATA_ELIMINATION=y CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y If I remember correctly, LD_DEAD_CODE_DATA_ELIMINATION dropped unused functions when I tried it last time. --gc-sections drops unused sections. If the unused function is part of a larger section which is retained due to other symbols (-fno-function-sections), the unused section will be retained as well. I also tried arm64 with a HAVE_LD_DEAD_CODE_DATA_ELIMINATION hack. The result was the same. Am I missing something? It's possible that it only works in combination with CLANG_LTO now because something broke. I definitely saw a reduction in kernel size when both options are enabled, but did not try a simple test case like you did. Maybe some other reference gets created that prevents the function from being garbage-collected unless that other option is removed as well? Arnd I believe with LLVM regular LTO, --gc-sections has very little benefit on compiler generated sections. It is still useful for assembly generated sections (but most such sections are probably needed): * Target specific optimizations can drop references on constants (e.g. `memcpy(..., , sizeof(constant));`) * Due to phase ordering issues some definitions are not discarded by the optimizer. For ThinLTO there are more compiler generated sections discarded by `--gc-sections`: * ThinLTO can cause a definition to be imported to other modules. The original definition may be unneeded after imports. * The definition may survive after intra-module optimization. After imports, a round of (inter-module) IR optimizations after `computeDeadSymbolsWithConstProp` may make the definition unneeded. * Symbol resolution is conservative. Regarding symbol resolution, symbol resolution happens before LTO and LTO happens before --gc-sections. The symbol resolution process may be conservative: it may communicate to LTO that some symbols are referenced by regular object files while in the GC stage the references turn out to not exist because of discarded sections with more precise GC roots. (I've added the above points to my https://maskray.me/blog/2021-02-28-linker-garbage-collection#link-time-optimization )
Re: [PATCH v2 1/2] Makefile: Remove '--gcc-toolchain' flag
On 2021-03-09, Nathan Chancellor wrote: This flag was originally added to allow clang to find the GNU cross tools in commit 785f11aa595b ("kbuild: Add better clang cross build support"). This flag was not enough to find the tools at times so '--prefix' was added to the list in commit ef8c4ed9db80 ("kbuild: allow to use GCC toolchain not in Clang search path") and improved upon in commit ca9b31f6bb9c ("Makefile: Fix GCC_TOOLCHAIN_DIR prefix for Clang cross compilation"). Now that '--prefix' specifies a full path and prefix, '--gcc-toolchain' serves no purpose because the kernel builds with '-nostdinc' and '-nostdlib'. This has been verified with self compiled LLVM 10.0.1 and LLVM 13.0.0 as well as a distribution version of LLVM 11.1.0 without binutils in the LLVM toolchain locations. Link: https://reviews.llvm.org/D97902 Signed-off-by: Nathan Chancellor The wording looks good. Reviewed-by: Fangrui Song
[PATCH] Replace __toc_start + 0x8000 with .TOC.
TOC relocations are like GOT relocations on other architectures. However, unlike other architectures, GNU ld's ppc64 port defines .TOC. relative to the .got output section instead of the linker synthesized .got input section. LLD defines .TOC. as the .got input section plus 0x8000. When CONFIG_PPC_OF_BOOT_TRAMPOLINE=y, arch/powerpc/kernel/prom_init.o is built, and LLD computed .TOC. can be different from __toc_start defined by the linker script. Simplify kernel_toc_addr with asm label .TOC. so that we can get rid of __toc_start. With this change, powernv_defconfig with CONFIG_PPC_OF_BOOT_TRAMPOLINE=y is bootable with LLD. There is still an untriaged issue with Alexey's configuration. Link: https://github.com/ClangBuiltLinux/linux/issues/1318 Reported-by: Alexey Kardashevskiy Signed-off-by: Fangrui Song --- arch/powerpc/boot/crt0.S| 2 +- arch/powerpc/boot/zImage.lds.S | 1 - arch/powerpc/include/asm/sections.h | 10 ++ arch/powerpc/kernel/head_64.S | 2 +- arch/powerpc/kernel/vmlinux.lds.S | 1 - 5 files changed, 4 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/boot/crt0.S b/arch/powerpc/boot/crt0.S index 1d83966f5ef6..e45907fe468f 100644 --- a/arch/powerpc/boot/crt0.S +++ b/arch/powerpc/boot/crt0.S @@ -28,7 +28,7 @@ p_etext: .8byte _etext p_bss_start: .8byte __bss_start p_end: .8byte _end -p_toc: .8byte __toc_start + 0x8000 - p_base +p_toc: .8byte .TOC. - p_base p_dyn: .8byte __dynamic_start - p_base p_rela:.8byte __rela_dyn_start - p_base p_prom:.8byte 0 diff --git a/arch/powerpc/boot/zImage.lds.S b/arch/powerpc/boot/zImage.lds.S index d6f072865627..32cf7816292f 100644 --- a/arch/powerpc/boot/zImage.lds.S +++ b/arch/powerpc/boot/zImage.lds.S @@ -39,7 +39,6 @@ SECTIONS . = ALIGN(256); .got : { -__toc_start = .; *(.got) *(.toc) } diff --git a/arch/powerpc/include/asm/sections.h b/arch/powerpc/include/asm/sections.h index 324d7b298ec3..bd22ca0b5eca 100644 --- a/arch/powerpc/include/asm/sections.h +++ b/arch/powerpc/include/asm/sections.h @@ -48,14 +48,8 @@ static inline int in_kernel_text(unsigned long addr) static inline unsigned long kernel_toc_addr(void) { - /* Defined by the linker, see vmlinux.lds.S */ - extern unsigned long __toc_start; - - /* -* The TOC register (r2) points 32kB into the TOC, so that 64kB of -* the TOC can be addressed using a single machine instruction. -*/ - return (unsigned long)(&__toc_start) + 0x8000UL; + extern unsigned long toc asm(".TOC."); + return (unsigned long)(); } static inline int overlaps_interrupt_vector_text(unsigned long start, diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S index ece7f97bafff..9542d03b2efe 100644 --- a/arch/powerpc/kernel/head_64.S +++ b/arch/powerpc/kernel/head_64.S @@ -899,7 +899,7 @@ _GLOBAL(relative_toc) blr .balign 8 -p_toc: .8byte __toc_start + 0x8000 - 0b +p_toc: .8byte .TOC. - 0b /* * This is where the main kernel code starts. diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S index 72fa3c00229a..c28f4e5bae3f 100644 --- a/arch/powerpc/kernel/vmlinux.lds.S +++ b/arch/powerpc/kernel/vmlinux.lds.S @@ -328,7 +328,6 @@ SECTIONS . = ALIGN(256); .got : AT(ADDR(.got) - LOAD_OFFSET) { - __toc_start = .; #ifndef CONFIG_RELOCATABLE __prom_init_toc_start = .; arch/powerpc/kernel/prom_init.o*(.toc .got) -- 2.30.1.766.gb4fecdf3b7-goog
Re: [PATCH 1/2] Makefile: Remove '--gcc-toolchain' flag
On 2021-03-03, Masahiro Yamada wrote: Hi. On Wed, Mar 3, 2021 at 6:44 AM Fangrui Song wrote: Reviewed-by: Fangrui Song Thanks for the clean-up! --gcc-toolchain= is an obsscure way searching for GCC installation prefixes (--prefix). The logic is complex and different for different distributions/architectures. If we specify --prefix= (-B) explicitly, --gcc-toolchain is not needed. I tested this, and worked for me too. Before applying this patch, could you please help me understand the logic? I checked the manual (https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-b-dir) -B, --prefix , --prefix= Add to search path for binaries and object files used implicitly --gcc-toolchain=, -gcc-toolchain Use the gcc toolchain at the given directory Hmm, this description is too concise to understand how it works... I use Ubuntu 20.10. I use distro's default clang located in /usr/bin/clang. I place my aarch64 linaro toolchain in /home/masahiro/tools/aarch64-linaro-7.5/bin/aarch64-linux-gnu-gcc, which is not in my PATH environment. From my some experiments, clang --target=aarch64-linux-gnu -no-integrated-as \ --prefix=/home/masahiro/tools/aarch64-linaro-7.5/bin/aarch64-linux-gnu- ... works almost equivalent to PATH=/home/masahiro/tools/aarch64-linaro-7.5/bin:$PATH \ clang --target=aarch64-linux-gnu -no-integrated-as ... Then, clang will pick up aarch64-linux-gnu-as found in the search path. Is this correct? On the other hand, I could not understand what the purpose of --gcc-toolchain= is. Even if I add --gcc-toolchain=/home/masahiro/tools/aarch64-linaro-7.5, it does not make any difference, and it is completely useless. I read the comment from stephenhines: https://github.com/ClangBuiltLinux/linux/issues/78 How could --gcc-toolchain be used in a useful way? --gcc-toolchain was introduced in https://reviews.llvm.org/rG1af7c219c7113a35415651127f05cdf056b63110 to provide a flexible alternative to autoconf configure-time --with-gcc-toolchain (now cmake variable GCC_INSTALL_PREFIX). I agree the option is confusing, the documentation is poor, and probably very few people understand what it does. I apologize that my previous reply is not particular correct. So the more correct answer is below: A --prefix option can specify either of 1) A directory (for something like /a/b/lib/gcc/arm-linux-androideabi, this should be /a/b, the parent directory of 'lib') 2) A path fragment like /usr/bin/aarch64-linux-gnu- The directory values of the --prefix list and --gcc-toolchain are used to detect GCC installation directories. The directory is used to fetch include directories, system library directories and binutils directories (as, objcopy). (See below, Linux kernel only needs the binutils executables, so the include/library logic is really useless to us) The logic is around https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Gnu.cpp#L1910 Prefixes = --prefix/-B list (only the directory subset is effective) StringRef GCCToolchainDir = --gcc-toolchain= or CMake variable GCC_INSTALL_PREFIX if (GCCToolchainDir != "") { Prefixes.push_back(std::string(GCCToolchainDir)); } else { if (!D.SysRoot.empty()) { Prefixes.push_back(D.SysRoot); // Add D.SysRoot+"/usr" to Prefixes. Some distributions add more directories. AddDefaultGCCPrefixes(TargetTriple, Prefixes, D.SysRoot); } // D.InstalledDir is the directory of the clang executable, e.g. /usr/bin Prefixes.push_back(D.InstalledDir + "/.."); if (D.SysRoot.empty()) AddDefaultGCCPrefixes(TargetTriple, Prefixes, D.SysRoot); } // Gentoo / ChromeOS specific logic. // I think this block is misplaced. if (GCCToolchainDir == "" || GCCToolchainDir == D.SysRoot + "/usr") { ... } // Loop over the various components which exist and select the best GCC // installation available. GCC installs are ranked by version number. Version = GCCVersion::Parse("0.0.0"); for (const std::string : Prefixes) { auto = D.getVFS(); if (!VFS.exists(Prefix)) continue; // CandidateLibDirs is a subset of {/lib64, /lib32, /lib}. for (StringRef Suffix : CandidateLibDirs) { const std::string LibDir = Prefix + Suffix.str(); if (!VFS.exists(LibDir)) continue; bool GCCDirExists = VFS.exists(LibDir + "/gcc"); bool GCCCrossDirExists = VFS.exists(LibDir + "/gcc-cross"); // Precise match. Detect $Prefix/lib/$--target ScanLibDirForGCCTriple(TargetTriple, Args, LibDir, TargetTriple.str(), false, GCCDirExists, GCCCrossDirExists); // Usually empty. for (StringRef Candidate : ExtraTripleAliases) // Try these first. ScanLibDirForGCCTriple(TargetTriple, Args, LibDir, Candidate, false, GCCDirExists, GCCCrossDirExists); //
Re: [PATCH 2/2] Makefile: Only specify '--prefix=' when building with clang + GNU as
On 2021-03-02, Nathan Chancellor wrote: When building with LLVM_IAS=1, there is no point to specifying '--prefix=' because that flag is only used to find the cross assembler, which is clang itself when building with LLVM_IAS=1. All of the other tools are invoked directly from PATH or a full path specified via the command line, which does not depend on the value of '--prefix='. Sharing commands to reproduce issues becomes a little bit easier without a '--prefix=' value because that '--prefix=' value is specific to a user's machine due to it being an absolute path. Signed-off-by: Nathan Chancellor Reviewed-by: Fangrui Song clang can spawn GNU as (if -f?no-integrated-as is specified) and GNU objcopy (-f?no-integrated-as and -gsplit-dwarf and -g[123]). With LLVM_IAS=1, these cases are ruled out.
Re: [PATCH 1/2] Makefile: Remove '--gcc-toolchain' flag
Reviewed-by: Fangrui Song Thanks for the clean-up! --gcc-toolchain= is an obsscure way searching for GCC installation prefixes (--prefix). The logic is complex and different for different distributions/architectures. If we specify --prefix= (-B) explicitly, --gcc-toolchain is not needed. On 2021-03-02, Nathan Chancellor wrote: This is not necessary anymore now that we specify '--prefix=', which tells clang exactly where to find the GNU cross tools. This has been verified with self compiled LLVM 10.0.1 and LLVM 13.0.0 as well as a distribution version of LLVM 11.1.0 without binutils in the LLVM toolchain locations. Signed-off-by: Nathan Chancellor --- Makefile | 4 1 file changed, 4 deletions(-) diff --git a/Makefile b/Makefile index f9b54da2fca0..c20f0ad8be73 100644 --- a/Makefile +++ b/Makefile @@ -568,10 +568,6 @@ ifneq ($(CROSS_COMPILE),) CLANG_FLAGS += --target=$(notdir $(CROSS_COMPILE:%-=%)) GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)elfedit)) CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)$(notdir $(CROSS_COMPILE)) -GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..) -endif -ifneq ($(GCC_TOOLCHAIN),) -CLANG_FLAGS+= --gcc-toolchain=$(GCC_TOOLCHAIN) endif ifneq ($(LLVM_IAS),1) CLANG_FLAGS += -no-integrated-as base-commit: 7a7fd0de4a9804299793e564a555a49c1fc924cb -- 2.31.0.rc0.75.gec125d1bc1 -- You received this message because you are subscribed to the Google Groups "Clang Built Linux" group. To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20210302210646.3044738-1-nathan%40kernel.org.
Re: [PATCH v8] pgo: add clang's Profile Guided Optimization infrastructure
On 2021-02-28, Fangrui Song wrote: Reviewed-by: Fangrui Song Some minor items below: On 2021-02-26, 'Bill Wendling' via Clang Built Linux wrote: From: Sami Tolvanen Enable the use of clang's Profile-Guided Optimization[1]. To generate a profile, the kernel is instrumented with PGO counters, a representative workload is run, and the raw profile data is collected from /sys/kernel/debug/pgo/profraw. The raw profile data must be processed by clang's "llvm-profdata" tool before it can be used during recompilation: $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw Multiple raw profiles may be merged during this step. The data can now be used by the compiler: $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ... This initial submission is restricted to x86, as that's the platform we know works. This restriction can be lifted once other platforms have been verified to work with PGO. Note that this method of profiling the kernel is clang-native, unlike the clang support in kernel/gcov. [1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization Signed-off-by: Sami Tolvanen Co-developed-by: Bill Wendling Signed-off-by: Bill Wendling --- v8: - Rebased on top-of-tree. v7: - Fix minor build failure reported by Sedat. v6: - Add better documentation about the locking scheme and other things. - Rename macros to better match the same macros in LLVM's source code. v5: - Correct padding calculation, discovered by Nathan Chancellor. v4: - Remove non-x86 Makfile changes and se "hweight64" instead of using our own popcount implementation, based on Nick Desaulniers's comment. v3: - Added change log section based on Sedat Dilek's comments. v2: - Added "__llvm_profile_instrument_memop" based on Nathan Chancellor's testing. - Corrected documentation, re PGO flags when using LTO, based on Fangrui Song's comments. --- Documentation/dev-tools/index.rst | 1 + Documentation/dev-tools/pgo.rst | 127 + MAINTAINERS | 9 + Makefile | 3 + arch/Kconfig | 1 + arch/x86/Kconfig | 1 + arch/x86/boot/Makefile| 1 + arch/x86/boot/compressed/Makefile | 1 + arch/x86/crypto/Makefile | 4 + arch/x86/entry/vdso/Makefile | 1 + arch/x86/kernel/vmlinux.lds.S | 2 + arch/x86/platform/efi/Makefile| 1 + arch/x86/purgatory/Makefile | 1 + arch/x86/realmode/rm/Makefile | 1 + arch/x86/um/vdso/Makefile | 1 + drivers/firmware/efi/libstub/Makefile | 1 + include/asm-generic/vmlinux.lds.h | 44 +++ kernel/Makefile | 1 + kernel/pgo/Kconfig| 35 +++ kernel/pgo/Makefile | 5 + kernel/pgo/fs.c | 389 ++ kernel/pgo/instrument.c | 189 + kernel/pgo/pgo.h | 203 ++ scripts/Makefile.lib | 10 + 24 files changed, 1032 insertions(+) create mode 100644 Documentation/dev-tools/pgo.rst create mode 100644 kernel/pgo/Kconfig create mode 100644 kernel/pgo/Makefile create mode 100644 kernel/pgo/fs.c create mode 100644 kernel/pgo/instrument.c create mode 100644 kernel/pgo/pgo.h diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst index f7809c7b1ba9..8d6418e85806 100644 --- a/Documentation/dev-tools/index.rst +++ b/Documentation/dev-tools/index.rst @@ -26,6 +26,7 @@ whole; patches welcome! kgdb kselftest kunit/index + pgo .. only:: subproject and html diff --git a/Documentation/dev-tools/pgo.rst b/Documentation/dev-tools/pgo.rst new file mode 100644 index ..b7f11d8405b7 --- /dev/null +++ b/Documentation/dev-tools/pgo.rst @@ -0,0 +1,127 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=== +Using PGO with the Linux kernel +=== + +Clang's profiling kernel support (PGO_) enables profiling of the Linux kernel +when building with Clang. The profiling data is exported via the ``pgo`` +debugfs directory. + +.. _PGO: https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization + + +Preparation +=== + +Configure the kernel with: + +.. code-block:: make + + CONFIG_DEBUG_FS=y + CONFIG_PGO_CLANG=y + +Note that kernels compiled with profiling flags will be significantly larger +and run slower. + +Profiling data will only become accessible once debugfs has been mounted: + +.. code-block:: sh + + mount -t debugfs none /sys/kernel/debug + + +Customization += + +You can enable or disable profiling for individual file and directories by +adding a line similar to the following to the respective kernel Makefile: + +- For a single file (e.g. main.o) + + .. code-block:: make + + PGO_PROFILE_main.o
Re: [PATCH v8] pgo: add clang's Profile Guided Optimization infrastructure
Reviewed-by: Fangrui Song Some minor items below: On 2021-02-26, 'Bill Wendling' via Clang Built Linux wrote: From: Sami Tolvanen Enable the use of clang's Profile-Guided Optimization[1]. To generate a profile, the kernel is instrumented with PGO counters, a representative workload is run, and the raw profile data is collected from /sys/kernel/debug/pgo/profraw. The raw profile data must be processed by clang's "llvm-profdata" tool before it can be used during recompilation: $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw Multiple raw profiles may be merged during this step. The data can now be used by the compiler: $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ... This initial submission is restricted to x86, as that's the platform we know works. This restriction can be lifted once other platforms have been verified to work with PGO. Note that this method of profiling the kernel is clang-native, unlike the clang support in kernel/gcov. [1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization Signed-off-by: Sami Tolvanen Co-developed-by: Bill Wendling Signed-off-by: Bill Wendling --- v8: - Rebased on top-of-tree. v7: - Fix minor build failure reported by Sedat. v6: - Add better documentation about the locking scheme and other things. - Rename macros to better match the same macros in LLVM's source code. v5: - Correct padding calculation, discovered by Nathan Chancellor. v4: - Remove non-x86 Makfile changes and se "hweight64" instead of using our own popcount implementation, based on Nick Desaulniers's comment. v3: - Added change log section based on Sedat Dilek's comments. v2: - Added "__llvm_profile_instrument_memop" based on Nathan Chancellor's testing. - Corrected documentation, re PGO flags when using LTO, based on Fangrui Song's comments. --- Documentation/dev-tools/index.rst | 1 + Documentation/dev-tools/pgo.rst | 127 + MAINTAINERS | 9 + Makefile | 3 + arch/Kconfig | 1 + arch/x86/Kconfig | 1 + arch/x86/boot/Makefile| 1 + arch/x86/boot/compressed/Makefile | 1 + arch/x86/crypto/Makefile | 4 + arch/x86/entry/vdso/Makefile | 1 + arch/x86/kernel/vmlinux.lds.S | 2 + arch/x86/platform/efi/Makefile| 1 + arch/x86/purgatory/Makefile | 1 + arch/x86/realmode/rm/Makefile | 1 + arch/x86/um/vdso/Makefile | 1 + drivers/firmware/efi/libstub/Makefile | 1 + include/asm-generic/vmlinux.lds.h | 44 +++ kernel/Makefile | 1 + kernel/pgo/Kconfig| 35 +++ kernel/pgo/Makefile | 5 + kernel/pgo/fs.c | 389 ++ kernel/pgo/instrument.c | 189 + kernel/pgo/pgo.h | 203 ++ scripts/Makefile.lib | 10 + 24 files changed, 1032 insertions(+) create mode 100644 Documentation/dev-tools/pgo.rst create mode 100644 kernel/pgo/Kconfig create mode 100644 kernel/pgo/Makefile create mode 100644 kernel/pgo/fs.c create mode 100644 kernel/pgo/instrument.c create mode 100644 kernel/pgo/pgo.h diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst index f7809c7b1ba9..8d6418e85806 100644 --- a/Documentation/dev-tools/index.rst +++ b/Documentation/dev-tools/index.rst @@ -26,6 +26,7 @@ whole; patches welcome! kgdb kselftest kunit/index + pgo .. only:: subproject and html diff --git a/Documentation/dev-tools/pgo.rst b/Documentation/dev-tools/pgo.rst new file mode 100644 index ..b7f11d8405b7 --- /dev/null +++ b/Documentation/dev-tools/pgo.rst @@ -0,0 +1,127 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=== +Using PGO with the Linux kernel +=== + +Clang's profiling kernel support (PGO_) enables profiling of the Linux kernel +when building with Clang. The profiling data is exported via the ``pgo`` +debugfs directory. + +.. _PGO: https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization + + +Preparation +=== + +Configure the kernel with: + +.. code-block:: make + + CONFIG_DEBUG_FS=y + CONFIG_PGO_CLANG=y + +Note that kernels compiled with profiling flags will be significantly larger +and run slower. + +Profiling data will only become accessible once debugfs has been mounted: + +.. code-block:: sh + + mount -t debugfs none /sys/kernel/debug + + +Customization += + +You can enable or disable profiling for individual file and directories by +adding a line similar to the following to the respective kernel Makefile: + +- For a single file (e.g. main.o) + + .. code-block:: make + + PGO_PROFILE_main.o := y + +- For all files in on
Re: [PATCH RFC] x86: remove toolchain check for X32 ABI capability
On 2021-02-28, Masahiro Yamada wrote: This commit reverts 0bf6276392e9 ("x32: Warn and disable rather than error if binutils too old"). The help text in arch/x86/Kconfig says enabling the X32 ABI support needs binutils 2.22 or later. This is met because the minimal binutils version is 2.23 according to Documentation/process/changes.rst. I would not say I am not familiar with toolchain configuration, but I checked the configure.tgt code in binutils. The elf32_x86_64 emulation mode seems to be included when it is configured for the x86_64-*-linux-* target. I also tried lld and llvm-objcopy, and succeeded in building x32 VDSO. I removed the compile-time check in arch/x86/Makefile, in the hope of elf32_x86_64 being always supported. With this, CONFIG_X86_X32 and CONFIG_X86_X32_ABI will be equivalent. Rename the former to the latter. Hi Masahiro, the cleanup looks nice! As of LLVM toolchain support, I don't know any user using LLVM binary utilities or LLD. The support on binary utitlies should be minimum anyway (EM_X86_64, ELFCLASS32, ELFDATA2LSB are mostly all the tool needs to know for many utilities), so many of they should just work. For llvm-objcopy, I know two issues related to `$(OBJCOPY) -O elf32-x86-64` (actually `objcopy -I elf64-x86-64 -O elf32-x86-64`). Such an operation tries to convert an ELFCLASS64 object file to an ELFCLASS32 object file. It is not very clear what GNU objcopy does. llvm-objcopy is dumb and does not do fancy CLASS conversion. * {gcc,clang} -gz{,=zlib} produced object files. The Elf{32,64}_Chdr headers are different. Seems that GNU objcopy can convert the headers (https://github.com/ClangBuiltLinux/linux/issues/514). llvm-objcopy cannot do it. * Seems that GNU objcopy can convert .note.gnu.property (https://github.com/ClangBuiltLinux/linux/issues/1141#issuecomment-678798228) llvm-objcopy cannot do it. On the linker side, I know TLS relaxations and IBT need special care and I believe LLD does not handle them correctly. Thankfully the kernel does not use thread-local storage so this is not an issue. So perhaps for most configurations it is already working. Since you've tested it, that is good news to me:)
Re: [PATCH] arm64: vmlinux.lds.S: keep .entry.tramp.text section
On 2021-02-26, Kees Cook wrote: On Fri, Feb 26, 2021 at 03:03:39PM +0100, Arnd Bergmann wrote: From: Arnd Bergmann When building with CONFIG_LD_DEAD_CODE_DATA_ELIMINATION, I sometimes see an assertion ld.lld: error: Entry trampoline text too big Heh, "too big" seems a weird report for having it discarded. :) Any idea on this Fangrui? ( I see this is https://github.com/ClangBuiltLinux/linux/issues/1311 ) This diagnostic is from an ASSERT in arch/arm64/kernel/vmlinux.lds ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) == (1 << 16), "Entry trampoline text too big") In our case (aarch64-linux-gnu-ld or LLD, --gc-sections), all the input sections with this name are discarded, so the output section is either absent (GNU ld) or empty (LLD). KEEP makes the sections GC roots, and it is appropriate to use here. However, I worry that many other KEEP keywords in vmlinux.lds are unnecessary: https://lore.kernel.org/linux-arm-kernel/20210226211323.arkvjnr4hifxa...@google.com/ git log -S KEEP -- include/asm-generic/vmlinux.lds.h => there is quite a bit unjustified usage. Sure, adding KEEP (GC root) is easy and works around problems, but it not good for CONFIG_LD_DEAD_CODE_DATA_ELIMINATION. Reviewed-by: Fangrui Song This happens when any reference to the trampoline is discarded at link time. Marking the section as KEEP() avoids the assertion, but I have not figured out whether this is the correct solution for the underlying problem. Signed-off-by: Arnd Bergmann As a work-around, it seems fine to me. Reviewed-by: Kees Cook -Kees --- arch/arm64/kernel/vmlinux.lds.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S index 926cdb597a45..c5ee9d5842db 100644 --- a/arch/arm64/kernel/vmlinux.lds.S +++ b/arch/arm64/kernel/vmlinux.lds.S @@ -96,7 +96,7 @@ jiffies = jiffies_64; #define TRAMP_TEXT \ . = ALIGN(PAGE_SIZE); \ __entry_tramp_text_start = .; \ - *(.entry.tramp.text)\ + KEEP(*(.entry.tramp.text)) \ . = ALIGN(PAGE_SIZE); \ __entry_tramp_text_end = .; #else -- 2.29.2 -- Kees Cook
Re: [PATCH] [RFC] arm64: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION
On 2021-02-25, Arnd Bergmann wrote: From: Arnd Bergmann When looking at kernel size optimizations, I found that arm64 does not currently support HAVE_LD_DEAD_CODE_DATA_ELIMINATION, which enables the --gc-sections flag to the linker. I see that for a defconfig build with llvm, there are some notable improvements from enabling this, in particular when combined with the recently added CONFIG_LTO_CLANG_THIN and CONFIG_TRIM_UNUSED_KSYMS: textdata bss dec hex filename 16570322 10998617 506468 28075407 1ac658f defconfig/vmlinux 16318793 10569913 506468 27395174 1a20466 trim_defconfig/vmlinux 16281234 10984848 504291 27770373 1a7be05 gc_defconfig/vmlinux 16029705 10556880 504355 27090940 19d5ffc gc+trim_defconfig/vmlinux 17040142 11102945 504196 28647283 1b51f73 thinlto_defconfig/vmlinux 16788613 10663201 504196 27956010 1aa932a thinlto+trim_defconfig/vmlinux 16347062 11043384 502499 27892945 1a99cd1 gc+thinlto_defconfig/vmlinux 15759453 10532792 502395 26794640 198da90 gc+thinlto+trim_defconfig/vmlinux I needed a small change to the linker script to get clean randconfig builds, but I have not done any meaningful boot testing on it to see if it works. If there are no regressions, I wonder whether this should be autmatically done for LTO builds, given that it improves both kernel size and compile speed. Link: https://lore.kernel.org/lkml/cak8p3a05vz9hskrzvtxtn+1nf9e+gqebjwtj6n23nfm+elh...@mail.gmail.com/ Signed-off-by: Arnd Bergmann For folks who are interested in --gc-sections on metadata sections, I want to bring you awareness of the implication of __start_/__stop_ symbols and C identifier name sections. You can see https://github.com/ClangBuiltLinux/linux/issues/1307 for a summary. (Its linked blog article has some examples.) In the kernel linker scripts, most C identifier name sections begin with double-underscore __. Some are surrounded by `KEEP(...)`, some are not. * A `KEEP` keyword has GC root semantics and makes ld --gc-sections ineffectful. * Without `KEEP`, __start_/__stop_ references from a live input section can unnecessarily retain all the associated C identifier name input sections. The new ld.lld option `-z start-stop-gc` can defeat this rule. As an example, a __start___jump_table reference from a live section causes all `__jump_table` input section to be retained, even if you change `KEEP(__jump_table)` to `(__jump_table)`. (If you change the symbol name from `__start_${section}` to something else (e.g. `__start${section}`), the rule will not apply.) There are a lot of KEEP usage. Perhaps some can be dropped to facilitate ld --gc-sections.
Re: [PATCH v7 1/2] Kbuild: make DWARF version a choice
On 2021-02-04, Nick Desaulniers wrote: On Thu, Feb 4, 2021 at 12:28 PM Mark Wielaard wrote: On Thu, 2021-02-04 at 12:04 -0800, Nick Desaulniers wrote: > On Thu, Feb 4, 2021 at 11:56 AM Mark Wielaard wrote: > > I agree with Jakub. Now that GCC has defaulted to DWARF5 all the > > tools > > have adopted to the new default version. And DWARF5 has been out > > for > > "all of the tools" ? I believe so yes, we did a mass-rebuild of all of Fedora a few weeks back with a GCC11 pre-release and did find some tools which weren't ready, but as far as I know all have been fixed now. I did try to Congrats, I'm sure that was _a lot_ of work. Our toolchain folks have been pouring a lot of effort over getting our internal code all moved over, and it doesn't look like it's been easy from my perspective. coordinate with the Suse and Debian packagers too, so all the major distros should have all the necessary updates once switching to GCC11. That's great for users of the next Fedora release who can and will upgrade, but I wouldn't assume kernel developers can, or will (or are even using those distros). Most recently, there was discussion on the list about upgrading the minimally required version of GCC for building the kernel to GCC 5.1; we still had developers complain about abandoning GCC 4.9. And Guenter shared with me a list of architectures he tests with where he cannot upgrade the version of GCC in order to keep building them. https://github.com/groeck/linux-build-test/blob/master/bin/stable-build-arch.sh (I hope someone sent bug reports for those) My intent is very much to allow for users of toolchains that have not switched the implicit default (such as all of the supported versions of GCC that have been released ie. up to GCC 10.2, and Clang; so all toolchains the kernel still supports) to enjoy the size saving of DWARF v5, and find what other tooling needs to be updated. > > more than 4 years already. It isn't unreasonable to assume that people > > using GCC11 will also be using the rest of the toolchain that has moved > > on. Which DWARF consumers are you concerned about not being ready for > > GCC defaulting to DWARF5 once GCC11 is released? > > Folks who don't have top of tree pahole or binutils are the two that > come to mind. I believe pahole just saw a 1.20 release. I am sure it will be widely available once GCC11 is released (which will still be 1 or 2 months) and people are actually using it. Or do you expect distros/people are going to upgrade to GCC11 without updating their other toolchain tools? Does no one test linux kernel builds with top of tree GCC built from source? Or does that require "updating their other toolchain tools?" If I build ToT GCC from source, do I need to do the same for binutils-gdb in order to build the kernel? Pretty sure I don't. https://bugzilla.redhat.com/show_bug.cgi?id=1922707 and https://bugzilla.redhat.com/show_bug.cgi?id=1922698 look like user reports to me, but hopefully some CI system reported earlier that builds of the Linux kernel with GCC 11 pre release would produce the warnings from those bug report. Otherwise it looks like evidence that users "are going to upgrade to GCC11 without updating their other toolchain tools." In the case of pahole, they could not, because fixes were not yet written. "Just upgrade" doesn't work if there's no fix yet upstream. (pahole is reported fixed for that specific issue, FWIW). BTW. GCC11 doesn't need top of tree binutils, it will detect the binutils capabilities (bugs) and adjust its DWARF output based on it. Yes, I saw https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=6923255e35a3d54f2083ad0f67edebb3f1b86506 and https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=1aeb7d7d67d167297ca2f4a97ef20f68e7546b4c. It's nice that GCC can tightly couple to a version of binutils. Clang without its integrated assembler can make no such assumptions about which assembler the user will prefer to use instead at runtime, and without binutils 2.35.1 being widely available as we all would like, leads to issues shipping DWARF v5 by default. > I don't have specifics on out of tree consumers, but > some Aarch64 extensions which had some changes to DWARF for ARMv8.3 > PAC support broke some debuggers. It would be really helpful if you could provide some specifics. I did fix some consumers to handle the PAC operands in CFI last year, but I don't believe that had anything to do with the default DWARF version, just with dealing with DW_CFA_AARCH64_negate_ra_state. Yep, that's the one. I suspect that the same out of tree consumers of DWARF that did not see that coming will similarly be stumped when presented with DWARF v5, but it's hypothetical, so not much of an argument I'll admit. I just wouldn't bet that an upgrade to DWARF v5 will be painless at this point in time, as evidenced by how much blood has been poured into finding what tools out there were broken and needed to be fixed. I also recognize we can't drag our heels
Re: [PATCH v7 1/2] Kbuild: make DWARF version a choice
On 2021-02-04, Masahiro Yamada wrote: On Sat, Jan 30, 2021 at 10:52 AM Nathan Chancellor wrote: On Fri, Jan 29, 2021 at 04:44:00PM -0800, Nick Desaulniers wrote: > Modifies CONFIG_DEBUG_INFO_DWARF4 to be a member of a choice which is > the default. Does so in a way that's forward compatible with existing > configs, and makes adding future versions more straightforward. > > GCC since ~4.8 has defaulted to this DWARF version implicitly. > > Suggested-by: Arvind Sankar > Suggested-by: Fangrui Song > Suggested-by: Nathan Chancellor > Suggested-by: Masahiro Yamada > Signed-off-by: Nick Desaulniers One comment below: Reviewed-by: Nathan Chancellor > --- > Makefile | 5 ++--- > lib/Kconfig.debug | 16 +++- > 2 files changed, 13 insertions(+), 8 deletions(-) > > diff --git a/Makefile b/Makefile > index 95ab9856f357..d2b4980807e0 100644 > --- a/Makefile > +++ b/Makefile > @@ -830,9 +830,8 @@ ifneq ($(LLVM_IAS),1) > KBUILD_AFLAGS+= -Wa,-gdwarf-2 It is probably worth a comment somewhere that assembly files will still have DWARF v2. I agree. Please noting the reason will be helpful. Could you summarize Jakub's comment in short? https://patchwork.kernel.org/project/linux-kbuild/patch/20201022012106.1875129-1-ndesaulni...@google.com/#23727667 One more question. Can we remove -g option like follows? ifdef CONFIG_DEBUG_INFO_SPLIT DEBUG_CFLAGS += -gsplit-dwarf -else -DEBUG_CFLAGS += -g endif GCC 11/Clang 12 -gsplit-dwarf no longer imply -g2 (https://reviews.llvm.org/D80391). May be worth checking whether -gsplit-dwarf is used without a debug info enabling option. In the current mainline code, -g is the only debug option if CONFIG_DEBUG_INFO_DWARF4 is disabled. The GCC manual says: https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Debugging-Options.html#Debugging-Options -g Produce debugging information in the operating system’s native format (stabs, COFF, XCOFF, or DWARF). GDB can work with this debugging information. Of course, we expect the -g option will produce the debug info in the DWARF format. With this patch set applied, it is very explicit. Only the format type, but also the version. The compiler will be given either -gdwarf-4 or -gdwarf-5, making the -g option redundant, I think. -gdwarf-N does imply -g2 but personally I'd not suggest remove it if it already exists. The non-orthogonality is the reason Clang has -fdebug-default-version (https://reviews.llvm.org/D69822). > endif > > -ifdef CONFIG_DEBUG_INFO_DWARF4 > -DEBUG_CFLAGS += -gdwarf-4 > -endif > +dwarf-version-$(CONFIG_DEBUG_INFO_DWARF4) := 4 > +DEBUG_CFLAGS += -gdwarf-$(dwarf-version-y) > > ifdef CONFIG_DEBUG_INFO_REDUCED > DEBUG_CFLAGS += $(call cc-option, -femit-struct-debug-baseonly) \ > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > index e906ea906cb7..94c1a7ed6306 100644 > --- a/lib/Kconfig.debug > +++ b/lib/Kconfig.debug > @@ -256,13 +256,19 @@ config DEBUG_INFO_SPLIT > to know about the .dwo files and include them. > Incompatible with older versions of ccache. > > +choice > + prompt "DWARF version" > + help > + Which version of DWARF debug info to emit. > + > config DEBUG_INFO_DWARF4 > - bool "Generate dwarf4 debuginfo" > + bool "Generate DWARF Version 4 debuginfo" > help > - Generate dwarf4 debug info. This requires recent versions > - of gcc and gdb. It makes the debug information larger. > - But it significantly improves the success of resolving > - variables in gdb on optimized code. > + Generate DWARF v4 debug info. This requires gcc 4.5+ and gdb 7.0+. > + It makes the debug information larger, but it significantly > + improves the success of resolving variables in gdb on optimized code. > + > +endchoice # "DWARF version" > > config DEBUG_INFO_BTF > bool "Generate BTF typeinfo" > -- > 2.30.0.365.g02bc693789-goog > -- Best Regards Masahiro Yamada
Re: [PATCH v6 2/2] Kbuild: implement support for DWARF v5
On 2021-01-29, Nick Desaulniers wrote: DWARF v5 is the latest standard of the DWARF debug info format. Feature detection of DWARF5 is onerous, especially given that we've removed $(AS), so we must query $(CC) for DWARF5 assembler directive support. The DWARF version of a binary can be validated with: $ llvm-dwarfdump vmlinux | head -n 4 | grep version or $ readelf --debug-dump=info vmlinux 2>/dev/null | grep Version DWARF5 wins significantly in terms of size when mixed with compression (CONFIG_DEBUG_INFO_COMPRESSED). 363Mvmlinux.clang12.dwarf5.compressed 434Mvmlinux.clang12.dwarf4.compressed 439Mvmlinux.clang12.dwarf2.compressed 457Mvmlinux.clang12.dwarf5 536Mvmlinux.clang12.dwarf4 548Mvmlinux.clang12.dwarf2 515Mvmlinux.gcc10.2.dwarf5.compressed 599Mvmlinux.gcc10.2.dwarf4.compressed 624Mvmlinux.gcc10.2.dwarf2.compressed 630Mvmlinux.gcc10.2.dwarf5 765Mvmlinux.gcc10.2.dwarf4 809Mvmlinux.gcc10.2.dwarf2 Though the quality of debug info is harder to quantify; size is not a proxy for quality. Jakub notes: All [GCC] 5.1 - 6.x did was start accepting -gdwarf-5 as experimental option that enabled some small DWARF subset (initially only a few DW_LANG_* codes newly added to DWARF5 drafts). Only GCC 7 (released after DWARF 5 has been finalized) started emitting DWARF5 section headers and got most of the DWARF5 changes in... Version check GCC so that we don't need to worry about the difference in command line args between GNU readelf and llvm-readelf/llvm-dwarfdump to validate the DWARF Version in the assembler feature detection script. GNU `as` only recently gained support for specifying -gdwarf-5, so when compiling with Clang but without Clang's integrated assembler (LLVM_IAS=1 is not set), explicitly add -Wa,-gdwarf-5 to DEBUG_CFLAGS. Disabled for now if CONFIG_DEBUG_INFO_BTF is set; pahole doesn't yet recognize the new additions to the DWARF debug info. Thanks to Sedat for the report. Link: http://www.dwarfstd.org/doc/DWARF5.pdf Reported-by: Sedat Dilek Suggested-by: Arvind Sankar Suggested-by: Caroline Tice Suggested-by: Fangrui Song Suggested-by: Jakub Jelinek Suggested-by: Masahiro Yamada Suggested-by: Nathan Chancellor Signed-off-by: Nick Desaulniers --- Makefile | 12 include/asm-generic/vmlinux.lds.h | 6 +- lib/Kconfig.debug | 18 ++ scripts/test_dwarf5_support.sh| 8 4 files changed, 43 insertions(+), 1 deletion(-) create mode 100755 scripts/test_dwarf5_support.sh diff --git a/Makefile b/Makefile index 20141cd9319e..bed8b3b180b8 100644 --- a/Makefile +++ b/Makefile @@ -832,8 +832,20 @@ endif dwarf-version-$(CONFIG_DEBUG_INFO_DWARF2) := 2 dwarf-version-$(CONFIG_DEBUG_INFO_DWARF4) := 4 +dwarf-version-$(CONFIG_DEBUG_INFO_DWARF5) := 5 DEBUG_CFLAGS+= -gdwarf-$(dwarf-version-y) +# If using clang without the integrated assembler, we need to explicitly tell +# GAS that we will be feeding it DWARF v5 assembler directives. Kconfig should +# detect whether the version of GAS supports DWARF v5. +ifdef CONFIG_CC_IS_CLANG +ifneq ($(LLVM_IAS),1) +ifeq ($(dwarf-version-y),5) +DEBUG_CFLAGS += -Wa,-gdwarf-5 +endif +endif +endif + ifdef CONFIG_DEBUG_INFO_REDUCED DEBUG_CFLAGS+= $(call cc-option, -femit-struct-debug-baseonly) \ $(call cc-option,-fno-var-tracking) diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 34b7e0d2346c..f8d5455cd87f 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -843,7 +843,11 @@ .debug_types0 : { *(.debug_types) } \ /* DWARF 5 */ \ .debug_macro0 : { *(.debug_macro) } \ - .debug_addr 0 : { *(.debug_addr) } + .debug_addr 0 : { *(.debug_addr) } \ + .debug_line_str 0 : { *(.debug_line_str) } \ + .debug_loclists 0 : { *(.debug_loclists) } \ + .debug_rnglists 0 : { *(.debug_rnglists) } \ + .debug_str_offsets 0 : { *(.debug_str_offsets) } /* Stabs debugging sections. */ #define STABS_DEBUG \ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 1850728b23e6..09146b1af20d 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -273,6 +273,24 @@ config DEBUG_INFO_DWARF4 It makes the debug information larger, but it significantly improves the success of resolving variables in gdb on optimized code. +config DEBUG_INFO_DWARF5 + bool "Generate DWARF Version 5 debuginfo" + depends on GCC_VERSION >= 5 || CC_IS_CLANG + depends on CC_IS_GCC || $(success,$(srctree)/scripts/test_dwarf5_support.sh $(CC) $(CLANG_FLAGS)) + depends on !DEBUG_INFO_
Re: [PATCH v6 2/2] Kbuild: implement support for DWARF v5
On 2021-01-29, Nick Desaulniers wrote: DWARF v5 is the latest standard of the DWARF debug info format. Feature detection of DWARF5 is onerous, especially given that we've removed $(AS), so we must query $(CC) for DWARF5 assembler directive support. The DWARF version of a binary can be validated with: $ llvm-dwarfdump vmlinux | head -n 4 | grep version or $ readelf --debug-dump=info vmlinux 2>/dev/null | grep Version DWARF5 wins significantly in terms of size when mixed with compression (CONFIG_DEBUG_INFO_COMPRESSED). 363Mvmlinux.clang12.dwarf5.compressed 434Mvmlinux.clang12.dwarf4.compressed 439Mvmlinux.clang12.dwarf2.compressed 457Mvmlinux.clang12.dwarf5 536Mvmlinux.clang12.dwarf4 548Mvmlinux.clang12.dwarf2 515Mvmlinux.gcc10.2.dwarf5.compressed 599Mvmlinux.gcc10.2.dwarf4.compressed 624Mvmlinux.gcc10.2.dwarf2.compressed 630Mvmlinux.gcc10.2.dwarf5 765Mvmlinux.gcc10.2.dwarf4 809Mvmlinux.gcc10.2.dwarf2 Though the quality of debug info is harder to quantify; size is not a proxy for quality. Jakub notes: All [GCC] 5.1 - 6.x did was start accepting -gdwarf-5 as experimental option that enabled some small DWARF subset (initially only a few DW_LANG_* codes newly added to DWARF5 drafts). Only GCC 7 (released after DWARF 5 has been finalized) started emitting DWARF5 section headers and got most of the DWARF5 changes in... Version check GCC so that we don't need to worry about the difference in command line args between GNU readelf and llvm-readelf/llvm-dwarfdump to validate the DWARF Version in the assembler feature detection script. GNU `as` only recently gained support for specifying -gdwarf-5, so when compiling with Clang but without Clang's integrated assembler (LLVM_IAS=1 is not set), explicitly add -Wa,-gdwarf-5 to DEBUG_CFLAGS. Disabled for now if CONFIG_DEBUG_INFO_BTF is set; pahole doesn't yet recognize the new additions to the DWARF debug info. Thanks to Sedat for the report. Link: http://www.dwarfstd.org/doc/DWARF5.pdf Reported-by: Sedat Dilek Suggested-by: Arvind Sankar Suggested-by: Caroline Tice Suggested-by: Fangrui Song Suggested-by: Jakub Jelinek Suggested-by: Masahiro Yamada Suggested-by: Nathan Chancellor Signed-off-by: Nick Desaulniers --- Makefile | 12 include/asm-generic/vmlinux.lds.h | 6 +- lib/Kconfig.debug | 18 ++ scripts/test_dwarf5_support.sh| 8 4 files changed, 43 insertions(+), 1 deletion(-) create mode 100755 scripts/test_dwarf5_support.sh diff --git a/Makefile b/Makefile index 20141cd9319e..bed8b3b180b8 100644 --- a/Makefile +++ b/Makefile @@ -832,8 +832,20 @@ endif dwarf-version-$(CONFIG_DEBUG_INFO_DWARF2) := 2 dwarf-version-$(CONFIG_DEBUG_INFO_DWARF4) := 4 +dwarf-version-$(CONFIG_DEBUG_INFO_DWARF5) := 5 DEBUG_CFLAGS+= -gdwarf-$(dwarf-version-y) +# If using clang without the integrated assembler, we need to explicitly tell +# GAS that we will be feeding it DWARF v5 assembler directives. Kconfig should +# detect whether the version of GAS supports DWARF v5. +ifdef CONFIG_CC_IS_CLANG +ifneq ($(LLVM_IAS),1) +ifeq ($(dwarf-version-y),5) +DEBUG_CFLAGS += -Wa,-gdwarf-5 +endif +endif +endif + ifdef CONFIG_DEBUG_INFO_REDUCED DEBUG_CFLAGS+= $(call cc-option, -femit-struct-debug-baseonly) \ $(call cc-option,-fno-var-tracking) diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 34b7e0d2346c..f8d5455cd87f 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -843,7 +843,11 @@ .debug_types0 : { *(.debug_types) } \ /* DWARF 5 */ \ .debug_macro0 : { *(.debug_macro) } \ - .debug_addr 0 : { *(.debug_addr) } + .debug_addr 0 : { *(.debug_addr) } \ + .debug_line_str 0 : { *(.debug_line_str) } \ + .debug_loclists 0 : { *(.debug_loclists) } \ + .debug_rnglists 0 : { *(.debug_rnglists) } \ + .debug_str_offsets 0 : { *(.debug_str_offsets) } Add .debug_names for -gdwarf-5 -gpubnames The internal linker script of GNU ld 2.36 will have it. https://sourceware.org/pipermail/binutils/2021-January/115064.html (Compilers don't generate .debug_sup, I added to GNU ld just for future-proof.). /* Stabs debugging sections. */ #define STABS_DEBUG \ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 1850728b23e6..09146b1af20d 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -273,6 +273,24 @@ config DEBUG_INFO_DWARF4 It makes the debug information larger, but it significantly improves the success of resolving variables in gdb on optimized code. +config DEBUG_INFO_DWARF5 + b
Re: [PATCH] vmlinux.lds.h: Define SANTIZER_DISCARDS with CONFIG_GCOV_KERNEL=y
On 2021-01-29, Nick Desaulniers wrote: On Fri, Jan 29, 2021 at 12:11 PM Nathan Chancellor wrote: clang produces .eh_frame sections when CONFIG_GCOV_KERNEL is enabled, even when -fno-asynchronous-unwind-tables is in KBUILD_CFLAGS: $ make CC=clang vmlinux ... ld: warning: orphan section `.eh_frame' from `init/main.o' being placed in section `.eh_frame' ld: warning: orphan section `.eh_frame' from `init/version.o' being placed in section `.eh_frame' ld: warning: orphan section `.eh_frame' from `init/do_mounts.o' being placed in section `.eh_frame' ld: warning: orphan section `.eh_frame' from `init/do_mounts_initrd.o' being placed in section `.eh_frame' ld: warning: orphan section `.eh_frame' from `init/initramfs.o' being placed in section `.eh_frame' ld: warning: orphan section `.eh_frame' from `init/calibrate.o' being placed in section `.eh_frame' ld: warning: orphan section `.eh_frame' from `init/init_task.o' being placed in section `.eh_frame' ... $ rg "GCOV_KERNEL|GCOV_PROFILE_ALL" .config CONFIG_GCOV_KERNEL=y CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y CONFIG_GCOV_PROFILE_ALL=y This was already handled for a couple of other options in commit d812db78288d ("vmlinux.lds.h: Avoid KASAN and KCSAN's unwanted sections") and there is an open LLVM bug for this issue. Take advantage of that section for this config as well so that there are no more orphan warnings. Link: https://bugs.llvm.org/show_bug.cgi?id=46478 Link: https://github.com/ClangBuiltLinux/linux/issues/1069 Reported-by: kernel test robot Signed-off-by: Nathan Chancellor Reviewed-by: Nick Desaulniers I suspect we're going to need to add module level attributes in LLVM IR for these options, then check those when synthesizing new function definitions within LLVM. At least we'll be able to point to this file and say "hey, this is a general problem in LLVM, and here are 3 specific cases now where it's a problem." Not a large problem, but would help us save some bytes in the final object. LLVM is not producing data in this section for all code, just these synthesized routines. Maybe. There are also a long list of security features which may impose additional requirements. Adding a module flag metadata for each such feature will be a long battle. For .eh_frame, I think it is important/generic enough and can benefit other applications that deserves special handling (and I can look into it). For .init_array, I am not too sure --- include/asm-generic/vmlinux.lds.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index b2b3d81b1535..f753fd449436 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -988,12 +988,13 @@ #endif /* - * Clang's -fsanitize=kernel-address and -fsanitize=thread produce + * Clang's -fsanitize=kernel-address, -fsanitize=thread, + * and -fprofile-arcs -ftest-coverage produce unwanted * unwanted sections (.eh_frame and .init_array.*), but * CONFIG_CONSTRUCTORS wants to keep any .init_array.* sections. * https://bugs.llvm.org/show_bug.cgi?id=46478 */ -#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KCSAN) +#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KCSAN) || defined(CONFIG_GCOV_KERNEL) # ifdef CONFIG_CONSTRUCTORS # define SANITIZER_DISCARDS \ *(.eh_frame) base-commit: bec4c2968fce2f44ce62d05288a633cd99a722eb -- 2.30.0 Drop -ftest-coverage. -ftest-coverage just produces .gcno and does not affect code generation. Reviewed-by: Fangrui Song
[tip: x86/build] x86/build: Treat R_386_PLT32 relocation as R_386_PC32
The following commit has been merged into the x86/build branch of tip: Commit-ID: bb73d07148c405c293e576b40af37737faf23a6a Gitweb: https://git.kernel.org/tip/bb73d07148c405c293e576b40af37737faf23a6a Author:Fangrui Song AuthorDate:Wed, 27 Jan 2021 12:56:00 -08:00 Committer: Borislav Petkov CommitterDate: Thu, 28 Jan 2021 12:24:06 +01:00 x86/build: Treat R_386_PLT32 relocation as R_386_PC32 This is similar to commit b21ebf2fb4cd ("x86: Treat R_X86_64_PLT32 as R_X86_64_PC32") but for i386. As far as the kernel is concerned, R_386_PLT32 can be treated the same as R_386_PC32. R_386_PLT32/R_X86_64_PLT32 are PC-relative relocation types which can only be used by branches. If the referenced symbol is defined externally, a PLT will be used. R_386_PC32/R_X86_64_PC32 are PC-relative relocation types which can be used by address taking operations and branches. If the referenced symbol is defined externally, a copy relocation/canonical PLT entry will be created in the executable. On x86-64, there is no PIC vs non-PIC PLT distinction and an R_X86_64_PLT32 relocation is produced for both `call/jmp foo` and `call/jmp foo@PLT` with newer (2018) GNU as/LLVM integrated assembler. This avoids canonical PLT entries (st_shndx=0, st_value!=0). On i386, there are 2 types of PLTs, PIC and non-PIC. Currently, the GCC/GNU as convention is to use R_386_PC32 for non-PIC PLT and R_386_PLT32 for PIC PLT. Copy relocations/canonical PLT entries are possible ABI issues but GCC/GNU as will likely keep the status quo because (1) the ABI is legacy (2) the change will drop a GNU ld diagnostic for non-default visibility ifunc in shared objects. clang-12 -fno-pic (since [1]) can emit R_386_PLT32 for compiler generated function declarations, because preventing canonical PLT entries is weighed over the rare ifunc diagnostic. Further info for the more interested: https://github.com/ClangBuiltLinux/linux/issues/1210 https://sourceware.org/bugzilla/show_bug.cgi?id=27169 https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6 [1] [ bp: Massage commit message. ] Reported-by: Arnd Bergmann Signed-off-by: Fangrui Song Signed-off-by: Borislav Petkov Reviewed-by: Nick Desaulniers Reviewed-by: Nathan Chancellor Tested-by: Nick Desaulniers Tested-by: Nathan Chancellor Tested-by: Sedat Dilek Link: https://lkml.kernel.org/r/20210127205600.1227437-1-mask...@google.com --- arch/x86/kernel/module.c | 1 + arch/x86/tools/relocs.c | 12 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c index 34b153c..5e9a34b 100644 --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -114,6 +114,7 @@ int apply_relocate(Elf32_Shdr *sechdrs, *location += sym->st_value; break; case R_386_PC32: + case R_386_PLT32: /* Add the value, subtract its position */ *location += sym->st_value - (uint32_t)location; break; diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c index ce7188c..1c3a196 100644 --- a/arch/x86/tools/relocs.c +++ b/arch/x86/tools/relocs.c @@ -867,9 +867,11 @@ static int do_reloc32(struct section *sec, Elf_Rel *rel, Elf_Sym *sym, case R_386_PC32: case R_386_PC16: case R_386_PC8: + case R_386_PLT32: /* -* NONE can be ignored and PC relative relocations don't -* need to be adjusted. +* NONE can be ignored and PC relative relocations don't need +* to be adjusted. Because sym must be defined, R_386_PLT32 can +* be treated the same way as R_386_PC32. */ break; @@ -910,9 +912,11 @@ static int do_reloc_real(struct section *sec, Elf_Rel *rel, Elf_Sym *sym, case R_386_PC32: case R_386_PC16: case R_386_PC8: + case R_386_PLT32: /* -* NONE can be ignored and PC relative relocations don't -* need to be adjusted. +* NONE can be ignored and PC relative relocations don't need +* to be adjusted. Because sym must be defined, R_386_PLT32 can +* be treated the same way as R_386_PC32. */ break;
[PATCH v4] x86: Treat R_386_PLT32 as R_386_PC32
This is similar to commit b21ebf2fb4cd ("x86: Treat R_X86_64_PLT32 as R_X86_64_PC32"), but for i386. As far as Linux kernel is concerned, R_386_PLT32 can be treated the same as R_386_PC32. R_386_PLT32/R_X86_64_PLT32 are PC-relative relocation types which can only be used by branches. If the referenced symbol is defined externally, a PLT will be used. R_386_PC32/R_X86_64_PC32 are PC-relative relocation types which can be used by address taking operations and branches. If the referenced symbol is defined externally, a copy relocation/canonical PLT entry will be created in the executable. On x86-64, there is no PIC vs non-PIC PLT distinction and an R_X86_64_PLT32 relocation is produced for both `call/jmp foo` and `call/jmp foo@PLT` with newer (2018) GNU as/LLVM integrated assembler. This avoids canonical PLT entries (st_shndx=0, st_value!=0). On i386, there are 2 types of PLTs, PIC and non-PIC. Currently the GCC/GNU as convention is to use R_386_PC32 for non-PIC PLT and R_386_PLT32 for PIC PLT. Copy relocations/canonical PLT entries are possible ABI issues but GCC/GNU as will likely keep the status quo because (1) the ABI is legacy (2) the change will drop a GNU ld diagnostic for non-default visibility ifunc in shared objects. https://sourceware.org/bugzilla/show_bug.cgi?id=27169 clang-12 -fno-pic (since https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6) can emit R_386_PLT32 for compiler generated function declarations, because preventing canonical PLT entries is weighed over the rare ifunc diagnostic. Link: https://github.com/ClangBuiltLinux/linux/issues/1210 Reported-by: Arnd Bergmann Signed-off-by: Fangrui Song Reviewed-by: Nick Desaulniers Reviewed-by: Nathan Chancellor Tested-by: Nick Desaulniers Tested-by: Nathan Chancellor --- Change in v2: * Improve commit message --- Change in v3: * Change the GCC link to the more relevant GNU as link. * Fix the relevant llvm-project commit. --- Change in v4: * Improve comments and commit message --- arch/x86/kernel/module.c | 1 + arch/x86/tools/relocs.c | 12 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c index 34b153cbd4ac..5e9a34b5bd74 100644 --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -114,6 +114,7 @@ int apply_relocate(Elf32_Shdr *sechdrs, *location += sym->st_value; break; case R_386_PC32: + case R_386_PLT32: /* Add the value, subtract its position */ *location += sym->st_value - (uint32_t)location; break; diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c index ce7188cbdae5..1c3a1962cade 100644 --- a/arch/x86/tools/relocs.c +++ b/arch/x86/tools/relocs.c @@ -867,9 +867,11 @@ static int do_reloc32(struct section *sec, Elf_Rel *rel, Elf_Sym *sym, case R_386_PC32: case R_386_PC16: case R_386_PC8: + case R_386_PLT32: /* -* NONE can be ignored and PC relative relocations don't -* need to be adjusted. +* NONE can be ignored and PC relative relocations don't need +* to be adjusted. Because sym must be defined, R_386_PLT32 can +* be treated the same way as R_386_PC32. */ break; @@ -910,9 +912,11 @@ static int do_reloc_real(struct section *sec, Elf_Rel *rel, Elf_Sym *sym, case R_386_PC32: case R_386_PC16: case R_386_PC8: + case R_386_PLT32: /* -* NONE can be ignored and PC relative relocations don't -* need to be adjusted. +* NONE can be ignored and PC relative relocations don't need +* to be adjusted. Because sym must be defined, R_386_PLT32 can +* be treated the same way as R_386_PC32. */ break; -- 2.30.0.280.ga3ce27912f-goog
Re: [PATCH v3] x86: Treat R_386_PLT32 as R_386_PC32
On 2021-01-25, Borislav Petkov wrote: It's a good thing I have a toolchain guy who can explain to me what you guys are doing because you need to start writing those commit messages for !toolchain developers. How about this following message? I'll answer your questions in line as well. Explaining everything in the message will be quite long... If someone is interested, I have put every possibly related matter in https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected This is similar to commit b21ebf2fb4cd ("x86: Treat R_X86_64_PLT32 as R_X86_64_PC32"), but for i386. As far as Linux kernel is concerned, R_386_PLT32 can be treated the same as R_386_PC32. R_386_PLT32/R_X86_64_PLT32 are PC-relative relocation types which can only be used by branches. If the referenced symbol is defined externally, a PLT will be used. R_386_PC32/R_X86_64_PC32 are PC-relative relocation types which can be used by address taking operations and branches. If the referenced symbol is defined externally, a copy relocation/canonical PLT entry will be created in the executable. On x86-64, there is no PIC vs non-PIC PLT distinction and an R_X86_64_PLT32 relocation is produced for both `call/jmp foo` and `call/jmp foo@PLT` with newer (2018) GNU as/LLVM integrated assembler. This avoids copy relocations/canonical PLT entries. On i386, there are 2 types of PLTs, PIC and non-PIC. Currently the GCC/GNU as convention is to use R_386_PC32 for non-PIC PLT and R_386_PLT32 for PIC PLT. Copy relocations/canonical PLT entries are possible ABI issues but GCC/GNU as will likely keep the status quo because (1) the ABI is legacy (2) the change will drop a GNU ld diagnostic for non-default visibility ifunc in shared objects. https://sourceware.org/bugzilla/show_bug.cgi?id=27169 clang-12 -fno-pic (since https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6) can emit R_386_PLT32 for compiler generated function declarations, because preventing canonical PLT entries is weighed over the rare ifunc diagnostic. Link: https://github.com/ClangBuiltLinux/linux/issues/1210 Reported-by: Arnd Bergmann Signed-off-by: Fangrui Song Reviewed-by: Nick Desaulniers Reviewed-by: Nathan Chancellor Tested-by: Nick Desaulniers Tested-by: Nathan Chancellor On Thu, Jan 14, 2021 at 02:48:19PM -0800, Fangrui Song wrote: This is similar to commit b21ebf2fb4cd ("x86: Treat R_X86_64_PLT32 as R_X86_64_PC32"), but for i386. As far as Linux kernel is concerned, R_386_PLT32 can be treated the same as R_386_PC32. R_386_PC32/R_X86_64_PC32 are PC-relative relocation types with the requirement that the symbol address is significant. R_386_PLT32/R_X86_64_PLT32 are PC-relative relocation types without the address significance requirement. I was told what "significant" means in that context and while it is clear to you, I'm pretty sure it is not clear to kernel developers who haven't looked at toolchains in depth. So please elaborate. Expanded "significant" to more words. See above. On x86-64, there is no PIC vs non-PIC PLT distinction and an R_X86_64_PLT32 relocation is produced for both `call/jmp foo` and `call/jmp foo@PLT` with newer (2018) GNU as/LLVM integrated assembler. Also, please explain in short why LLVM is generating R_X86_64_PLT32 relocs now? I.e., is it the same reason as why binutils does that? I.e., mentioning the big picture of things would help as to why you're doing this. It has been explained. The LLVM change was in 2018, roughly the same time when GNU as emitted R_X86_64_PLT32. I think it does not need extended explanation because of the separate canonical PLT entries paragraph. On i386, there are 2 types of PLTs, PIC and non-PIC. Currently the convention is to use R_386_PC32 for non-PIC PLT and R_386_PLT32 for PIC PLT. Convention in general or convention for LLVM? Changed to "GCC/GNU as convention". clang-12 -fno-pic (since https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6) can emit R_386_PLT32 for compiler generated function declarations as well to avoid a canonical PLT entry (st_shndx=0, st_value!=0) if the symbol turns out to be defined externally. GCC/GNU as will likely keep using R_386_PC32 because (1) the ABI is legacy (2) the change will drop a GNU ld non-default visibility ifunc for shared objects. https://sourceware.org/bugzilla/show_bug.cgi?id=27169 Not sure how useful this paragraph is for kernel developers... Reorganize it a bit... Link: https://github.com/ClangBuiltLinux/linux/issues/1210 Reported-by: Arnd Bergmann Signed-off-by: Fangrui Song Reviewed-by: Nick Desaulniers Reviewed-by: Nathan Chancellor Tested-by: Nick Desaulniers Tested-by: Nathan Chancellor --- Change in v2: * Improve commit message --- Change in v3: * Change the GCC link to the more relevant GNU as link. * Fix the relevant llvm-project commit id. --- arch/x86/kernel/module.c | 1 + a
Re: [PATCH v4 00/10] Function Granular KASLR
On 2020-08-28, Josh Poimboeuf wrote: On Fri, Aug 28, 2020 at 12:21:13PM +0200, Miroslav Benes wrote: > Hi there! I was trying to find a super easy way to address this, so I > thought the best thing would be if there were a compiler or linker > switch to just eliminate any duplicate symbols at compile time for > vmlinux. I filed this question on the binutils bugzilla looking to see > if there were existing flags that might do this, but H.J. Lu went ahead > and created a new one "-z unique", that seems to do what we would need > it to do. > > https://sourceware.org/bugzilla/show_bug.cgi?id=26391 > > When I use this option, it renames any duplicate symbols with an > extension - for example duplicatefunc.1 or duplicatefunc.2. You could > either match on the full unique name of the specific binary you are > trying to patch, or you match the base name and use the extension to > determine original position. Do you think this solution would work? Yes, I think so (thanks, Joe, for testing!). It looks cleaner to me than the options above, but it may just be a matter of taste. Anyway, I'd go with full name matching, because -z unique-symbol would allow us to remove sympos altogether, which is appealing. > If > so, I can modify livepatch to refuse to patch on duplicated symbols if > CONFIG_FG_KASLR and when this option is merged into the tool chain I > can add it to KBUILD_LDFLAGS when CONFIG_FG_KASLR and livepatching > should work in all cases. Ok. Josh, Petr, would this work for you too? Sounds good to me. Kristen, thanks for finding a solution! (I am not subscribed. I came here via https://sourceware.org/bugzilla/show_bug.cgi?id=26391 (ld -z unique-symbol)) This works great after randomization because it always receives the current address at runtime rather than relying on any kind of buildtime address. The issue with with the live-patching code's algorithm for resolving duplicate symbol names. If they request a symbol by name from the kernel and there are 3 symbols with the same name, they use the symbol's position in the built binary image to select the correct symbol. If a.o, b.o and c.o define local symbol 'foo'. By position, do you mean that * the live-patching code uses something like (findall("foo")[0], findall("foo")[1], findall("foo")[2]) ? * shuffling a.o/b.o/c.o will make the returned triple different Local symbols are not required to be unique. Instead of patching the toolchain, have you thought about making the live-patching code smarter? (Depend on the duplicates, such a linker option can increase the link time/binary size considerably AND I don't know in what other cases such an option will be useful) For the following example, https://sourceware.org/bugzilla/show_bug.cgi?id=26822 # RUN: split-file %s %t # RUN: gcc -c %t/a.s -o %t/a.o # RUN: gcc -c %t/b.s -o %t/b.o # RUN: gcc -c %t/c.s -o %t/c.o # RUN: ld-new %t/a.o %t/b.o %t/c.o -z unique-symbol -o %t.exe #--- a.s a: a.1: a.2: nop #--- b.s a: nop #--- c.s a: nop readelf -Ws output: Symbol table '.symtab' contains 13 entries: Num:Value Size TypeBind Vis Ndx Name 0: 0 NOTYPE LOCAL DEFAULT UND 1: 0 FILELOCAL DEFAULT ABS a.o 2: 00401000 0 NOTYPE LOCAL DEFAULT1 a 3: 00401000 0 NOTYPE LOCAL DEFAULT1 a.1 4: 00401000 0 NOTYPE LOCAL DEFAULT1 a.2 5: 0 FILELOCAL DEFAULT ABS b.o 6: 00401001 0 NOTYPE LOCAL DEFAULT1 a.1 7: 0 FILELOCAL DEFAULT ABS c.o 8: 00401002 0 NOTYPE LOCAL DEFAULT1 a.2 9: 0 NOTYPE GLOBAL DEFAULT UND _start 10: 00402000 0 NOTYPE GLOBAL DEFAULT1 __bss_start 11: 00402000 0 NOTYPE GLOBAL DEFAULT1 _edata 12: 00402000 0 NOTYPE GLOBAL DEFAULT1 _end Note that you have STT_FILE SHN_ABS symbols. If the compiler does not produce them, they will be synthesized by GNU ld. https://sourceware.org/bugzilla/show_bug.cgi?id=26822 ld.bfd copies non-STT_SECTION local symbols from input object files. If an object file does not have STT_FILE symbols (no .file directive) but has non-STT_SECTION local symbols, ld.bfd synthesizes a STT_FILE symbol The filenames are usually base names, so "a.o" and "a.o" in two directories will be indistinguishable. The live-patching code can possibly work around this by not changing the relative order of the two "a.o".
Re: [PATCH bpf-next v2] samples/bpf: Update README.rst and Makefile for manually compiling LLVM and clang
On 2021-01-19, Tiezhu Yang wrote: The current llvm/clang build procedure in samples/bpf/README.rst is out of date. See below that the links are not accessible any more. $ git clone http://llvm.org/git/llvm.git Cloning into 'llvm'... fatal: unable to access 'http://llvm.org/git/llvm.git/': Maximum (20) redirects followed $ git clone --depth 1 http://llvm.org/git/clang.git Cloning into 'clang'... fatal: unable to access 'http://llvm.org/git/clang.git/': Maximum (20) redirects followed The llvm community has adopted new ways to build the compiler. There are different ways to build llvm/clang, the Clang Getting Started page [1] has one way. As Yonghong said, it is better to just copy the build procedure in Documentation/bpf/bpf_devel_QA.rst to keep consistent. I verified the procedure and it is proved to be feasible, so we should update README.rst to reflect the reality. At the same time, update the related comment in Makefile. [1] https://clang.llvm.org/get_started.html Signed-off-by: Tiezhu Yang Acked-by: Yonghong Song --- v2: Update the commit message suggested by Yonghong, thank you very much. samples/bpf/Makefile | 2 +- samples/bpf/README.rst | 17 ++--- 2 files changed, 11 insertions(+), 8 deletions(-) diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile index 26fc96c..d061446 100644 --- a/samples/bpf/Makefile +++ b/samples/bpf/Makefile @@ -208,7 +208,7 @@ TPROGLDLIBS_xdpsock += -pthread -lcap TPROGLDLIBS_xsk_fwd += -pthread # Allows pointing LLC/CLANG to a LLVM backend with bpf support, redefine on cmdline: -# make M=samples/bpf/ LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang +# make M=samples/bpf LLC=~/git/llvm-project/llvm/build/bin/llc CLANG=~/git/llvm-project/llvm/build/bin/clang LLC ?= llc CLANG ?= clang OPT ?= opt diff --git a/samples/bpf/README.rst b/samples/bpf/README.rst index dd34b2d..d1be438 100644 --- a/samples/bpf/README.rst +++ b/samples/bpf/README.rst @@ -65,17 +65,20 @@ To generate a smaller llc binary one can use:: Quick sniplet for manually compiling LLVM and clang (build dependencies are cmake and gcc-c++):: - $ git clone http://llvm.org/git/llvm.git - $ cd llvm/tools - $ git clone --depth 1 http://llvm.org/git/clang.git - $ cd ..; mkdir build; cd build - $ cmake .. -DLLVM_TARGETS_TO_BUILD="BPF;X86" - $ make -j $(getconf _NPROCESSORS_ONLN) + $ git clone https://github.com/llvm/llvm-project.git + $ mkdir -p llvm-project/llvm/build/install llvm-project/llvm/build/install is not used. + $ cd llvm-project/llvm/build + $ cmake .. -G "Ninja" -DLLVM_TARGETS_TO_BUILD="BPF;X86" \ +-DLLVM_ENABLE_PROJECTS="clang"\ +-DBUILD_SHARED_LIBS=OFF \ -DBUILD_SHARED_LIBS=OFF is the default. It can be omitted. +-DCMAKE_BUILD_TYPE=Release\ +-DLLVM_BUILD_RUNTIME=OFF -DLLVM_BUILD_RUNTIME=OFF can be omitted if none of compiler-rt/libc++/libc++abi is built. + $ ninja It is also possible to point make to the newly compiled 'llc' or 'clang' command via redefining LLC or CLANG on the make command line:: - make M=samples/bpf LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang + make M=samples/bpf LLC=~/git/llvm-project/llvm/build/bin/llc CLANG=~/git/llvm-project/llvm/build/bin/clang Cross compiling samples --- -- 2.1.0 -- You received this message because you are subscribed to the Google Groups "Clang Built Linux" group. To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/1611042978-21473-1-git-send-email-yangtiezhu%40loongson.cn.
Re: [PATCH v5 2/3] Kbuild: make DWARF version a choice
On 2021-01-15, Sedat Dilek wrote: On Fri, Jan 15, 2021 at 10:06 PM Nick Desaulniers wrote: Modifies CONFIG_DEBUG_INFO_DWARF4 to be a member of a choice. Adds an explicit CONFIG_DEBUG_INFO_DWARF2, which is the default. Does so in a way that's forward compatible with existing configs, and makes adding future versions more straightforward. Suggested-by: Arvind Sankar Suggested-by: Fangrui Song Suggested-by: Masahiro Yamada Signed-off-by: Nick Desaulniers --- Makefile | 13 ++--- lib/Kconfig.debug | 21 - 2 files changed, 22 insertions(+), 12 deletions(-) diff --git a/Makefile b/Makefile index d49c3f39ceb4..4eb3bf7ee974 100644 --- a/Makefile +++ b/Makefile @@ -826,13 +826,12 @@ else DEBUG_CFLAGS += -g endif -ifneq ($(LLVM_IAS),1) -KBUILD_AFLAGS += -Wa,-gdwarf-2 -endif - -ifdef CONFIG_DEBUG_INFO_DWARF4 -DEBUG_CFLAGS += -gdwarf-4 -endif +dwarf-version-$(CONFIG_DEBUG_INFO_DWARF2) := 2 +dwarf-version-$(CONFIG_DEBUG_INFO_DWARF4) := 4 +DEBUG_CFLAGS += -gdwarf-$(dwarf-version-y) +# Binutils 2.35+ required for -gdwarf-4+ support. +dwarf-aflag:= $(call as-option,-Wa$(comma)-gdwarf-$(dwarf-version-y)) +KBUILD_AFLAGS += $(dwarf-aflag) ifdef CONFIG_DEBUG_INFO_REDUCED DEBUG_CFLAGS += $(call cc-option, -femit-struct-debug-baseonly) \ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index dd7d8d35b2a5..e80770fac4f0 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -256,13 +256,24 @@ config DEBUG_INFO_SPLIT to know about the .dwo files and include them. Incompatible with older versions of ccache. +choice + prompt "DWARF version" Here you use "DWARF version" so keep this for v2 and v4. + help + Which version of DWARF debug info to emit. + +config DEBUG_INFO_DWARF2 + bool "Generate DWARF Version 2 debuginfo" s/DWARF Version/DWARF version + help + Generate DWARF v2 debug info. + config DEBUG_INFO_DWARF4 - bool "Generate dwarf4 debuginfo" + bool "Generate DWARF Version 4 debuginfo" Same here: s/DWARF Version/DWARF version DWARF Version 2 is fine and preferable. I have checked DWARF Version 2/3/4/5 specifications. "DWARF Version 2" is the official way that version is referred to... - Sedat - help - Generate dwarf4 debug info. This requires recent versions - of gcc and gdb. It makes the debug information larger. - But it significantly improves the success of resolving - variables in gdb on optimized code. + Generate DWARF v4 debug info. This requires gcc 4.5+ and gdb 7.0+. + It makes the debug information larger, but it significantly + improves the success of resolving variables in gdb on optimized code. + +endchoice # "DWARF version" config DEBUG_INFO_BTF bool "Generate BTF typeinfo" -- 2.30.0.284.gd98b1dd5eaa7-goog
Re: [PATCH v5 1/3] Remove $(cc-option,-gdwarf-4) dependency from CONFIG_DEBUG_INFO_DWARF4
On 2021-01-15, Nick Desaulniers wrote: From: Masahiro Yamada The -gdwarf-4 flag is supported by GCC 4.5+, and also by Clang. You can see it at https://godbolt.org/z/6ed1oW For gcc 4.5.3 pane,line 37:.value 0x4 For clang 10.0.1 pane, line 117: .short 4 Given Documentation/process/changes.rst stating GCC 4.9 is the minimal version, this cc-option is unneeded. Note CONFIG_DEBUG_INFO_DWARF4 controls the DWARF version only for C files. As you can see in the top Makefile, -gdwarf-4 is only passed to CFLAGS. ifdef CONFIG_DEBUG_INFO_DWARF4 DEBUG_CFLAGS+= -gdwarf-4 endif This flag is used when compiling *.c files. On the other hand, the assembler is always given -gdwarf-2. KBUILD_AFLAGS += -Wa,-gdwarf-2 Hence, the debug info that comes from *.S files is always DWARF v2. This is simply because GAS supported only -gdwarf-2 for a long time. Recently, GAS gained the support for --dwarf-[3|4|5] options. [1] The gas commit description has a typo. The supported options are -gdwarf-[345] or --gdwarf-[345]. -gdwarf2 and --gdwarf2 are kept for compatibility. Looks good otherwise. And, also we have Clang integrated assembler. So, the debug info for *.S files might be improved if we want. In my understanding, the current code is intentional, not a bug. [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=31bf18645d98b4d3d7357353be840e320649a67d Reviewed-by: Nick Desaulniers Reviewed-by: Nathan Chancellor Signed-off-by: Masahiro Yamada --- lib/Kconfig.debug | 1 - 1 file changed, 1 deletion(-) diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 78361f0abe3a..dd7d8d35b2a5 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -258,7 +258,6 @@ config DEBUG_INFO_SPLIT config DEBUG_INFO_DWARF4 bool "Generate dwarf4 debuginfo" - depends on $(cc-option,-gdwarf-4) help Generate dwarf4 debug info. This requires recent versions of gcc and gdb. It makes the debug information larger. -- 2.30.0.284.gd98b1dd5eaa7-goog
Re: [PATCH] mips: vdso: fix DWARF2 warning
On 2021-01-15, Anders Roxell wrote: On Fri, 15 Jan 2021 at 20:28, Nathan Chancellor wrote: On Fri, Jan 15, 2021 at 08:13:30PM +0100, Anders Roxell wrote: > When building mips tinyconifg the following warning show up > > make --silent --keep-going --jobs=8 O=/home/anders/src/kernel/next/out/builddir ARCH=mips CROSS_COMPILE=mips-linux-gnu- HOSTCC=clang CC=clang > /srv/src/kernel/next/arch/mips/vdso/elf.S:14:1: warning: DWARF2 only supports one section per compilation unit > .pushsection .note.Linux, "a",@note ; .balign 4 ; .long 2f - 1f ; .long 4484f - 3f ; .long 0 ; 1:.asciz "Linux" ; 2:.balign 4 ; 3: > ^ > /srv/src/kernel/next/arch/mips/vdso/elf.S:34:2: warning: DWARF2 only supports one section per compilation unit > .section .mips_abiflags, "a" > ^ > > Rework so the mips vdso Makefile adds flag '-no-integrated-as' unless > LLVM_IAS is defined. > > Link: https://github.com/ClangBuiltLinux/linux/issues/1256 > Cc: sta...@vger.kernel.org # v4.19+ > Suggested-by: Nick Desaulniers > Signed-off-by: Anders Roxell I believe this is the better solution: https://lore.kernel.org/r/20210115192622.3828545-1-natechancel...@gmail.com/ Yes, I agree. Cheers, Anders http://lore.kernel.org/r/20201202010850.jibrjpyu6xgkf...@google.com Personally I'd drop DWARF v2 as an option.
[PATCH v3] module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
clang-12 -fno-pic (since https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6) can emit `call __stack_chk_fail@PLT` instead of `call __stack_chk_fail` on x86. The two forms should have identical behaviors on x86-64 but the former causes GNU as<2.37 to produce an unreferenced undefined symbol _GLOBAL_OFFSET_TABLE_. (On x86-32, there is an R_386_PC32 vs R_386_PLT32 difference but the linker behavior is identical as far as Linux kernel is concerned.) Simply ignore _GLOBAL_OFFSET_TABLE_ for now, like what scripts/mod/modpost.c:ignore_undef_symbol does. This also fixes the problem for gcc/clang -fpie and -fpic, which may emit `call foo@PLT` for external function calls on x86. Note: ld -z defs and dynamic loaders do not error for unreferenced undefined symbols so the module loader is reading too much. If we ever need to ignore more symbols, the code should be refactored to ignore unreferenced symbols. Reported-by: Marco Elver Link: https://github.com/ClangBuiltLinux/linux/issues/1250 Signed-off-by: Fangrui Song Reviewed-by: Nick Desaulniers Tested-by: Marco Elver Cc: --- Changes in v2: * Fix Marco's email address * Add a function ignore_undef_symbol similar to scripts/mod/modpost.c:ignore_undef_symbol --- Changes in v3: * Fix the style of a multi-line comment. * Use static bool ignore_undef_symbol. --- kernel/module.c | 21 +++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index 4bf30e4b3eaa..805c49d1b86d 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2348,6 +2348,21 @@ static int verify_exported_symbols(struct module *mod) return 0; } +static bool ignore_undef_symbol(Elf_Half emachine, const char *name) +{ + /* +* On x86, PIC code and Clang non-PIC code may have call foo@PLT. GNU as +* before 2.37 produces an unreferenced _GLOBAL_OFFSET_TABLE_ on x86-64. +* i386 has a similar problem but may not deserve a fix. +* +* If we ever have to ignore many symbols, consider refactoring the code to +* only warn if referenced by a relocation. +*/ + if (emachine == EM_386 || emachine == EM_X86_64) + return !strcmp(name, "_GLOBAL_OFFSET_TABLE_"); + return false; +} + /* Change all symbols so that st_value encodes the pointer directly. */ static int simplify_symbols(struct module *mod, const struct load_info *info) { @@ -2395,8 +2410,10 @@ static int simplify_symbols(struct module *mod, const struct load_info *info) break; } - /* Ok if weak. */ - if (!ksym && ELF_ST_BIND(sym[i].st_info) == STB_WEAK) + /* Ok if weak or ignored. */ + if (!ksym && + (ELF_ST_BIND(sym[i].st_info) == STB_WEAK || +ignore_undef_symbol(info->hdr->e_machine, name))) break; ret = PTR_ERR(ksym) ?: -ENOENT; -- 2.30.0.296.g2bfb1c46d8-goog
[PATCH v3] x86: Treat R_386_PLT32 as R_386_PC32
This is similar to commit b21ebf2fb4cd ("x86: Treat R_X86_64_PLT32 as R_X86_64_PC32"), but for i386. As far as Linux kernel is concerned, R_386_PLT32 can be treated the same as R_386_PC32. R_386_PC32/R_X86_64_PC32 are PC-relative relocation types with the requirement that the symbol address is significant. R_386_PLT32/R_X86_64_PLT32 are PC-relative relocation types without the address significance requirement. On x86-64, there is no PIC vs non-PIC PLT distinction and an R_X86_64_PLT32 relocation is produced for both `call/jmp foo` and `call/jmp foo@PLT` with newer (2018) GNU as/LLVM integrated assembler. On i386, there are 2 types of PLTs, PIC and non-PIC. Currently the convention is to use R_386_PC32 for non-PIC PLT and R_386_PLT32 for PIC PLT. clang-12 -fno-pic (since https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6) can emit R_386_PLT32 for compiler generated function declarations as well to avoid a canonical PLT entry (st_shndx=0, st_value!=0) if the symbol turns out to be defined externally. GCC/GNU as will likely keep using R_386_PC32 because (1) the ABI is legacy (2) the change will drop a GNU ld non-default visibility ifunc for shared objects. https://sourceware.org/bugzilla/show_bug.cgi?id=27169 Link: https://github.com/ClangBuiltLinux/linux/issues/1210 Reported-by: Arnd Bergmann Signed-off-by: Fangrui Song Reviewed-by: Nick Desaulniers Reviewed-by: Nathan Chancellor Tested-by: Nick Desaulniers Tested-by: Nathan Chancellor --- Change in v2: * Improve commit message --- Change in v3: * Change the GCC link to the more relevant GNU as link. * Fix the relevant llvm-project commit id. --- arch/x86/kernel/module.c | 1 + arch/x86/tools/relocs.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c index 34b153cbd4ac..5e9a34b5bd74 100644 --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -114,6 +114,7 @@ int apply_relocate(Elf32_Shdr *sechdrs, *location += sym->st_value; break; case R_386_PC32: + case R_386_PLT32: /* Add the value, subtract its position */ *location += sym->st_value - (uint32_t)location; break; diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c index ce7188cbdae5..717e48ca28b6 100644 --- a/arch/x86/tools/relocs.c +++ b/arch/x86/tools/relocs.c @@ -867,6 +867,7 @@ static int do_reloc32(struct section *sec, Elf_Rel *rel, Elf_Sym *sym, case R_386_PC32: case R_386_PC16: case R_386_PC8: + case R_386_PLT32: /* * NONE can be ignored and PC relative relocations don't * need to be adjusted. @@ -910,6 +911,7 @@ static int do_reloc_real(struct section *sec, Elf_Rel *rel, Elf_Sym *sym, case R_386_PC32: case R_386_PC16: case R_386_PC8: + case R_386_PLT32: /* * NONE can be ignored and PC relative relocations don't * need to be adjusted. -- 2.30.0.296.g2bfb1c46d8-goog
[PATCH v2] module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
clang-12 -fno-pic (since https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6) can emit `call __stack_chk_fail@PLT` instead of `call __stack_chk_fail` on x86. The two forms should have identical behaviors on x86-64 but the former causes GNU as<2.37 to produce an unreferenced undefined symbol _GLOBAL_OFFSET_TABLE_. (On x86-32, there is an R_386_PC32 vs R_386_PLT32 difference but the linker behavior is identical as far as Linux kernel is concerned.) Simply ignore _GLOBAL_OFFSET_TABLE_ for now, like what scripts/mod/modpost.c:ignore_undef_symbol does. This also fixes the problem for gcc/clang -fpie and -fpic, which may emit `call foo@PLT` for external function calls on x86. Note: ld -z defs and dynamic loaders do not error for unreferenced undefined symbols so the module loader is reading too much. If we ever need to ignore more symbols, the code should be refactored to ignore unreferenced symbols. Reported-by: Marco Elver Link: https://github.com/ClangBuiltLinux/linux/issues/1250 Signed-off-by: Fangrui Song --- kernel/module.c | 20 ++-- 1 file changed, 18 insertions(+), 2 deletions(-) --- Changes in v2: * Fix Marco's email address * Add a function ignore_undef_symbol similar to scripts/mod/modpost.c:ignore_undef_symbol diff --git a/kernel/module.c b/kernel/module.c index 4bf30e4b3eaa..278f5129bde2 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2348,6 +2348,20 @@ static int verify_exported_symbols(struct module *mod) return 0; } +static int ignore_undef_symbol(Elf_Half emachine, const char *name) +{ + /* On x86, PIC code and Clang non-PIC code may have call foo@PLT. GNU as +* before 2.37 produces an unreferenced _GLOBAL_OFFSET_TABLE_ on x86-64. +* i386 has a similar problem but may not deserve a fix. +* +* If we ever have to ignore many symbols, consider refactoring the code to +* only warn if referenced by a relocation. +*/ + if (emachine == EM_386 || emachine == EM_X86_64) + return !strcmp(name, "_GLOBAL_OFFSET_TABLE_"); + return 0; +} + /* Change all symbols so that st_value encodes the pointer directly. */ static int simplify_symbols(struct module *mod, const struct load_info *info) { @@ -2395,8 +2409,10 @@ static int simplify_symbols(struct module *mod, const struct load_info *info) break; } - /* Ok if weak. */ - if (!ksym && ELF_ST_BIND(sym[i].st_info) == STB_WEAK) + /* Ok if weak or ignored. */ + if (!ksym && + (ELF_ST_BIND(sym[i].st_info) == STB_WEAK || +ignore_undef_symbol(info->hdr->e_machine, name))) break; ret = PTR_ERR(ksym) ?: -ENOENT; -- 2.30.0.296.g2bfb1c46d8-goog
[PATCH] module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
clang-12 -fno-pic (since https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de008232da3f1d6) can emit `call __stack_chk_fail@PLT` instead of `call __stack_chk_fail` on x86. The two forms should have identical behaviors on x86-64 but the former causes GNU as<2.37 to produce an unreferenced undefined symbol _GLOBAL_OFFSET_TABLE_. (On x86-32, there is an R_386_PC32 vs R_386_PLT32 difference but the linker behavior is identical as far as Linux kernel is concerned.) Simply ignore _GLOBAL_OFFSET_TABLE_ for now, like what scripts/mod/modpost.c:ignore_undef_symbol does. This also fixes the problem for gcc/clang -fpie and -fpic, which may emit `call foo@PLT` for external function calls on x86. Note: ld -z defs and dynamic loaders do not error for unreferenced undefined symbols so the module loader is reading too much. If we ever need to ignore more symbols, the code should be refactored to ignore unreferenced symbols. Reported-by: Marco Elver Link: https://github.com/ClangBuiltLinux/linux/issues/1250 Signed-off-by: Fangrui Song --- kernel/module.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index 4bf30e4b3eaa..2e2deea99289 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2395,8 +2395,14 @@ static int simplify_symbols(struct module *mod, const struct load_info *info) break; } - /* Ok if weak. */ - if (!ksym && ELF_ST_BIND(sym[i].st_info) == STB_WEAK) + /* Ok if weak. Also allow _GLOBAL_OFFSET_TABLE_: +* GNU as before 2.37 produces an unreferenced _GLOBAL_OFFSET_TABLE_ +* for call foo@PLT on x86-64. If the code ever needs to ignore +* more symbols, refactor the code to only warn if referenced by +* a relocation. +*/ + if (!ksym && (ELF_ST_BIND(sym[i].st_info) == STB_WEAK || + !strcmp(name, "_GLOBAL_OFFSET_TABLE_"))) break; ret = PTR_ERR(ksym) ?: -ENOENT; -- 2.30.0.284.gd98b1dd5eaa7-goog
Re: [PATCH v3] x86/entry: emit a symbol for register restoring thunk
On 2021-01-11, Nick Desaulniers wrote: Arnd found a randconfig that produces the warning: arch/x86/entry/thunk_64.o: warning: objtool: missing symbol for insn at offset 0x3e when building with LLVM_IAS=1 (use Clang's integrated assembler). Josh notes: With the LLVM assembler stripping the .text section symbol, objtool has no way to reference this code when it generates ORC unwinder entries, because this code is outside of any ELF function. Fangrui notes that this optimization is helpful for reducing images size when compiling with -ffunction-sections and -fdata-sections. I have observerd on the order of tens of thousands of symbols for the kernel images built with those flags. A patch has been authored against GNU binutils to match this behavior, with a new flag --generate-unused-section-symbols=[yes|no]. https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d1bcae833b32f1408485ce69f844dcd7ded093a8 has been committed. The patch should be included in binutils 2.37. The maintainers are welcome to the idea, but fixing all the arch-specific tests is tricky. H.J. fixed the x86 tests and enabled this for x86. When binutils 2.37 come out, some other architectures may follow as well. We can omit the .L prefix on a label to emit an entry into the symbol table for the label, with STB_LOCAL binding. This enables objtool to generate proper unwind info here with LLVM_IAS=1. Josh, I think objtool orc generate needs to synthesize STT_SECTION symbols even if they do not exist in object files. rg 'SYM_CODE.*\.L' reveals a few other .S files which may have similar problems. Cc: Fangrui Song Link: https://github.com/ClangBuiltLinux/linux/issues/1209 Link: https://reviews.llvm.org/D93783 Link: https://sourceware.org/binutils/docs/as/Symbol-Names.html Link: https://sourceware.org/pipermail/binutils/2020-December/114671.html Reported-by: Arnd Bergmann Suggested-by: Josh Poimboeuf Signed-off-by: Nick Desaulniers --- Changes v2 -> v3: * rework to use STB_LOCAL rather than STB_GLOBAL by dropping .L prefix, as per Josh. * rename oneline to drop STB_GLOBAL in commit message. * add link to GAS docs on .L prefix. * drop Josh's ack since patch changed. Changes v1 -> v2: * Pick up Josh's Ack. * Add commit message info about -ffunction-sections/-fdata-sections, and link to binutils patch. arch/x86/entry/thunk_64.S | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/entry/thunk_64.S b/arch/x86/entry/thunk_64.S index ccd32877a3c4..c9a9fbf1655f 100644 --- a/arch/x86/entry/thunk_64.S +++ b/arch/x86/entry/thunk_64.S @@ -31,7 +31,7 @@ SYM_FUNC_START_NOALIGN(\name) .endif call \func - jmp .L_restore + jmp __thunk_restore SYM_FUNC_END(\name) _ASM_NOKPROBE(\name) .endm @@ -44,7 +44,7 @@ SYM_FUNC_END(\name) #endif #ifdef CONFIG_PREEMPTION -SYM_CODE_START_LOCAL_NOALIGN(.L_restore) +SYM_CODE_START_LOCAL_NOALIGN(__thunk_restore) popq %r11 popq %r10 popq %r9 @@ -56,6 +56,6 @@ SYM_CODE_START_LOCAL_NOALIGN(.L_restore) popq %rdi popq %rbp ret - _ASM_NOKPROBE(.L_restore) -SYM_CODE_END(.L_restore) + _ASM_NOKPROBE(__thunk_restore) +SYM_CODE_END(__thunk_restore) #endif -- 2.30.0.284.gd98b1dd5eaa7-goog
Re: [PATCH] pgo: add clang's Profile Guided Optimization infrastructure
On 2021-01-11, Bill Wendling wrote: On Mon, Jan 11, 2021 at 12:12 PM Fangrui Song wrote: On 2021-01-11, 'Bill Wendling' via Clang Built Linux wrote: >From: Sami Tolvanen > >Enable the use of clang's Profile-Guided Optimization[1]. To generate a >profile, the kernel is instrumented with PGO counters, a representative >workload is run, and the raw profile data is collected from >/sys/kernel/debug/pgo/profraw. > >The raw profile data must be processed by clang's "llvm-profdata" tool before >it can be used during recompilation: > > $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw > $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw > >Multiple raw profiles may be merged during this step. > >The data can be used either by the compiler if LTO isn't enabled: > >... -fprofile-use=vmlinux.profdata ... > >or by LLD if LTO is enabled: > >... -lto-cs-profile-file=vmlinux.profdata ... This LLD option does not exist. LLD does have some `--lto-*` options but the `-lto-*` form is not supported (it clashes with -l) https://reviews.llvm.org/D79371 That's strange. I've been using that option for years now. :-) Is this a recent change? The more frequently used options (specifyed by the clang driver) are -plugin-opt=... (options implemented by LLVMgold.so). `-lto-*` is rare. (There is an earlier -fprofile-instr-generate which does instrumentation in Clang, but the option does not have broad usage. It is used more for code coverage, not for optimization. Noticeably, it does not even implement the Kirchhoff's current law optimization) Right. I've been told outside of this email that -fprofile-generate is the prefered flag to use. -fprofile-use= is used by both regular PGO and context-sensitive PGO (CSPGO). clang -flto=thin -fprofile-use= passes -plugin-opt=cs-profile-path= to the linker. For regular PGO, this option is effectively a no-op (confirmed with CSPGO main developer). So I think the "or by LLD if LTO is enabled:" part should be removed. But what if you specify the linking step explicitly? Linux doesn't call "clang" when linking, but "ld.lld". Regular PGO+LTO does not need -plugin-opt=cs-profile-path= CSPGO+LTO needs it. Because -fprofile-use= may be used by both, Clang driver adds it. CSPGO is relevant in this this patch, so the linker option does not need to be mentioned.
Re: [PATCH] pgo: add clang's Profile Guided Optimization infrastructure
On 2021-01-11, 'Bill Wendling' via Clang Built Linux wrote: From: Sami Tolvanen Enable the use of clang's Profile-Guided Optimization[1]. To generate a profile, the kernel is instrumented with PGO counters, a representative workload is run, and the raw profile data is collected from /sys/kernel/debug/pgo/profraw. The raw profile data must be processed by clang's "llvm-profdata" tool before it can be used during recompilation: $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw Multiple raw profiles may be merged during this step. The data can be used either by the compiler if LTO isn't enabled: ... -fprofile-use=vmlinux.profdata ... or by LLD if LTO is enabled: ... -lto-cs-profile-file=vmlinux.profdata ... This LLD option does not exist. LLD does have some `--lto-*` options but the `-lto-*` form is not supported (it clashes with -l) https://reviews.llvm.org/D79371 (There is an earlier -fprofile-instr-generate which does instrumentation in Clang, but the option does not have broad usage. It is used more for code coverage, not for optimization. Noticeably, it does not even implement the Kirchhoff's current law optimization) -fprofile-use= is used by both regular PGO and context-sensitive PGO (CSPGO). clang -flto=thin -fprofile-use= passes -plugin-opt=cs-profile-path= to the linker. For regular PGO, this option is effectively a no-op (confirmed with CSPGO main developer). So I think the "or by LLD if LTO is enabled:" part should be removed. This initial submission is restricted to x86, as that's the platform we know works. This restriction can be lifted once other platforms have been verified to work with PGO. Note that this method of profiling the kernel is clang-native and isn't compatible with clang's gcov support in kernel/gcov. [1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization Signed-off-by: Sami Tolvanen Co-developed-by: Bill Wendling Signed-off-by: Bill Wendling --- Documentation/dev-tools/index.rst | 1 + Documentation/dev-tools/pgo.rst | 127 + MAINTAINERS | 9 + Makefile | 3 + arch/Kconfig | 1 + arch/arm/boot/bootp/Makefile | 1 + arch/arm/boot/compressed/Makefile | 1 + arch/arm/vdso/Makefile| 3 +- arch/arm64/kernel/vdso/Makefile | 3 +- arch/arm64/kvm/hyp/nvhe/Makefile | 1 + arch/mips/boot/compressed/Makefile| 1 + arch/mips/vdso/Makefile | 1 + arch/nds32/kernel/vdso/Makefile | 4 +- arch/parisc/boot/compressed/Makefile | 1 + arch/powerpc/kernel/Makefile | 6 +- arch/powerpc/kernel/trace/Makefile| 3 +- arch/powerpc/kernel/vdso32/Makefile | 1 + arch/powerpc/kernel/vdso64/Makefile | 1 + arch/powerpc/kexec/Makefile | 3 +- arch/powerpc/xmon/Makefile| 1 + arch/riscv/kernel/vdso/Makefile | 3 +- arch/s390/boot/Makefile | 1 + arch/s390/boot/compressed/Makefile| 1 + arch/s390/kernel/Makefile | 1 + arch/s390/kernel/vdso64/Makefile | 3 +- arch/s390/purgatory/Makefile | 1 + arch/sh/boot/compressed/Makefile | 1 + arch/sh/mm/Makefile | 1 + arch/sparc/vdso/Makefile | 1 + arch/x86/Kconfig | 1 + arch/x86/boot/Makefile| 1 + arch/x86/boot/compressed/Makefile | 1 + arch/x86/entry/vdso/Makefile | 1 + arch/x86/kernel/vmlinux.lds.S | 2 + arch/x86/platform/efi/Makefile| 1 + arch/x86/purgatory/Makefile | 1 + arch/x86/realmode/rm/Makefile | 1 + arch/x86/um/vdso/Makefile | 1 + drivers/firmware/efi/libstub/Makefile | 1 + drivers/s390/char/Makefile| 1 + include/asm-generic/vmlinux.lds.h | 44 +++ kernel/Makefile | 1 + kernel/pgo/Kconfig| 34 +++ kernel/pgo/Makefile | 5 + kernel/pgo/fs.c | 382 ++ kernel/pgo/instrument.c | 147 ++ kernel/pgo/pgo.h | 206 ++ scripts/Makefile.lib | 10 + 48 files changed, 1017 insertions(+), 9 deletions(-) create mode 100644 Documentation/dev-tools/pgo.rst create mode 100644 kernel/pgo/Kconfig create mode 100644 kernel/pgo/Makefile create mode 100644 kernel/pgo/fs.c create mode 100644 kernel/pgo/instrument.c create mode 100644 kernel/pgo/pgo.h diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst index f7809c7b1ba9e..8d6418e858062 100644 --- a/Documentation/dev-tools/index.rst +++ b/Documentation/dev-tools/index.rst @@ -26,6 +26,7 @@ whole; patches welcome! kgdb kselftest kunit/index + pgo .. only:: subproject and html diff --git a/Documentation/dev-tools/pgo.rst
[PATCH v2] x86: Treat R_386_PLT32 as R_386_PC32
This is similar to commit b21ebf2fb4cd ("x86: Treat R_X86_64_PLT32 as R_X86_64_PC32"), but for i386. As far as Linux kernel is concerned, R_386_PLT32 can be treated the same as R_386_PC32. R_386_PC32/R_X86_64_PC32 are PC-relative relocation types with the requirement that the symbol address is significant. R_386_PLT32/R_X86_64_PLT32 are PC-relative relocation types without the address significance requirement. On x86-64, there is no PIC vs non-PIC PLT distinction and an R_X86_64_PLT32 relocation is produced for both `call/jmp foo` and `call/jmp foo@PLT` with newer (2018) GNU as/LLVM integrated assembler. On i386, there are 2 types of PLTs, PIC and non-PIC. Currently the convention is to use R_386_PC32 for non-PIC PLT and R_386_PLT32 for PIC PLT, but R_386_PLT32 is arguably preferable for -fno-pic code as well: this can avoid a "canonical PLT entry" (st_shndx=0, st_value!=0) if the symbol turns out to be defined externally. clang-12 -fno-pic (since https://github.com/llvm/llvm-project/commit/961f31d8ad14c66829991522d73e14b5a96ff6d4) can emit R_386_PLT32 for compiler produced symbols (if we drop -ffreestanding for CONFIG_X86_32, library call optimization can produce newer declarations) and future GCC -fno-pic may emit R_386_PLT32 as well if an option like -fno-direct-access-external-data is adopted to avoid canonical PLT entry/copy relocations. Link: https://github.com/ClangBuiltLinux/linux/issues/1210 Link: https://github.com/llvm/llvm-project/commit/961f31d8ad14c66829991522d73e14b5a96ff6d4 Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 Reported-by: Arnd Bergmann Signed-off-by: Fangrui Song Reviewed-by: Nick Desaulniers Reviewed-by: Nathan Chancellor Tested-by: Nick Desaulniers Tested-by: Nathan Chancellor --- Change in v2: * Improve commit message --- arch/x86/kernel/module.c | 1 + arch/x86/tools/relocs.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c index 34b153cbd4ac..5e9a34b5bd74 100644 --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -114,6 +114,7 @@ int apply_relocate(Elf32_Shdr *sechdrs, *location += sym->st_value; break; case R_386_PC32: + case R_386_PLT32: /* Add the value, subtract its position */ *location += sym->st_value - (uint32_t)location; break; diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c index ce7188cbdae5..717e48ca28b6 100644 --- a/arch/x86/tools/relocs.c +++ b/arch/x86/tools/relocs.c @@ -867,6 +867,7 @@ static int do_reloc32(struct section *sec, Elf_Rel *rel, Elf_Sym *sym, case R_386_PC32: case R_386_PC16: case R_386_PC8: + case R_386_PLT32: /* * NONE can be ignored and PC relative relocations don't * need to be adjusted. @@ -910,6 +911,7 @@ static int do_reloc_real(struct section *sec, Elf_Rel *rel, Elf_Sym *sym, case R_386_PC32: case R_386_PC16: case R_386_PC8: + case R_386_PLT32: /* * NONE can be ignored and PC relative relocations don't * need to be adjusted. -- 2.29.2.729.g45daf8777d-goog
Re: [PATCH 4/4] x86: don't build CONFIG_X86_32 as -ffreestanding
On 2020-08-17, Nick Desaulniers wrote: -ffreestanding typically inhibits "libcall optimizations" where calls to certain library functions can be replaced by the compiler in certain cases to calls to other library functions that may be more efficient. This can be problematic for embedded targets that don't provide full libc implementations. -ffreestanding inhibits all such optimizations, which is the safe choice, but generally we want the optimizations that are performed. The Linux kernel does implement a fair amount of libc routines. Instead of -ffreestanding (which makes more sense in smaller images like kexec's purgatory image), prefer -fno-builtin-* flags to disable the compiler from emitting calls to functions which may not be defined. If you see a linkage failure due to a missing symbol that's typically defined in a libc, and not explicitly called from the source code, then the compiler may have done such a transform. You can either implement such a function (ie. in lib/string.c) or disable the transform outright via -fno-builtin-* flag (where * is the name of the library routine, ie. -fno-builtin-bcmp). i386_defconfig build+boot tested with GCC and Clang. Removes a pretty old TODO from the codebase. Fixes: 6edfba1b33c7 ("x86_64: Don't define string functions to builtin") Suggested-by: Arvind Sankar Signed-off-by: Nick Desaulniers Reviewed-by: Kees Cook --- arch/x86/Makefile | 3 --- 1 file changed, 3 deletions(-) diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 4346ffb2e39f..2383a96cf4fd 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -80,9 +80,6 @@ ifeq ($(CONFIG_X86_32),y) # CPU-specific tuning. Anything which can be shared with UML should go here. include arch/x86/Makefile_32.cpu KBUILD_CFLAGS += $(cflags-y) - -# temporary until string.h is fixed -KBUILD_CFLAGS += -ffreestanding else BITS := 64 UTS_MACHINE := x86_64 -- 2.28.0.220.ged08abb693-goog Reviewed-by: Fangrui Song But dropping -ffreestanding causes compiler produced declarations which require https://lore.kernel.org/lkml/20210107001739.1321725-1-mask...@google.com/ "x86: Treat R_386_PLT32 as R_386_PC32" as a prerequisite to build with trunk Clang https://github.com/ClangBuiltLinux/linux/issues/1210 Since there have been more than 4 months, it seems that something else regressed the non -ffreestanding build. Maybe another -fno-builtin-* is needed somewhere.
[PATCH] x86: Treat R_386_PLT32 as R_386_PC32
This is similar to commit b21ebf2fb4cde1618915a97cc773e287ff49173e "x86: Treat R_X86_64_PLT32 as R_X86_64_PC32", but for i386. As far as Linux kernel is concerned, R_386_PLT32 can be treated the same as R_386_PC32. R_386_PC32/R_X86_64_PC32 are PC-relative relocation types with the requirement that the symbol address is significant. R_386_PLT32/R_X86_64_PLT32 are PC-relative relocation types without the address significance requirement. On x86-64, there is no PIC vs non-PIC PLT distinction and an R_X86_64_PLT32 relocation is produced for both `call/jmp foo` and `call/jmp foo@PLT` with newer (2018) GNU as/LLVM integrated assembler. On i386, there are 2 types of PLTs, PIC and non-PIC. Currently the convention is to use R_386_PC32 for non-PIC PLT and R_386_PLT32 for PIC PLT, but R_386_PLT32 is arguably preferable for -fno-pic code as well: this can avoid a "canonical PLT entry" (st_shndx=0, st_value!=0) if the symbol turns out to be defined externally. Latest Clang (since 961f31d8ad14c66829991522d73e14b5a96ff6d4) can use R_386_PLT32 for compiler produced symbols (if we drop -ffreestanding for CONFIG_X86_32, library call optimization can produce newer declarations) and future GCC may use R_386_PLT32 as well if the maintainers agree to adopt an option like -fdirect-access-external-data to avoid "canonical PLT entry"/copy relocations https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 Link: https://github.com/ClangBuiltLinux/linux/issues/1210 Reported-by: Arnd Bergmann Signed-off-by: Fangrui Song --- arch/x86/kernel/module.c | 1 + arch/x86/tools/relocs.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c index 34b153cbd4ac..5e9a34b5bd74 100644 --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -114,6 +114,7 @@ int apply_relocate(Elf32_Shdr *sechdrs, *location += sym->st_value; break; case R_386_PC32: + case R_386_PLT32: /* Add the value, subtract its position */ *location += sym->st_value - (uint32_t)location; break; diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c index ce7188cbdae5..717e48ca28b6 100644 --- a/arch/x86/tools/relocs.c +++ b/arch/x86/tools/relocs.c @@ -867,6 +867,7 @@ static int do_reloc32(struct section *sec, Elf_Rel *rel, Elf_Sym *sym, case R_386_PC32: case R_386_PC16: case R_386_PC8: + case R_386_PLT32: /* * NONE can be ignored and PC relative relocations don't * need to be adjusted. @@ -910,6 +911,7 @@ static int do_reloc_real(struct section *sec, Elf_Rel *rel, Elf_Sym *sym, case R_386_PC32: case R_386_PC16: case R_386_PC8: + case R_386_PLT32: /* * NONE can be ignored and PC relative relocations don't * need to be adjusted. -- 2.29.2.729.g45daf8777d-goog
Re: building csky with CC=clang
On 2020-12-22, 'Nick Desaulniers' via Clang Built Linux wrote: Hello! I was playing with some of LLVM's experimental backends (m68k) and saw there was a CSKY backend. I rebuilt LLVM to support CSKY, but I ran into trouble building the kernel before even getting to the compiler invocation: $ ARCH=csky CROSS_COMPILE=csky-linux-gnu- make CC=clang -j71 defconfig ... scripts/Kconfig.include:40: linker 'csky-linux-gnu-ld' not found My distro doesn't package binutils-csky-linux-gnu, is there documentation on how to build the kernel targeting CSKY, starting with building GNU binutils configured with CSKY emulation? Note also that the llvm/lib/Target/CSKY has not been fully upstreamed yet. It is a WIP https://lists.llvm.org/pipermail/llvm-dev/2020-August/144481.html I will not expect clang csky to work currently. (The latest committed LLVM patch is https://reviews.llvm.org/D93372 Normally committing an important piece of a large patch series like this should take a bit longer time longer after someone in the community accepted it https://llvm.org/docs/CodeReview.html#can-code-be-reviewed-after-it-is-committed ) I do want to raise the recent LLVM M68k target. Its patches ([M67k] (Patch */8)) are very organized and the main proposer shares updates to llvm-dev regularly. There is a lot from the process where the C-SKY target can learn from.
Re: [PATCH v8 00/16] Add support for Clang LTO
On 2020-12-08, 'Sami Tolvanen' via Clang Built Linux wrote: On Tue, Dec 8, 2020 at 4:15 AM Arnd Bergmann wrote: On Tue, Dec 1, 2020 at 10:37 PM 'Sami Tolvanen' via Clang Built Linux wrote: > > This patch series adds support for building the kernel with Clang's > Link Time Optimization (LTO). In addition to performance, the primary > motivation for LTO is to allow Clang's Control-Flow Integrity (CFI) > to be used in the kernel. Google has shipped millions of Pixel > devices running three major kernel versions with LTO+CFI since 2018. > > Most of the patches are build system changes for handling LLVM > bitcode, which Clang produces with LTO instead of ELF object files, > postponing ELF processing until a later stage, and ensuring initcall > ordering. > > Note that arm64 support depends on Will's memory ordering patches > [1]. I will post x86_64 patches separately after we have fixed the > remaining objtool warnings [2][3]. > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=for-next/lto > [2] https://lore.kernel.org/lkml/20201120040424.a3wctajzft4ufoiw@treble/ > [3] https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/?h=objtool-vmlinux > > You can also pull this series from > > https://github.com/samitolvanen/linux.git lto-v8 I've tried pull this into my randconfig test tree to give it a spin. Great, thank you for testing this! So far I have not managed to get a working build out of it, the main problem so far being that it is really slow to build because the link stage only uses one CPU. These are the other issues I've seen so far: ld.lld ThinLTO uses the number of (physical cores enabled by affinity) by default. You may want to limit your testing only to ThinLTO at first, because full LTO is going to be extremely slow with larger configs, especially when building arm64 kernels. - one build seems to take even longer to link. It's currently at 35GB RAM usage and 40 minutes into the final link, but I'm worried it might not complete before it runs out of memory. I only have 128GB installed, and google-chrome uses another 30GB of that, and I'm also doing some other builds in parallel. Is there a minimum recommended amount of memory for doing LTO builds? When building arm64 defconfig, the maximum memory usage I measured with ThinLTO was 3.5 GB, and with full LTO 20.3 GB. I haven't measured larger configurations, but I believe LLD can easily consume 3-4x that much with full LTO allyesconfig. - One build failed with ld.lld -EL -maarch64elf -mllvm -import-instr-limit=5 -r -o vmlinux.o -T .tmp_initcalls.lds --whole-archive arch/arm64/kernel/head.o init/built-in.a usr/built-in.a arch/arm64/built-in.a kernel/built-in.a certs/built-in.a mm/built-in.a fs/built-in.a ipc/built-in.a security/built-in.a crypto/built-in.a block/built-in.a arch/arm64/lib/built-in.a lib/built-in.a drivers/built-in.a sound/built-in.a net/built-in.a virt/built-in.a --no-whole-archive --start-group arch/arm64/lib/lib.a lib/lib.a ./drivers/firmware/efi/libstub/lib.a --end-group "ld.lld: error: arch/arm64/kernel/head.o: invalid symbol index" after about 30 minutes That's interesting. Did you use LLVM_IAS=1? May be worth checking which relocation or (SHT_GROUP section's sh_info) in arch/arm64/kernel/head.o is incorrect. - CONFIG_CPU_BIG_ENDIAN doesn't seem to work with lld, and LTO doesn't work with ld.bfd. I've added a CPU_LITTLE_ENDIAN dependency to ARCH_SUPPORTS_LTO_CLANG{,THIN} Ah, good point. I'll fix this in v9. Full/Thin LTO should work with GNU ld and gold with LLVMgold.so built from llvm-project (https://llvm.org/docs/GoldPlugin.html ). You'll need to make sure that LLVMgold.so is newer than clang. (Newer clang may introduce bitcode attributes which are unrecognizable by older LLVMgold.so/ld.lld) [...] Not sure if these are all known issues. If there is one you'd like me try take a closer look at for finding which config options break it, I can try No, none of these are known issues. I would be happy to take a closer look if you can share configs that reproduce these. Sami -- You received this message because you are subscribed to the Google Groups "Clang Built Linux" group. To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/CABCJKueCHo2RYfx_A21m%2B%3Dd1gQLR9QsOOxCsHFeicCqyHkb-Kg%40mail.gmail.com.
[PATCH v2] firmware_loader: Align .builtin_fw to 8
arm64 references the start address of .builtin_fw (__start_builtin_fw) with a pair of R_AARCH64_ADR_PREL_PG_HI21/R_AARCH64_LDST64_ABS_LO12_NC relocations. The compiler is allowed to emit the R_AARCH64_LDST64_ABS_LO12_NC relocation because struct builtin_fw in include/linux/firmware.h is 8-byte aligned. The R_AARCH64_LDST64_ABS_LO12_NC relocation requires the address to be a multiple of 8, which may not be the case if .builtin_fw is empty. Unconditionally align .builtin_fw to fix the linker error. 32-bit architectures could use ALIGN(4) but that would add unnecessary complexity, so just use ALIGN(8). Fixes: 5658c76 ("firmware: allow firmware files to be built into kernel image") Link: https://github.com/ClangBuiltLinux/linux/issues/1204 Reported-by: kernel test robot Signed-off-by: Fangrui Song Acked-by: Arnd Bergmann --- Change in v2: * Use output section alignment instead of inappropriate ALIGN_FUNCTION() --- include/asm-generic/vmlinux.lds.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index b2b3d81b1535..b97c628ad91f 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -459,7 +459,7 @@ } \ \ /* Built-in firmware blobs */ \ - .builtin_fw: AT(ADDR(.builtin_fw) - LOAD_OFFSET) { \ + .builtin_fw : AT(ADDR(.builtin_fw) - LOAD_OFFSET) ALIGN(8) {\ __start_builtin_fw = .; \ KEEP(*(.builtin_fw))\ __end_builtin_fw = .; \ -- 2.29.2.576.ga3fc446d84-goog
Re: [PATCH] firmware_loader: Align .builtin_fw to 8
On 2020-12-03, Nick Desaulniers wrote: On Thu, Dec 3, 2020 at 9:05 AM Fangrui Song wrote: arm64 references the start address of .builtin_fw (__start_builtin_fw) with a pair of R_AARCH64_ADR_PREL_PG_HI21/R_AARCH64_LDST64_ABS_LO12_NC relocations. The compiler is allowed to emit the R_AARCH64_LDST64_ABS_LO12_NC relocation because struct builtin_fw in include/linux/firmware.h is 8-byte aligned. The R_AARCH64_LDST64_ABS_LO12_NC relocation requires the address to be a multiple of 8, which may not be the case if .builtin_fw is empty. Unconditionally align .builtin_fw to fix the linker error. Fixes: 5658c76 ("firmware: allow firmware files to be built into kernel image") Link: https://github.com/ClangBuiltLinux/linux/issues/1204 Reported-by: kernel test robot Signed-off-by: Fangrui Song --- include/asm-generic/vmlinux.lds.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index b2b3d81b1535..3cd4bd1193ab 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -459,6 +459,7 @@ } \ \ /* Built-in firmware blobs */ \ + ALIGN_FUNCTION(); \ Thanks for the patch! I'm going to repeat my question from the above link (https://github.com/ClangBuiltLinux/linux/issues/1204#issuecomment-737610582) just in case it's not naive: ALIGN_FUNCTION() C preprocessor macro seems to be used to realign code, while STRUCT_ALIGN() seems to be used to realign data. It looks to me like only data is put into .builtin_fw. If these relocations require an alignment of 8, than multiples of 8 should also be fine (STRUCT_ALIGN in 32 for all toolchain version, except gcc 4.9 which is 64; both are multiples of 8 though). It looks like only structs are placed in .builtin_fw; ie. data. In that case, I worry that using ALIGN_FUNCTION/8 might actually be under-aligning data in this section. Regarding STRUCT_ALIGN (32 for GCC>4.9) in include/asm-generic/vmlinux.lds.h, it is probably not suitable for .builtin_fw * Its comment is a bit unclear. It probably should mention that the 32-byte overalignment is only for global structure variables which are at least 32 byte large. But this is just my observation. Adding a GCC maintainer to comment on this. * Even if GCC does overalign defined global struct variables, it is unlikely that GCC will leverage this property for undefined `extern struct builtin_fw __start_builtin_fw[]` (drivers/base/firmware_loader/main.c) To make .builtin_fw aligned, I agree that ALIGN_FUNCTION() is probably a misuse. Maybe I should just use `. = ALIGN(8)` if the kernel linker script prefers `. = ALIGN(8)` to an output section alignment (https://sourceware.org/binutils/docs/ld/Output-Section-Description.html#Output-Section-Description https://lld.llvm.org/ELF/linker_script.html#output-section-alignment) Though, in https://github.com/ClangBuiltLinux/linux/issues/1204#issuecomment-737625134 you're comment: In GNU ld, the empty .builtin_fw is removed So that's a difference in behavior between ld.bfd and ld.lld, which is fine, but it makes me wonder whether we should instead or additionally be discarding this section explicitly via linker script when CONFIG_FW_LOADER is not set? Short answer: No, we should not discard .builtin_fw .builtin_fw: AT(ADDR(.builtin_fw) - LOAD_OFFSET) { __start_builtin_fw = .; ... } In LLD, either a section reference (`ADDR(.builtin_fw)`) or a non-PROVIDE symbol assignment __start_builtin_fw makes the section non-discardable. It can be argued that discarding an output section with a symbol assignment (GNU ld) is strange because the symbol (st_shndx) will be defined relative to an arbitrary unrelated section. Retaining the section can avoid some other issues. .builtin_fw: AT(ADDR(.builtin_fw) - LOAD_OFFSET) { \ __start_builtin_fw = .; \ KEEP(*(.builtin_fw))\ -- 2.29.2.576.ga3fc446d84-goog -- Thanks, ~Nick Desaulniers
[PATCH] firmware_loader: Align .builtin_fw to 8
arm64 references the start address of .builtin_fw (__start_builtin_fw) with a pair of R_AARCH64_ADR_PREL_PG_HI21/R_AARCH64_LDST64_ABS_LO12_NC relocations. The compiler is allowed to emit the R_AARCH64_LDST64_ABS_LO12_NC relocation because struct builtin_fw in include/linux/firmware.h is 8-byte aligned. The R_AARCH64_LDST64_ABS_LO12_NC relocation requires the address to be a multiple of 8, which may not be the case if .builtin_fw is empty. Unconditionally align .builtin_fw to fix the linker error. Fixes: 5658c76 ("firmware: allow firmware files to be built into kernel image") Link: https://github.com/ClangBuiltLinux/linux/issues/1204 Reported-by: kernel test robot Signed-off-by: Fangrui Song --- include/asm-generic/vmlinux.lds.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index b2b3d81b1535..3cd4bd1193ab 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -459,6 +459,7 @@ } \ \ /* Built-in firmware blobs */ \ + ALIGN_FUNCTION(); \ .builtin_fw: AT(ADDR(.builtin_fw) - LOAD_OFFSET) { \ __start_builtin_fw = .; \ KEEP(*(.builtin_fw))\ -- 2.29.2.576.ga3fc446d84-goog
Re: [PATCH v2 2/4] Kbuild: do not emit debug info for assembly with LLVM_IAS=1
On 2020-11-04, Nathan Chancellor wrote: On Tue, Nov 03, 2020 at 04:53:41PM -0800, Nick Desaulniers wrote: Clang's integrated assembler produces the warning for assembly files: warning: DWARF2 only supports one section per compilation unit If -Wa,-gdwarf-* is unspecified, then debug info is not emitted. This Is this something that should be called out somewhere? If I understand this correctly, LLVM_IAS=1 + CONFIG_DEBUG_INFO=y won't work? Maybe this should be handled in Kconfig? will be re-enabled for new DWARF versions in a follow up patch. Enables defconfig+CONFIG_DEBUG_INFO to build cleanly with LLVM=1 LLVM_IAS=1 for x86_64 and arm64. Cc: Link: https://github.com/ClangBuiltLinux/linux/issues/716 Reported-by: Nathan Chancellor Suggested-by: Dmitry Golovin If you happen to respin, Dmitry deserves a Reported-by tag too :) Suggested-by: Sedat Dilek Signed-off-by: Nick Desaulniers Regardless of the other two comments, this is fine as is as a fix for stable to unblock Android + CrOS since we have been running something similar to it in CI: Reviewed-by: Nathan Chancellor --- Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/Makefile b/Makefile index f353886dbf44..75b1a3dcbf30 100644 --- a/Makefile +++ b/Makefile @@ -826,7 +826,9 @@ else DEBUG_CFLAGS += -g endif +ifndef LLVM_IAS Nit: this should probably match the existing LLVM_IAS check ifneq ($(LLVM_IAS),1) KBUILD_AFLAGS += -Wa,-gdwarf-2 +endif ifdef CONFIG_DEBUG_INFO_DWARF4 DEBUG_CFLAGS += -gdwarf-4 -- 2.29.1.341.ge80a0c044ae-goog The root cause is that DWARF v2 has no DW_AT_ranges, so it cannot represent non-contiguous address ranges. It seems that GNU as -gdwarf-3 emits DW_AT_ranges as well and emits an entry for a non-executable section. In any case, the option is of very low value, at least for LLVM. Reviewed-by: Fangrui Song
[tip: x86/urgent] x86/lib: Change .weak to SYM_FUNC_START_WEAK for arch/x86/lib/mem*_64.S
The following commit has been merged into the x86/urgent branch of tip: Commit-ID: 4d6ffa27b8e5116c0abb318790fd01d4e12d75e6 Gitweb: https://git.kernel.org/tip/4d6ffa27b8e5116c0abb318790fd01d4e12d75e6 Author:Fangrui Song AuthorDate:Mon, 02 Nov 2020 17:23:58 -08:00 Committer: Borislav Petkov CommitterDate: Wed, 04 Nov 2020 12:30:20 +01:00 x86/lib: Change .weak to SYM_FUNC_START_WEAK for arch/x86/lib/mem*_64.S Commit 393f203f5fd5 ("x86_64: kasan: add interceptors for memset/memmove/memcpy functions") added .weak directives to arch/x86/lib/mem*_64.S instead of changing the existing ENTRY macros to WEAK. This can lead to the assembly snippet .weak memcpy ... .globl memcpy which will produce a STB_WEAK memcpy with GNU as but STB_GLOBAL memcpy with LLVM's integrated assembler before LLVM 12. LLVM 12 (since https://reviews.llvm.org/D90108) will error on such an overridden symbol binding. Commit ef1e03152cb0 ("x86/asm: Make some functions local") changed ENTRY in arch/x86/lib/memcpy_64.S to SYM_FUNC_START_LOCAL, which was ineffective due to the preceding .weak directive. Use the appropriate SYM_FUNC_START_WEAK instead. Fixes: 393f203f5fd5 ("x86_64: kasan: add interceptors for memset/memmove/memcpy functions") Fixes: ef1e03152cb0 ("x86/asm: Make some functions local") Reported-by: Sami Tolvanen Signed-off-by: Fangrui Song Signed-off-by: Borislav Petkov Reviewed-by: Nick Desaulniers Tested-by: Nathan Chancellor Tested-by: Nick Desaulniers Cc: Link: https://lkml.kernel.org/r/20201103012358.168682-1-mask...@google.com --- arch/x86/lib/memcpy_64.S | 4 +--- arch/x86/lib/memmove_64.S | 4 +--- arch/x86/lib/memset_64.S | 4 +--- 3 files changed, 3 insertions(+), 9 deletions(-) diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S index 037faac..1e299ac 100644 --- a/arch/x86/lib/memcpy_64.S +++ b/arch/x86/lib/memcpy_64.S @@ -16,8 +16,6 @@ * to a jmp to memcpy_erms which does the REP; MOVSB mem copy. */ -.weak memcpy - /* * memcpy - Copy a memory block. * @@ -30,7 +28,7 @@ * rax original destination */ SYM_FUNC_START_ALIAS(__memcpy) -SYM_FUNC_START_LOCAL(memcpy) +SYM_FUNC_START_WEAK(memcpy) ALTERNATIVE_2 "jmp memcpy_orig", "", X86_FEATURE_REP_GOOD, \ "jmp memcpy_erms", X86_FEATURE_ERMS diff --git a/arch/x86/lib/memmove_64.S b/arch/x86/lib/memmove_64.S index 7ff00ea..41902fe 100644 --- a/arch/x86/lib/memmove_64.S +++ b/arch/x86/lib/memmove_64.S @@ -24,9 +24,7 @@ * Output: * rax: dest */ -.weak memmove - -SYM_FUNC_START_ALIAS(memmove) +SYM_FUNC_START_WEAK(memmove) SYM_FUNC_START(__memmove) mov %rdi, %rax diff --git a/arch/x86/lib/memset_64.S b/arch/x86/lib/memset_64.S index 9ff15ee..0bfd26e 100644 --- a/arch/x86/lib/memset_64.S +++ b/arch/x86/lib/memset_64.S @@ -6,8 +6,6 @@ #include #include -.weak memset - /* * ISO C memset - set a memory block to a byte value. This function uses fast * string to get better performance than the original function. The code is @@ -19,7 +17,7 @@ * * rax original destination */ -SYM_FUNC_START_ALIAS(memset) +SYM_FUNC_START_WEAK(memset) SYM_FUNC_START(__memset) /* * Some CPUs support enhanced REP MOVSB/STOSB feature. It is recommended
[PATCH] perf bench: Update arch/x86/lib/mem{cpy,set}_64.S
In memset_64.S, the macros expand to `.weak MEMSET ... .globl MEMSET` which will produce a STB_WEAK MEMSET with GNU as but STB_GLOBAL MEMSET with LLVM's integrated assembler before LLVM 12. LLVM 12 (since https://reviews.llvm.org/D90108) will error on such an overridden symbol binding. memcpy_64.S is similar. Port http://lore.kernel.org/r/20201103012358.168682-1-mask...@google.com ("x86_64: Change .weak to SYM_FUNC_START_WEAK for arch/x86/lib/mem*_64.S") to fix the issue. Additionally, port SYM_L_WEAK and SYM_FUNC_START_WEAK from include/linux/linkage.h to tools/perf/util/include/linux/linkage.h Fixes: 7d7d1bf1d1da ("perf bench: Copy kernel files needed to build mem{cpy,set} x86_64 benchmarks") Link: https://lore.kernel.org/r/20201103012358.168682-1-mask...@google.com Signed-off-by: Fangrui Song --- tools/arch/x86/lib/memcpy_64.S | 4 +--- tools/arch/x86/lib/memset_64.S | 4 +--- tools/perf/util/include/linux/linkage.h | 7 +++ 3 files changed, 9 insertions(+), 6 deletions(-) diff --git a/tools/arch/x86/lib/memcpy_64.S b/tools/arch/x86/lib/memcpy_64.S index 0b5b8ae56bd9..9428f251df0f 100644 --- a/tools/arch/x86/lib/memcpy_64.S +++ b/tools/arch/x86/lib/memcpy_64.S @@ -16,8 +16,6 @@ * to a jmp to memcpy_erms which does the REP; MOVSB mem copy. */ -.weak memcpy - /* * memcpy - Copy a memory block. * @@ -30,7 +28,7 @@ * rax original destination */ SYM_FUNC_START_ALIAS(__memcpy) -SYM_FUNC_START_LOCAL(memcpy) +SYM_FUNC_START_WEAK(memcpy) ALTERNATIVE_2 "jmp memcpy_orig", "", X86_FEATURE_REP_GOOD, \ "jmp memcpy_erms", X86_FEATURE_ERMS diff --git a/tools/arch/x86/lib/memset_64.S b/tools/arch/x86/lib/memset_64.S index fd5d25a474b7..1f9b11f9244d 100644 --- a/tools/arch/x86/lib/memset_64.S +++ b/tools/arch/x86/lib/memset_64.S @@ -5,8 +5,6 @@ #include #include -.weak memset - /* * ISO C memset - set a memory block to a byte value. This function uses fast * string to get better performance than the original function. The code is @@ -18,7 +16,7 @@ * * rax original destination */ -SYM_FUNC_START_ALIAS(memset) +SYM_FUNC_START_WEAK(memset) SYM_FUNC_START(__memset) /* * Some CPUs support enhanced REP MOVSB/STOSB feature. It is recommended diff --git a/tools/perf/util/include/linux/linkage.h b/tools/perf/util/include/linux/linkage.h index b8a5159361b4..0e493bf3151b 100644 --- a/tools/perf/util/include/linux/linkage.h +++ b/tools/perf/util/include/linux/linkage.h @@ -25,6 +25,7 @@ /* SYM_L_* -- linkage of symbols */ #define SYM_L_GLOBAL(name) .globl name +#define SYM_L_WEAK(name) .weak name #define SYM_L_LOCAL(name) /* nothing */ #define ALIGN __ALIGN @@ -78,6 +79,12 @@ SYM_START(name, SYM_L_LOCAL, SYM_A_ALIGN) #endif +/* SYM_FUNC_START_WEAK -- use for weak functions */ +#ifndef SYM_FUNC_START_WEAK +#define SYM_FUNC_START_WEAK(name) \ + SYM_START(name, SYM_L_WEAK, SYM_A_ALIGN) +#endif + /* SYM_FUNC_END_ALIAS -- the end of LOCAL_ALIASed or ALIASed function */ #ifndef SYM_FUNC_END_ALIAS #define SYM_FUNC_END_ALIAS(name) \ -- 2.29.1.341.ge80a0c044ae-goog
Re: [PATCH] module: use hidden visibility for weak symbol references
One nit about ".got" in the message: Reviewed-by: Fangrui Song On 2020-10-27, Nick Desaulniers wrote: + Fangrui On Tue, Oct 27, 2020 at 8:11 AM Ard Biesheuvel wrote: Geert reports that commit be2881824ae9eb92 ("arm64/build: Assert for unwanted sections") results in build errors on arm64 for configurations that have CONFIG_MODULES disabled. The commit in question added ASSERT()s to the arm64 linker script to ensure that linker generated sections such as .got, .plt etc are empty, .got -> .got.plt be2881824ae9eb92 does not ASSERT on .got (it can). Strangely *(.got) is placed in .text in arch/arm64/kernel/vmlinux.lds.S I think that line can be removed. On x86, aarch64 and many other archs, the start of .got.plt is the GOT base. .got is not needed (ppc/arm/riscv use .got instead of .got.plt as the GOT base anchor). but as it turns out, there are corner cases where the linker does emit content into those sections. More specifically, weak references to function symbols (which can remain unsatisfied, and can therefore not be emitted as relative references) will be emitted as GOT and PLT entries when linking the kernel in PIE mode (which is the case when CONFIG_RELOCATABLE is enabled, which is on by default). Confirmed. What happens is that code such as struct device *(*fn)(struct device *dev); struct device *iommu_device; fn = symbol_get(mdev_get_iommu_device); if (fn) { iommu_device = fn(dev); essentially gets converted into the following when CONFIG_MODULES is off: struct device *iommu_device; if (_get_iommu_device) { iommu_device = mdev_get_iommu_device(dev); where mdev_get_iommu_device is emitted as a weak symbol reference into the object file. The first reference is decorated with an ordinary ABS64 data relocation (which yields 0x0 if the reference remains unsatisfied). However, the indirect call is turned into a direct call covered by a R_AARCH64_CALL26 relocation, which is converted into a call via a PLT entry taking the target address from the associated GOT entry. Yes, the R_AARCH64_CALL26 relocation referencing an undefined weak symbol causes one .plt entry and one .got.plt entry. Given that such GOT and PLT entries are unnecessary for fully linked binaries such as the kernel, let's give these weak symbol references hidden visibility, so that the linker knows that the weak reference via R_AARCH64_CALL26 can simply remain unsatisfied. Cc: Jessica Yu Cc: Kees Cook Cc: Geert Uytterhoeven Cc: Nick Desaulniers Signed-off-by: Ard Biesheuvel --- include/linux/module.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/module.h b/include/linux/module.h index 7ccdf87f376f..6264617bab4d 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -740,7 +740,7 @@ static inline bool within_module(unsigned long addr, const struct module *mod) } /* Get/put a kernel symbol (calls should be symmetric) */ -#define symbol_get(x) ({ extern typeof(x) x __attribute__((weak)); &(x); }) +#define symbol_get(x) ({ extern typeof(x) x __attribute__((weak,visibility("hidden"))); &(x); }) #define symbol_put(x) do { } while (0) #define symbol_put_addr(x) do { } while (0) -- 2.17.1 -- Thanks, ~Nick Desaulniers
Re: [PATCH] Kbuild: implement support for DWARF5
On 2020-10-21, 'Nick Desaulniers' via Clang Built Linux wrote: DWARF5 is the latest standard of the DWARF debug info format. Feature detection of DWARF5 is onerous, especially given that we've removed $(AS), so we must query $(CC) for DWARF5 assembler directive support. Further -gdwarf-X where X is an unsupported value doesn't produce an error in $(CC). GNU `as` only recently gained support for specifying -gdwarf-5. The DWARF version of a binary can be validated with: To be more correct: this is just the version number of the .debug_info section. Other sections can use different version numbers. (For example, GNU as still does not support version 5 .debug_line) $ llvm-dwarfdump vmlinux | head -n 5 | grep version or $ readelf --debug-dump=info vmlinux 2>/dev/null | grep Version DWARF5 wins significantly in terms of size when mixed with compression (CONFIG_DEBUG_INFO_COMPRESSED). 363Mvmlinux.clang12.dwarf5.compressed 434Mvmlinux.clang12.dwarf4.compressed 439Mvmlinux.clang12.dwarf2.compressed 457Mvmlinux.clang12.dwarf5 536Mvmlinux.clang12.dwarf4 548Mvmlinux.clang12.dwarf2 Make CONFIG_DEBUG_INFO_DWARF4 part of a Kconfig choice to preserve forward compatibility. Link: http://www.dwarfstd.org/doc/DWARF5.pdf Signed-off-by: Nick Desaulniers --- RFC because this patch is super half baked, but I'm looking for feedback. I would logically split this into a series of patches; 1. disable -Wa,gdwarf-2 for LLVM_IAS=1, see also https://github.com/ClangBuiltLinux/linux/issues/716 https://github.com/ClangBuiltLinux/continuous-integration/blob/master/patches/llvm-all/linux-next/arm64/silence-dwarf2-warnings.patch that way we can backport for improved LLVM_IAS support. 2. move CONFIG_DEBUG_INFO_DWARF4 to choice. 3. implement the rest on top. I'm pretty sure GNU `as` only recently gained the ability to specify -gdwarf-4 without erroring in binutils 2.35, so that part likely needs to be fixed. Makefile | 19 --- include/asm-generic/vmlinux.lds.h | 6 +- lib/Kconfig.debug | 29 + scripts/test_dwarf5_support.sh| 4 4 files changed, 50 insertions(+), 8 deletions(-) create mode 100755 scripts/test_dwarf5_support.sh diff --git a/Makefile b/Makefile index e71979882e4f..0862df5b1a24 100644 --- a/Makefile +++ b/Makefile @@ -828,10 +828,23 @@ else DEBUG_CFLAGS+= -g endif -KBUILD_AFLAGS += -Wa,-gdwarf-2 - +DWARF_VERSION=2 ifdef CONFIG_DEBUG_INFO_DWARF4 -DEBUG_CFLAGS += -gdwarf-4 +DWARF_VERSION=4 +endif +ifdef CONFIG_DEBUG_INFO_DWARF5 +DWARF_VERSION=5 +endif +DEBUG_CFLAGS += -gdwarf-$(DWARF_VERSION) + +ifneq ($(DWARF_VERSION)$(LLVM_IAS),21) +KBUILD_AFLAGS += -Wa,-gdwarf-$(DWARF_VERSION) +endif + +ifdef CONFIG_CC_IS_CLANG +ifneq ($(LLVM_IAS),1) +KBUILD_CFLAGS += -Wa,-gdwarf-$(DWARF_VERSION) +endif endif ifdef CONFIG_DEBUG_INFO_REDUCED diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index cd1bf600..0382808ef9fe 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -828,7 +828,11 @@ .debug_types0 : { *(.debug_types) } \ /* DWARF 5 */ \ .debug_macro0 : { *(.debug_macro) } \ - .debug_addr 0 : { *(.debug_addr) } + .debug_addr 0 : { *(.debug_addr) } \ + .debug_line_str 0 : { *(.debug_line_str) } \ + .debug_loclists 0 : { *(.debug_loclists) } \ + .debug_rnglists 0 : { *(.debug_rnglists) } \ + .debug_str_offsets 0 : { *(.debug_str_offsets) } Consider adding .debug_names for the accelerator table. It is the DWARF v5 version of .debug_pub{names,types} (which are mentioned a few lines above). /* Stabs debugging sections. */ #define STABS_DEBUG \ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 537cf3c2937d..6b01f0e2dad8 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -256,14 +256,35 @@ config DEBUG_INFO_SPLIT to know about the .dwo files and include them. Incompatible with older versions of ccache. +choice +prompt "DWARF version" + depends on DEBUG_INFO + default DEBUG_INFO_DWARF2 + help + Which version of DWARF debug info to emit. + +config DEBUG_INFO_DWARF2 + bool "Generate dwarf2 debuginfo" + help + Generate dwarf2 debug info. In documentation, a more official way to refer to the format is: DWARF v2. (While "DWARF5" and "DWARF v5" are acceptable, the latter is preferred) Ditto below. config DEBUG_INFO_DWARF4 bool "Generate dwarf4 debuginfo" depends on $(cc-option,-gdwarf-4) help - Generate dwarf4 debug info. This requires recent versions - of gcc and gdb. It makes the
Re: [tip:x86/seves] BUILD SUCCESS WITH WARNING e6eb15c9ba3165698488ae5c34920eea20eaa38e
On 2020-09-16, 'Marco Elver' via Clang Built Linux wrote: On Wed, 16 Sep 2020 at 20:22, 'Nick Desaulniers' via kasan-dev wrote: On Wed, Sep 16, 2020 at 1:46 AM Marco Elver wrote: > > On Wed, 16 Sep 2020 at 10:30, wrote: > > On Tue, Sep 15, 2020 at 08:09:16PM +0200, Marco Elver wrote: > > > On Tue, 15 Sep 2020 at 19:40, Nick Desaulniers wrote: > > > > On Tue, Sep 15, 2020 at 10:21 AM Borislav Petkov wrote: > > > > > > > init/calibrate.o: warning: objtool: asan.module_ctor()+0xc: call without frame pointer save/setup > > > > > init/calibrate.o: warning: objtool: asan.module_dtor()+0xc: call without frame pointer save/setup > > > > > init/version.o: warning: objtool: asan.module_ctor()+0xc: call without frame pointer save/setup > > > > > init/version.o: warning: objtool: asan.module_dtor()+0xc: call without frame pointer save/setup > > > > > certs/system_keyring.o: warning: objtool: asan.module_ctor()+0xc: call without frame pointer save/setup > > > > > certs/system_keyring.o: warning: objtool: asan.module_dtor()+0xc: call without frame pointer save/setup > > > > > > This one also appears with Clang 11. This is new I think because we > > > started emitting ASAN ctors for globals redzone initialization. > > > > > > I think we really do not care about precise stack frames in these > > > compiler-generated functions. So, would it be reasonable to make > > > objtool ignore all *san.module_ctor and *san.module_dtor functions (we > > > have them for ASAN, TSAN, MSAN)? > > > > The thing is, if objtool cannot follow, it cannot generate ORC data and > > our unwinder cannot unwind through the instrumentation, and that is a > > fail. > > > > Or am I missing something here? > > They aren't about the actual instrumentation. The warnings are about > module_ctor/module_dtor functions which are compiler-generated, and > these are only called on initialization/destruction (dtors only for > modules I guess). > > E.g. for KASAN it's the calls to __asan_register_globals that are > called from asan.module_ctor. For KCSAN the tsan.module_ctor is > effectively a noop (because __tsan_init() is a noop), so it really > doesn't matter much. > > Is my assumption correct that the only effect would be if something > called by them fails, we just don't see the full stack trace? I think > we can live with that, there are only few central places that deal > with ctors/dtors (do_ctors(), ...?). > > The "real" fix would be to teach the compilers about "frame pointer > save/setup" for generated functions, but I don't think that's > realistic. So this has come up before, specifically in the context of gcov: https://github.com/ClangBuiltLinux/linux/issues/955. I looked into this a bit, and IIRC, the issue was that compiler generated functions aren't very good about keeping track of whether they should or should not emit framepointer setup/teardown prolog/epilogs. In LLVM's IR, -fno-omit-frame-pointer gets attached to every function as a function level attribute. https://godbolt.org/z/fcn9c6 ("frame-pointer"="all"). There were some recent LLVM patches for BTI (arm64) that made some BTI related command line flags module level attributes, which I thought was interesting; I was wondering last night if -fno-omit-frame-pointer and maybe even the level of stack protector should be? I guess LTO would complicate things; not sure it would be good to merge modules with different attributes; I'm not sure how that's handled today in LLVM. Basically, when the compiler is synthesizing a new function definition, it should check whether a frame pointer should be emitted or not. We could do that today by maybe scanning all other function definitions for the presence of "frame-pointer"="all" fn attr, breaking early if we find one, and emitting the frame pointer setup in that case. Though I guess it's "frame-pointer"="none" otherwise, so maybe checking any other fn def would be fine; I don't see any C fn attr's that allow you to keep frame pointers or not. What's tricky is that the front end flag was resolved much earlier than where this code gets generated, so it would need to look for traces that the flag ever existed, which sounds brittle on paper to me. Thanks for the summary -- yeah, that was my suspicion, that some attribute was being lost somewhere. And I think if we generalize this, and don't just try to attach "frame-pointer" attr to the function, we probably also solve the BTI issue that Mark still pointed out with these module_ctor/dtors. I was trying to see if there was a generic way to attach all the common attributes to the function generated here: https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Utils/ModuleUtils.cpp#L122 -- but we probably can't attach all attributes, and need to remove a bunch of them again like the sanitizers (or alternatively just select the ones we need). But, I'm still digging for the function that attaches all the common attributes... Thanks, -- Marco Speaking of gcov, do
Re: [PATCH v2 2/7] Revert "kbuild: disable clang's default use of -fmerge-all-constants"
On 2020-08-31, Nathan Chancellor wrote: On Mon, Aug 31, 2020 at 05:23:21PM -0700, Nick Desaulniers wrote: This reverts commit 87e0d4f0f37fb0c8c4aeeac46fff5e957738df79. This was fixed in clang-6; the minimum supported version of clang in the kernel is clang-10 (10.0.1). Link: https://reviews.llvm.org/rL329300. Link: https://github.com/ClangBuiltLinux/linux/issues/9 Suggested-by: Nathan Chancellor Signed-off-by: Nick Desaulniers Reviewed-by: Nathan Chancellor How about expanding "This was fixed in clang-6" to be -fno-merge-all-constants has been the default since clang-6? (Both gcc|clang -fmerge-all-constants can cause an assertion failure for the example on https://bugs.llvm.org/show_bug.cgi?id=18538 ) Reviewed-by: Fangrui Song --- Makefile | 9 - 1 file changed, 9 deletions(-) diff --git a/Makefile b/Makefile index 37739ee53f27..144ac6a073ff 100644 --- a/Makefile +++ b/Makefile @@ -932,15 +932,6 @@ KBUILD_CFLAGS += $(call cc-disable-warning, maybe-uninitialized) # disable invalid "can't wrap" optimizations for signed / pointers KBUILD_CFLAGS += $(call cc-option,-fno-strict-overflow) -# clang sets -fmerge-all-constants by default as optimization, but this -# is non-conforming behavior for C and in fact breaks the kernel, so we -# need to disable it here generally. -KBUILD_CFLAGS += $(call cc-option,-fno-merge-all-constants) - -# for gcc -fno-merge-all-constants disables everything, but it is fine -# to have actual conforming behavior enabled. -KBUILD_CFLAGS += $(call cc-option,-fmerge-constants) - # Make sure -fstack-check isn't enabled (like gentoo apparently did) KBUILD_CFLAGS += $(call cc-option,-fno-stack-check,) -- 2.28.0.402.g5ffc5be6b7-goog -- You received this message because you are subscribed to the Google Groups "Clang Built Linux" group. To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200901045516.GA1561318%40ubuntu-n2-xlarge-x86.
Re: [PATCH v5 23/36] arm/build: Explicitly keep .ARM.attributes sections
On 2020-08-03, 'Nick Desaulniers' via Clang Built Linux wrote: On Fri, Jul 31, 2020 at 4:18 PM Kees Cook wrote: In preparation for adding --orphan-handling=warn, explicitly keep the .ARM.attributes section by expanding the existing ELF_DETAILS macro into ARM_DETAILS. Suggested-by: Nick Desaulniers Link: https://lore.kernel.org/lkml/cakwvodk-racgq5pxsogs6vtifbtrk5fmkmnolxrqmaovv0n...@mail.gmail.com/ Signed-off-by: Kees Cook --- arch/arm/include/asm/vmlinux.lds.h | 4 arch/arm/kernel/vmlinux-xip.lds.S | 2 +- arch/arm/kernel/vmlinux.lds.S | 2 +- 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/arm/include/asm/vmlinux.lds.h b/arch/arm/include/asm/vmlinux.lds.h index a08f4301b718..c4af5182ab48 100644 --- a/arch/arm/include/asm/vmlinux.lds.h +++ b/arch/arm/include/asm/vmlinux.lds.h @@ -52,6 +52,10 @@ ARM_MMU_DISCARD(*(__ex_table)) \ COMMON_DISCARDS +#define ARM_DETAILS\ + ELF_DETAILS \ + .ARM.attributes 0 : { *(.ARM.attributes) } I had to look up what the `0` meant: https://sourceware.org/binutils/docs/ld/Output-Section-Attributes.html#Output-Section-Attributes mentions it's an "address" and https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_chapter/ld_3.html#SEC21 mentions it as "start" (an address). Unless we need those, can we drop them? (Sorry for the resulting churn that would cause). I think the NO_LOAD stuff makes more sense, but I'm curious if the kernel checks for that. NOLOAD means SHT_NOBITS (usually SHF_ALLOC). .ARM.attributes is a non-SHF_ALLOC section. An explicit 0 (output section address) is good - GNU ld's internal linker scripts (ld --verbose output) use 0 for such non-SHF_ALLOC sections. Without the 0, the section may get a non-zero address, which is not wrong - but probably does not look well. See https://reviews.llvm.org/D85867 for details. Reviewed-by: Fangrui Song + #define ARM_STUBS_TEXT \ *(.gnu.warning) \ *(.glue_7) \ diff --git a/arch/arm/kernel/vmlinux-xip.lds.S b/arch/arm/kernel/vmlinux-xip.lds.S index 904c31fa20ed..57fcbf55f913 100644 --- a/arch/arm/kernel/vmlinux-xip.lds.S +++ b/arch/arm/kernel/vmlinux-xip.lds.S @@ -150,7 +150,7 @@ SECTIONS _end = .; STABS_DEBUG - ELF_DETAILS + ARM_DETAILS } /* diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S index bb950c896a67..1d3d3b599635 100644 --- a/arch/arm/kernel/vmlinux.lds.S +++ b/arch/arm/kernel/vmlinux.lds.S @@ -149,7 +149,7 @@ SECTIONS _end = .; STABS_DEBUG - ELF_DETAILS + ARM_DETAILS } #ifdef CONFIG_STRICT_KERNEL_RWX -- 2.25.1
Re: [PATCH v2] lib/string.c: implement stpcpy
On 2020-08-15, 'Nick Desaulniers' via Clang Built Linux wrote: On Sat, Aug 15, 2020 at 2:31 PM Joe Perches wrote: On Sat, 2020-08-15 at 14:28 -0700, Nick Desaulniers wrote: > On Sat, Aug 15, 2020 at 2:24 PM Joe Perches wrote: > > On Sat, 2020-08-15 at 13:47 -0700, Nick Desaulniers wrote: > > > On Sat, Aug 15, 2020 at 9:34 AM Kees Cook wrote: > > > > On Fri, Aug 14, 2020 at 07:09:44PM -0700, Nick Desaulniers wrote: > > > > > LLVM implemented a recent "libcall optimization" that lowers calls to > > > > > `sprintf(dest, "%s", str)` where the return value is used to > > > > > `stpcpy(dest, str) - dest`. This generally avoids the machinery involved > > > > > in parsing format strings. Calling `sprintf` with overlapping arguments > > > > > was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior. > > > > > > > > > > `stpcpy` is just like `strcpy` except it returns the pointer to the new > > > > > tail of `dest`. This allows you to chain multiple calls to `stpcpy` in > > > > > one statement. > > > > > > > > O_O What? > > > > > > > > No; this is a _terrible_ API: there is no bounds checking, there are no > > > > buffer sizes. Anything using the example sprintf() pattern is _already_ > > > > wrong and must be removed from the kernel. (Yes, I realize that the > > > > kernel is *filled* with this bad assumption that "I'll never write more > > > > than PAGE_SIZE bytes to this buffer", but that's both theoretically > > > > wrong ("640k is enough for anybody") and has been known to be wrong in > > > > practice too (e.g. when suddenly your writing routine is reachable by > > > > splice(2) and you may not have a PAGE_SIZE buffer). > > > > > > > > But we cannot _add_ another dangerous string API. We're already in a > > > > terrible mess trying to remove strcpy[1], strlcpy[2], and strncpy[3]. This > > > > needs to be addressed up by removing the unbounded sprintf() uses. (And > > > > to do so without introducing bugs related to using snprintf() when > > > > scnprintf() is expected[4].) > > > > > > Well, everything (-next, mainline, stable) is broken right now (with > > > ToT Clang) without providing this symbol. I'm not going to go clean > > > the entire kernel's use of sprintf to get our CI back to being green. > > > > Maybe this should get place in compiler-clang.h so it isn't > > generic and public. > > https://bugs.llvm.org/show_bug.cgi?id=47162#c7 and > https://bugs.llvm.org/show_bug.cgi?id=47144 > Seem to imply that Clang is not the only compiler that can lower a > sequence of libcalls to stpcpy. Do we want to wait until we have a > fire drill w/ GCC to move such an implementation from > include/linux/compiler-clang.h back in to lib/string.c? My guess is yes, wait until gcc, if ever, needs it. The suggestion to use static inline doesn't even make sense. The compiler is lowering calls to other library routines; `stpcpy` isn't being explicitly called. Even if it was, not sure we want it being inlined. No symbol definition will be emitted; problem not solved. And I refuse to add any more code using `extern inline`. Putting the definition in lib/string.c is the most straightforward and avoids revisiting this issue in the future for other toolchains. I'll limit access by removing the declaration, and adding a comment to avoid its use. But if you're going to use a gnu target triple without using -ffreestanding because you *want* libcall optimizations, then you have to provide symbols for all possible library routines! Adding a definition without a declaration for stpcpy looks good. Clang LTO will work. (If the kernel does not want to provide these routines, is http://git.kernel.org/linus/6edfba1b33c701108717f4e036320fc39abe1912 probably wrong? (why remove -ffreestanding from the main Makefile) )
Re: [PATCH v3] Makefile: Fix GCC_TOOLCHAIN_DIR prefix for Clang cross compilation
On 2020-07-22, Masahiro Yamada wrote: On Wed, Jul 22, 2020 at 9:14 AM Fangrui Song wrote: On 2020-07-22, Masahiro Yamada wrote: >On Wed, Jul 22, 2020 at 2:31 AM 'Fangrui Song' via Clang Built Linux > wrote: >> >> When CROSS_COMPILE is set (e.g. aarch64-linux-gnu-), if >> $(CROSS_COMPILE)elfedit is found at /usr/bin/aarch64-linux-gnu-elfedit, >> GCC_TOOLCHAIN_DIR will be set to /usr/bin/. --prefix= will be set to >> /usr/bin/ and Clang as of 11 will search for both >> $(prefix)aarch64-linux-gnu-$needle and $(prefix)$needle. >> >> GCC searchs for $(prefix)aarch64-linux-gnu/$version/$needle, >> $(prefix)aarch64-linux-gnu/$needle and $(prefix)$needle. In practice, >> $(prefix)aarch64-linux-gnu/$needle rarely contains executables. >> >> To better model how GCC's -B/--prefix takes in effect in practice, newer >> Clang (since >> https://github.com/llvm/llvm-project/commit/3452a0d8c17f7166f479706b293caf6ac76ffd90) >> only searches for $(prefix)$needle. Currently it will find /usr/bin/as >> instead of /usr/bin/aarch64-linux-gnu-as. >> >> Set --prefix= to $(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE) >> (/usr/bin/aarch64-linux-gnu-) so that newer Clang can find the >> appropriate cross compiling GNU as (when -no-integrated-as is in >> effect). >> >> Cc: sta...@vger.kernel.org >> Reported-by: Nathan Chancellor >> Signed-off-by: Fangrui Song >> Reviewed-by: Nathan Chancellor >> Tested-by: Nathan Chancellor >> Tested-by: Nick Desaulniers >> Link: https://github.com/ClangBuiltLinux/linux/issues/1099 >> --- >> Changes in v2: >> * Updated description to add tags and the llvm-project commit link. >> * Fixed a typo. >> >> Changes in v3: >> * Add Cc: sta...@vger.kernel.org >> --- >> Makefile | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/Makefile b/Makefile >> index 0b5f8538bde5..3ac83e375b61 100644 >> --- a/Makefile >> +++ b/Makefile >> @@ -567,7 +567,7 @@ ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),) >> ifneq ($(CROSS_COMPILE),) >> CLANG_FLAGS+= --target=$(notdir $(CROSS_COMPILE:%-=%)) >> GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)elfedit)) >> -CLANG_FLAGS+= --prefix=$(GCC_TOOLCHAIN_DIR) >> +CLANG_FLAGS+= --prefix=$(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE) > > > >CROSS_COMPILE may contain the directory path >to the cross toolchains. > > >For example, I use aarch64-linux-gnu-* >installed in >/home/masahiro/tools/aarch64-linaro-7.5/bin > > > >Basically, there are two ways to use it. > >[1] >PATH=$PATH:/home/masahiro/tools/aarch64-linaro-7.5/bin >CROSS_COMPILE=aarch64-linux-gnu- > > >[2] >Without setting PATH, >CROSS_COMPILE=~/tools/aarch64-linaro-7.5/bin/aarch64-linux-gnu- > > > >I usually do [2] (and so does intel's 0day bot). > > > >This patch works for the use-case [1] >but if I do [2], --prefix is set to a strange path: > >--prefix=/home/masahiro/tools/aarch64-linaro-7.5/bin//home/masahiro/tools/aarch64-linaro-7.5/bin/aarch64-linux-gnu- Thanks. I did not know the use-case [2]. This explains why there is a `$(notdir ...)` in `CLANG_FLAGS += --target=$(notdir $(CROSS_COMPILE:%-=%))` > > >Interestingly, the build is still successful. >Presumably Clang searches for more paths >when $(prefix)$needle is not found ? The priority order is: -B(--prefix), COMPILER_PATH, detected gcc-cross paths (In GCC, -B paths get prepended to the COMPILER_PATH list. Clang<=11 incorrectly adds -B to the COMPILER_PATH list. I have fixed it for 12.0.0) If -B fails, the detected gcc-cross paths may still be able to find /usr/bin/aarch64-linux-gnu- For example, on my machine (a variant of Debian testing), Clang finds $(realpath /usr/lib/gcc-cross/aarch64-linux-gnu/9/../../../../aarch64-linux-gnu/bin/as), which is /usr/bin/aarch64-linux-gnu-as > >I applied your patch and added -v option >to see which assembler was internally invoked: > > "/home/masahiro/tools/aarch64-linaro-7.5/lib/gcc/aarch64-linux-gnu/7.5.0/../../../../aarch64-linux-gnu/bin/as" >-EL -I ./arch/arm64/include -I ./arch/arm64/include/generated -I >./include -I ./arch/arm64/include/uapi -I >./arch/arm64/include/generated/uapi -I ./include/uapi -I >./include/generated/uapi -o kernel/smp.o /tmp/smp-2ec2c7.s > > >Ok, it looks like Clang found an alternative path >to the correct 'as'. > > > > >But, to keep the original behavior for both [1] and [2], >how about this? > >CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)$(notdir $(CROSS_COMPILE)) > > > >Then, I can get this: > > "/home/masah
Re: [PATCH v3] Makefile: Fix GCC_TOOLCHAIN_DIR prefix for Clang cross compilation
On 2020-07-22, Masahiro Yamada wrote: On Wed, Jul 22, 2020 at 2:31 AM 'Fangrui Song' via Clang Built Linux wrote: When CROSS_COMPILE is set (e.g. aarch64-linux-gnu-), if $(CROSS_COMPILE)elfedit is found at /usr/bin/aarch64-linux-gnu-elfedit, GCC_TOOLCHAIN_DIR will be set to /usr/bin/. --prefix= will be set to /usr/bin/ and Clang as of 11 will search for both $(prefix)aarch64-linux-gnu-$needle and $(prefix)$needle. GCC searchs for $(prefix)aarch64-linux-gnu/$version/$needle, $(prefix)aarch64-linux-gnu/$needle and $(prefix)$needle. In practice, $(prefix)aarch64-linux-gnu/$needle rarely contains executables. To better model how GCC's -B/--prefix takes in effect in practice, newer Clang (since https://github.com/llvm/llvm-project/commit/3452a0d8c17f7166f479706b293caf6ac76ffd90) only searches for $(prefix)$needle. Currently it will find /usr/bin/as instead of /usr/bin/aarch64-linux-gnu-as. Set --prefix= to $(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE) (/usr/bin/aarch64-linux-gnu-) so that newer Clang can find the appropriate cross compiling GNU as (when -no-integrated-as is in effect). Cc: sta...@vger.kernel.org Reported-by: Nathan Chancellor Signed-off-by: Fangrui Song Reviewed-by: Nathan Chancellor Tested-by: Nathan Chancellor Tested-by: Nick Desaulniers Link: https://github.com/ClangBuiltLinux/linux/issues/1099 --- Changes in v2: * Updated description to add tags and the llvm-project commit link. * Fixed a typo. Changes in v3: * Add Cc: sta...@vger.kernel.org --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 0b5f8538bde5..3ac83e375b61 100644 --- a/Makefile +++ b/Makefile @@ -567,7 +567,7 @@ ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),) ifneq ($(CROSS_COMPILE),) CLANG_FLAGS+= --target=$(notdir $(CROSS_COMPILE:%-=%)) GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)elfedit)) -CLANG_FLAGS+= --prefix=$(GCC_TOOLCHAIN_DIR) +CLANG_FLAGS+= --prefix=$(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE) CROSS_COMPILE may contain the directory path to the cross toolchains. For example, I use aarch64-linux-gnu-* installed in /home/masahiro/tools/aarch64-linaro-7.5/bin Basically, there are two ways to use it. [1] PATH=$PATH:/home/masahiro/tools/aarch64-linaro-7.5/bin CROSS_COMPILE=aarch64-linux-gnu- [2] Without setting PATH, CROSS_COMPILE=~/tools/aarch64-linaro-7.5/bin/aarch64-linux-gnu- I usually do [2] (and so does intel's 0day bot). This patch works for the use-case [1] but if I do [2], --prefix is set to a strange path: --prefix=/home/masahiro/tools/aarch64-linaro-7.5/bin//home/masahiro/tools/aarch64-linaro-7.5/bin/aarch64-linux-gnu- Thanks. I did not know the use-case [2]. This explains why there is a `$(notdir ...)` in `CLANG_FLAGS += --target=$(notdir $(CROSS_COMPILE:%-=%))` Interestingly, the build is still successful. Presumably Clang searches for more paths when $(prefix)$needle is not found ? The priority order is: -B(--prefix), COMPILER_PATH, detected gcc-cross paths (In GCC, -B paths get prepended to the COMPILER_PATH list. Clang<=11 incorrectly adds -B to the COMPILER_PATH list. I have fixed it for 12.0.0) If -B fails, the detected gcc-cross paths may still be able to find /usr/bin/aarch64-linux-gnu- For example, on my machine (a variant of Debian testing), Clang finds $(realpath /usr/lib/gcc-cross/aarch64-linux-gnu/9/../../../../aarch64-linux-gnu/bin/as), which is /usr/bin/aarch64-linux-gnu-as I applied your patch and added -v option to see which assembler was internally invoked: "/home/masahiro/tools/aarch64-linaro-7.5/lib/gcc/aarch64-linux-gnu/7.5.0/../../../../aarch64-linux-gnu/bin/as" -EL -I ./arch/arm64/include -I ./arch/arm64/include/generated -I ./include -I ./arch/arm64/include/uapi -I ./arch/arm64/include/generated/uapi -I ./include/uapi -I ./include/generated/uapi -o kernel/smp.o /tmp/smp-2ec2c7.s Ok, it looks like Clang found an alternative path to the correct 'as'. But, to keep the original behavior for both [1] and [2], how about this? CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)$(notdir $(CROSS_COMPILE)) Then, I can get this: "/home/masahiro/tools/aarch64-linaro-7.5/bin/aarch64-linux-gnu-as" -EL -I ./arch/arm64/include -I ./arch/arm64/include/generated -I ./include -I ./arch/arm64/include/uapi -I ./arch/arm64/include/generated/uapi -I ./include/uapi -I ./include/generated/uapi -o kernel/smp.o /tmp/smp-16d76f.s This looks good. Agreed that `--prefix=$(GCC_TOOLCHAIN_DIR)$(notdir $(CROSS_COMPILE))` should work for both [1] and [2]. Shall I send a v4? Or you are kind enough that you'll just add your Signed-off-by: tag and fix that for me? :) GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..) endif ifneq ($(GCC_TOOLCHAIN),) -- 2.28.0.rc0.105.gf9edc3c819-goog -- You received this message because you are subscribed to the Google Groups "Clang Built Linux" group. To unsubscribe from this group
[PATCH v3] Makefile: Fix GCC_TOOLCHAIN_DIR prefix for Clang cross compilation
When CROSS_COMPILE is set (e.g. aarch64-linux-gnu-), if $(CROSS_COMPILE)elfedit is found at /usr/bin/aarch64-linux-gnu-elfedit, GCC_TOOLCHAIN_DIR will be set to /usr/bin/. --prefix= will be set to /usr/bin/ and Clang as of 11 will search for both $(prefix)aarch64-linux-gnu-$needle and $(prefix)$needle. GCC searchs for $(prefix)aarch64-linux-gnu/$version/$needle, $(prefix)aarch64-linux-gnu/$needle and $(prefix)$needle. In practice, $(prefix)aarch64-linux-gnu/$needle rarely contains executables. To better model how GCC's -B/--prefix takes in effect in practice, newer Clang (since https://github.com/llvm/llvm-project/commit/3452a0d8c17f7166f479706b293caf6ac76ffd90) only searches for $(prefix)$needle. Currently it will find /usr/bin/as instead of /usr/bin/aarch64-linux-gnu-as. Set --prefix= to $(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE) (/usr/bin/aarch64-linux-gnu-) so that newer Clang can find the appropriate cross compiling GNU as (when -no-integrated-as is in effect). Cc: sta...@vger.kernel.org Reported-by: Nathan Chancellor Signed-off-by: Fangrui Song Reviewed-by: Nathan Chancellor Tested-by: Nathan Chancellor Tested-by: Nick Desaulniers Link: https://github.com/ClangBuiltLinux/linux/issues/1099 --- Changes in v2: * Updated description to add tags and the llvm-project commit link. * Fixed a typo. Changes in v3: * Add Cc: sta...@vger.kernel.org --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 0b5f8538bde5..3ac83e375b61 100644 --- a/Makefile +++ b/Makefile @@ -567,7 +567,7 @@ ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),) ifneq ($(CROSS_COMPILE),) CLANG_FLAGS+= --target=$(notdir $(CROSS_COMPILE:%-=%)) GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)elfedit)) -CLANG_FLAGS+= --prefix=$(GCC_TOOLCHAIN_DIR) +CLANG_FLAGS+= --prefix=$(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE) GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..) endif ifneq ($(GCC_TOOLCHAIN),) -- 2.28.0.rc0.105.gf9edc3c819-goog
[PATCH v2] Makefile: Fix GCC_TOOLCHAIN_DIR prefix for Clang cross compilation
When CROSS_COMPILE is set (e.g. aarch64-linux-gnu-), if $(CROSS_COMPILE)elfedit is found at /usr/bin/aarch64-linux-gnu-elfedit, GCC_TOOLCHAIN_DIR will be set to /usr/bin/. --prefix= will be set to /usr/bin/ and Clang as of 11 will search for both $(prefix)aarch64-linux-gnu-$needle and $(prefix)$needle. GCC searchs for $(prefix)aarch64-linux-gnu/$version/$needle, $(prefix)aarch64-linux-gnu/$needle and $(prefix)$needle. In practice, $(prefix)aarch64-linux-gnu/$needle rarely contains executables. To better model how GCC's -B/--prefix takes in effect in practice, newer Clang (since https://github.com/llvm/llvm-project/commit/3452a0d8c17f7166f479706b293caf6ac76ffd90) only searches for $(prefix)$needle. Currently it will find /usr/bin/as instead of /usr/bin/aarch64-linux-gnu-as. Set --prefix= to $(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE) (/usr/bin/aarch64-linux-gnu-) so that newer Clang can find the appropriate cross compiling GNU as (when -no-integrated-as is in effect). Reported-by: Nathan Chancellor Signed-off-by: Fangrui Song Reviewed-by: Nathan Chancellor Tested-by: Nathan Chancellor Tested-by: Nick Desaulniers Link: https://github.com/ClangBuiltLinux/linux/issues/1099 --- Changes in v2: * Updated description to add tags and the llvm-project commit link. * Fixed a typo. --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 0b5f8538bde5..3ac83e375b61 100644 --- a/Makefile +++ b/Makefile @@ -567,7 +567,7 @@ ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),) ifneq ($(CROSS_COMPILE),) CLANG_FLAGS+= --target=$(notdir $(CROSS_COMPILE:%-=%)) GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)elfedit)) -CLANG_FLAGS+= --prefix=$(GCC_TOOLCHAIN_DIR) +CLANG_FLAGS+= --prefix=$(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE) GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..) endif ifneq ($(GCC_TOOLCHAIN),) -- 2.28.0.rc0.105.gf9edc3c819-goog
Re: [PATCH] Makefile: Fix GCC_TOOLCHAIN_DIR prefix for Clang cross compilation
On 2020-07-20, Nick Desaulniers wrote: On Mon, Jul 20, 2020 at 11:16 AM Nathan Chancellor wrote: On Mon, Jul 20, 2020 at 11:12:22AM -0700, Fangrui Song wrote: > When CROSS_COMPILE is set (e.g. aarch64-linux-gnu-), if > $(CROSS_COMPILE)elfedit is found at /usr/bin/aarch64-linux-gnu-, > GCC_TOOLCHAIN_DIR will be set to /usr/bin/. --prefix= will be set to > /usr/bin/ and Clang as of 11 will search for both > $(prefix)aarch64-linux-gnu-$needle and $(prefix)$needle. > > GCC searchs for $(prefix)aarch64-linux-gnu/$version/$needle, > $(prefix)aarch64-linux-gnu/$needle and $(prefix)$needle. In practice, > $(prefix)aarch64-linux-gnu/$needle rarely contains executables. > > To better model how GCC's -B/--prefix takes in effect in practice, newer > Clang only searches for $(prefix)$needle and for example it will find "newer Clang" requires the reader to recall that "Clang as of 11" was the previous frame of reference. I think it would be clearer to: 1. call of clang-12 as having a difference in behavior. 2. explicitly link to the commit, ie: Link: https://github.com/llvm/llvm-project/commit/3452a0d8c17f7166f479706b293caf6ac76ffd90 > /usr/bin/as instead of /usr/bin/aarch64-linux-gnu-as. That's a common source of pain (for example, when cross compiling without having the proper cross binutils installed, it's common to get spooky errors about unsupported configs or host binutils not recognizing flags specific to cross building). /usr/bin/as: unrecognized option '-EL' being the most common. So in that case, I'm actually very happy with the llvm change if it solves that particularly common pain point. > > Set --prefix= to $(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE) > (/usr/bin/aarch64-linux-gnu-) so that newer Clang can find the > appropriate cross compiling GNU as (when -no-integrated-as is in > effect). > > Signed-off-by: Nathan Chancellor > Signed-off-by: Fangrui Song > Link: https://github.com/ClangBuiltLinux/linux/issues/1099 Sorry that I did not pay attention before but this needs Cc: sta...@vger.kernel.org Agreed. This change to llvm will blow up all of our CI jobs that cross compile if not backported to stable. Thanks. I did not know this. in the body so that it gets automatically backported into all of our stable branches. I am not sure if Masahiro is okay with adding that after the fact or if he will want a v2. I am fine with having my signed-off-by on the patch but I did not really do much :) I am fine with having that downgraded to Not a big deal, but there's only really two cases I can think of where it's appropriate to attach someone else's "SOB" to a patch: 1. It's their patch that you're resending on their behalf, possibly as part of a larger series. 2. You're a maintainer, and...well I guess that's also case 1 above. Reported-by is more appropriate, and you can include the tags collected from this thread. Please ping me internally for help sending a v2, if needed. Nathan's draft patch on https://github.com/ClangBuiltLinux/linux/issues/1099 actually works. I removed the slash. The words are my own. Since Nathan explicitly requested a downgrade of his tag, I'll do that for V2. I'll do that anyway because I need to fix a typo in the description: $(CROSS_COMPILE)elfedit is found at /usr/bin/aarch64-linux-gnu- $(CROSS_COMPILE)elfedit is found at /usr/bin/aarch64-linux-gnu-elfedit Reviewed-by: Nathan Chancellor Tested-by: Nathan Chancellor I tested with this llvm pre- and post- https://github.com/llvm/llvm-project/commit/3452a0d8c17f7166f479706b293caf6ac76ffd90. I also tested downstream Android kernel builds with 3452a0d8c17f7166f479706b293caf6ac76ffd90. Builds that don't make use of CROSS_COMPILE (native host targets) are obviously unaffected. We might see this issue pop up a few more times internally if the patch isn't picked up by stable, or if those downstream kernel trees don't rebase on stable kernel trees as often as they refresh their toolchain. Tested-by: Nick Desaulniers Thanks for offerring proofreading service! I'm working on the description... if people find it odd. Thanks for sending this along! Cheers, Nathan > --- > Makefile | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/Makefile b/Makefile > index 0b5f8538bde5..3ac83e375b61 100644 > --- a/Makefile > +++ b/Makefile > @@ -567,7 +567,7 @@ ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),) > ifneq ($(CROSS_COMPILE),) > CLANG_FLAGS += --target=$(notdir $(CROSS_COMPILE:%-=%)) > GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)elfedit)) > -CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR) > +CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE) > GCC_TOOLCHAIN:= $(realpath $(GCC_TOOLCHAIN_DIR)/..) > endif > ifneq ($(GCC_TOOLCHAIN),) > -- > 2.28.0.rc0.105.gf9edc3c819-goog > -- -- Thanks, ~Nick Desaulniers
[PATCH] Makefile: Fix GCC_TOOLCHAIN_DIR prefix for Clang cross compilation
When CROSS_COMPILE is set (e.g. aarch64-linux-gnu-), if $(CROSS_COMPILE)elfedit is found at /usr/bin/aarch64-linux-gnu-, GCC_TOOLCHAIN_DIR will be set to /usr/bin/. --prefix= will be set to /usr/bin/ and Clang as of 11 will search for both $(prefix)aarch64-linux-gnu-$needle and $(prefix)$needle. GCC searchs for $(prefix)aarch64-linux-gnu/$version/$needle, $(prefix)aarch64-linux-gnu/$needle and $(prefix)$needle. In practice, $(prefix)aarch64-linux-gnu/$needle rarely contains executables. To better model how GCC's -B/--prefix takes in effect in practice, newer Clang only searches for $(prefix)$needle and for example it will find /usr/bin/as instead of /usr/bin/aarch64-linux-gnu-as. Set --prefix= to $(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE) (/usr/bin/aarch64-linux-gnu-) so that newer Clang can find the appropriate cross compiling GNU as (when -no-integrated-as is in effect). Signed-off-by: Nathan Chancellor Signed-off-by: Fangrui Song Link: https://github.com/ClangBuiltLinux/linux/issues/1099 --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 0b5f8538bde5..3ac83e375b61 100644 --- a/Makefile +++ b/Makefile @@ -567,7 +567,7 @@ ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),) ifneq ($(CROSS_COMPILE),) CLANG_FLAGS+= --target=$(notdir $(CROSS_COMPILE:%-=%)) GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)elfedit)) -CLANG_FLAGS+= --prefix=$(GCC_TOOLCHAIN_DIR) +CLANG_FLAGS+= --prefix=$(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE) GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..) endif ifneq ($(GCC_TOOLCHAIN),) -- 2.28.0.rc0.105.gf9edc3c819-goog
Re: Plumbers session on GNU+LLVM collab?
On 2020-07-09, 'Nick Desaulniers' via Clang Built Linux wrote: Hi Segher, Rasmus, and Ramana, I am working on finalizing a proposal for an LLVM microconference at plumbers, which is focusing on a lot of issues we currently face on the LLVM side. I'd really like to host a session with more GNU toolchain developers to discuss collaboration more. I was curious; are either of you planning on attending plumbers this year? If so, would such a session be interesting enough for you to attend? Looks like a good idea. I am interested. Perhaps Tom Stellard, Jeremy Bennett, Nathan Sidwell and Iain Sandoe have some ideas. They have a talk about GCC/LLVM collaboration https://gcc.gnu.org/wiki/cauldron2019#cauldron2019talks.GCC_LLVM_Collaboration_BoF I was curious too, who else we should explicitly invite? I ran a quick set analysis on who's contributed to both kernel and , and the list was much much bigger than I was expecting. https://gist.github.com/nickdesaulniers/5330eea6f46dea93e7766bb03311d474 89 contributors to both linux and llvm 283 linux+gcc 159 linux+binutils (No one to all four yet...also, not super scientific, since I'm using name+email for the set, and emails change. Point being I don't want to explicitly invite hundreds of people) Might be worth sending an email to g...@gcc.gnu.org as well. This month's archive: https://sourceware.org/pipermail/gcc/2020-July/
Re: [PATCH v3 7/7] x86/boot: Check that there are no runtime relocations
* Ard Biesheuvel On Tue, 30 Jun 2020 at 01:34, Fangrui Song wrote: > > On 2020-06-29, Ard Biesheuvel wrote: > >On Mon, 29 Jun 2020 at 19:37, Fangrui Song wrote: > >> > >> On 2020-06-29, Arvind Sankar wrote: > >> >On Mon, Jun 29, 2020 at 09:20:31AM -0700, Kees Cook wrote: > >> >> On Mon, Jun 29, 2020 at 06:11:59PM +0200, Ard Biesheuvel wrote: > >> >> > On Mon, 29 Jun 2020 at 18:09, Kees Cook wrote: > >> >> > > > >> >> > > On Mon, Jun 29, 2020 at 10:09:28AM -0400, Arvind Sankar wrote: > >> >> > > > Add a linker script check that there are no runtime relocations, and > >> >> > > > remove the old one that tries to check via looking for specially-named > >> >> > > > sections in the object files. > >> >> > > > > >> >> > > > Drop the tests for -fPIE compiler option and -pie linker option, as they > >> >> > > > are available in all supported gcc and binutils versions (as well as > >> >> > > > clang and lld). > >> >> > > > > >> >> > > > Signed-off-by: Arvind Sankar > >> >> > > > Reviewed-by: Ard Biesheuvel > >> >> > > > Reviewed-by: Fangrui Song > >> >> > > > --- > >> >> > > > arch/x86/boot/compressed/Makefile | 28 +++--- > >> >> > > > arch/x86/boot/compressed/vmlinux.lds.S | 8 > >> >> > > > 2 files changed, 11 insertions(+), 25 deletions(-) > >> >> > > > >> >> > > Reviewed-by: Kees Cook > >> >> > > > >> >> > > question below ... > >> >> > > > >> >> > > > diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S > >> >> > > > index a4a4a59a2628..a78510046eec 100644 > >> >> > > > --- a/arch/x86/boot/compressed/vmlinux.lds.S > >> >> > > > +++ b/arch/x86/boot/compressed/vmlinux.lds.S > >> >> > > > @@ -42,6 +42,12 @@ SECTIONS > >> >> > > > *(.rodata.*) > >> >> > > > _erodata = . ; > >> >> > > > } > >> >> > > > + .rel.dyn : { > >> >> > > > + *(.rel.*) > >> >> > > > + } > >> >> > > > + .rela.dyn : { > >> >> > > > + *(.rela.*) > >> >> > > > + } > >> >> > > > .got : { > >> >> > > > *(.got) > >> >> > > > } > >> >> > > > >> >> > > Should these be marked (INFO) as well? > >> >> > > > >> >> > > >> >> > Given that sections marked as (INFO) will still be emitted into the > >> >> > ELF image, it does not really make a difference to do this for zero > >> >> > sized sections. > >> >> > >> >> Oh, I misunderstood -- I though they were _not_ emitted; I see now what > >> >> you said was not allocated. So, disk space used for the .got.plt case, > >> >> but not memory space used. Sorry for the confusion! > >> >> > >> >> -Kees > >> > >> About output section type (INFO): > >> https://sourceware.org/binutils/docs/ld/Output-Section-Type.html#Output-Section-Type > >> says "These type names are supported for backward compatibility, and are > >> rarely used." > >> > >> If all input section don't have the SHF_ALLOC flag, the output section > >> will not have this flag as well. This type is not useful... > >> > >> If .got and .got.plt were used, they should be considered dynamic > >> relocations which should be part of the loadable image. So they should > >> have the SHF_ALLOC flag. (INFO) will not be applicable anyway. > >> > > > >I don't care deeply either way, but Kees indicated that he would like > >to get rid of the 24 bytes of .got.plt magic entries that we have no > >need for. > > > >In fact, a lot of this mangling is caused by the fact that the linker > >is creating a relocatable binary, and assumes that it is a hosted > >binary that is loaded by a dynamic loader. It would actually be much
Re: [PATCH v3 7/7] x86/boot: Check that there are no runtime relocations
On 2020-06-29, Ard Biesheuvel wrote: On Mon, 29 Jun 2020 at 19:37, Fangrui Song wrote: On 2020-06-29, Arvind Sankar wrote: >On Mon, Jun 29, 2020 at 09:20:31AM -0700, Kees Cook wrote: >> On Mon, Jun 29, 2020 at 06:11:59PM +0200, Ard Biesheuvel wrote: >> > On Mon, 29 Jun 2020 at 18:09, Kees Cook wrote: >> > > >> > > On Mon, Jun 29, 2020 at 10:09:28AM -0400, Arvind Sankar wrote: >> > > > Add a linker script check that there are no runtime relocations, and >> > > > remove the old one that tries to check via looking for specially-named >> > > > sections in the object files. >> > > > >> > > > Drop the tests for -fPIE compiler option and -pie linker option, as they >> > > > are available in all supported gcc and binutils versions (as well as >> > > > clang and lld). >> > > > >> > > > Signed-off-by: Arvind Sankar >> > > > Reviewed-by: Ard Biesheuvel >> > > > Reviewed-by: Fangrui Song >> > > > --- >> > > > arch/x86/boot/compressed/Makefile | 28 +++--- >> > > > arch/x86/boot/compressed/vmlinux.lds.S | 8 >> > > > 2 files changed, 11 insertions(+), 25 deletions(-) >> > > >> > > Reviewed-by: Kees Cook >> > > >> > > question below ... >> > > >> > > > diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S >> > > > index a4a4a59a2628..a78510046eec 100644 >> > > > --- a/arch/x86/boot/compressed/vmlinux.lds.S >> > > > +++ b/arch/x86/boot/compressed/vmlinux.lds.S >> > > > @@ -42,6 +42,12 @@ SECTIONS >> > > > *(.rodata.*) >> > > > _erodata = . ; >> > > > } >> > > > + .rel.dyn : { >> > > > + *(.rel.*) >> > > > + } >> > > > + .rela.dyn : { >> > > > + *(.rela.*) >> > > > + } >> > > > .got : { >> > > > *(.got) >> > > > } >> > > >> > > Should these be marked (INFO) as well? >> > > >> > >> > Given that sections marked as (INFO) will still be emitted into the >> > ELF image, it does not really make a difference to do this for zero >> > sized sections. >> >> Oh, I misunderstood -- I though they were _not_ emitted; I see now what >> you said was not allocated. So, disk space used for the .got.plt case, >> but not memory space used. Sorry for the confusion! >> >> -Kees About output section type (INFO): https://sourceware.org/binutils/docs/ld/Output-Section-Type.html#Output-Section-Type says "These type names are supported for backward compatibility, and are rarely used." If all input section don't have the SHF_ALLOC flag, the output section will not have this flag as well. This type is not useful... If .got and .got.plt were used, they should be considered dynamic relocations which should be part of the loadable image. So they should have the SHF_ALLOC flag. (INFO) will not be applicable anyway. I don't care deeply either way, but Kees indicated that he would like to get rid of the 24 bytes of .got.plt magic entries that we have no need for. In fact, a lot of this mangling is caused by the fact that the linker is creating a relocatable binary, and assumes that it is a hosted binary that is loaded by a dynamic loader. It would actually be much better if the compiler and linker would take -ffreestanding into account, and suppress GOT entries, PLTs, dynamic program headers for shared libraries altogether. Linkers (GNU ld and LLD) don't create .got or .got.plt just because the linker command line has -pie or -shared. They create .got or .got.plt if there are specific needs. For .got.plt, if there is (1) any .plt/.iplt entry, (2) any .got.plt based relocation (e.g. R_X86_64_GOTPC32 on x86-64), or (3) if _GLOBAL_OFFSET_TABLE_ is referenced, .got.plt will be created (both GNU ld and LLD) with usually 3 entries (for ld.so purposes). If (1) is not satisfied, the created .got.plt is just served as an anchor for things that want to reference (the distance from GOT base to some point). The linker will still reserve 3 words but the words are likely not needed. I don't think there is a specific need for another option to teach the linker (GNU ld or LLD) that this is a kernel link. For -ffreestanding builds, cc -static (ld -no-pie))/-static-pie (-pie) already work quite well.
Re: [PATCH v3 7/7] x86/boot: Check that there are no runtime relocations
On 2020-06-29, Arvind Sankar wrote: On Mon, Jun 29, 2020 at 09:20:31AM -0700, Kees Cook wrote: On Mon, Jun 29, 2020 at 06:11:59PM +0200, Ard Biesheuvel wrote: > On Mon, 29 Jun 2020 at 18:09, Kees Cook wrote: > > > > On Mon, Jun 29, 2020 at 10:09:28AM -0400, Arvind Sankar wrote: > > > Add a linker script check that there are no runtime relocations, and > > > remove the old one that tries to check via looking for specially-named > > > sections in the object files. > > > > > > Drop the tests for -fPIE compiler option and -pie linker option, as they > > > are available in all supported gcc and binutils versions (as well as > > > clang and lld). > > > > > > Signed-off-by: Arvind Sankar > > > Reviewed-by: Ard Biesheuvel > > > Reviewed-by: Fangrui Song > > > --- > > > arch/x86/boot/compressed/Makefile | 28 +++--- > > > arch/x86/boot/compressed/vmlinux.lds.S | 8 > > > 2 files changed, 11 insertions(+), 25 deletions(-) > > > > Reviewed-by: Kees Cook > > > > question below ... > > > > > diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S > > > index a4a4a59a2628..a78510046eec 100644 > > > --- a/arch/x86/boot/compressed/vmlinux.lds.S > > > +++ b/arch/x86/boot/compressed/vmlinux.lds.S > > > @@ -42,6 +42,12 @@ SECTIONS > > > *(.rodata.*) > > > _erodata = . ; > > > } > > > + .rel.dyn : { > > > + *(.rel.*) > > > + } > > > + .rela.dyn : { > > > + *(.rela.*) > > > + } > > > .got : { > > > *(.got) > > > } > > > > Should these be marked (INFO) as well? > > > > Given that sections marked as (INFO) will still be emitted into the > ELF image, it does not really make a difference to do this for zero > sized sections. Oh, I misunderstood -- I though they were _not_ emitted; I see now what you said was not allocated. So, disk space used for the .got.plt case, but not memory space used. Sorry for the confusion! -Kees About output section type (INFO): https://sourceware.org/binutils/docs/ld/Output-Section-Type.html#Output-Section-Type says "These type names are supported for backward compatibility, and are rarely used." If all input section don't have the SHF_ALLOC flag, the output section will not have this flag as well. This type is not useful... If .got and .got.plt were used, they should be considered dynamic relocations which should be part of the loadable image. So they should have the SHF_ALLOC flag. (INFO) will not be applicable anyway. SHT_REL[A] may be allocable or not. Usually .rel[a].dyn and .rel[a].plt are linker created allocable sections. (INFO) does not make sense for them. In the case of the REL[A] and .got sections, they are actually already not emitted at all into the ELF file now that they are zero size. For .got.plt, it is only emitted for 32-bit (with the 3 reserved entries), the 64-bit linker seems to get rid of it.
Re: [PATCH v3 2/9] vmlinux.lds.h: Add .symtab, .strtab, and .shstrtab to STABS_DEBUG
On 2020-06-24, Arvind Sankar wrote: On Wed, Jun 24, 2020 at 09:16:43AM -0700, Fangrui Song wrote: On 2020-06-24, Arvind Sankar wrote: >On Tue, Jun 23, 2020 at 06:49:33PM -0700, Kees Cook wrote: >> When linking vmlinux with LLD, the synthetic sections .symtab, .strtab, >> and .shstrtab are listed as orphaned. Add them to the STABS_DEBUG section >> so there will be no warnings when --orphan-handling=warn is used more >> widely. (They are added above comment as it is the more common > >Nit 1: is "after .comment" better than "above comment"? It's above in the >sense of higher file offset, but it's below in readelf output. I mean this order:) .comment .symtab .shstrtab .strtab This is the case in the absence of a linker script if at least one object file has .comment (mostly for GCC/clang version information) or the linker is LLD which adds a .comment >Nit 2: These aren't actually debugging sections, no? Is it better to add >a new macro for it, and is there any plan to stop LLD from warning about >them? https://reviews.llvm.org/D75149 "[ELF] --orphan-handling=: don't warn/error for unused synthesized sections" described that .symtab .shstrtab .strtab are different in GNU ld. Since many other GNU ld synthesized sections (.rela.dyn .plt ...) can be renamed or dropped via output section descriptions, I don't understand why the 3 sections can't be customized. So IIUC, lld will now warn about .rela.dyn etc only if they're non-empty? HEAD and future 11.0.0 will not warn about unused synthesized sections like .rela.dyn For most synthesized sections, empty = unused. I created a feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=26168 (If this is supported, it is a consistent behavior to warn for orphan .symtab/.strtab/.shstrtab There may be 50% chance that the maintainer decides that "LLD diverges" I would disagree: there is no fundamental problems with .symtab/.strtab/.shstrtab which make them special in output section descriptions or orphan handling.) .shstrtab is a little special in that it can't be discarded if the ELF file contains any sections at all. But yeah, there's no reason they can't be renamed or placed in a custom location in the file. https://sourceware.org/pipermail/binutils/2020-March/000179.html proposes -z nosectionheader. With this option, I believe .shstrtab is not needed. /DISCARD/ : { *(.shstrtab) } should achieve a similar effect.
Re: [PATCH v3 2/9] vmlinux.lds.h: Add .symtab, .strtab, and .shstrtab to STABS_DEBUG
On 2020-06-24, Arvind Sankar wrote: On Tue, Jun 23, 2020 at 06:49:33PM -0700, Kees Cook wrote: When linking vmlinux with LLD, the synthetic sections .symtab, .strtab, and .shstrtab are listed as orphaned. Add them to the STABS_DEBUG section so there will be no warnings when --orphan-handling=warn is used more widely. (They are added above comment as it is the more common Nit 1: is "after .comment" better than "above comment"? It's above in the sense of higher file offset, but it's below in readelf output. I mean this order:) .comment .symtab .shstrtab .strtab This is the case in the absence of a linker script if at least one object file has .comment (mostly for GCC/clang version information) or the linker is LLD which adds a .comment Nit 2: These aren't actually debugging sections, no? Is it better to add a new macro for it, and is there any plan to stop LLD from warning about them? https://reviews.llvm.org/D75149 "[ELF] --orphan-handling=: don't warn/error for unused synthesized sections" described that .symtab .shstrtab .strtab are different in GNU ld. Since many other GNU ld synthesized sections (.rela.dyn .plt ...) can be renamed or dropped via output section descriptions, I don't understand why the 3 sections can't be customized. I created a feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=26168 (If this is supported, it is a consistent behavior to warn for orphan .symtab/.strtab/.shstrtab There may be 50% chance that the maintainer decides that "LLD diverges" I would disagree: there is no fundamental problems with .symtab/.strtab/.shstrtab which make them special in output section descriptions or orphan handling.) order[1].) ld.lld: warning: :(.symtab) is being placed in '.symtab' ld.lld: warning: :(.shstrtab) is being placed in '.shstrtab' ld.lld: warning: :(.strtab) is being placed in '.strtab' [1] https://lore.kernel.org/lkml/2020064928.o2a7jkq33guxf...@google.com/ Reported-by: Fangrui Song Reviewed-by: Fangrui Song Signed-off-by: Kees Cook --- include/asm-generic/vmlinux.lds.h | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 1248a206be8d..8e71757f485b 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -792,7 +792,10 @@ .stab.exclstr 0 : { *(.stab.exclstr) } \ .stab.index 0 : { *(.stab.index) } \ .stab.indexstr 0 : { *(.stab.indexstr) }\ - .comment 0 : { *(.comment) } + .comment 0 : { *(.comment) }\ + .symtab 0 : { *(.symtab) } \ + .strtab 0 : { *(.strtab) } \ + .shstrtab 0 : { *(.shstrtab) } #ifdef CONFIG_GENERIC_BUG #define BUG_TABLE \ -- 2.25.1
Re: [PATCH v3 3/9] efi/libstub: Remove .note.gnu.property
On 2020-06-23, Kees Cook wrote: In preparation for adding --orphan-handling=warn to more architectures, make sure unwanted sections don't end up appearing under the .init section prefix that libstub adds to itself during objcopy. Signed-off-by: Kees Cook --- drivers/firmware/efi/libstub/Makefile | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile index 75daaf20374e..9d2d2e784bca 100644 --- a/drivers/firmware/efi/libstub/Makefile +++ b/drivers/firmware/efi/libstub/Makefile @@ -66,6 +66,9 @@ lib-$(CONFIG_X86) += x86-stub.o CFLAGS_arm32-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET) CFLAGS_arm64-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET) +# Remove unwanted sections first. +STUBCOPY_FLAGS-y += --remove-section=.note.gnu.property + # # For x86, bootloaders like systemd-boot or grub-efi do not zero-initialize the # .bss section, so the .bss section of the EFI stub needs to be included in the arch/arm64/Kconfig enables ARM64_PTR_AUTH by default. When the config is on ifeq ($(CONFIG_ARM64_BTI_KERNEL),y) branch-prot-flags-$(CONFIG_CC_HAS_BRANCH_PROT_PAC_RET_BTI) := -mbranch-protection=pac-ret+leaf+bti else branch-prot-flags-$(CONFIG_CC_HAS_BRANCH_PROT_PAC_RET) := -mbranch-protection=pac-ret+leaf endif This option creates .note.gnu.property: % readelf -n drivers/firmware/efi/libstub/efi-stub.o Displaying notes found in: .note.gnu.property OwnerData sizeDescription GNU 0x0010 NT_GNU_PROPERTY_TYPE_0 Properties: AArch64 feature: PAC If .note.gnu.property is not desired in drivers/firmware/efi/libstub, specifying -mbranch-protection=none can override -mbranch-protection=pac-ret+leaf
Re: [PATCH v2 1/3] vmlinux.lds.h: Add .gnu.version* to DISCARDS
On 2020-06-22, Kees Cook wrote: On Mon, Jun 22, 2020 at 03:00:43PM -0700, Fangrui Song wrote: On 2020-06-22, Kees Cook wrote: > For vmlinux linking, no architecture uses the .gnu.version* section, > so remove it via the common DISCARDS macro in preparation for adding > --orphan-handling=warn more widely. > > Signed-off-by: Kees Cook > --- > include/asm-generic/vmlinux.lds.h | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h > index db600ef218d7..6fbe9ed10cdb 100644 > --- a/include/asm-generic/vmlinux.lds.h > +++ b/include/asm-generic/vmlinux.lds.h > @@ -934,6 +934,7 @@ >*(.discard) \ >*(.discard.*) \ >*(.modinfo) \ > + *(.gnu.version*)\ >} > > /** > -- > 2.25.1 I wonder what lead to .gnu.version{,_d,_r} sections in the kernel. This looks like a bug in bfd.ld? There are no versioned symbols in any of the input files (and no output section either!) The link command is: $ ld -m elf_x86_64 --no-ld-generated-unwind-info -z noreloc-overflow -pie \ --no-dynamic-linker --orphan-handling=warn -T \ arch/x86/boot/compressed/vmlinux.lds \ arch/x86/boot/compressed/kernel_info.o \ arch/x86/boot/compressed/head_64.o arch/x86/boot/compressed/misc.o \ arch/x86/boot/compressed/string.o arch/x86/boot/compressed/cmdline.o \ arch/x86/boot/compressed/error.o arch/x86/boot/compressed/piggy.o \ arch/x86/boot/compressed/cpuflags.o \ arch/x86/boot/compressed/early_serial_console.o \ arch/x86/boot/compressed/kaslr.o arch/x86/boot/compressed/kaslr_64.o \ arch/x86/boot/compressed/mem_encrypt.o \ arch/x86/boot/compressed/pgtable_64.o arch/x86/boot/compressed/acpi.o \ -o arch/x86/boot/compressed/vmlinux None of the inputs have the section: $ for i in arch/x86/boot/compressed/kernel_info.o \ arch/x86/boot/compressed/head_64.o arch/x86/boot/compressed/misc.o \ arch/x86/boot/compressed/string.o arch/x86/boot/compressed/cmdline.o \ arch/x86/boot/compressed/error.o arch/x86/boot/compressed/piggy.o \ arch/x86/boot/compressed/cpuflags.o \ arch/x86/boot/compressed/early_serial_console.o \ arch/x86/boot/compressed/kaslr.o arch/x86/boot/compressed/kaslr_64.o \ arch/x86/boot/compressed/mem_encrypt.o \ arch/x86/boot/compressed/pgtable_64.o arch/x86/boot/compressed/acpi.o \ ; do echo -n $i": "; readelf -Vs $i | grep 'version'; done arch/x86/boot/compressed/kernel_info.o: No version information found in this file. arch/x86/boot/compressed/head_64.o: No version information found in this file. arch/x86/boot/compressed/misc.o: No version information found in this file. arch/x86/boot/compressed/string.o: No version information found in this file. arch/x86/boot/compressed/cmdline.o: No version information found in this file. arch/x86/boot/compressed/error.o: No version information found in this file. arch/x86/boot/compressed/piggy.o: No version information found in this file. arch/x86/boot/compressed/cpuflags.o: No version information found in this file. arch/x86/boot/compressed/early_serial_console.o: No version information found in this file. arch/x86/boot/compressed/kaslr.o: No version information found in this file. arch/x86/boot/compressed/kaslr_64.o: No version information found in this file. arch/x86/boot/compressed/mem_encrypt.o: No version information found in this file. arch/x86/boot/compressed/pgtable_64.o: No version information found in this file. arch/x86/boot/compressed/acpi.o: No version information found in this file. And it's not in the output: $ readelf -Vs arch/x86/boot/compressed/vmlinux | grep version No version information found in this file. So... for the kernel we need to silence it right now. Re-link with -M (or -Map file) to check where .gnu.version{,_d,_r} input sections come from? If it is a bug, we should probably figure out which version of binutils has fixed the bug.
Re: [PATCH v2 3/3] x86/boot: Warn on orphan section placement
On 2020-06-22, Kees Cook wrote: On Mon, Jun 22, 2020 at 03:06:28PM -0700, Fangrui Song wrote: LLD may report warnings for 3 synthetic sections if they are orphans: ld.lld: warning: :(.symtab) is being placed in '.symtab' ld.lld: warning: :(.shstrtab) is being placed in '.shstrtab' ld.lld: warning: :(.strtab) is being placed in '.strtab' Are they described? Perhaps: diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index db600ef218d7..57e9c142e401 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -792,6 +792,9 @@ .stab.exclstr 0 : { *(.stab.exclstr) } \ .stab.index 0 : { *(.stab.index) } \ .stab.indexstr 0 : { *(.stab.indexstr) }\ + .symtab 0 : { *(.symtab) } \ + .strtab 0 : { *(.strtab) } \ + .shstrtab 0 : { *(.shstrtab) } \ .comment 0 : { *(.comment) } #ifdef CONFIG_GENERIC_BUG This LGTM. Nit: .comment before .symtab is a more common order. Reviewed-by: Fangrui Song
Re: [PATCH v2 3/3] x86/boot: Warn on orphan section placement
On 2020-06-22, Kees Cook wrote: We don't want to depend on the linker's orphan section placement heuristics as these can vary between linkers, and may change between versions. All sections need to be explicitly named in the linker script. Add the common debugging sections. Discard the unused note, rel, plt, dyn, and hash sections that are not needed in the compressed vmlinux. Disable .eh_frame generation in the linker and enable orphan section warnings. Signed-off-by: Kees Cook --- arch/x86/boot/compressed/Makefile | 3 ++- arch/x86/boot/compressed/vmlinux.lds.S | 11 +++ 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index 7619742f91c9..646720a05f89 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -48,6 +48,7 @@ GCOV_PROFILE := n UBSAN_SANITIZE :=n KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE) +KBUILD_LDFLAGS += $(call ld-option,--no-ld-generated-unwind-info) # Compressed kernel should be built as PIE since it may be loaded at any # address by the bootloader. ifeq ($(CONFIG_X86_32),y) @@ -59,7 +60,7 @@ else KBUILD_LDFLAGS += $(shell $(LD) --help 2>&1 | grep -q "\-z noreloc-overflow" \ && echo "-z noreloc-overflow -pie --no-dynamic-linker") endif -LDFLAGS_vmlinux := -T +LDFLAGS_vmlinux := --orphan-handling=warn -T hostprogs := mkpiggy HOST_EXTRACFLAGS += -I$(srctree)/tools/include diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S index 8f1025d1f681..6fe3ecdfd685 100644 --- a/arch/x86/boot/compressed/vmlinux.lds.S +++ b/arch/x86/boot/compressed/vmlinux.lds.S @@ -75,5 +75,16 @@ SECTIONS . = ALIGN(PAGE_SIZE); /* keep ZO size page aligned */ _end = .; + STABS_DEBUG + DWARF_DEBUG + DISCARDS + /DISCARD/ : { + *(.note.*) + *(.rela.*) *(.rela_*) + *(.rel.*) *(.rel_*) + *(.plt) *(.plt.*) + *(.dyn*) + *(.hash) *(.gnu.hash) + } } -- 2.25.1 LLD may report warnings for 3 synthetic sections if they are orphans: ld.lld: warning: :(.symtab) is being placed in '.symtab' ld.lld: warning: :(.shstrtab) is being placed in '.shstrtab' ld.lld: warning: :(.strtab) is being placed in '.strtab' Are they described?
Re: [PATCH v2 1/3] vmlinux.lds.h: Add .gnu.version* to DISCARDS
On 2020-06-22, Kees Cook wrote: For vmlinux linking, no architecture uses the .gnu.version* section, so remove it via the common DISCARDS macro in preparation for adding --orphan-handling=warn more widely. Signed-off-by: Kees Cook --- include/asm-generic/vmlinux.lds.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index db600ef218d7..6fbe9ed10cdb 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -934,6 +934,7 @@ *(.discard) \ *(.discard.*) \ *(.modinfo) \ + *(.gnu.version*)\ } /** -- 2.25.1 I wonder what lead to .gnu.version{,_d,_r} sections in the kernel. tools/lib/bpf/libbpf_internal.h uses `.symver` directive and -Wl,--version-script, which may lead to .gnu.version{,_d}, but this only applies to the userspace libbpf.so libperf.so has a similar -Wl,--version-script. Linking vmlinux does not appear to use any symbol versioning.
Re: [kbuild-all] Re: gcc-5: error: -gz is not supported in this configuration
But if that gcc was originally _configured_ with a version of binutils that doesn't support -gz=zlib, I agree with this theory :) On 2020-06-10, Arvind Sankar wrote: On Tue, Jun 09, 2020 at 11:23:31PM -0400, Arvind Sankar wrote: On Tue, Jun 09, 2020 at 11:12:25PM -0400, Arvind Sankar wrote: > The output of gcc-5 -dumpspecs may also be useful. > > The exact Kconfig check should have been >gcc-5 -Werror -gz=zlib -S -x c /dev/null -o /dev/null > > I can't see how that would succeed if the a.c test didn't but maybe just > in case? Oh wait, -S instead of -c. Which means it runs neither the assembler nor the linker, so gcc won't error out. But if that gcc was originally _configured_ with a version of binutils that doesn't support -gz=zlib, it will give an error on -c regardless of whether the runtime binutils would actually support it or not. I think the below might be better than passing the option via -Wa, since gcc will translate -gz=zlib into the right assembler option anyway, and it will also generate an error if the compiler driver was misconfigured and won't support the option even if the rest of the toolchain does, fixing the config dependency. Unless this doesn't work with Clang? Clang>=6 supports -gz=zlib Alternatively (or even in addition), we should redefine cc-option to use -c, it uses -S in the Kconfig version, apparently for speed, but -c in the Kbuild version. Unifying cc-option in scripts/Kbuild.include & scripts/Kconfig.include sounds good. diff --git a/Makefile b/Makefile index 839f9fee22cb..cb29e56f227a 100644 --- a/Makefile +++ b/Makefile @@ -842,7 +842,7 @@ endif ifdef CONFIG_DEBUG_INFO_COMPRESSED DEBUG_CFLAGS+= -gz=zlib -KBUILD_AFLAGS += -Wa,--compress-debug-sections=zlib +KBUILD_AFLAGS += -gz=zlib KBUILD_LDFLAGS += --compress-debug-sections=zlib endif diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index cb98741601bd..94ce36be470c 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -229,7 +229,7 @@ config DEBUG_INFO_COMPRESSED bool "Compressed debugging information" depends on DEBUG_INFO depends on $(cc-option,-gz=zlib) - depends on $(as-option,-Wa$(comma)--compress-debug-sections=zlib) + depends on $(as-option,-gz=zlib) depends on $(ld-option,--compress-debug-sections=zlib) help Compress the debug information using zlib. Requires GCC 5.0+ or Clang This patch looks good. (clang cc1as only supports(hardcodes) a limited number of -Wa, options (it parses the options by itself, rather than delegating to GNU as like GCC). If there is a compiler driver option, that is usually preferable)
Re: [kbuild-all] Re: gcc-5: error: -gz is not supported in this configuration
On 2020-06-10, Rong Chen wrote: On 6/10/20 1:49 AM, Fangrui Song wrote: On 2020-06-09, Nick Desaulniers wrote: On Tue, Jun 9, 2020 at 6:12 AM kernel test robot wrote: tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: abfbb29297c27e3f101f348dc9e467b0fe70f919 commit: 10e68b02c861ccf2b3adb59d3f0c10dc6b5e3ace Makefile: support compressed debug info date: 12 days ago config: x86_64-randconfig-r032-20200609 (attached as .config) compiler: gcc-5 (Ubuntu 5.5.0-12ubuntu1) 5.5.0 20171010 reproduce (this is a W=1 build): git checkout 10e68b02c861ccf2b3adb59d3f0c10dc6b5e3ace # save the attached .config to linux build tree make W=1 ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>, old ones prefixed by <<): gcc-5: error: -gz is not supported in this configuration Hmm...I wonder if the feature detection is incomplete? I suspect it's possible to not depend on zlib. make[2]: *** [scripts/Makefile.build:277: scripts/mod/empty.o] Error 1 make[2]: Target '__build' not remade because of errors. make[1]: *** [Makefile:1169: prepare0] Error 2 make[1]: Target 'prepare' not remade because of errors. make: *** [Makefile:185: __sub-make] Error 2 The output of gcc-5 -v --version on that machine may help. The convoluted gcc_cv_ld_compress_de logic in gcc/configure.ac may be related, but I can't find any mistake that our CONFIG_DEBUG_INFO_COMPRESSED conditions may make. Hi Fangrui, Here is the output: $gcc-5 -v --version Using built-in specs. COLLECT_GCC=gcc-5 COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper gcc-5 (Ubuntu 5.5.0-12ubuntu1) 5.5.0 20171010 Copyright (C) 2015 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.5.0-12ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 5.5.0 20171010 (Ubuntu 5.5.0-12ubuntu1) COLLECT_GCC_OPTIONS='-v' '--version' '-mtune=generic' '-march=x86-64' /usr/lib/gcc/x86_64-linux-gnu/5/cc1 -quiet -v -imultiarch x86_64-linux-gnu help-dummy -quiet -dumpbase help-dummy -mtune=generic -march=x86-64 -auxbase help-dummy -version --version -fstack-protector-strong -Wformat -Wformat-security -o /tmp/ccqnZumV.s GNU C11 (Ubuntu 5.5.0-12ubuntu1) version 5.5.0 20171010 (x86_64-linux-gnu) compiled by GNU C version 5.5.0 20171010, GMP version 6.1.2, MPFR version 4.0.1, MPC version 1.1.0 warning: MPFR header version 4.0.1 differs from library version 4.0.2. GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 COLLECT_GCC_OPTIONS='-v' '--version' '-mtune=generic' '-march=x86-64' as -v --64 --version -o /tmp/ccRPgs9J.o /tmp/ccqnZumV.s GNU assembler version 2.34 (x86_64-linux-gnu) using BFD version (GNU Binutils for Ubuntu) 2.34 GNU assembler (GNU Binutils for Ubuntu) 2.34 Copyright (C) 2020 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or later. This program has absolutely no warranty. This assembler was configured for a target of `x86_64-linux-gnu'. COMPILER_PATH=/usr/lib/gcc/x86_64-linux-gnu/5/:/usr/lib/gcc/x86_64-linux-gnu/5/:/usr/lib/gcc/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/5/:/usr/lib/gcc/x86_64-linux-gnu/ LIBRARY_PATH=/usr/lib/gcc/x86_64-linux-gnu/5/:/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/5/../../../../lib/:/lib/x86_64-linux-gnu/:/lib/../lib/:/usr/lib/x86_64-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/x86_64-linux-gnu/5/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-v' '--version' '-mtune=generic' '-march=x86-64' /usr/lib/gcc/x86_64-linux-gnu/5/collect2 -plugin /usr/lib/gcc/x86_64-linux-gnu/5/liblto_plugin.so -plugin-opt=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper -plugin-opt=-fresolution=/tmp/ccJLhs3y.res -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lc
Re: gcc-5: error: -gz is not supported in this configuration
On 2020-06-09, Nick Desaulniers wrote: On Tue, Jun 9, 2020 at 6:12 AM kernel test robot wrote: tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: abfbb29297c27e3f101f348dc9e467b0fe70f919 commit: 10e68b02c861ccf2b3adb59d3f0c10dc6b5e3ace Makefile: support compressed debug info date: 12 days ago config: x86_64-randconfig-r032-20200609 (attached as .config) compiler: gcc-5 (Ubuntu 5.5.0-12ubuntu1) 5.5.0 20171010 reproduce (this is a W=1 build): git checkout 10e68b02c861ccf2b3adb59d3f0c10dc6b5e3ace # save the attached .config to linux build tree make W=1 ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>, old ones prefixed by <<): >> gcc-5: error: -gz is not supported in this configuration Hmm...I wonder if the feature detection is incomplete? I suspect it's possible to not depend on zlib. make[2]: *** [scripts/Makefile.build:277: scripts/mod/empty.o] Error 1 make[2]: Target '__build' not remade because of errors. make[1]: *** [Makefile:1169: prepare0] Error 2 make[1]: Target 'prepare' not remade because of errors. make: *** [Makefile:185: __sub-make] Error 2 The output of gcc-5 -v --version on that machine may help. The convoluted gcc_cv_ld_compress_de logic in gcc/configure.ac may be related, but I can't find any mistake that our CONFIG_DEBUG_INFO_COMPRESSED conditions may make.
Re: [PATCH] arm64: disable -fsanitize=shadow-call-stack for big-endian
On 2020-05-27, Arnd Bergmann wrote: On Wed, May 27, 2020 at 7:28 PM 'Nick Desaulniers' via Clang Built Linux wrote: On Wed, May 27, 2020 at 8:24 AM Mark Rutland wrote: > > On Wed, May 27, 2020 at 03:39:46PM +0200, Arnd Bergmann wrote: > > clang-11 and earlier do not support -fsanitize=shadow-call-stack > > in combination with -mbig-endian, but the Kconfig check does not > > pass the endianess flag, so building a big-endian kernel with > > this fails at build time: > > > > clang: error: unsupported option '-fsanitize=shadow-call-stack' for target 'aarch64_be-unknown-linux' > > > > Change the Kconfig check to let Kconfig figure this out earlier > > and prevent the broken configuration. I assume this is a bug > > in clang that needs to be fixed, but we also have to work > > around existing releases. > > > > Fixes: 5287569a790d ("arm64: Implement Shadow Call Stack") > > Link: https://bugs.llvm.org/show_bug.cgi?id=46076 > > Signed-off-by: Arnd Bergmann > > I suspect this is similar to the patchable-function-entry issue, and > this is an oversight that we'd rather fix toolchain side. > > Nick, Fangrui, thoughts? Exactly, Fangrui already has a fix: https://reviews.llvm.org/D80647. Thanks Fangrui! Ok, great! I had opened the bug first so I could reference it in the commit changelog, it seems the fix came fast than I managed to send out the kernel workaround. Do we still want the kernel workaround anyway to make it work with older clang versions, or do we expect to fall back to not use the integrated assembler for the moment? Arnd We can condition it on `CLANG_VERSION >= 11` (assuming Tom (CCed) is happy (and there is still time) cherrying pick the two commits https://bugs.llvm.org/show_bug.cgi?id=46076 to clang 10.0.1)
Re: [PATCH] arm64: fix clang integrated assembler build
On 2020-05-27, 'Nick Desaulniers' via Clang Built Linux wrote: On Wed, May 27, 2020 at 7:14 AM Arnd Bergmann wrote: clang and gas seem to interpret the symbols in memmove.S and memset.S differently, such that clang does not make them 'weak' as expected, which leads to a linker error, with both ld.bfd and ld.lld: ld.lld: error: duplicate symbol: memmove >>> defined at common.c >>>kasan/common.o:(memmove) in archive mm/built-in.a >>> defined at memmove.o:(__memmove) in archive arch/arm64/lib/lib.a ld.lld: error: duplicate symbol: memset >>> defined at common.c >>>kasan/common.o:(memset) in archive mm/built-in.a >>> defined at memset.o:(__memset) in archive arch/arm64/lib/lib.a Copy the exact way these are written in memcpy_64.S, which does not have the same problem. I don't know why this makes a difference, and it would be good to have someone with a better understanding of assembler internals review it. It might be either a bug in the kernel or a bug in the assembler, no idea which one. My patch makes it work with all versions of clang and gcc, which is probably helpful even if it's a workaround for a clang bug. + Bill, Fangrui, Jian I think we saw this bug or a very similar bug internally around the ordering of .weak to .global. This may be another instance of https://sourceware.org/pipermail/binutils/2020-March/000299.html https://lore.kernel.org/linuxppc-dev/20200325164257.170229-1-mask...@google.com/ I haven't checked but there may be both a .globl directive and a .weak directive Cc: sta...@vger.kernel.org Signed-off-by: Arnd Bergmann --- --- arch/arm64/lib/memcpy.S | 3 +-- arch/arm64/lib/memmove.S | 3 +-- arch/arm64/lib/memset.S | 3 +-- 3 files changed, 3 insertions(+), 6 deletions(-) diff --git a/arch/arm64/lib/memcpy.S b/arch/arm64/lib/memcpy.S index e0bf83d556f2..dc8d2a216a6e 100644 --- a/arch/arm64/lib/memcpy.S +++ b/arch/arm64/lib/memcpy.S @@ -56,9 +56,8 @@ stp \reg1, \reg2, [\ptr], \val .endm - .weak memcpy SYM_FUNC_START_ALIAS(__memcpy) -SYM_FUNC_START_PI(memcpy) +SYM_FUNC_START_WEAK_PI(memcpy) #include "copy_template.S" ret SYM_FUNC_END_PI(memcpy) diff --git a/arch/arm64/lib/memmove.S b/arch/arm64/lib/memmove.S index 02cda2e33bde..1035dce4bdaf 100644 --- a/arch/arm64/lib/memmove.S +++ b/arch/arm64/lib/memmove.S @@ -45,9 +45,8 @@ C_h .reqx12 D_l.reqx13 D_h.reqx14 - .weak memmove SYM_FUNC_START_ALIAS(__memmove) -SYM_FUNC_START_PI(memmove) +SYM_FUNC_START_WEAK_PI(memmove) cmp dstin, src b.lo__memcpy add tmp1, src, count diff --git a/arch/arm64/lib/memset.S b/arch/arm64/lib/memset.S index 77c3c7ba0084..a9c1c9a01ea9 100644 --- a/arch/arm64/lib/memset.S +++ b/arch/arm64/lib/memset.S @@ -42,9 +42,8 @@ dst .reqx8 tmp3w .reqw9 tmp3 .reqx9 - .weak memset SYM_FUNC_START_ALIAS(__memset) -SYM_FUNC_START_PI(memset) +SYM_FUNC_START_WEAK_PI(memset) mov dst, dstin /* Preserve return value. */ and A_lw, val, #255 orr A_lw, A_lw, A_lw, lsl #8 -- 2.26.2 -- Thanks, ~Nick Desaulniers -- You received this message because you are subscribed to the Google Groups "Clang Built Linux" group. To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/CAKwvOdnNxj-MdKj3aWoefF2W9PPG-TSeNU4Ym-N8NODJB5Yw_w%40mail.gmail.com.
Re: [PATCH v2 4/4] x86/boot: Check that there are no runtime relocations
On 2020-05-26, Arvind Sankar wrote: On Tue, May 26, 2020 at 08:11:56AM +0200, Ard Biesheuvel wrote: On Tue, 26 May 2020 at 00:59, Arvind Sankar wrote: > # Compressed kernel should be built as PIE since it may be loaded at any > # address by the bootloader. > -KBUILD_LDFLAGS += $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker) > +KBUILD_LDFLAGS += -pie $(call ld-option, --no-dynamic-linker) Do we still need -pie linking with these changes applied? I think it's currently not strictly necessary -- eg the 64bit kernel doesn't get linked as pie right now with LLD or old binutils. However, it is safer to do so to ensure that the result remains PIC with future versions of the linker. There are linker optimizations that can convert certain PIC instructions when PIE is disabled. While I think they currently all focus on eliminating indirection through the GOT (and thus wouldn't be applicable any more), There are 3 forms described by x86-64 psABI B.2 Optimize GOTPCRELX Relocations (1) movq foo@GOTPCREL(%rip), %reg -> leaq foo(%rip), %reg (2) call *foo@GOTPCREL(%rip) -> nop; call foo (3) jmp *foo@GOTPCREL(%rip) -> jmp foo; nop ld.bfd and gold perform (1) even for R_X86_64_GOTPCREL. LLD requires R_X86_64_[REX_]GOTPCRELX it's easy to imagine that they could get extended to, for eg, convert leaqfoo(%rip), %rax to movl$foo, %eax with some nop padding, etc. Not with NOP padding, but probably with instruction prefixes. It is unclear the rewriting will be beneficial. Rewriting instructions definitely requires a dedicated relocation type like R_X86_64_[REX_]GOTPCRELX. Also, the relocation check that's being added here would only work with PIE linking.
Re: [PATCH 0/4] x86/boot: Remove runtime relocations from compressed kernel
On 2020-05-24, Arvind Sankar wrote: The compressed kernel currently contains bogus runtime relocations in the startup code in head_{32,64}.S, which are generated by the linker, but must not actually be processed at runtime. This generates warnings when linking with the BFD linker, and errors with LLD, which defaults to erroring on runtime relocations in read-only sections. It also requires the -z noreloc-overflow hack for the 64-bit kernel, which prevents us from linking it as -pie on an older BFD linker (<= 2.26) or on LLD, because the locations that are to be apparently relocated are only 32-bits in size and so cannot normally have R_X86_64_RELATIVE relocations. This series aims to get rid of these relocations. It is based on efi/next (efi-changes-for-v5.8), where the latest patches touch the head code to eliminate the global offset table. The first patch is an independent fix for LLD, to avoid an orphan section in arch/x86/boot/setup.elf [0]. The second patch gets rid of almost all the relocations. It uses standard PIC addressing technique for 32-bit, i.e. loading a register with the address of _GLOBAL_OFFSET_TABLE_ and then using GOTOFF references to access variables. For 64-bit, there is 32-bit code that cannot use RIP-relative addressing, and also cannot use the 32-bit method, since GOTOFF references are 64-bit only. This is instead handled using a macro to replace a reference like gdt with (gdt-startup_32) instead. The assembler will generate a PC32 relocation entry, with addend set to (.-startup_32), and these will be replaced with constants at link time. This works as long as all the code using such references lives in the same section as startup_32, i.e. in .head.text. The third patch addresses a remaining issue with the BFD linker, which insists on generating runtime relocations for absolute symbols. We use z_input_len and z_output_len, defined in the generated piggy.S file, as symbols whose absolute "addresses" are actually the size of the compressed payload and the size of the decompressed kernel image respectively. LLD does not generate relocations for these two symbols, but the BFD linker does. To get around this, piggy.S is extended to also define two u32 variables (in .rodata) with the lengths, and the head code is modified to use those instead of the symbol addresses. An alternative way to handle z_input_len/z_output_len would be to just include piggy.S in head_{32,64}.S instead of as a separate object file, since the GNU assembler doesn't generate relocations for symbols set to constants. The last patch adds a check in the linker script to ensure that no runtime relocations get reintroduced. Since the GOT has been eliminated as well, the compressed kernel has no runtime relocations whatsoever any more. [0] https://lore.kernel.org/lkml/20200521152459.558081-1-nived...@alum.mit.edu/ Arvind Sankar (4): x86/boot: Add .text.startup to setup.ld x86/boot: Remove runtime relocations from .head.text code x86/boot: Remove runtime relocations from head_{32,64}.S x86/boot: Check that there are no runtime relocations arch/x86/boot/compressed/Makefile | 36 +- arch/x86/boot/compressed/head_32.S | 59 +++ arch/x86/boot/compressed/head_64.S | 99 +++--- arch/x86/boot/compressed/mkpiggy.c | 6 ++ arch/x86/boot/compressed/vmlinux.lds.S | 11 +++ arch/x86/boot/setup.ld | 2 +- 6 files changed, 109 insertions(+), 104 deletions(-) -- 2.26.2 All 4 commits look good. Reviewed-by: Fangrui Song
Re: [PATCH 4/4] x86/boot: Check that there are no runtime relocations
On 2020-05-25, Ard Biesheuvel wrote: On Sun, 24 May 2020 at 23:28, Arvind Sankar wrote: Add a linker script check that there are no runtime relocations, and remove the old one that tries to check via looking for specially-named sections in the object files. Drop the tests for -fPIE compiler option and -pie linker option, as they are available in all supported gcc and binutils versions (as well as clang and lld). Signed-off-by: Arvind Sankar --- arch/x86/boot/compressed/Makefile | 28 +++--- arch/x86/boot/compressed/vmlinux.lds.S | 11 ++ 2 files changed, 14 insertions(+), 25 deletions(-) diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index d3e882e855ee..679a2b383bfe 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -27,7 +27,7 @@ targets := vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma \ vmlinux.bin.xz vmlinux.bin.lzo vmlinux.bin.lz4 KBUILD_CFLAGS := -m$(BITS) -O2 -KBUILD_CFLAGS += -fno-strict-aliasing $(call cc-option, -fPIE, -fPIC) +KBUILD_CFLAGS += -fno-strict-aliasing -fPIE KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING cflags-$(CONFIG_X86_32) := -march=i386 cflags-$(CONFIG_X86_64) := -mcmodel=small @@ -49,7 +49,7 @@ UBSAN_SANITIZE :=n KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE) # Compressed kernel should be built as PIE since it may be loaded at any # address by the bootloader. -KBUILD_LDFLAGS += $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker) +KBUILD_LDFLAGS += -pie $(call ld-option, --no-dynamic-linker) LDFLAGS_vmlinux := -T hostprogs := mkpiggy @@ -84,30 +84,8 @@ vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o vmlinux-objs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o -# The compressed kernel is built with -fPIC/-fPIE so that a boot loader -# can place it anywhere in memory and it will still run. However, since -# it is executed as-is without any ELF relocation processing performed -# (and has already had all relocation sections stripped from the binary), -# none of the code can use data relocations (e.g. static assignments of -# pointer values), since they will be meaningless at runtime. This check -# will refuse to link the vmlinux if any of these relocations are found. -quiet_cmd_check_data_rel = DATAREL $@ -define cmd_check_data_rel - for obj in $(filter %.o,$^); do \ - $(READELF) -S $$obj | grep -qF .rel.local && { \ - echo "error: $$obj has data relocations!" >&2; \ - exit 1; \ - } || true; \ - done -endef - -# We need to run two commands under "if_changed", so merge them into a -# single invocation. -quiet_cmd_check-and-link-vmlinux = LD $@ - cmd_check-and-link-vmlinux = $(cmd_check_data_rel); $(cmd_ld) - $(obj)/vmlinux: $(vmlinux-objs-y) FORCE - $(call if_changed,check-and-link-vmlinux) + $(call if_changed,ld) OBJCOPYFLAGS_vmlinux.bin := -R .comment -S $(obj)/vmlinux.bin: vmlinux FORCE diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S index d826ab38a8f9..0ac14feacb24 100644 --- a/arch/x86/boot/compressed/vmlinux.lds.S +++ b/arch/x86/boot/compressed/vmlinux.lds.S @@ -11,9 +11,15 @@ OUTPUT_FORMAT(CONFIG_OUTPUT_FORMAT) #ifdef CONFIG_X86_64 OUTPUT_ARCH(i386:x86-64) ENTRY(startup_64) + +#define REL .rela + #else OUTPUT_ARCH(i386) ENTRY(startup_32) + +#define REL .rel + #endif SECTIONS @@ -42,6 +48,9 @@ SECTIONS *(.rodata.*) _erodata = . ; } + REL.dyn : { + *(REL.*) + } Do we really need the macro here? Could we just do The output section name does not matter: it will be discarded by the linker. .rel.dyn : { *(.rel.* .rela.*) } If for some reasons there is at least one SHT_REL and at least one SHT_RELA, LLD will error "section type mismatch for .rel.dyn", while the intended diagnostic is the assert below. (or even .rel.dyn : { *(.rel.* } .rela.dyn : { *(.rela.*) } if the output section name matters, and always assert that both are empty)? .rel.dyn : { *(.rel.* } .rela.dyn : { *(.rela.*) } looks good to me. FWIW I intend to add -z rel and -z rela to LLD: https://reviews.llvm.org/D80496#inline-738804 (binutils thread https://sourceware.org/pipermail/binutils/2020-May/111244.html) In case someone builds the x86-64 kernel with -z rel, your suggested input section description will work out of the box... .got : { *(.got) } @@ -83,3 +92,5 @@ ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT en #else ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0xc, "Unexpected GOT/PLT entries detected!") #endif + +ASSERT(SIZEOF(REL.dyn) == 0, "Unexpected runtime relocations detected!") -- 2.26.2
Re: [PATCH 2/4] x86/boot: Remove runtime relocations from .head.text code
On 2020-05-24, Arvind Sankar wrote: On Sun, May 24, 2020 at 03:53:59PM -0700, Fangrui Song wrote: On 2020-05-24, Arvind Sankar wrote: >The assembly code in head_{32,64}.S, while meant to be >position-independent, generates run-time relocations because it uses >instructions such as >lealgdt(%edx), %eax >which make the assembler and linker think that the code is using %edx as >an index into gdt, and hence gdt needs to be relocated to its run-time >address. > >With the BFD linker, this generates a warning during the build: > LD arch/x86/boot/compressed/vmlinux >ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text' >ld: warning: creating a DT_TEXTREL in object Interesting. How does the build generate a warning by default? Do you use Gentoo Linux which appears to ship a --warn-shared-textrel enabled-by-default patch? (https://bugs.gentoo.org/700488) Ah, yes I am using gentoo. I didn't realize that was a distro modification. >+ >+/* >+ * This macro gives the link address of X. It's the same as X, since startup_32 >+ * has link address 0, but defining it this way tells the assembler/linker that >+ * we want the link address, and not the run-time address of X. This prevents >+ * the linker from creating a run-time relocation entry for this reference. >+ * The macro should be used as a displacement with a base register containing >+ * the run-time address of startup_32 [i.e. la(X)(%reg)], or as an >+ * immediate [$ la(X)]. >+ * >+ * This macro can only be used from within the .head.text section, since the >+ * expression requires startup_32 to be in the same section as the code being >+ * assembled. >+ */ >+#define la(X) ((X) - startup_32) >+ IIRC, %ebp contains the address of startup_32. la(X) references X relative to startup_32. The fixup (in GNU as and clang integrated assembler's term) is a constant which is resolved by the assembler. There is no R_386_32 or R_386_PC32 for the linker to resolve. This is incorrect (or maybe I'm not understanding you correctly). X is a symbol whose final location relative to startup_32 is in most cases not known to the assembler (there are a couple of cases where X is a label within .head.text which do get completely resolved by the assembler). For example, taking the instruction loading the gdt address, this is what we get from the assembler: lea la(gdt)(%ebp), %eax becomes in the object file: 11: 8d 85 00 00 00 00 lea0x0(%ebp),%eax 13: R_X86_64_PC32 .data+0x23 or a cleaner example using a global symbol: sublla(image_offset)(%ebp), %ebx becomes 41: 2b 9d 00 00 00 00 sub0x0(%ebp),%ebx 43: R_X86_64_PC32 image_offset+0x43 So in general you get PC32 relocations, with the addend being set by the assembler to .-startup_32, modulo the adjustment for where within the instruction the displacement needs to be. These relocations are resolved by the static linker to produce constants in the final executable. You are right. I am not familiar with the code and only looked at 1b. Just preprocessed head_64.S and verified many target symbols are in .data and .pgtable The assembler converts an expression `foo - symbol_defined_in_same_section` to be `foo - . + offset` which can be encoded as an R_X86_64_PC32 (or resolved the fixup if it is a constant, e.g. `1b - startup_32`) Not very sure stating that "since startup_32 has link address 0" is very appropriate here (probably because I did't see the term "link address" before). If my understanding above is correct, I think you can just reword the comment to express that X is referenced relative to startup_32, which is stored in %ebp. Yeah, the more standard term is virtual address/VMA, but that sounds confusing to me with PIE code since the _actual_ virtual address at which this code is going to run isn't 0, that's just the address assumed for linking. I can reword it to avoid referencing "link address" but then it's not obvious why the macro is named "la" :) I'm open to suggestions on a better name, I could use offset but that's a bit long-winded. I could just use vma() if nobody else finds it confusing. Thanks. With your approach, the important property is that "the distance between startup_32 and the target symbol is a constant", not that "startup_32 has link address 0". `ra`, `rva`, `rvma` or any other term which expresses "relative" should work. Hope someone can come up with a good suggestion:)
Re: [PATCH 1/4] x86/boot: Add .text.startup to setup.ld
On 2020-05-24, Arvind Sankar wrote: On Sun, May 24, 2020 at 03:13:26PM -0700, Fangrui Song wrote: On 2020-05-24, Arvind Sankar wrote: >gcc puts the main function into .text.startup when compiled with -Os (or >-O2). This results in arch/x86/boot/main.c having a .text.startup >section which is currently not included explicitly in the linker script >setup.ld in the same directory. > >The BFD linker places this orphan section immediately after .text, so >this still works. However, LLD git, since [1], is choosing to place it >immediately after the .bstext section instead (this is the first code >section). This plays havoc with the section layout that setup.elf >requires to create the setup header, for eg on 64-bit: > >LD arch/x86/boot/setup.elf > ld.lld: error: section .text.startup file range overlaps with .header > >>> .text.startup range is [0x200040, 0x2001FE] > >>> .header range is [0x2001EF, 0x20026B] > > ld.lld: error: section .header file range overlaps with .bsdata > >>> .header range is [0x2001EF, 0x20026B] > >>> .bsdata range is [0x2001FF, 0x200398] > > ld.lld: error: section .bsdata file range overlaps with .entrytext > >>> .bsdata range is [0x2001FF, 0x200398] > >>> .entrytext range is [0x20026C, 0x2002D3] > > ld.lld: error: section .text.startup virtual address range overlaps > with .header > >>> .text.startup range is [0x40, 0x1FE] > >>> .header range is [0x1EF, 0x26B] > > ld.lld: error: section .header virtual address range overlaps with > .bsdata > >>> .header range is [0x1EF, 0x26B] > >>> .bsdata range is [0x1FF, 0x398] > > ld.lld: error: section .bsdata virtual address range overlaps with > .entrytext > >>> .bsdata range is [0x1FF, 0x398] > >>> .entrytext range is [0x26C, 0x2D3] > > ld.lld: error: section .text.startup load address range overlaps with > .header > >>> .text.startup range is [0x40, 0x1FE] > >>> .header range is [0x1EF, 0x26B] > > ld.lld: error: section .header load address range overlaps with > .bsdata > >>> .header range is [0x1EF, 0x26B] > >>> .bsdata range is [0x1FF, 0x398] > > ld.lld: error: section .bsdata load address range overlaps with > .entrytext > >>> .bsdata range is [0x1FF, 0x398] > >>> .entrytext range is [0x26C, 0x2D3] > >Explicitly pull .text.startup into the .text output section to avoid >this. > >[1] https://reviews.llvm.org/D75225 > >Signed-off-by: Arvind Sankar >Reviewed-by: Fangrui Song >--- > arch/x86/boot/setup.ld | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > >diff --git a/arch/x86/boot/setup.ld b/arch/x86/boot/setup.ld >index 24c95522f231..ed60abcdb089 100644 >--- a/arch/x86/boot/setup.ld >+++ b/arch/x86/boot/setup.ld >@@ -20,7 +20,7 @@ SECTIONS >.initdata : { *(.initdata) } >__end_init = .; > >- .text : { *(.text) } >+ .text : { *(.text.startup) *(.text) } >.text32 : { *(.text32) } > >. = ALIGN(16); >-- >2.26.2 Should .text.startup* be used instead? If -ffunction-sections is used, // a.c int main() {} gcc -O2 a.c # .text.startup gcc -Os a.c # .text.startup gcc -O2 -ffunction-sections a.c # .text.startup.main gcc -Os -ffunction-sections a.c # .text.startup.main It's probably unlikely we'll use function-sections on the setup code, but *(.text.*) might be more future-proof, since gcc/clang might grow the ability to stick code into .text.hot or .text.unlikely etc automatically. *(.text.*) looks good to me. When you send PATCH v2, feel free to add (There is indeed no guarantee that future clang FDO will not place it .text.unknown. :) Reviewed-by: Fangrui Song - In case anyone wants to CC a GCC dev for the citation that main compiles to `.text.startup` in -Os or -O2 mode, I have a small request that `.text.startup.` probably makes more sense. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95095 I made an llvm change recently https://reviews.llvm.org/D79600
Re: [PATCH 4/4] x86/boot: Check that there are no runtime relocations
On 2020-05-24, Arvind Sankar wrote: Add a linker script check that there are no runtime relocations, and remove the old one that tries to check via looking for specially-named sections in the object files. Drop the tests for -fPIE compiler option and -pie linker option, as they are available in all supported gcc and binutils versions (as well as clang and lld). Signed-off-by: Arvind Sankar --- arch/x86/boot/compressed/Makefile | 28 +++--- arch/x86/boot/compressed/vmlinux.lds.S | 11 ++ 2 files changed, 14 insertions(+), 25 deletions(-) diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index d3e882e855ee..679a2b383bfe 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -27,7 +27,7 @@ targets := vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma \ vmlinux.bin.xz vmlinux.bin.lzo vmlinux.bin.lz4 KBUILD_CFLAGS := -m$(BITS) -O2 -KBUILD_CFLAGS += -fno-strict-aliasing $(call cc-option, -fPIE, -fPIC) +KBUILD_CFLAGS += -fno-strict-aliasing -fPIE KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING cflags-$(CONFIG_X86_32) := -march=i386 cflags-$(CONFIG_X86_64) := -mcmodel=small @@ -49,7 +49,7 @@ UBSAN_SANITIZE :=n KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE) # Compressed kernel should be built as PIE since it may be loaded at any # address by the bootloader. -KBUILD_LDFLAGS += $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker) +KBUILD_LDFLAGS += -pie $(call ld-option, --no-dynamic-linker) LDFLAGS_vmlinux := -T hostprogs := mkpiggy @@ -84,30 +84,8 @@ vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o vmlinux-objs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o -# The compressed kernel is built with -fPIC/-fPIE so that a boot loader -# can place it anywhere in memory and it will still run. However, since -# it is executed as-is without any ELF relocation processing performed -# (and has already had all relocation sections stripped from the binary), -# none of the code can use data relocations (e.g. static assignments of -# pointer values), since they will be meaningless at runtime. This check -# will refuse to link the vmlinux if any of these relocations are found. -quiet_cmd_check_data_rel = DATAREL $@ -define cmd_check_data_rel - for obj in $(filter %.o,$^); do \ - $(READELF) -S $$obj | grep -qF .rel.local && { \ - echo "error: $$obj has data relocations!" >&2; \ - exit 1; \ - } || true; \ - done -endef - -# We need to run two commands under "if_changed", so merge them into a -# single invocation. -quiet_cmd_check-and-link-vmlinux = LD $@ - cmd_check-and-link-vmlinux = $(cmd_check_data_rel); $(cmd_ld) - $(obj)/vmlinux: $(vmlinux-objs-y) FORCE - $(call if_changed,check-and-link-vmlinux) + $(call if_changed,ld) OBJCOPYFLAGS_vmlinux.bin := -R .comment -S $(obj)/vmlinux.bin: vmlinux FORCE diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S index d826ab38a8f9..0ac14feacb24 100644 --- a/arch/x86/boot/compressed/vmlinux.lds.S +++ b/arch/x86/boot/compressed/vmlinux.lds.S @@ -11,9 +11,15 @@ OUTPUT_FORMAT(CONFIG_OUTPUT_FORMAT) #ifdef CONFIG_X86_64 OUTPUT_ARCH(i386:x86-64) ENTRY(startup_64) + +#define REL .rela + #else OUTPUT_ARCH(i386) ENTRY(startup_32) + +#define REL .rel + #endif SECTIONS @@ -42,6 +48,9 @@ SECTIONS *(.rodata.*) _erodata = . ; } + REL.dyn : { + *(REL.*) + } .got : { *(.got) } @@ -83,3 +92,5 @@ ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT en #else ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0xc, "Unexpected GOT/PLT entries detected!") #endif + +ASSERT(SIZEOF(REL.dyn) == 0, "Unexpected runtime relocations detected!") -- 2.26.2 `grep -qF .rel.local` from 98f78525371b55ccd1c480207ce10296c72fa340 may be incorrect.. None of these synthesized dynamic relocation sections is called *.rel.local* ... (it probably wanted to name .rel.data.rel.ro or .rel.data) Reviewed-by: Fangrui Song
Re: [PATCH 3/4] x86/boot: Remove runtime relocations from head_{32,64}.S
at a.s pushl $z_input_len % cat b.s .globl z_input_len z_input_len = 0xb612 % gcc -m32 -c a.s b.s % ld.bfd -m elf_i386 -pie a.o b.o # has an incorrect R_386_RELATIVE before binutils 2.35 Reviewed-by: Fangrui Song
Re: [PATCH 2/4] x86/boot: Remove runtime relocations from .head.text code
On 2020-05-24, Arvind Sankar wrote: The assembly code in head_{32,64}.S, while meant to be position-independent, generates run-time relocations because it uses instructions such as lealgdt(%edx), %eax which make the assembler and linker think that the code is using %edx as an index into gdt, and hence gdt needs to be relocated to its run-time address. With the BFD linker, this generates a warning during the build: LD arch/x86/boot/compressed/vmlinux ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text' ld: warning: creating a DT_TEXTREL in object Interesting. How does the build generate a warning by default? Do you use Gentoo Linux which appears to ship a --warn-shared-textrel enabled-by-default patch? (https://bugs.gentoo.org/700488) % cat a.s lealgdt(%edx), %eax % as --32 a.s -o a.o % ld.bfd -m elf_i386 -shared a.o -z notext # DT_TEXTREL is set. R_386_32 % ld.bfd -m elf_i386 -shared a.o # on-demand text relocations. DT_TEXTREL is set. R_386_32 % ld.bfd -m elf_i386 -shared a.o --warn-shared-textrel ld.bfd: a.o: warning: relocation against `gdt' in read-only section `.text' ld.bfd: warning: creating a DT_TEXTREL in a shared object % ld.bfd -m elf_i386 -shared a.o -z text ld.bfd: a.o: warning: relocation against `gdt' in read-only section `.text' ld.bfd: read-only segment has dynamic relocations ## The above is an error. Output is suppressed lld has only two modes: -z text (default) and -z notext. There is no on-demand state. By default it will error. There are feature requests to make -z text default, at least for some architectures. I just found https://sourceware.org/bugzilla/show_bug.cgi?id=20824 With lld, Dmitry Golovin reports that this results in a link-time error with default options (i.e. unless -z notext is explicitly passed): LD arch/x86/boot/compressed/vmlinux ld.lld: error: can't create dynamic relocation R_386_32 against local symbol in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output Start fixing this by removing relocations from .head.text: - On 32-bit, use a base register that holds the address of the GOT and reference symbol addresses using @GOTOFF, i.e. lealgdt@GOTOFF(%edx), %eax Looks good to me. - On 64-bit, most of the code can (and already does) use %rip-relative addressing, however the .code32 bits can't, and the 64-bit code also needs to reference symbol addresses as they will be after moving the compressed kernel to the end of the decompression buffer. For these cases, reference the symbols as an offset to startup_32 to avoid creating relocations, i.e. leal(gdt-startup_32)(%bp), %eax This only works in .head.text as the subtraction cannot be represented as a PC-relative relocation unless startup_32 is in the same section as the code. Move efi32_pe_entry into .head.text so that it can use the same method to avoid relocations. I have a nit about the startup_32 comment. See below. Signed-off-by: Arvind Sankar --- arch/x86/boot/compressed/head_32.S | 40 +++-- arch/x86/boot/compressed/head_64.S | 95 ++ 2 files changed, 77 insertions(+), 58 deletions(-) diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S index dfa4131c65df..66657bb99aae 100644 --- a/arch/x86/boot/compressed/head_32.S +++ b/arch/x86/boot/compressed/head_32.S @@ -73,10 +73,10 @@ SYM_FUNC_START(startup_32) leal(BP_scratch+4)(%esi), %esp call1f 1: popl%edx - subl$1b, %edx + addl$_GLOBAL_OFFSET_TABLE_+(.-1b), %edx /* Load new GDT */ - lealgdt(%edx), %eax + lealgdt@GOTOFF(%edx), %eax movl%eax, 2(%eax) lgdt(%eax) @@ -89,14 +89,16 @@ SYM_FUNC_START(startup_32) movl%eax, %ss /* - * %edx contains the address we are loaded at by the boot loader and %ebx - * contains the address where we should move the kernel image temporarily - * for safe in-place decompression. %ebp contains the address that the kernel - * will be decompressed to. + * %edx contains the address we are loaded at by the boot loader (plus the + * offset to the GOT). The below code calculates %ebx to be the address where + * we should move the kernel image temporarily for safe in-place decompression + * (again, plus the offset to the GOT). + * + * %ebp is calculated to be the address that the kernel will be decompressed to. */ #ifdef CONFIG_RELOCATABLE - movl%edx, %ebx + lealstartup_32@GOTOFF(%edx), %ebx #ifdef CONFIG_EFI_STUB /* @@ -107,7 +109,7 @@ SYM_FUNC_START(startup_32) * image_offset = startup_32 - image_base * Otherwise image_offset will be zero and has no effect on the calculations. */ - sublimage_offset(%edx), %ebx + sublimage_offset@GOTOFF(%edx), %ebx #endif movlBP_kernel_alignment(%esi), %eax @@ -124,10
Re: [PATCH 1/4] x86/boot: Add .text.startup to setup.ld
On 2020-05-24, Arvind Sankar wrote: gcc puts the main function into .text.startup when compiled with -Os (or -O2). This results in arch/x86/boot/main.c having a .text.startup section which is currently not included explicitly in the linker script setup.ld in the same directory. The BFD linker places this orphan section immediately after .text, so this still works. However, LLD git, since [1], is choosing to place it immediately after the .bstext section instead (this is the first code section). This plays havoc with the section layout that setup.elf requires to create the setup header, for eg on 64-bit: LD arch/x86/boot/setup.elf ld.lld: error: section .text.startup file range overlaps with .header >>> .text.startup range is [0x200040, 0x2001FE] >>> .header range is [0x2001EF, 0x20026B] ld.lld: error: section .header file range overlaps with .bsdata >>> .header range is [0x2001EF, 0x20026B] >>> .bsdata range is [0x2001FF, 0x200398] ld.lld: error: section .bsdata file range overlaps with .entrytext >>> .bsdata range is [0x2001FF, 0x200398] >>> .entrytext range is [0x20026C, 0x2002D3] ld.lld: error: section .text.startup virtual address range overlaps with .header >>> .text.startup range is [0x40, 0x1FE] >>> .header range is [0x1EF, 0x26B] ld.lld: error: section .header virtual address range overlaps with .bsdata >>> .header range is [0x1EF, 0x26B] >>> .bsdata range is [0x1FF, 0x398] ld.lld: error: section .bsdata virtual address range overlaps with .entrytext >>> .bsdata range is [0x1FF, 0x398] >>> .entrytext range is [0x26C, 0x2D3] ld.lld: error: section .text.startup load address range overlaps with .header >>> .text.startup range is [0x40, 0x1FE] >>> .header range is [0x1EF, 0x26B] ld.lld: error: section .header load address range overlaps with .bsdata >>> .header range is [0x1EF, 0x26B] >>> .bsdata range is [0x1FF, 0x398] ld.lld: error: section .bsdata load address range overlaps with .entrytext >>> .bsdata range is [0x1FF, 0x398] >>> .entrytext range is [0x26C, 0x2D3] Explicitly pull .text.startup into the .text output section to avoid this. [1] https://reviews.llvm.org/D75225 Signed-off-by: Arvind Sankar Reviewed-by: Fangrui Song --- arch/x86/boot/setup.ld | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/boot/setup.ld b/arch/x86/boot/setup.ld index 24c95522f231..ed60abcdb089 100644 --- a/arch/x86/boot/setup.ld +++ b/arch/x86/boot/setup.ld @@ -20,7 +20,7 @@ SECTIONS .initdata : { *(.initdata) } __end_init = .; - .text : { *(.text) } + .text : { *(.text.startup) *(.text) } .text32 : { *(.text32) } . = ALIGN(16); -- 2.26.2 Should .text.startup* be used instead? If -ffunction-sections is used, // a.c int main() {} gcc -O2 a.c # .text.startup gcc -Os a.c # .text.startup gcc -O2 -ffunction-sections a.c # .text.startup.main gcc -Os -ffunction-sections a.c # .text.startup.main - In case anyone wants to CC a GCC dev for the citation that main compiles to `.text.startup` in -Os or -O2 mode, I have a small request that `.text.startup.` probably makes more sense. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95095 I made an llvm change recently https://reviews.llvm.org/D79600
[tip: x86/build] x86/boot: Discard .discard.unreachable for arch/x86/boot/compressed/vmlinux
The following commit has been merged into the x86/build branch of tip: Commit-ID: d6ee6529436a15a0541aff6e1697989ee7dc2c44 Gitweb: https://git.kernel.org/tip/d6ee6529436a15a0541aff6e1697989ee7dc2c44 Author:Fangrui Song AuthorDate:Wed, 20 May 2020 11:20:10 -07:00 Committer: Borislav Petkov CommitterDate: Fri, 22 May 2020 12:42:07 +02:00 x86/boot: Discard .discard.unreachable for arch/x86/boot/compressed/vmlinux With commit ce5e3f909fc0 ("efi/printf: Add 64-bit and 8-bit integer support") arch/x86/boot/compressed/vmlinux may have an undesired .discard.unreachable section coming from drivers/firmware/efi/libstub/vsprintf.stub.o. That section gets generated from unreachable() annotations when CONFIG_STACK_VALIDATION is enabled. .discard.unreachable contains an R_X86_64_PC32 relocation which will be warned about by LLD: a non-SHF_ALLOC section (.discard.unreachable) is not part of the memory image, thus conceptually the distance between a non-SHF_ALLOC and a SHF_ALLOC is not a constant which can be resolved at link time: % ld.lld -m elf_x86_64 -T arch/x86/boot/compressed/vmlinux.lds ... -o arch/x86/boot/compressed/vmlinux ld.lld: warning: vsprintf.c:(.discard.unreachable+0x0): has non-ABS relocation R_X86_64_PC32 against symbol '' Reuse the DISCARDS macro which includes .discard.* to drop .discard.unreachable. [ bp: Massage and complete the commit message. ] Reported-by: kbuild test robot Signed-off-by: Fangrui Song Signed-off-by: Borislav Petkov Reviewed-by: Kees Cook Tested-by: Arvind Sankar Tested-by: Sedat Dilek Link: https://lkml.kernel.org/r/20200520182010.242489-1-mask...@google.com --- arch/x86/boot/compressed/vmlinux.lds.S | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S index 508cfa6..1031af1 100644 --- a/arch/x86/boot/compressed/vmlinux.lds.S +++ b/arch/x86/boot/compressed/vmlinux.lds.S @@ -73,4 +73,6 @@ SECTIONS #endif . = ALIGN(PAGE_SIZE); /* keep ZO size page aligned */ _end = .; + + DISCARDS }
Re: [PATCH 0/1] x86/boot: lld fix
On 2020-05-20, Arvind Sankar wrote: On Wed, May 20, 2020 at 06:56:53PM -0400, Arvind Sankar wrote: arch/x86/boot/setup.elf currently has an orphan section .text.startup, and lld git as of ebf14d9b6d8b is breaking on 64-bit due to what seems to be a change in behavior on orphan section placement (details in patch commit message). I'm not sure if this was an intentional change in lld, but it seems like a good idea to explicitly include .text.startup anyway. Arvind Sankar (1): x86/boot: Add .text.startup to setup.ld arch/x86/boot/setup.ld | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.26.2 I found your PATCH 1/1 on https://lkml.org/lkml/2020/5/20/1491 - .text : { *(.text) } + .text : { *(.text.startup) *(.text) } The LLD behavior change was introduced in https://reviews.llvm.org/D75225 (will be included in 11.0.0) It was intended to match GNU ld. But yes, orphan section placement is still different in the two linkers. Placing .text.startup before .text seems good. In GNU ld's internal linker script (ld --verbose), .text.startup is placed before .text Reviewed-by: Fangrui Song
Re: [PATCH] x86/boot: allow a relocatable kernel to be linked with lld
On 2020-05-16, Dmitry Golovin wrote: 15.05.2020, 21:50, "Borislav Petkov" : I need more info here about which segment is read-only? Is this something LLD does by default or what's happening? Probably should have quoted the original error message: ld.lld: error: can't create dynamic relocation R_386_32 against symbol: _bss in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output Do we know where do these R_386_32 come from? When linking in -shared mode, the linker assumes the image is a shared object and has undetermined image base at runtime. An absolute relocation needs a text relocation (a relocation against a readonly segment). When neither -z notext nor -z text is specified, GNU ld is in an indefinite state where it will enable text relocations (DT_TEXTREL DF_TEXTREL) on demand. It is not considered a good practice for userspace applications to do this. Of course the kernel is different... I know little about the kernel, but if there is a way to make the sections containing R_386_32 relocations writable (SHF_WRITE), that will be a better solution to me. In LLD, -z notext is like making every section SHF_WRITE. IOW, don't be afraid to be more verbose in the commit message. :) Tried both BFD and LLD for linking to understand the difference more and rewrite the commit message, and came to the conclusion that the patch is wrong. I will submit v2 when I figure out the correct solution. Regards, Dmitry -- You received this message because you are subscribed to the Google Groups "Clang Built Linux" group. To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/602331589572661%40mail.yandex.ru.
Re: [PATCH] Makefile: support compressed debug info
On 2020-05-12, Nick Desaulniers wrote: On Mon, May 11, 2020 at 10:54 PM Masahiro Yamada wrote: > >On Mon, May 4, 2020 at 5:13 AM Nick Desaulniers > > wrote: > >> > >> As debug information gets larger and larger, it helps significantly save > >> the size of vmlinux images to compress the information in the debug > >> information sections. Note: this debug info is typically split off from > >> the final compressed kernel image, which is why vmlinux is what's used > >> in conjunction with GDB. Minimizing the debug info size should have no > >> impact on boot times, or final compressed kernel image size. > >> Nick, I am OK with this patch. Fangrui provided the minimal requirement for --compress-debug-sections=zlib Is it worth recording in the help text? Do you want to send v2? Yes I'd like to record that information. I can also record Sedat's Tested-by tag. Thank you for testing Sedat. I don't know what "linux-image-dbg file" are, or why they would be bigger. The size of the debug info is the primary concern with this config. It sounds like however that file is created might be problematic. Fangrui, I wasn't able to easily find what version of binutils first added support. Can you please teach me how to fish? I actually downloaded https://ftp.gnu.org/gnu/binutils/ archives and located the sources... I think an easier way is: % cd binutils-gdb % git show binutils-2_26:./gas/as.c | grep compress-debug-sections --compress-debug-sections[={none|zlib|zlib-gnu|zlib-gabi}]\n\ ... GNU as 2.25 only supports --compress-debug-sections which means "zlib-gnu" in newer versions. Similarly, for GNU ld: % git show binutils-2_26:./ld/lexsup.c | grep compress-debug-sections --compress-debug-sections=[none|zlib|zlib-gnu|zlib-gabi]\n\ (I have spent a lot of time investigating GNU ld's behavior :) Another question I had for Fangrui is, if the linker can compress these sections, shouldn't we just have the linker do it, not the the compiler and assembler? IIUC the debug info can contain relocations, so the linker would have to decompress these, perform relocations, then recompress these? I guess having the compiler and assembler compress the debug info as well would minimize the size of the .o files on disk. The linker will decompress debug info unconditionally. Because input .debug_info sections need to be concatenated to form the output .debug_info . Whether the output .debug_info is compressed is controlled by the linker option --compress-debug-sections=zlib, which is not affected by the compression state of object files. Both GNU as and GNU ld name the option --compress-debug-sections=zlib. In a compiler driver context, an unfamiliar user may find -Wa,--compress-debug-sections=zlib -Wl,--compress-debug-sections=zlib confusing:/ Otherwise I should add this flag to the assembler invocation, too, in v2. Thoughts? Compressing object files along with the linked output should be fine. It can save disk space. (It'd be great if you paste the comparison with and w/o object files compressed) Feel free to add: Reviewed-by: Fangrui Song I have a patch series that enables dwarf5 support in the kernel that I'm working up to. I wanted to send this first. Both roughly reduce the debug info size by 20% each, though I haven't measured them together, yet. Requires ToT binutils because there have been many fixes from reports of mine recently. This will be awesome! I also heard that enabling DWARF v5 for our object files can easily make debug info size smaller by 20%. Glad that the kernel can benefit it as well:)
Re: [PATCH] arm64: disable patchable function entry on big-endian clang builds
On 2020-05-06, Nick Desaulniers wrote: On Wed, May 6, 2020 at 8:46 AM 'Fangrui Song' via Clang Built Linux wrote: Created https://reviews.llvm.org/D79495 to allow the function attribute 'patchable_function_entry' on aarch64_be. I think -fpatchable-function-entry= just works. Note, LLD does not support aarch64_be (https://github.com/ClangBuiltLinux/linux/issues/380). I've approved the patch. Thanks for the quick fix. Looks like we backported -fpatchable-function-entry= to the clang-10 release, so we should cherry pick your fix to the release-10 branch for the clang 10.1 release. I'd rather have this fixed on the toolchain side. +1. Cherry picked to release/10.x https://github.com/llvm/llvm-project/commit/98f9f73f6d2367aa8001c4d16de9d3b347febb08 I did not use any endianness-sensitive in the original implementation, so hopefully this will not run into issues. The scheduled rc1 of LLVM 10.0.1 will happen on May 18, 2020 (https://lists.llvm.org/pipermail/llvm-dev/2020-April/141128.html) We should be quick if we want to test it on qemu or real hardware.
Re: [PATCH] arm64: disable patchable function entry on big-endian clang builds
On 2020-05-06, Nathan Chancellor wrote: On Wed, May 06, 2020 at 12:22:58PM +0200, Arnd Bergmann wrote: On Wed, May 6, 2020 at 5:45 AM Nathan Chancellor wrote: > On Tue, May 05, 2020 at 07:42:43PM +0200, Torsten Duwe wrote: > > On Tue, 5 May 2020 15:25:56 +0100 Mark Rutland wrote: > > > On Tue, May 05, 2020 at 04:12:36PM +0200, Arnd Bergmann wrote: > > > This practically rules out a BE distro kernel with things like PAC, > > > which isn't ideal. > > To be fair, are there big endian AArch64 distros? > > https://wiki.debian.org/Arm64Port: "There is also a big-endian version > of the architecture/ABI: aarch64_be-linux-gnu but we're not supporting > that in Debian (so there is no corresponding Debian architecture name) > and hopefully will never have to. Nevertheless you might want to check > for this by way of completeness in upstream code." > > OpenSUSE and Fedora don't appear to have support for big endian. I don't think any of the binary distros ship big endian ARM64. There are a couple of users that tend to build everything from source using Yocto or similar embedded distros, but as far as I can tell this is getting less common over time as applications get ported to be compatible with big-endian, or get phased out and replaced by code running on regular little-endian systems. The users we see today are likely in telco, military or aerospace settings (While earth is mostly little-endian these days, space is definitely big-endian) that got ported from big-endian hardware, but often with a high degree of customization and long service life. Ah yes, that makes sense, thanks for the information and background. Helps orient myself for the future. My policy for Arm specific kernel code submissions is generally that it should be written so it can work on either big-endian or little-endian using the available abstractions (just like any architecture independent code), but I don't normally expect it to be tested on big endian. There are some important examples of code that just doesn't work on big-endian because it's far too hard, e.g. the UEFI runtime services. That is also ok, if anyone really needs it, they can do the work. > > I suggest to get a quote from clang folks first about their schedule and > > regarded importance of patchable-function-entries on BE, and leave it as > > is: broken on certain clang configurations. It's not the kernel's fault. > > We can file an upstream PR (https://bugs.llvm.org) to talk about this > (although I've CC'd Fangrui) but you would rather the kernel fail to > work properly than prevent the user from being able to select that > option? Why even have the "select" or "depends on" keyword then? Created https://reviews.llvm.org/D79495 to allow the function attribute 'patchable_function_entry' on aarch64_be. I think -fpatchable-function-entry= just works. Note, LLD does not support aarch64_be (https://github.com/ClangBuiltLinux/linux/issues/380). I definitely want all randconfig kernels to build without warnings, and I agree with you that making it just fail at build time is not a good solution. > That said, I do think we should hold off on this patch until we hear > from the LLVM developers. +1 Arnd Glad we are on the same page. Cheers, Nathan
Re: [PATCH] Makefile: support compressed debug info
On 2020-05-04, Sedat Dilek wrote: On Mon, May 4, 2020 at 5:13 AM Nick Desaulniers wrote: As debug information gets larger and larger, it helps significantly save the size of vmlinux images to compress the information in the debug information sections. Note: this debug info is typically split off from the final compressed kernel image, which is why vmlinux is what's used in conjunction with GDB. Minimizing the debug info size should have no impact on boot times, or final compressed kernel image size. All of the debug sections will have a `C` flag set. $ readelf -S $ bloaty vmlinux.gcc75.compressed.dwarf4 -- \ vmlinux.gcc75.uncompressed.dwarf4 FILE SIZEVM SIZE -- -- +0.0% +18 [ = ] 0[Unmapped] -73.3% -114Ki [ = ] 0.debug_aranges -76.2% -2.01Mi [ = ] 0.debug_frame -73.6% -2.89Mi [ = ] 0.debug_str -80.7% -4.66Mi [ = ] 0.debug_abbrev -82.9% -4.88Mi [ = ] 0.debug_ranges -70.5% -9.04Mi [ = ] 0.debug_line -79.3% -10.9Mi [ = ] 0.debug_loc -39.5% -88.6Mi [ = ] 0.debug_info -18.2% -123Mi [ = ] 0TOTAL $ bloaty vmlinux.clang11.compressed.dwarf4 -- \ vmlinux.clang11.uncompressed.dwarf4 FILE SIZEVM SIZE -- -- +0.0% +23 [ = ] 0[Unmapped] -65.6%-871 [ = ] 0.debug_aranges -77.4% -1.84Mi [ = ] 0.debug_frame -82.9% -2.33Mi [ = ] 0.debug_abbrev -73.1% -2.43Mi [ = ] 0.debug_str -84.8% -3.07Mi [ = ] 0.debug_ranges -65.9% -8.62Mi [ = ] 0.debug_line -86.2% -40.0Mi [ = ] 0.debug_loc -42.0% -64.1Mi [ = ] 0.debug_info -22.1% -122Mi [ = ] 0TOTAL Hi Nick, thanks for the patch. I have slightly modified it to adapt to Linux v5.7-rc4 (what was your base?). Which linker did you use and has it an impact if you switch from ld.bfd to ld.lld? lld has supported the linker option --compress-debug-sections=zlib since about 5.0.0 (https://reviews.llvm.org/D31941) I tried a first normal run and in a 2nd one with CONFIG_DEBUG_INFO_COMPRESSED=y both with clang-10 and ld.lld-10. My numbers (sizes in MiB): [ diffconfig ] $ scripts/diffconfig /boot/config-5.7.0-rc4-1-amd64-clang /boot/config-5.7.0-rc4-2-amd64-clang BUILD_SALT "5.7.0-rc4-1-amd64-clang" -> "5.7.0-rc4-2-amd64-clang" +DEBUG_INFO_COMPRESSED y [ compiler and linker ] $ clang-10 -v ClangBuiltLinux clang version 10.0.1 (https://github.com/llvm/llvm-project 92d5c1be9ee93850c0a8903f05f36a23ee835dc2) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /home/dileks/src/llvm-toolchain/install/bin Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/10 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9 Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/10 Candidate multilib: .;@m64 Candidate multilib: 32;@m32 Candidate multilib: x32;@mx32 Selected multilib: .;@m64 $ ld.lld-10 -v LLD 10.0.1 (https://github.com/llvm/llvm-project 92d5c1be9ee93850c0a8903f05f36a23ee835dc2) (compatible with GNU linkers) [ sizes vmlinux ] $ du -m 5.7.0-rc4-*/vmlinux* 409 5.7.0-rc4-1-amd64-clang/vmlinux 7 5.7.0-rc4-1-amd64-clang/vmlinux.compressed 404 5.7.0-rc4-1-amd64-clang/vmlinux.o 324 5.7.0-rc4-2-amd64-clang/vmlinux 7 5.7.0-rc4-2-amd64-clang/vmlinux.compressed 299 5.7.0-rc4-2-amd64-clang/vmlinux.o [ readelf (.debug_info as example) ] $ readelf -S vmlinux.o [33] .debug_info PROGBITS 01d6a5e8 06be1ee6 0 0 1 $ readelf -S vmlinux.o [33] .debug_info PROGBITS 01749f18 02ef04d2 C 0 0 1 <--- XXX: "C (compressed)" Flag Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), l (large), p (processor specific) [ sizes linux-image debian packages ] $ du -m 5.7.0-rc4-*/linux-image*.deb 47 5.7.0-rc4-1-amd64-clang/linux-image-5.7.0-rc4-1-amd64-clang_5.7.0~rc4-1~bullseye+dileks1_amd64.deb 424 5.7.0-rc4-1-amd64-clang/linux-image-5.7.0-rc4-1-amd64-clang-dbg_5.7.0~rc4-1~bullseye+dileks1_amd64.deb 47 5.7.0-rc4-2-amd64-clang/linux-image-5.7.0-rc4-2-amd64-clang_5.7.0~rc4-2~bullseye+dileks1_amd64.deb 771 5.7.0-rc4-2-amd64-clang/linux-image-5.7.0-rc4-2-amd64-clang-dbg_5.7.0~rc4-2~bullseye+dileks1_amd64.deb [ sizes linux-git dir (compilation finished ] 5.7.0-rc4-1-amd64-clang: 17963 /home/dileks/src/linux-kernel/linux 5.7.0-rc4-2-amd64-clang: 14328 /home/dileks/src/linux-kernel/linux [ xz compressed linux-image-dbg packages ] $ file
Re: [PATCH v4 4/5] MIPS: VDSO: Use $(LD) instead of $(CC) to link VDSO
On 2020-04-28, Nathan Chancellor wrote: Currently, the VDSO is being linked through $(CC). This does not match how the rest of the kernel links objects, which is through the $(LD) variable. When clang is built in a default configuration, it first attempts to use the target triple's default linker then the system's default linker, unless told otherwise through -fuse-ld=... We do not use -fuse-ld= because it can be brittle and we have support for invoking $(LD) directly. See commit fe00e50b2db8c ("ARM: 8858/1: vdso: use $(LD) instead of $(CC) to link VDSO") and commit 691efbedc60d2 ("arm64: vdso: use $(LD) instead of $(CC) to link VDSO") for examples of doing this in the VDSO. Do the same thing here. Replace the custom linking logic with $(cmd_ld) and ldflags-y so that $(LD) is respected. We need to explicitly add two flags to the linker that were implicitly passed by the compiler: -G 0 (which comes from ccflags-vdso) and --eh-frame-hdr. Before this patch (generated by adding '-v' to VDSO_LDFLAGS): /libexec/gcc/mips64-linux/9.3.0/collect2 \ -plugin /libexec/gcc/mips64-linux/9.3.0/liblto_plugin.so \ -plugin-opt=/libexec/gcc/mips64-linux/9.3.0/lto-wrapper \ -plugin-opt=-fresolution=/tmp/ccGEi5Ka.res \ --eh-frame-hdr \ -G 0 \ -EB \ -mips64r2 \ -shared \ -melf64btsmip \ -o arch/mips/vdso/vdso.so.dbg.raw \ -L/lib/gcc/mips64-linux/9.3.0/64 \ -L/lib/gcc/mips64-linux/9.3.0 \ -L/lib/gcc/mips64-linux/9.3.0/../../../../mips64-linux/lib \ -Bsymbolic \ --no-undefined \ -soname=linux-vdso.so.1 \ -EB \ --hash-style=sysv \ --build-id \ -T arch/mips/vdso/vdso.lds \ arch/mips/vdso/elf.o \ arch/mips/vdso/vgettimeofday.o \ arch/mips/vdso/sigreturn.o After this patch: /bin/mips64-linux-ld \ -m elf64btsmip \ -Bsymbolic \ --no-undefined \ -soname=linux-vdso.so.1 \ -EB \ -nostdlib \ -shared \ -G 0 \ --eh-frame-hdr \ --hash-style=sysv \ --build-id \ -T arch/mips/vdso/vdso.lds \ arch/mips/vdso/elf.o \ arch/mips/vdso/vgettimeofday.o arch/mips/vdso/sigreturn.o \ -o arch/mips/vdso/vdso.so.dbg.raw Note that we leave behind -mips64r2. Turns out that ld ignores it (see get_emulation in ld/ldmain.c). This is true of current trunk and 2.23, which is the minimum supported version for the kernel: https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=ld/ldmain.c;hb=aa4209e7b679afd74a3860ce25659e71cc4847d5#l593 https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=ld/ldmain.c;hb=a55e30b51bc6227d8d41f707654d0a5620978dcf#l641 Before this patch, LD=ld.lld did nothing: $ llvm-readelf -p.comment arch/mips/vdso/vdso.so.dbg | sed 's/(.*//' String dump of section '.comment': [ 0] ClangBuiltLinux clang version 11.0.0 After this patch, it does: $ llvm-readelf -p.comment arch/mips/vdso/vdso.so.dbg | sed 's/(.*//' String dump of section '.comment': [ 0] Linker: LLD 11.0.0 [62] ClangBuiltLinux clang version 11.0.0 Link: https://github.com/ClangBuiltLinux/linux/issues/785 Signed-off-by: Nathan Chancellor --- v3 -> v4: * Improve commit message to show that ld command is effectively the same as the one generated by GCC. * Add '-G 0' and '--eh-frame-hdr' because they were added by GCC. My understanding is that we start to use more -fasynchronous-unwind-tables to eliminate .eh_frame in object files. Without .eh_frame, LD --eh-frame-hdr is really not useful. Sigh... -G 0. This is an option ignored by LLD. GCC devs probably should have used the long option --gpsize rather than take the short option -G. Even better, -z gpsize= or similar if this option is specific to ELF. v2 -> v3: * New patch. arch/mips/vdso/Makefile | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/arch/mips/vdso/Makefile b/arch/mips/vdso/Makefile index 92b53d1df42c3..2e64c7600eead 100644 --- a/arch/mips/vdso/Makefile +++ b/arch/mips/vdso/Makefile @@ -60,10 +60,9 @@ ifdef CONFIG_MIPS_DISABLE_VDSO endif # VDSO linker flags. -VDSO_LDFLAGS := \ - -Wl,-Bsymbolic -Wl,--no-undefined -Wl,-soname=linux-vdso.so.1 \ - $(addprefix -Wl$(comma),$(filter -E%,$(KBUILD_CFLAGS))) \ - -nostdlib -shared -Wl,--hash-style=sysv -Wl,--build-id +ldflags-y := -Bsymbolic --no-undefined -soname=linux-vdso.so.1 \ + $(filter -E%,$(KBUILD_CFLAGS)) -nostdlib -shared \ + -G 0 --eh-frame-hdr --hash-style=sysv --build-id -T CFLAGS_REMOVE_vdso.o = -pg @@ -82,11 +81,7 @@ quiet_cmd_vdso_mips_check = VDSOCHK $@ # quiet_cmd_vdsold_and_vdso_check = LD $@ - cmd_vdsold_and_vdso_check = $(cmd_vdsold); $(cmd_vdso_check); $(cmd_vdso_mips_check) - -quiet_cmd_vdsold = VDSO$@ - cmd_vdsold = $(CC) $(c_flags) $(VDSO_LDFLAGS) \ - -Wl,-T $(filter %.lds,$^) $(filter %.o,$^) -o $@ + cmd_vdsold_and_vdso_check = $(cmd_ld); $(cmd_vdso_check); $(cmd_vdso_mips_check) quiet_cmd_vdsoas_o_S = AS $@ cmd_vdsoas_o_S = $(CC) $(a_flags) -c -o $@ $< -- 2.26.2