Re: [PATCH 1/1] livepatch: Rename KLP_* to KLP_TRANSITION_*
On Tue, May 07, 2024 at 01:01:11PM +0800, zhangwar...@gmail.com wrote: > From: Wardenjohn > > The original macros of KLP_* is about the state of the transition. > Rename macros of KLP_* to KLP_TRANSITION_* to fix the confusing > description of klp transition state. > > Signed-off-by: Wardenjohn Acked-by: Josh Poimboeuf -- Josh
Re: [PATCH 0/1] *** Replace KLP_* to KLP_TRANSITION_* ***
On Tue, May 07, 2024 at 10:56:09AM +0800, zhang warden wrote: > > > > On May 7, 2024, at 10:41, Josh Poimboeuf wrote: > > > > On Tue, May 07, 2024 at 10:21:40AM +0800, zhang warden wrote: > >> > >> > >>> > >>> transition state. With this marcos renamed, comments are not > >>> necessary at this point. > >>> > >> Sorry for my careless, the comment still remains in the code. However, > >> comment in the code do no harms here. Maybe it can be kept. > > > > The comments aren't actually correct. > > > > For example, KLP_TRANSITION_UNPATCHED doesn't always mean "transitioning > > to unpatched state". Sometimes it means "transitioning *from* unpatched > > state". > > > > -- > > Josh > > OK, I got it. I will remove the comment later. After all, comment is > not necessary at this point after we rename the macros. Yeah, removing them altogether might be best, as the meaning of these can vary slightly depending on the operation (patching vs unpatching), and also depending on where it's stored (task->patch_state vs klp_target_state). -- Josh
Re: [PATCH 0/1] *** Replace KLP_* to KLP_TRANSITION_* ***
On Tue, May 07, 2024 at 10:21:40AM +0800, zhang warden wrote: > > > > > > transition state. With this marcos renamed, comments are not > > necessary at this point. > > > Sorry for my careless, the comment still remains in the code. However, > comment in the code do no harms here. Maybe it can be kept. The comments aren't actually correct. For example, KLP_TRANSITION_UNPATCHED doesn't always mean "transitioning to unpatched state". Sometimes it means "transitioning *from* unpatched state". -- Josh
Re: [PATCH] livepatch.h: Add comment to klp transition state
On Mon, Apr 29, 2024 at 03:26:28PM +0800, zhangwar...@gmail.com wrote: > From: Wardenjohn > > livepatch.h use KLP_UNDEFINED\KLP_UNPATCHED\KLP_PATCHED for klp transition > state. > When livepatch is ready but idle, using KLP_UNDEFINED seems very confusing. > In order not to introduce potential risks to kernel, just update comment > to these state. > --- > include/linux/livepatch.h | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h > index 9b9b38e89563..b6a214f2f8e3 100644 > --- a/include/linux/livepatch.h > +++ b/include/linux/livepatch.h > @@ -18,9 +18,9 @@ > #if IS_ENABLED(CONFIG_LIVEPATCH) > > /* task patch states */ > -#define KLP_UNDEFINED-1 > -#define KLP_UNPATCHED 0 > -#define KLP_PATCHED 1 > +#define KLP_UNDEFINED-1 /* idle, no transition in progress */ > +#define KLP_UNPATCHED 0 /* transitioning to unpatched state */ > +#define KLP_PATCHED 1 /* transitioning to patched state */ Instead of the comments, how about we just rename them to KLP_TRANSITION_IDLE KLP_TRANSITION_UNPATCHED KLP_TRANSITION_PATCHED which shouldn't break userspace AFAIK. -- Josh
Re: [RFC PATCH 0/4] perf: Correlating user process data to samples
On Sat, Apr 13, 2024 at 08:48:57AM -0400, Steven Rostedt wrote: > On Sat, 13 Apr 2024 12:53:38 +0200 > Peter Zijlstra wrote: > > > On Fri, Apr 12, 2024 at 09:37:24AM -0700, Beau Belgrave wrote: > > > > > > Anyway, since we typically run stuff from NMI context, accessing user > > > > data is 'interesting'. As such I would really like to make this work > > > > depend on the call-graph rework that pushes all the user access bits > > > > into return-to-user. > > > > > > Cool, I assume that's the SFRAME work? Are there pointers to work I > > > could look at and think about what a rebase looks like? Or do you have > > > someone in mind I should work with for this? > > > > I've been offline for a little while and still need to catch up with > > things myself. > > > > Josh was working on that when I dropped off IIRC, I'm not entirely sure > > where things are at currently (and there is no way I can ever hope to > > process the backlog). > > > > Anybody know where we are with that? > > It's still very much on my RADAR, but with layoffs and such, my > priorities have unfortunately changed. I'm hoping to start helping out > in the near future though (in a month or two). > > Josh was working on it, but I think he got pulled off onto other > priorities too :-p Yeah, this is still a priority for me and I hope to get back to it over the next few weeks (crosses fingers). -- Josh
Re: [POC 0/7] livepatch: Make livepatch states, callbacks, and shadow variables work together
On Fri, Nov 10, 2023 at 06:04:21PM +0100, Petr Mladek wrote: > This POC is a material for the discussion "Simplify Livepatch Callbacks, > Shadow Variables, and States handling" at LPC 2013, see > https://lpc.events/event/17/contributions/1541/ > > It obsoletes the patchset adding the garbage collection of shadow > variables. This new solution is based on ideas from Nicolai Stange. > And it should also be in sync with Josh's ideas mentioned into > the thread about the garbage collection, see > https://lore.kernel.org/r/20230204235910.4j4ame5ntqogqi7m@treble Nice! I like how it brings the "features" together and makes them easy to use. This looks like a vast improvement. Was there a reason to change the naming? I'm thinking setup / enable / disable / release is less precise than pre_patch / post_patch / pre_unpatch / post_unpatch. Also, I'm thinking "replaced" instead of "obsolete" would be more consistent with the existing terminology. For example, in __klp_enable_patch(): ret = klp_setup_states(patch); if (ret) goto err; if (patch->replace) klp_disable_obsolete_states(patch); it's not immediately clear why "disable obsolete" would be the "replace" counterpart to "setup". Similarly, in klp_complete_transition(): if (klp_transition_patch->replace && klp_target_state == KLP_PATCHED) { klp_unpatch_replaced_patches(klp_transition_patch); klp_discard_nops(klp_transition_patch); klp_release_obsolete_states(klp_transition_patch); } it's a little jarring to have "unpatch replaced" followed by "release obsolete". So instead of: int klp_setup_states(struct klp_patch *patch); void klp_enable_states(struct klp_patch *patch); void klp_disable_states(struct klp_patch *patch); void klp_release_states(struct klp_patch *patch); void klp_enable_obsolete_states(struct klp_patch *patch); void klp_disable_obsolete_states(struct klp_patch *patch); void klp_release_obsolete_states(struct klp_patch *patch); how about something like: int klp_states_pre_patch(void); void klp_states_post_patch(void); void klp_states_pre_unpatch(void); void klp_states_post_unpatch(void); void klp_states_post_patch_replaced(void); void klp_states_pre_unpatch_replaced(void); void klp_states_post_unpatch_replaced(void); ? (note that passing klp_transition_patch might be optional since state.c already has visibility to it anyway) -- Josh
Re: [RFC PATCH 07/86] Revert "livepatch,sched: Add livepatch task switching to cond_resched()"
On Thu, Nov 09, 2023 at 02:50:48PM -0800, Ankur Arora wrote: > >> I guess I'm not fully understanding what the cond rescheds are for. But > >> would an IPI to all CPUs setting NEED_RESCHED, fix it? > > Yeah. We could just temporarily toggle to full preemption, when > NEED_RESCHED_LAZY is always upgraded to NEED_RESCHED which will > then send IPIs. > > > If all livepatch arches had the ORC unwinder, yes. > > > > The problem is that frame pointer (and similar) unwinders can't reliably > > unwind past an interrupt frame. > > Ah, I wonder if we could just disable the preempt_schedule_irq() path > temporarily? Hooking into schedule() alongside something like this: > > @@ -379,7 +379,7 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs > *regs) > > void irqentry_exit_cond_resched(void) > { > - if (!preempt_count()) { > + if (klp_cond_resched_disable() && !preempt_count()) { > > The problem would be tasks that don't go through any preemptible > sections. Let me back up a bit and explain what klp is trying to do. When a livepatch is applied, klp needs to unwind all the tasks, preferably within a reasonable amount of time. We can't unwind task A from task B while task A is running, since task A could be changing the stack during the unwind. So task A needs to be blocked or asleep. The only exception to that is if the unwind happens in the context of task A itself. The problem we were seeing was CPU-bound kthreads (e.g., vhost_worker) not getting patched within a reasonable amount of time. We fixed it by hooking the klp unwind into cond_resched() so it can unwind from the task itself. It only worked because we had a non-preempted hook (because non-ORC unwinders can't unwind reliably through preemption) which called klp to unwind from the context of the task. Without something to hook into, we have a problem. We could of course hook into schedule(), but if the kthread never calls schedule() from a non-preempted context then it still doesn't help. -- Josh
Re: [RFC PATCH 07/86] Revert "livepatch,sched: Add livepatch task switching to cond_resched()"
On Thu, Nov 09, 2023 at 12:31:47PM -0500, Steven Rostedt wrote: > On Thu, 9 Nov 2023 09:26:37 -0800 > Josh Poimboeuf wrote: > > > On Tue, Nov 07, 2023 at 06:16:09PM -0500, Steven Rostedt wrote: > > > On Tue, 7 Nov 2023 13:56:53 -0800 > > > Ankur Arora wrote: > > > > > > > This reverts commit e3ff7c609f39671d1aaff4fb4a8594e14f3e03f8. > > > > > > > > Note that removing this commit reintroduces "live patches failing to > > > > complete within a reasonable amount of time due to CPU-bound kthreads." > > > > > > > > Unfortunately this fix depends quite critically on PREEMPT_DYNAMIC and > > > > existence of cond_resched() so this will need an alternate fix. > > > > We definitely don't want to introduce a regression, something will need > > to be figured out before removing cond_resched(). > > > > We could hook into preempt_schedule_irq(), but that wouldn't work for > > non-ORC. > > > > Another option would be to hook into schedule(). Then livepatch could > > set TIF_NEED_RESCHED on remaining unpatched tasks. But again if they go > > through the preemption path then we have the same problem for non-ORC. > > > > Worst case we'll need to sprinkle cond_livepatch() everywhere :-/ > > > > I guess I'm not fully understanding what the cond rescheds are for. But > would an IPI to all CPUs setting NEED_RESCHED, fix it? If all livepatch arches had the ORC unwinder, yes. The problem is that frame pointer (and similar) unwinders can't reliably unwind past an interrupt frame. -- Josh
Re: [RFC PATCH 07/86] Revert "livepatch,sched: Add livepatch task switching to cond_resched()"
On Tue, Nov 07, 2023 at 06:16:09PM -0500, Steven Rostedt wrote: > On Tue, 7 Nov 2023 13:56:53 -0800 > Ankur Arora wrote: > > > This reverts commit e3ff7c609f39671d1aaff4fb4a8594e14f3e03f8. > > > > Note that removing this commit reintroduces "live patches failing to > > complete within a reasonable amount of time due to CPU-bound kthreads." > > > > Unfortunately this fix depends quite critically on PREEMPT_DYNAMIC and > > existence of cond_resched() so this will need an alternate fix. We definitely don't want to introduce a regression, something will need to be figured out before removing cond_resched(). We could hook into preempt_schedule_irq(), but that wouldn't work for non-ORC. Another option would be to hook into schedule(). Then livepatch could set TIF_NEED_RESCHED on remaining unpatched tasks. But again if they go through the preemption path then we have the same problem for non-ORC. Worst case we'll need to sprinkle cond_livepatch() everywhere :-/ -- Josh
Re: [PATCH 01/15] objtool: Find a destination for jumps beyond the section end
On Tue, Apr 20, 2021 at 01:25:43PM -0700, Sami Tolvanen wrote: > On Tue, Apr 20, 2021 at 11:14 AM Josh Poimboeuf wrote: > > > > On Fri, Apr 16, 2021 at 01:38:30PM -0700, Sami Tolvanen wrote: > > > With -ffunction-sections, Clang can generate a jump beyond the end of > > > a section when the section ends in an unreachable instruction. > > > > Why? Can you show an example? > > Here's the warning I'm seeing when building allyesconfig + CFI: > > vmlinux.o: warning: objtool: > rockchip_spi_transfer_one.f088382d97b74759d70e27e891fe8f1c()+0x149: > can't find jump dest instruction at > .text.rockchip_spi_transfer_one.f088382d97b74759d70e27e891fe8f1c+0x7dc > > $ objdump -d -r -j > .text.rockchip_spi_transfer_one.f088382d97b74759d70e27e891fe8f1c > vmlinux.o > : > ... > 149: 0f 85 8d 06 00 00 jne7dc <.compoundliteral.4> > ... > 7d7: e8 00 00 00 00 callq 7dc <.compoundliteral.4> > 7d8: R_X86_64_PLT32 __stack_chk_fail-0x4 Instead of silencing the warning by faking the jump destination, I'd rather improve the warning to something like "warning: rockchip_spi_transfer_one() falls through to the next function" which is what we normally do in this type of situation. It may be caused by UB, or a compiler bug, but either way we should figure out the root cause. -- Josh
Re: [PATCH 02/15] objtool: Add CONFIG_CFI_CLANG support
On Fri, Apr 16, 2021 at 01:38:31PM -0700, Sami Tolvanen wrote: > +static int fix_cfi_relocs(const struct elf *elf) > +{ > + struct section *sec, *tmpsec; > + struct reloc *reloc, *tmpreloc; > + > + list_for_each_entry_safe(sec, tmpsec, >sections, list) { > + list_for_each_entry_safe(reloc, tmpreloc, >reloc_list, > list) { These loops don't remove structs from the lists, so are the "safe" iterators really needed? > + struct symbol *sym; > + struct reloc *func_reloc; > + > + /* > + * CONFIG_CFI_CLANG replaces function relocations to > refer > + * to an intermediate jump table. Undo the conversion so > + * objtool can make sense of things. > + */ I think this comment could be clearer if it were placed above the function. > + if (!reloc->sym->sec->cfi_jt) > + continue; > + > + if (reloc->sym->type == STT_SECTION) > + sym = find_func_by_offset(reloc->sym->sec, > + reloc->addend); > + else > + sym = reloc->sym; > + > + if (!sym || !sym->sec) > + continue; This should be a fatal error, it probably means something's gone wrong with the reading of the jump table. > + > + /* > + * The jump table immediately jumps to the actual > function, > + * so look up the relocation there. > + */ > + func_reloc = find_reloc_by_dest_range(elf, sym->sec, > sym->offset, 5); The jump instruction's reloc is always at offset 1, so this can maybe be optimized to: func_reloc = find_reloc_by_dest(elf, sym->sec, sym->offset+1); though of course that will probably break (future) arm64 objtool. Maybe an arch-specific offset from the start of the insn would be good. > + if (!func_reloc || !func_reloc->sym) > + continue; This should also be an error. > + > + reloc->sym = func_reloc->sym; I think we should also do 'reloc->addend = 0', because of the STT_SECTION case: 0025 259e000b R_X86_64_32S .text..L.cfi.jumptable.8047 + 8 which then translates to 0009 06320004 R_X86_64_PLT32 x2apic_prepare_cpu - 4 so the original addend of '8' is no longer valid. Copying the '-4' wouldn't be right either, because that only applies to a PLT32 text reloc. A '0' should be appropriate for the 32S data reloc. This addend might not actually be read by any code, so it may not currently be an actual bug, but better safe than sorry. -- Josh
Re: [PATCH 01/15] objtool: Find a destination for jumps beyond the section end
On Fri, Apr 16, 2021 at 01:38:30PM -0700, Sami Tolvanen wrote: > With -ffunction-sections, Clang can generate a jump beyond the end of > a section when the section ends in an unreachable instruction. Why? Can you show an example? -- Josh
[tip: objtool/core] x86/crypto/aesni-intel_avx: Standardize stack alignment prologue
The following commit has been merged into the objtool/core branch of tip: Commit-ID: e163be86fff3deec70f63330fc43fedf892c9aee Gitweb: https://git.kernel.org/tip/e163be86fff3deec70f63330fc43fedf892c9aee Author:Josh Poimboeuf AuthorDate:Wed, 24 Feb 2021 10:29:17 -06:00 Committer: Josh Poimboeuf CommitterDate: Mon, 19 Apr 2021 12:36:34 -05:00 x86/crypto/aesni-intel_avx: Standardize stack alignment prologue Use RBP instead of R14 for saving the old stack pointer before realignment. This resembles what compilers normally do. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf Tested-by: Ard Biesheuvel Acked-by: Ard Biesheuvel Tested-by: Sami Tolvanen Acked-by: Peter Zijlstra (Intel) Acked-by: Herbert Xu Link: https://lore.kernel.org/r/02d00a0903a0959f4787e186e2a07d271e1f63d4.1614182415.git.jpoim...@redhat.com --- arch/x86/crypto/aesni-intel_avx-x86_64.S | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S b/arch/x86/crypto/aesni-intel_avx-x86_64.S index 188f184..98e3552 100644 --- a/arch/x86/crypto/aesni-intel_avx-x86_64.S +++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S @@ -251,22 +251,20 @@ VARIABLE_OFFSET = 16*8 .macro FUNC_SAVE push%r12 push%r13 -push%r14 push%r15 -mov %rsp, %r14 - - + push%rbp + mov %rsp, %rbp sub $VARIABLE_OFFSET, %rsp and $~63, %rsp# align rsp to 64 bytes .endm .macro FUNC_RESTORE -mov %r14, %rsp +mov %rbp, %rsp + pop %rbp pop %r15 -pop %r14 pop %r13 pop %r12 .endm
[tip: objtool/core] x86/crypto/aesni-intel_avx: Remove unused macros
The following commit has been merged into the objtool/core branch of tip: Commit-ID: 4f08300916e882a0c34a2f325ff3fea2be2e57b3 Gitweb: https://git.kernel.org/tip/4f08300916e882a0c34a2f325ff3fea2be2e57b3 Author:Josh Poimboeuf AuthorDate:Wed, 24 Feb 2021 10:29:15 -06:00 Committer: Josh Poimboeuf CommitterDate: Mon, 19 Apr 2021 12:36:33 -05:00 x86/crypto/aesni-intel_avx: Remove unused macros These macros are no longer used; remove them. Signed-off-by: Josh Poimboeuf Tested-by: Ard Biesheuvel Acked-by: Ard Biesheuvel Tested-by: Sami Tolvanen Acked-by: Peter Zijlstra (Intel) Acked-by: Herbert Xu Link: https://lore.kernel.org/r/53f7136ea93ebdbca399959e6d2991ecb46e733e.1614182415.git.jpoim...@redhat.com --- arch/x86/crypto/aesni-intel_avx-x86_64.S | 8 1 file changed, 8 deletions(-) diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S b/arch/x86/crypto/aesni-intel_avx-x86_64.S index 2cf8e94..4fdf38e 100644 --- a/arch/x86/crypto/aesni-intel_avx-x86_64.S +++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S @@ -212,10 +212,6 @@ HashKey_8_k= 16*21 # store XOR of HashKey^8 <<1 mod poly here (for Karatsu #define arg4 %rcx #define arg5 %r8 #define arg6 %r9 -#define arg7 STACK_OFFSET+8*1(%r14) -#define arg8 STACK_OFFSET+8*2(%r14) -#define arg9 STACK_OFFSET+8*3(%r14) -#define arg10 STACK_OFFSET+8*4(%r14) #define keysize 2*15*16(arg1) i = 0 @@ -237,9 +233,6 @@ define_reg j %j .noaltmacro .endm -# need to push 4 registers into stack to maintain -STACK_OFFSET = 8*4 - TMP1 = 16*0# Temporary storage for AAD TMP2 = 16*1# Temporary storage for AES State 2 (State 1 is stored in an XMM register) TMP3 = 16*2# Temporary storage for AES State 3 @@ -256,7 +249,6 @@ VARIABLE_OFFSET = 16*8 .macro FUNC_SAVE -#the number of pushes must equal STACK_OFFSET push%r12 push%r13 push%r14
[tip: objtool/core] x86/crypto/aesni-intel_avx: Fix register usage comments
The following commit has been merged into the objtool/core branch of tip: Commit-ID: ff5796b6dbea4763fdca002101e32b60aa17f8e8 Gitweb: https://git.kernel.org/tip/ff5796b6dbea4763fdca002101e32b60aa17f8e8 Author:Josh Poimboeuf AuthorDate:Wed, 24 Feb 2021 10:29:16 -06:00 Committer: Josh Poimboeuf CommitterDate: Mon, 19 Apr 2021 12:36:33 -05:00 x86/crypto/aesni-intel_avx: Fix register usage comments Fix register usage comments to match reality. Signed-off-by: Josh Poimboeuf Tested-by: Ard Biesheuvel Acked-by: Ard Biesheuvel Tested-by: Sami Tolvanen Acked-by: Peter Zijlstra (Intel) Acked-by: Herbert Xu Link: https://lore.kernel.org/r/8655d4513a0ed1eddec609165064153973010aa2.1614182415.git.jpoim...@redhat.com --- arch/x86/crypto/aesni-intel_avx-x86_64.S | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S b/arch/x86/crypto/aesni-intel_avx-x86_64.S index 4fdf38e..188f184 100644 --- a/arch/x86/crypto/aesni-intel_avx-x86_64.S +++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S @@ -286,7 +286,7 @@ VARIABLE_OFFSET = 16*8 # combined for GCM encrypt and decrypt functions # clobbering all xmm registers -# clobbering r10, r11, r12, r13, r14, r15 +# clobbering r10, r11, r12, r13, r15, rax .macro GCM_ENC_DEC INITIAL_BLOCKS GHASH_8_ENCRYPT_8_PARALLEL GHASH_LAST_8 GHASH_MUL ENC_DEC REP vmovdqu AadHash(arg2), %xmm8 vmovdqu HashKey(arg2), %xmm13 # xmm13 = HashKey @@ -988,7 +988,7 @@ _partial_block_done_\@: ## num_initial_blocks = b mod 4# ## encrypt the initial num_initial_blocks blocks and apply ghash on the ciphertext ## r10, r11, r12, rax are clobbered -## arg1, arg3, arg4, r14 are used as a pointer only, not modified +## arg1, arg2, arg3, arg4 are used as pointers only, not modified .macro INITIAL_BLOCKS_AVX REP num_initial_blocks T1 T2 T3 T4 T5 CTR XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 T6 T_key ENC_DEC i = (8-\num_initial_blocks) @@ -1223,7 +1223,7 @@ _initial_blocks_done\@: # encrypt 8 blocks at a time # ghash the 8 previously encrypted ciphertext blocks -# arg1, arg3, arg4 are used as pointers only, not modified +# arg1, arg2, arg3, arg4 are used as pointers only, not modified # r11 is the data offset value .macro GHASH_8_ENCRYPT_8_PARALLEL_AVX REP T1 T2 T3 T4 T5 T6 CTR XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 T7 loop_idx ENC_DEC @@ -1936,7 +1936,7 @@ SYM_FUNC_END(aesni_gcm_finalize_avx_gen2) ## num_initial_blocks = b mod 4# ## encrypt the initial num_initial_blocks blocks and apply ghash on the ciphertext ## r10, r11, r12, rax are clobbered -## arg1, arg3, arg4, r14 are used as a pointer only, not modified +## arg1, arg2, arg3, arg4 are used as pointers only, not modified .macro INITIAL_BLOCKS_AVX2 REP num_initial_blocks T1 T2 T3 T4 T5 CTR XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 T6 T_key ENC_DEC VER i = (8-\num_initial_blocks) @@ -2178,7 +2178,7 @@ _initial_blocks_done\@: # encrypt 8 blocks at a time # ghash the 8 previously encrypted ciphertext blocks -# arg1, arg3, arg4 are used as pointers only, not modified +# arg1, arg2, arg3, arg4 are used as pointers only, not modified # r11 is the data offset value .macro GHASH_8_ENCRYPT_8_PARALLEL_AVX2 REP T1 T2 T3 T4 T5 T6 CTR XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 T7 loop_idx ENC_DEC
[tip: objtool/core] objtool: Support asm jump tables
The following commit has been merged into the objtool/core branch of tip: Commit-ID: 99033461e685b48549ec77608b4bda75ddf772ce Gitweb: https://git.kernel.org/tip/99033461e685b48549ec77608b4bda75ddf772ce Author:Josh Poimboeuf AuthorDate:Wed, 24 Feb 2021 10:29:14 -06:00 Committer: Josh Poimboeuf CommitterDate: Mon, 19 Apr 2021 12:36:32 -05:00 objtool: Support asm jump tables Objtool detection of asm jump tables would normally just work, except for the fact that asm retpolines use alternatives. Objtool thinks the alternative code path (a jump to the retpoline) is a sibling call. Don't treat alternative indirect branches as sibling calls when the original instruction has a jump table. Signed-off-by: Josh Poimboeuf Tested-by: Ard Biesheuvel Acked-by: Ard Biesheuvel Tested-by: Sami Tolvanen Acked-by: Peter Zijlstra (Intel) Acked-by: Herbert Xu Link: https://lore.kernel.org/r/460cf4dc675d64e1124146562cabd2c05aa322e8.1614182415.git.jpoim...@redhat.com --- tools/objtool/check.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/tools/objtool/check.c b/tools/objtool/check.c index a0f762a..46621e8 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -108,6 +108,18 @@ static struct instruction *prev_insn_same_sym(struct objtool_file *file, for (insn = next_insn_same_sec(file, insn); insn; \ insn = next_insn_same_sec(file, insn)) +static bool is_jump_table_jump(struct instruction *insn) +{ + struct alt_group *alt_group = insn->alt_group; + + if (insn->jump_table) + return true; + + /* Retpoline alternative for a jump table? */ + return alt_group && alt_group->orig_group && + alt_group->orig_group->first_insn->jump_table; +} + static bool is_sibling_call(struct instruction *insn) { /* @@ -120,7 +132,7 @@ static bool is_sibling_call(struct instruction *insn) /* An indirect jump is either a sibling call or a jump to a table. */ if (insn->type == INSN_JUMP_DYNAMIC) - return list_empty(>alts); + return !is_jump_table_jump(insn); /* add_jump_destinations() sets insn->call_dest for sibling calls. */ return (is_static_jump(insn) && insn->call_dest);
[tip: objtool/core] x86/crypto/sha512-avx2: Standardize stack alignment prologue
The following commit has been merged into the objtool/core branch of tip: Commit-ID: ec063e090bd6487097d459bb4272508b78448270 Gitweb: https://git.kernel.org/tip/ec063e090bd6487097d459bb4272508b78448270 Author:Josh Poimboeuf AuthorDate:Wed, 24 Feb 2021 10:29:24 -06:00 Committer: Josh Poimboeuf CommitterDate: Mon, 19 Apr 2021 12:36:36 -05:00 x86/crypto/sha512-avx2: Standardize stack alignment prologue Use a more standard prologue for saving the stack pointer before realigning the stack. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf Tested-by: Ard Biesheuvel Acked-by: Ard Biesheuvel Tested-by: Sami Tolvanen Acked-by: Peter Zijlstra (Intel) Acked-by: Herbert Xu Link: https://lore.kernel.org/r/b1a7b29fcfc65d60a3b6e77ef75f4762a5b8488d.1614182415.git.jpoim...@redhat.com --- arch/x86/crypto/sha512-avx2-asm.S | 42 ++ 1 file changed, 20 insertions(+), 22 deletions(-) diff --git a/arch/x86/crypto/sha512-avx2-asm.S b/arch/x86/crypto/sha512-avx2-asm.S index 3a44bdc..072cb0f 100644 --- a/arch/x86/crypto/sha512-avx2-asm.S +++ b/arch/x86/crypto/sha512-avx2-asm.S @@ -102,17 +102,13 @@ SRND_SIZE = 1*8 INP_SIZE = 1*8 INPEND_SIZE = 1*8 CTX_SIZE = 1*8 -RSPSAVE_SIZE = 1*8 -GPRSAVE_SIZE = 5*8 frame_XFER = 0 frame_SRND = frame_XFER + XFER_SIZE frame_INP = frame_SRND + SRND_SIZE frame_INPEND = frame_INP + INP_SIZE frame_CTX = frame_INPEND + INPEND_SIZE -frame_RSPSAVE = frame_CTX + CTX_SIZE -frame_GPRSAVE = frame_RSPSAVE + RSPSAVE_SIZE -frame_size = frame_GPRSAVE + GPRSAVE_SIZE +frame_size = frame_CTX + CTX_SIZE ## assume buffers not aligned #defineVMOVDQ vmovdqu @@ -570,18 +566,18 @@ frame_size = frame_GPRSAVE + GPRSAVE_SIZE # "blocks" is the message length in SHA512 blocks SYM_FUNC_START(sha512_transform_rorx) + # Save GPRs + push%rbx + push%r12 + push%r13 + push%r14 + push%r15 + # Allocate Stack Space - mov %rsp, %rax + push%rbp + mov %rsp, %rbp sub $frame_size, %rsp and $~(0x20 - 1), %rsp - mov %rax, frame_RSPSAVE(%rsp) - - # Save GPRs - mov %rbx, 8*0+frame_GPRSAVE(%rsp) - mov %r12, 8*1+frame_GPRSAVE(%rsp) - mov %r13, 8*2+frame_GPRSAVE(%rsp) - mov %r14, 8*3+frame_GPRSAVE(%rsp) - mov %r15, 8*4+frame_GPRSAVE(%rsp) shl $7, NUM_BLKS# convert to bytes jz done_hash @@ -672,15 +668,17 @@ loop2: done_hash: -# Restore GPRs - mov 8*0+frame_GPRSAVE(%rsp), %rbx - mov 8*1+frame_GPRSAVE(%rsp), %r12 - mov 8*2+frame_GPRSAVE(%rsp), %r13 - mov 8*3+frame_GPRSAVE(%rsp), %r14 - mov 8*4+frame_GPRSAVE(%rsp), %r15 - # Restore Stack Pointer - mov frame_RSPSAVE(%rsp), %rsp + mov %rbp, %rsp + pop %rbp + + # Restore GPRs + pop %r15 + pop %r14 + pop %r13 + pop %r12 + pop %rbx + ret SYM_FUNC_END(sha512_transform_rorx)
[tip: objtool/core] x86/crypto/sha_ni: Standardize stack alignment prologue
The following commit has been merged into the objtool/core branch of tip: Commit-ID: 35a0067d2c02a7c35466db5f207b7b9265de84d9 Gitweb: https://git.kernel.org/tip/35a0067d2c02a7c35466db5f207b7b9265de84d9 Author:Josh Poimboeuf AuthorDate:Wed, 24 Feb 2021 10:29:20 -06:00 Committer: Josh Poimboeuf CommitterDate: Mon, 19 Apr 2021 12:36:35 -05:00 x86/crypto/sha_ni: Standardize stack alignment prologue Use a more standard prologue for saving the stack pointer before realigning the stack. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf Tested-by: Ard Biesheuvel Acked-by: Ard Biesheuvel Tested-by: Sami Tolvanen Acked-by: Peter Zijlstra (Intel) Acked-by: Herbert Xu Link: https://lore.kernel.org/r/5033e1a79867dff1b18e1b4d0783c38897d3f223.1614182415.git.jpoim...@redhat.com --- arch/x86/crypto/sha1_ni_asm.S | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/crypto/sha1_ni_asm.S b/arch/x86/crypto/sha1_ni_asm.S index 11efe3a..5d8415f 100644 --- a/arch/x86/crypto/sha1_ni_asm.S +++ b/arch/x86/crypto/sha1_ni_asm.S @@ -59,8 +59,6 @@ #define DATA_PTR %rsi/* 2nd arg */ #define NUM_BLKS %rdx/* 3rd arg */ -#define RSPSAVE%rax - /* gcc conversion */ #define FRAME_SIZE 32 /* space for 2x16 bytes */ @@ -96,7 +94,8 @@ .text .align 32 SYM_FUNC_START(sha1_ni_transform) - mov %rsp, RSPSAVE + push%rbp + mov %rsp, %rbp sub $FRAME_SIZE, %rsp and $~0xF, %rsp @@ -288,7 +287,8 @@ SYM_FUNC_START(sha1_ni_transform) pextrd $3, E0, 1*16(DIGEST_PTR) .Ldone_hash: - mov RSPSAVE, %rsp + mov %rbp, %rsp + pop %rbp ret SYM_FUNC_END(sha1_ni_transform)
[tip: objtool/core] x86/crypto/crc32c-pcl-intel: Standardize jump table
The following commit has been merged into the objtool/core branch of tip: Commit-ID: 2b02ed55482a1c5c310a7f53707292fcf1601e7a Gitweb: https://git.kernel.org/tip/2b02ed55482a1c5c310a7f53707292fcf1601e7a Author:Josh Poimboeuf AuthorDate:Wed, 24 Feb 2021 10:29:19 -06:00 Committer: Josh Poimboeuf CommitterDate: Mon, 19 Apr 2021 12:36:34 -05:00 x86/crypto/crc32c-pcl-intel: Standardize jump table Simplify the jump table code so that it resembles a compiler-generated table. This enables ORC unwinding by allowing objtool to follow all the potential code paths. Signed-off-by: Josh Poimboeuf Tested-by: Ard Biesheuvel Acked-by: Ard Biesheuvel Tested-by: Sami Tolvanen Acked-by: Peter Zijlstra (Intel) Acked-by: Herbert Xu Link: https://lore.kernel.org/r/5357a039def90b8ef6b5874ef12cda008ecf18ba.1614182415.git.jpoim...@redhat.com --- arch/x86/crypto/crc32c-pcl-intel-asm_64.S | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S index 884dc76..ac1f303 100644 --- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S +++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S @@ -53,7 +53,7 @@ .endm .macro JMPTBL_ENTRY i -.word crc_\i - crc_array +.quad crc_\i .endm .macro JNC_LESS_THAN j @@ -168,10 +168,7 @@ continue_block: xor crc2, crc2 ## branch into array - lea jump_table(%rip), %bufp - movzwq (%bufp, %rax, 2), len - lea crc_array(%rip), %bufp - lea (%bufp, len, 1), %bufp + mov jump_table(,%rax,8), %bufp JMP_NOSPEC bufp
[tip: objtool/core] x86/crypto/camellia-aesni-avx2: Unconditionally allocate stack buffer
The following commit has been merged into the objtool/core branch of tip: Commit-ID: dabe5167a3cbb4bf16b20c0e5b6497513e2e3a08 Gitweb: https://git.kernel.org/tip/dabe5167a3cbb4bf16b20c0e5b6497513e2e3a08 Author:Josh Poimboeuf AuthorDate:Wed, 24 Feb 2021 10:29:18 -06:00 Committer: Josh Poimboeuf CommitterDate: Mon, 19 Apr 2021 12:36:34 -05:00 x86/crypto/camellia-aesni-avx2: Unconditionally allocate stack buffer A conditional stack allocation violates traditional unwinding requirements when a single instruction can have differing stack layouts. There's no benefit in allocating the stack buffer conditionally. Just do it unconditionally. Signed-off-by: Josh Poimboeuf Tested-by: Ard Biesheuvel Acked-by: Ard Biesheuvel Tested-by: Sami Tolvanen Acked-by: Peter Zijlstra (Intel) Acked-by: Herbert Xu Link: https://lore.kernel.org/r/85ac96613ee5784b6239c18d3f68b1f3c509caa3.1614182415.git.jpoim...@redhat.com --- arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S index 782e971..706f708 100644 --- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S +++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S @@ -990,6 +990,7 @@ SYM_FUNC_START(camellia_cbc_dec_32way) * %rdx: src (32 blocks) */ FRAME_BEGIN + subq $(16 * 32), %rsp; vzeroupper; @@ -1002,7 +1003,6 @@ SYM_FUNC_START(camellia_cbc_dec_32way) %ymm8, %ymm9, %ymm10, %ymm11, %ymm12, %ymm13, %ymm14, %ymm15, %rdx, (key_table)(CTX, %r8, 8)); - movq %rsp, %r10; cmpq %rsi, %rdx; je .Lcbc_dec_use_stack; @@ -1015,7 +1015,6 @@ SYM_FUNC_START(camellia_cbc_dec_32way) * dst still in-use (because dst == src), so use stack for temporary * storage. */ - subq $(16 * 32), %rsp; movq %rsp, %rax; .Lcbc_dec_continue: @@ -1025,7 +1024,6 @@ SYM_FUNC_START(camellia_cbc_dec_32way) vpxor %ymm7, %ymm7, %ymm7; vinserti128 $1, (%rdx), %ymm7, %ymm7; vpxor (%rax), %ymm7, %ymm7; - movq %r10, %rsp; vpxor (0 * 32 + 16)(%rdx), %ymm6, %ymm6; vpxor (1 * 32 + 16)(%rdx), %ymm5, %ymm5; vpxor (2 * 32 + 16)(%rdx), %ymm4, %ymm4; @@ -1047,6 +1045,7 @@ SYM_FUNC_START(camellia_cbc_dec_32way) vzeroupper; + addq $(16 * 32), %rsp; FRAME_END ret; SYM_FUNC_END(camellia_cbc_dec_32way)
[tip: objtool/core] x86/crypto/sha512-ssse3: Standardize stack alignment prologue
The following commit has been merged into the objtool/core branch of tip: Commit-ID: 27d26793f2105281d9374928448142777cef6f74 Gitweb: https://git.kernel.org/tip/27d26793f2105281d9374928448142777cef6f74 Author:Josh Poimboeuf AuthorDate:Wed, 24 Feb 2021 10:29:25 -06:00 Committer: Josh Poimboeuf CommitterDate: Mon, 19 Apr 2021 12:36:37 -05:00 x86/crypto/sha512-ssse3: Standardize stack alignment prologue Use a more standard prologue for saving the stack pointer before realigning the stack. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf Tested-by: Ard Biesheuvel Acked-by: Ard Biesheuvel Tested-by: Sami Tolvanen Acked-by: Peter Zijlstra (Intel) Acked-by: Herbert Xu Link: https://lore.kernel.org/r/6ecaaac9f3828fbb903513bf90c34a08380a8e35.1614182415.git.jpoim...@redhat.com --- arch/x86/crypto/sha512-ssse3-asm.S | 41 + 1 file changed, 19 insertions(+), 22 deletions(-) diff --git a/arch/x86/crypto/sha512-ssse3-asm.S b/arch/x86/crypto/sha512-ssse3-asm.S index 50812af..bd51c90 100644 --- a/arch/x86/crypto/sha512-ssse3-asm.S +++ b/arch/x86/crypto/sha512-ssse3-asm.S @@ -74,14 +74,10 @@ tmp0 = %rax W_SIZE = 80*8 WK_SIZE = 2*8 -RSPSAVE_SIZE = 1*8 -GPRSAVE_SIZE = 5*8 frame_W = 0 frame_WK = frame_W + W_SIZE -frame_RSPSAVE = frame_WK + WK_SIZE -frame_GPRSAVE = frame_RSPSAVE + RSPSAVE_SIZE -frame_size = frame_GPRSAVE + GPRSAVE_SIZE +frame_size = frame_WK + WK_SIZE # Useful QWORD "arrays" for simpler memory references # MSG, DIGEST, K_t, W_t are arrays @@ -283,18 +279,18 @@ SYM_FUNC_START(sha512_transform_ssse3) test msglen, msglen je nowork + # Save GPRs + push%rbx + push%r12 + push%r13 + push%r14 + push%r15 + # Allocate Stack Space - mov %rsp, %rax + push%rbp + mov %rsp, %rbp sub $frame_size, %rsp and $~(0x20 - 1), %rsp - mov %rax, frame_RSPSAVE(%rsp) - - # Save GPRs - mov %rbx, frame_GPRSAVE(%rsp) - mov %r12, frame_GPRSAVE +8*1(%rsp) - mov %r13, frame_GPRSAVE +8*2(%rsp) - mov %r14, frame_GPRSAVE +8*3(%rsp) - mov %r15, frame_GPRSAVE +8*4(%rsp) updateblock: @@ -355,15 +351,16 @@ updateblock: dec msglen jnz updateblock - # Restore GPRs - mov frame_GPRSAVE(%rsp), %rbx - mov frame_GPRSAVE +8*1(%rsp), %r12 - mov frame_GPRSAVE +8*2(%rsp), %r13 - mov frame_GPRSAVE +8*3(%rsp), %r14 - mov frame_GPRSAVE +8*4(%rsp), %r15 - # Restore Stack Pointer - mov frame_RSPSAVE(%rsp), %rsp + mov %rbp, %rsp + pop %rbp + + # Restore GPRs + pop %r15 + pop %r14 + pop %r13 + pop %r12 + pop %rbx nowork: ret
[tip: objtool/core] x86/crypto/sha256-avx2: Standardize stack alignment prologue
The following commit has been merged into the objtool/core branch of tip: Commit-ID: ce5846668076aa76a17ab559f0296374e3611fec Gitweb: https://git.kernel.org/tip/ce5846668076aa76a17ab559f0296374e3611fec Author:Josh Poimboeuf AuthorDate:Wed, 24 Feb 2021 10:29:22 -06:00 Committer: Josh Poimboeuf CommitterDate: Mon, 19 Apr 2021 12:36:36 -05:00 x86/crypto/sha256-avx2: Standardize stack alignment prologue Use a more standard prologue for saving the stack pointer before realigning the stack. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf Tested-by: Ard Biesheuvel Acked-by: Ard Biesheuvel Tested-by: Sami Tolvanen Acked-by: Peter Zijlstra (Intel) Acked-by: Herbert Xu Link: https://lore.kernel.org/r/8048e7444c49a8137f05265262b83dc50f8fb7f3.1614182415.git.jpoim...@redhat.com --- arch/x86/crypto/sha256-avx2-asm.S | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/arch/x86/crypto/sha256-avx2-asm.S b/arch/x86/crypto/sha256-avx2-asm.S index 11ff60c..4087f74 100644 --- a/arch/x86/crypto/sha256-avx2-asm.S +++ b/arch/x86/crypto/sha256-avx2-asm.S @@ -117,15 +117,13 @@ _XMM_SAVE_SIZE= 0 _INP_END_SIZE = 8 _INP_SIZE = 8 _CTX_SIZE = 8 -_RSP_SIZE = 8 _XFER = 0 _XMM_SAVE = _XFER + _XFER_SIZE _INP_END = _XMM_SAVE + _XMM_SAVE_SIZE _INP = _INP_END + _INP_END_SIZE _CTX = _INP + _INP_SIZE -_RSP = _CTX + _CTX_SIZE -STACK_SIZE = _RSP + _RSP_SIZE +STACK_SIZE = _CTX + _CTX_SIZE # rotate_Xs # Rotate values of symbols X0...X3 @@ -533,11 +531,11 @@ SYM_FUNC_START(sha256_transform_rorx) pushq %r14 pushq %r15 - mov %rsp, %rax + push%rbp + mov %rsp, %rbp + subq$STACK_SIZE, %rsp and $-32, %rsp # align rsp to 32 byte boundary - mov %rax, _RSP(%rsp) - shl $6, NUM_BLKS# convert to bytes jz done_hash @@ -704,7 +702,8 @@ only_one_block: done_hash: - mov _RSP(%rsp), %rsp + mov %rbp, %rsp + pop %rbp popq%r15 popq%r14
[tip: objtool/core] x86/crypto/sha1_avx2: Standardize stack alignment prologue
The following commit has been merged into the objtool/core branch of tip: Commit-ID: 20114c899cafa8313534a841cab0ab1f7ab09672 Gitweb: https://git.kernel.org/tip/20114c899cafa8313534a841cab0ab1f7ab09672 Author:Josh Poimboeuf AuthorDate:Wed, 24 Feb 2021 10:29:21 -06:00 Committer: Josh Poimboeuf CommitterDate: Mon, 19 Apr 2021 12:36:35 -05:00 x86/crypto/sha1_avx2: Standardize stack alignment prologue Use a more standard prologue for saving the stack pointer before realigning the stack. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf Tested-by: Ard Biesheuvel Acked-by: Ard Biesheuvel Tested-by: Sami Tolvanen Acked-by: Peter Zijlstra (Intel) Acked-by: Herbert Xu Link: https://lore.kernel.org/r/fdaaf8670ed1f52f55ba9a6bbac98c1afddc1af6.1614182415.git.jpoim...@redhat.com --- arch/x86/crypto/sha1_avx2_x86_64_asm.S | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S b/arch/x86/crypto/sha1_avx2_x86_64_asm.S index 1e594d6..5eed620 100644 --- a/arch/x86/crypto/sha1_avx2_x86_64_asm.S +++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S @@ -645,9 +645,9 @@ _loop3: RESERVE_STACK = (W_SIZE*4 + 8+24) /* Align stack */ - mov %rsp, %rbx + push%rbp + mov %rsp, %rbp and $~(0x20-1), %rsp - push%rbx sub $RESERVE_STACK, %rsp avx2_zeroupper @@ -665,8 +665,8 @@ _loop3: avx2_zeroupper - add $RESERVE_STACK, %rsp - pop %rsp + mov %rbp, %rsp + pop %rbp pop %r15 pop %r14
[tip: objtool/core] x86/crypto: Enable objtool in crypto code
The following commit has been merged into the objtool/core branch of tip: Commit-ID: 7d3d10e0e85fb7c23a86a70f795b1eabd2bc030b Gitweb: https://git.kernel.org/tip/7d3d10e0e85fb7c23a86a70f795b1eabd2bc030b Author:Josh Poimboeuf AuthorDate:Wed, 24 Feb 2021 10:29:26 -06:00 Committer: Josh Poimboeuf CommitterDate: Mon, 19 Apr 2021 12:36:37 -05:00 x86/crypto: Enable objtool in crypto code Now that all the stack alignment prologues have been cleaned up in the crypto code, enable objtool. Among other benefits, this will allow ORC unwinding to work. Signed-off-by: Josh Poimboeuf Tested-by: Ard Biesheuvel Acked-by: Ard Biesheuvel Tested-by: Sami Tolvanen Acked-by: Peter Zijlstra (Intel) Acked-by: Herbert Xu Link: https://lore.kernel.org/r/fc2a1918c50e33e46ef0e9a5de02743f2f6e3639.1614182415.git.jpoim...@redhat.com --- arch/x86/crypto/Makefile | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile index b28e36b..d0959e7 100644 --- a/arch/x86/crypto/Makefile +++ b/arch/x86/crypto/Makefile @@ -2,8 +2,6 @@ # # x86 crypto algorithms -OBJECT_FILES_NON_STANDARD := y - obj-$(CONFIG_CRYPTO_TWOFISH_586) += twofish-i586.o twofish-i586-y := twofish-i586-asm_32.o twofish_glue.o obj-$(CONFIG_CRYPTO_TWOFISH_X86_64) += twofish-x86_64.o
[tip: objtool/core] x86/crypto/sha512-avx: Standardize stack alignment prologue
The following commit has been merged into the objtool/core branch of tip: Commit-ID: d61684b56edf369f0a6d388088d7c9d59f1618d4 Gitweb: https://git.kernel.org/tip/d61684b56edf369f0a6d388088d7c9d59f1618d4 Author:Josh Poimboeuf AuthorDate:Wed, 24 Feb 2021 10:29:23 -06:00 Committer: Josh Poimboeuf CommitterDate: Mon, 19 Apr 2021 12:36:36 -05:00 x86/crypto/sha512-avx: Standardize stack alignment prologue Use a more standard prologue for saving the stack pointer before realigning the stack. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf Tested-by: Ard Biesheuvel Acked-by: Ard Biesheuvel Tested-by: Sami Tolvanen Acked-by: Peter Zijlstra (Intel) Acked-by: Herbert Xu Link: https://lore.kernel.org/r/d36e9ea1c819d87fa89b3df3fa83e2a1ede18146.1614182415.git.jpoim...@redhat.com --- arch/x86/crypto/sha512-avx-asm.S | 41 ++- 1 file changed, 19 insertions(+), 22 deletions(-) diff --git a/arch/x86/crypto/sha512-avx-asm.S b/arch/x86/crypto/sha512-avx-asm.S index 684d58c..3d8f0fd 100644 --- a/arch/x86/crypto/sha512-avx-asm.S +++ b/arch/x86/crypto/sha512-avx-asm.S @@ -76,14 +76,10 @@ tmp0= %rax W_SIZE = 80*8 # W[t] + K[t] | W[t+1] + K[t+1] WK_SIZE = 2*8 -RSPSAVE_SIZE = 1*8 -GPRSAVE_SIZE = 5*8 frame_W = 0 frame_WK = frame_W + W_SIZE -frame_RSPSAVE = frame_WK + WK_SIZE -frame_GPRSAVE = frame_RSPSAVE + RSPSAVE_SIZE -frame_size = frame_GPRSAVE + GPRSAVE_SIZE +frame_size = frame_WK + WK_SIZE # Useful QWORD "arrays" for simpler memory references # MSG, DIGEST, K_t, W_t are arrays @@ -281,18 +277,18 @@ SYM_FUNC_START(sha512_transform_avx) test msglen, msglen je nowork + # Save GPRs + push%rbx + push%r12 + push%r13 + push%r14 + push%r15 + # Allocate Stack Space - mov %rsp, %rax + push%rbp + mov %rsp, %rbp sub $frame_size, %rsp and $~(0x20 - 1), %rsp - mov %rax, frame_RSPSAVE(%rsp) - - # Save GPRs - mov %rbx, frame_GPRSAVE(%rsp) - mov %r12, frame_GPRSAVE +8*1(%rsp) - mov %r13, frame_GPRSAVE +8*2(%rsp) - mov %r14, frame_GPRSAVE +8*3(%rsp) - mov %r15, frame_GPRSAVE +8*4(%rsp) updateblock: @@ -353,15 +349,16 @@ updateblock: dec msglen jnz updateblock - # Restore GPRs - mov frame_GPRSAVE(%rsp), %rbx - mov frame_GPRSAVE +8*1(%rsp), %r12 - mov frame_GPRSAVE +8*2(%rsp), %r13 - mov frame_GPRSAVE +8*3(%rsp), %r14 - mov frame_GPRSAVE +8*4(%rsp), %r15 - # Restore Stack Pointer - mov frame_RSPSAVE(%rsp), %rsp + mov %rbp, %rsp + pop %rbp + + # Restore GPRs + pop %r15 + pop %r14 + pop %r13 + pop %r12 + pop %rbx nowork: ret
Re: [PATCH v2] X86: Makefile: Replace -pg with CC_FLAGS_FTRACE
Adding Steven Rostedt (ftrace maintainer). On Fri, Apr 16, 2021 at 01:39:28PM +0800, zhaoxiao wrote: > In preparation for x86 supporting ftrace built on other compiler > options, let's have the x86 Makefiles remove the $(CC_FLAGS_FTRACE) > flags, whatever these may be, rather than assuming '-pg'. > > There should be no functional change as a result of this patch. > > Signed-off-by: zhaoxiao > --- > v2: add the same change for the other Makefile in arch/x86 directory. > arch/x86/entry/vdso/Makefile | 8 > arch/x86/kernel/Makefile | 16 > arch/x86/kernel/cpu/Makefile | 4 ++-- > arch/x86/lib/Makefile| 2 +- > arch/x86/mm/Makefile | 4 ++-- > arch/x86/xen/Makefile| 6 +++--- > 6 files changed, 20 insertions(+), 20 deletions(-) > > diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile > index 05c4abc2fdfd..c5bd91bf9f93 100644 > --- a/arch/x86/entry/vdso/Makefile > +++ b/arch/x86/entry/vdso/Makefile > @@ -96,10 +96,10 @@ $(vobjs): KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO) > $(GCC_PLUGINS_CFLAGS) $( > # > # vDSO code runs in userspace and -pg doesn't help with profiling anyway. > # > -CFLAGS_REMOVE_vclock_gettime.o = -pg > -CFLAGS_REMOVE_vdso32/vclock_gettime.o = -pg > -CFLAGS_REMOVE_vgetcpu.o = -pg > -CFLAGS_REMOVE_vsgx.o = -pg > +CFLAGS_REMOVE_vclock_gettime.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_vdso32/vclock_gettime.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_vgetcpu.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_vsgx.o = $(CC_FLAGS_FTRACE) > > # > # X32 processes use x32 vDSO to access 64bit kernel data. > diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile > index 2ddf08351f0b..2811fc6a17ba 100644 > --- a/arch/x86/kernel/Makefile > +++ b/arch/x86/kernel/Makefile > @@ -13,14 +13,14 @@ CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE) > > ifdef CONFIG_FUNCTION_TRACER > # Do not profile debug and lowlevel utilities > -CFLAGS_REMOVE_tsc.o = -pg > -CFLAGS_REMOVE_paravirt-spinlocks.o = -pg > -CFLAGS_REMOVE_pvclock.o = -pg > -CFLAGS_REMOVE_kvmclock.o = -pg > -CFLAGS_REMOVE_ftrace.o = -pg > -CFLAGS_REMOVE_early_printk.o = -pg > -CFLAGS_REMOVE_head64.o = -pg > -CFLAGS_REMOVE_sev-es.o = -pg > +CFLAGS_REMOVE_tsc.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_paravirt-spinlocks.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_pvclock.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_kvmclock.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_ftrace.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_early_printk.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_head64.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_sev-es.o = $(CC_FLAGS_FTRACE) > endif > > KASAN_SANITIZE_head$(BITS).o := n > diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile > index 637b499450d1..4435c6de9145 100644 > --- a/arch/x86/kernel/cpu/Makefile > +++ b/arch/x86/kernel/cpu/Makefile > @@ -5,8 +5,8 @@ > > # Don't trace early stages of a secondary CPU boot > ifdef CONFIG_FUNCTION_TRACER > -CFLAGS_REMOVE_common.o = -pg > -CFLAGS_REMOVE_perf_event.o = -pg > +CFLAGS_REMOVE_common.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_perf_event.o = $(CC_FLAGS_FTRACE) > endif > > # If these files are instrumented, boot hangs during the first second. > diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile > index bad4dee4f0e4..0aa71b8a5bc1 100644 > --- a/arch/x86/lib/Makefile > +++ b/arch/x86/lib/Makefile > @@ -21,7 +21,7 @@ KASAN_SANITIZE_cmdline.o := n > KCSAN_SANITIZE_cmdline.o := n > > ifdef CONFIG_FUNCTION_TRACER > -CFLAGS_REMOVE_cmdline.o = -pg > +CFLAGS_REMOVE_cmdline.o = $(CC_FLAGS_FTRACE) > endif > > CFLAGS_cmdline.o := -fno-stack-protector -fno-jump-tables > diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile > index 5864219221ca..91883d5a0293 100644 > --- a/arch/x86/mm/Makefile > +++ b/arch/x86/mm/Makefile > @@ -12,8 +12,8 @@ KASAN_SANITIZE_mem_encrypt_identity.o := n > KCSAN_SANITIZE := n > > ifdef CONFIG_FUNCTION_TRACER > -CFLAGS_REMOVE_mem_encrypt.o = -pg > -CFLAGS_REMOVE_mem_encrypt_identity.o = -pg > +CFLAGS_REMOVE_mem_encrypt.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_mem_encrypt_identity.o = $(CC_FLAGS_FTRACE) > endif > > obj-y:= init.o init_$(BITS).o fault.o > ioremap.o extable.o mmap.o \ > diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile > index 40b5779fce21..179dfc124c94 100644 > --- a/arch/x86/xen/Makefile > +++ b/arch/x86/xen/Makefile > @@ -2,9 +2,9 @@ > > ifdef CONFIG_FUNCTION_TRACER > # Do not profile debug and lowlevel utilities > -CFLAGS_REMOVE_spinlock.o = -pg > -CFLAGS_REMOVE_time.o = -pg > -CFLAGS_REMOVE_irq.o = -pg > +CFLAGS_REMOVE_spinlock.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_time.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_irq.o = $(CC_FLAGS_FTRACE) > endif > > # Make sure early boot has no stackprotector > -- > 2.20.1 > > > -- Josh
Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
On Mon, Apr 12, 2021 at 05:59:33PM +0100, Mark Brown wrote: > On Fri, Apr 09, 2021 at 05:32:27PM -0500, Josh Poimboeuf wrote: > > > Hm, for that matter, even without renaming things, a comment above > > stack_trace_save_tsk_reliable() describing the meaning of "reliable" > > would be a good idea. > > Might be better to place something at the prototype for > arch_stack_walk_reliable() or cross link the two since that's where any > new architectures should be starting, or perhaps even better to extend > the document that Mark wrote further and point to that from both places. > > Some more explict pointer to live patching as the only user would > definitely be good but I think the more important thing would be writing > down any assumptions in the API that aren't already written down and > we're supposed to be relying on. Mark's document captured a lot of it > but it sounds like there's more here, and even with knowing that this > interface is only used by live patch and digging into what it does it's > not always clear what happens to work with the code right now and what's > something that's suitable to be relied on. Something like so? From: Josh Poimboeuf Subject: [PATCH] livepatch: Clarify the meaning of 'reliable' Update the comments and documentation to reflect what 'reliable' unwinding actually means, in the context of live patching. Suggested-by: Mark Brown Signed-off-by: Josh Poimboeuf --- .../livepatch/reliable-stacktrace.rst | 26 + arch/x86/kernel/stacktrace.c | 6 include/linux/stacktrace.h| 29 +-- kernel/stacktrace.c | 7 - 4 files changed, 53 insertions(+), 15 deletions(-) diff --git a/Documentation/livepatch/reliable-stacktrace.rst b/Documentation/livepatch/reliable-stacktrace.rst index 67459d2ca2af..e325efc7e952 100644 --- a/Documentation/livepatch/reliable-stacktrace.rst +++ b/Documentation/livepatch/reliable-stacktrace.rst @@ -72,7 +72,21 @@ The unwinding process varies across architectures, their respective procedure call standards, and kernel configurations. This section describes common details that architectures should consider. -4.1 Identifying successful termination +4.1 Only preemptible code needs reliability detection +- + +The only current user of reliable stacktracing is livepatch, which only +calls it for a) inactive tasks; or b) the current task in task context. + +Therefore, the unwinder only needs to detect the reliability of stacks +involving *preemptible* code. + +Practically speaking, reliability of stacks involving *non-preemptible* +code is a "don't-care". It may help to return a wrong reliability +result for such cases, if it results in reduced complexity, since such +cases will not happen in practice. + +4.2 Identifying successful termination -- Unwinding may terminate early for a number of reasons, including: @@ -95,7 +109,7 @@ architectures verify that a stacktrace ends at an expected location, e.g. * On a specific stack expected for a kernel entry point (e.g. if the architecture has separate task and IRQ stacks). -4.2 Identifying unwindable code +4.3 Identifying unwindable code --- Unwinding typically relies on code following specific conventions (e.g. @@ -129,7 +143,7 @@ unreliable to unwind from, e.g. * Identifying specific portions of code using bounds information. -4.3 Unwinding across interrupts and exceptions +4.4 Unwinding across interrupts and exceptions -- At function call boundaries the stack and other unwind state is expected to be @@ -156,7 +170,7 @@ have no such cases) should attempt to unwind across exception boundaries, as doing so can prevent unnecessarily stalling livepatch consistency checks and permits livepatch transitions to complete more quickly. -4.4 Rewriting of return addresses +4.5 Rewriting of return addresses - Some trampolines temporarily modify the return address of a function in order @@ -222,7 +236,7 @@ middle of return_to_handler and can report this as unreliable. Architectures are not required to unwind from other trampolines which modify the return address. -4.5 Obscuring of return addresses +4.6 Obscuring of return addresses - Some trampolines do not rewrite the return address in order to intercept @@ -249,7 +263,7 @@ than the link register as would usually be the case. Architectures must either ensure that unwinders either reliably unwind such cases, or report the unwinding as unreliable. -4.6 Link register unreliability +4.7 Link register unreliability --- On some other architectures, 'call' instructions place
Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
On Fri, Apr 09, 2021 at 05:32:27PM -0500, Josh Poimboeuf wrote: > On Fri, Apr 09, 2021 at 05:05:58PM -0500, Madhavan T. Venkataraman wrote: > > > FWIW, over the years we've had zero issues with encoding the frame > > > pointer on x86. After you save pt_regs, you encode the frame pointer to > > > point to it. Ideally in the same macro so it's hard to overlook. > > > > > > > I had the same opinion. In fact, in my encoding scheme, I have additional > > checks to make absolutely sure that it is a true encoding and not stack > > corruption. The chances of all of those values accidentally matching are, > > well, null. > > Right, stack corruption -- which is already exceedingly rare -- would > have to be combined with a miracle or two in order to come out of the > whole thing marked as 'reliable' :-) > > And really, we already take a similar risk today by "trusting" the frame > pointer value on the stack to a certain extent. Oh yeah, I forgot to mention some more benefits of encoding the frame pointer (or marking pt_regs in some other way): a) Stack addresses can be printed properly: '%pS' for printing regs->pc and '%pB' for printing call returns. Using '%pS' for call returns (as arm64 seems to do today) will result in printing the wrong function when you have tail calls to noreturn functions on the stack (which is actually quite common for calls to panic(), die(), etc). More details: https://lkml.kernel.org/r/20210403155948.ubbgtwmlsdyar7yp@treble b) Stack dumps to the console can dump the exception registers they find along the way. This is actually quite nice for debugging. -- Josh
Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
On Fri, Apr 09, 2021 at 05:05:58PM -0500, Madhavan T. Venkataraman wrote: > > FWIW, over the years we've had zero issues with encoding the frame > > pointer on x86. After you save pt_regs, you encode the frame pointer to > > point to it. Ideally in the same macro so it's hard to overlook. > > > > I had the same opinion. In fact, in my encoding scheme, I have additional > checks to make absolutely sure that it is a true encoding and not stack > corruption. The chances of all of those values accidentally matching are, > well, null. Right, stack corruption -- which is already exceedingly rare -- would have to be combined with a miracle or two in order to come out of the whole thing marked as 'reliable' :-) And really, we already take a similar risk today by "trusting" the frame pointer value on the stack to a certain extent. > >> I think there's a lot more code that we cannot unwind, e.g. KVM > >> exception code, or almost anything marked with SYM_CODE_END(). > > > > Just a reminder that livepatch only unwinds blocked tasks (plus the > > 'current' task which calls into livepatch). So practically speaking, it > > doesn't matter whether the 'unreliable' detection has full coverage. > > The only exceptions which really matter are those which end up calling > > schedule(), e.g. preemption or page faults. > > > > Being able to consistently detect *all* possible unreliable paths would > > be nice in theory, but it's unnecessary and may not be worth the extra > > complexity. > > > > You do have a point. I tried to think of arch_stack_walk_reliable() as > something that should be implemented independent of livepatching. But > I could not really come up with a single example of where else it would > really be useful. > > So, if we assume that the reliable stack trace is solely for the purpose > of livepatching, I agree with your earlier comments as well. One thought: if folks really view this as a problem, it might help to just rename things to reduce confusion. For example, instead of calling it 'reliable', we could call it something more precise, like 'klp_reliable', to indicate that its reliable enough for live patching. Then have a comment above 'klp_reliable' and/or stack_trace_save_tsk_klp_reliable() which describes what that means. Hm, for that matter, even without renaming things, a comment above stack_trace_save_tsk_reliable() describing the meaning of "reliable" would be a good idea. -- Josh
Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks
On Fri, Apr 09, 2021 at 01:09:09PM +0100, Mark Rutland wrote: > On Mon, Apr 05, 2021 at 03:43:09PM -0500, madve...@linux.microsoft.com wrote: > > From: "Madhavan T. Venkataraman" > > > > There are a number of places in kernel code where the stack trace is not > > reliable. Enhance the unwinder to check for those cases and mark the > > stack trace as unreliable. Once all of the checks are in place, the unwinder > > can provide a reliable stack trace. But before this can be used for > > livepatch, > > some other entity needs to guarantee that the frame pointers are all set up > > correctly in kernel functions. objtool is currently being worked on to > > fill that gap. > > > > Except for the return address check, all the other checks involve checking > > the return PC of every frame against certain kernel functions. To do this, > > implement some infrastructure code: > > > > - Define a special_functions[] array and populate the array with > > the special functions > > I'm not too keen on having to manually collate this within the unwinder, > as it's very painful from a maintenance perspective. Agreed. > I'd much rather we could associate this information with the > implementations of these functions, so that they're more likely to > stay in sync. > > Further, I believe all the special cases are assembly functions, and > most of those are already in special sections to begin with. I reckon > it'd be simpler and more robust to reject unwinding based on the > section. If we need to unwind across specific functions in those > sections, we could opt-in with some metadata. So e.g. we could reject > all functions in ".entry.text", special casing the EL0 entry functions > if necessary. Couldn't this also end up being somewhat fragile? Saying "certain sections are deemed unreliable" isn't necessarily obvious to somebody who doesn't already know about it, and it could be overlooked or forgotten over time. And there's no way to enforce it stays that way. FWIW, over the years we've had zero issues with encoding the frame pointer on x86. After you save pt_regs, you encode the frame pointer to point to it. Ideally in the same macro so it's hard to overlook. If you're concerned about debuggers getting confused by the encoding - which debuggers specifically? In my experience, if vmlinux has debuginfo, gdb and most other debuggers will use DWARF (which is already broken in asm code) and completely ignore frame pointers. > I think there's a lot more code that we cannot unwind, e.g. KVM > exception code, or almost anything marked with SYM_CODE_END(). Just a reminder that livepatch only unwinds blocked tasks (plus the 'current' task which calls into livepatch). So practically speaking, it doesn't matter whether the 'unreliable' detection has full coverage. The only exceptions which really matter are those which end up calling schedule(), e.g. preemption or page faults. Being able to consistently detect *all* possible unreliable paths would be nice in theory, but it's unnecessary and may not be worth the extra complexity. -- Josh
Re: [PATCH 0/5] Introduce support for PSF mitigation
On Thu, Apr 08, 2021 at 09:56:47AM -0500, Saripalli, RK wrote: > Josh, thank you for taking the time to review the patches. > > On 4/7/2021 5:39 PM, Josh Poimboeuf wrote: > > On Tue, Apr 06, 2021 at 10:49:59AM -0500, Ramakrishna Saripalli wrote: > >> Because PSF speculation is limited to the current program context, > >> the impact of bad PSF speculation is very similar to that of > >> Speculative Store Bypass (Spectre v4) > >> > >> Predictive Store Forwarding controls: > >> There are two hardware control bits which influence the PSF feature: > >> - MSR 48h bit 2 – Speculative Store Bypass (SSBD) > >> - MSR 48h bit 7 – Predictive Store Forwarding Disable (PSFD) > >> > >> The PSF feature is disabled if either of these bits are set. These bits > >> are controllable on a per-thread basis in an SMT system. By default, both > >> SSBD and PSFD are 0 meaning that the speculation features are enabled. > >> > >> While the SSBD bit disables PSF and speculative store bypass, PSFD only > >> disables PSF. > >> > >> PSFD may be desirable for software which is concerned with the > >> speculative behavior of PSF but desires a smaller performance impact than > >> setting SSBD. > > > > Hi Ramakrishna, > > > > Is there a realistic scenario where an application would want to disable > > PSF, but not disable SSB? > > It is possible most applications have been reviewed and scrubbed for > SSB-type attacks but PSF-type issues may not have been looked at yet. It's "possible", but is it realistic? As far as I know, SSB is impractical to scrub an application for. Do we know of any real-world cases where this option is needed? -- Josh
Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate
On Thu, Apr 08, 2021 at 08:18:21AM +0200, Greg KH wrote: > On Wed, Apr 07, 2021 at 03:17:46PM -0500, Josh Poimboeuf wrote: > > On Fri, Apr 02, 2021 at 09:54:12AM +0200, Greg KH wrote: > > > On Thu, Apr 01, 2021 at 11:59:25PM +, Luis Chamberlain wrote: > > > > As for the syfs deadlock possible with drivers, this fixes it in a > > > > generic way: > > > > > > > > commit fac43d8025727a74f80a183cc5eb74ed902a5d14 > > > > Author: Luis Chamberlain > > > > Date: Sat Mar 27 14:58:15 2021 + > > > > > > > > sysfs: add optional module_owner to attribute > > > > > > > > This is needed as otherwise the owner of the attribute > > > > or group read/store might have a shared lock used on driver removal, > > > > and deadlock if we race with driver removal. > > > > > > > > Signed-off-by: Luis Chamberlain > > > > > > No, please no. Module removal is a "best effort", if the system dies > > > when it happens, that's on you. I am not willing to expend extra energy > > > and maintance of core things like sysfs for stuff like this that does > > > not matter in any system other than a developer's box. > > > > So I mentioned this on IRC, and some folks were surprised to hear that > > module unloading is unsupported and is just a development aid. > > > > Is this stance documented anywhere? > > > > If we really believe this to be true, we should make rmmod taint the > > kernel. > > My throw-away comment here seems to have gotten way more attention than > it deserved, sorry about that everyone. > > Nothing is supported for anything here, it's all "best effort" :) > > And I would love a taint for rmmod, but what is that going to help out > with? I'm having trouble following this conclusion. Sure, we give our best effort, but if "nothing is supported" then what are we even doing here? Either we aim to support module unloading, or we don't. If we do support it, "best effort" isn't a valid reason for nacking fixes. If we don't support it, but still want to keep it half-working for development purposes, tainting on rmmod will make it crystal clear that the system has been destabilized. -- Josh
Re: [PATCH 0/5] Introduce support for PSF mitigation
On Tue, Apr 06, 2021 at 10:49:59AM -0500, Ramakrishna Saripalli wrote: > Because PSF speculation is limited to the current program context, > the impact of bad PSF speculation is very similar to that of > Speculative Store Bypass (Spectre v4) > > Predictive Store Forwarding controls: > There are two hardware control bits which influence the PSF feature: > - MSR 48h bit 2 – Speculative Store Bypass (SSBD) > - MSR 48h bit 7 – Predictive Store Forwarding Disable (PSFD) > > The PSF feature is disabled if either of these bits are set. These bits > are controllable on a per-thread basis in an SMT system. By default, both > SSBD and PSFD are 0 meaning that the speculation features are enabled. > > While the SSBD bit disables PSF and speculative store bypass, PSFD only > disables PSF. > > PSFD may be desirable for software which is concerned with the > speculative behavior of PSF but desires a smaller performance impact than > setting SSBD. Hi Ramakrishna, Is there a realistic scenario where an application would want to disable PSF, but not disable SSB? Maybe I'm missing something, but I'd presume an application would either care about this class of attacks, or not. -- Josh
Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate
On Fri, Apr 02, 2021 at 09:54:12AM +0200, Greg KH wrote: > On Thu, Apr 01, 2021 at 11:59:25PM +, Luis Chamberlain wrote: > > As for the syfs deadlock possible with drivers, this fixes it in a generic > > way: > > > > commit fac43d8025727a74f80a183cc5eb74ed902a5d14 > > Author: Luis Chamberlain > > Date: Sat Mar 27 14:58:15 2021 + > > > > sysfs: add optional module_owner to attribute > > > > This is needed as otherwise the owner of the attribute > > or group read/store might have a shared lock used on driver removal, > > and deadlock if we race with driver removal. > > > > Signed-off-by: Luis Chamberlain > > No, please no. Module removal is a "best effort", if the system dies > when it happens, that's on you. I am not willing to expend extra energy > and maintance of core things like sysfs for stuff like this that does > not matter in any system other than a developer's box. So I mentioned this on IRC, and some folks were surprised to hear that module unloading is unsupported and is just a development aid. Is this stance documented anywhere? If we really believe this to be true, we should make rmmod taint the kernel. -- Josh
Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate
On Wed, Apr 07, 2021 at 04:09:44PM +0200, Peter Zijlstra wrote: > On Tue, Apr 06, 2021 at 10:54:23AM -0500, Josh Poimboeuf wrote: > > > Same for Red Hat. Unloading livepatch modules seems to work fine, but > > isn't officially supported. > > > > That said, if rmmod is just considered a development aid, and we're > > going to be ignoring bugs, we should make it official with a new > > TAINT_RMMOD. > > Another option would be to have live-patch modules leak a module > reference by default, except when some debug sysctl is set or something. > Then only those LP modules loaded while the sysctl is set to 'YOLO' can > be unloaded. The issue is broader than just live patching. My suggestion was that if we aren't going to fix bugs in kernel module unloading, then unloading modules shouldn't be supported, and should taint the kernel. -- Josh
Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate
On Tue, Apr 06, 2021 at 02:00:19PM +0200, Miroslav Benes wrote: > Hi, > > > > Driver developers will simply have to open code these protections. In > > > light of what I see on LTP / fuzzing, I suspect the use case will grow > > > and we'll have to revisit this in the future. But for now, sure, we can > > > just open code the required protections everywhere to not crash on module > > > removal. > > > > LTP and fuzzing too do not remove modules. So I do not understand the > > root problem here, that's just something that does not happen on a real > > system. > > If I am not mistaken, the issue that Luis tries to solve here was indeed > found by running LTP. > > > On Sat, Apr 03, 2021 at 08:13:23AM +0200, Greg KH wrote: > > > On Fri, Apr 02, 2021 at 06:30:16PM +, Luis Chamberlain wrote: > > > > On Fri, Apr 02, 2021 at 09:54:12AM +0200, Greg KH wrote: > > > > > No, please no. Module removal is a "best effort", > > > > > > > > Not for live patching. I am not sure if I am missing any other valid > > > > use case? > > > > > > live patching removes modules? We have so many code paths that are > > > "best effort" when it comes to module unloading, trying to resolve this > > > one is a valiant try, but not realistic. > > > > Miroslav, your input / help here would be valuable. I did the > > generalization work because you said it would be worthy for you too... > > Yes, we have the option to revert and remove the existing live patch from > the system. I am not sure how (if) it is used in practice. > > At least at SUSE we do not support the option. But we are only one of the > many downstream users. So yes, there is the option. Same for Red Hat. Unloading livepatch modules seems to work fine, but isn't officially supported. That said, if rmmod is just considered a development aid, and we're going to be ignoring bugs, we should make it official with a new TAINT_RMMOD. -- Josh
Re: [RFC PATCH v1 0/4] arm64: Implement stack trace reliability checks
On Tue, Mar 30, 2021 at 02:09:51PM -0500, madve...@linux.microsoft.com wrote: > From: "Madhavan T. Venkataraman" > > There are a number of places in kernel code where the stack trace is not > reliable. Enhance the unwinder to check for those cases and mark the > stack trace as unreliable. Once all of the checks are in place, the unwinder > can be used for livepatching. This assumes all such places are known. That's a big assumption, as a) hand-written asm code may inadvertantly skip frame pointer setup; b) for inline asm which calls a function, the compiler may blindly insert it into a function before the frame pointer setup. That's where objtool stack validation would come in. > Detect EL1 exception frame > == > > EL1 exceptions can happen on any instruction including instructions in > the frame pointer prolog or epilog. Depending on where exactly they happen, > they could render the stack trace unreliable. > > Add all of the EL1 exception handlers to special_functions[]. > > - el1_sync() > - el1_irq() > - el1_error() > - el1_sync_invalid() > - el1_irq_invalid() > - el1_fiq_invalid() > - el1_error_invalid() A possibly more robust alternative would be to somehow mark el1 exception frames so the unwinder can detect them more generally. For example, as described in my previous email, encode the frame pointer so the unwinder can detect el1 frames automatically. > Detect ftrace frame > === > > When FTRACE executes at the beginning of a traced function, it creates two > frames and calls the tracer function: > > - One frame for the traced function > > - One frame for the caller of the traced function > > That gives a sensible stack trace while executing in the tracer function. > When FTRACE returns to the traced function, the frames are popped and > everything is back to normal. > > However, in cases like live patch, the tracer function redirects execution > to a different function. When FTRACE returns, control will go to that target > function. A stack trace taken in the tracer function will not show the target > function. The target function is the real function that we want to track. > So, the stack trace is unreliable. I don't think this is a real problem. Livepatch only checks the stacks of blocked tasks (and the task calling into livepatch). So the reliability of unwinding from the livepatch tracer function itself (klp_ftrace_handler) isn't a concern since it doesn't sleep. > To detect FTRACE in a stack trace, add the following to special_functions[]: > > - ftrace_graph_call() > - ftrace_graph_caller() > > Please see the diff for a comment that explains why ftrace_graph_call() > must be checked. > > Also, the Function Graph Tracer modifies the return address of a traced > function to a return trampoline (return_to_handler()) to gather tracing > data on function return. Stack traces taken from the traced function and > functions it calls will not show the original caller of the traced function. > The unwinder handles this case by getting the original caller from FTRACE. > > However, stack traces taken from the trampoline itself and functions it calls > are unreliable as the original return address may not be available in > that context. This is because the trampoline calls FTRACE to gather trace > data as well as to obtain the actual return address and FTRACE discards the > record of the original return address along the way. Again, this shouldn't be a concern because livepatch won't be unwinding from a function_graph trampoline unless it got preempted somehow (and then the el1 frame would get detected anyway). > Add return_to_handler() to special_functions[]. > > Check for kretprobe > === > > For functions with a kretprobe set up, probe code executes on entry > to the function and replaces the return address in the stack frame with a > kretprobe trampoline. Whenever the function returns, control is > transferred to the trampoline. The trampoline eventually returns to the > original return address. > > A stack trace taken while executing in the function (or in functions that > get called from the function) will not show the original return address. > Similarly, a stack trace taken while executing in the trampoline itself > (and functions that get called from the trampoline) will not show the > original return address. This means that the caller of the probed function > will not show. This makes the stack trace unreliable. > > Add the kretprobe trampoline to special_functions[]. > > FYI, each task contains a task->kretprobe_instances list that can > theoretically be consulted to find the orginal return address. But I am > not entirely sure how to safely traverse that list for stack traces > not on the current process. So, I have taken the easy way out. For kretprobes, unwinding from the trampoline or kretprobe handler shouldn't be a reliability
Re: [RFC PATCH v2 1/1] arm64: Implement stack trace termination record
On Thu, Apr 01, 2021 at 10:24:04PM -0500, madve...@linux.microsoft.com wrote: > From: "Madhavan T. Venkataraman" > @@ -447,9 +464,9 @@ SYM_FUNC_START_LOCAL(__primary_switched) > #endif > bl switch_to_vhe // Prefer VHE if possible > add sp, sp, #16 > - mov x29, #0 > - mov x30, #0 > - b start_kernel > + setup_final_frame > + bl start_kernel > + nop > SYM_FUNC_END(__primary_switched) > > .pushsection ".rodata", "a" > @@ -606,14 +623,14 @@ SYM_FUNC_START_LOCAL(__secondary_switched) > cbz x2, __secondary_too_slow > msr sp_el0, x2 > scs_load x2, x3 > - mov x29, #0 > - mov x30, #0 > + setup_final_frame > > #ifdef CONFIG_ARM64_PTR_AUTH > ptrauth_keys_init_cpu x2, x3, x4, x5 > #endif > > - b secondary_start_kernel > + bl secondary_start_kernel > + nop > SYM_FUNC_END(__secondary_switched) I'm somewhat arm-ignorant, so take the following comments with a grain of salt. I don't think changing these to 'bl' is necessary, unless you wanted __primary_switched() and __secondary_switched() to show up in the stacktrace for some reason? If so, that seems like a separate patch. Also, why are nops added after the calls? My guess would be because, since these are basically tail calls to "noreturn" functions, the stack dump code would otherwise show the wrong function, i.e. whatever function happens to be after the 'bl'. We had the same issue for x86. It can be fixed by using '%pB' instead of '%pS' when printing the address in dump_backtrace_entry(). See sprint_backtrace() for more details. BTW I think the same issue exists for GCC-generated code. The following shows several such cases: objdump -dr vmlinux |awk '/bl / {bl=1;l=$0;next} bl == 1 && /^$/ {print l; print} // {bl=0}' However, looking at how arm64 unwinds through exceptions in kernel space, using '%pB' might have side effects when the exception LR (elr_el1) points to the beginning of a function. Then '%pB' would show the end of the previous function, instead of the function which was interrupted. So you may need to rethink how to unwind through in-kernel exceptions. Basically, when printing a stack return address, you want to use '%pB' for a call return address and '%pS' for an interrupted address. On x86, with the frame pointer unwinder, we encode the frame pointer by setting a bit in %rbp which tells the unwinder that it's a special pt_regs frame. Then instead of treating it like a normal call frame, the stack dump code prints the registers, and the return address (regs->ip) gets printed with '%pS'. > SYM_FUNC_START_LOCAL(__secondary_too_slow) > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > index 325c83b1a24d..906baa232a89 100644 > --- a/arch/arm64/kernel/process.c > +++ b/arch/arm64/kernel/process.c > @@ -437,6 +437,11 @@ int copy_thread(unsigned long clone_flags, unsigned long > stack_start, > } > p->thread.cpu_context.pc = (unsigned long)ret_from_fork; > p->thread.cpu_context.sp = (unsigned long)childregs; > + /* > + * For the benefit of the unwinder, set up childregs->stackframe > + * as the final frame for the new task. > + */ > + p->thread.cpu_context.fp = (unsigned long)childregs->stackframe; > > ptrace_hw_copy_thread(p); > > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c > index ad20981dfda4..72f5af8c69dc 100644 > --- a/arch/arm64/kernel/stacktrace.c > +++ b/arch/arm64/kernel/stacktrace.c > @@ -44,16 +44,16 @@ int notrace unwind_frame(struct task_struct *tsk, struct > stackframe *frame) > unsigned long fp = frame->fp; > struct stack_info info; > > - /* Terminal record; nothing to unwind */ > - if (!fp) > + if (!tsk) > + tsk = current; > + > + /* Final frame; nothing to unwind */ > + if (fp == (unsigned long) task_pt_regs(tsk)->stackframe) > return -ENOENT; As far as I can tell, the regs stackframe value is initialized to zero during syscall entry, so isn't this basically just 'if (fp == 0)'? Shouldn't it instead be comparing with the _address_ of the stackframe field to make sure it reached the end? -- Josh
Re: [RFC PATCH -tip 3/3] x86/kprobes,orc: Unwind optprobe trampoline correctly
On Wed, Mar 31, 2021 at 02:44:56PM +0900, Masami Hiramatsu wrote: > +#ifdef CONFIG_UNWINDER_ORC > +unsigned long recover_optprobe_trampoline(unsigned long addr, unsigned long > *sp) > +{ > + unsigned long offset, entry, probe_addr; > + struct optimized_kprobe *op; > + struct orc_entry *orc; > + > + entry = find_kprobe_optinsn_slot_entry(addr); > + if (!entry) > + return addr; > + > + offset = addr - entry; > + > + /* Decode arg1 and get the optprobe */ > + op = (void *)extract_set_arg1((void *)(entry + TMPL_MOVE_IDX)); > + if (!op) > + return addr; > + > + probe_addr = (unsigned long)op->kp.addr; > + > + if (offset < TMPL_END_IDX) { > + orc = orc_find((unsigned long)optprobe_template_func + offset); > + if (!orc || orc->sp_reg != ORC_REG_SP) > + return addr; > + /* > + * Since optprobe trampoline doesn't push caller on the stack, > + * need to decrement 1 stack entry size > + */ > + *sp += orc->sp_offset - sizeof(long); > + return probe_addr; > + } else { > + return probe_addr + offset - TMPL_END_IDX; > + } > +} > +#endif Hm, I'd like to avoid intertwining kprobes and ORC like this. ORC unwinds other generated code by assuming the generated code uses a frame pointer. Could we do that here? With CONFIG_FRAME_POINTER, unwinding works because SAVE_REGS_STRING has ENCODE_FRAME_POINTER, but that's not going to work for ORC. Instead of these patches, can we 'push %rbp; mov %rsp, %rbp' at the beginning of the template and 'pop %rbp' at the end? I guess SAVE_REGS_STRING would need to be smart enough to push the original saved version of %rbp. Of course then that breaks the kretprobe_trampoline() usage, so it may need to be a separate macro. [ Or make the same change to kretprobe_trampoline(). Then the other patch set wouldn't be needed either ;-) ] Of course the downside is, when you get an interrupt during the frame pointer setup, unwinding is broken. But I think that's acceptable for generated code. We've lived with that limitation for all code, with CONFIG_FRAME_POINTER, for many years. Eventually we may want to have a way to register generated code (and the ORC for it). -- Josh
Re: [PATCH -tip v4 10/12] x86/kprobes: Push a fake return address at kretprobe_trampoline
On Fri, Mar 26, 2021 at 10:20:09AM -0400, Steven Rostedt wrote: > On Fri, 26 Mar 2021 21:03:49 +0900 > Masami Hiramatsu wrote: > > > I confirmed this is not related to this series, but occurs when I build > > kernels with different > > configs without cleanup. > > > > Once I build kernel with CONFIG_UNWIND_GUESS=y (for testing), and after > > that, > > I build kernel again with CONFIG_UNWIND_ORC=y (but without make clean), this > > happened. In this case, I guess ORC data might be corrupted? > > When I cleanup and rebuild, the stacktrace seems correct. > > Hmm, that should be fixed. Seems like we are missing a dependency somewhere. Thomas reported something similar: for example arch/x86/kernel/head_64.o doesn't get rebuilt when changing unwinders. https://lkml.kernel.org/r/87tuqublrb@nanos.tec.linutronix.de Masahiro, any idea how we can force a full tree rebuild when changing the unwinder? -- Josh
Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
On Fri, Mar 26, 2021 at 04:12:15PM +0100, Peter Zijlstra wrote: > @@ -61,3 +89,15 @@ SYM_FUNC_END(__x86_indirect_thunk_\reg) > #define GEN(reg) EXPORT_THUNK(reg) > #include > > +#undef GEN > +#define GEN(reg) ALT_THUNK reg > +#include > + > +#undef GEN > +#define GEN(reg) __EXPORT_THUNK(__x86_indirect_alt_call_ ## reg) > +#include > + > +#undef GEN > +#define GEN(reg) __EXPORT_THUNK(__x86_indirect_alt_jmp_ ## reg) > +#include > + Git complains about this last newline. Otherwise everything looks pretty good to me. Let me run it through the test matrix. -- Josh
Re: [PATCH -tip v4 10/12] x86/kprobes: Push a fake return address at kretprobe_trampoline
On Wed, Mar 24, 2021 at 10:40:58AM +0900, Masami Hiramatsu wrote: > On Tue, 23 Mar 2021 23:30:07 +0100 > Peter Zijlstra wrote: > > > On Mon, Mar 22, 2021 at 03:41:40PM +0900, Masami Hiramatsu wrote: > > > ".global kretprobe_trampoline\n" > > > ".type kretprobe_trampoline, @function\n" > > > "kretprobe_trampoline:\n" > > > #ifdef CONFIG_X86_64 > > > > So what happens if we get an NMI here? That is, after the RET but before > > the push? Then our IP points into the trampoline but we've not done that > > push yet. > > Not only NMI, but also interrupts can happen. There is no cli/sti here. > > Anyway, thanks for pointing! > I think in UNWIND_HINT_TYPE_REGS and UNWIND_HINT_TYPE_REGS_PARTIAL cases > ORC unwinder also has to check the state->ip and if it is > kretprobe_trampoline, > it should be recovered. > What about this? I think the REGS and REGS_PARTIAL cases can also be affected by function graph tracing. So should they use the generic unwind_recover_ret_addr() instead of unwind_recover_kretprobe()? -- Josh
Re: [PATCH -tip v3 05/11] x86/kprobes: Add UNWIND_HINT_FUNC on kretprobe_trampoline code
On Sat, Mar 20, 2021 at 10:05:43PM +0900, Masami Hiramatsu wrote: > On Sat, 20 Mar 2021 21:16:16 +0900 > Masami Hiramatsu wrote: > > > On Fri, 19 Mar 2021 21:22:39 +0900 > > Masami Hiramatsu wrote: > > > > > From: Josh Poimboeuf > > > > > > Add UNWIND_HINT_FUNC on kretporbe_trampoline code so that ORC > > > information is generated on the kretprobe_trampoline correctly. > > > > > > > Test bot also found a new warning for this. > > > > > >> arch/x86/kernel/kprobes/core.o: warning: objtool: > > > >> kretprobe_trampoline()+0x25: call without frame pointer save/setup > > > > With CONFIG_FRAME_POINTER=y. > > > > Of course this can be fixed with additional "push %bp; mov %sp, %bp" before > > calling > > trampoline_handler. But actually we know that this function has a bit > > special > > stack frame too. > > > > Can I recover STACK_FRAME_NON_STANDARD(kretprobe_trampoline) when > > CONFIG_FRAME_POINTER=y ? > > So something like this. Does it work? > > diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c > index b31058a152b6..651f337dc880 100644 > --- a/arch/x86/kernel/kprobes/core.c > +++ b/arch/x86/kernel/kprobes/core.c > @@ -760,6 +760,10 @@ int kprobe_int3_handler(struct pt_regs *regs) > } > NOKPROBE_SYMBOL(kprobe_int3_handler); > > +#ifdef CONFIG_FRAME_POINTER > +#undef UNWIND_HINT_FUNC > +#define UNWIND_HINT_FUNC > +#endif This hunk isn't necessary. The unwind hints don't actually have an effect with frame pointers. > /* > * When a retprobed function returns, this code saves registers and > * calls trampoline_handler() runs, which calls the kretprobe's handler. > @@ -797,7 +801,14 @@ asm( > ".size kretprobe_trampoline, .-kretprobe_trampoline\n" > ); > NOKPROBE_SYMBOL(kretprobe_trampoline); > - > +#ifdef CONFIG_FRAME_POINTER > +/* > + * kretprobe_trampoline skips updating frame pointer. The frame pointer > + * saved in trampoline_handler points to the real caller function's > + * frame pointer. > + */ > +STACK_FRAME_NON_STANDARD(kretprobe_trampoline); > +#endif > > /* > * Called from kretprobe_trampoline Ack. -- Josh
Re: [PATCH -tip v3 05/11] x86/kprobes: Add UNWIND_HINT_FUNC on kretprobe_trampoline code
On Sat, Mar 20, 2021 at 09:16:16PM +0900, Masami Hiramatsu wrote: > On Fri, 19 Mar 2021 21:22:39 +0900 > Masami Hiramatsu wrote: > > > From: Josh Poimboeuf > > > > Add UNWIND_HINT_FUNC on kretporbe_trampoline code so that ORC > > information is generated on the kretprobe_trampoline correctly. > > > > Test bot also found a new warning for this. > > > >> arch/x86/kernel/kprobes/core.o: warning: objtool: > > >> kretprobe_trampoline()+0x25: call without frame pointer save/setup > > With CONFIG_FRAME_POINTER=y. > > Of course this can be fixed with additional "push %bp; mov %sp, %bp" before > calling > trampoline_handler. But actually we know that this function has a bit special > stack frame too. > > Can I recover STACK_FRAME_NON_STANDARD(kretprobe_trampoline) when > CONFIG_FRAME_POINTER=y ? Yes, that's what I'd recommend. -- Josh
Re: [PATCH v2 14/14] objtool,x86: Rewrite retpoline thunk calls
On Fri, Mar 19, 2021 at 04:56:30PM +0100, Peter Zijlstra wrote: > On Fri, Mar 19, 2021 at 10:30:26AM -0500, Josh Poimboeuf wrote: > > On Fri, Mar 19, 2021 at 09:06:44AM +0100, Peter Zijlstra wrote: > > > > Also doesn't the alternative code already insert nops? > > > > > > Problem is that the {call,jmp} *%\reg thing is not fixed length. They're > > > 2 or 3 bytes depending on which register is picked. > > > > Why do they need to be fixed length? Objtool can use sym->len as the > > alternative replacement length. Then alternatives can add nops as > > needed. > > UNDEF has size 0. That is, unless these symbols exist in the translation > unit (they do not) we have no clue. > > Arguably I could parse the symbol name again and then we know the > register number and thus if we need REX etc.. but I figured we wanted to > avoid all that. Ah, makes sense now. -- Josh
Re: [PATCH v2 08/14] objtool: Add elf_create_reloc() helper
On Fri, Mar 19, 2021 at 04:24:40PM +0100, Peter Zijlstra wrote: > On Fri, Mar 19, 2021 at 10:12:59AM -0500, Josh Poimboeuf wrote: > > > > -void elf_add_reloc(struct elf *elf, struct reloc *reloc) > > > +int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long > > > offset, > > > + unsigned int type, struct symbol *sym, int addend) > > > { > > > - struct section *sec = reloc->sec; > > > + struct reloc *reloc; > > > + > > > + reloc = malloc(sizeof(*reloc)); > > > + if (!reloc) { > > > + perror("malloc"); > > > + return -1; > > > + } > > > + memset(reloc, 0, sizeof(*reloc)); > > > + > > > + reloc->sec = sec->reloc; > > > > Maybe elf_add_reloc() could create the reloc section if it doesn't > > already exist, that would help remove the multiple calls to > > elf_create_reloc_section(). > > I'll do that as yet another patch ;-) Ok! > > > + reloc->offset = offset; > > > + reloc->type = type; > > > + reloc->sym = sym; > > > + reloc->addend = addend; > > > > > > list_add_tail(>list, >reloc_list); > > > > This should be sec->reloc->reloc_list? > > Absolutely, there were a few other 'funnies' as well that I just fixed > whilst trying to make the whole new stack work again :-) I'm sure some of those were my fault, thanks for testing my crap :-) -- Josh
Re: [PATCH v2 14/14] objtool,x86: Rewrite retpoline thunk calls
On Fri, Mar 19, 2021 at 09:06:44AM +0100, Peter Zijlstra wrote: > > Also doesn't the alternative code already insert nops? > > Problem is that the {call,jmp} *%\reg thing is not fixed length. They're > 2 or 3 bytes depending on which register is picked. Why do they need to be fixed length? Objtool can use sym->len as the alternative replacement length. Then alternatives can add nops as needed. -- Josh
Re: [PATCH 5/9] objtool: Rework rebuild_reloc logic
On Fri, Mar 19, 2021 at 10:22:54AM +0100, Peter Zijlstra wrote: > @@ -814,6 +814,11 @@ struct section *elf_create_reloc_section > } > } > > +static inline bool is_reloc_section(struct section *reloc) > +{ > + return reloc->base && reloc->base->reloc == reloc; > +} I believe the 2nd condition will always be true, so it can just be 'return reloc->base'. -- Josh
Re: [PATCH v2 08/14] objtool: Add elf_create_reloc() helper
On Fri, Mar 19, 2021 at 10:47:36AM +0100, Peter Zijlstra wrote: > Full patch, because diff on a diff on a diff is getting ludicrous hard > to read :-) Appreciated :-) > -void elf_add_reloc(struct elf *elf, struct reloc *reloc) > +int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset, > + unsigned int type, struct symbol *sym, int addend) > { > - struct section *sec = reloc->sec; > + struct reloc *reloc; > + > + reloc = malloc(sizeof(*reloc)); > + if (!reloc) { > + perror("malloc"); > + return -1; > + } > + memset(reloc, 0, sizeof(*reloc)); > + > + reloc->sec = sec->reloc; Maybe elf_add_reloc() could create the reloc section if it doesn't already exist, that would help remove the multiple calls to elf_create_reloc_section(). > + reloc->offset = offset; > + reloc->type = type; > + reloc->sym = sym; > + reloc->addend = addend; > > list_add_tail(>list, >reloc_list); This should be sec->reloc->reloc_list? -- Josh
Re: [PATCH v2 10/14] objtool: Extract elf_symbol_add()
On Fri, Mar 19, 2021 at 10:54:05AM +0100, Peter Zijlstra wrote: > On Thu, Mar 18, 2021 at 09:14:03PM -0500, Josh Poimboeuf wrote: > > On Thu, Mar 18, 2021 at 06:11:13PM +0100, Peter Zijlstra wrote: > > > Create a common helper to add symbols. > > > > > > Signed-off-by: Peter Zijlstra (Intel) > > > --- > > > tools/objtool/elf.c | 57 > > > ++-- > > > 1 file changed, 33 insertions(+), 24 deletions(-) > > > > > > --- a/tools/objtool/elf.c > > > +++ b/tools/objtool/elf.c > > > @@ -290,12 +290,41 @@ static int read_sections(struct elf *elf > > > return 0; > > > } > > > > > > +static bool elf_symbol_add(struct elf *elf, struct symbol *sym) > > > > How about "elf_add_symbol()" for consistency with my other suggestions > > (elf_add_reloc() and elf_add_string()). And return an int. > > Changed the nane, how about void? This latest incarnation doesn't > actually have an error path. Still doesn't hurt to have one and not use > it I suppose... void would be my preference, just to avoid unnecessary error handling boilerplate in the caller. -- Josh
Re: [PATCH v2 14/14] objtool,x86: Rewrite retpoline thunk calls
On Thu, Mar 18, 2021 at 06:11:17PM +0100, Peter Zijlstra wrote: > When the compiler emits: "CALL __x86_indirect_thunk_\reg" for an > indirect call, have objtool rewrite it to: > > ALTERNATIVE "call __x86_indirect_thunk_\reg", > "call *%reg", ALT_NOT(X86_FEATURE_RETPOLINE) > > Additionally, in order to not emit endless identical > .altinst_replacement chunks, use a global symbol for them, see > __x86_indirect_alt_*. > > This also avoids objtool from having to do code generation. > > Signed-off-by: Peter Zijlstra (Intel) This is better than I expected. Nice workaround for not generating code. > +.macro ALT_THUNK reg > + > + .align 1 > + > +SYM_FUNC_START_NOALIGN(__x86_indirect_alt_call_\reg) > + ANNOTATE_RETPOLINE_SAFE > +1: call*%\reg > +2: .skip 5-(2b-1b), 0x90 > +SYM_FUNC_END(__x86_indirect_alt_call_\reg) > + > +SYM_FUNC_START_NOALIGN(__x86_indirect_alt_jmp_\reg) > + ANNOTATE_RETPOLINE_SAFE > +1: jmp *%\reg > +2: .skip 5-(2b-1b), 0x90 > +SYM_FUNC_END(__x86_indirect_alt_jmp_\reg) This mysterious code needs a comment. Shouldn't it be in .altinstr_replacement or something? Also doesn't the alternative code already insert nops? > +int arch_rewrite_retpoline(struct objtool_file *file, > +struct instruction *insn, > +struct reloc *reloc) > +{ > + struct symbol *sym; > + char name[32] = ""; > + > + if (!strcmp(insn->sec->name, ".text.__x86.indirect_thunk")) > + return 0; > + > + sprintf(name, "__x86_indirect_alt_%s_%s", > + insn->type == INSN_JUMP_DYNAMIC ? "jmp" : "call", > + reloc->sym->name + 21); > + > + sym = find_symbol_by_name(file->elf, name); > + if (!sym) { > + sym = elf_create_undef_symbol(file->elf, name); > + if (!sym) { > + WARN("elf_create_undef_symbol"); > + return -1; > + } > + } > + > + elf_add_alternative(file->elf, insn, sym, > + ALT_NOT(X86_FEATURE_RETPOLINE), 5, 5); > + > + return 0; > +} Need to propagate the error. -- Josh
Re: [PATCH v2 12/14] objtool: Allow archs to rewrite retpolines
On Thu, Mar 18, 2021 at 06:11:15PM +0100, Peter Zijlstra wrote: > @@ -1212,6 +1225,8 @@ static int handle_group_alt(struct objto > dest_off = arch_jump_destination(insn); > if (dest_off == special_alt->new_off + special_alt->new_len) > insn->jump_dest = next_insn_same_sec(file, > last_orig_insn); > + else > + insn->jump_dest = find_insn(file, insn->sec, dest_off); > > if (!insn->jump_dest) { > WARN_FUNC("can't find alternative jump destination", So I assume this change is because of the ordering change: now this is done before add_jump_destinations(). But doesn't that mean the alternative jump modification (changing the dest to the end of the original insns) will get overwritten later? Also the new hunk to be an oversimplified version of add_jump_destinations(). I'm not quite convinced that it will always do the right thing for this case. > @@ -1797,11 +1812,15 @@ static int decode_sections(struct objtoo > if (ret) > return ret; > > - ret = add_jump_destinations(file); > + /* > + * Must be before add_{jump,call}_destination; for they can add > + * magic alternatives. > + */ > + ret = add_special_section_alts(file); This reordering is unfortunate. Maybe there's a better way, though I don't have any ideas, at least until I get to the most controversial patch. -- Josh
Re: [PATCH v2 05/14] objtool: Per arch retpoline naming
On Thu, Mar 18, 2021 at 06:11:08PM +0100, Peter Zijlstra wrote: > @@ -872,7 +877,7 @@ static int add_jump_destinations(struct > } else if (reloc->sym->type == STT_SECTION) { > dest_sec = reloc->sym->sec; > dest_off = arch_dest_reloc_offset(reloc->addend); > - } else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", > 21)) { > + } else if (arch_is_retpoline(reloc->sym)) { > /* >* Retpoline jumps are really dynamic jumps in >* disguise, so convert them accordingly. There's another one in add_call_destinations(). -- Josh
Re: [PATCH v2 11/14] objtool: Add elf_create_undef_symbol()
On Thu, Mar 18, 2021 at 06:11:14PM +0100, Peter Zijlstra wrote: > Allow objtool to create undefined symbols; this allows creating > relocations to symbols not currently in the symbol table. > > Signed-off-by: Peter Zijlstra (Intel) > --- > tools/objtool/elf.c | 63 > > tools/objtool/include/objtool/elf.h |1 > 2 files changed, 64 insertions(+) > > --- a/tools/objtool/elf.c > +++ b/tools/objtool/elf.c > @@ -724,6 +724,69 @@ static int elf_strtab_concat(struct elf > return len; > } > > +struct symbol *elf_create_undef_symbol(struct elf *elf, const char *name) > +{ > + struct section *symtab; > + struct symbol *sym; > + Elf_Data *data; > + Elf_Scn *s; > + > + sym = malloc(sizeof(*sym)); > + if (!sym) { > + perror("malloc"); > + return NULL; > + } > + memset(sym, 0, sizeof(*sym)); > + > + sym->name = strdup(name); > + > + sym->sym.st_name = elf_strtab_concat(elf, sym->name, NULL); > + if (sym->sym.st_name == -1) > + return NULL; > + > + sym->sym.st_info = 0x10; /* STB_GLOBAL, STT_NOTYPE */ There's a generic macro for this: sym->sym.st_info = GELF_ST_INFO(STB_GLOBAL, STT_NOTYPE); And sym->bind and sym->type should probably get set. -- Josh
Re: [PATCH v2 10/14] objtool: Extract elf_symbol_add()
On Thu, Mar 18, 2021 at 06:11:13PM +0100, Peter Zijlstra wrote: > Create a common helper to add symbols. > > Signed-off-by: Peter Zijlstra (Intel) > --- > tools/objtool/elf.c | 57 > ++-- > 1 file changed, 33 insertions(+), 24 deletions(-) > > --- a/tools/objtool/elf.c > +++ b/tools/objtool/elf.c > @@ -290,12 +290,41 @@ static int read_sections(struct elf *elf > return 0; > } > > +static bool elf_symbol_add(struct elf *elf, struct symbol *sym) How about "elf_add_symbol()" for consistency with my other suggestions (elf_add_reloc() and elf_add_string()). And return an int. -- Josh
Re: [PATCH v2 09/14] objtool: Extract elf_strtab_concat()
On Thu, Mar 18, 2021 at 06:11:12PM +0100, Peter Zijlstra wrote: > Create a common helper to append strings to a strtab. > > Signed-off-by: Peter Zijlstra (Intel) > --- > tools/objtool/elf.c | 73 > +--- > 1 file changed, 42 insertions(+), 31 deletions(-) > > --- a/tools/objtool/elf.c > +++ b/tools/objtool/elf.c > @@ -676,13 +676,51 @@ struct elf *elf_open_read(const char *na > return NULL; > } > > +static int elf_strtab_concat(struct elf *elf, char *name, const char > *strtab_name) > +{ > + struct section *strtab = NULL; > + Elf_Data *data; > + Elf_Scn *s; > + int len; > + > + if (strtab_name) > + strtab = find_section_by_name(elf, strtab_name); > + if (!strtab) > + strtab = find_section_by_name(elf, ".strtab"); > + if (!strtab) { > + WARN("can't find %s or .strtab section", strtab_name); > + return -1; > + } This part's a bit mysterious (and it loses the Clang comment). Maybe we can leave the section finding logic in elf_create_section()? diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c index b85db6efb9d3..db9ad54894d8 100644 --- a/tools/objtool/elf.c +++ b/tools/objtool/elf.c @@ -676,19 +676,17 @@ struct elf *elf_open_read(const char *name, int flags) return NULL; } -static int elf_strtab_concat(struct elf *elf, char *name, const char *strtab_name) +static int elf_add_string(struct elf *elf, struct section *strtab, char *str) { struct section *strtab = NULL; Elf_Data *data; Elf_Scn *s; int len; - if (strtab_name) - strtab = find_section_by_name(elf, strtab_name); if (!strtab) strtab = find_section_by_name(elf, ".strtab"); if (!strtab) { - WARN("can't find %s or .strtab section", strtab_name); + WARN("can't find .strtab section"); return -1; } @@ -718,7 +716,7 @@ static int elf_strtab_concat(struct elf *elf, char *name, const char *strtab_nam struct section *elf_create_section(struct elf *elf, const char *name, unsigned int sh_flags, size_t entsize, int nr) { - struct section *sec; + struct section *sec, *shstrtab; size_t size = entsize * nr; Elf_Scn *s; @@ -777,7 +775,15 @@ struct section *elf_create_section(struct elf *elf, const char *name, sec->sh.sh_addralign = 1; sec->sh.sh_flags = SHF_ALLOC | sh_flags; - sec->sh.sh_name = elf_strtab_concat(elf, sec->name, ".shstrtab"); + /* Add section name to .shstrtab (or .strtab for Clang) */ + shstrtab = find_section_by_name(elf, ".shstrtab"); + if (!shstrtab) + shstrtab = find_section_by_name(elf, ".strtab"); + if (!shstrtab) { + WARN("can't find .shstrtab or .strtab section"); + return NULL; + } + sec->sh.sh_name = elf_add_string(elf, sec->name, shstrtab); if (sec->sh.sh_name == -1) return NULL;
Re: [PATCH v2 08/14] objtool: Add elf_create_reloc() helper
On Thu, Mar 18, 2021 at 06:11:11PM +0100, Peter Zijlstra wrote: > We have 4 instances of adding a relocation. Create a common helper > to avoid growing even more. > > Signed-off-by: Peter Zijlstra (Intel) I'm not a fan of the API -- how about squashing this in? Untested, of course. diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 6a52d56504b1..38ffa3b2595e 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -469,12 +469,11 @@ static int create_static_call_sections(struct objtool_file *file) memset(site, 0, sizeof(struct static_call_site)); /* populate reloc for 'addr' */ - if (!elf_create_reloc(file->elf, sec, idx * sizeof(struct static_call_site), - R_X86_64_PC32, - reloc_to_insn, insn, 0)) { - WARN_ELF("elf_create_reloc: static_call_site::addr"); + if (elf_add_reloc_to_insn(file->elf, sec, + idx * sizeof(struct static_call_site), + R_X86_64_PC32, + insn->sec, insn->offset)) return -1; - } /* find key symbol */ key_name = strdup(insn->call_dest->name); @@ -511,13 +510,11 @@ static int create_static_call_sections(struct objtool_file *file) free(key_name); /* populate reloc for 'key' */ - if (!elf_create_reloc(file->elf, sec, idx * sizeof(struct static_call_site) + 4, - R_X86_64_PC32, - reloc_to_sym, key_sym, - is_sibling_call(insn) * STATIC_CALL_SITE_TAIL)) { - WARN_ELF("elf_create_reloc: static_call_site::key"); + if (elf_add_reloc(file->elf, sec, + idx * sizeof(struct static_call_site) + 4, + R_X86_64_PC32, key_sym, + is_sibling_call(insn) * STATIC_CALL_SITE_TAIL)) return -1; - } idx++; } @@ -559,12 +556,11 @@ static int create_mcount_loc_sections(struct objtool_file *file) loc = (unsigned long *)sec->data->d_buf + idx; memset(loc, 0, sizeof(unsigned long)); - if (!elf_create_reloc(file->elf, sec, idx * sizeof(unsigned long), - R_X86_64_64, - reloc_to_insn, insn, 0)) { - WARN_ELF("elf_create_reloc: __mcount_loc"); + if (elf_add_reloc_to_insn(file->elf, sec, + idx * sizeof(unsigned long), + R_X86_64_64, + insn->sec, insn->offset)) return -1; - } idx++; } diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c index d2afe2454de4..b3bd97b2b034 100644 --- a/tools/objtool/elf.c +++ b/tools/objtool/elf.c @@ -447,76 +447,73 @@ static int read_symbols(struct elf *elf) return -1; } -static void elf_add_reloc(struct elf *elf, struct reloc *reloc) +int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset, + unsigned int type, struct symbol *sym, int addend) { - struct section *sec = reloc->sec; + struct reloc *reloc; + + reloc = malloc(sizeof(*reloc)); + if (!reloc) { + perror("malloc"); + return -1; + } + memset(reloc, 0, sizeof(*reloc)); + + reloc->sec = sec->reloc; + reloc->offset = offset; + reloc->type = type; + reloc->sym = sym; + reloc->addend = addend; list_add_tail(>list, >reloc_list); elf_hash_add(elf->reloc_hash, >hash, reloc_hash(reloc)); - sec->changed = true; -} - -void reloc_to_sym(struct reloc *reloc, void *dst) -{ - struct symbol *sym = dst; - reloc->sym = sym; + return 0; } -void __reloc_to_insn(struct reloc *reloc, struct section *sec, unsigned long offset) +static int insn_to_sym_addend(struct section *sec, unsigned long offset, + struct symbol **sym, int *addend) { if (sec->sym) { - reloc->sym = sec->sym; - reloc->addend = offset; - return; + *sym = sec->sym; + *addend = offset; + return 0; } /* * The Clang assembler strips section symbols, so we have to reference * the function symbol instead: */ - reloc->sym = find_symbol_containing(sec, offset); - if (!reloc->sym) { + *sym = find_symbol_containing(sec, offset); + if (!*sym) { /*
Re: [PATCH 5/9] objtool: Rework rebuild_reloc logic
On Thu, Mar 18, 2021 at 12:38:42PM -0500, Josh Poimboeuf wrote: > On Thu, Mar 18, 2021 at 06:04:25PM +0100, Peter Zijlstra wrote: > > On Thu, Mar 18, 2021 at 11:36:40AM -0500, Josh Poimboeuf wrote: > > > > I was thinking you could get a section changed without touching > > > > relocations, but while that is theoretically possible, it is exceedingly > > > > unlikely (and objtool doesn't do that). > > > > > > Hm? This is a *relocation* section, not a normal one. So by > > > definition, it only changes when its relocations change. > > > > The way I read this code: > > > > list_for_each_entry(sec, >sections, list) { > > if (sec->changed) { > > + if (sec->reloc && > > + elf_rebuild_reloc_section(elf, sec->reloc)) { > > + WARN_ELF("elf_rebuild_reloc_section"); > > + return -1; > > + } > > > > is that we iterate the regular sections (which could be dirtied because > > we changed some data), and if that section has a relocation section, we > > rebuild that for good measure (even though it might not have altered > > relocations). > > > > Or am I just totally confused ? > > Ah, you're right. I'm the one that's confused. I guess I was also > confused when I wrote that hunk, but it just happens to work anyway. > > It would be cleaner to do something like > > if ((is_reloc_sec(sec) && > elf_rebuild_reloc_section(elf, sec)) { > > so we process the changed reloc section directly, instead of relying on > the (most likely) fact that the corresponding text section also changed. i.e., in actual code: diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c index 66c49c2e20a6..3b3d19a5e626 100644 --- a/tools/objtool/elf.c +++ b/tools/objtool/elf.c @@ -948,8 +948,7 @@ int elf_write(struct elf *elf) /* Update changed relocation sections and section headers: */ list_for_each_entry(sec, >sections, list) { if (sec->changed) { - if (sec->reloc && - elf_rebuild_reloc_section(elf, sec->reloc)) { + if (sec->base && elf_rebuild_reloc_section(elf, sec)) { WARN_ELF("elf_rebuild_reloc_section"); return -1; }
Re: [PATCH] kbuild: rebuild GCC plugins when the compiler is upgraded
On Wed, Mar 10, 2021 at 10:44:46PM +0900, Masahiro Yamada wrote: > > Masahiro, > > > > Ping. Do you have a better approach for building GCC plugins in the > > external module directory? > > > I am not sure if building GCC plugins in the external module directory > is the right approach. I'm certainly open to any better ideas that would allow GCC plugins to be enabled in distro kernel builds. -- Josh
Re: [PATCH 5/9] objtool: Rework rebuild_reloc logic
On Thu, Mar 18, 2021 at 06:04:25PM +0100, Peter Zijlstra wrote: > On Thu, Mar 18, 2021 at 11:36:40AM -0500, Josh Poimboeuf wrote: > > > I was thinking you could get a section changed without touching > > > relocations, but while that is theoretically possible, it is exceedingly > > > unlikely (and objtool doesn't do that). > > > > Hm? This is a *relocation* section, not a normal one. So by > > definition, it only changes when its relocations change. > > The way I read this code: > > list_for_each_entry(sec, >sections, list) { > if (sec->changed) { > + if (sec->reloc && > + elf_rebuild_reloc_section(elf, sec->reloc)) { > + WARN_ELF("elf_rebuild_reloc_section"); > + return -1; > + } > > is that we iterate the regular sections (which could be dirtied because > we changed some data), and if that section has a relocation section, we > rebuild that for good measure (even though it might not have altered > relocations). > > Or am I just totally confused ? Ah, you're right. I'm the one that's confused. I guess I was also confused when I wrote that hunk, but it just happens to work anyway. It would be cleaner to do something like if ((is_reloc_sec(sec) && elf_rebuild_reloc_section(elf, sec)) { so we process the changed reloc section directly, instead of relying on the (most likely) fact that the corresponding text section also changed. -- Josh
Re: [PATCH 5/9] objtool: Rework rebuild_reloc logic
On Thu, Mar 18, 2021 at 01:57:03PM +0100, Peter Zijlstra wrote: > On Wed, Mar 17, 2021 at 07:49:17PM -0500, Josh Poimboeuf wrote: > > On Wed, Mar 17, 2021 at 09:12:15AM +0100, Peter Zijlstra wrote: > > > On Tue, Mar 16, 2021 at 10:34:17PM -0500, Josh Poimboeuf wrote: > > > > On Fri, Mar 12, 2021 at 06:16:18PM +0100, Peter Zijlstra wrote: > > > > > --- a/tools/objtool/elf.c > > > > > +++ b/tools/objtool/elf.c > > > > > @@ -479,6 +479,8 @@ void elf_add_reloc(struct elf *elf, stru > > > > > > > > > > list_add_tail(>list, >reloc_list); > > > > > elf_hash_add(elf->reloc_hash, >hash, reloc_hash(reloc)); > > > > > + > > > > > + sec->rereloc = true; > > > > > } > > > > > > > > Can we just reuse sec->changed for this? Something like this on top > > > > (untested of course): > > > > > > I think my worry was that we'd dirty too much and slow down the write, > > > but I haven't done any actual performance measurements on this. > > > > Really? I thought my proposal was purely aesthetic, no functional > > change, but my brain is toasty this week due to other distractions so > > who knows. > > I was thinking you could get a section changed without touching > relocations, but while that is theoretically possible, it is exceedingly > unlikely (and objtool doesn't do that). Hm? This is a *relocation* section, not a normal one. So by definition, it only changes when its relocations change. -- Josh
Re: [PATCH 3/3] static_call: Fix static_call_update() sanity check
On Thu, Mar 18, 2021 at 12:31:59PM +0100, Peter Zijlstra wrote: > if (!kernel_text_address((unsigned long)site_addr)) { > - WARN_ONCE(1, "can't patch static call site at > %pS", > + /* > + * This skips patching __exit, which is part of > + * init_section_contains() but is not part of > + * kernel_text_address(). > + * > + * Skipping __exit is fine since it will never > + * be executed. > + */ > + WARN_ONCE(!static_call_is_init(site), > + "can't patch static call site at %pS", > site_addr); > continue; > } It might be good to clarify the situation for __exit in modules in the comment and/or changelog, as they both seem to be implicitly talking only about __exit in vmlinux. For CONFIG_MODULE_UNLOAD, the code ends up in the normal text area, so static_call_is_init() is false and kernel_text_address() is true. For !CONFIG_MODULE_UNLOAD, the code gets discarded during module load, so static_call_is_init() and kernel_text_address() are both false. I guess that will trigger a warning? -- Josh
Re: [PATCH 5/9] objtool: Rework rebuild_reloc logic
On Wed, Mar 17, 2021 at 09:12:15AM +0100, Peter Zijlstra wrote: > On Tue, Mar 16, 2021 at 10:34:17PM -0500, Josh Poimboeuf wrote: > > On Fri, Mar 12, 2021 at 06:16:18PM +0100, Peter Zijlstra wrote: > > > --- a/tools/objtool/elf.c > > > +++ b/tools/objtool/elf.c > > > @@ -479,6 +479,8 @@ void elf_add_reloc(struct elf *elf, stru > > > > > > list_add_tail(>list, >reloc_list); > > > elf_hash_add(elf->reloc_hash, >hash, reloc_hash(reloc)); > > > + > > > + sec->rereloc = true; > > > } > > > > Can we just reuse sec->changed for this? Something like this on top > > (untested of course): > > I think my worry was that we'd dirty too much and slow down the write, > but I haven't done any actual performance measurements on this. Really? I thought my proposal was purely aesthetic, no functional change, but my brain is toasty this week due to other distractions so who knows. > Let me do a few runs and see if it matters at all. -- Josh
Re: [PATCH 6/9] objtool: Add elf_create_undef_symbol()
On Wed, Mar 17, 2021 at 03:13:43PM +0100, Peter Zijlstra wrote: > On Wed, Mar 17, 2021 at 02:52:23PM +0100, Miroslav Benes wrote: > > > > + if (!elf_symbol_add(elf, sym, SHN_XINDEX)) { > > > + WARN("elf_symbol_add"); > > > + return NULL; > > > + } > > > > SHN_XINDEX means that the extended section index is used. Above you seem > > to use it in the opposite sense too (assigning to shndx when shndx_data is > > NULL). While it makes the code easier to handle, it is a bit confusing > > (and maybe I am just confused now). Could you add a comment about that, > > please? elf_symbol_add() seems like a good place. > > Yes, that was a horrible thing to do :/ And you understood it right. > > Looking at it again, I'm not sure it is actually correct tho; shouldn't > elf_create_undef_symbol() also look at gelf_getsymshndx() of symtab ? > > What toolchain generates these extended sections and how? That is, how > do I test this crud.. SHN_XINDEX is basically a special-case extension to original ELF for supporting more than 64k sections. -- Josh
Re: [PATCH v4 3/9] x86/entry: Convert ret_from_fork to C
On Wed, Mar 17, 2021 at 11:12:42AM -0700, Andy Lutomirski wrote: > ret_from_fork is written in asm, slightly differently, for x86_32 and > x86_64. Convert it to C. > > This is a straight conversion without any particular cleverness. As a > further cleanup, the code that sets up the ret_from_fork argument registers > could be adjusted to put the arguments in the correct registers. > > This will cause the ORC unwinder to find pt_regs even for kernel threads on > x86_64. This seems harmless. > > The 32-bit comment above the now-deleted schedule_tail_wrapper was > obsolete: the encode_frame_pointer mechanism (see copy_thread()) solves the > same problem more cleanly. > > Cc: Josh Poimboeuf > Signed-off-by: Andy Lutomirski Acked-by: Josh Poimboeuf -- Josh
Re: [PATCH v4 2/9] x86/kthread,dumpstack: Set task_pt_regs->cs.RPL=3 for kernel threads
On Wed, Mar 17, 2021 at 11:12:41AM -0700, Andy Lutomirski wrote: > @@ -163,6 +163,19 @@ int copy_thread(unsigned long clone_flags, unsigned long > sp, unsigned long arg, > /* Kernel thread ? */ > if (unlikely(p->flags & PF_KTHREAD)) { > memset(childregs, 0, sizeof(struct pt_regs)); > + > + /* > + * Even though there is no real user state here, these > + * are were user regs belong, and kernel_execve() will s/were/where/ > + * overwrite them with user regs. Put an obviously > + * invalid value that nonetheless causes user_mode(regs) > + * to return true in CS. > + * > + * This also prevents the unwinder from thinking that there > + * is invalid kernel state at the top of the stack. > + */ > + childregs->cs = 3; > + > kthread_frame_init(frame, sp, arg); > return 0; > } Acked-by: Josh Poimboeuf -- Josh
Re: [PATCH] objtool,static_call: Don't emit static_call_site for .exit.text
On Wed, Mar 17, 2021 at 01:45:57PM +0100, Peter Zijlstra wrote: > arguably it simply isn't a good idea to use static_call() in __exit > code anyway, since module unload is never a performance critical path. Couldn't you make the same argument about __init functions, which are allowed to do static calls? We might consider a STATIC_CALL_SITE_EXIT flag, but I suppose we've run out of flag space. -- Josh
Re: [PATCH 5/9] objtool: Rework rebuild_reloc logic
On Fri, Mar 12, 2021 at 06:16:18PM +0100, Peter Zijlstra wrote: > --- a/tools/objtool/elf.c > +++ b/tools/objtool/elf.c > @@ -479,6 +479,8 @@ void elf_add_reloc(struct elf *elf, stru > > list_add_tail(>list, >reloc_list); > elf_hash_add(elf->reloc_hash, >hash, reloc_hash(reloc)); > + > + sec->rereloc = true; > } Can we just reuse sec->changed for this? Something like this on top (untested of course): diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c index addcec88ac9f..b9cb74a54681 100644 --- a/tools/objtool/elf.c +++ b/tools/objtool/elf.c @@ -480,7 +480,7 @@ void elf_add_reloc(struct elf *elf, struct reloc *reloc) list_add_tail(>list, >reloc_list); elf_hash_add(elf->reloc_hash, >hash, reloc_hash(reloc)); - sec->rereloc = true; + sec->changed = true; } static int read_rel_reloc(struct section *sec, int i, struct reloc *reloc, unsigned int *symndx) @@ -882,9 +882,6 @@ int elf_rebuild_reloc_section(struct elf *elf, struct section *sec) struct reloc *reloc; int nr; - sec->changed = true; - elf->changed = true; - nr = 0; list_for_each_entry(reloc, >reloc_list, list) nr++; @@ -894,8 +891,6 @@ int elf_rebuild_reloc_section(struct elf *elf, struct section *sec) case SHT_RELA: return elf_rebuild_rela_reloc_section(sec, nr); default: return -1; } - - sec->rereloc = false; } int elf_write_insn(struct elf *elf, struct section *sec, @@ -950,14 +945,15 @@ int elf_write(struct elf *elf) struct section *sec; Elf_Scn *s; - list_for_each_entry(sec, >sections, list) { - if (sec->reloc && sec->reloc->rereloc) - elf_rebuild_reloc_section(elf, sec->reloc); - } - - /* Update section headers for changed sections: */ + /* Update changed relocation sections and section headers: */ list_for_each_entry(sec, >sections, list) { if (sec->changed) { + if (sec->reloc && + elf_rebuild_reloc_section(elf, sec->reloc)) { + WARN_ELF("elf_rebuild_reloc_section"); + return -1; + } + s = elf_getscn(elf->elf, sec->idx); if (!s) { WARN_ELF("elf_getscn"); @@ -969,6 +965,7 @@ int elf_write(struct elf *elf) } sec->changed = false; + elf->changed = true; } } diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h index 9fdd4c5f9f32..e6890cc70a25 100644 --- a/tools/objtool/include/objtool/elf.h +++ b/tools/objtool/include/objtool/elf.h @@ -39,7 +39,7 @@ struct section { char *name; int idx; unsigned int len; - bool changed, text, rodata, noinstr, rereloc; + bool changed, text, rodata, noinstr; }; struct symbol {
Re: [PATCH 4/9] objtool: Fix static_call list generation
On Fri, Mar 12, 2021 at 06:16:17PM +0100, Peter Zijlstra wrote: > @@ -1701,6 +1706,9 @@ static int decode_sections(struct objtoo > if (ret) > return ret; > > + /* > + * Must be before add_{jump_call}_desetination. > + */ s/desetination/destination/ -- Josh
Re: [PATCH 2/9] objtool: Correctly handle retpoline thunk calls
On Fri, Mar 12, 2021 at 06:16:15PM +0100, Peter Zijlstra wrote: > Just like JMP handling, convert a direct CALL to a retpoline thunk > into a retpoline safe indirect CALL. > > Signed-off-by: Peter Zijlstra (Intel) > --- > tools/objtool/check.c | 12 > 1 file changed, 12 insertions(+) > > --- a/tools/objtool/check.c > +++ b/tools/objtool/check.c > @@ -953,6 +953,18 @@ static int add_call_destinations(struct > dest_off); > return -1; > } > + > + } else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", > 21)) { > + /* > + * Retpoline calls are really dynamic calls in > + * disguise, so convert them accodingly. s/accodingly/accordingly/ -- Josh
Re: [PATCH -tip v2 00/10] kprobes: Fix stacktrace with kretprobes
On Fri, Mar 12, 2021 at 03:41:44PM +0900, Masami Hiramatsu wrote: > Hello, > > Here is the 2nd version of the series to fix the stacktrace with kretprobe. > > The 1st series is here; > > https://lore.kernel.org/bpf/161495873696.346821.10161501768906432924.stgit@devnote2/ > > In this version I merged the ORC unwinder fix for kretprobe which discussed > in the > previous thread. [3/10] is updated according to the Miroslav's comment. > [4/10] is > updated for simplify the code. [5/10]-[9/10] are discussed in the previsous > tread > and are introduced to the series. > > Daniel, can you also test this again? I and Josh discussed a bit different > method and I've implemented it on this version. > > This actually changes the kretprobe behavisor a bit, now the instraction > pointer in > the pt_regs passed to kretprobe user handler is correctly set the real return > address. So user handlers can get it via instruction_pointer() API. When I add WARN_ON(1) to a test kretprobe, it doesn't unwind properly. show_trace_log_lvl() reads the entire stack in lockstep with calls to the unwinder so that it can decide which addresses get prefixed with '?'. So it expects to find an actual return address on the stack. Instead it finds %rsp. So it never syncs up with unwind_next_frame() and shows all remaining addresses as unreliable. Call Trace: __kretprobe_trampoline_handler+0xca/0x1a0 trampoline_handler+0x3d/0x50 kretprobe_trampoline+0x25/0x50 ? init_test_probes.cold+0x323/0x387 ? init_kprobes+0x144/0x14c ? init_optprobes+0x15/0x15 ? do_one_initcall+0x5b/0x300 ? lock_is_held_type+0xe8/0x140 ? kernel_init_freeable+0x174/0x2cd ? rest_init+0x233/0x233 ? kernel_init+0xa/0x11d ? ret_from_fork+0x22/0x30 How about pushing 'kretprobe_trampoline' instead of %rsp for the return address placeholder. That fixes the above test, and removes the need for the weird 'state->ip == sp' check: Call Trace: __kretprobe_trampoline_handler+0xca/0x150 trampoline_handler+0x3d/0x50 kretprobe_trampoline+0x29/0x50 ? init_test_probes.cold+0x323/0x387 elfcorehdr_read+0x10/0x10 init_kprobes+0x144/0x14c ? init_optprobes+0x15/0x15 do_one_initcall+0x72/0x280 kernel_init_freeable+0x174/0x2cd ? rest_init+0x122/0x122 kernel_init+0xa/0x10e ret_from_fork+0x22/0x30 Though, init_test_probes.cold() (the real return address) is still listed as unreliable. Maybe we need a call to kretprobe_find_ret_addr() in show_trace_log_lvl(), similar to the ftrace_graph_ret_addr() call there. diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c index 06f33bfebc50..70188fdd288e 100644 --- a/arch/x86/kernel/kprobes/core.c +++ b/arch/x86/kernel/kprobes/core.c @@ -766,19 +766,19 @@ asm( "kretprobe_trampoline:\n" /* We don't bother saving the ss register */ #ifdef CONFIG_X86_64 - " pushq %rsp\n" + /* Push fake return address to tell the unwinder it's a kretprobe */ + " pushq $kretprobe_trampoline\n" UNWIND_HINT_FUNC " pushfq\n" SAVE_REGS_STRING " movq %rsp, %rdi\n" " call trampoline_handler\n" - /* Replace saved sp with true return address. */ + /* Replace the fake return address with the real one. */ " movq %rax, 19*8(%rsp)\n" RESTORE_REGS_STRING " popfq\n" #else " pushl %esp\n" - UNWIND_HINT_FUNC " pushfl\n" SAVE_REGS_STRING " movl %esp, %eax\n" diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index 1d1b9388a1b1..1d3de84d2410 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -548,8 +548,7 @@ bool unwind_next_frame(struct unwind_state *state) * In those cases, find the correct return address from * task->kretprobe_instances list. */ - if (state->ip == sp || - is_kretprobe_trampoline(state->ip)) + if (is_kretprobe_trampoline(state->ip)) state->ip = kretprobe_find_ret_addr(state->task, >kr_iter);
Re: [GIT pull] locking/urgent for v5.12-rc3
On Mon, Mar 15, 2021 at 01:08:27PM +0100, Peter Zijlstra wrote: > On Mon, Mar 15, 2021 at 12:26:12PM +0100, Peter Zijlstra wrote: > > Ooooh, modules don't have this. They still have regular > > .static_call_sites sections, and *those* are unaligned. > > > > Section Headers: > > [Nr] Name TypeAddress OffSize ES > > Flg Lk Inf Al > > > > [16] .static_call_sites PROGBITS 008aa1 0006f0 00 > > WA 0 0 1 > > > > And that goes *BOOM*.. Let me ses if I can figure out how to make > > objtool align those sections. > > The below seems to have cured it: > > [16] .static_call_sites PROGBITS 008aa8 0006f0 00 WA > 0 0 8 > > > So, anybody any opinion on if we ought to do this? I'd say yes to alignment, for the sake of consistency with vmlinux. Though instead of using objtool, it can be done in the module linker script: diff --git a/scripts/module.lds.S b/scripts/module.lds.S index 168cd27e6122..73345cbfe100 100644 --- a/scripts/module.lds.S +++ b/scripts/module.lds.S @@ -17,6 +17,7 @@ SECTIONS { .init_array 0 : ALIGN(8) { *(SORT(.init_array.*)) *(.init_array) } __jump_table0 : ALIGN(8) { KEEP(*(__jump_table)) } + .static_call_sites 0 : ALIGN(8) { KEEP(*(.static_call_sites)) } __patchable_function_entries : { *(__patchable_function_entries) }
Re: [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes
On Thu, Mar 11, 2021 at 10:54:38AM +0900, Masami Hiramatsu wrote: > On Wed, 10 Mar 2021 19:06:15 -0600 > Josh Poimboeuf wrote: > > > On Thu, Mar 11, 2021 at 09:20:18AM +0900, Masami Hiramatsu wrote: > > > > > bool unwind_next_frame(struct unwind_state *state) > > > > > { > > > > > unsigned long ip_p, sp, tmp, orig_ip = state->ip, prev_sp = > > > > > state->sp; > > > > > @@ -536,6 +561,18 @@ bool unwind_next_frame(struct unwind_state > > > > > *state) > > > > > > > > > > state->ip = ftrace_graph_ret_addr(state->task, > > > > > >graph_idx, > > > > > state->ip, (void > > > > > *)ip_p); > > > > > + /* > > > > > + * There are special cases when the stack unwinder is > > > > > called > > > > > + * from the kretprobe handler or the interrupt handler > > > > > which > > > > > + * occurs in the kretprobe trampoline code. In those > > > > > cases, > > > > > + * %sp is shown on the stack instead of the return > > > > > address. > > > > > + * Or, when the unwinder find the return address is > > > > > replaced > > > > > + * by kretprobe_trampoline. > > > > > + * In those cases, correct address can be found in > > > > > kretprobe. > > > > > + */ > > > > > + if (state->ip == sp || > > > > > > > > Why is the 'state->ip == sp' needed? > > > > > > As I commented above, until kretprobe_trampoline writes back the real > > > address to the stack, sp value is there (which has been pushed by the > > > 'pushq %rsp' at the entry of kretprobe_trampoline.) > > > > > > ".type kretprobe_trampoline, @function\n" > > > "kretprobe_trampoline:\n" > > > /* We don't bother saving the ss register */ > > > " pushq %rsp\n" // THIS > > > " pushfq\n" > > > > > > Thus, from inside the kretprobe handler, like ftrace, you'll see > > > the sp value instead of the real return address. > > > > I see. If you change is_kretprobe_trampoline_address() to include the > > entire function, like: > > > > static bool is_kretprobe_trampoline_address(unsigned long ip) > > { > > return (void *)ip >= kretprobe_trampoline && > >(void *)ip < kretprobe_trampoline_end; > > } > > > > then the unwinder won't ever read the bogus %rsp value into state->ip, > > and the 'state->ip == sp' check can be removed. > > Hmm, I couldn't get your point. Since sp is the address of stack, > it always out of text address. When unwinding from trampoline_handler(), state->ip will point to the instruction after the call: call trampoline_handler movq %rax, 19*8(%rsp) <-- state->ip points to this insn But then, the above version of is_kretprobe_trampoline_address() is true, so state->ip gets immediately replaced with the real return address: if (is_kretprobe_trampoline_address(state->ip)) state->ip = orc_kretprobe_correct_ip(state); so the unwinder skips over the kretprobe_trampoline() frame and goes straight to the frame of the real return address. Thus it never reads this bogus return value into state->ip: pushq %rsp which is why the weird 'state->ip == sp' check is no longer needed. The only "downside" is that the unwinder skips the kretprobe_trampoline() frame. (note that downside wouldn't exist in the case of UNWIND_HINT_REGS + valid regs->ip). > > > > And it would make the unwinder just work automatically when unwinding > > > > from the handler using the regs. > > > > > > > > It would also work when unwinding from the handler's stack, if we put an > > > > UNWIND_HINT_REGS after saving the regs. > > > > > > At that moment, the real return address is not identified. So we can not > > > put it. > > > > True, at the time the regs are originally saved, the real return address > > isn't available. But by the time the user handler is called, the return > > address *is* available. So if the real return address were placed in > > regs->ip befo
Re: [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes
On Thu, Mar 11, 2021 at 09:20:18AM +0900, Masami Hiramatsu wrote: > > > bool unwind_next_frame(struct unwind_state *state) > > > { > > > unsigned long ip_p, sp, tmp, orig_ip = state->ip, prev_sp = state->sp; > > > @@ -536,6 +561,18 @@ bool unwind_next_frame(struct unwind_state *state) > > > > > > state->ip = ftrace_graph_ret_addr(state->task, > > > >graph_idx, > > > state->ip, (void *)ip_p); > > > + /* > > > + * There are special cases when the stack unwinder is called > > > + * from the kretprobe handler or the interrupt handler which > > > + * occurs in the kretprobe trampoline code. In those cases, > > > + * %sp is shown on the stack instead of the return address. > > > + * Or, when the unwinder find the return address is replaced > > > + * by kretprobe_trampoline. > > > + * In those cases, correct address can be found in kretprobe. > > > + */ > > > + if (state->ip == sp || > > > > Why is the 'state->ip == sp' needed? > > As I commented above, until kretprobe_trampoline writes back the real > address to the stack, sp value is there (which has been pushed by the > 'pushq %rsp' at the entry of kretprobe_trampoline.) > > ".type kretprobe_trampoline, @function\n" > "kretprobe_trampoline:\n" > /* We don't bother saving the ss register */ > " pushq %rsp\n" // THIS > " pushfq\n" > > Thus, from inside the kretprobe handler, like ftrace, you'll see > the sp value instead of the real return address. I see. If you change is_kretprobe_trampoline_address() to include the entire function, like: static bool is_kretprobe_trampoline_address(unsigned long ip) { return (void *)ip >= kretprobe_trampoline && (void *)ip < kretprobe_trampoline_end; } then the unwinder won't ever read the bogus %rsp value into state->ip, and the 'state->ip == sp' check can be removed. > > And it would make the unwinder just work automatically when unwinding > > from the handler using the regs. > > > > It would also work when unwinding from the handler's stack, if we put an > > UNWIND_HINT_REGS after saving the regs. > > At that moment, the real return address is not identified. So we can not > put it. True, at the time the regs are originally saved, the real return address isn't available. But by the time the user handler is called, the return address *is* available. So if the real return address were placed in regs->ip before calling the handler, the unwinder could find it there, when called from the handler. Then we wouldn't need the call to orc_kretprobe_correct_ip() in __unwind_start(). But maybe it's not possible due to the regs->ip expectations of legacy handlers? -- Josh
Re: [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes
On Thu, Mar 11, 2021 at 12:55:09AM +0900, Masami Hiramatsu wrote: > +#ifdef CONFIG_KRETPROBES > +static unsigned long orc_kretprobe_correct_ip(struct unwind_state *state) > +{ > + return kretprobe_find_ret_addr( > + (unsigned long)kretprobe_trampoline_addr(), > + state->task, >kr_iter); > +} > + > +static bool is_kretprobe_trampoline_address(unsigned long ip) > +{ > + return ip == (unsigned long)kretprobe_trampoline_addr(); > +} > +#else > +static unsigned long orc_kretprobe_correct_ip(struct unwind_state *state) > +{ > + return state->ip; > +} > + > +static bool is_kretprobe_trampoline_address(unsigned long ip) > +{ > + return false; > +} > +#endif > + Can this code go in a kprobes file? I'd rather not clutter ORC with it, and maybe it would be useful for other arches or unwinders. > bool unwind_next_frame(struct unwind_state *state) > { > unsigned long ip_p, sp, tmp, orig_ip = state->ip, prev_sp = state->sp; > @@ -536,6 +561,18 @@ bool unwind_next_frame(struct unwind_state *state) > > state->ip = ftrace_graph_ret_addr(state->task, > >graph_idx, > state->ip, (void *)ip_p); > + /* > + * There are special cases when the stack unwinder is called > + * from the kretprobe handler or the interrupt handler which > + * occurs in the kretprobe trampoline code. In those cases, > + * %sp is shown on the stack instead of the return address. > + * Or, when the unwinder find the return address is replaced > + * by kretprobe_trampoline. > + * In those cases, correct address can be found in kretprobe. > + */ > + if (state->ip == sp || Why is the 'state->ip == sp' needed? > + is_kretprobe_trampoline_address(state->ip)) > + state->ip = orc_kretprobe_correct_ip(state); This is similar in concept to ftrace_graph_ret_addr(), right? Would it be possible to have a similar API? Like state->ip = kretprobe_ret_addr(state->task, >kr_iter, state->ip); and without the conditional. > > state->sp = sp; > state->regs = NULL; > @@ -649,6 +686,12 @@ void __unwind_start(struct unwind_state *state, struct > task_struct *task, > state->full_regs = true; > state->signal = true; > > + /* > + * When the unwinder called with regs from kretprobe handler, > + * the regs->ip starts from kretprobe_trampoline address. > + */ > + if (is_kretprobe_trampoline_address(state->ip)) > + state->ip = orc_kretprobe_correct_ip(state); Shouldn't __kretprobe_trampoline_handler() just set regs->ip to 'correct_ret_addr' before passing the regs to the handler? I'd think that would be a less surprising value for regs->ip than '_trampoline'. And it would make the unwinder just work automatically when unwinding from the handler using the regs. It would also work when unwinding from the handler's stack, if we put an UNWIND_HINT_REGS after saving the regs. The only (rare) case it wouldn't work would be unwinding from an interrupt before regs->ip gets set properly. In which case we'd still need the above call to orc_kretprobe_correct_ip() or so. -- Josh
Re: [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes
On Wed, Mar 10, 2021 at 06:57:34PM +0900, Masami Hiramatsu wrote: > > If I understand correctly, for #1 you need an unwind hint which treats > > the instruction *after* the "pushq %rsp" as the beginning of the > > function. > > Thanks for the patch. In that case, should I still change the stack > allocation? > Or can I continue to use a series of "push/pop" ? You can continue to use push/pop. Objtool is only getting confused by the unbalanced stack of the function (more pushes than pops). The unwind hint should fix that. -- Josh
Re: [PATCH] kbuild: rebuild GCC plugins when the compiler is upgraded
On Fri, Mar 05, 2021 at 08:50:59PM -0600, Josh Poimboeuf wrote: > > Is this a bad coding contest? > > > > I am not asking you to add ugly ifeq or whatever > > hacks to say "this worked for me". > > > > Please feel free to do this in the fedora kernel, > > but do not send it to upstream. > > > > Sorry, I really do not want to see hacks like this any more. Masahiro, Ping. Do you have a better approach for building GCC plugins in the external module directory? -- Josh
Re: [PATCH] objtool: Fix a memory leak bug
On Tue, Mar 09, 2021 at 04:46:16PM +0800, Jiapeng Chong wrote: > Fix the following cppcheck warnings: > > tools/objtool/check.c(1102): error: Memory leak: orig_alt_group. > > Reported-by: Abaci Robot > Signed-off-by: Jiapeng Chong Hi Jiapeng, Objtool is a short-running process which exits immediately on error, so we don't worry about memory leaks. -- Josh
Re: [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes
On Mon, Mar 08, 2021 at 11:52:10AM +0900, Masami Hiramatsu wrote: > So at the kretprobe handler, we have 2 issues. > 1) the return address (caller_func+0x15) is not on the stack. >this can be solved by searching from current->kretprobe_instances. > 2) the stack frame size of kretprobe_trampoline is unknown >Since the stackframe is fixed, the fixed number (0x98) can be used. > > However, those solutions are only for the kretprobe handler. The stacktrace > from interrupt handler hit in the kretprobe_trampoline still doesn't work. > > So, here is my idea; > > 1) Change the trampline code to prepare stack frame at first and save >registers on it, instead of "push". This will makes ORC easy to setup >stackframe information for this code. > 2) change the return addres fixup timing. Instead of using return value >of trampoline handler, before removing the real return address from >current->kretprobe_instances. > 3) Then, if orc_find() finds the ip is in the kretprobe_trampoline, it >checks the contents of the end of stackframe (at the place of regs->sp) >is same as the address of it. If it is, it can find the correct address >from current->kretprobe_instances. If not, there is a correct address. > > I need to find how the ORC info is prepared for 1), maybe UNWIND_HINT macro > may help? Hi Masami, If I understand correctly, for #1 you need an unwind hint which treats the instruction *after* the "pushq %rsp" as the beginning of the function. I'm thinking this should work: diff --git a/arch/x86/include/asm/unwind_hints.h b/arch/x86/include/asm/unwind_hints.h index 8e574c0afef8..8b33674288ea 100644 --- a/arch/x86/include/asm/unwind_hints.h +++ b/arch/x86/include/asm/unwind_hints.h @@ -52,6 +52,11 @@ UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=8 type=UNWIND_HINT_TYPE_FUNC .endm +#else + +#define UNWIND_HINT_FUNC \ + UNWIND_HINT(ORC_REG_SP, 8, UNWIND_HINT_TYPE_FUNC, 0) + #endif /* __ASSEMBLY__ */ #endif /* _ASM_X86_UNWIND_HINTS_H */ diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c index df776cdca327..38ff1387f95d 100644 --- a/arch/x86/kernel/kprobes/core.c +++ b/arch/x86/kernel/kprobes/core.c @@ -767,6 +767,7 @@ asm( /* We don't bother saving the ss register */ #ifdef CONFIG_X86_64 " pushq %rsp\n" + UNWIND_HINT_FUNC " pushfq\n" SAVE_REGS_STRING " movq %rsp, %rdi\n" @@ -790,7 +791,6 @@ asm( ".size kretprobe_trampoline, .-kretprobe_trampoline\n" ); NOKPROBE_SYMBOL(kretprobe_trampoline); -STACK_FRAME_NON_STANDARD(kretprobe_trampoline); /*
[tip: x86/urgent] x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2
The following commit has been merged into the x86/urgent branch of tip: Commit-ID: e504e74cc3a2c092b05577ce3e8e013fae7d94e6 Gitweb: https://git.kernel.org/tip/e504e74cc3a2c092b05577ce3e8e013fae7d94e6 Author:Josh Poimboeuf AuthorDate:Fri, 05 Feb 2021 08:24:02 -06:00 Committer: Borislav Petkov CommitterDate: Sat, 06 Mar 2021 13:09:37 +01:00 x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2 KASAN reserves "redzone" areas between stack frames in order to detect stack overruns. A read or write to such an area triggers a KASAN "stack-out-of-bounds" BUG. Normally, the ORC unwinder stays in-bounds and doesn't access the redzone. But sometimes it can't find ORC metadata for a given instruction. This can happen for code which is missing ORC metadata, or for generated code. In such cases, the unwinder attempts to fall back to frame pointers, as a best-effort type thing. This fallback often works, but when it doesn't, the unwinder can get confused and go off into the weeds into the KASAN redzone, triggering the aforementioned KASAN BUG. But in this case, the unwinder's confusion is actually harmless and working as designed. It already has checks in place to prevent off-stack accesses, but those checks get short-circuited by the KASAN BUG. And a BUG is a lot more disruptive than a harmless unwinder warning. Disable the KASAN checks by using READ_ONCE_NOCHECK() for all stack accesses. This finishes the job started by commit 881125bfe65b ("x86/unwind: Disable KASAN checking in the ORC unwinder"), which only partially fixed the issue. Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Reported-by: Ivan Babrou Signed-off-by: Josh Poimboeuf Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Borislav Petkov Reviewed-by: Steven Rostedt (VMware) Tested-by: Ivan Babrou Cc: sta...@kernel.org Link: https://lkml.kernel.org/r/9583327904ebbbeda399eca9c56d6c7085ac20fe.1612534649.git.jpoim...@redhat.com --- arch/x86/kernel/unwind_orc.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index 2a1d47f..1bcc14c 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -367,8 +367,8 @@ static bool deref_stack_regs(struct unwind_state *state, unsigned long addr, if (!stack_access_ok(state, addr, sizeof(struct pt_regs))) return false; - *ip = regs->ip; - *sp = regs->sp; + *ip = READ_ONCE_NOCHECK(regs->ip); + *sp = READ_ONCE_NOCHECK(regs->sp); return true; } @@ -380,8 +380,8 @@ static bool deref_stack_iret_regs(struct unwind_state *state, unsigned long addr if (!stack_access_ok(state, addr, IRET_FRAME_SIZE)) return false; - *ip = regs->ip; - *sp = regs->sp; + *ip = READ_ONCE_NOCHECK(regs->ip); + *sp = READ_ONCE_NOCHECK(regs->sp); return true; } @@ -402,12 +402,12 @@ static bool get_reg(struct unwind_state *state, unsigned int reg_off, return false; if (state->full_regs) { - *val = ((unsigned long *)state->regs)[reg]; + *val = READ_ONCE_NOCHECK(((unsigned long *)state->regs)[reg]); return true; } if (state->prev_regs) { - *val = ((unsigned long *)state->prev_regs)[reg]; + *val = READ_ONCE_NOCHECK(((unsigned long *)state->prev_regs)[reg]); return true; }
[tip: x86/urgent] x86/unwind/orc: Silence warnings caused by missing ORC data
The following commit has been merged into the x86/urgent branch of tip: Commit-ID: b59cc97674c947861783ca92b9a6e7d043adba96 Gitweb: https://git.kernel.org/tip/b59cc97674c947861783ca92b9a6e7d043adba96 Author:Josh Poimboeuf AuthorDate:Fri, 05 Feb 2021 08:24:03 -06:00 Committer: Borislav Petkov CommitterDate: Sat, 06 Mar 2021 13:09:45 +01:00 x86/unwind/orc: Silence warnings caused by missing ORC data The ORC unwinder attempts to fall back to frame pointers when ORC data is missing for a given instruction. It sets state->error, but then tries to keep going as a best-effort type of thing. That may result in further warnings if the unwinder gets lost. Until we have some way to register generated code with the unwinder, missing ORC will be expected, and occasionally going off the rails will also be expected. So don't warn about it. Signed-off-by: Josh Poimboeuf Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Borislav Petkov Tested-by: Ivan Babrou Link: https://lkml.kernel.org/r/06d02c4bbb220bd31668db579278b0352538efbb.1612534649.git.jpoim...@redhat.com --- arch/x86/kernel/unwind_orc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index 1bcc14c..a120253 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -13,7 +13,7 @@ #define orc_warn_current(args...) \ ({ \ - if (state->task == current) \ + if (state->task == current && !state->error)\ orc_warn(args); \ })
[tip: x86/urgent] x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2
The following commit has been merged into the x86/urgent branch of tip: Commit-ID: 8bd7b3980ca62904814d536b3a2453001992a0c3 Gitweb: https://git.kernel.org/tip/8bd7b3980ca62904814d536b3a2453001992a0c3 Author:Josh Poimboeuf AuthorDate:Fri, 05 Feb 2021 08:24:02 -06:00 Committer: Borislav Petkov CommitterDate: Sat, 06 Mar 2021 11:37:00 +01:00 x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2 KASAN reserves "redzone" areas between stack frames in order to detect stack overruns. A read or write to such an area triggers a KASAN "stack-out-of-bounds" BUG. Normally, the ORC unwinder stays in-bounds and doesn't access the redzone. But sometimes it can't find ORC metadata for a given instruction. This can happen for code which is missing ORC metadata, or for generated code. In such cases, the unwinder attempts to fall back to frame pointers, as a best-effort type thing. This fallback often works, but when it doesn't, the unwinder can get confused and go off into the weeds into the KASAN redzone, triggering the aforementioned KASAN BUG. But in this case, the unwinder's confusion is actually harmless and working as designed. It already has checks in place to prevent off-stack accesses, but those checks get short-circuited by the KASAN BUG. And a BUG is a lot more disruptive than a harmless unwinder warning. Disable the KASAN checks by using READ_ONCE_NOCHECK() for all stack accesses. This finishes the job started by commit 881125bfe65b ("x86/unwind: Disable KASAN checking in the ORC unwinder"), which only partially fixed the issue. Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Reported-by: Ivan Babrou Signed-off-by: Josh Poimboeuf Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Steven Rostedt (VMware) Tested-by: Ivan Babrou Cc: sta...@kernel.org Link: https://lkml.kernel.org/r/9583327904ebbbeda399eca9c56d6c7085ac20fe.1612534649.git.jpoim...@redhat.com --- arch/x86/kernel/unwind_orc.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index 2a1d47f..1bcc14c 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -367,8 +367,8 @@ static bool deref_stack_regs(struct unwind_state *state, unsigned long addr, if (!stack_access_ok(state, addr, sizeof(struct pt_regs))) return false; - *ip = regs->ip; - *sp = regs->sp; + *ip = READ_ONCE_NOCHECK(regs->ip); + *sp = READ_ONCE_NOCHECK(regs->sp); return true; } @@ -380,8 +380,8 @@ static bool deref_stack_iret_regs(struct unwind_state *state, unsigned long addr if (!stack_access_ok(state, addr, IRET_FRAME_SIZE)) return false; - *ip = regs->ip; - *sp = regs->sp; + *ip = READ_ONCE_NOCHECK(regs->ip); + *sp = READ_ONCE_NOCHECK(regs->sp); return true; } @@ -402,12 +402,12 @@ static bool get_reg(struct unwind_state *state, unsigned int reg_off, return false; if (state->full_regs) { - *val = ((unsigned long *)state->regs)[reg]; + *val = READ_ONCE_NOCHECK(((unsigned long *)state->regs)[reg]); return true; } if (state->prev_regs) { - *val = ((unsigned long *)state->prev_regs)[reg]; + *val = READ_ONCE_NOCHECK(((unsigned long *)state->prev_regs)[reg]); return true; }
[tip: x86/urgent] x86/unwind/orc: Silence warnings caused by missing ORC data
The following commit has been merged into the x86/urgent branch of tip: Commit-ID: d072f941c1e234f8495cc4828370b180318bf49b Gitweb: https://git.kernel.org/tip/d072f941c1e234f8495cc4828370b180318bf49b Author:Josh Poimboeuf AuthorDate:Fri, 05 Feb 2021 08:24:03 -06:00 Committer: Borislav Petkov CommitterDate: Sat, 06 Mar 2021 11:37:00 +01:00 x86/unwind/orc: Silence warnings caused by missing ORC data The ORC unwinder attempts to fall back to frame pointers when ORC data is missing for a given instruction. It sets state->error, but then tries to keep going as a best-effort type of thing. That may result in further warnings if the unwinder gets lost. Until we have some way to register generated code with the unwinder, missing ORC will be expected, and occasionally going off the rails will also be expected. So don't warn about it. Signed-off-by: Josh Poimboeuf Signed-off-by: Peter Zijlstra (Intel) Tested-by: Ivan Babrou Link: https://lkml.kernel.org/r/06d02c4bbb220bd31668db579278b0352538efbb.1612534649.git.jpoim...@redhat.com --- arch/x86/kernel/unwind_orc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index 1bcc14c..a120253 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -13,7 +13,7 @@ #define orc_warn_current(args...) \ ({ \ - if (state->task == current) \ + if (state->task == current && !state->error)\ orc_warn(args); \ })
Re: [PATCH] kbuild: rebuild GCC plugins when the compiler is upgraded
On Sat, Mar 06, 2021 at 11:18:31AM +0900, Masahiro Yamada wrote: > On Sat, Mar 6, 2021 at 10:28 AM Josh Poimboeuf wrote: > > > > On Thu, Mar 04, 2021 at 08:25:00PM -0600, Josh Poimboeuf wrote: > > > On Thu, Mar 04, 2021 at 03:37:14PM -0800, Linus Torvalds wrote: > > > > On Thu, Mar 4, 2021 at 3:20 PM Kees Cook wrote: > > > > > > > > > > This seems fine to me, but I want to make sure Josh has somewhere to > > > > > actually go with this. Josh, does this get you any closer? > > > > > > No, this doesn't seem to help me at all. > > > > > > > > It sounds like the plugins need to move to another location for > > > > > packaged kernels? > > > > > > > > Well, it might be worth extending the stuff that gets installed with > > > > /lib/modules// with enough information and > > > > infrastruvcture to then build any external modules. > > > > > > The gcc plugins live in scripts/, which get installed by "make > > > modules_install" already. So the plugins' source and makefiles are in > > > /lib/modules//build/scripts/gcc-plugins. > > > > > > So everything needed for building the plugins is already there. We just > > > need the kernel makefiles to rebuild the plugins locally, when building > > > an external module. > > > > This seems to work with very limited testing... Based on top of > > Masahiro's recent patch: > > > > > > https://lkml.kernel.org/r/cak7lnarhotnz3gavhgnyb4n-wyuboxc10a6zurh1odghxwd...@mail.gmail.com > > Is this a bad coding contest? > > I am not asking you to add ugly ifeq or whatever > hacks to say "this worked for me". > > Please feel free to do this in the fedora kernel, > but do not send it to upstream. > > Sorry, I really do not want to see hacks like this any more. Geez, that's a bit harsh. I never claimed to be a brilliant makefile coder. Do you have any constructive feedback for improving the patch? > Remember, how badly objtool was integrated in the build system, > and you even blocked me from fixing that. I have no idea what you're talking about, nor how it would be relevant to this patch. -- Josh
Re: [PATCH] kbuild: rebuild GCC plugins when the compiler is upgraded
On Thu, Mar 04, 2021 at 08:25:00PM -0600, Josh Poimboeuf wrote: > On Thu, Mar 04, 2021 at 03:37:14PM -0800, Linus Torvalds wrote: > > On Thu, Mar 4, 2021 at 3:20 PM Kees Cook wrote: > > > > > > This seems fine to me, but I want to make sure Josh has somewhere to > > > actually go with this. Josh, does this get you any closer? > > No, this doesn't seem to help me at all. > > > > It sounds like the plugins need to move to another location for > > > packaged kernels? > > > > Well, it might be worth extending the stuff that gets installed with > > /lib/modules// with enough information and > > infrastruvcture to then build any external modules. > > The gcc plugins live in scripts/, which get installed by "make > modules_install" already. So the plugins' source and makefiles are in > /lib/modules//build/scripts/gcc-plugins. > > So everything needed for building the plugins is already there. We just > need the kernel makefiles to rebuild the plugins locally, when building > an external module. This seems to work with very limited testing... Based on top of Masahiro's recent patch: https://lkml.kernel.org/r/cak7lnarhotnz3gavhgnyb4n-wyuboxc10a6zurh1odghxwd...@mail.gmail.com From: Josh Poimboeuf Subject: [PATCH] gcc-plugins: Rebuild plugins in external module directory When building external kernel modules, the build system doesn't require the GCC version to match the version used to build the original kernel. In fact, most distros release the compiler and the kernel in separate packages, with separate release cadences. So it's not uncommon for mismatches to occur. But with GCC plugins enabled, that's no longer allowed: cc1: error: incompatible gcc/plugin versions cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so That error comes from the plugin's call to plugin_default_version_check(), which strictly enforces the GCC version. The strict check makes sense, because there's nothing to prevent the GCC plugin ABI from changing, and it often does. Since plugins are tightly tied to the compiler version, just rebuild them locally in the external module directory, and then use the local version in the external module build. Reported-by: Ondrej Mosnacek Signed-off-by: Josh Poimboeuf --- Makefile | 1 + scripts/Makefile.gcc-plugins | 2 +- scripts/gcc-plugins/Makefile | 8 ++-- 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/Makefile b/Makefile index bc208886fcce..90c6656de224 100644 --- a/Makefile +++ b/Makefile @@ -1784,6 +1784,7 @@ prepare: echo " The kernel was built by: "$(CONFIG_CC_VERSION_TEXT); \ echo " You are using: $(CC_VERSION_TEXT)"; \ fi + $(Q)$(MAKE) $(build)=scripts/gcc-plugins PHONY += help help: diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins index 952e46876329..be4303678942 100644 --- a/scripts/Makefile.gcc-plugins +++ b/scripts/Makefile.gcc-plugins @@ -48,7 +48,7 @@ export DISABLE_ARM_SSP_PER_TASK_PLUGIN # All the plugin CFLAGS are collected here in case a build target needs to # filter them out of the KBUILD_CFLAGS. -GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y)) +GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(if $(KBUILD_EXTMOD),$(KBUILD_EXTMOD),$(objtree))/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y)) # The sancov_plugin.so is included via CFLAGS_KCOV, so it is removed here. GCC_PLUGINS_CFLAGS := $(filter-out %/sancov_plugin.so, $(GCC_PLUGINS_CFLAGS)) export GCC_PLUGINS_CFLAGS diff --git a/scripts/gcc-plugins/Makefile b/scripts/gcc-plugins/Makefile index b5487cce69e8..9f8e2ef3ab56 100644 --- a/scripts/gcc-plugins/Makefile +++ b/scripts/gcc-plugins/Makefile @@ -1,10 +1,14 @@ # SPDX-License-Identifier: GPL-2.0 -$(obj)/randomize_layout_plugin.so: $(objtree)/$(obj)/randomize_layout_seed.h +ifneq ($(KBUILD_EXTMOD),) +override obj := $(KBUILD_EXTMOD)/$(obj) +endif + +$(obj)/randomize_layout_plugin.so: $(objtree)/$(src)/randomize_layout_seed.h quiet_cmd_create_randomize_layout_seed = GENSEED $@ cmd_create_randomize_layout_seed = \ $(CONFIG_SHELL) $(srctree)/$(src)/gen-random-seed.sh $@ $(objtree)/include/generated/randomize_layout_hash.h -$(objtree)/$(obj)/randomize_layout_seed.h: FORCE +$(objtree)/$(src)/randomize_layout_seed.h: FORCE $(call if_changed,create_randomize_layout_seed) targets += randomize_layout_seed.h randomize_layout_hash.h -- 2.29.2
Re: [PATCH] x86: kprobes: orc: Fix ORC walks in kretprobes
On Fri, Mar 05, 2021 at 11:25:15AM -0800, Daniel Xu wrote: > > BTW, is this a regression? or CONFIG_UNWINDER_ORC has this issue before? > > It seems that the above commit just changed the default unwinder. This means > > OCR stack unwinder has this bug before that commit. > > I see your point -- I suppose it depends on point of view. Viewed from > userspace, a change in kernel defaults means that one kernel worked and > the next one didn't -- all without the user doing anything. Consider it > from the POV of a typical linux user who just takes whatever the distro > gives them and doesn't compile their own kernels. > > From the kernel point of view, you're also right. ORC didn't regress, it > was always broken for this particular use case. But as a primarily > userspace developer, I would consider this a kernel regression. Either way, if the bug has always existed in the ORC unwinder, the Fixes tag needs to reference the original ORC commit: Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") That makes it possible for stable kernels to identify the source of the bug and corresponding fix. Many people used the ORC unwinder before it became the default. -- Josh
Re: [PATCH RFC] kbuild: Prevent compiler mismatch with external modules
On Sat, Mar 06, 2021 at 01:28:22AM +0900, Masahiro Yamada wrote: > > +orig_name := $(if $(CONFIG_CC_IS_GCC),GCC,CLANG) > > +orig_minor := $(shell expr $(if > > $(CONFIG_CC_IS_GCC),$(CONFIG_GCC_VERSION),$(CONFIG_CLANG_VERSION)) / 100) > > +cur_namever := $(shell $(srctree)/scripts/cc-version.sh $(CC)) > > +cur_name:= $(word 1,$(cur_namever)) > > +cur_minor := $(shell expr $(word 2,$(cur_namever)) / 100) > > These are still calculated by 'make M=... clean' or 'make M=... help'. > Using '=' assignment solves it, but the code is still ugly. > > > I attached my alternative implementation. Thanks for the attached patch, yours looks much cleaner. Looks like it warns on *any* mismatch, rather than just a major.minor mismatch. But that's ok with me. Acked-by: Josh Poimboeuf -- Josh
Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules
On Sat, Mar 06, 2021 at 01:03:32AM +0900, Masahiro Yamada wrote: > > Ok. So it sounds like the best/easiest option is the original patch in > > this thread: when building an external module with a GCC mismatch, just > > disable the GCC plugin, with a warning (or an error for randstruct). > > Just for clarification, > I believe "the original patch" pointed to this one: > https://lore.kernel.org/lkml/efe6b039a544da8215d5e54aa7c4b6d1986fc2b0.1611607264.git.jpoim...@redhat.com/ > > This is dead. Please do not come back to this. Sorry, no. The patch may have been crap, but that doesn't make the problem I'm trying to solve any less valid. > See negative comments not only from me, but also from Greg, Peter, > Christoph. I responded to those. Summarizing my replies once again... - "External modules aren't supported" This doesn't even remotely match reality. Are you honestly using this "negative comment" as a reason to NAK the patch? - "External modules must be built with the same GCC version" As has been stated repeatedly, by Linus and others, there's no technical reason behind this claim. It ignores the realities of how distros release the kernel and compiler independently, with separate cadences. Minor variances in compiler version are ABI compatible. Also, for features which are dependent on compiler version, many of those are now enabled by kbuild. As I suggested to you previously, kbuild should warn when such features get disabled (which can happen due to a compiler/toolchain change or due to a .config copied from another system). -- Josh
Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules
On Thu, Mar 04, 2021 at 11:12:42AM -0800, Linus Torvalds wrote: > On Thu, Mar 4, 2021 at 7:36 AM Masahiro Yamada wrote: > > > > All the kernel-space objects are rebuilt > > when the compiler is upgraded. > > I very much NAK'ed that one. Why did that go in? > > Or maybe I NAK'ed another version of it (I think the one I NAK'ed was > from Josh), and didn't realize that there were multiple ones. This thread is making me dizzy, but I think the patch you NAK'ed from me was different. It just added an error on GCC mismatch with external modules: https://lkml.kernel.org/r/fff056a7c9e6050c2d60910f70b6d99602f3bec4.1611863075.git.jpoim...@redhat.com though I think you were ok with downgrading it to a warning. -- Josh
Re: [PATCH] kbuild: rebuild GCC plugins when the compiler is upgraded
On Thu, Mar 04, 2021 at 03:37:14PM -0800, Linus Torvalds wrote: > On Thu, Mar 4, 2021 at 3:20 PM Kees Cook wrote: > > > > This seems fine to me, but I want to make sure Josh has somewhere to > > actually go with this. Josh, does this get you any closer? No, this doesn't seem to help me at all. > > It sounds like the plugins need to move to another location for > > packaged kernels? > > Well, it might be worth extending the stuff that gets installed with > /lib/modules// with enough information and > infrastruvcture to then build any external modules. The gcc plugins live in scripts/, which get installed by "make modules_install" already. So the plugins' source and makefiles are in /lib/modules//build/scripts/gcc-plugins. So everything needed for building the plugins is already there. We just need the kernel makefiles to rebuild the plugins locally, when building an external module. -- Josh
Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules
On Thu, Mar 04, 2021 at 09:27:28PM +0900, Masahiro Yamada wrote: > I agree with rebuilding GCC plugins when the compiler is upgraded > for *in-tree* building. > Linus had reported it a couple of months before, > and I just submitted a very easy fix. Hm? So does that mean that a GCC version change won't trigger a tree-wide rebuild? So you're asserting that a GCC mismatch is ok for in-tree code, but not for external modules??? That seems backwards. For in-tree, why not just rebuild the entire tree? Some kernel features are dependent on compiler version or capability, so not rebuilding the tree could introduce silent breakage. For external modules, a tree-wide rebuild isn't an option so the risk is assumed by the user. I posted a patch earlier [1] which prints a warning if the compiler major/minor version changes with an external module build. [1] https://lkml.kernel.org/r/20210201211322.t2rxmvnrystc2ky7@treble > Rebuilding plugins for external modules is not easy; > plugins are placed in the read-only directory, > /usr/src/linux-headers-$(uname -r)/scripts/gcc-plugins/. > > The external modules must not (cannot) update in-tree > build artifacts. "Rebuild" means creating copies in a different > writable directory. > Doing that requires a lot of design changes. Ok. So it sounds like the best/easiest option is the original patch in this thread: when building an external module with a GCC mismatch, just disable the GCC plugin, with a warning (or an error for randstruct). -- Josh
[tip: x86/urgent] x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2
The following commit has been merged into the x86/urgent branch of tip: Commit-ID: 38b6eb474ed2df3d159396c3d4312c8a7b2d5196 Gitweb: https://git.kernel.org/tip/38b6eb474ed2df3d159396c3d4312c8a7b2d5196 Author:Josh Poimboeuf AuthorDate:Fri, 05 Feb 2021 08:24:02 -06:00 Committer: Peter Zijlstra CommitterDate: Wed, 03 Mar 2021 16:56:29 +01:00 x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2 KASAN reserves "redzone" areas between stack frames in order to detect stack overruns. A read or write to such an area triggers a KASAN "stack-out-of-bounds" BUG. Normally, the ORC unwinder stays in-bounds and doesn't access the redzone. But sometimes it can't find ORC metadata for a given instruction. This can happen for code which is missing ORC metadata, or for generated code. In such cases, the unwinder attempts to fall back to frame pointers, as a best-effort type thing. This fallback often works, but when it doesn't, the unwinder can get confused and go off into the weeds into the KASAN redzone, triggering the aforementioned KASAN BUG. But in this case, the unwinder's confusion is actually harmless and working as designed. It already has checks in place to prevent off-stack accesses, but those checks get short-circuited by the KASAN BUG. And a BUG is a lot more disruptive than a harmless unwinder warning. Disable the KASAN checks by using READ_ONCE_NOCHECK() for all stack accesses. This finishes the job started by commit 881125bfe65b ("x86/unwind: Disable KASAN checking in the ORC unwinder"), which only partially fixed the issue. Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Reported-by: Ivan Babrou Signed-off-by: Josh Poimboeuf Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Steven Rostedt (VMware) Tested-by: Ivan Babrou Cc: sta...@kernel.org Link: https://lkml.kernel.org/r/9583327904ebbbeda399eca9c56d6c7085ac20fe.1612534649.git.jpoim...@redhat.com --- arch/x86/kernel/unwind_orc.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index 2a1d47f..1bcc14c 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -367,8 +367,8 @@ static bool deref_stack_regs(struct unwind_state *state, unsigned long addr, if (!stack_access_ok(state, addr, sizeof(struct pt_regs))) return false; - *ip = regs->ip; - *sp = regs->sp; + *ip = READ_ONCE_NOCHECK(regs->ip); + *sp = READ_ONCE_NOCHECK(regs->sp); return true; } @@ -380,8 +380,8 @@ static bool deref_stack_iret_regs(struct unwind_state *state, unsigned long addr if (!stack_access_ok(state, addr, IRET_FRAME_SIZE)) return false; - *ip = regs->ip; - *sp = regs->sp; + *ip = READ_ONCE_NOCHECK(regs->ip); + *sp = READ_ONCE_NOCHECK(regs->sp); return true; } @@ -402,12 +402,12 @@ static bool get_reg(struct unwind_state *state, unsigned int reg_off, return false; if (state->full_regs) { - *val = ((unsigned long *)state->regs)[reg]; + *val = READ_ONCE_NOCHECK(((unsigned long *)state->regs)[reg]); return true; } if (state->prev_regs) { - *val = ((unsigned long *)state->prev_regs)[reg]; + *val = READ_ONCE_NOCHECK(((unsigned long *)state->prev_regs)[reg]); return true; }
[tip: x86/urgent] x86/unwind/orc: Silence warnings caused by missing ORC data
The following commit has been merged into the x86/urgent branch of tip: Commit-ID: 86402dcc894951c0a363b6aee12d955ff923b35e Gitweb: https://git.kernel.org/tip/86402dcc894951c0a363b6aee12d955ff923b35e Author:Josh Poimboeuf AuthorDate:Fri, 05 Feb 2021 08:24:03 -06:00 Committer: Peter Zijlstra CommitterDate: Wed, 03 Mar 2021 16:56:30 +01:00 x86/unwind/orc: Silence warnings caused by missing ORC data The ORC unwinder attempts to fall back to frame pointers when ORC data is missing for a given instruction. It sets state->error, but then tries to keep going as a best-effort type of thing. That may result in further warnings if the unwinder gets lost. Until we have some way to register generated code with the unwinder, missing ORC will be expected, and occasionally going off the rails will also be expected. So don't warn about it. Signed-off-by: Josh Poimboeuf Signed-off-by: Peter Zijlstra (Intel) Tested-by: Ivan Babrou Link: https://lkml.kernel.org/r/06d02c4bbb220bd31668db579278b0352538efbb.1612534649.git.jpoim...@redhat.com --- arch/x86/kernel/unwind_orc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index 1bcc14c..a120253 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -13,7 +13,7 @@ #define orc_warn_current(args...) \ ({ \ - if (state->task == current) \ + if (state->task == current && !state->error)\ orc_warn(args); \ })
Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules
On Wed, Mar 03, 2021 at 12:56:52PM -0800, Linus Torvalds wrote: > On Wed, Mar 3, 2021 at 12:24 PM Josh Poimboeuf wrote: > > > > Your nack is for a different reason: GCC plugins are second-class > > citizens. Fair enough... > > MNo, I didn't NAK it. Quite the reverser. > > I am ABSOLUTELY against rebuilding normal object files just because > gcc versions change. A compiler version change makes zero difference > for any normal object file. > > But the gcc plugins are different. They very much _are_ tied to a > particular gcc version. > > Now, they are tied to a particular gcc version because they are > horribly badly done, and bad technology, and I went off on a bit of a > rant about just how bad they are, but the point is that gcc plugins > depend on the exact gcc version in ways that normal object files do > _not_. Thanks, reading comprehension is hard. I realized after re-reading that I interpreted your "plugins should depend on the kernel version" statement too broadly. Masahiro, any idea how I can make the GCC version a build dependency? -- Josh
Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules
On Wed, Mar 03, 2021 at 11:25:34AM -0800, Linus Torvalds wrote: > On Wed, Mar 3, 2021 at 11:15 AM Josh Poimboeuf wrote: > > > > Adding Linus, who indicated in another thread that we shouldn't force > > exact GCC versions because there's no technical reason to do so. > > I do not believe we should recompile everything just because the gcc > version changes. > > But gcc _plugins_ certainly should depend on the kernel version. > > Very few people should be enabling the gcc plugins in the first place. > Honestly, most of them are bad, and the people who really care about > those things have already moved to clang which does the important > parts natively without the need for a plugin. I'm personally waiting > for the day when we can just say "let's remove them". You might be sad to learn that some of the plugins are useful for hardening of a production distro kernel, like stackleak and structleak. > But in the meantime, making the plugins depend on the gcc version some > way is certainly better than not doing so. So currently, the plugins already so that. They require the GCC version to be exact. If there's a mismatch, then it fails the OOT module build. But that's not usable for a distro. When users build OOT modules with a slight GCC mismatch, it breaks the build, effectively requiring the exact same GCC version for *all* OOT builds going forward. So there have been a few proposals to better handle GCC version mismatches: 1) disable the plugin - this works fine for most plugins except randstruct 2) rebuild the plugin whenever the GCC version changes 3) fail the build, like today - effectively blocks distros from using plugins -- Josh
Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules
On Wed, Mar 03, 2021 at 02:24:12PM -0600, Josh Poimboeuf wrote: > On Wed, Mar 03, 2021 at 11:57:33AM -0800, Linus Torvalds wrote: > > On Wed, Mar 3, 2021 at 11:38 AM Josh Poimboeuf wrote: > > > > > > > But in the meantime, making the plugins depend on the gcc version some > > > > way is certainly better than not doing so. > > > > > > So currently, the plugins already so that. They require the GCC version > > > to be exact. If there's a mismatch, then it fails the OOT module build. > > > > That's not my experience. > > > > Yes, the build fails, but it fails not by _rebuilding_, but by failing > > with an error. > > Um, that's what I said. It does not rebuild. It fails with an error. > > The *proposal* is to rebuild the plugin -- which Masahiro nacked because > he claims GCC mismatches aren't supported for OOT builds (plugin or > not). > > Your nack is for a different reason: GCC plugins are second-class > citizens. Fair enough... Or was it a nack? :-) Reading your message again, I may have misinterpreted. Put simply, should we rebuild plugins when the GCC version changes? -- Josh
Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules
On Wed, Mar 03, 2021 at 11:57:33AM -0800, Linus Torvalds wrote: > On Wed, Mar 3, 2021 at 11:38 AM Josh Poimboeuf wrote: > > > > > But in the meantime, making the plugins depend on the gcc version some > > > way is certainly better than not doing so. > > > > So currently, the plugins already so that. They require the GCC version > > to be exact. If there's a mismatch, then it fails the OOT module build. > > That's not my experience. > > Yes, the build fails, but it fails not by _rebuilding_, but by failing > with an error. Um, that's what I said. It does not rebuild. It fails with an error. The *proposal* is to rebuild the plugin -- which Masahiro nacked because he claims GCC mismatches aren't supported for OOT builds (plugin or not). Your nack is for a different reason: GCC plugins are second-class citizens. Fair enough... -- Josh
Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules
On Thu, Mar 04, 2021 at 03:49:35AM +0900, Masahiro Yamada wrote: > > This problem is becoming more prevalent. We will need to fix it one way > > or another, if we want to support distro adoption of these GCC > > plugin-based features. > > > > Frank suggested a possibly better idea: always rebuild the plugins when > > the GCC version changes. > > > That is just another form of the previous patch, > which was already rejected. > > > - That is a hack just for external modules > - Our consensus is, use the same version for the kernel and external modules > > > > I use Ubuntu, and I do not see such a problem. > (I have never seen it on Debian either, except sid.) > > I see Fedora providing GCC whose version is different > from the one used for building the kernel. > That is a problem on the distribution side. I don't understand. Are you suggesting that a distro should always release GCC and kernel updates simultaneously? How is this problem specific to Fedora? Please be specific about what Ubuntu and Debian do, which Fedora doesn't do. Adding Linus, who indicated in another thread that we shouldn't force exact GCC versions because there's no technical reason to do so. -- Josh
Re: [PATCH 0/3] objtool: OBJTOOL_ARGS and --backup
On Fri, Feb 26, 2021 at 11:57:42AM +0100, Peter Zijlstra wrote: > Boris asked for an environment variable to have objtool preserve the original > object file so that it becomes trivial to inspect what actually changed. I might bikeshed the suffix ".o.orig" instead of ".obj". Acked-by: Josh Poimboeuf -- Josh
Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules
On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote: > When building out-of-tree kernel modules, the build system doesn't > require the GCC version to match the version used to build the original > kernel. That's probably [1] fine. > > In fact, for many distros, the version of GCC used to build the latest > kernel doesn't necessarily match the latest released GCC, so a GCC > mismatch turns out to be pretty common. And with CONFIG_MODVERSIONS > it's probably more common. > > So a lot of users have come to rely on being able to use a different > version of GCC when building OOT modules. > > But with GCC plugins enabled, that's no longer allowed: > > cc1: error: incompatible gcc/plugin versions > cc1: error: failed to initialize plugin > ./scripts/gcc-plugins/structleak_plugin.so > > That error comes from the plugin's call to > plugin_default_version_check(), which strictly enforces the GCC version. > The strict check makes sense, because there's nothing to prevent the GCC > plugin ABI from changing -- and it often does. > > But failing the build isn't necessary. For most plugins, OOT modules > will otherwise work just fine without the plugin instrumentation. > > When a GCC version mismatch is detected, print a warning and disable the > plugin. The only exception is the RANDSTRUCT plugin which needs all > code to see the same struct layouts. In that case print an error. Hi Masahiro, This problem is becoming more prevalent. We will need to fix it one way or another, if we want to support distro adoption of these GCC plugin-based features. Frank suggested a possibly better idea: always rebuild the plugins when the GCC version changes. What do you think? Any suggestions on how to implement that? Otherwise I can try to hack something together. -- Josh