from:"Josh Poimboeuf"

Re: [PATCH 1/1] livepatch: Rename KLP_* to KLP_TRANSITION_*

2024-05-07 Thread Josh Poimboeuf

On Tue, May 07, 2024 at 01:01:11PM +0800, zhangwar...@gmail.com wrote:
> From: Wardenjohn 
> 
> The original macros of KLP_* is about the state of the transition.
> Rename macros of KLP_* to KLP_TRANSITION_* to fix the confusing
> description of klp transition state.
> 
> Signed-off-by: Wardenjohn 

Acked-by: Josh Poimboeuf 

-- 
Josh

Re: [PATCH 0/1] *** Replace KLP_* to KLP_TRANSITION_* ***

2024-05-06 Thread Josh Poimboeuf

On Tue, May 07, 2024 at 10:56:09AM +0800, zhang warden wrote:
> 
> 
> > On May 7, 2024, at 10:41, Josh Poimboeuf  wrote:
> > 
> > On Tue, May 07, 2024 at 10:21:40AM +0800, zhang warden wrote:
> >> 
> >> 
> >>> 
> >>> transition state. With this marcos renamed, comments are not 
> >>> necessary at this point.
> >>> 
> >> Sorry for my careless, the comment still remains in the code. However,
> >> comment in the code do no harms here. Maybe it can be kept.
> > 
> > The comments aren't actually correct.
> > 
> > For example, KLP_TRANSITION_UNPATCHED doesn't always mean "transitioning
> > to unpatched state".  Sometimes it means "transitioning *from* unpatched
> > state".
> > 
> > -- 
> > Josh
> 
> OK, I got it. I will remove the comment later. After all, comment is
> not necessary at this point after we rename the macros.

Yeah, removing them altogether might be best, as the meaning of these
can vary slightly depending on the operation (patching vs unpatching),
and also depending on where it's stored (task->patch_state vs klp_target_state).

-- 
Josh

Re: [PATCH 0/1] *** Replace KLP_* to KLP_TRANSITION_* ***

2024-05-06 Thread Josh Poimboeuf

On Tue, May 07, 2024 at 10:21:40AM +0800, zhang warden wrote:
> 
> 
> > 
> > transition state. With this marcos renamed, comments are not 
> > necessary at this point.
> > 
> Sorry for my careless, the comment still remains in the code. However,
> comment in the code do no harms here. Maybe it can be kept.

The comments aren't actually correct.

For example, KLP_TRANSITION_UNPATCHED doesn't always mean "transitioning
to unpatched state".  Sometimes it means "transitioning *from* unpatched
state".

-- 
Josh

Re: [PATCH] livepatch.h: Add comment to klp transition state

2024-05-05 Thread Josh Poimboeuf

On Mon, Apr 29, 2024 at 03:26:28PM +0800, zhangwar...@gmail.com wrote:
> From: Wardenjohn 
> 
> livepatch.h use KLP_UNDEFINED\KLP_UNPATCHED\KLP_PATCHED for klp transition 
> state.
> When livepatch is ready but idle, using KLP_UNDEFINED seems very confusing.
> In order not to introduce potential risks to kernel, just update comment
> to these state.
> ---
>  include/linux/livepatch.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index 9b9b38e89563..b6a214f2f8e3 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -18,9 +18,9 @@
>  #if IS_ENABLED(CONFIG_LIVEPATCH)
>  
>  /* task patch states */
> -#define KLP_UNDEFINED-1
> -#define KLP_UNPATCHED 0
> -#define KLP_PATCHED   1
> +#define KLP_UNDEFINED-1 /* idle, no transition in progress */
> +#define KLP_UNPATCHED 0 /* transitioning to unpatched state */
> +#define KLP_PATCHED   1 /* transitioning to patched state */

Instead of the comments, how about we just rename them to

  KLP_TRANSITION_IDLE
  KLP_TRANSITION_UNPATCHED
  KLP_TRANSITION_PATCHED

which shouldn't break userspace AFAIK.

-- 
Josh

Re: [RFC PATCH 0/4] perf: Correlating user process data to samples

2024-04-18 Thread Josh Poimboeuf

On Sat, Apr 13, 2024 at 08:48:57AM -0400, Steven Rostedt wrote:
> On Sat, 13 Apr 2024 12:53:38 +0200
> Peter Zijlstra  wrote:
> 
> > On Fri, Apr 12, 2024 at 09:37:24AM -0700, Beau Belgrave wrote:
> > 
> > > > Anyway, since we typically run stuff from NMI context, accessing user
> > > > data is 'interesting'. As such I would really like to make this work
> > > > depend on the call-graph rework that pushes all the user access bits
> > > > into return-to-user.  
> > > 
> > > Cool, I assume that's the SFRAME work? Are there pointers to work I
> > > could look at and think about what a rebase looks like? Or do you have
> > > someone in mind I should work with for this?  
> > 
> > I've been offline for a little while and still need to catch up with
> > things myself.
> > 
> > Josh was working on that when I dropped off IIRC, I'm not entirely sure
> > where things are at currently (and there is no way I can ever hope to
> > process the backlog).
> > 
> > Anybody know where we are with that?
> 
> It's still very much on my RADAR, but with layoffs and such, my
> priorities have unfortunately changed. I'm hoping to start helping out
> in the near future though (in a month or two).
> 
> Josh was working on it, but I think he got pulled off onto other
> priorities too :-p

Yeah, this is still a priority for me and I hope to get back to it over
the next few weeks (crosses fingers).

-- 
Josh

Re: [POC 0/7] livepatch: Make livepatch states, callbacks, and shadow variables work together

2023-11-10 Thread Josh Poimboeuf

On Fri, Nov 10, 2023 at 06:04:21PM +0100, Petr Mladek wrote:
> This POC is a material for the discussion "Simplify Livepatch Callbacks,
> Shadow Variables, and States handling" at LPC 2013, see
> https://lpc.events/event/17/contributions/1541/
> 
> It obsoletes the patchset adding the garbage collection of shadow
> variables. This new solution is based on ideas from Nicolai Stange.
> And it should also be in sync with Josh's ideas mentioned into
> the thread about the garbage collection, see
> https://lore.kernel.org/r/20230204235910.4j4ame5ntqogqi7m@treble

Nice!  I like how it brings the "features" together and makes them easy
to use.  This looks like a vast improvement.

Was there a reason to change the naming?  I'm thinking

  setup / enable / disable / release

is less precise than

  pre_patch / post_patch / pre_unpatch / post_unpatch.

Also, I'm thinking "replaced" instead of "obsolete" would be more
consistent with the existing terminology.

For example, in __klp_enable_patch():

ret = klp_setup_states(patch);
if (ret)
goto err;

if (patch->replace)
klp_disable_obsolete_states(patch);

it's not immediately clear why "disable obsolete" would be the "replace"
counterpart to "setup".

Similarly, in klp_complete_transition():

if (klp_transition_patch->replace && klp_target_state == KLP_PATCHED) {
klp_unpatch_replaced_patches(klp_transition_patch);
klp_discard_nops(klp_transition_patch);
klp_release_obsolete_states(klp_transition_patch);
}

it's a little jarring to have "unpatch replaced" followed by "release
obsolete".

So instead of:

  int  klp_setup_states(struct klp_patch *patch);
  void klp_enable_states(struct klp_patch *patch);
  void klp_disable_states(struct klp_patch *patch);
  void klp_release_states(struct klp_patch *patch);

  void klp_enable_obsolete_states(struct klp_patch *patch);
  void klp_disable_obsolete_states(struct klp_patch *patch);
  void klp_release_obsolete_states(struct klp_patch *patch);

how about something like:

  int  klp_states_pre_patch(void);
  void klp_states_post_patch(void);
  void klp_states_pre_unpatch(void);
  void klp_states_post_unpatch(void);

  void klp_states_post_patch_replaced(void);
  void klp_states_pre_unpatch_replaced(void);
  void klp_states_post_unpatch_replaced(void);

?

(note that passing klp_transition_patch might be optional since state.c
already has visibility to it anyway)

-- 
Josh

Re: [RFC PATCH 07/86] Revert "livepatch,sched: Add livepatch task switching to cond_resched()"

2023-11-09 Thread Josh Poimboeuf

On Thu, Nov 09, 2023 at 02:50:48PM -0800, Ankur Arora wrote:
> >> I guess I'm not fully understanding what the cond rescheds are for. But
> >> would an IPI to all CPUs setting NEED_RESCHED, fix it?
> 
> Yeah. We could just temporarily toggle to full preemption, when
> NEED_RESCHED_LAZY is always upgraded to NEED_RESCHED which will
> then send IPIs.
> 
> > If all livepatch arches had the ORC unwinder, yes.
> >
> > The problem is that frame pointer (and similar) unwinders can't reliably
> > unwind past an interrupt frame.
> 
> Ah, I wonder if we could just disable the preempt_schedule_irq() path
> temporarily? Hooking into schedule() alongside something like this:
> 
> @@ -379,7 +379,7 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs 
> *regs)
> 
>  void irqentry_exit_cond_resched(void)
>  {
> -   if (!preempt_count()) {
> +   if (klp_cond_resched_disable() && !preempt_count()) {
> 
> The problem would be tasks that don't go through any preemptible
> sections.

Let me back up a bit and explain what klp is trying to do.

When a livepatch is applied, klp needs to unwind all the tasks,
preferably within a reasonable amount of time.

We can't unwind task A from task B while task A is running, since task A
could be changing the stack during the unwind.  So task A needs to be
blocked or asleep.  The only exception to that is if the unwind happens
in the context of task A itself.

The problem we were seeing was CPU-bound kthreads (e.g., vhost_worker)
not getting patched within a reasonable amount of time.  We fixed it by
hooking the klp unwind into cond_resched() so it can unwind from the
task itself.

It only worked because we had a non-preempted hook (because non-ORC
unwinders can't unwind reliably through preemption) which called klp to
unwind from the context of the task.

Without something to hook into, we have a problem.  We could of course
hook into schedule(), but if the kthread never calls schedule() from a
non-preempted context then it still doesn't help.

-- 
Josh

Re: [RFC PATCH 07/86] Revert "livepatch,sched: Add livepatch task switching to cond_resched()"

2023-11-09 Thread Josh Poimboeuf

On Thu, Nov 09, 2023 at 12:31:47PM -0500, Steven Rostedt wrote:
> On Thu, 9 Nov 2023 09:26:37 -0800
> Josh Poimboeuf  wrote:
> 
> > On Tue, Nov 07, 2023 at 06:16:09PM -0500, Steven Rostedt wrote:
> > > On Tue,  7 Nov 2023 13:56:53 -0800
> > > Ankur Arora  wrote:
> > >   
> > > > This reverts commit e3ff7c609f39671d1aaff4fb4a8594e14f3e03f8.
> > > > 
> > > > Note that removing this commit reintroduces "live patches failing to
> > > > complete within a reasonable amount of time due to CPU-bound kthreads."
> > > > 
> > > > Unfortunately this fix depends quite critically on PREEMPT_DYNAMIC and
> > > > existence of cond_resched() so this will need an alternate fix.  
> > 
> > We definitely don't want to introduce a regression, something will need
> > to be figured out before removing cond_resched().
> > 
> > We could hook into preempt_schedule_irq(), but that wouldn't work for
> > non-ORC.
> > 
> > Another option would be to hook into schedule().  Then livepatch could
> > set TIF_NEED_RESCHED on remaining unpatched tasks.  But again if they go
> > through the preemption path then we have the same problem for non-ORC.
> > 
> > Worst case we'll need to sprinkle cond_livepatch() everywhere :-/
> > 
> 
> I guess I'm not fully understanding what the cond rescheds are for. But
> would an IPI to all CPUs setting NEED_RESCHED, fix it?

If all livepatch arches had the ORC unwinder, yes.

The problem is that frame pointer (and similar) unwinders can't reliably
unwind past an interrupt frame.

-- 
Josh

Re: [RFC PATCH 07/86] Revert "livepatch,sched: Add livepatch task switching to cond_resched()"

2023-11-09 Thread Josh Poimboeuf

On Tue, Nov 07, 2023 at 06:16:09PM -0500, Steven Rostedt wrote:
> On Tue,  7 Nov 2023 13:56:53 -0800
> Ankur Arora  wrote:
> 
> > This reverts commit e3ff7c609f39671d1aaff4fb4a8594e14f3e03f8.
> > 
> > Note that removing this commit reintroduces "live patches failing to
> > complete within a reasonable amount of time due to CPU-bound kthreads."
> > 
> > Unfortunately this fix depends quite critically on PREEMPT_DYNAMIC and
> > existence of cond_resched() so this will need an alternate fix.

We definitely don't want to introduce a regression, something will need
to be figured out before removing cond_resched().

We could hook into preempt_schedule_irq(), but that wouldn't work for
non-ORC.

Another option would be to hook into schedule().  Then livepatch could
set TIF_NEED_RESCHED on remaining unpatched tasks.  But again if they go
through the preemption path then we have the same problem for non-ORC.

Worst case we'll need to sprinkle cond_livepatch() everywhere :-/

-- 
Josh

Re: [PATCH 01/15] objtool: Find a destination for jumps beyond the section end

2021-04-20 Thread Josh Poimboeuf

On Tue, Apr 20, 2021 at 01:25:43PM -0700, Sami Tolvanen wrote:
> On Tue, Apr 20, 2021 at 11:14 AM Josh Poimboeuf  wrote:
> >
> > On Fri, Apr 16, 2021 at 01:38:30PM -0700, Sami Tolvanen wrote:
> > > With -ffunction-sections, Clang can generate a jump beyond the end of
> > > a section when the section ends in an unreachable instruction.
> >
> > Why?  Can you show an example?
> 
> Here's the warning I'm seeing when building allyesconfig + CFI:
> 
> vmlinux.o: warning: objtool:
> rockchip_spi_transfer_one.f088382d97b74759d70e27e891fe8f1c()+0x149:
> can't find jump dest instruction at
> .text.rockchip_spi_transfer_one.f088382d97b74759d70e27e891fe8f1c+0x7dc
> 
> $ objdump -d -r -j
> .text.rockchip_spi_transfer_one.f088382d97b74759d70e27e891fe8f1c
> vmlinux.o
>  :
>   ...
>  149:   0f 85 8d 06 00 00   jne7dc <.compoundliteral.4>
>   ...
>  7d7:   e8 00 00 00 00  callq  7dc <.compoundliteral.4>
> 7d8: R_X86_64_PLT32 __stack_chk_fail-0x4

Instead of silencing the warning by faking the jump destination, I'd
rather improve the warning to something like

  "warning: rockchip_spi_transfer_one() falls through to the next function"

which is what we normally do in this type of situation.

It may be caused by UB, or a compiler bug, but either way we should
figure out the root cause.

-- 
Josh

Re: [PATCH 02/15] objtool: Add CONFIG_CFI_CLANG support

2021-04-20 Thread Josh Poimboeuf

On Fri, Apr 16, 2021 at 01:38:31PM -0700, Sami Tolvanen wrote:
> +static int fix_cfi_relocs(const struct elf *elf)
> +{
> + struct section *sec, *tmpsec;
> + struct reloc *reloc, *tmpreloc;
> +
> + list_for_each_entry_safe(sec, tmpsec, >sections, list) {
> + list_for_each_entry_safe(reloc, tmpreloc, >reloc_list, 
> list) {

These loops don't remove structs from the lists, so are the "safe"
iterators really needed?

> + struct symbol *sym;
> + struct reloc *func_reloc;
> +
> + /*
> +  * CONFIG_CFI_CLANG replaces function relocations to 
> refer
> +  * to an intermediate jump table. Undo the conversion so
> +  * objtool can make sense of things.
> +  */

I think this comment could be clearer if it were placed above the
function.

> + if (!reloc->sym->sec->cfi_jt)
> + continue;
> +
> + if (reloc->sym->type == STT_SECTION)
> + sym = find_func_by_offset(reloc->sym->sec,
> +   reloc->addend);
> + else
> + sym = reloc->sym;
> +
> + if (!sym || !sym->sec)
> + continue;

This should be a fatal error, it probably means something's gone wrong
with the reading of the jump table.

> +
> + /*
> +  * The jump table immediately jumps to the actual 
> function,
> +  * so look up the relocation there.
> +  */
> + func_reloc = find_reloc_by_dest_range(elf, sym->sec, 
> sym->offset, 5);

The jump instruction's reloc is always at offset 1, so this can maybe be
optimized to:

func_reloc = find_reloc_by_dest(elf, sym->sec, 
sym->offset+1);

though of course that will probably break (future) arm64 objtool.  Maybe
an arch-specific offset from the start of the insn would be good.

> + if (!func_reloc || !func_reloc->sym)
> + continue;

This should also be an error.

> +
> + reloc->sym = func_reloc->sym;

I think we should also do 'reloc->addend = 0', because of the
STT_SECTION case:

  0025  259e000b R_X86_64_32S    
.text..L.cfi.jumptable.8047 + 8

which then translates to

  0009  06320004 R_X86_64_PLT32  
x2apic_prepare_cpu - 4

so the original addend of '8' is no longer valid.  Copying the '-4'
wouldn't be right either, because that only applies to a PLT32 text
reloc.  A '0' should be appropriate for the 32S data reloc.

This addend might not actually be read by any code, so it may not
currently be an actual bug, but better safe than sorry.

-- 
Josh

Re: [PATCH 01/15] objtool: Find a destination for jumps beyond the section end

2021-04-20 Thread Josh Poimboeuf

On Fri, Apr 16, 2021 at 01:38:30PM -0700, Sami Tolvanen wrote:
> With -ffunction-sections, Clang can generate a jump beyond the end of
> a section when the section ends in an unreachable instruction.

Why?  Can you show an example?

-- 
Josh

[tip: objtool/core] x86/crypto/aesni-intel_avx: Standardize stack alignment prologue

2021-04-20 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the objtool/core branch of tip:

Commit-ID: e163be86fff3deec70f63330fc43fedf892c9aee
Gitweb:
https://git.kernel.org/tip/e163be86fff3deec70f63330fc43fedf892c9aee
Author:Josh Poimboeuf 
AuthorDate:Wed, 24 Feb 2021 10:29:17 -06:00
Committer: Josh Poimboeuf 
CommitterDate: Mon, 19 Apr 2021 12:36:34 -05:00

x86/crypto/aesni-intel_avx: Standardize stack alignment prologue

Use RBP instead of R14 for saving the old stack pointer before
realignment.  This resembles what compilers normally do.

This enables ORC unwinding by allowing objtool to understand the stack
realignment.

Signed-off-by: Josh Poimboeuf 
Tested-by: Ard Biesheuvel 
Acked-by: Ard Biesheuvel 
Tested-by: Sami Tolvanen 
Acked-by: Peter Zijlstra (Intel) 
Acked-by: Herbert Xu 
Link: 
https://lore.kernel.org/r/02d00a0903a0959f4787e186e2a07d271e1f63d4.1614182415.git.jpoim...@redhat.com
---
 arch/x86/crypto/aesni-intel_avx-x86_64.S | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S 
b/arch/x86/crypto/aesni-intel_avx-x86_64.S
index 188f184..98e3552 100644
--- a/arch/x86/crypto/aesni-intel_avx-x86_64.S
+++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S
@@ -251,22 +251,20 @@ VARIABLE_OFFSET = 16*8
 .macro FUNC_SAVE
 push%r12
 push%r13
-push%r14
 push%r15
 
-mov %rsp, %r14
-
-
+   push%rbp
+   mov %rsp, %rbp
 
 sub $VARIABLE_OFFSET, %rsp
 and $~63, %rsp# align rsp to 64 bytes
 .endm
 
 .macro FUNC_RESTORE
-mov %r14, %rsp
+mov %rbp, %rsp
+   pop %rbp
 
 pop %r15
-pop %r14
 pop %r13
 pop %r12
 .endm

[tip: objtool/core] x86/crypto/aesni-intel_avx: Remove unused macros

2021-04-20 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the objtool/core branch of tip:

Commit-ID: 4f08300916e882a0c34a2f325ff3fea2be2e57b3
Gitweb:
https://git.kernel.org/tip/4f08300916e882a0c34a2f325ff3fea2be2e57b3
Author:Josh Poimboeuf 
AuthorDate:Wed, 24 Feb 2021 10:29:15 -06:00
Committer: Josh Poimboeuf 
CommitterDate: Mon, 19 Apr 2021 12:36:33 -05:00

x86/crypto/aesni-intel_avx: Remove unused macros

These macros are no longer used; remove them.

Signed-off-by: Josh Poimboeuf 
Tested-by: Ard Biesheuvel 
Acked-by: Ard Biesheuvel 
Tested-by: Sami Tolvanen 
Acked-by: Peter Zijlstra (Intel) 
Acked-by: Herbert Xu 
Link: 
https://lore.kernel.org/r/53f7136ea93ebdbca399959e6d2991ecb46e733e.1614182415.git.jpoim...@redhat.com
---
 arch/x86/crypto/aesni-intel_avx-x86_64.S | 8 
 1 file changed, 8 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S 
b/arch/x86/crypto/aesni-intel_avx-x86_64.S
index 2cf8e94..4fdf38e 100644
--- a/arch/x86/crypto/aesni-intel_avx-x86_64.S
+++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S
@@ -212,10 +212,6 @@ HashKey_8_k= 16*21   # store XOR of HashKey^8 <<1 mod 
poly here (for Karatsu
 #define arg4 %rcx
 #define arg5 %r8
 #define arg6 %r9
-#define arg7 STACK_OFFSET+8*1(%r14)
-#define arg8 STACK_OFFSET+8*2(%r14)
-#define arg9 STACK_OFFSET+8*3(%r14)
-#define arg10 STACK_OFFSET+8*4(%r14)
 #define keysize 2*15*16(arg1)
 
 i = 0
@@ -237,9 +233,6 @@ define_reg j %j
 .noaltmacro
 .endm
 
-# need to push 4 registers into stack to maintain
-STACK_OFFSET = 8*4
-
 TMP1 =   16*0# Temporary storage for AAD
 TMP2 =   16*1# Temporary storage for AES State 2 (State 1 is stored in an 
XMM register)
 TMP3 =   16*2# Temporary storage for AES State 3
@@ -256,7 +249,6 @@ VARIABLE_OFFSET = 16*8
 
 
 .macro FUNC_SAVE
-#the number of pushes must equal STACK_OFFSET
 push%r12
 push%r13
 push%r14

[tip: objtool/core] x86/crypto/aesni-intel_avx: Fix register usage comments

2021-04-20 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the objtool/core branch of tip:

Commit-ID: ff5796b6dbea4763fdca002101e32b60aa17f8e8
Gitweb:
https://git.kernel.org/tip/ff5796b6dbea4763fdca002101e32b60aa17f8e8
Author:Josh Poimboeuf 
AuthorDate:Wed, 24 Feb 2021 10:29:16 -06:00
Committer: Josh Poimboeuf 
CommitterDate: Mon, 19 Apr 2021 12:36:33 -05:00

x86/crypto/aesni-intel_avx: Fix register usage comments

Fix register usage comments to match reality.

Signed-off-by: Josh Poimboeuf 
Tested-by: Ard Biesheuvel 
Acked-by: Ard Biesheuvel 
Tested-by: Sami Tolvanen 
Acked-by: Peter Zijlstra (Intel) 
Acked-by: Herbert Xu 
Link: 
https://lore.kernel.org/r/8655d4513a0ed1eddec609165064153973010aa2.1614182415.git.jpoim...@redhat.com
---
 arch/x86/crypto/aesni-intel_avx-x86_64.S | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S 
b/arch/x86/crypto/aesni-intel_avx-x86_64.S
index 4fdf38e..188f184 100644
--- a/arch/x86/crypto/aesni-intel_avx-x86_64.S
+++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S
@@ -286,7 +286,7 @@ VARIABLE_OFFSET = 16*8
 
 # combined for GCM encrypt and decrypt functions
 # clobbering all xmm registers
-# clobbering r10, r11, r12, r13, r14, r15
+# clobbering r10, r11, r12, r13, r15, rax
 .macro  GCM_ENC_DEC INITIAL_BLOCKS GHASH_8_ENCRYPT_8_PARALLEL GHASH_LAST_8 
GHASH_MUL ENC_DEC REP
 vmovdqu AadHash(arg2), %xmm8
 vmovdqu  HashKey(arg2), %xmm13  # xmm13 = HashKey
@@ -988,7 +988,7 @@ _partial_block_done_\@:
 ## num_initial_blocks = b mod 4#
 ## encrypt the initial num_initial_blocks blocks and apply ghash on the 
ciphertext
 ## r10, r11, r12, rax are clobbered
-## arg1, arg3, arg4, r14 are used as a pointer only, not modified
+## arg1, arg2, arg3, arg4 are used as pointers only, not modified
 
 .macro INITIAL_BLOCKS_AVX REP num_initial_blocks T1 T2 T3 T4 T5 CTR XMM1 XMM2 
XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 T6 T_key ENC_DEC
i = (8-\num_initial_blocks)
@@ -1223,7 +1223,7 @@ _initial_blocks_done\@:
 
 # encrypt 8 blocks at a time
 # ghash the 8 previously encrypted ciphertext blocks
-# arg1, arg3, arg4 are used as pointers only, not modified
+# arg1, arg2, arg3, arg4 are used as pointers only, not modified
 # r11 is the data offset value
 .macro GHASH_8_ENCRYPT_8_PARALLEL_AVX REP T1 T2 T3 T4 T5 T6 CTR XMM1 XMM2 XMM3 
XMM4 XMM5 XMM6 XMM7 XMM8 T7 loop_idx ENC_DEC
 
@@ -1936,7 +1936,7 @@ SYM_FUNC_END(aesni_gcm_finalize_avx_gen2)
 ## num_initial_blocks = b mod 4#
 ## encrypt the initial num_initial_blocks blocks and apply ghash on the 
ciphertext
 ## r10, r11, r12, rax are clobbered
-## arg1, arg3, arg4, r14 are used as a pointer only, not modified
+## arg1, arg2, arg3, arg4 are used as pointers only, not modified
 
 .macro INITIAL_BLOCKS_AVX2 REP num_initial_blocks T1 T2 T3 T4 T5 CTR XMM1 XMM2 
XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 T6 T_key ENC_DEC VER
i = (8-\num_initial_blocks)
@@ -2178,7 +2178,7 @@ _initial_blocks_done\@:
 
 # encrypt 8 blocks at a time
 # ghash the 8 previously encrypted ciphertext blocks
-# arg1, arg3, arg4 are used as pointers only, not modified
+# arg1, arg2, arg3, arg4 are used as pointers only, not modified
 # r11 is the data offset value
 .macro GHASH_8_ENCRYPT_8_PARALLEL_AVX2 REP T1 T2 T3 T4 T5 T6 CTR XMM1 XMM2 
XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 T7 loop_idx ENC_DEC

[tip: objtool/core] objtool: Support asm jump tables

2021-04-20 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the objtool/core branch of tip:

Commit-ID: 99033461e685b48549ec77608b4bda75ddf772ce
Gitweb:
https://git.kernel.org/tip/99033461e685b48549ec77608b4bda75ddf772ce
Author:Josh Poimboeuf 
AuthorDate:Wed, 24 Feb 2021 10:29:14 -06:00
Committer: Josh Poimboeuf 
CommitterDate: Mon, 19 Apr 2021 12:36:32 -05:00

objtool: Support asm jump tables

Objtool detection of asm jump tables would normally just work, except
for the fact that asm retpolines use alternatives.  Objtool thinks the
alternative code path (a jump to the retpoline) is a sibling call.

Don't treat alternative indirect branches as sibling calls when the
original instruction has a jump table.

Signed-off-by: Josh Poimboeuf 
Tested-by: Ard Biesheuvel 
Acked-by: Ard Biesheuvel 
Tested-by: Sami Tolvanen 
Acked-by: Peter Zijlstra (Intel) 
Acked-by: Herbert Xu 
Link: 
https://lore.kernel.org/r/460cf4dc675d64e1124146562cabd2c05aa322e8.1614182415.git.jpoim...@redhat.com
---
 tools/objtool/check.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index a0f762a..46621e8 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -108,6 +108,18 @@ static struct instruction *prev_insn_same_sym(struct 
objtool_file *file,
for (insn = next_insn_same_sec(file, insn); insn;   \
 insn = next_insn_same_sec(file, insn))
 
+static bool is_jump_table_jump(struct instruction *insn)
+{
+   struct alt_group *alt_group = insn->alt_group;
+
+   if (insn->jump_table)
+   return true;
+
+   /* Retpoline alternative for a jump table? */
+   return alt_group && alt_group->orig_group &&
+  alt_group->orig_group->first_insn->jump_table;
+}
+
 static bool is_sibling_call(struct instruction *insn)
 {
/*
@@ -120,7 +132,7 @@ static bool is_sibling_call(struct instruction *insn)
 
/* An indirect jump is either a sibling call or a jump to a table. */
if (insn->type == INSN_JUMP_DYNAMIC)
-   return list_empty(>alts);
+   return !is_jump_table_jump(insn);
 
/* add_jump_destinations() sets insn->call_dest for sibling calls. */
return (is_static_jump(insn) && insn->call_dest);

[tip: objtool/core] x86/crypto/sha512-avx2: Standardize stack alignment prologue

2021-04-20 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the objtool/core branch of tip:

Commit-ID: ec063e090bd6487097d459bb4272508b78448270
Gitweb:
https://git.kernel.org/tip/ec063e090bd6487097d459bb4272508b78448270
Author:Josh Poimboeuf 
AuthorDate:Wed, 24 Feb 2021 10:29:24 -06:00
Committer: Josh Poimboeuf 
CommitterDate: Mon, 19 Apr 2021 12:36:36 -05:00

x86/crypto/sha512-avx2: Standardize stack alignment prologue

Use a more standard prologue for saving the stack pointer before
realigning the stack.

This enables ORC unwinding by allowing objtool to understand the stack
realignment.

Signed-off-by: Josh Poimboeuf 
Tested-by: Ard Biesheuvel 
Acked-by: Ard Biesheuvel 
Tested-by: Sami Tolvanen 
Acked-by: Peter Zijlstra (Intel) 
Acked-by: Herbert Xu 
Link: 
https://lore.kernel.org/r/b1a7b29fcfc65d60a3b6e77ef75f4762a5b8488d.1614182415.git.jpoim...@redhat.com
---
 arch/x86/crypto/sha512-avx2-asm.S | 42 ++
 1 file changed, 20 insertions(+), 22 deletions(-)

diff --git a/arch/x86/crypto/sha512-avx2-asm.S 
b/arch/x86/crypto/sha512-avx2-asm.S
index 3a44bdc..072cb0f 100644
--- a/arch/x86/crypto/sha512-avx2-asm.S
+++ b/arch/x86/crypto/sha512-avx2-asm.S
@@ -102,17 +102,13 @@ SRND_SIZE = 1*8
 INP_SIZE = 1*8
 INPEND_SIZE = 1*8
 CTX_SIZE = 1*8
-RSPSAVE_SIZE = 1*8
-GPRSAVE_SIZE = 5*8
 
 frame_XFER = 0
 frame_SRND = frame_XFER + XFER_SIZE
 frame_INP = frame_SRND + SRND_SIZE
 frame_INPEND = frame_INP + INP_SIZE
 frame_CTX = frame_INPEND + INPEND_SIZE
-frame_RSPSAVE = frame_CTX + CTX_SIZE
-frame_GPRSAVE = frame_RSPSAVE + RSPSAVE_SIZE
-frame_size = frame_GPRSAVE + GPRSAVE_SIZE
+frame_size = frame_CTX + CTX_SIZE
 
 ## assume buffers not aligned
 #defineVMOVDQ vmovdqu
@@ -570,18 +566,18 @@ frame_size = frame_GPRSAVE + GPRSAVE_SIZE
 # "blocks" is the message length in SHA512 blocks
 
 SYM_FUNC_START(sha512_transform_rorx)
+   # Save GPRs
+   push%rbx
+   push%r12
+   push%r13
+   push%r14
+   push%r15
+
# Allocate Stack Space
-   mov %rsp, %rax
+   push%rbp
+   mov %rsp, %rbp
sub $frame_size, %rsp
and $~(0x20 - 1), %rsp
-   mov %rax, frame_RSPSAVE(%rsp)
-
-   # Save GPRs
-   mov %rbx, 8*0+frame_GPRSAVE(%rsp)
-   mov %r12, 8*1+frame_GPRSAVE(%rsp)
-   mov %r13, 8*2+frame_GPRSAVE(%rsp)
-   mov %r14, 8*3+frame_GPRSAVE(%rsp)
-   mov %r15, 8*4+frame_GPRSAVE(%rsp)
 
shl $7, NUM_BLKS# convert to bytes
jz  done_hash
@@ -672,15 +668,17 @@ loop2:
 
 done_hash:
 
-# Restore GPRs
-   mov 8*0+frame_GPRSAVE(%rsp), %rbx
-   mov 8*1+frame_GPRSAVE(%rsp), %r12
-   mov 8*2+frame_GPRSAVE(%rsp), %r13
-   mov 8*3+frame_GPRSAVE(%rsp), %r14
-   mov 8*4+frame_GPRSAVE(%rsp), %r15
-
# Restore Stack Pointer
-   mov frame_RSPSAVE(%rsp), %rsp
+   mov %rbp, %rsp
+   pop %rbp
+
+   # Restore GPRs
+   pop %r15
+   pop %r14
+   pop %r13
+   pop %r12
+   pop %rbx
+
ret
 SYM_FUNC_END(sha512_transform_rorx)

[tip: objtool/core] x86/crypto/sha_ni: Standardize stack alignment prologue

2021-04-20 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the objtool/core branch of tip:

Commit-ID: 35a0067d2c02a7c35466db5f207b7b9265de84d9
Gitweb:
https://git.kernel.org/tip/35a0067d2c02a7c35466db5f207b7b9265de84d9
Author:Josh Poimboeuf 
AuthorDate:Wed, 24 Feb 2021 10:29:20 -06:00
Committer: Josh Poimboeuf 
CommitterDate: Mon, 19 Apr 2021 12:36:35 -05:00

x86/crypto/sha_ni: Standardize stack alignment prologue

Use a more standard prologue for saving the stack pointer before
realigning the stack.

This enables ORC unwinding by allowing objtool to understand the stack
realignment.

Signed-off-by: Josh Poimboeuf 
Tested-by: Ard Biesheuvel 
Acked-by: Ard Biesheuvel 
Tested-by: Sami Tolvanen 
Acked-by: Peter Zijlstra (Intel) 
Acked-by: Herbert Xu 
Link: 
https://lore.kernel.org/r/5033e1a79867dff1b18e1b4d0783c38897d3f223.1614182415.git.jpoim...@redhat.com
---
 arch/x86/crypto/sha1_ni_asm.S | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/crypto/sha1_ni_asm.S b/arch/x86/crypto/sha1_ni_asm.S
index 11efe3a..5d8415f 100644
--- a/arch/x86/crypto/sha1_ni_asm.S
+++ b/arch/x86/crypto/sha1_ni_asm.S
@@ -59,8 +59,6 @@
 #define DATA_PTR   %rsi/* 2nd arg */
 #define NUM_BLKS   %rdx/* 3rd arg */
 
-#define RSPSAVE%rax
-
 /* gcc conversion */
 #define FRAME_SIZE 32  /* space for 2x16 bytes */
 
@@ -96,7 +94,8 @@
 .text
 .align 32
 SYM_FUNC_START(sha1_ni_transform)
-   mov %rsp, RSPSAVE
+   push%rbp
+   mov %rsp, %rbp
sub $FRAME_SIZE, %rsp
and $~0xF, %rsp
 
@@ -288,7 +287,8 @@ SYM_FUNC_START(sha1_ni_transform)
pextrd  $3, E0, 1*16(DIGEST_PTR)
 
 .Ldone_hash:
-   mov RSPSAVE, %rsp
+   mov %rbp, %rsp
+   pop %rbp
 
ret
 SYM_FUNC_END(sha1_ni_transform)

[tip: objtool/core] x86/crypto/crc32c-pcl-intel: Standardize jump table

2021-04-20 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the objtool/core branch of tip:

Commit-ID: 2b02ed55482a1c5c310a7f53707292fcf1601e7a
Gitweb:
https://git.kernel.org/tip/2b02ed55482a1c5c310a7f53707292fcf1601e7a
Author:Josh Poimboeuf 
AuthorDate:Wed, 24 Feb 2021 10:29:19 -06:00
Committer: Josh Poimboeuf 
CommitterDate: Mon, 19 Apr 2021 12:36:34 -05:00

x86/crypto/crc32c-pcl-intel: Standardize jump table

Simplify the jump table code so that it resembles a compiler-generated
table.

This enables ORC unwinding by allowing objtool to follow all the
potential code paths.

Signed-off-by: Josh Poimboeuf 
Tested-by: Ard Biesheuvel 
Acked-by: Ard Biesheuvel 
Tested-by: Sami Tolvanen 
Acked-by: Peter Zijlstra (Intel) 
Acked-by: Herbert Xu 
Link: 
https://lore.kernel.org/r/5357a039def90b8ef6b5874ef12cda008ecf18ba.1614182415.git.jpoim...@redhat.com
---
 arch/x86/crypto/crc32c-pcl-intel-asm_64.S | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S 
b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
index 884dc76..ac1f303 100644
--- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
+++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
@@ -53,7 +53,7 @@
 .endm
 
 .macro JMPTBL_ENTRY i
-.word crc_\i - crc_array
+.quad crc_\i
 .endm
 
 .macro JNC_LESS_THAN j
@@ -168,10 +168,7 @@ continue_block:
xor crc2, crc2
 
## branch into array
-   lea jump_table(%rip), %bufp
-   movzwq  (%bufp, %rax, 2), len
-   lea crc_array(%rip), %bufp
-   lea (%bufp, len, 1), %bufp
+   mov jump_table(,%rax,8), %bufp
JMP_NOSPEC bufp

[tip: objtool/core] x86/crypto/camellia-aesni-avx2: Unconditionally allocate stack buffer

2021-04-20 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the objtool/core branch of tip:

Commit-ID: dabe5167a3cbb4bf16b20c0e5b6497513e2e3a08
Gitweb:
https://git.kernel.org/tip/dabe5167a3cbb4bf16b20c0e5b6497513e2e3a08
Author:Josh Poimboeuf 
AuthorDate:Wed, 24 Feb 2021 10:29:18 -06:00
Committer: Josh Poimboeuf 
CommitterDate: Mon, 19 Apr 2021 12:36:34 -05:00

x86/crypto/camellia-aesni-avx2: Unconditionally allocate stack buffer

A conditional stack allocation violates traditional unwinding
requirements when a single instruction can have differing stack layouts.

There's no benefit in allocating the stack buffer conditionally.  Just
do it unconditionally.

Signed-off-by: Josh Poimboeuf 
Tested-by: Ard Biesheuvel 
Acked-by: Ard Biesheuvel 
Tested-by: Sami Tolvanen 
Acked-by: Peter Zijlstra (Intel) 
Acked-by: Herbert Xu 
Link: 
https://lore.kernel.org/r/85ac96613ee5784b6239c18d3f68b1f3c509caa3.1614182415.git.jpoim...@redhat.com
---
 arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S 
b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
index 782e971..706f708 100644
--- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
@@ -990,6 +990,7 @@ SYM_FUNC_START(camellia_cbc_dec_32way)
 *  %rdx: src (32 blocks)
 */
FRAME_BEGIN
+   subq $(16 * 32), %rsp;
 
vzeroupper;
 
@@ -1002,7 +1003,6 @@ SYM_FUNC_START(camellia_cbc_dec_32way)
 %ymm8, %ymm9, %ymm10, %ymm11, %ymm12, %ymm13, %ymm14,
 %ymm15, %rdx, (key_table)(CTX, %r8, 8));
 
-   movq %rsp, %r10;
cmpq %rsi, %rdx;
je .Lcbc_dec_use_stack;
 
@@ -1015,7 +1015,6 @@ SYM_FUNC_START(camellia_cbc_dec_32way)
 * dst still in-use (because dst == src), so use stack for temporary
 * storage.
 */
-   subq $(16 * 32), %rsp;
movq %rsp, %rax;
 
 .Lcbc_dec_continue:
@@ -1025,7 +1024,6 @@ SYM_FUNC_START(camellia_cbc_dec_32way)
vpxor %ymm7, %ymm7, %ymm7;
vinserti128 $1, (%rdx), %ymm7, %ymm7;
vpxor (%rax), %ymm7, %ymm7;
-   movq %r10, %rsp;
vpxor (0 * 32 + 16)(%rdx), %ymm6, %ymm6;
vpxor (1 * 32 + 16)(%rdx), %ymm5, %ymm5;
vpxor (2 * 32 + 16)(%rdx), %ymm4, %ymm4;
@@ -1047,6 +1045,7 @@ SYM_FUNC_START(camellia_cbc_dec_32way)
 
vzeroupper;
 
+   addq $(16 * 32), %rsp;
FRAME_END
ret;
 SYM_FUNC_END(camellia_cbc_dec_32way)

[tip: objtool/core] x86/crypto/sha512-ssse3: Standardize stack alignment prologue

2021-04-20 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the objtool/core branch of tip:

Commit-ID: 27d26793f2105281d9374928448142777cef6f74
Gitweb:
https://git.kernel.org/tip/27d26793f2105281d9374928448142777cef6f74
Author:Josh Poimboeuf 
AuthorDate:Wed, 24 Feb 2021 10:29:25 -06:00
Committer: Josh Poimboeuf 
CommitterDate: Mon, 19 Apr 2021 12:36:37 -05:00

x86/crypto/sha512-ssse3: Standardize stack alignment prologue

Use a more standard prologue for saving the stack pointer before
realigning the stack.

This enables ORC unwinding by allowing objtool to understand the stack
realignment.

Signed-off-by: Josh Poimboeuf 
Tested-by: Ard Biesheuvel 
Acked-by: Ard Biesheuvel 
Tested-by: Sami Tolvanen 
Acked-by: Peter Zijlstra (Intel) 
Acked-by: Herbert Xu 
Link: 
https://lore.kernel.org/r/6ecaaac9f3828fbb903513bf90c34a08380a8e35.1614182415.git.jpoim...@redhat.com
---
 arch/x86/crypto/sha512-ssse3-asm.S | 41 +
 1 file changed, 19 insertions(+), 22 deletions(-)

diff --git a/arch/x86/crypto/sha512-ssse3-asm.S 
b/arch/x86/crypto/sha512-ssse3-asm.S
index 50812af..bd51c90 100644
--- a/arch/x86/crypto/sha512-ssse3-asm.S
+++ b/arch/x86/crypto/sha512-ssse3-asm.S
@@ -74,14 +74,10 @@ tmp0 =  %rax
 
 W_SIZE = 80*8
 WK_SIZE = 2*8
-RSPSAVE_SIZE = 1*8
-GPRSAVE_SIZE = 5*8
 
 frame_W = 0
 frame_WK = frame_W + W_SIZE
-frame_RSPSAVE = frame_WK + WK_SIZE
-frame_GPRSAVE = frame_RSPSAVE + RSPSAVE_SIZE
-frame_size = frame_GPRSAVE + GPRSAVE_SIZE
+frame_size = frame_WK + WK_SIZE
 
 # Useful QWORD "arrays" for simpler memory references
 # MSG, DIGEST, K_t, W_t are arrays
@@ -283,18 +279,18 @@ SYM_FUNC_START(sha512_transform_ssse3)
test msglen, msglen
je nowork
 
+   # Save GPRs
+   push%rbx
+   push%r12
+   push%r13
+   push%r14
+   push%r15
+
# Allocate Stack Space
-   mov %rsp, %rax
+   push%rbp
+   mov %rsp, %rbp
sub $frame_size, %rsp
and $~(0x20 - 1), %rsp
-   mov %rax, frame_RSPSAVE(%rsp)
-
-   # Save GPRs
-   mov %rbx, frame_GPRSAVE(%rsp)
-   mov %r12, frame_GPRSAVE +8*1(%rsp)
-   mov %r13, frame_GPRSAVE +8*2(%rsp)
-   mov %r14, frame_GPRSAVE +8*3(%rsp)
-   mov %r15, frame_GPRSAVE +8*4(%rsp)
 
 updateblock:
 
@@ -355,15 +351,16 @@ updateblock:
dec msglen
jnz updateblock
 
-   # Restore GPRs
-   mov frame_GPRSAVE(%rsp),  %rbx
-   mov frame_GPRSAVE +8*1(%rsp), %r12
-   mov frame_GPRSAVE +8*2(%rsp), %r13
-   mov frame_GPRSAVE +8*3(%rsp), %r14
-   mov frame_GPRSAVE +8*4(%rsp), %r15
-
# Restore Stack Pointer
-   mov frame_RSPSAVE(%rsp), %rsp
+   mov %rbp, %rsp
+   pop %rbp
+
+   # Restore GPRs
+   pop %r15
+   pop %r14
+   pop %r13
+   pop %r12
+   pop %rbx
 
 nowork:
ret

[tip: objtool/core] x86/crypto/sha256-avx2: Standardize stack alignment prologue

2021-04-20 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the objtool/core branch of tip:

Commit-ID: ce5846668076aa76a17ab559f0296374e3611fec
Gitweb:
https://git.kernel.org/tip/ce5846668076aa76a17ab559f0296374e3611fec
Author:Josh Poimboeuf 
AuthorDate:Wed, 24 Feb 2021 10:29:22 -06:00
Committer: Josh Poimboeuf 
CommitterDate: Mon, 19 Apr 2021 12:36:36 -05:00

x86/crypto/sha256-avx2: Standardize stack alignment prologue

Use a more standard prologue for saving the stack pointer before
realigning the stack.

This enables ORC unwinding by allowing objtool to understand the stack
realignment.

Signed-off-by: Josh Poimboeuf 
Tested-by: Ard Biesheuvel 
Acked-by: Ard Biesheuvel 
Tested-by: Sami Tolvanen 
Acked-by: Peter Zijlstra (Intel) 
Acked-by: Herbert Xu 
Link: 
https://lore.kernel.org/r/8048e7444c49a8137f05265262b83dc50f8fb7f3.1614182415.git.jpoim...@redhat.com
---
 arch/x86/crypto/sha256-avx2-asm.S | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/crypto/sha256-avx2-asm.S 
b/arch/x86/crypto/sha256-avx2-asm.S
index 11ff60c..4087f74 100644
--- a/arch/x86/crypto/sha256-avx2-asm.S
+++ b/arch/x86/crypto/sha256-avx2-asm.S
@@ -117,15 +117,13 @@ _XMM_SAVE_SIZE= 0
 _INP_END_SIZE  = 8
 _INP_SIZE  = 8
 _CTX_SIZE  = 8
-_RSP_SIZE  = 8
 
 _XFER  = 0
 _XMM_SAVE  = _XFER + _XFER_SIZE
 _INP_END   = _XMM_SAVE + _XMM_SAVE_SIZE
 _INP   = _INP_END  + _INP_END_SIZE
 _CTX   = _INP  + _INP_SIZE
-_RSP   = _CTX  + _CTX_SIZE
-STACK_SIZE = _RSP  + _RSP_SIZE
+STACK_SIZE = _CTX  + _CTX_SIZE
 
 # rotate_Xs
 # Rotate values of symbols X0...X3
@@ -533,11 +531,11 @@ SYM_FUNC_START(sha256_transform_rorx)
pushq   %r14
pushq   %r15
 
-   mov %rsp, %rax
+   push%rbp
+   mov %rsp, %rbp
+
subq$STACK_SIZE, %rsp
and $-32, %rsp  # align rsp to 32 byte boundary
-   mov %rax, _RSP(%rsp)
-
 
shl $6, NUM_BLKS# convert to bytes
jz  done_hash
@@ -704,7 +702,8 @@ only_one_block:
 
 done_hash:
 
-   mov _RSP(%rsp), %rsp
+   mov %rbp, %rsp
+   pop %rbp
 
popq%r15
popq%r14

[tip: objtool/core] x86/crypto/sha1_avx2: Standardize stack alignment prologue

2021-04-20 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the objtool/core branch of tip:

Commit-ID: 20114c899cafa8313534a841cab0ab1f7ab09672
Gitweb:
https://git.kernel.org/tip/20114c899cafa8313534a841cab0ab1f7ab09672
Author:Josh Poimboeuf 
AuthorDate:Wed, 24 Feb 2021 10:29:21 -06:00
Committer: Josh Poimboeuf 
CommitterDate: Mon, 19 Apr 2021 12:36:35 -05:00

x86/crypto/sha1_avx2: Standardize stack alignment prologue

Use a more standard prologue for saving the stack pointer before
realigning the stack.

This enables ORC unwinding by allowing objtool to understand the stack
realignment.

Signed-off-by: Josh Poimboeuf 
Tested-by: Ard Biesheuvel 
Acked-by: Ard Biesheuvel 
Tested-by: Sami Tolvanen 
Acked-by: Peter Zijlstra (Intel) 
Acked-by: Herbert Xu 
Link: 
https://lore.kernel.org/r/fdaaf8670ed1f52f55ba9a6bbac98c1afddc1af6.1614182415.git.jpoim...@redhat.com
---
 arch/x86/crypto/sha1_avx2_x86_64_asm.S | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S 
b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
index 1e594d6..5eed620 100644
--- a/arch/x86/crypto/sha1_avx2_x86_64_asm.S
+++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
@@ -645,9 +645,9 @@ _loop3:
RESERVE_STACK  = (W_SIZE*4 + 8+24)
 
/* Align stack */
-   mov %rsp, %rbx
+   push%rbp
+   mov %rsp, %rbp
and $~(0x20-1), %rsp
-   push%rbx
sub $RESERVE_STACK, %rsp
 
avx2_zeroupper
@@ -665,8 +665,8 @@ _loop3:
 
avx2_zeroupper
 
-   add $RESERVE_STACK, %rsp
-   pop %rsp
+   mov %rbp, %rsp
+   pop %rbp
 
pop %r15
pop %r14

[tip: objtool/core] x86/crypto: Enable objtool in crypto code

2021-04-20 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the objtool/core branch of tip:

Commit-ID: 7d3d10e0e85fb7c23a86a70f795b1eabd2bc030b
Gitweb:
https://git.kernel.org/tip/7d3d10e0e85fb7c23a86a70f795b1eabd2bc030b
Author:Josh Poimboeuf 
AuthorDate:Wed, 24 Feb 2021 10:29:26 -06:00
Committer: Josh Poimboeuf 
CommitterDate: Mon, 19 Apr 2021 12:36:37 -05:00

x86/crypto: Enable objtool in crypto code

Now that all the stack alignment prologues have been cleaned up in the
crypto code, enable objtool.  Among other benefits, this will allow ORC
unwinding to work.

Signed-off-by: Josh Poimboeuf 
Tested-by: Ard Biesheuvel 
Acked-by: Ard Biesheuvel 
Tested-by: Sami Tolvanen 
Acked-by: Peter Zijlstra (Intel) 
Acked-by: Herbert Xu 
Link: 
https://lore.kernel.org/r/fc2a1918c50e33e46ef0e9a5de02743f2f6e3639.1614182415.git.jpoim...@redhat.com
---
 arch/x86/crypto/Makefile | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
index b28e36b..d0959e7 100644
--- a/arch/x86/crypto/Makefile
+++ b/arch/x86/crypto/Makefile
@@ -2,8 +2,6 @@
 #
 # x86 crypto algorithms
 
-OBJECT_FILES_NON_STANDARD := y
-
 obj-$(CONFIG_CRYPTO_TWOFISH_586) += twofish-i586.o
 twofish-i586-y := twofish-i586-asm_32.o twofish_glue.o
 obj-$(CONFIG_CRYPTO_TWOFISH_X86_64) += twofish-x86_64.o

[tip: objtool/core] x86/crypto/sha512-avx: Standardize stack alignment prologue

2021-04-20 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the objtool/core branch of tip:

Commit-ID: d61684b56edf369f0a6d388088d7c9d59f1618d4
Gitweb:
https://git.kernel.org/tip/d61684b56edf369f0a6d388088d7c9d59f1618d4
Author:Josh Poimboeuf 
AuthorDate:Wed, 24 Feb 2021 10:29:23 -06:00
Committer: Josh Poimboeuf 
CommitterDate: Mon, 19 Apr 2021 12:36:36 -05:00

x86/crypto/sha512-avx: Standardize stack alignment prologue

Use a more standard prologue for saving the stack pointer before
realigning the stack.

This enables ORC unwinding by allowing objtool to understand the stack
realignment.

Signed-off-by: Josh Poimboeuf 
Tested-by: Ard Biesheuvel 
Acked-by: Ard Biesheuvel 
Tested-by: Sami Tolvanen 
Acked-by: Peter Zijlstra (Intel) 
Acked-by: Herbert Xu 
Link: 
https://lore.kernel.org/r/d36e9ea1c819d87fa89b3df3fa83e2a1ede18146.1614182415.git.jpoim...@redhat.com
---
 arch/x86/crypto/sha512-avx-asm.S | 41 ++-
 1 file changed, 19 insertions(+), 22 deletions(-)

diff --git a/arch/x86/crypto/sha512-avx-asm.S b/arch/x86/crypto/sha512-avx-asm.S
index 684d58c..3d8f0fd 100644
--- a/arch/x86/crypto/sha512-avx-asm.S
+++ b/arch/x86/crypto/sha512-avx-asm.S
@@ -76,14 +76,10 @@ tmp0= %rax
 W_SIZE = 80*8
 # W[t] + K[t] | W[t+1] + K[t+1]
 WK_SIZE = 2*8
-RSPSAVE_SIZE = 1*8
-GPRSAVE_SIZE = 5*8
 
 frame_W = 0
 frame_WK = frame_W + W_SIZE
-frame_RSPSAVE = frame_WK + WK_SIZE
-frame_GPRSAVE = frame_RSPSAVE + RSPSAVE_SIZE
-frame_size = frame_GPRSAVE + GPRSAVE_SIZE
+frame_size = frame_WK + WK_SIZE
 
 # Useful QWORD "arrays" for simpler memory references
 # MSG, DIGEST, K_t, W_t are arrays
@@ -281,18 +277,18 @@ SYM_FUNC_START(sha512_transform_avx)
test msglen, msglen
je nowork
 
+   # Save GPRs
+   push%rbx
+   push%r12
+   push%r13
+   push%r14
+   push%r15
+
# Allocate Stack Space
-   mov %rsp, %rax
+   push%rbp
+   mov %rsp, %rbp
sub $frame_size, %rsp
and $~(0x20 - 1), %rsp
-   mov %rax, frame_RSPSAVE(%rsp)
-
-   # Save GPRs
-   mov %rbx, frame_GPRSAVE(%rsp)
-   mov %r12, frame_GPRSAVE +8*1(%rsp)
-   mov %r13, frame_GPRSAVE +8*2(%rsp)
-   mov %r14, frame_GPRSAVE +8*3(%rsp)
-   mov %r15, frame_GPRSAVE +8*4(%rsp)
 
 updateblock:
 
@@ -353,15 +349,16 @@ updateblock:
dec msglen
jnz updateblock
 
-   # Restore GPRs
-   mov frame_GPRSAVE(%rsp),  %rbx
-   mov frame_GPRSAVE +8*1(%rsp), %r12
-   mov frame_GPRSAVE +8*2(%rsp), %r13
-   mov frame_GPRSAVE +8*3(%rsp), %r14
-   mov frame_GPRSAVE +8*4(%rsp), %r15
-
# Restore Stack Pointer
-   mov frame_RSPSAVE(%rsp), %rsp
+   mov %rbp, %rsp
+   pop %rbp
+
+   # Restore GPRs
+   pop %r15
+   pop %r14
+   pop %r13
+   pop %r12
+   pop %rbx
 
 nowork:
ret

Re: [PATCH v2] X86: Makefile: Replace -pg with CC_FLAGS_FTRACE

2021-04-16 Thread Josh Poimboeuf

Adding Steven Rostedt (ftrace maintainer).

On Fri, Apr 16, 2021 at 01:39:28PM +0800, zhaoxiao wrote:
> In preparation for x86 supporting ftrace built on other compiler
> options, let's have the x86 Makefiles remove the $(CC_FLAGS_FTRACE)
> flags, whatever these may be, rather than assuming '-pg'.
> 
> There should be no functional change as a result of this patch.
> 
> Signed-off-by: zhaoxiao 
> ---
> v2: add the same change for the other Makefile in arch/x86 directory.
>  arch/x86/entry/vdso/Makefile |  8 
>  arch/x86/kernel/Makefile | 16 
>  arch/x86/kernel/cpu/Makefile |  4 ++--
>  arch/x86/lib/Makefile|  2 +-
>  arch/x86/mm/Makefile |  4 ++--
>  arch/x86/xen/Makefile|  6 +++---
>  6 files changed, 20 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
> index 05c4abc2fdfd..c5bd91bf9f93 100644
> --- a/arch/x86/entry/vdso/Makefile
> +++ b/arch/x86/entry/vdso/Makefile
> @@ -96,10 +96,10 @@ $(vobjs): KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO) 
> $(GCC_PLUGINS_CFLAGS) $(
>  #
>  # vDSO code runs in userspace and -pg doesn't help with profiling anyway.
>  #
> -CFLAGS_REMOVE_vclock_gettime.o = -pg
> -CFLAGS_REMOVE_vdso32/vclock_gettime.o = -pg
> -CFLAGS_REMOVE_vgetcpu.o = -pg
> -CFLAGS_REMOVE_vsgx.o = -pg
> +CFLAGS_REMOVE_vclock_gettime.o = $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_vdso32/vclock_gettime.o = $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_vgetcpu.o = $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_vsgx.o = $(CC_FLAGS_FTRACE)
>  
>  #
>  # X32 processes use x32 vDSO to access 64bit kernel data.
> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> index 2ddf08351f0b..2811fc6a17ba 100644
> --- a/arch/x86/kernel/Makefile
> +++ b/arch/x86/kernel/Makefile
> @@ -13,14 +13,14 @@ CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)
>  
>  ifdef CONFIG_FUNCTION_TRACER
>  # Do not profile debug and lowlevel utilities
> -CFLAGS_REMOVE_tsc.o = -pg
> -CFLAGS_REMOVE_paravirt-spinlocks.o = -pg
> -CFLAGS_REMOVE_pvclock.o = -pg
> -CFLAGS_REMOVE_kvmclock.o = -pg
> -CFLAGS_REMOVE_ftrace.o = -pg
> -CFLAGS_REMOVE_early_printk.o = -pg
> -CFLAGS_REMOVE_head64.o = -pg
> -CFLAGS_REMOVE_sev-es.o = -pg
> +CFLAGS_REMOVE_tsc.o = $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_paravirt-spinlocks.o = $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_pvclock.o = $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_kvmclock.o = $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_ftrace.o = $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_early_printk.o = $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_head64.o = $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_sev-es.o = $(CC_FLAGS_FTRACE)
>  endif
>  
>  KASAN_SANITIZE_head$(BITS).o := n
> diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
> index 637b499450d1..4435c6de9145 100644
> --- a/arch/x86/kernel/cpu/Makefile
> +++ b/arch/x86/kernel/cpu/Makefile
> @@ -5,8 +5,8 @@
>  
>  # Don't trace early stages of a secondary CPU boot
>  ifdef CONFIG_FUNCTION_TRACER
> -CFLAGS_REMOVE_common.o = -pg
> -CFLAGS_REMOVE_perf_event.o = -pg
> +CFLAGS_REMOVE_common.o = $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_perf_event.o = $(CC_FLAGS_FTRACE)
>  endif
>  
>  # If these files are instrumented, boot hangs during the first second.
> diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
> index bad4dee4f0e4..0aa71b8a5bc1 100644
> --- a/arch/x86/lib/Makefile
> +++ b/arch/x86/lib/Makefile
> @@ -21,7 +21,7 @@ KASAN_SANITIZE_cmdline.o  := n
>  KCSAN_SANITIZE_cmdline.o  := n
>  
>  ifdef CONFIG_FUNCTION_TRACER
> -CFLAGS_REMOVE_cmdline.o = -pg
> +CFLAGS_REMOVE_cmdline.o = $(CC_FLAGS_FTRACE)
>  endif
>  
>  CFLAGS_cmdline.o := -fno-stack-protector -fno-jump-tables
> diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
> index 5864219221ca..91883d5a0293 100644
> --- a/arch/x86/mm/Makefile
> +++ b/arch/x86/mm/Makefile
> @@ -12,8 +12,8 @@ KASAN_SANITIZE_mem_encrypt_identity.o   := n
>  KCSAN_SANITIZE := n
>  
>  ifdef CONFIG_FUNCTION_TRACER
> -CFLAGS_REMOVE_mem_encrypt.o  = -pg
> -CFLAGS_REMOVE_mem_encrypt_identity.o = -pg
> +CFLAGS_REMOVE_mem_encrypt.o  = $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_mem_encrypt_identity.o = $(CC_FLAGS_FTRACE)
>  endif
>  
>  obj-y:=  init.o init_$(BITS).o fault.o 
> ioremap.o extable.o mmap.o \
> diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
> index 40b5779fce21..179dfc124c94 100644
> --- a/arch/x86/xen/Makefile
> +++ b/arch/x86/xen/Makefile
> @@ -2,9 +2,9 @@
>  
>  ifdef CONFIG_FUNCTION_TRACER
>  # Do not profile debug and lowlevel utilities
> -CFLAGS_REMOVE_spinlock.o = -pg
> -CFLAGS_REMOVE_time.o = -pg
> -CFLAGS_REMOVE_irq.o = -pg
> +CFLAGS_REMOVE_spinlock.o = $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_time.o = $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_irq.o = $(CC_FLAGS_FTRACE)
>  endif
>  
>  # Make sure early boot has no stackprotector
> -- 
> 2.20.1
> 
> 
> 

-- 
Josh

Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-13 Thread Josh Poimboeuf

On Mon, Apr 12, 2021 at 05:59:33PM +0100, Mark Brown wrote:
> On Fri, Apr 09, 2021 at 05:32:27PM -0500, Josh Poimboeuf wrote:
> 
> > Hm, for that matter, even without renaming things, a comment above
> > stack_trace_save_tsk_reliable() describing the meaning of "reliable"
> > would be a good idea.
> 
> Might be better to place something at the prototype for
> arch_stack_walk_reliable() or cross link the two since that's where any
> new architectures should be starting, or perhaps even better to extend
> the document that Mark wrote further and point to that from both places.  
> 
> Some more explict pointer to live patching as the only user would
> definitely be good but I think the more important thing would be writing
> down any assumptions in the API that aren't already written down and
> we're supposed to be relying on.  Mark's document captured a lot of it
> but it sounds like there's more here, and even with knowing that this
> interface is only used by live patch and digging into what it does it's
> not always clear what happens to work with the code right now and what's
> something that's suitable to be relied on.

Something like so?

From: Josh Poimboeuf 
Subject: [PATCH] livepatch: Clarify the meaning of 'reliable'

Update the comments and documentation to reflect what 'reliable'
unwinding actually means, in the context of live patching.

Suggested-by: Mark Brown 
Signed-off-by: Josh Poimboeuf 
---
 .../livepatch/reliable-stacktrace.rst | 26 +
 arch/x86/kernel/stacktrace.c  |  6 
 include/linux/stacktrace.h| 29 +--
 kernel/stacktrace.c   |  7 -
 4 files changed, 53 insertions(+), 15 deletions(-)

diff --git a/Documentation/livepatch/reliable-stacktrace.rst 
b/Documentation/livepatch/reliable-stacktrace.rst
index 67459d2ca2af..e325efc7e952 100644
--- a/Documentation/livepatch/reliable-stacktrace.rst
+++ b/Documentation/livepatch/reliable-stacktrace.rst
@@ -72,7 +72,21 @@ The unwinding process varies across architectures, their 
respective procedure
 call standards, and kernel configurations. This section describes common
 details that architectures should consider.
 
-4.1 Identifying successful termination
+4.1 Only preemptible code needs reliability detection
+-
+
+The only current user of reliable stacktracing is livepatch, which only
+calls it for a) inactive tasks; or b) the current task in task context.
+
+Therefore, the unwinder only needs to detect the reliability of stacks
+involving *preemptible* code.
+
+Practically speaking, reliability of stacks involving *non-preemptible*
+code is a "don't-care".  It may help to return a wrong reliability
+result for such cases, if it results in reduced complexity, since such
+cases will not happen in practice.
+
+4.2 Identifying successful termination
 --
 
 Unwinding may terminate early for a number of reasons, including:
@@ -95,7 +109,7 @@ architectures verify that a stacktrace ends at an expected 
location, e.g.
 * On a specific stack expected for a kernel entry point (e.g. if the
   architecture has separate task and IRQ stacks).
 
-4.2 Identifying unwindable code
+4.3 Identifying unwindable code
 ---
 
 Unwinding typically relies on code following specific conventions (e.g.
@@ -129,7 +143,7 @@ unreliable to unwind from, e.g.
 
 * Identifying specific portions of code using bounds information.
 
-4.3 Unwinding across interrupts and exceptions
+4.4 Unwinding across interrupts and exceptions
 --
 
 At function call boundaries the stack and other unwind state is expected to be
@@ -156,7 +170,7 @@ have no such cases) should attempt to unwind across 
exception boundaries, as
 doing so can prevent unnecessarily stalling livepatch consistency checks and
 permits livepatch transitions to complete more quickly.
 
-4.4 Rewriting of return addresses
+4.5 Rewriting of return addresses
 -
 
 Some trampolines temporarily modify the return address of a function in order
@@ -222,7 +236,7 @@ middle of return_to_handler and can report this as 
unreliable. Architectures
 are not required to unwind from other trampolines which modify the return
 address.
 
-4.5 Obscuring of return addresses
+4.6 Obscuring of return addresses
 -
 
 Some trampolines do not rewrite the return address in order to intercept
@@ -249,7 +263,7 @@ than the link register as would usually be the case.
 Architectures must either ensure that unwinders either reliably unwind
 such cases, or report the unwinding as unreliable.
 
-4.6 Link register unreliability
+4.7 Link register unreliability
 ---
 
 On some other architectures, 'call' instructions place

Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-09 Thread Josh Poimboeuf

On Fri, Apr 09, 2021 at 05:32:27PM -0500, Josh Poimboeuf wrote:
> On Fri, Apr 09, 2021 at 05:05:58PM -0500, Madhavan T. Venkataraman wrote:
> > > FWIW, over the years we've had zero issues with encoding the frame
> > > pointer on x86.  After you save pt_regs, you encode the frame pointer to
> > > point to it.  Ideally in the same macro so it's hard to overlook.
> > > 
> > 
> > I had the same opinion. In fact, in my encoding scheme, I have additional
> > checks to make absolutely sure that it is a true encoding and not stack
> > corruption. The chances of all of those values accidentally matching are,
> > well, null.
> 
> Right, stack corruption -- which is already exceedingly rare -- would
> have to be combined with a miracle or two in order to come out of the
> whole thing marked as 'reliable' :-)
> 
> And really, we already take a similar risk today by "trusting" the frame
> pointer value on the stack to a certain extent.

Oh yeah, I forgot to mention some more benefits of encoding the frame
pointer (or marking pt_regs in some other way):

a) Stack addresses can be printed properly: '%pS' for printing regs->pc
   and '%pB' for printing call returns.

   Using '%pS' for call returns (as arm64 seems to do today) will result
   in printing the wrong function when you have tail calls to noreturn
   functions on the stack (which is actually quite common for calls to
   panic(), die(), etc).

   More details:

   https://lkml.kernel.org/r/20210403155948.ubbgtwmlsdyar7yp@treble

b) Stack dumps to the console can dump the exception registers they find
   along the way.  This is actually quite nice for debugging.

-- 
Josh

Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-09 Thread Josh Poimboeuf

On Fri, Apr 09, 2021 at 05:05:58PM -0500, Madhavan T. Venkataraman wrote:
> > FWIW, over the years we've had zero issues with encoding the frame
> > pointer on x86.  After you save pt_regs, you encode the frame pointer to
> > point to it.  Ideally in the same macro so it's hard to overlook.
> > 
> 
> I had the same opinion. In fact, in my encoding scheme, I have additional
> checks to make absolutely sure that it is a true encoding and not stack
> corruption. The chances of all of those values accidentally matching are,
> well, null.

Right, stack corruption -- which is already exceedingly rare -- would
have to be combined with a miracle or two in order to come out of the
whole thing marked as 'reliable' :-)

And really, we already take a similar risk today by "trusting" the frame
pointer value on the stack to a certain extent.

> >> I think there's a lot more code that we cannot unwind, e.g. KVM
> >> exception code, or almost anything marked with SYM_CODE_END().
> > 
> > Just a reminder that livepatch only unwinds blocked tasks (plus the
> > 'current' task which calls into livepatch).  So practically speaking, it
> > doesn't matter whether the 'unreliable' detection has full coverage.
> > The only exceptions which really matter are those which end up calling
> > schedule(), e.g. preemption or page faults.
> > 
> > Being able to consistently detect *all* possible unreliable paths would
> > be nice in theory, but it's unnecessary and may not be worth the extra
> > complexity.
> > 
> 
> You do have a point. I tried to think of arch_stack_walk_reliable() as
> something that should be implemented independent of livepatching. But
> I could not really come up with a single example of where else it would
> really be useful.
> 
> So, if we assume that the reliable stack trace is solely for the purpose
> of livepatching, I agree with your earlier comments as well.

One thought: if folks really view this as a problem, it might help to
just rename things to reduce confusion.

For example, instead of calling it 'reliable', we could call it
something more precise, like 'klp_reliable', to indicate that its
reliable enough for live patching.

Then have a comment above 'klp_reliable' and/or
stack_trace_save_tsk_klp_reliable() which describes what that means.

Hm, for that matter, even without renaming things, a comment above
stack_trace_save_tsk_reliable() describing the meaning of "reliable"
would be a good idea.

-- 
Josh

Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-09 Thread Josh Poimboeuf

On Fri, Apr 09, 2021 at 01:09:09PM +0100, Mark Rutland wrote:
> On Mon, Apr 05, 2021 at 03:43:09PM -0500, madve...@linux.microsoft.com wrote:
> > From: "Madhavan T. Venkataraman" 
> > 
> > There are a number of places in kernel code where the stack trace is not
> > reliable. Enhance the unwinder to check for those cases and mark the
> > stack trace as unreliable. Once all of the checks are in place, the unwinder
> > can provide a reliable stack trace. But before this can be used for 
> > livepatch,
> > some other entity needs to guarantee that the frame pointers are all set up
> > correctly in kernel functions. objtool is currently being worked on to
> > fill that gap.
> > 
> > Except for the return address check, all the other checks involve checking
> > the return PC of every frame against certain kernel functions. To do this,
> > implement some infrastructure code:
> > 
> > - Define a special_functions[] array and populate the array with
> >   the special functions
> 
> I'm not too keen on having to manually collate this within the unwinder,
> as it's very painful from a maintenance perspective.

Agreed.

> I'd much rather we could associate this information with the
> implementations of these functions, so that they're more likely to
> stay in sync.
> 
> Further, I believe all the special cases are assembly functions, and
> most of those are already in special sections to begin with. I reckon
> it'd be simpler and more robust to reject unwinding based on the
> section. If we need to unwind across specific functions in those
> sections, we could opt-in with some metadata. So e.g. we could reject
> all functions in ".entry.text", special casing the EL0 entry functions
> if necessary.

Couldn't this also end up being somewhat fragile?  Saying "certain
sections are deemed unreliable" isn't necessarily obvious to somebody
who doesn't already know about it, and it could be overlooked or
forgotten over time.  And there's no way to enforce it stays that way.

FWIW, over the years we've had zero issues with encoding the frame
pointer on x86.  After you save pt_regs, you encode the frame pointer to
point to it.  Ideally in the same macro so it's hard to overlook.

If you're concerned about debuggers getting confused by the encoding -
which debuggers specifically?  In my experience, if vmlinux has
debuginfo, gdb and most other debuggers will use DWARF (which is already
broken in asm code) and completely ignore frame pointers.

> I think there's a lot more code that we cannot unwind, e.g. KVM
> exception code, or almost anything marked with SYM_CODE_END().

Just a reminder that livepatch only unwinds blocked tasks (plus the
'current' task which calls into livepatch).  So practically speaking, it
doesn't matter whether the 'unreliable' detection has full coverage.
The only exceptions which really matter are those which end up calling
schedule(), e.g. preemption or page faults.

Being able to consistently detect *all* possible unreliable paths would
be nice in theory, but it's unnecessary and may not be worth the extra
complexity.

-- 
Josh

Re: [PATCH 0/5] Introduce support for PSF mitigation

2021-04-09 Thread Josh Poimboeuf

On Thu, Apr 08, 2021 at 09:56:47AM -0500, Saripalli, RK wrote:
> Josh, thank you for taking the time to review the patches.
> 
> On 4/7/2021 5:39 PM, Josh Poimboeuf wrote:
> > On Tue, Apr 06, 2021 at 10:49:59AM -0500, Ramakrishna Saripalli wrote:
> >> Because PSF speculation is limited to the current program context,
> >> the impact of bad PSF speculation is very similar to that of
> >> Speculative Store Bypass (Spectre v4)
> >>
> >> Predictive Store Forwarding controls:
> >> There are two hardware control bits which influence the PSF feature:
> >> - MSR 48h bit 2 – Speculative Store Bypass (SSBD)
> >> - MSR 48h bit 7 – Predictive Store Forwarding Disable (PSFD)
> >>
> >> The PSF feature is disabled if either of these bits are set.  These bits
> >> are controllable on a per-thread basis in an SMT system. By default, both
> >> SSBD and PSFD are 0 meaning that the speculation features are enabled.
> >>
> >> While the SSBD bit disables PSF and speculative store bypass, PSFD only
> >> disables PSF.
> >>
> >> PSFD may be desirable for software which is concerned with the
> >> speculative behavior of PSF but desires a smaller performance impact than
> >> setting SSBD.
> > 
> > Hi Ramakrishna,
> > 
> > Is there a realistic scenario where an application would want to disable
> > PSF, but not disable SSB?
> 
> It is possible most applications have been reviewed and scrubbed for
> SSB-type attacks but PSF-type issues may not have been looked at yet.

It's "possible", but is it realistic?  As far as I know, SSB is
impractical to scrub an application for.

Do we know of any real-world cases where this option is needed?

-- 
Josh

Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate

2021-04-08 Thread Josh Poimboeuf

On Thu, Apr 08, 2021 at 08:18:21AM +0200, Greg KH wrote:
> On Wed, Apr 07, 2021 at 03:17:46PM -0500, Josh Poimboeuf wrote:
> > On Fri, Apr 02, 2021 at 09:54:12AM +0200, Greg KH wrote:
> > > On Thu, Apr 01, 2021 at 11:59:25PM +, Luis Chamberlain wrote:
> > > > As for the syfs deadlock possible with drivers, this fixes it in a 
> > > > generic way:
> > > > 
> > > > commit fac43d8025727a74f80a183cc5eb74ed902a5d14
> > > > Author: Luis Chamberlain 
> > > > Date:   Sat Mar 27 14:58:15 2021 +
> > > > 
> > > > sysfs: add optional module_owner to attribute
> > > > 
> > > > This is needed as otherwise the owner of the attribute
> > > > or group read/store might have a shared lock used on driver removal,
> > > > and deadlock if we race with driver removal.
> > > > 
> > > > Signed-off-by: Luis Chamberlain 
> > > 
> > > No, please no.  Module removal is a "best effort", if the system dies
> > > when it happens, that's on you.  I am not willing to expend extra energy
> > > and maintance of core things like sysfs for stuff like this that does
> > > not matter in any system other than a developer's box.
> > 
> > So I mentioned this on IRC, and some folks were surprised to hear that
> > module unloading is unsupported and is just a development aid.
> > 
> > Is this stance documented anywhere?
> > 
> > If we really believe this to be true, we should make rmmod taint the
> > kernel.
> 
> My throw-away comment here seems to have gotten way more attention than
> it deserved, sorry about that everyone.
> 
> Nothing is supported for anything here, it's all "best effort" :)
> 
> And I would love a taint for rmmod, but what is that going to help out
> with?

I'm having trouble following this conclusion.  Sure, we give our best
effort, but if "nothing is supported" then what are we even doing here?

Either we aim to support module unloading, or we don't.

If we do support it, "best effort" isn't a valid reason for nacking
fixes.

If we don't support it, but still want to keep it half-working for
development purposes, tainting on rmmod will make it crystal clear that
the system has been destabilized.

-- 
Josh

Re: [PATCH 0/5] Introduce support for PSF mitigation

2021-04-07 Thread Josh Poimboeuf

On Tue, Apr 06, 2021 at 10:49:59AM -0500, Ramakrishna Saripalli wrote:
> Because PSF speculation is limited to the current program context,
> the impact of bad PSF speculation is very similar to that of
> Speculative Store Bypass (Spectre v4)
> 
> Predictive Store Forwarding controls:
> There are two hardware control bits which influence the PSF feature:
> - MSR 48h bit 2 – Speculative Store Bypass (SSBD)
> - MSR 48h bit 7 – Predictive Store Forwarding Disable (PSFD)
> 
> The PSF feature is disabled if either of these bits are set.  These bits
> are controllable on a per-thread basis in an SMT system. By default, both
> SSBD and PSFD are 0 meaning that the speculation features are enabled.
> 
> While the SSBD bit disables PSF and speculative store bypass, PSFD only
> disables PSF.
> 
> PSFD may be desirable for software which is concerned with the
> speculative behavior of PSF but desires a smaller performance impact than
> setting SSBD.

Hi Ramakrishna,

Is there a realistic scenario where an application would want to disable
PSF, but not disable SSB?

Maybe I'm missing something, but I'd presume an application would either
care about this class of attacks, or not.

-- 
Josh

Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate

2021-04-07 Thread Josh Poimboeuf

On Fri, Apr 02, 2021 at 09:54:12AM +0200, Greg KH wrote:
> On Thu, Apr 01, 2021 at 11:59:25PM +, Luis Chamberlain wrote:
> > As for the syfs deadlock possible with drivers, this fixes it in a generic 
> > way:
> > 
> > commit fac43d8025727a74f80a183cc5eb74ed902a5d14
> > Author: Luis Chamberlain 
> > Date:   Sat Mar 27 14:58:15 2021 +
> > 
> > sysfs: add optional module_owner to attribute
> > 
> > This is needed as otherwise the owner of the attribute
> > or group read/store might have a shared lock used on driver removal,
> > and deadlock if we race with driver removal.
> > 
> > Signed-off-by: Luis Chamberlain 
> 
> No, please no.  Module removal is a "best effort", if the system dies
> when it happens, that's on you.  I am not willing to expend extra energy
> and maintance of core things like sysfs for stuff like this that does
> not matter in any system other than a developer's box.

So I mentioned this on IRC, and some folks were surprised to hear that
module unloading is unsupported and is just a development aid.

Is this stance documented anywhere?

If we really believe this to be true, we should make rmmod taint the
kernel.

-- 
Josh

Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate

2021-04-07 Thread Josh Poimboeuf

On Wed, Apr 07, 2021 at 04:09:44PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 06, 2021 at 10:54:23AM -0500, Josh Poimboeuf wrote:
> 
> > Same for Red Hat.  Unloading livepatch modules seems to work fine, but
> > isn't officially supported.
> > 
> > That said, if rmmod is just considered a development aid, and we're
> > going to be ignoring bugs, we should make it official with a new
> > TAINT_RMMOD.
> 
> Another option would be to have live-patch modules leak a module
> reference by default, except when some debug sysctl is set or something.
> Then only those LP modules loaded while the sysctl is set to 'YOLO' can
> be unloaded.

The issue is broader than just live patching.

My suggestion was that if we aren't going to fix bugs in kernel module
unloading, then unloading modules shouldn't be supported, and should
taint the kernel.

-- 
Josh

Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate

2021-04-06 Thread Josh Poimboeuf

On Tue, Apr 06, 2021 at 02:00:19PM +0200, Miroslav Benes wrote:
> Hi,
> 
> > > Driver developers will simply have to open code these protections. In
> > > light of what I see on LTP / fuzzing, I suspect the use case will grow
> > > and we'll have to revisit this in the future. But for now, sure, we can
> > > just open code the required protections everywhere to not crash on module
> > > removal.
> > 
> > LTP and fuzzing too do not remove modules.  So I do not understand the
> > root problem here, that's just something that does not happen on a real
> > system.
> 
> If I am not mistaken, the issue that Luis tries to solve here was indeed 
> found by running LTP.
> 
> > On Sat, Apr 03, 2021 at 08:13:23AM +0200, Greg KH wrote:
> > > On Fri, Apr 02, 2021 at 06:30:16PM +, Luis Chamberlain wrote:
> > > > On Fri, Apr 02, 2021 at 09:54:12AM +0200, Greg KH wrote:
> > > > > No, please no.  Module removal is a "best effort",
> > > > 
> > > > Not for live patching. I am not sure if I am missing any other valid
> > > > use case?
> > > 
> > > live patching removes modules?  We have so many code paths that are
> > > "best effort" when it comes to module unloading, trying to resolve this
> > > one is a valiant try, but not realistic.
> > 
> > Miroslav, your input / help here would be valuable. I did the
> > generalization work because you said it would be worthy for you too...
> 
> Yes, we have the option to revert and remove the existing live patch from 
> the system. I am not sure how (if) it is used in practice.
> 
> At least at SUSE we do not support the option. But we are only one of the 
> many downstream users. So yes, there is the option.

Same for Red Hat.  Unloading livepatch modules seems to work fine, but
isn't officially supported.

That said, if rmmod is just considered a development aid, and we're
going to be ignoring bugs, we should make it official with a new
TAINT_RMMOD.

-- 
Josh

Re: [RFC PATCH v1 0/4] arm64: Implement stack trace reliability checks

2021-04-03 Thread Josh Poimboeuf

On Tue, Mar 30, 2021 at 02:09:51PM -0500, madve...@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" 
> 
> There are a number of places in kernel code where the stack trace is not
> reliable. Enhance the unwinder to check for those cases and mark the
> stack trace as unreliable. Once all of the checks are in place, the unwinder
> can be used for livepatching.

This assumes all such places are known.  That's a big assumption, as

a) hand-written asm code may inadvertantly skip frame pointer setup;

b) for inline asm which calls a function, the compiler may blindly
   insert it into a function before the frame pointer setup.

That's where objtool stack validation would come in.
   
> Detect EL1 exception frame
> ==
> 
> EL1 exceptions can happen on any instruction including instructions in
> the frame pointer prolog or epilog. Depending on where exactly they happen,
> they could render the stack trace unreliable.
> 
> Add all of the EL1 exception handlers to special_functions[].
> 
>   - el1_sync()
>   - el1_irq()
>   - el1_error()
>   - el1_sync_invalid()
>   - el1_irq_invalid()
>   - el1_fiq_invalid()
>   - el1_error_invalid()

A possibly more robust alternative would be to somehow mark el1
exception frames so the unwinder can detect them more generally.

For example, as described in my previous email, encode the frame pointer
so the unwinder can detect el1 frames automatically.

> Detect ftrace frame
> ===
> 
> When FTRACE executes at the beginning of a traced function, it creates two
> frames and calls the tracer function:
> 
>   - One frame for the traced function
> 
>   - One frame for the caller of the traced function
> 
> That gives a sensible stack trace while executing in the tracer function.
> When FTRACE returns to the traced function, the frames are popped and
> everything is back to normal.
> 
> However, in cases like live patch, the tracer function redirects execution
> to a different function. When FTRACE returns, control will go to that target
> function. A stack trace taken in the tracer function will not show the target
> function. The target function is the real function that we want to track.
> So, the stack trace is unreliable.

I don't think this is a real problem.  Livepatch only checks the stacks
of blocked tasks (and the task calling into livepatch).  So the
reliability of unwinding from the livepatch tracer function itself
(klp_ftrace_handler) isn't a concern since it doesn't sleep.

> To detect FTRACE in a stack trace, add the following to special_functions[]:
> 
>   - ftrace_graph_call()
>   - ftrace_graph_caller()
> 
> Please see the diff for a comment that explains why ftrace_graph_call()
> must be checked.
> 
> Also, the Function Graph Tracer modifies the return address of a traced
> function to a return trampoline (return_to_handler()) to gather tracing
> data on function return. Stack traces taken from the traced function and
> functions it calls will not show the original caller of the traced function.
> The unwinder handles this case by getting the original caller from FTRACE.
> 
> However, stack traces taken from the trampoline itself and functions it calls
> are unreliable as the original return address may not be available in
> that context. This is because the trampoline calls FTRACE to gather trace
> data as well as to obtain the actual return address and FTRACE discards the
> record of the original return address along the way.

Again, this shouldn't be a concern because livepatch won't be unwinding
from a function_graph trampoline unless it got preempted somehow (and
then the el1 frame would get detected anyway).

> Add return_to_handler() to special_functions[].
> 
> Check for kretprobe
> ===
> 
> For functions with a kretprobe set up, probe code executes on entry
> to the function and replaces the return address in the stack frame with a
> kretprobe trampoline. Whenever the function returns, control is
> transferred to the trampoline. The trampoline eventually returns to the
> original return address.
> 
> A stack trace taken while executing in the function (or in functions that
> get called from the function) will not show the original return address.
> Similarly, a stack trace taken while executing in the trampoline itself
> (and functions that get called from the trampoline) will not show the
> original return address. This means that the caller of the probed function
> will not show. This makes the stack trace unreliable.
> 
> Add the kretprobe trampoline to special_functions[].
> 
> FYI, each task contains a task->kretprobe_instances list that can
> theoretically be consulted to find the orginal return address. But I am
> not entirely sure how to safely traverse that list for stack traces
> not on the current process. So, I have taken the easy way out.

For kretprobes, unwinding from the trampoline or kretprobe handler
shouldn't be a reliability

Re: [RFC PATCH v2 1/1] arm64: Implement stack trace termination record

2021-04-03 Thread Josh Poimboeuf

On Thu, Apr 01, 2021 at 10:24:04PM -0500, madve...@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" 
> @@ -447,9 +464,9 @@ SYM_FUNC_START_LOCAL(__primary_switched)
>  #endif
>   bl  switch_to_vhe   // Prefer VHE if possible
>   add sp, sp, #16
> - mov x29, #0
> - mov x30, #0
> - b   start_kernel
> + setup_final_frame
> + bl  start_kernel
> + nop
>  SYM_FUNC_END(__primary_switched)
>  
>   .pushsection ".rodata", "a"
> @@ -606,14 +623,14 @@ SYM_FUNC_START_LOCAL(__secondary_switched)
>   cbz x2, __secondary_too_slow
>   msr sp_el0, x2
>   scs_load x2, x3
> - mov x29, #0
> - mov x30, #0
> + setup_final_frame
>  
>  #ifdef CONFIG_ARM64_PTR_AUTH
>   ptrauth_keys_init_cpu x2, x3, x4, x5
>  #endif
>  
> - b   secondary_start_kernel
> + bl  secondary_start_kernel
> + nop
>  SYM_FUNC_END(__secondary_switched)

I'm somewhat arm-ignorant, so take the following comments with a grain
of salt.

I don't think changing these to 'bl' is necessary, unless you wanted
__primary_switched() and __secondary_switched() to show up in the
stacktrace for some reason?  If so, that seems like a separate patch.

Also, why are nops added after the calls?  My guess would be because,
since these are basically tail calls to "noreturn" functions, the stack
dump code would otherwise show the wrong function, i.e. whatever
function happens to be after the 'bl'.

We had the same issue for x86.  It can be fixed by using '%pB' instead
of '%pS' when printing the address in dump_backtrace_entry().  See
sprint_backtrace() for more details.

BTW I think the same issue exists for GCC-generated code.  The following
shows several such cases:

  objdump -dr vmlinux |awk '/bl   / {bl=1;l=$0;next} bl == 1 && /^$/ {print l; 
print} // {bl=0}'

However, looking at how arm64 unwinds through exceptions in kernel
space, using '%pB' might have side effects when the exception LR
(elr_el1) points to the beginning of a function.  Then '%pB' would show
the end of the previous function, instead of the function which was
interrupted.

So you may need to rethink how to unwind through in-kernel exceptions.

Basically, when printing a stack return address, you want to use '%pB'
for a call return address and '%pS' for an interrupted address.

On x86, with the frame pointer unwinder, we encode the frame pointer by
setting a bit in %rbp which tells the unwinder that it's a special
pt_regs frame.  Then instead of treating it like a normal call frame,
the stack dump code prints the registers, and the return address
(regs->ip) gets printed with '%pS'.

>  SYM_FUNC_START_LOCAL(__secondary_too_slow)
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 325c83b1a24d..906baa232a89 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -437,6 +437,11 @@ int copy_thread(unsigned long clone_flags, unsigned long 
> stack_start,
>   }
>   p->thread.cpu_context.pc = (unsigned long)ret_from_fork;
>   p->thread.cpu_context.sp = (unsigned long)childregs;
> + /*
> +  * For the benefit of the unwinder, set up childregs->stackframe
> +  * as the final frame for the new task.
> +  */
> + p->thread.cpu_context.fp = (unsigned long)childregs->stackframe;
>  
>   ptrace_hw_copy_thread(p);
>  
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index ad20981dfda4..72f5af8c69dc 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -44,16 +44,16 @@ int notrace unwind_frame(struct task_struct *tsk, struct 
> stackframe *frame)
>   unsigned long fp = frame->fp;
>   struct stack_info info;
>  
> - /* Terminal record; nothing to unwind */
> - if (!fp)
> + if (!tsk)
> + tsk = current;
> +
> + /* Final frame; nothing to unwind */
> + if (fp == (unsigned long) task_pt_regs(tsk)->stackframe)
>   return -ENOENT;

As far as I can tell, the regs stackframe value is initialized to zero
during syscall entry, so isn't this basically just 'if (fp == 0)'?

Shouldn't it instead be comparing with the _address_ of the stackframe
field to make sure it reached the end?

-- 
Josh

Re: [RFC PATCH -tip 3/3] x86/kprobes,orc: Unwind optprobe trampoline correctly

2021-03-31 Thread Josh Poimboeuf

On Wed, Mar 31, 2021 at 02:44:56PM +0900, Masami Hiramatsu wrote:
> +#ifdef CONFIG_UNWINDER_ORC
> +unsigned long recover_optprobe_trampoline(unsigned long addr, unsigned long 
> *sp)
> +{
> + unsigned long offset, entry, probe_addr;
> + struct optimized_kprobe *op;
> + struct orc_entry *orc;
> +
> + entry = find_kprobe_optinsn_slot_entry(addr);
> + if (!entry)
> + return addr;
> +
> + offset = addr - entry;
> +
> + /* Decode arg1 and get the optprobe */
> + op = (void *)extract_set_arg1((void *)(entry + TMPL_MOVE_IDX));
> + if (!op)
> + return addr;
> +
> + probe_addr = (unsigned long)op->kp.addr;
> +
> + if (offset < TMPL_END_IDX) {
> + orc = orc_find((unsigned long)optprobe_template_func + offset);
> + if (!orc || orc->sp_reg != ORC_REG_SP)
> + return addr;
> + /*
> +  * Since optprobe trampoline doesn't push caller on the stack,
> +  * need to decrement 1 stack entry size
> +  */
> + *sp += orc->sp_offset - sizeof(long);
> + return probe_addr;
> + } else {
> + return probe_addr + offset - TMPL_END_IDX;
> + }
> +}
> +#endif

Hm, I'd like to avoid intertwining kprobes and ORC like this.

ORC unwinds other generated code by assuming the generated code uses a
frame pointer.  Could we do that here?

With CONFIG_FRAME_POINTER, unwinding works because SAVE_REGS_STRING has
ENCODE_FRAME_POINTER, but that's not going to work for ORC.

Instead of these patches, can we 'push %rbp; mov %rsp, %rbp' at the
beginning of the template and 'pop %rbp' at the end?

I guess SAVE_REGS_STRING would need to be smart enough to push the
original saved version of %rbp.  Of course then that breaks the
kretprobe_trampoline() usage, so it may need to be a separate macro.

[ Or make the same change to kretprobe_trampoline().  Then the other
  patch set wouldn't be needed either ;-) ]

Of course the downside is, when you get an interrupt during the frame
pointer setup, unwinding is broken.  But I think that's acceptable for
generated code.  We've lived with that limitation for all code, with
CONFIG_FRAME_POINTER, for many years.

Eventually we may want to have a way to register generated code (and the
ORC for it).

-- 
Josh

Re: [PATCH -tip v4 10/12] x86/kprobes: Push a fake return address at kretprobe_trampoline

2021-03-29 Thread Josh Poimboeuf

On Fri, Mar 26, 2021 at 10:20:09AM -0400, Steven Rostedt wrote:
> On Fri, 26 Mar 2021 21:03:49 +0900
> Masami Hiramatsu  wrote:
> 
> > I confirmed this is not related to this series, but occurs when I build 
> > kernels with different
> > configs without cleanup.
> > 
> > Once I build kernel with CONFIG_UNWIND_GUESS=y (for testing), and after 
> > that,
> > I build kernel again with CONFIG_UNWIND_ORC=y (but without make clean), this
> > happened. In this case, I guess ORC data might be corrupted?
> > When I cleanup and rebuild, the stacktrace seems correct.
> 
> Hmm, that should be fixed. Seems like we are missing a dependency somewhere.

Thomas reported something similar: for example arch/x86/kernel/head_64.o
doesn't get rebuilt when changing unwinders.

  https://lkml.kernel.org/r/87tuqublrb@nanos.tec.linutronix.de

Masahiro, any idea how we can force a full tree rebuild when changing
the unwinder?

-- 
Josh

Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls

2021-03-29 Thread Josh Poimboeuf

On Fri, Mar 26, 2021 at 04:12:15PM +0100, Peter Zijlstra wrote:
> @@ -61,3 +89,15 @@ SYM_FUNC_END(__x86_indirect_thunk_\reg)
>  #define GEN(reg) EXPORT_THUNK(reg)
>  #include 
>  
> +#undef GEN
> +#define GEN(reg) ALT_THUNK reg
> +#include 
> +
> +#undef GEN
> +#define GEN(reg) __EXPORT_THUNK(__x86_indirect_alt_call_ ## reg)
> +#include 
> +
> +#undef GEN
> +#define GEN(reg) __EXPORT_THUNK(__x86_indirect_alt_jmp_ ## reg)
> +#include 
> +

Git complains about this last newline.

Otherwise everything looks pretty good to me.  Let me run it through the
test matrix.

-- 
Josh

Re: [PATCH -tip v4 10/12] x86/kprobes: Push a fake return address at kretprobe_trampoline

2021-03-24 Thread Josh Poimboeuf

On Wed, Mar 24, 2021 at 10:40:58AM +0900, Masami Hiramatsu wrote:
> On Tue, 23 Mar 2021 23:30:07 +0100
> Peter Zijlstra  wrote:
> 
> > On Mon, Mar 22, 2021 at 03:41:40PM +0900, Masami Hiramatsu wrote:
> > >   ".global kretprobe_trampoline\n"
> > >   ".type kretprobe_trampoline, @function\n"
> > >   "kretprobe_trampoline:\n"
> > >  #ifdef CONFIG_X86_64
> > 
> > So what happens if we get an NMI here? That is, after the RET but before
> > the push? Then our IP points into the trampoline but we've not done that
> > push yet.
> 
> Not only NMI, but also interrupts can happen. There is no cli/sti here.
> 
> Anyway, thanks for pointing!
> I think in UNWIND_HINT_TYPE_REGS and UNWIND_HINT_TYPE_REGS_PARTIAL cases
> ORC unwinder also has to check the state->ip and if it is 
> kretprobe_trampoline,
> it should be recovered.
> What about this?

I think the REGS and REGS_PARTIAL cases can also be affected by function
graph tracing.  So should they use the generic unwind_recover_ret_addr()
instead of unwind_recover_kretprobe()?

-- 
Josh

Re: [PATCH -tip v3 05/11] x86/kprobes: Add UNWIND_HINT_FUNC on kretprobe_trampoline code

2021-03-21 Thread Josh Poimboeuf

On Sat, Mar 20, 2021 at 10:05:43PM +0900, Masami Hiramatsu wrote:
> On Sat, 20 Mar 2021 21:16:16 +0900
> Masami Hiramatsu  wrote:
> 
> > On Fri, 19 Mar 2021 21:22:39 +0900
> > Masami Hiramatsu  wrote:
> > 
> > > From: Josh Poimboeuf 
> > > 
> > > Add UNWIND_HINT_FUNC on kretporbe_trampoline code so that ORC
> > > information is generated on the kretprobe_trampoline correctly.
> > > 
> > 
> > Test bot also found a new warning for this.
> > 
> > > >> arch/x86/kernel/kprobes/core.o: warning: objtool: 
> > > >> kretprobe_trampoline()+0x25: call without frame pointer save/setup
> > 
> > With CONFIG_FRAME_POINTER=y.
> > 
> > Of course this can be fixed with additional "push %bp; mov %sp, %bp" before 
> > calling
> > trampoline_handler. But actually we know that this function has a bit 
> > special
> > stack frame too. 
> > 
> > Can I recover STACK_FRAME_NON_STANDARD(kretprobe_trampoline) when 
> > CONFIG_FRAME_POINTER=y ?
> 
> So something like this. Does it work?
> 
> diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
> index b31058a152b6..651f337dc880 100644
> --- a/arch/x86/kernel/kprobes/core.c
> +++ b/arch/x86/kernel/kprobes/core.c
> @@ -760,6 +760,10 @@ int kprobe_int3_handler(struct pt_regs *regs)
>  }
>  NOKPROBE_SYMBOL(kprobe_int3_handler);
>  
> +#ifdef CONFIG_FRAME_POINTER
> +#undef UNWIND_HINT_FUNC
> +#define UNWIND_HINT_FUNC
> +#endif

This hunk isn't necessary.  The unwind hints don't actually have an
effect with frame pointers.

>  /*
>   * When a retprobed function returns, this code saves registers and
>   * calls trampoline_handler() runs, which calls the kretprobe's handler.
> @@ -797,7 +801,14 @@ asm(
>   ".size kretprobe_trampoline, .-kretprobe_trampoline\n"
>  );
>  NOKPROBE_SYMBOL(kretprobe_trampoline);
> -
> +#ifdef CONFIG_FRAME_POINTER
> +/*
> + * kretprobe_trampoline skips updating frame pointer. The frame pointer
> + * saved in trampoline_handler points to the real caller function's
> + * frame pointer.
> + */
> +STACK_FRAME_NON_STANDARD(kretprobe_trampoline);
> +#endif
>  
>  /*
>   * Called from kretprobe_trampoline

Ack.

-- 
Josh

Re: [PATCH -tip v3 05/11] x86/kprobes: Add UNWIND_HINT_FUNC on kretprobe_trampoline code

2021-03-21 Thread Josh Poimboeuf

On Sat, Mar 20, 2021 at 09:16:16PM +0900, Masami Hiramatsu wrote:
> On Fri, 19 Mar 2021 21:22:39 +0900
> Masami Hiramatsu  wrote:
> 
> > From: Josh Poimboeuf 
> > 
> > Add UNWIND_HINT_FUNC on kretporbe_trampoline code so that ORC
> > information is generated on the kretprobe_trampoline correctly.
> > 
> 
> Test bot also found a new warning for this.
> 
> > >> arch/x86/kernel/kprobes/core.o: warning: objtool: 
> > >> kretprobe_trampoline()+0x25: call without frame pointer save/setup
> 
> With CONFIG_FRAME_POINTER=y.
> 
> Of course this can be fixed with additional "push %bp; mov %sp, %bp" before 
> calling
> trampoline_handler. But actually we know that this function has a bit special
> stack frame too. 
> 
> Can I recover STACK_FRAME_NON_STANDARD(kretprobe_trampoline) when 
> CONFIG_FRAME_POINTER=y ?

Yes, that's what I'd recommend.

-- 
Josh

Re: [PATCH v2 14/14] objtool,x86: Rewrite retpoline thunk calls

2021-03-19 Thread Josh Poimboeuf

On Fri, Mar 19, 2021 at 04:56:30PM +0100, Peter Zijlstra wrote:
> On Fri, Mar 19, 2021 at 10:30:26AM -0500, Josh Poimboeuf wrote:
> > On Fri, Mar 19, 2021 at 09:06:44AM +0100, Peter Zijlstra wrote:
> > > > Also doesn't the alternative code already insert nops?
> > > 
> > > Problem is that the {call,jmp} *%\reg thing is not fixed length. They're
> > > 2 or 3 bytes depending on which register is picked.
> > 
> > Why do they need to be fixed length?  Objtool can use sym->len as the
> > alternative replacement length.  Then alternatives can add nops as
> > needed.
> 
> UNDEF has size 0. That is, unless these symbols exist in the translation
> unit (they do not) we have no clue.
> 
> Arguably I could parse the symbol name again and then we know the
> register number and thus if we need REX etc.. but I figured we wanted to
> avoid all that.

Ah, makes sense now.

-- 
Josh

Re: [PATCH v2 08/14] objtool: Add elf_create_reloc() helper

2021-03-19 Thread Josh Poimboeuf

On Fri, Mar 19, 2021 at 04:24:40PM +0100, Peter Zijlstra wrote:
> On Fri, Mar 19, 2021 at 10:12:59AM -0500, Josh Poimboeuf wrote:
> 
> > > -void elf_add_reloc(struct elf *elf, struct reloc *reloc)
> > > +int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long 
> > > offset,
> > > +   unsigned int type, struct symbol *sym, int addend)
> > >  {
> > > - struct section *sec = reloc->sec;
> > > + struct reloc *reloc;
> > > +
> > > + reloc = malloc(sizeof(*reloc));
> > > + if (!reloc) {
> > > + perror("malloc");
> > > + return -1;
> > > + }
> > > + memset(reloc, 0, sizeof(*reloc));
> > > +
> > > + reloc->sec = sec->reloc;
> > 
> > Maybe elf_add_reloc() could create the reloc section if it doesn't
> > already exist, that would help remove the multiple calls to
> > elf_create_reloc_section().
> 
> I'll do that as yet another patch ;-)

Ok!

> > > + reloc->offset = offset;
> > > + reloc->type = type;
> > > + reloc->sym = sym;
> > > + reloc->addend = addend;
> > >  
> > >   list_add_tail(>list, >reloc_list);
> > 
> > This should be sec->reloc->reloc_list?
> 
> Absolutely, there were a few other 'funnies' as well that I just fixed
> whilst trying to make the whole new stack work again :-)

I'm sure some of those were my fault, thanks for testing my crap :-)

-- 
Josh

Re: [PATCH v2 14/14] objtool,x86: Rewrite retpoline thunk calls

2021-03-19 Thread Josh Poimboeuf

On Fri, Mar 19, 2021 at 09:06:44AM +0100, Peter Zijlstra wrote:
> > Also doesn't the alternative code already insert nops?
> 
> Problem is that the {call,jmp} *%\reg thing is not fixed length. They're
> 2 or 3 bytes depending on which register is picked.

Why do they need to be fixed length?  Objtool can use sym->len as the
alternative replacement length.  Then alternatives can add nops as
needed.

-- 
Josh

Re: [PATCH 5/9] objtool: Rework rebuild_reloc logic

2021-03-19 Thread Josh Poimboeuf

On Fri, Mar 19, 2021 at 10:22:54AM +0100, Peter Zijlstra wrote:
> @@ -814,6 +814,11 @@ struct section *elf_create_reloc_section
>   }
>  }
>  
> +static inline bool is_reloc_section(struct section *reloc)
> +{
> + return reloc->base && reloc->base->reloc == reloc;
> +}

I believe the 2nd condition will always be true, so it can just be
'return reloc->base'.

-- 
Josh

Re: [PATCH v2 08/14] objtool: Add elf_create_reloc() helper

2021-03-19 Thread Josh Poimboeuf

On Fri, Mar 19, 2021 at 10:47:36AM +0100, Peter Zijlstra wrote:
> Full patch, because diff on a diff on a diff is getting ludicrous hard
> to read :-)

Appreciated :-)

> -void elf_add_reloc(struct elf *elf, struct reloc *reloc)
> +int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
> +   unsigned int type, struct symbol *sym, int addend)
>  {
> - struct section *sec = reloc->sec;
> + struct reloc *reloc;
> +
> + reloc = malloc(sizeof(*reloc));
> + if (!reloc) {
> + perror("malloc");
> + return -1;
> + }
> + memset(reloc, 0, sizeof(*reloc));
> +
> + reloc->sec = sec->reloc;

Maybe elf_add_reloc() could create the reloc section if it doesn't
already exist, that would help remove the multiple calls to
elf_create_reloc_section().

> + reloc->offset = offset;
> + reloc->type = type;
> + reloc->sym = sym;
> + reloc->addend = addend;
>  
>   list_add_tail(>list, >reloc_list);

This should be sec->reloc->reloc_list?

-- 
Josh

Re: [PATCH v2 10/14] objtool: Extract elf_symbol_add()

2021-03-19 Thread Josh Poimboeuf

On Fri, Mar 19, 2021 at 10:54:05AM +0100, Peter Zijlstra wrote:
> On Thu, Mar 18, 2021 at 09:14:03PM -0500, Josh Poimboeuf wrote:
> > On Thu, Mar 18, 2021 at 06:11:13PM +0100, Peter Zijlstra wrote:
> > > Create a common helper to add symbols.
> > > 
> > > Signed-off-by: Peter Zijlstra (Intel) 
> > > ---
> > >  tools/objtool/elf.c |   57 
> > > ++--
> > >  1 file changed, 33 insertions(+), 24 deletions(-)
> > > 
> > > --- a/tools/objtool/elf.c
> > > +++ b/tools/objtool/elf.c
> > > @@ -290,12 +290,41 @@ static int read_sections(struct elf *elf
> > >   return 0;
> > >  }
> > >  
> > > +static bool elf_symbol_add(struct elf *elf, struct symbol *sym)
> > 
> > How about "elf_add_symbol()" for consistency with my other suggestions
> > (elf_add_reloc() and elf_add_string()).  And return an int.
> 
> Changed the nane, how about void? This latest incarnation doesn't
> actually have an error path. Still doesn't hurt to have one and not use
> it I suppose...

void would be my preference, just to avoid unnecessary error handling
boilerplate in the caller.

-- 
Josh

Re: [PATCH v2 14/14] objtool,x86: Rewrite retpoline thunk calls

2021-03-18 Thread Josh Poimboeuf

On Thu, Mar 18, 2021 at 06:11:17PM +0100, Peter Zijlstra wrote:
> When the compiler emits: "CALL __x86_indirect_thunk_\reg" for an
> indirect call, have objtool rewrite it to:
> 
>   ALTERNATIVE "call __x86_indirect_thunk_\reg",
>   "call *%reg", ALT_NOT(X86_FEATURE_RETPOLINE)
> 
> Additionally, in order to not emit endless identical
> .altinst_replacement chunks, use a global symbol for them, see
> __x86_indirect_alt_*.
> 
> This also avoids objtool from having to do code generation.
> 
> Signed-off-by: Peter Zijlstra (Intel) 

This is better than I expected.  Nice workaround for not generating
code.

> +.macro ALT_THUNK reg
> +
> + .align 1
> +
> +SYM_FUNC_START_NOALIGN(__x86_indirect_alt_call_\reg)
> + ANNOTATE_RETPOLINE_SAFE
> +1:   call*%\reg
> +2:   .skip   5-(2b-1b), 0x90
> +SYM_FUNC_END(__x86_indirect_alt_call_\reg)
> +
> +SYM_FUNC_START_NOALIGN(__x86_indirect_alt_jmp_\reg)
> + ANNOTATE_RETPOLINE_SAFE
> +1:   jmp *%\reg
> +2:   .skip   5-(2b-1b), 0x90
> +SYM_FUNC_END(__x86_indirect_alt_jmp_\reg)

This mysterious code needs a comment.  Shouldn't it be in
.altinstr_replacement or something?

Also doesn't the alternative code already insert nops?

> +int arch_rewrite_retpoline(struct objtool_file *file,
> +struct instruction *insn,
> +struct reloc *reloc)
> +{
> + struct symbol *sym;
> + char name[32] = "";
> +
> + if (!strcmp(insn->sec->name, ".text.__x86.indirect_thunk"))
> + return 0;
> +
> + sprintf(name, "__x86_indirect_alt_%s_%s",
> + insn->type == INSN_JUMP_DYNAMIC ? "jmp" : "call",
> + reloc->sym->name + 21);
> +
> + sym = find_symbol_by_name(file->elf, name);
> + if (!sym) {
> + sym = elf_create_undef_symbol(file->elf, name);
> + if (!sym) {
> + WARN("elf_create_undef_symbol");
> + return -1;
> + }
> + }
> +
> + elf_add_alternative(file->elf, insn, sym,
> + ALT_NOT(X86_FEATURE_RETPOLINE), 5, 5);
> +
> + return 0;
> +}

Need to propagate the error.

-- 
Josh

Re: [PATCH v2 12/14] objtool: Allow archs to rewrite retpolines

2021-03-18 Thread Josh Poimboeuf

On Thu, Mar 18, 2021 at 06:11:15PM +0100, Peter Zijlstra wrote:
> @@ -1212,6 +1225,8 @@ static int handle_group_alt(struct objto
>   dest_off = arch_jump_destination(insn);
>   if (dest_off == special_alt->new_off + special_alt->new_len)
>   insn->jump_dest = next_insn_same_sec(file, 
> last_orig_insn);
> + else
> + insn->jump_dest = find_insn(file, insn->sec, dest_off);
>  
>   if (!insn->jump_dest) {
>   WARN_FUNC("can't find alternative jump destination",

So I assume this change is because of the ordering change: now this is
done before add_jump_destinations().

But doesn't that mean the alternative jump modification (changing the
dest to the end of the original insns) will get overwritten later?

Also the new hunk to be an oversimplified version of
add_jump_destinations().  I'm not quite convinced that it will always do
the right thing for this case.

> @@ -1797,11 +1812,15 @@ static int decode_sections(struct objtoo
>   if (ret)
>   return ret;
>  
> - ret = add_jump_destinations(file);
> + /*
> +  * Must be before add_{jump,call}_destination; for they can add
> +  * magic alternatives.
> +  */
> + ret = add_special_section_alts(file);

This reordering is unfortunate.  Maybe there's a better way, though I
don't have any ideas, at least until I get to the most controversial
patch.

-- 
Josh

Re: [PATCH v2 05/14] objtool: Per arch retpoline naming

2021-03-18 Thread Josh Poimboeuf

On Thu, Mar 18, 2021 at 06:11:08PM +0100, Peter Zijlstra wrote:
> @@ -872,7 +877,7 @@ static int add_jump_destinations(struct
>   } else if (reloc->sym->type == STT_SECTION) {
>   dest_sec = reloc->sym->sec;
>   dest_off = arch_dest_reloc_offset(reloc->addend);
> - } else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 
> 21)) {
> + } else if (arch_is_retpoline(reloc->sym)) {
>   /*
>* Retpoline jumps are really dynamic jumps in
>* disguise, so convert them accordingly.

There's another one in add_call_destinations().

-- 
Josh

Re: [PATCH v2 11/14] objtool: Add elf_create_undef_symbol()

2021-03-18 Thread Josh Poimboeuf

On Thu, Mar 18, 2021 at 06:11:14PM +0100, Peter Zijlstra wrote:
> Allow objtool to create undefined symbols; this allows creating
> relocations to symbols not currently in the symbol table.
> 
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  tools/objtool/elf.c |   63 
> 
>  tools/objtool/include/objtool/elf.h |1 
>  2 files changed, 64 insertions(+)
> 
> --- a/tools/objtool/elf.c
> +++ b/tools/objtool/elf.c
> @@ -724,6 +724,69 @@ static int elf_strtab_concat(struct elf
>   return len;
>  }
>  
> +struct symbol *elf_create_undef_symbol(struct elf *elf, const char *name)
> +{
> + struct section *symtab;
> + struct symbol *sym;
> + Elf_Data *data;
> + Elf_Scn *s;
> +
> + sym = malloc(sizeof(*sym));
> + if (!sym) {
> + perror("malloc");
> + return NULL;
> + }
> + memset(sym, 0, sizeof(*sym));
> +
> + sym->name = strdup(name);
> +
> + sym->sym.st_name = elf_strtab_concat(elf, sym->name, NULL);
> + if (sym->sym.st_name == -1)
> + return NULL;
> +
> + sym->sym.st_info = 0x10; /* STB_GLOBAL, STT_NOTYPE */

There's a generic macro for this:

sym->sym.st_info = GELF_ST_INFO(STB_GLOBAL, STT_NOTYPE);

And sym->bind and sym->type should probably get set.

-- 
Josh

Re: [PATCH v2 10/14] objtool: Extract elf_symbol_add()

2021-03-18 Thread Josh Poimboeuf

On Thu, Mar 18, 2021 at 06:11:13PM +0100, Peter Zijlstra wrote:
> Create a common helper to add symbols.
> 
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  tools/objtool/elf.c |   57 
> ++--
>  1 file changed, 33 insertions(+), 24 deletions(-)
> 
> --- a/tools/objtool/elf.c
> +++ b/tools/objtool/elf.c
> @@ -290,12 +290,41 @@ static int read_sections(struct elf *elf
>   return 0;
>  }
>  
> +static bool elf_symbol_add(struct elf *elf, struct symbol *sym)

How about "elf_add_symbol()" for consistency with my other suggestions
(elf_add_reloc() and elf_add_string()).  And return an int.

-- 
Josh

Re: [PATCH v2 09/14] objtool: Extract elf_strtab_concat()

2021-03-18 Thread Josh Poimboeuf

On Thu, Mar 18, 2021 at 06:11:12PM +0100, Peter Zijlstra wrote:
> Create a common helper to append strings to a strtab.
> 
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  tools/objtool/elf.c |   73 
> +---
>  1 file changed, 42 insertions(+), 31 deletions(-)
> 
> --- a/tools/objtool/elf.c
> +++ b/tools/objtool/elf.c
> @@ -676,13 +676,51 @@ struct elf *elf_open_read(const char *na
>   return NULL;
>  }
>  
> +static int elf_strtab_concat(struct elf *elf, char *name, const char 
> *strtab_name)
> +{
> + struct section *strtab = NULL;
> + Elf_Data *data;
> + Elf_Scn *s;
> + int len;
> +
> + if (strtab_name)
> + strtab = find_section_by_name(elf, strtab_name);
> + if (!strtab)
> + strtab = find_section_by_name(elf, ".strtab");
> + if (!strtab) {
> + WARN("can't find %s or .strtab section", strtab_name);
> + return -1;
> + }

This part's a bit mysterious (and it loses the Clang comment).  Maybe we
can leave the section finding logic in elf_create_section()?

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index b85db6efb9d3..db9ad54894d8 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -676,19 +676,17 @@ struct elf *elf_open_read(const char *name, int flags)
return NULL;
 }
 
-static int elf_strtab_concat(struct elf *elf, char *name, const char 
*strtab_name)
+static int elf_add_string(struct elf *elf, struct section *strtab, char *str)
 {
struct section *strtab = NULL;
Elf_Data *data;
Elf_Scn *s;
int len;
 
-   if (strtab_name)
-   strtab = find_section_by_name(elf, strtab_name);
if (!strtab)
strtab = find_section_by_name(elf, ".strtab");
if (!strtab) {
-   WARN("can't find %s or .strtab section", strtab_name);
+   WARN("can't find .strtab section");
return -1;
}
 
@@ -718,7 +716,7 @@ static int elf_strtab_concat(struct elf *elf, char *name, 
const char *strtab_nam
 struct section *elf_create_section(struct elf *elf, const char *name,
   unsigned int sh_flags, size_t entsize, int 
nr)
 {
-   struct section *sec;
+   struct section *sec, *shstrtab;
size_t size = entsize * nr;
Elf_Scn *s;
 
@@ -777,7 +775,15 @@ struct section *elf_create_section(struct elf *elf, const 
char *name,
sec->sh.sh_addralign = 1;
sec->sh.sh_flags = SHF_ALLOC | sh_flags;
 
-   sec->sh.sh_name = elf_strtab_concat(elf, sec->name, ".shstrtab");
+   /* Add section name to .shstrtab (or .strtab for Clang) */
+   shstrtab = find_section_by_name(elf, ".shstrtab");
+   if (!shstrtab)
+   shstrtab = find_section_by_name(elf, ".strtab");
+   if (!shstrtab) {
+   WARN("can't find .shstrtab or .strtab section");
+   return NULL;
+   }
+   sec->sh.sh_name = elf_add_string(elf, sec->name, shstrtab);
if (sec->sh.sh_name == -1)
return NULL;

Re: [PATCH v2 08/14] objtool: Add elf_create_reloc() helper

2021-03-18 Thread Josh Poimboeuf

On Thu, Mar 18, 2021 at 06:11:11PM +0100, Peter Zijlstra wrote:
> We have 4 instances of adding a relocation. Create a common helper
> to avoid growing even more.
> 
> Signed-off-by: Peter Zijlstra (Intel) 

I'm not a fan of the API -- how about squashing this in?  Untested, of
course.

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 6a52d56504b1..38ffa3b2595e 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -469,12 +469,11 @@ static int create_static_call_sections(struct 
objtool_file *file)
memset(site, 0, sizeof(struct static_call_site));
 
/* populate reloc for 'addr' */
-   if (!elf_create_reloc(file->elf, sec, idx * sizeof(struct 
static_call_site),
- R_X86_64_PC32,
- reloc_to_insn, insn, 0)) {
-   WARN_ELF("elf_create_reloc: static_call_site::addr");
+   if (elf_add_reloc_to_insn(file->elf, sec,
+ idx * sizeof(struct static_call_site),
+ R_X86_64_PC32,
+ insn->sec, insn->offset))
return -1;
-   }
 
/* find key symbol */
key_name = strdup(insn->call_dest->name);
@@ -511,13 +510,11 @@ static int create_static_call_sections(struct 
objtool_file *file)
free(key_name);
 
/* populate reloc for 'key' */
-   if (!elf_create_reloc(file->elf, sec, idx * sizeof(struct 
static_call_site) + 4,
- R_X86_64_PC32,
- reloc_to_sym, key_sym,
- is_sibling_call(insn) * 
STATIC_CALL_SITE_TAIL)) {
-   WARN_ELF("elf_create_reloc: static_call_site::key");
+   if (elf_add_reloc(file->elf, sec,
+ idx * sizeof(struct static_call_site) + 4,
+ R_X86_64_PC32, key_sym,
+ is_sibling_call(insn) * 
STATIC_CALL_SITE_TAIL))
return -1;
-   }
 
idx++;
}
@@ -559,12 +556,11 @@ static int create_mcount_loc_sections(struct objtool_file 
*file)
loc = (unsigned long *)sec->data->d_buf + idx;
memset(loc, 0, sizeof(unsigned long));
 
-   if (!elf_create_reloc(file->elf, sec, idx * sizeof(unsigned 
long),
- R_X86_64_64,
- reloc_to_insn, insn, 0)) {
-   WARN_ELF("elf_create_reloc: __mcount_loc");
+   if (elf_add_reloc_to_insn(file->elf, sec,
+ idx * sizeof(unsigned long),
+ R_X86_64_64,
+ insn->sec, insn->offset))
return -1;
-   }
 
idx++;
}
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index d2afe2454de4..b3bd97b2b034 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -447,76 +447,73 @@ static int read_symbols(struct elf *elf)
return -1;
 }
 
-static void elf_add_reloc(struct elf *elf, struct reloc *reloc)
+int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
+ unsigned int type, struct symbol *sym, int addend)
 {
-   struct section *sec = reloc->sec;
+   struct reloc *reloc;
+
+   reloc = malloc(sizeof(*reloc));
+   if (!reloc) {
+   perror("malloc");
+   return -1;
+   }
+   memset(reloc, 0, sizeof(*reloc));
+
+   reloc->sec = sec->reloc;
+   reloc->offset = offset;
+   reloc->type = type;
+   reloc->sym = sym;
+   reloc->addend = addend;
 
list_add_tail(>list, >reloc_list);
elf_hash_add(elf->reloc_hash, >hash, reloc_hash(reloc));
 
-   sec->changed = true;
-}
-
-void reloc_to_sym(struct reloc *reloc, void *dst)
-{
-   struct symbol *sym = dst;
-   reloc->sym = sym;
+   return 0;
 }
 
-void __reloc_to_insn(struct reloc *reloc, struct section *sec, unsigned long 
offset)
+static int insn_to_sym_addend(struct section *sec, unsigned long offset,
+ struct symbol **sym, int *addend)
 {
if (sec->sym) {
-   reloc->sym = sec->sym;
-   reloc->addend = offset;
-   return;
+   *sym = sec->sym;
+   *addend = offset;
+   return 0;
}
 
/*
 * The Clang assembler strips section symbols, so we have to reference
 * the function symbol instead:
 */
-   reloc->sym = find_symbol_containing(sec, offset);
-   if (!reloc->sym) {
+   *sym = find_symbol_containing(sec, offset);
+   if (!*sym) {
/*

Re: [PATCH 5/9] objtool: Rework rebuild_reloc logic

2021-03-18 Thread Josh Poimboeuf

On Thu, Mar 18, 2021 at 12:38:42PM -0500, Josh Poimboeuf wrote:
> On Thu, Mar 18, 2021 at 06:04:25PM +0100, Peter Zijlstra wrote:
> > On Thu, Mar 18, 2021 at 11:36:40AM -0500, Josh Poimboeuf wrote:
> > > > I was thinking you could get a section changed without touching
> > > > relocations, but while that is theoretically possible, it is exceedingly
> > > > unlikely (and objtool doesn't do that).
> > > 
> > > Hm?  This is a *relocation* section, not a normal one.  So by
> > > definition, it only changes when its relocations change.
> > 
> > The way I read this code:
> > 
> > list_for_each_entry(sec, >sections, list) {
> > if (sec->changed) {
> > +   if (sec->reloc &&
> > +   elf_rebuild_reloc_section(elf, sec->reloc)) {
> > +   WARN_ELF("elf_rebuild_reloc_section");
> > +   return -1;
> > +   }
> > 
> > is that we iterate the regular sections (which could be dirtied because
> > we changed some data), and if that section has a relocation section, we
> > rebuild that for good measure (even though it might not have altered
> > relocations).
> > 
> > Or am I just totally confused ?
> 
> Ah, you're right.  I'm the one that's confused.  I guess I was also
> confused when I wrote that hunk, but it just happens to work anyway.
> 
> It would be cleaner to do something like
> 
>   if ((is_reloc_sec(sec) &&   
>   elf_rebuild_reloc_section(elf, sec)) {
> 
> so we process the changed reloc section directly, instead of relying on
> the (most likely) fact that the corresponding text section also changed.

i.e., in actual code:

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 66c49c2e20a6..3b3d19a5e626 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -948,8 +948,7 @@ int elf_write(struct elf *elf)
/* Update changed relocation sections and section headers: */
list_for_each_entry(sec, >sections, list) {
if (sec->changed) {
-   if (sec->reloc &&
-   elf_rebuild_reloc_section(elf, sec->reloc)) {
+   if (sec->base && elf_rebuild_reloc_section(elf, sec)) {
WARN_ELF("elf_rebuild_reloc_section");
return -1;
}

Re: [PATCH] kbuild: rebuild GCC plugins when the compiler is upgraded

2021-03-18 Thread Josh Poimboeuf

On Wed, Mar 10, 2021 at 10:44:46PM +0900, Masahiro Yamada wrote:
> > Masahiro,
> >
> > Ping.  Do you have a better approach for building GCC plugins in the
> > external module directory?
> 
> 
> I am not sure if building GCC plugins in the external module directory
> is the right approach.

I'm certainly open to any better ideas that would allow GCC plugins to
be enabled in distro kernel builds.

-- 
Josh

Re: [PATCH 5/9] objtool: Rework rebuild_reloc logic

2021-03-18 Thread Josh Poimboeuf

On Thu, Mar 18, 2021 at 06:04:25PM +0100, Peter Zijlstra wrote:
> On Thu, Mar 18, 2021 at 11:36:40AM -0500, Josh Poimboeuf wrote:
> > > I was thinking you could get a section changed without touching
> > > relocations, but while that is theoretically possible, it is exceedingly
> > > unlikely (and objtool doesn't do that).
> > 
> > Hm?  This is a *relocation* section, not a normal one.  So by
> > definition, it only changes when its relocations change.
> 
> The way I read this code:
> 
>   list_for_each_entry(sec, >sections, list) {
>   if (sec->changed) {
> + if (sec->reloc &&
> + elf_rebuild_reloc_section(elf, sec->reloc)) {
> + WARN_ELF("elf_rebuild_reloc_section");
> + return -1;
> + }
> 
> is that we iterate the regular sections (which could be dirtied because
> we changed some data), and if that section has a relocation section, we
> rebuild that for good measure (even though it might not have altered
> relocations).
> 
> Or am I just totally confused ?

Ah, you're right.  I'm the one that's confused.  I guess I was also
confused when I wrote that hunk, but it just happens to work anyway.

It would be cleaner to do something like

if ((is_reloc_sec(sec) &&   
elf_rebuild_reloc_section(elf, sec)) {

so we process the changed reloc section directly, instead of relying on
the (most likely) fact that the corresponding text section also changed.

-- 
Josh

Re: [PATCH 5/9] objtool: Rework rebuild_reloc logic

2021-03-18 Thread Josh Poimboeuf

On Thu, Mar 18, 2021 at 01:57:03PM +0100, Peter Zijlstra wrote:
> On Wed, Mar 17, 2021 at 07:49:17PM -0500, Josh Poimboeuf wrote:
> > On Wed, Mar 17, 2021 at 09:12:15AM +0100, Peter Zijlstra wrote:
> > > On Tue, Mar 16, 2021 at 10:34:17PM -0500, Josh Poimboeuf wrote:
> > > > On Fri, Mar 12, 2021 at 06:16:18PM +0100, Peter Zijlstra wrote:
> > > > > --- a/tools/objtool/elf.c
> > > > > +++ b/tools/objtool/elf.c
> > > > > @@ -479,6 +479,8 @@ void elf_add_reloc(struct elf *elf, stru
> > > > >  
> > > > >   list_add_tail(>list, >reloc_list);
> > > > >   elf_hash_add(elf->reloc_hash, >hash, reloc_hash(reloc));
> > > > > +
> > > > > + sec->rereloc = true;
> > > > >  }
> > > > 
> > > > Can we just reuse sec->changed for this?  Something like this on top
> > > > (untested of course):
> > > 
> > > I think my worry was that we'd dirty too much and slow down the write,
> > > but I haven't done any actual performance measurements on this.
> > 
> > Really?  I thought my proposal was purely aesthetic, no functional
> > change, but my brain is toasty this week due to other distractions so
> > who knows.
> 
> I was thinking you could get a section changed without touching
> relocations, but while that is theoretically possible, it is exceedingly
> unlikely (and objtool doesn't do that).

Hm?  This is a *relocation* section, not a normal one.  So by
definition, it only changes when its relocations change.

-- 
Josh

Re: [PATCH 3/3] static_call: Fix static_call_update() sanity check

2021-03-18 Thread Josh Poimboeuf

On Thu, Mar 18, 2021 at 12:31:59PM +0100, Peter Zijlstra wrote:
>   if (!kernel_text_address((unsigned long)site_addr)) {
> - WARN_ONCE(1, "can't patch static call site at 
> %pS",
> + /*
> +  * This skips patching __exit, which is part of
> +  * init_section_contains() but is not part of
> +  * kernel_text_address().
> +  *
> +  * Skipping __exit is fine since it will never
> +  * be executed.
> +  */
> + WARN_ONCE(!static_call_is_init(site),
> +   "can't patch static call site at %pS",
> site_addr);
>   continue;
>   }

It might be good to clarify the situation for __exit in modules in the
comment and/or changelog, as they both seem to be implicitly talking
only about __exit in vmlinux.

For CONFIG_MODULE_UNLOAD, the code ends up in the normal text area, so
static_call_is_init() is false and kernel_text_address() is true.

For !CONFIG_MODULE_UNLOAD, the code gets discarded during module load,
so static_call_is_init() and kernel_text_address() are both false.  I
guess that will trigger a warning?

-- 
Josh

Re: [PATCH 5/9] objtool: Rework rebuild_reloc logic

2021-03-17 Thread Josh Poimboeuf

On Wed, Mar 17, 2021 at 09:12:15AM +0100, Peter Zijlstra wrote:
> On Tue, Mar 16, 2021 at 10:34:17PM -0500, Josh Poimboeuf wrote:
> > On Fri, Mar 12, 2021 at 06:16:18PM +0100, Peter Zijlstra wrote:
> > > --- a/tools/objtool/elf.c
> > > +++ b/tools/objtool/elf.c
> > > @@ -479,6 +479,8 @@ void elf_add_reloc(struct elf *elf, stru
> > >  
> > >   list_add_tail(>list, >reloc_list);
> > >   elf_hash_add(elf->reloc_hash, >hash, reloc_hash(reloc));
> > > +
> > > + sec->rereloc = true;
> > >  }
> > 
> > Can we just reuse sec->changed for this?  Something like this on top
> > (untested of course):
> 
> I think my worry was that we'd dirty too much and slow down the write,
> but I haven't done any actual performance measurements on this.

Really?  I thought my proposal was purely aesthetic, no functional
change, but my brain is toasty this week due to other distractions so
who knows.

> Let me do a few runs and see if it matters at all.

-- 
Josh

Re: [PATCH 6/9] objtool: Add elf_create_undef_symbol()

2021-03-17 Thread Josh Poimboeuf

On Wed, Mar 17, 2021 at 03:13:43PM +0100, Peter Zijlstra wrote:
> On Wed, Mar 17, 2021 at 02:52:23PM +0100, Miroslav Benes wrote:
> 
> > > + if (!elf_symbol_add(elf, sym, SHN_XINDEX)) {
> > > + WARN("elf_symbol_add");
> > > + return NULL;
> > > + }
> > 
> > SHN_XINDEX means that the extended section index is used. Above you seem 
> > to use it in the opposite sense too (assigning to shndx when shndx_data is 
> > NULL). While it makes the code easier to handle, it is a bit confusing 
> > (and maybe I am just confused now). Could you add a comment about that, 
> > please? elf_symbol_add() seems like a good place.
> 
> Yes, that was a horrible thing to do :/ And you understood it right.
> 
> Looking at it again, I'm not sure it is actually correct tho; shouldn't
> elf_create_undef_symbol() also look at gelf_getsymshndx() of symtab ?
> 
> What toolchain generates these extended sections and how? That is, how
> do I test this crud..

SHN_XINDEX is basically a special-case extension to original ELF for
supporting more than 64k sections.

-- 
Josh

Re: [PATCH v4 3/9] x86/entry: Convert ret_from_fork to C

2021-03-17 Thread Josh Poimboeuf

On Wed, Mar 17, 2021 at 11:12:42AM -0700, Andy Lutomirski wrote:
> ret_from_fork is written in asm, slightly differently, for x86_32 and
> x86_64.  Convert it to C.
> 
> This is a straight conversion without any particular cleverness.  As a
> further cleanup, the code that sets up the ret_from_fork argument registers
> could be adjusted to put the arguments in the correct registers.
> 
> This will cause the ORC unwinder to find pt_regs even for kernel threads on
> x86_64.  This seems harmless.
> 
> The 32-bit comment above the now-deleted schedule_tail_wrapper was
> obsolete: the encode_frame_pointer mechanism (see copy_thread()) solves the
> same problem more cleanly.
> 
> Cc: Josh Poimboeuf 
> Signed-off-by: Andy Lutomirski 

Acked-by: Josh Poimboeuf 

-- 
Josh

Re: [PATCH v4 2/9] x86/kthread,dumpstack: Set task_pt_regs->cs.RPL=3 for kernel threads

2021-03-17 Thread Josh Poimboeuf

On Wed, Mar 17, 2021 at 11:12:41AM -0700, Andy Lutomirski wrote:
> @@ -163,6 +163,19 @@ int copy_thread(unsigned long clone_flags, unsigned long 
> sp, unsigned long arg,
>   /* Kernel thread ? */
>   if (unlikely(p->flags & PF_KTHREAD)) {
>   memset(childregs, 0, sizeof(struct pt_regs));
> +
> + /*
> +  * Even though there is no real user state here, these
> +  * are were user regs belong, and kernel_execve() will

s/were/where/

> +  * overwrite them with user regs.  Put an obviously
> +  * invalid value that nonetheless causes user_mode(regs)
> +  * to return true in CS.
> +  *
> +  * This also prevents the unwinder from thinking that there
> +  * is invalid kernel state at the top of the stack.
> +  */
> + childregs->cs = 3;
> +
>   kthread_frame_init(frame, sp, arg);
>   return 0;
>   }

Acked-by: Josh Poimboeuf 

-- 
Josh

Re: [PATCH] objtool,static_call: Don't emit static_call_site for .exit.text

2021-03-17 Thread Josh Poimboeuf

On Wed, Mar 17, 2021 at 01:45:57PM +0100, Peter Zijlstra wrote:
> arguably it simply isn't a good idea to use static_call() in __exit
> code anyway, since module unload is never a performance critical path.

Couldn't you make the same argument about __init functions, which are
allowed to do static calls?

We might consider a STATIC_CALL_SITE_EXIT flag, but I suppose we've run
out of flag space.

-- 
Josh

Re: [PATCH 5/9] objtool: Rework rebuild_reloc logic

2021-03-16 Thread Josh Poimboeuf

On Fri, Mar 12, 2021 at 06:16:18PM +0100, Peter Zijlstra wrote:
> --- a/tools/objtool/elf.c
> +++ b/tools/objtool/elf.c
> @@ -479,6 +479,8 @@ void elf_add_reloc(struct elf *elf, stru
>  
>   list_add_tail(>list, >reloc_list);
>   elf_hash_add(elf->reloc_hash, >hash, reloc_hash(reloc));
> +
> + sec->rereloc = true;
>  }

Can we just reuse sec->changed for this?  Something like this on top
(untested of course):

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index addcec88ac9f..b9cb74a54681 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -480,7 +480,7 @@ void elf_add_reloc(struct elf *elf, struct reloc *reloc)
list_add_tail(>list, >reloc_list);
elf_hash_add(elf->reloc_hash, >hash, reloc_hash(reloc));
 
-   sec->rereloc = true;
+   sec->changed = true;
 }
 
 static int read_rel_reloc(struct section *sec, int i, struct reloc *reloc, 
unsigned int *symndx)
@@ -882,9 +882,6 @@ int elf_rebuild_reloc_section(struct elf *elf, struct 
section *sec)
struct reloc *reloc;
int nr;
 
-   sec->changed = true;
-   elf->changed = true;
-
nr = 0;
list_for_each_entry(reloc, >reloc_list, list)
nr++;
@@ -894,8 +891,6 @@ int elf_rebuild_reloc_section(struct elf *elf, struct 
section *sec)
case SHT_RELA: return elf_rebuild_rela_reloc_section(sec, nr);
default:   return -1;
}
-
-   sec->rereloc = false;
 }
 
 int elf_write_insn(struct elf *elf, struct section *sec,
@@ -950,14 +945,15 @@ int elf_write(struct elf *elf)
struct section *sec;
Elf_Scn *s;
 
-   list_for_each_entry(sec, >sections, list) {
-   if (sec->reloc && sec->reloc->rereloc)
-   elf_rebuild_reloc_section(elf, sec->reloc);
-   }
-
-   /* Update section headers for changed sections: */
+   /* Update changed relocation sections and section headers: */
list_for_each_entry(sec, >sections, list) {
if (sec->changed) {
+   if (sec->reloc &&
+   elf_rebuild_reloc_section(elf, sec->reloc)) {
+   WARN_ELF("elf_rebuild_reloc_section");
+   return -1;
+   }
+
s = elf_getscn(elf->elf, sec->idx);
if (!s) {
WARN_ELF("elf_getscn");
@@ -969,6 +965,7 @@ int elf_write(struct elf *elf)
}
 
sec->changed = false;
+   elf->changed = true;
}
}
 
diff --git a/tools/objtool/include/objtool/elf.h 
b/tools/objtool/include/objtool/elf.h
index 9fdd4c5f9f32..e6890cc70a25 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -39,7 +39,7 @@ struct section {
char *name;
int idx;
unsigned int len;
-   bool changed, text, rodata, noinstr, rereloc;
+   bool changed, text, rodata, noinstr;
 };
 
 struct symbol {

Re: [PATCH 4/9] objtool: Fix static_call list generation

2021-03-16 Thread Josh Poimboeuf

On Fri, Mar 12, 2021 at 06:16:17PM +0100, Peter Zijlstra wrote:
> @@ -1701,6 +1706,9 @@ static int decode_sections(struct objtoo
>   if (ret)
>   return ret;
>  
> + /*
> +  * Must be before add_{jump_call}_desetination.
> +  */

s/desetination/destination/

-- 
Josh

Re: [PATCH 2/9] objtool: Correctly handle retpoline thunk calls

2021-03-16 Thread Josh Poimboeuf

On Fri, Mar 12, 2021 at 06:16:15PM +0100, Peter Zijlstra wrote:
> Just like JMP handling, convert a direct CALL to a retpoline thunk
> into a retpoline safe indirect CALL.
> 
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  tools/objtool/check.c |   12 
>  1 file changed, 12 insertions(+)
> 
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -953,6 +953,18 @@ static int add_call_destinations(struct
> dest_off);
>   return -1;
>   }
> +
> + } else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 
> 21)) {
> + /*
> +  * Retpoline calls are really dynamic calls in
> +  * disguise, so convert them accodingly.

s/accodingly/accordingly/

-- 
Josh

Re: [PATCH -tip v2 00/10] kprobes: Fix stacktrace with kretprobes

2021-03-15 Thread Josh Poimboeuf

On Fri, Mar 12, 2021 at 03:41:44PM +0900, Masami Hiramatsu wrote:
> Hello,
> 
> Here is the 2nd version of the series to fix the stacktrace with kretprobe.
> 
> The 1st series is here;
> 
> https://lore.kernel.org/bpf/161495873696.346821.10161501768906432924.stgit@devnote2/
> 
> In this version I merged the ORC unwinder fix for kretprobe which discussed 
> in the
> previous thread. [3/10] is updated according to the Miroslav's comment. 
> [4/10] is
> updated for simplify the code. [5/10]-[9/10] are discussed in the previsous 
> tread
> and are introduced to the series.
> 
> Daniel, can you also test this again? I and Josh discussed a bit different
> method and I've implemented it on this version.
> 
> This actually changes the kretprobe behavisor a bit, now the instraction 
> pointer in
> the pt_regs passed to kretprobe user handler is correctly set the real return
> address. So user handlers can get it via instruction_pointer() API.

When I add WARN_ON(1) to a test kretprobe, it doesn't unwind properly.

show_trace_log_lvl() reads the entire stack in lockstep with calls to
the unwinder so that it can decide which addresses get prefixed with
'?'.  So it expects to find an actual return address on the stack.
Instead it finds %rsp.  So it never syncs up with unwind_next_frame()
and shows all remaining addresses as unreliable.

  Call Trace:
   __kretprobe_trampoline_handler+0xca/0x1a0
   trampoline_handler+0x3d/0x50
   kretprobe_trampoline+0x25/0x50
   ? init_test_probes.cold+0x323/0x387
   ? init_kprobes+0x144/0x14c
   ? init_optprobes+0x15/0x15
   ? do_one_initcall+0x5b/0x300
   ? lock_is_held_type+0xe8/0x140
   ? kernel_init_freeable+0x174/0x2cd
   ? rest_init+0x233/0x233
   ? kernel_init+0xa/0x11d
   ? ret_from_fork+0x22/0x30

How about pushing 'kretprobe_trampoline' instead of %rsp for the return
address placeholder.  That fixes the above test, and removes the need
for the weird 'state->ip == sp' check:

  Call Trace:
   __kretprobe_trampoline_handler+0xca/0x150
   trampoline_handler+0x3d/0x50
   kretprobe_trampoline+0x29/0x50
   ? init_test_probes.cold+0x323/0x387
   elfcorehdr_read+0x10/0x10
   init_kprobes+0x144/0x14c
   ? init_optprobes+0x15/0x15
   do_one_initcall+0x72/0x280
   kernel_init_freeable+0x174/0x2cd
   ? rest_init+0x122/0x122
   kernel_init+0xa/0x10e
   ret_from_fork+0x22/0x30

Though, init_test_probes.cold() (the real return address) is still
listed as unreliable.  Maybe we need a call to kretprobe_find_ret_addr()
in show_trace_log_lvl(), similar to the ftrace_graph_ret_addr() call
there.

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 06f33bfebc50..70188fdd288e 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -766,19 +766,19 @@ asm(
"kretprobe_trampoline:\n"
/* We don't bother saving the ss register */
 #ifdef CONFIG_X86_64
-   "   pushq %rsp\n"
+   /* Push fake return address to tell the unwinder it's a kretprobe */
+   "   pushq $kretprobe_trampoline\n"
UNWIND_HINT_FUNC
"   pushfq\n"
SAVE_REGS_STRING
"   movq %rsp, %rdi\n"
"   call trampoline_handler\n"
-   /* Replace saved sp with true return address. */
+   /* Replace the fake return address with the real one. */
"   movq %rax, 19*8(%rsp)\n"
RESTORE_REGS_STRING
"   popfq\n"
 #else
"   pushl %esp\n"
-   UNWIND_HINT_FUNC
"   pushfl\n"
SAVE_REGS_STRING
"   movl %esp, %eax\n"
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 1d1b9388a1b1..1d3de84d2410 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -548,8 +548,7 @@ bool unwind_next_frame(struct unwind_state *state)
 * In those cases, find the correct return address from
 * task->kretprobe_instances list.
 */
-   if (state->ip == sp ||
-   is_kretprobe_trampoline(state->ip))
+   if (is_kretprobe_trampoline(state->ip))
state->ip = kretprobe_find_ret_addr(state->task,
>kr_iter);

Re: [GIT pull] locking/urgent for v5.12-rc3

2021-03-15 Thread Josh Poimboeuf

On Mon, Mar 15, 2021 at 01:08:27PM +0100, Peter Zijlstra wrote:
> On Mon, Mar 15, 2021 at 12:26:12PM +0100, Peter Zijlstra wrote:
> > Ooooh, modules don't have this. They still have regular
> > .static_call_sites sections, and *those* are unaligned.
> > 
> > Section Headers:
> > [Nr] Name  TypeAddress  OffSize   ES 
> > Flg Lk Inf Al
> > 
> > [16] .static_call_sites PROGBITS 008aa1 0006f0 00  
> > WA  0   0  1
> > 
> > And that goes *BOOM*.. Let me ses if I can figure out how to make
> > objtool align those sections.
> 
> The below seems to have cured it:
> 
> [16] .static_call_sites PROGBITS 008aa8 0006f0 00  WA 
>  0   0  8
> 
> 
> So, anybody any opinion on if we ought to do this?

I'd say yes to alignment, for the sake of consistency with vmlinux.

Though instead of using objtool, it can be done in the module linker
script:

diff --git a/scripts/module.lds.S b/scripts/module.lds.S
index 168cd27e6122..73345cbfe100 100644
--- a/scripts/module.lds.S
+++ b/scripts/module.lds.S
@@ -17,6 +17,7 @@ SECTIONS {
.init_array 0 : ALIGN(8) { *(SORT(.init_array.*)) 
*(.init_array) }
 
__jump_table0 : ALIGN(8) { KEEP(*(__jump_table)) }
+   .static_call_sites  0 : ALIGN(8) { KEEP(*(.static_call_sites)) }
 
__patchable_function_entries : { *(__patchable_function_entries) }

Re: [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes

2021-03-11 Thread Josh Poimboeuf

On Thu, Mar 11, 2021 at 10:54:38AM +0900, Masami Hiramatsu wrote:
> On Wed, 10 Mar 2021 19:06:15 -0600
> Josh Poimboeuf  wrote:
> 
> > On Thu, Mar 11, 2021 at 09:20:18AM +0900, Masami Hiramatsu wrote:
> > > > >  bool unwind_next_frame(struct unwind_state *state)
> > > > >  {
> > > > >   unsigned long ip_p, sp, tmp, orig_ip = state->ip, prev_sp = 
> > > > > state->sp;
> > > > > @@ -536,6 +561,18 @@ bool unwind_next_frame(struct unwind_state 
> > > > > *state)
> > > > >  
> > > > >   state->ip = ftrace_graph_ret_addr(state->task, 
> > > > > >graph_idx,
> > > > > state->ip, (void 
> > > > > *)ip_p);
> > > > > + /*
> > > > > +  * There are special cases when the stack unwinder is 
> > > > > called
> > > > > +  * from the kretprobe handler or the interrupt handler 
> > > > > which
> > > > > +  * occurs in the kretprobe trampoline code. In those 
> > > > > cases,
> > > > > +  * %sp is shown on the stack instead of the return 
> > > > > address.
> > > > > +  * Or, when the unwinder find the return address is 
> > > > > replaced
> > > > > +  * by kretprobe_trampoline.
> > > > > +  * In those cases, correct address can be found in 
> > > > > kretprobe.
> > > > > +  */
> > > > > + if (state->ip == sp ||
> > > > 
> > > > Why is the 'state->ip == sp' needed?
> > > 
> > > As I commented above, until kretprobe_trampoline writes back the real
> > > address to the stack, sp value is there (which has been pushed by the
> > > 'pushq %rsp' at the entry of kretprobe_trampoline.)
> > > 
> > > ".type kretprobe_trampoline, @function\n"
> > > "kretprobe_trampoline:\n"
> > > /* We don't bother saving the ss register */
> > > "   pushq %rsp\n" // THIS
> > > "   pushfq\n"
> > > 
> > > Thus, from inside the kretprobe handler, like ftrace, you'll see
> > > the sp value instead of the real return address.
> > 
> > I see.  If you change is_kretprobe_trampoline_address() to include the
> > entire function, like:
> > 
> > static bool is_kretprobe_trampoline_address(unsigned long ip)
> > {
> > return (void *)ip >= kretprobe_trampoline &&
> >(void *)ip < kretprobe_trampoline_end;
> > }
> > 
> > then the unwinder won't ever read the bogus %rsp value into state->ip,
> > and the 'state->ip == sp' check can be removed.
> 
> Hmm, I couldn't get your point. Since sp is the address of stack,
> it always out of text address.

When unwinding from trampoline_handler(), state->ip will point to the
instruction after the call:

call trampoline_handler
movq %rax, 19*8(%rsp)   <-- state->ip points to this insn

But then, the above version of is_kretprobe_trampoline_address() is
true, so state->ip gets immediately replaced with the real return
address:

if (is_kretprobe_trampoline_address(state->ip))
state->ip = orc_kretprobe_correct_ip(state);

so the unwinder skips over the kretprobe_trampoline() frame and goes
straight to the frame of the real return address.  Thus it never reads
this bogus return value into state->ip:

pushq %rsp

which is why the weird 'state->ip == sp' check is no longer needed.

The only "downside" is that the unwinder skips the
kretprobe_trampoline() frame.  (note that downside wouldn't exist in
the case of UNWIND_HINT_REGS + valid regs->ip).

> > > > And it would make the unwinder just work automatically when unwinding
> > > > from the handler using the regs.
> > > > 
> > > > It would also work when unwinding from the handler's stack, if we put an
> > > > UNWIND_HINT_REGS after saving the regs.
> > > 
> > > At that moment, the real return address is not identified. So we can not
> > > put it.
> > 
> > True, at the time the regs are originally saved, the real return address
> > isn't available.  But by the time the user handler is called, the return
> > address *is* available.  So if the real return address were placed in
> > regs->ip befo

Re: [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes

2021-03-10 Thread Josh Poimboeuf

On Thu, Mar 11, 2021 at 09:20:18AM +0900, Masami Hiramatsu wrote:
> > >  bool unwind_next_frame(struct unwind_state *state)
> > >  {
> > >   unsigned long ip_p, sp, tmp, orig_ip = state->ip, prev_sp = state->sp;
> > > @@ -536,6 +561,18 @@ bool unwind_next_frame(struct unwind_state *state)
> > >  
> > >   state->ip = ftrace_graph_ret_addr(state->task, 
> > > >graph_idx,
> > > state->ip, (void *)ip_p);
> > > + /*
> > > +  * There are special cases when the stack unwinder is called
> > > +  * from the kretprobe handler or the interrupt handler which
> > > +  * occurs in the kretprobe trampoline code. In those cases,
> > > +  * %sp is shown on the stack instead of the return address.
> > > +  * Or, when the unwinder find the return address is replaced
> > > +  * by kretprobe_trampoline.
> > > +  * In those cases, correct address can be found in kretprobe.
> > > +  */
> > > + if (state->ip == sp ||
> > 
> > Why is the 'state->ip == sp' needed?
> 
> As I commented above, until kretprobe_trampoline writes back the real
> address to the stack, sp value is there (which has been pushed by the
> 'pushq %rsp' at the entry of kretprobe_trampoline.)
> 
> ".type kretprobe_trampoline, @function\n"
> "kretprobe_trampoline:\n"
> /* We don't bother saving the ss register */
> "   pushq %rsp\n" // THIS
> "   pushfq\n"
> 
> Thus, from inside the kretprobe handler, like ftrace, you'll see
> the sp value instead of the real return address.

I see.  If you change is_kretprobe_trampoline_address() to include the
entire function, like:

static bool is_kretprobe_trampoline_address(unsigned long ip)
{
return (void *)ip >= kretprobe_trampoline &&
   (void *)ip < kretprobe_trampoline_end;
}

then the unwinder won't ever read the bogus %rsp value into state->ip,
and the 'state->ip == sp' check can be removed.

> > And it would make the unwinder just work automatically when unwinding
> > from the handler using the regs.
> > 
> > It would also work when unwinding from the handler's stack, if we put an
> > UNWIND_HINT_REGS after saving the regs.
> 
> At that moment, the real return address is not identified. So we can not
> put it.

True, at the time the regs are originally saved, the real return address
isn't available.  But by the time the user handler is called, the return
address *is* available.  So if the real return address were placed in
regs->ip before calling the handler, the unwinder could find it there,
when called from the handler.

Then we wouldn't need the call to orc_kretprobe_correct_ip() in
__unwind_start().

But maybe it's not possible due to the regs->ip expectations of legacy
handlers?

-- 
Josh

Re: [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes

2021-03-10 Thread Josh Poimboeuf

On Thu, Mar 11, 2021 at 12:55:09AM +0900, Masami Hiramatsu wrote:
> +#ifdef CONFIG_KRETPROBES
> +static unsigned long orc_kretprobe_correct_ip(struct unwind_state *state)
> +{
> + return kretprobe_find_ret_addr(
> + (unsigned long)kretprobe_trampoline_addr(),
> + state->task, >kr_iter);
> +}
> +
> +static bool is_kretprobe_trampoline_address(unsigned long ip)
> +{
> + return ip == (unsigned long)kretprobe_trampoline_addr();
> +}
> +#else
> +static unsigned long orc_kretprobe_correct_ip(struct unwind_state *state)
> +{
> + return state->ip;
> +}
> +
> +static bool is_kretprobe_trampoline_address(unsigned long ip)
> +{
> + return false;
> +}
> +#endif
> +

Can this code go in a kprobes file?  I'd rather not clutter ORC with it,
and maybe it would be useful for other arches or unwinders.

>  bool unwind_next_frame(struct unwind_state *state)
>  {
>   unsigned long ip_p, sp, tmp, orig_ip = state->ip, prev_sp = state->sp;
> @@ -536,6 +561,18 @@ bool unwind_next_frame(struct unwind_state *state)
>  
>   state->ip = ftrace_graph_ret_addr(state->task, 
> >graph_idx,
> state->ip, (void *)ip_p);
> + /*
> +  * There are special cases when the stack unwinder is called
> +  * from the kretprobe handler or the interrupt handler which
> +  * occurs in the kretprobe trampoline code. In those cases,
> +  * %sp is shown on the stack instead of the return address.
> +  * Or, when the unwinder find the return address is replaced
> +  * by kretprobe_trampoline.
> +  * In those cases, correct address can be found in kretprobe.
> +  */
> + if (state->ip == sp ||

Why is the 'state->ip == sp' needed?

> + is_kretprobe_trampoline_address(state->ip))
> + state->ip = orc_kretprobe_correct_ip(state);

This is similar in concept to ftrace_graph_ret_addr(), right?  Would it
be possible to have a similar API?  Like

state->ip = kretprobe_ret_addr(state->task, >kr_iter, 
state->ip);

and without the conditional.

>  
>   state->sp = sp;
>   state->regs = NULL;
> @@ -649,6 +686,12 @@ void __unwind_start(struct unwind_state *state, struct 
> task_struct *task,
>   state->full_regs = true;
>   state->signal = true;
>  
> + /*
> +  * When the unwinder called with regs from kretprobe handler,
> +  * the regs->ip starts from kretprobe_trampoline address.
> +  */
> + if (is_kretprobe_trampoline_address(state->ip))
> + state->ip = orc_kretprobe_correct_ip(state);

Shouldn't __kretprobe_trampoline_handler() just set regs->ip to
'correct_ret_addr' before passing the regs to the handler?  I'd think
that would be a less surprising value for regs->ip than
'_trampoline'.

And it would make the unwinder just work automatically when unwinding
from the handler using the regs.

It would also work when unwinding from the handler's stack, if we put an
UNWIND_HINT_REGS after saving the regs.

The only (rare) case it wouldn't work would be unwinding from an
interrupt before regs->ip gets set properly.  In which case we'd still
need the above call to orc_kretprobe_correct_ip() or so.

-- 
Josh

Re: [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes

2021-03-10 Thread Josh Poimboeuf

On Wed, Mar 10, 2021 at 06:57:34PM +0900, Masami Hiramatsu wrote:
> > If I understand correctly, for #1 you need an unwind hint which treats
> > the instruction *after* the "pushq %rsp" as the beginning of the
> > function.
> 
> Thanks for the patch. In that case, should I still change the stack 
> allocation?
> Or can I continue to use a series of "push/pop" ?

You can continue to use push/pop.  Objtool is only getting confused by
the unbalanced stack of the function (more pushes than pops).  The
unwind hint should fix that.

-- 
Josh

Re: [PATCH] kbuild: rebuild GCC plugins when the compiler is upgraded

2021-03-09 Thread Josh Poimboeuf

On Fri, Mar 05, 2021 at 08:50:59PM -0600, Josh Poimboeuf wrote:
> > Is this a bad coding contest?
> > 
> > I am not asking you to add ugly ifeq or whatever
> > hacks to say "this worked for me".
> > 
> > Please feel free to do this in the fedora kernel,
> > but do not send it to upstream.
> > 
> > Sorry, I really do not want to see hacks like this any more.

Masahiro,

Ping.  Do you have a better approach for building GCC plugins in the
external module directory?

-- 
Josh

Re: [PATCH] objtool: Fix a memory leak bug

2021-03-09 Thread Josh Poimboeuf

On Tue, Mar 09, 2021 at 04:46:16PM +0800, Jiapeng Chong wrote:
> Fix the following cppcheck warnings:
> 
> tools/objtool/check.c(1102): error: Memory leak: orig_alt_group.
> 
> Reported-by: Abaci Robot 
> Signed-off-by: Jiapeng Chong 

Hi Jiapeng,

Objtool is a short-running process which exits immediately on error, so
we don't worry about memory leaks.

-- 
Josh

Re: [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes

2021-03-08 Thread Josh Poimboeuf

On Mon, Mar 08, 2021 at 11:52:10AM +0900, Masami Hiramatsu wrote:
> So at the kretprobe handler, we have 2 issues.
> 1) the return address (caller_func+0x15) is not on the stack.
>this can be solved by searching from current->kretprobe_instances.
> 2) the stack frame size of kretprobe_trampoline is unknown
>Since the stackframe is fixed, the fixed number (0x98) can be used.
> 
> However, those solutions are only for the kretprobe handler. The stacktrace
> from interrupt handler hit in the kretprobe_trampoline still doesn't work.
> 
> So, here is my idea;
> 
> 1) Change the trampline code to prepare stack frame at first and save
>registers on it, instead of "push". This will makes ORC easy to setup
>stackframe information for this code.
> 2) change the return addres fixup timing. Instead of using return value
>of trampoline handler, before removing the real return address from
>current->kretprobe_instances.
> 3) Then, if orc_find() finds the ip is in the kretprobe_trampoline, it
>checks the contents of the end of stackframe (at the place of regs->sp)
>is same as the address of it. If it is, it can find the correct address
>from current->kretprobe_instances. If not, there is a correct address.
> 
> I need to find how the ORC info is prepared for 1), maybe UNWIND_HINT macro
> may help?

Hi Masami,

If I understand correctly, for #1 you need an unwind hint which treats
the instruction *after* the "pushq %rsp" as the beginning of the
function.

I'm thinking this should work:


diff --git a/arch/x86/include/asm/unwind_hints.h 
b/arch/x86/include/asm/unwind_hints.h
index 8e574c0afef8..8b33674288ea 100644
--- a/arch/x86/include/asm/unwind_hints.h
+++ b/arch/x86/include/asm/unwind_hints.h
@@ -52,6 +52,11 @@
UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=8 type=UNWIND_HINT_TYPE_FUNC
 .endm
 
+#else
+
+#define UNWIND_HINT_FUNC \
+   UNWIND_HINT(ORC_REG_SP, 8, UNWIND_HINT_TYPE_FUNC, 0)
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_X86_UNWIND_HINTS_H */
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index df776cdca327..38ff1387f95d 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -767,6 +767,7 @@ asm(
/* We don't bother saving the ss register */
 #ifdef CONFIG_X86_64
"   pushq %rsp\n"
+   UNWIND_HINT_FUNC
"   pushfq\n"
SAVE_REGS_STRING
"   movq %rsp, %rdi\n"
@@ -790,7 +791,6 @@ asm(
".size kretprobe_trampoline, .-kretprobe_trampoline\n"
 );
 NOKPROBE_SYMBOL(kretprobe_trampoline);
-STACK_FRAME_NON_STANDARD(kretprobe_trampoline);
 
 
 /*

[tip: x86/urgent] x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2

2021-03-06 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: e504e74cc3a2c092b05577ce3e8e013fae7d94e6
Gitweb:
https://git.kernel.org/tip/e504e74cc3a2c092b05577ce3e8e013fae7d94e6
Author:Josh Poimboeuf 
AuthorDate:Fri, 05 Feb 2021 08:24:02 -06:00
Committer: Borislav Petkov 
CommitterDate: Sat, 06 Mar 2021 13:09:37 +01:00

x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2

KASAN reserves "redzone" areas between stack frames in order to detect
stack overruns.  A read or write to such an area triggers a KASAN
"stack-out-of-bounds" BUG.

Normally, the ORC unwinder stays in-bounds and doesn't access the
redzone.  But sometimes it can't find ORC metadata for a given
instruction.  This can happen for code which is missing ORC metadata, or
for generated code.  In such cases, the unwinder attempts to fall back
to frame pointers, as a best-effort type thing.

This fallback often works, but when it doesn't, the unwinder can get
confused and go off into the weeds into the KASAN redzone, triggering
the aforementioned KASAN BUG.

But in this case, the unwinder's confusion is actually harmless and
working as designed.  It already has checks in place to prevent
off-stack accesses, but those checks get short-circuited by the KASAN
BUG.  And a BUG is a lot more disruptive than a harmless unwinder
warning.

Disable the KASAN checks by using READ_ONCE_NOCHECK() for all stack
accesses.  This finishes the job started by commit 881125bfe65b
("x86/unwind: Disable KASAN checking in the ORC unwinder"), which only
partially fixed the issue.

Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder")
Reported-by: Ivan Babrou 
Signed-off-by: Josh Poimboeuf 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Borislav Petkov 
Reviewed-by: Steven Rostedt (VMware) 
Tested-by: Ivan Babrou 
Cc: sta...@kernel.org
Link: 
https://lkml.kernel.org/r/9583327904ebbbeda399eca9c56d6c7085ac20fe.1612534649.git.jpoim...@redhat.com
---
 arch/x86/kernel/unwind_orc.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 2a1d47f..1bcc14c 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -367,8 +367,8 @@ static bool deref_stack_regs(struct unwind_state *state, 
unsigned long addr,
if (!stack_access_ok(state, addr, sizeof(struct pt_regs)))
return false;
 
-   *ip = regs->ip;
-   *sp = regs->sp;
+   *ip = READ_ONCE_NOCHECK(regs->ip);
+   *sp = READ_ONCE_NOCHECK(regs->sp);
return true;
 }
 
@@ -380,8 +380,8 @@ static bool deref_stack_iret_regs(struct unwind_state 
*state, unsigned long addr
if (!stack_access_ok(state, addr, IRET_FRAME_SIZE))
return false;
 
-   *ip = regs->ip;
-   *sp = regs->sp;
+   *ip = READ_ONCE_NOCHECK(regs->ip);
+   *sp = READ_ONCE_NOCHECK(regs->sp);
return true;
 }
 
@@ -402,12 +402,12 @@ static bool get_reg(struct unwind_state *state, unsigned 
int reg_off,
return false;
 
if (state->full_regs) {
-   *val = ((unsigned long *)state->regs)[reg];
+   *val = READ_ONCE_NOCHECK(((unsigned long *)state->regs)[reg]);
return true;
}
 
if (state->prev_regs) {
-   *val = ((unsigned long *)state->prev_regs)[reg];
+   *val = READ_ONCE_NOCHECK(((unsigned long 
*)state->prev_regs)[reg]);
return true;
}

[tip: x86/urgent] x86/unwind/orc: Silence warnings caused by missing ORC data

2021-03-06 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: b59cc97674c947861783ca92b9a6e7d043adba96
Gitweb:
https://git.kernel.org/tip/b59cc97674c947861783ca92b9a6e7d043adba96
Author:Josh Poimboeuf 
AuthorDate:Fri, 05 Feb 2021 08:24:03 -06:00
Committer: Borislav Petkov 
CommitterDate: Sat, 06 Mar 2021 13:09:45 +01:00

x86/unwind/orc: Silence warnings caused by missing ORC data

The ORC unwinder attempts to fall back to frame pointers when ORC data
is missing for a given instruction.  It sets state->error, but then
tries to keep going as a best-effort type of thing.  That may result in
further warnings if the unwinder gets lost.

Until we have some way to register generated code with the unwinder,
missing ORC will be expected, and occasionally going off the rails will
also be expected.  So don't warn about it.

Signed-off-by: Josh Poimboeuf 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Borislav Petkov 
Tested-by: Ivan Babrou 
Link: 
https://lkml.kernel.org/r/06d02c4bbb220bd31668db579278b0352538efbb.1612534649.git.jpoim...@redhat.com
---
 arch/x86/kernel/unwind_orc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 1bcc14c..a120253 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -13,7 +13,7 @@
 
 #define orc_warn_current(args...)  \
 ({ \
-   if (state->task == current) \
+   if (state->task == current && !state->error)\
orc_warn(args); \
 })

[tip: x86/urgent] x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2

2021-03-06 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 8bd7b3980ca62904814d536b3a2453001992a0c3
Gitweb:
https://git.kernel.org/tip/8bd7b3980ca62904814d536b3a2453001992a0c3
Author:Josh Poimboeuf 
AuthorDate:Fri, 05 Feb 2021 08:24:02 -06:00
Committer: Borislav Petkov 
CommitterDate: Sat, 06 Mar 2021 11:37:00 +01:00

x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2

KASAN reserves "redzone" areas between stack frames in order to detect
stack overruns.  A read or write to such an area triggers a KASAN
"stack-out-of-bounds" BUG.

Normally, the ORC unwinder stays in-bounds and doesn't access the
redzone.  But sometimes it can't find ORC metadata for a given
instruction.  This can happen for code which is missing ORC metadata, or
for generated code.  In such cases, the unwinder attempts to fall back
to frame pointers, as a best-effort type thing.

This fallback often works, but when it doesn't, the unwinder can get
confused and go off into the weeds into the KASAN redzone, triggering
the aforementioned KASAN BUG.

But in this case, the unwinder's confusion is actually harmless and
working as designed.  It already has checks in place to prevent
off-stack accesses, but those checks get short-circuited by the KASAN
BUG.  And a BUG is a lot more disruptive than a harmless unwinder
warning.

Disable the KASAN checks by using READ_ONCE_NOCHECK() for all stack
accesses.  This finishes the job started by commit 881125bfe65b
("x86/unwind: Disable KASAN checking in the ORC unwinder"), which only
partially fixed the issue.

Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder")
Reported-by: Ivan Babrou 
Signed-off-by: Josh Poimboeuf 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Steven Rostedt (VMware) 
Tested-by: Ivan Babrou 
Cc: sta...@kernel.org
Link: 
https://lkml.kernel.org/r/9583327904ebbbeda399eca9c56d6c7085ac20fe.1612534649.git.jpoim...@redhat.com
---
 arch/x86/kernel/unwind_orc.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 2a1d47f..1bcc14c 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -367,8 +367,8 @@ static bool deref_stack_regs(struct unwind_state *state, 
unsigned long addr,
if (!stack_access_ok(state, addr, sizeof(struct pt_regs)))
return false;
 
-   *ip = regs->ip;
-   *sp = regs->sp;
+   *ip = READ_ONCE_NOCHECK(regs->ip);
+   *sp = READ_ONCE_NOCHECK(regs->sp);
return true;
 }
 
@@ -380,8 +380,8 @@ static bool deref_stack_iret_regs(struct unwind_state 
*state, unsigned long addr
if (!stack_access_ok(state, addr, IRET_FRAME_SIZE))
return false;
 
-   *ip = regs->ip;
-   *sp = regs->sp;
+   *ip = READ_ONCE_NOCHECK(regs->ip);
+   *sp = READ_ONCE_NOCHECK(regs->sp);
return true;
 }
 
@@ -402,12 +402,12 @@ static bool get_reg(struct unwind_state *state, unsigned 
int reg_off,
return false;
 
if (state->full_regs) {
-   *val = ((unsigned long *)state->regs)[reg];
+   *val = READ_ONCE_NOCHECK(((unsigned long *)state->regs)[reg]);
return true;
}
 
if (state->prev_regs) {
-   *val = ((unsigned long *)state->prev_regs)[reg];
+   *val = READ_ONCE_NOCHECK(((unsigned long 
*)state->prev_regs)[reg]);
return true;
}

[tip: x86/urgent] x86/unwind/orc: Silence warnings caused by missing ORC data

2021-03-06 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: d072f941c1e234f8495cc4828370b180318bf49b
Gitweb:
https://git.kernel.org/tip/d072f941c1e234f8495cc4828370b180318bf49b
Author:Josh Poimboeuf 
AuthorDate:Fri, 05 Feb 2021 08:24:03 -06:00
Committer: Borislav Petkov 
CommitterDate: Sat, 06 Mar 2021 11:37:00 +01:00

x86/unwind/orc: Silence warnings caused by missing ORC data

The ORC unwinder attempts to fall back to frame pointers when ORC data
is missing for a given instruction.  It sets state->error, but then
tries to keep going as a best-effort type of thing.  That may result in
further warnings if the unwinder gets lost.

Until we have some way to register generated code with the unwinder,
missing ORC will be expected, and occasionally going off the rails will
also be expected.  So don't warn about it.

Signed-off-by: Josh Poimboeuf 
Signed-off-by: Peter Zijlstra (Intel) 
Tested-by: Ivan Babrou 
Link: 
https://lkml.kernel.org/r/06d02c4bbb220bd31668db579278b0352538efbb.1612534649.git.jpoim...@redhat.com
---
 arch/x86/kernel/unwind_orc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 1bcc14c..a120253 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -13,7 +13,7 @@
 
 #define orc_warn_current(args...)  \
 ({ \
-   if (state->task == current) \
+   if (state->task == current && !state->error)\
orc_warn(args); \
 })

Re: [PATCH] kbuild: rebuild GCC plugins when the compiler is upgraded

2021-03-05 Thread Josh Poimboeuf

On Sat, Mar 06, 2021 at 11:18:31AM +0900, Masahiro Yamada wrote:
> On Sat, Mar 6, 2021 at 10:28 AM Josh Poimboeuf  wrote:
> >
> > On Thu, Mar 04, 2021 at 08:25:00PM -0600, Josh Poimboeuf wrote:
> > > On Thu, Mar 04, 2021 at 03:37:14PM -0800, Linus Torvalds wrote:
> > > > On Thu, Mar 4, 2021 at 3:20 PM Kees Cook  wrote:
> > > > >
> > > > > This seems fine to me, but I want to make sure Josh has somewhere to
> > > > > actually go with this. Josh, does this get you any closer?
> > >
> > > No, this doesn't seem to help me at all.
> > >
> > > > > It sounds like the plugins need to move to another location for
> > > > > packaged kernels?
> > > >
> > > > Well, it might be worth extending the stuff that gets installed with
> > > > /lib/modules// with enough information and
> > > > infrastruvcture to then build any external modules.
> > >
> > > The gcc plugins live in scripts/, which get installed by "make
> > > modules_install" already.  So the plugins' source and makefiles are in
> > > /lib/modules//build/scripts/gcc-plugins.
> > >
> > > So everything needed for building the plugins is already there.  We just
> > > need the kernel makefiles to rebuild the plugins locally, when building
> > > an external module.
> >
> > This seems to work with very limited testing...  Based on top of
> > Masahiro's recent patch:
> >
> >   
> > https://lkml.kernel.org/r/cak7lnarhotnz3gavhgnyb4n-wyuboxc10a6zurh1odghxwd...@mail.gmail.com
> 
> Is this a bad coding contest?
> 
> I am not asking you to add ugly ifeq or whatever
> hacks to say "this worked for me".
> 
> Please feel free to do this in the fedora kernel,
> but do not send it to upstream.
> 
> Sorry, I really do not want to see hacks like this any more.

Geez, that's a bit harsh.  I never claimed to be a brilliant makefile
coder.  Do you have any constructive feedback for improving the patch?

> Remember, how badly objtool was integrated in the build system,
> and you even blocked me from fixing that.

I have no idea what you're talking about, nor how it would be relevant
to this patch.

-- 
Josh

Re: [PATCH] kbuild: rebuild GCC plugins when the compiler is upgraded

2021-03-05 Thread Josh Poimboeuf

On Thu, Mar 04, 2021 at 08:25:00PM -0600, Josh Poimboeuf wrote:
> On Thu, Mar 04, 2021 at 03:37:14PM -0800, Linus Torvalds wrote:
> > On Thu, Mar 4, 2021 at 3:20 PM Kees Cook  wrote:
> > >
> > > This seems fine to me, but I want to make sure Josh has somewhere to
> > > actually go with this. Josh, does this get you any closer?
> 
> No, this doesn't seem to help me at all.
> 
> > > It sounds like the plugins need to move to another location for
> > > packaged kernels?
> > 
> > Well, it might be worth extending the stuff that gets installed with
> > /lib/modules// with enough information and
> > infrastruvcture to then build any external modules.
> 
> The gcc plugins live in scripts/, which get installed by "make
> modules_install" already.  So the plugins' source and makefiles are in
> /lib/modules//build/scripts/gcc-plugins.
> 
> So everything needed for building the plugins is already there.  We just
> need the kernel makefiles to rebuild the plugins locally, when building
> an external module.

This seems to work with very limited testing...  Based on top of
Masahiro's recent patch:

  
https://lkml.kernel.org/r/cak7lnarhotnz3gavhgnyb4n-wyuboxc10a6zurh1odghxwd...@mail.gmail.com

From: Josh Poimboeuf 
Subject: [PATCH] gcc-plugins: Rebuild plugins in external module directory

When building external kernel modules, the build system doesn't require
the GCC version to match the version used to build the original kernel.

In fact, most distros release the compiler and the kernel in separate
packages, with separate release cadences.  So it's not uncommon for
mismatches to occur.

But with GCC plugins enabled, that's no longer allowed:

  cc1: error: incompatible gcc/plugin versions
  cc1: error: failed to initialize plugin 
./scripts/gcc-plugins/structleak_plugin.so

That error comes from the plugin's call to
plugin_default_version_check(), which strictly enforces the GCC version.
The strict check makes sense, because there's nothing to prevent the GCC
plugin ABI from changing, and it often does.

Since plugins are tightly tied to the compiler version, just rebuild
them locally in the external module directory, and then use the local
version in the external module build.

Reported-by: Ondrej Mosnacek 
Signed-off-by: Josh Poimboeuf 
---
 Makefile | 1 +
 scripts/Makefile.gcc-plugins | 2 +-
 scripts/gcc-plugins/Makefile | 8 ++--
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/Makefile b/Makefile
index bc208886fcce..90c6656de224 100644
--- a/Makefile
+++ b/Makefile
@@ -1784,6 +1784,7 @@ prepare:
echo "  The kernel was built by: "$(CONFIG_CC_VERSION_TEXT); \
echo "  You are using:   $(CC_VERSION_TEXT)"; \
fi
+   $(Q)$(MAKE) $(build)=scripts/gcc-plugins
 
 PHONY += help
 help:
diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins
index 952e46876329..be4303678942 100644
--- a/scripts/Makefile.gcc-plugins
+++ b/scripts/Makefile.gcc-plugins
@@ -48,7 +48,7 @@ export DISABLE_ARM_SSP_PER_TASK_PLUGIN
 
 # All the plugin CFLAGS are collected here in case a build target needs to
 # filter them out of the KBUILD_CFLAGS.
-GCC_PLUGINS_CFLAGS := $(strip $(addprefix 
-fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) 
$(gcc-plugin-cflags-y))
+GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(if 
$(KBUILD_EXTMOD),$(KBUILD_EXTMOD),$(objtree))/scripts/gcc-plugins/, 
$(gcc-plugin-y)) $(gcc-plugin-cflags-y))
 # The sancov_plugin.so is included via CFLAGS_KCOV, so it is removed here.
 GCC_PLUGINS_CFLAGS := $(filter-out %/sancov_plugin.so, $(GCC_PLUGINS_CFLAGS))
 export GCC_PLUGINS_CFLAGS
diff --git a/scripts/gcc-plugins/Makefile b/scripts/gcc-plugins/Makefile
index b5487cce69e8..9f8e2ef3ab56 100644
--- a/scripts/gcc-plugins/Makefile
+++ b/scripts/gcc-plugins/Makefile
@@ -1,10 +1,14 @@
 # SPDX-License-Identifier: GPL-2.0
 
-$(obj)/randomize_layout_plugin.so: $(objtree)/$(obj)/randomize_layout_seed.h
+ifneq ($(KBUILD_EXTMOD),)
+override obj := $(KBUILD_EXTMOD)/$(obj)
+endif
+
+$(obj)/randomize_layout_plugin.so: $(objtree)/$(src)/randomize_layout_seed.h
 quiet_cmd_create_randomize_layout_seed = GENSEED $@
 cmd_create_randomize_layout_seed = \
   $(CONFIG_SHELL) $(srctree)/$(src)/gen-random-seed.sh $@ 
$(objtree)/include/generated/randomize_layout_hash.h
-$(objtree)/$(obj)/randomize_layout_seed.h: FORCE
+$(objtree)/$(src)/randomize_layout_seed.h: FORCE
$(call if_changed,create_randomize_layout_seed)
 targets += randomize_layout_seed.h randomize_layout_hash.h
 
-- 
2.29.2

Re: [PATCH] x86: kprobes: orc: Fix ORC walks in kretprobes

2021-03-05 Thread Josh Poimboeuf

On Fri, Mar 05, 2021 at 11:25:15AM -0800, Daniel Xu wrote:
> > BTW, is this a regression? or CONFIG_UNWINDER_ORC has this issue before?
> > It seems that the above commit just changed the default unwinder. This means
> > OCR stack unwinder has this bug before that commit.
> 
> I see your point -- I suppose it depends on point of view. Viewed from
> userspace, a change in kernel defaults means that one kernel worked and
> the next one didn't -- all without the user doing anything. Consider it
> from the POV of a typical linux user who just takes whatever the distro
> gives them and doesn't compile their own kernels.
> 
> From the kernel point of view, you're also right. ORC didn't regress, it
> was always broken for this particular use case. But as a primarily
> userspace developer, I would consider this a kernel regression.

Either way, if the bug has always existed in the ORC unwinder, the Fixes
tag needs to reference the original ORC commit:

  Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder")

That makes it possible for stable kernels to identify the source of the
bug and corresponding fix.  Many people used the ORC unwinder before it
became the default.

-- 
Josh

Re: [PATCH RFC] kbuild: Prevent compiler mismatch with external modules

2021-03-05 Thread Josh Poimboeuf

On Sat, Mar 06, 2021 at 01:28:22AM +0900, Masahiro Yamada wrote:
> > +orig_name   := $(if $(CONFIG_CC_IS_GCC),GCC,CLANG)
> > +orig_minor  := $(shell expr $(if 
> > $(CONFIG_CC_IS_GCC),$(CONFIG_GCC_VERSION),$(CONFIG_CLANG_VERSION)) / 100)
> > +cur_namever := $(shell $(srctree)/scripts/cc-version.sh $(CC))
> > +cur_name:= $(word 1,$(cur_namever))
> > +cur_minor   := $(shell expr $(word 2,$(cur_namever)) / 100)
> 
> These are still calculated by 'make M=... clean' or 'make M=... help'.
> Using '=' assignment solves it, but the code is still ugly.
> 
> 
> I attached my alternative implementation.

Thanks for the attached patch, yours looks much cleaner.  Looks like it
warns on *any* mismatch, rather than just a major.minor mismatch.  But
that's ok with me.

Acked-by: Josh Poimboeuf 

-- 
Josh

Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules

2021-03-05 Thread Josh Poimboeuf

On Sat, Mar 06, 2021 at 01:03:32AM +0900, Masahiro Yamada wrote:
> > Ok.  So it sounds like the best/easiest option is the original patch in
> > this thread:  when building an external module with a GCC mismatch, just
> > disable the GCC plugin, with a warning (or an error for randstruct).
> 
> Just for clarification,
> I believe "the original patch" pointed to this one:
> https://lore.kernel.org/lkml/efe6b039a544da8215d5e54aa7c4b6d1986fc2b0.1611607264.git.jpoim...@redhat.com/
> 
> This is dead. Please do not come back to this.

Sorry, no.  The patch may have been crap, but that doesn't make the
problem I'm trying to solve any less valid.

> See negative comments not only from me, but also from Greg, Peter,
> Christoph.

I responded to those.  Summarizing my replies once again...

- "External modules aren't supported"

  This doesn't even remotely match reality.  Are you honestly using this
  "negative comment" as a reason to NAK the patch?

- "External modules must be built with the same GCC version"

  As has been stated repeatedly, by Linus and others, there's no
  technical reason behind this claim.  It ignores the realities of how
  distros release the kernel and compiler independently, with separate
  cadences.  Minor variances in compiler version are ABI compatible.

  Also, for features which are dependent on compiler version, many of
  those are now enabled by kbuild.  As I suggested to you previously,
  kbuild should warn when such features get disabled (which can happen
  due to a compiler/toolchain change or due to a .config copied from
  another system).

-- 
Josh

Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules

2021-03-04 Thread Josh Poimboeuf

On Thu, Mar 04, 2021 at 11:12:42AM -0800, Linus Torvalds wrote:
> On Thu, Mar 4, 2021 at 7:36 AM Masahiro Yamada  wrote:
> >
> > All the kernel-space objects are rebuilt
> > when the compiler is upgraded.
> 
> I very much NAK'ed that one. Why did that go in?
> 
> Or maybe I NAK'ed another version of it (I think the one I NAK'ed was
> from Josh), and didn't realize that there were multiple ones.

This thread is making me dizzy, but I think the patch you NAK'ed from me
was different.  It just added an error on GCC mismatch with external
modules:

  
https://lkml.kernel.org/r/fff056a7c9e6050c2d60910f70b6d99602f3bec4.1611863075.git.jpoim...@redhat.com

though I think you were ok with downgrading it to a warning.

-- 
Josh

Re: [PATCH] kbuild: rebuild GCC plugins when the compiler is upgraded

2021-03-04 Thread Josh Poimboeuf

On Thu, Mar 04, 2021 at 03:37:14PM -0800, Linus Torvalds wrote:
> On Thu, Mar 4, 2021 at 3:20 PM Kees Cook  wrote:
> >
> > This seems fine to me, but I want to make sure Josh has somewhere to
> > actually go with this. Josh, does this get you any closer?

No, this doesn't seem to help me at all.

> > It sounds like the plugins need to move to another location for
> > packaged kernels?
> 
> Well, it might be worth extending the stuff that gets installed with
> /lib/modules// with enough information and
> infrastruvcture to then build any external modules.

The gcc plugins live in scripts/, which get installed by "make
modules_install" already.  So the plugins' source and makefiles are in
/lib/modules//build/scripts/gcc-plugins.

So everything needed for building the plugins is already there.  We just
need the kernel makefiles to rebuild the plugins locally, when building
an external module.

-- 
Josh

Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules

2021-03-04 Thread Josh Poimboeuf

On Thu, Mar 04, 2021 at 09:27:28PM +0900, Masahiro Yamada wrote:
> I agree with rebuilding GCC plugins when the compiler is upgraded
> for *in-tree* building.
> Linus had reported it a couple of months before,
> and I just submitted a very easy fix.

Hm?  So does that mean that a GCC version change won't trigger a
tree-wide rebuild?  So you're asserting that a GCC mismatch is ok for
in-tree code, but not for external modules???  That seems backwards.

For in-tree, why not just rebuild the entire tree?  Some kernel features
are dependent on compiler version or capability, so not rebuilding the
tree could introduce silent breakage.

For external modules, a tree-wide rebuild isn't an option so the risk is
assumed by the user.  I posted a patch earlier [1] which prints a
warning if the compiler major/minor version changes with an external
module build.

[1] https://lkml.kernel.org/r/20210201211322.t2rxmvnrystc2ky7@treble

> Rebuilding plugins for external modules is not easy;
> plugins are placed in the read-only directory,
> /usr/src/linux-headers-$(uname -r)/scripts/gcc-plugins/.
> 
> The external modules must not (cannot) update in-tree
> build artifacts.  "Rebuild" means creating copies in a different
> writable directory.
> Doing that requires a lot of design changes.

Ok.  So it sounds like the best/easiest option is the original patch in
this thread:  when building an external module with a GCC mismatch, just
disable the GCC plugin, with a warning (or an error for randstruct).

-- 
Josh

[tip: x86/urgent] x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2

2021-03-04 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 38b6eb474ed2df3d159396c3d4312c8a7b2d5196
Gitweb:
https://git.kernel.org/tip/38b6eb474ed2df3d159396c3d4312c8a7b2d5196
Author:Josh Poimboeuf 
AuthorDate:Fri, 05 Feb 2021 08:24:02 -06:00
Committer: Peter Zijlstra 
CommitterDate: Wed, 03 Mar 2021 16:56:29 +01:00

x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2

KASAN reserves "redzone" areas between stack frames in order to detect
stack overruns.  A read or write to such an area triggers a KASAN
"stack-out-of-bounds" BUG.

Normally, the ORC unwinder stays in-bounds and doesn't access the
redzone.  But sometimes it can't find ORC metadata for a given
instruction.  This can happen for code which is missing ORC metadata, or
for generated code.  In such cases, the unwinder attempts to fall back
to frame pointers, as a best-effort type thing.

This fallback often works, but when it doesn't, the unwinder can get
confused and go off into the weeds into the KASAN redzone, triggering
the aforementioned KASAN BUG.

But in this case, the unwinder's confusion is actually harmless and
working as designed.  It already has checks in place to prevent
off-stack accesses, but those checks get short-circuited by the KASAN
BUG.  And a BUG is a lot more disruptive than a harmless unwinder
warning.

Disable the KASAN checks by using READ_ONCE_NOCHECK() for all stack
accesses.  This finishes the job started by commit 881125bfe65b
("x86/unwind: Disable KASAN checking in the ORC unwinder"), which only
partially fixed the issue.

Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder")
Reported-by: Ivan Babrou 
Signed-off-by: Josh Poimboeuf 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Steven Rostedt (VMware) 
Tested-by: Ivan Babrou 
Cc: sta...@kernel.org
Link: 
https://lkml.kernel.org/r/9583327904ebbbeda399eca9c56d6c7085ac20fe.1612534649.git.jpoim...@redhat.com
---
 arch/x86/kernel/unwind_orc.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 2a1d47f..1bcc14c 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -367,8 +367,8 @@ static bool deref_stack_regs(struct unwind_state *state, 
unsigned long addr,
if (!stack_access_ok(state, addr, sizeof(struct pt_regs)))
return false;
 
-   *ip = regs->ip;
-   *sp = regs->sp;
+   *ip = READ_ONCE_NOCHECK(regs->ip);
+   *sp = READ_ONCE_NOCHECK(regs->sp);
return true;
 }
 
@@ -380,8 +380,8 @@ static bool deref_stack_iret_regs(struct unwind_state 
*state, unsigned long addr
if (!stack_access_ok(state, addr, IRET_FRAME_SIZE))
return false;
 
-   *ip = regs->ip;
-   *sp = regs->sp;
+   *ip = READ_ONCE_NOCHECK(regs->ip);
+   *sp = READ_ONCE_NOCHECK(regs->sp);
return true;
 }
 
@@ -402,12 +402,12 @@ static bool get_reg(struct unwind_state *state, unsigned 
int reg_off,
return false;
 
if (state->full_regs) {
-   *val = ((unsigned long *)state->regs)[reg];
+   *val = READ_ONCE_NOCHECK(((unsigned long *)state->regs)[reg]);
return true;
}
 
if (state->prev_regs) {
-   *val = ((unsigned long *)state->prev_regs)[reg];
+   *val = READ_ONCE_NOCHECK(((unsigned long 
*)state->prev_regs)[reg]);
return true;
}

[tip: x86/urgent] x86/unwind/orc: Silence warnings caused by missing ORC data

2021-03-04 Thread tip-bot2 for Josh Poimboeuf

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 86402dcc894951c0a363b6aee12d955ff923b35e
Gitweb:
https://git.kernel.org/tip/86402dcc894951c0a363b6aee12d955ff923b35e
Author:Josh Poimboeuf 
AuthorDate:Fri, 05 Feb 2021 08:24:03 -06:00
Committer: Peter Zijlstra 
CommitterDate: Wed, 03 Mar 2021 16:56:30 +01:00

x86/unwind/orc: Silence warnings caused by missing ORC data

The ORC unwinder attempts to fall back to frame pointers when ORC data
is missing for a given instruction.  It sets state->error, but then
tries to keep going as a best-effort type of thing.  That may result in
further warnings if the unwinder gets lost.

Until we have some way to register generated code with the unwinder,
missing ORC will be expected, and occasionally going off the rails will
also be expected.  So don't warn about it.

Signed-off-by: Josh Poimboeuf 
Signed-off-by: Peter Zijlstra (Intel) 
Tested-by: Ivan Babrou 
Link: 
https://lkml.kernel.org/r/06d02c4bbb220bd31668db579278b0352538efbb.1612534649.git.jpoim...@redhat.com
---
 arch/x86/kernel/unwind_orc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 1bcc14c..a120253 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -13,7 +13,7 @@
 
 #define orc_warn_current(args...)  \
 ({ \
-   if (state->task == current) \
+   if (state->task == current && !state->error)\
orc_warn(args); \
 })

Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules

2021-03-03 Thread Josh Poimboeuf

On Wed, Mar 03, 2021 at 12:56:52PM -0800, Linus Torvalds wrote:
> On Wed, Mar 3, 2021 at 12:24 PM Josh Poimboeuf  wrote:
> >
> > Your nack is for a different reason: GCC plugins are second-class
> > citizens.  Fair enough...
> 
> MNo, I didn't NAK it. Quite the reverser.
> 
> I am ABSOLUTELY against rebuilding normal object files just because
> gcc versions change. A compiler version change makes zero difference
> for any normal object file.
> 
> But the gcc plugins are different. They very much _are_ tied to a
> particular gcc version.
> 
> Now, they are tied to a particular gcc version because they are
> horribly badly done, and bad technology, and I went off on a bit of a
> rant about just how bad they are, but the point is that gcc plugins
> depend on the exact gcc version in ways that normal object files do
> _not_.

Thanks, reading comprehension is hard.  I realized after re-reading that
I interpreted your "plugins should depend on the kernel version"
statement too broadly.

Masahiro, any idea how I can make the GCC version a build dependency?

-- 
Josh

Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules

2021-03-03 Thread Josh Poimboeuf

On Wed, Mar 03, 2021 at 11:25:34AM -0800, Linus Torvalds wrote:
> On Wed, Mar 3, 2021 at 11:15 AM Josh Poimboeuf  wrote:
> >
> > Adding Linus, who indicated in another thread that we shouldn't force
> > exact GCC versions because there's no technical reason to do so.
> 
> I do not believe we should recompile everything just because the gcc
> version changes.
> 
> But gcc _plugins_ certainly should depend on the kernel version.
> 
> Very few people should be enabling the gcc plugins in the first place.
> Honestly, most of them are bad, and the people who really care about
> those things have already moved to clang which does the important
> parts natively without the need for a plugin. I'm personally waiting
> for the day when we can just say "let's remove them".

You might be sad to learn that some of the plugins are useful for
hardening of a production distro kernel, like stackleak and structleak.

> But in the meantime, making the plugins depend on the gcc version some
> way is certainly better than not doing so.

So currently, the plugins already so that.  They require the GCC version
to be exact.  If there's a mismatch, then it fails the OOT module build.

But that's not usable for a distro.  When users build OOT modules with a
slight GCC mismatch, it breaks the build, effectively requiring the
exact same GCC version for *all* OOT builds going forward.

So there have been a few proposals to better handle GCC version
mismatches:

1) disable the plugin - this works fine for most plugins except
   randstruct

2) rebuild the plugin whenever the GCC version changes

3) fail the build, like today - effectively blocks distros from using
   plugins

-- 
Josh

Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules

2021-03-03 Thread Josh Poimboeuf

On Wed, Mar 03, 2021 at 02:24:12PM -0600, Josh Poimboeuf wrote:
> On Wed, Mar 03, 2021 at 11:57:33AM -0800, Linus Torvalds wrote:
> > On Wed, Mar 3, 2021 at 11:38 AM Josh Poimboeuf  wrote:
> > >
> > > > But in the meantime, making the plugins depend on the gcc version some
> > > > way is certainly better than not doing so.
> > >
> > > So currently, the plugins already so that.  They require the GCC version
> > > to be exact.  If there's a mismatch, then it fails the OOT module build.
> > 
> > That's not my experience.
> > 
> > Yes, the build fails, but it fails not by _rebuilding_, but by failing
> > with an error.
> 
> Um, that's what I said.  It does not rebuild.  It fails with an error.
> 
> The *proposal* is to rebuild the plugin -- which Masahiro nacked because
> he claims GCC mismatches aren't supported for OOT builds (plugin or
> not).
> 
> Your nack is for a different reason: GCC plugins are second-class
> citizens.  Fair enough...

Or was it a nack? :-)  Reading your message again, I may have
misinterpreted.  Put simply, should we rebuild plugins when the GCC
version changes?

-- 
Josh

Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules

2021-03-03 Thread Josh Poimboeuf

On Wed, Mar 03, 2021 at 11:57:33AM -0800, Linus Torvalds wrote:
> On Wed, Mar 3, 2021 at 11:38 AM Josh Poimboeuf  wrote:
> >
> > > But in the meantime, making the plugins depend on the gcc version some
> > > way is certainly better than not doing so.
> >
> > So currently, the plugins already so that.  They require the GCC version
> > to be exact.  If there's a mismatch, then it fails the OOT module build.
> 
> That's not my experience.
> 
> Yes, the build fails, but it fails not by _rebuilding_, but by failing
> with an error.

Um, that's what I said.  It does not rebuild.  It fails with an error.

The *proposal* is to rebuild the plugin -- which Masahiro nacked because
he claims GCC mismatches aren't supported for OOT builds (plugin or
not).

Your nack is for a different reason: GCC plugins are second-class
citizens.  Fair enough...

-- 
Josh

Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules

2021-03-03 Thread Josh Poimboeuf

On Thu, Mar 04, 2021 at 03:49:35AM +0900, Masahiro Yamada wrote:
> > This problem is becoming more prevalent.  We will need to fix it one way
> > or another, if we want to support distro adoption of these GCC
> > plugin-based features.
> >
> > Frank suggested a possibly better idea: always rebuild the plugins when
> > the GCC version changes.
> 
> 
> That is just another form of the previous patch,
> which was already rejected.
> 
> 
> - That is a hack just for external modules
> - Our consensus is, use the same version for the kernel and external modules
> 
> 
> 
> I use Ubuntu, and I do not see such a problem.
> (I have never seen it on Debian either, except sid.)
> 
> I see Fedora providing GCC whose version is different
> from the one used for building the kernel.
> That is a problem on the distribution side.

I don't understand.  Are you suggesting that a distro should always
release GCC and kernel updates simultaneously?

How is this problem specific to Fedora?  Please be specific about what
Ubuntu and Debian do, which Fedora doesn't do.

Adding Linus, who indicated in another thread that we shouldn't force
exact GCC versions because there's no technical reason to do so.

-- 
Josh

Re: [PATCH 0/3] objtool: OBJTOOL_ARGS and --backup

2021-03-03 Thread Josh Poimboeuf

On Fri, Feb 26, 2021 at 11:57:42AM +0100, Peter Zijlstra wrote:
> Boris asked for an environment variable to have objtool preserve the original
> object file so that it becomes trivial to inspect what actually changed.

I might bikeshed the suffix ".o.orig" instead of ".obj".

Acked-by: Josh Poimboeuf 

-- 
Josh

Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules

2021-03-03 Thread Josh Poimboeuf

On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote:
> When building out-of-tree kernel modules, the build system doesn't
> require the GCC version to match the version used to build the original
> kernel.  That's probably [1] fine.
> 
> In fact, for many distros, the version of GCC used to build the latest
> kernel doesn't necessarily match the latest released GCC, so a GCC
> mismatch turns out to be pretty common.  And with CONFIG_MODVERSIONS
> it's probably more common.
> 
> So a lot of users have come to rely on being able to use a different
> version of GCC when building OOT modules.
> 
> But with GCC plugins enabled, that's no longer allowed:
> 
>   cc1: error: incompatible gcc/plugin versions
>   cc1: error: failed to initialize plugin 
> ./scripts/gcc-plugins/structleak_plugin.so
> 
> That error comes from the plugin's call to
> plugin_default_version_check(), which strictly enforces the GCC version.
> The strict check makes sense, because there's nothing to prevent the GCC
> plugin ABI from changing -- and it often does.
> 
> But failing the build isn't necessary.  For most plugins, OOT modules
> will otherwise work just fine without the plugin instrumentation.
> 
> When a GCC version mismatch is detected, print a warning and disable the
> plugin.  The only exception is the RANDSTRUCT plugin which needs all
> code to see the same struct layouts.  In that case print an error.

Hi Masahiro,

This problem is becoming more prevalent.  We will need to fix it one way
or another, if we want to support distro adoption of these GCC
plugin-based features.

Frank suggested a possibly better idea: always rebuild the plugins when
the GCC version changes.  What do you think?  Any suggestions on how to
implement that?  Otherwise I can try to hack something together.

-- 
Josh

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 7781 matches

Mail list logo