Re: [PATCH v2 05/14] objtool: Per arch retpoline naming

2021-03-19 Thread Peter Zijlstra
On Thu, Mar 18, 2021 at 09:38:14PM -0500, Josh Poimboeuf wrote: > On Thu, Mar 18, 2021 at 06:11:08PM +0100, Peter Zijlstra wrote: > > @@ -872,7 +877,7 @@ static int add_jump_destinations(struct > > } else if (reloc->sym->type == STT_SECTION) { > >

Re: [PATCH v2 14/14] objtool,x86: Rewrite retpoline thunk calls

2021-03-19 Thread Peter Zijlstra
On Thu, Mar 18, 2021 at 10:29:55PM -0500, Josh Poimboeuf wrote: > On Thu, Mar 18, 2021 at 06:11:17PM +0100, Peter Zijlstra wrote: > > When the compiler emits: "CALL __x86_indirect_thunk_\reg" for an > > indirect call, have objtool rewrite it to: > > > > AL

Re: [PATCH v2 11/14] objtool: Add elf_create_undef_symbol()

2021-03-19 Thread Peter Zijlstra
On Thu, Mar 18, 2021 at 09:29:23PM -0500, Josh Poimboeuf wrote: > On Thu, Mar 18, 2021 at 06:11:14PM +0100, Peter Zijlstra wrote: > > Allow objtool to create undefined symbols; this allows creating > > relocations to symbols not currently in the symbol table. > > > > Si

Re: [PATCH v2 01/17] add support for Clang CFI

2021-03-18 Thread Peter Zijlstra
On Thu, Mar 18, 2021 at 10:10:55AM -0700, Sami Tolvanen wrote: > +static void update_shadow(struct module *mod, unsigned long base_addr, > + update_shadow_fn fn) > +{ > + struct cfi_shadow *prev; > + struct cfi_shadow *next; > + unsigned long min_addr, max_addr; > + > +

[RFC][PATCH] sched: Optimize cpufreq_update_util

2021-03-18 Thread Peter Zijlstra
Hi, The below replaces cpufreq_update_util()'s indirect call with a static_call(). The patch is quite gross, and we definitely need static_call_update_cpuslocked(). cpufreq folks, is there a better way to do that optimize pass? That is, we need to know when all CPUs have the *same* function set

[PATCH v2 14/14] objtool,x86: Rewrite retpoline thunk calls

2021-03-18 Thread Peter Zijlstra
to not emit endless identical .altinst_replacement chunks, use a global symbol for them, see __x86_indirect_alt_*. This also avoids objtool from having to do code generation. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/asm-prototypes.h | 12 ++- arch/x86/lib/retpoline.S | 33

[PATCH v2 06/14] objtool: Fix static_call list generation

2021-03-18 Thread Peter Zijlstra
Currently objtool generates tail call entries in add_jump_destination() but waits until validate_branch() to generate the regular call entries, move these to add_call_destination() for consistency. Signed-off-by: Peter Zijlstra (Intel) --- tools/objtool/check.c | 18 +- 1 file

[PATCH v2 11/14] objtool: Add elf_create_undef_symbol()

2021-03-18 Thread Peter Zijlstra
Allow objtool to create undefined symbols; this allows creating relocations to symbols not currently in the symbol table. Signed-off-by: Peter Zijlstra (Intel) --- tools/objtool/elf.c | 63 tools/objtool/include/objtool/elf.h |1 2

[PATCH v2 08/14] objtool: Add elf_create_reloc() helper

2021-03-18 Thread Peter Zijlstra
We have 4 instances of adding a relocation. Create a common helper to avoid growing even more. Signed-off-by: Peter Zijlstra (Intel) --- tools/objtool/check.c | 76 ++-- tools/objtool/elf.c | 90

[PATCH v2 05/14] objtool: Per arch retpoline naming

2021-03-18 Thread Peter Zijlstra
The __x86_indirect_ naming is obviously not generic. Shorten to allow matching some additional magic names later. Signed-off-by: Peter Zijlstra (Intel) --- tools/objtool/arch/x86/decode.c |5 + tools/objtool/check.c|7 ++- tools/objtool/include/objtool

[PATCH v2 09/14] objtool: Extract elf_strtab_concat()

2021-03-18 Thread Peter Zijlstra
Create a common helper to append strings to a strtab. Signed-off-by: Peter Zijlstra (Intel) --- tools/objtool/elf.c | 73 +--- 1 file changed, 42 insertions(+), 31 deletions(-) --- a/tools/objtool/elf.c +++ b/tools/objtool/elf.c @@ -676,13

[PATCH v2 12/14] objtool: Allow archs to rewrite retpolines

2021-03-18 Thread Peter Zijlstra
When retpolines are employed, compilers typically emit calls to retpoline thunks. Objtool recognises these calls and marks them as dynamic calls. Provide infrastructure for architectures to rewrite/augment what the compiler wrote for us. Signed-off-by: Peter Zijlstra (Intel) --- tools/objtool

[PATCH v2 10/14] objtool: Extract elf_symbol_add()

2021-03-18 Thread Peter Zijlstra
Create a common helper to add symbols. Signed-off-by: Peter Zijlstra (Intel) --- tools/objtool/elf.c | 57 ++-- 1 file changed, 33 insertions(+), 24 deletions(-) --- a/tools/objtool/elf.c +++ b/tools/objtool/elf.c @@ -290,12 +290,41 @@ static

[PATCH v2 13/14] objtool: Skip magical retpoline .altinstr_replacement

2021-03-18 Thread Peter Zijlstra
When the .altinstr_replacement is a retpoline, skip the alternative. We already special case retpolines anyway. Signed-off-by: Peter Zijlstra (Intel) --- tools/objtool/special.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) --- a/tools/objtool/special.c +++ b/tools

[PATCH v2 03/14] x86/retpoline: Simplify retpolines

2021-03-18 Thread Peter Zijlstra
r_replacement+0x5> c: 48 89 04 24 mov%rax,(%rsp) 10: c3 retq 17 bytes, we have 15 bytes NOP at the end of our 32 byte slot. (IOW, if we can shrink the retpoline by 1 byte we can pack it more dense) Signed-off-by: Peter Zijlstra (Intel) --- arch/

[PATCH v2 02/14] x86/alternatives: Optimize optimize_nops()

2021-03-18 Thread Peter Zijlstra
string of NOPs inside the alternative to larger NOPs. Also run it irrespective of patching, replacing NOPs in both the original and replaced code. A direct consequence is that padlen becomes superfluous, so remove it. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/alternative.h

[PATCH v2 04/14] objtool: Correctly handle retpoline thunk calls

2021-03-18 Thread Peter Zijlstra
Just like JMP handling, convert a direct CALL to a retpoline thunk into a retpoline safe indirect CALL. Signed-off-by: Peter Zijlstra (Intel) --- tools/objtool/check.c | 12 1 file changed, 12 insertions(+) --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -953,6

[PATCH v2 01/14] x86: Add insn_decode_kernel()

2021-03-18 Thread Peter Zijlstra
Add a helper to decode kernel instructions; there's no point in endlessly repeating those last two arguments. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/insn.h|2 ++ arch/x86/kernel/alternative.c |2 +- arch/x86/kernel/cpu/mce/severity.c |2 +-

[PATCH v2 00/14] x86,objtool: Optimize !RETPOLINE

2021-03-18 Thread Peter Zijlstra
Hi, Respin of the !RETPOLINE optimization patches. Boris, the first 3 should probably go into tip/x86/core, it's an ungodly tangle since it relies on the insn decoder patches in tip/x86/core, the NOP patches in tip/x86/cpu and the alternative patches in tip/x86/alternatives. Just to make life ea

[PATCH v2 07/14] objtool: Rework rebuild_reloc logic

2021-03-18 Thread Peter Zijlstra
Instead of manually calling elf_rebuild_reloc_section() on sections we've called elf_add_reloc() on, have elf_write() DTRT. This makes it easier to add random relocations in places without carefully tracking when we're done and need to flush what section. Signed-off-by: Peter Zijls

Re: [PATCH 5/9] objtool: Rework rebuild_reloc logic

2021-03-18 Thread Peter Zijlstra
On Thu, Mar 18, 2021 at 11:36:40AM -0500, Josh Poimboeuf wrote: > > I was thinking you could get a section changed without touching > > relocations, but while that is theoretically possible, it is exceedingly > > unlikely (and objtool doesn't do that). > > Hm? This is a *relocation* section, not

Re: [PATCH 3/3] static_call: Fix static_call_update() sanity check

2021-03-18 Thread Peter Zijlstra
On Thu, Mar 18, 2021 at 11:13:08AM -0500, Josh Poimboeuf wrote: > On Thu, Mar 18, 2021 at 12:31:59PM +0100, Peter Zijlstra wrote: > > if (!kernel_text_address((unsigned long)site_addr)) { > > - WARN_ONCE(1, "can't patch sta

Re: [PATCH v4 1/4] sched/fair: Introduce primitives for CFS bandwidth burst

2021-03-18 Thread Peter Zijlstra
On Thu, Mar 18, 2021 at 08:59:44AM -0400, Phil Auld wrote: > I admit to not having followed all the history of this patch set. That > said, when I see the above I just think your quota is too low for your > workload. This. > The burst (mis?)feature seems to be a way to bypass the quota. And it >

Re: [PATCH v4 1/4] sched/fair: Introduce primitives for CFS bandwidth burst

2021-03-18 Thread Peter Zijlstra
On Thu, Mar 18, 2021 at 09:26:58AM +0800, changhuaixin wrote: > > On Mar 17, 2021, at 4:06 PM, Peter Zijlstra wrote: > > So what is the typical avg,stdev,max and mode for the workloads where you > > find > > you need this? > > > > I would really like to put

Re: [PATCH 5/9] objtool: Rework rebuild_reloc logic

2021-03-18 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 07:49:17PM -0500, Josh Poimboeuf wrote: > On Wed, Mar 17, 2021 at 09:12:15AM +0100, Peter Zijlstra wrote: > > On Tue, Mar 16, 2021 at 10:34:17PM -0500, Josh Poimboeuf wrote: > > > On Fri, Mar 12, 2021 at 06:16:18PM +0100, Peter Zijlstra wrote: > >

Re: [PATCH 3/3] static_call: Fix static_call_update() sanity check

2021-03-18 Thread Peter Zijlstra
On Thu, Mar 18, 2021 at 12:31:59PM +0100, Peter Zijlstra wrote: > Sites that match init_section_contains() get marked as INIT. For > built-in code init_sections contains both __init and __exit text. OTOH > kernel_text_address() only explicitly includes __init text (and there > are no

[PATCH 0/3] static_call() vs __exit fixes

2021-03-18 Thread Peter Zijlstra
Hi, After more poking a new set of patches to fix static_call() vs __exit functions. These patches replace the patch I posted yesterday: https://lkml.kernel.org/r/yfh6br61b5gk8...@hirez.programming.kicks-ass.net Since I've reproduced the problem locally, and these patches do seem to fully cure

[PATCH 3/3] static_call: Fix static_call_update() sanity check

2021-03-18 Thread Peter Zijlstra
INIT sites. Also see the excellent changelog for commit: 8f35eaa5f2de ("jump_label: Don't warn on __exit jump entries") Fixes: 9183c3f9ed710 ("static_call: Add inline static call infrastructure") Reported-by: Sumit Garg Signed-off-by: Peter Zijlstra (Intel) --- ke

[PATCH 2/3] static_call: Align static_call_is_init() patching condition

2021-03-18 Thread Peter Zijlstra
within_module_init() always fails, while jump_label relies on the module state which is more obvious and matches the kernel logic. Signed-off-by: Peter Zijlstra (Intel) --- kernel/static_call.c | 14 -- 1 file changed, 4 insertions(+), 10 deletions(-) --- a/kernel/static_call.c +++ b/kernel

[PATCH 1/3] static_call: Fix static_call_set_init()

2021-03-18 Thread Peter Zijlstra
It turns out that static_call_set_init() does not preserve the other flags; IOW. it clears TAIL if it was set. Fixes: 9183c3f9ed710 ("static_call: Add inline static call infrastructure") Reported-by: Sumit Garg Signed-off-by: Peter Zijlstra (Intel) --- kernel/static_cal

Re: [PATCH] objtool,static_call: Don't emit static_call_site for .exit.text

2021-03-18 Thread Peter Zijlstra
On Thu, Mar 18, 2021 at 09:30:18AM +0100, Peter Zijlstra wrote: > On Thu, Mar 18, 2021 at 08:59:45AM +0100, Peter Zijlstra wrote: > > On Wed, Mar 17, 2021 at 07:02:12PM -0500, Josh Poimboeuf wrote: > > > On Wed, Mar 17, 2021 at 01:45:57PM +0100, Peter Zijlstra wrote: > >

Re: [PATCH] objtool,static_call: Don't emit static_call_site for .exit.text

2021-03-18 Thread Peter Zijlstra
On Thu, Mar 18, 2021 at 08:59:45AM +0100, Peter Zijlstra wrote: > On Wed, Mar 17, 2021 at 07:02:12PM -0500, Josh Poimboeuf wrote: > > On Wed, Mar 17, 2021 at 01:45:57PM +0100, Peter Zijlstra wrote: > > > arguably it simply isn't a good idea to use static_call() in __exit &

Re: [PATCH] objtool,static_call: Don't emit static_call_site for .exit.text

2021-03-18 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 07:02:12PM -0500, Josh Poimboeuf wrote: > On Wed, Mar 17, 2021 at 01:45:57PM +0100, Peter Zijlstra wrote: > > arguably it simply isn't a good idea to use static_call() in __exit > > code anyway, since module unload is never a performance critical pat

Re: [PATCH 6/9] objtool: Add elf_create_undef_symbol()

2021-03-18 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 07:46:14PM -0500, Josh Poimboeuf wrote: > On Wed, Mar 17, 2021 at 03:13:43PM +0100, Peter Zijlstra wrote: > > On Wed, Mar 17, 2021 at 02:52:23PM +0100, Miroslav Benes wrote: > > > > > > + if (!elf_symbol_add(elf, sym, SHN_XINDEX)) { &

Re: [tip: locking/urgent] locking/ww_mutex: Treat ww_mutex_lock() like a trylock

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 02:32:27PM -0400, Waiman Long wrote: > On 3/17/21 1:45 PM, Peter Zijlstra wrote: > > > +# define __DEP_MAP_WW_MUTEX_INITIALIZER(lockname, class) \ > > > + , .dep_map = { \ > > > + .key = &(class).mutex_key, \ >

Re: [tip: locking/urgent] locking/ww_mutex: Treat ww_mutex_lock() like a trylock

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 06:40:27PM +0100, Peter Zijlstra wrote: > On Wed, Mar 17, 2021 at 05:48:48PM +0100, Peter Zijlstra wrote: > > > I think you'll find that if you use ww_mutex_init() it'll all work. Let > > me go and zap this patch, and then I'll try and f

Re: [tip: locking/urgent] locking/ww_mutex: Treat ww_mutex_lock() like a trylock

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 05:48:48PM +0100, Peter Zijlstra wrote: > I think you'll find that if you use ww_mutex_init() it'll all work. Let > me go and zap this patch, and then I'll try and figure out why > DEFINE_WW_MUTEX() is buggered. Moo, I can't get the compiler

Re: [PATCH v2] smp: kernel/panic.c - silence warnings

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 06:17:26PM +0100, Christophe Leroy wrote: > > > Le 17/03/2021 à 13:23, Peter Zijlstra a écrit : > > On Wed, Mar 17, 2021 at 12:00:29PM +0100, Christophe Leroy wrote: > > > What do you mean ? 'extern' prototype is pointless for function p

Re: [tip: locking/urgent] locking/ww_mutex: Treat ww_mutex_lock() like a trylock

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 11:35:12AM -0400, Waiman Long wrote: > From reading the source code, nest_lock check is done in check_deadlock() so > that it won't complain. However, nest_lock isn't considered in > check_noncircular() which causes the splat to come out. Maybe we should add > a check for n

Re: [PATCH -tip 0/3] x86/kprobes: Remoev single-step trap from x86 kprobes

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 11:55:22PM +0900, Masami Hiramatsu wrote: > Hi Andy, > > Would you think you still need this series to remove iret to kernel? They're an improvement anyway, let me queue them so that they don't get lost. I'll line them up for tip/x86/core unless anybody else thinks of a b

[tip: irq/core] tasklets: Replace spin wait in tasklet_unlock_wait()

2021-03-17 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the irq/core branch of tip: Commit-ID: da044747401fc16202e223c9da970ed4e84fd84d Gitweb: https://git.kernel.org/tip/da044747401fc16202e223c9da970ed4e84fd84d Author:Peter Zijlstra AuthorDate:Tue, 09 Mar 2021 09:42:08 +01:00

[tip: irq/core] tasklets: Replace spin wait in tasklet_kill()

2021-03-17 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the irq/core branch of tip: Commit-ID: 697d8c63c4a2991a22a896a5e6adcdbb28fefe56 Gitweb: https://git.kernel.org/tip/697d8c63c4a2991a22a896a5e6adcdbb28fefe56 Author:Peter Zijlstra AuthorDate:Tue, 09 Mar 2021 09:42:09 +01:00

Re: [tip: locking/urgent] locking/ww_mutex: Simplify use_ww_ctx & ww_ctx handling

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 10:10:16AM -0400, Waiman Long wrote: > On 3/17/21 9:55 AM, Peter Zijlstra wrote: > > On Wed, Mar 17, 2021 at 09:43:20AM -0400, Waiman Long wrote: > > > > > Using gcc 8.4.1, the generated __mutex_lock function has the same size > > > (with

Re: [PATCH 6/9] objtool: Add elf_create_undef_symbol()

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 02:52:23PM +0100, Miroslav Benes wrote: > > + if (!elf_symbol_add(elf, sym, SHN_XINDEX)) { > > + WARN("elf_symbol_add"); > > + return NULL; > > + } > > SHN_XINDEX means that the extended section index is used. Above you seem > to use it in the oppo

Re: [tip: locking/urgent] locking/ww_mutex: Simplify use_ww_ctx & ww_ctx handling

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 09:43:20AM -0400, Waiman Long wrote: > Using gcc 8.4.1, the generated __mutex_lock function has the same size (with > last instruction at offset +5179) with or without this patch. Well, you can > say that this patch is an no-op wrt generated code. OK, then GCC has gotten b

Re: [tip: locking/urgent] locking/ww_mutex: Treat ww_mutex_lock() like a trylock

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 02:12:41PM +0100, Peter Zijlstra wrote: > On Wed, Mar 17, 2021 at 12:38:21PM -, tip-bot2 for Waiman Long wrote: > > + /* > > +* Treat as trylock for ww_mutex. > > +*/ > > + mutex_acquire_nest(&lock->dep_map, subclass,

Re: [tip: locking/urgent] locking/ww_mutex: Treat ww_mutex_lock() like a trylock

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 12:38:21PM -, tip-bot2 for Waiman Long wrote: > The following commit has been merged into the locking/urgent branch of tip: > > Commit-ID: b058f2e4d0a70c060e21ed122b264e9649cad57f > Gitweb: > https://git.kernel.org/tip/b058f2e4d0a70c060e21ed122b264e9649cad57

Re: [tip: locking/urgent] locking/ww_mutex: Simplify use_ww_ctx & ww_ctx handling

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 12:38:22PM -, tip-bot2 for Waiman Long wrote: > The following commit has been merged into the locking/urgent branch of tip: > > Commit-ID: 5de2055d31ea88fd9ae9709ac95c372a505a60fa > Gitweb: > https://git.kernel.org/tip/5de2055d31ea88fd9ae9709ac95c372a505a60f

[PATCH] objtool,static_call: Don't emit static_call_site for .exit.text

2021-03-17 Thread Peter Zijlstra
__exit. --- Subject: objtool,static_call: Don't emit static_call_site for .exit.text From: Peter Zijlstra Date: Wed Mar 17 13:35:05 CET 2021 Functions marked __exit are (somewhat surprisingly) discarded at runtime when built-in. This means that static_call(), when used in __exit functions, wi

Re: [PATCH v2] smp: kernel/panic.c - silence warnings

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 12:00:29PM +0100, Christophe Leroy wrote: > What do you mean ? 'extern' prototype is pointless for function prototypes > and deprecated, no new function prototypes should be added with the 'extern' > keyword. > > checkpatch.pl tells you: "extern prototypes should be avoided

Re: [PATCH] sched: swait: use wake_up_process() instead of wake_up_state()

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 10:46:18AM +0100, Ingo Molnar wrote: > > * Mike Galbraith wrote: > > > On Tue, 2021-03-16 at 19:20 +0800, Wang Qing wrote: > > > Why not just use wake_up_process(). > > > > IMO this is not an improvement. There are other places where explicit > > TASK_NORMAL is used as

Re: [PATCH v23 00/28] Control-flow Enforcement: Shadow Stack

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 10:18:00AM +0100, Ingo Molnar wrote: > > * Yu, Yu-cheng wrote: > > > On 3/16/2021 2:15 PM, Peter Zijlstra wrote: > > > On Tue, Mar 16, 2021 at 08:10:26AM -0700, Yu-cheng Yu wrote: > > > > Control-flow Enforcement (CET) i

Re: unknown NMI on AMD Rome

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 09:48:29AM +0100, Ingo Molnar wrote: > > https://developer.amd.com/wp-content/resources/56323-PUB_0.78.pdf > > So: > > > 1215 IBS (Instruction Based Sampling) Counter Valid Value > May be Incorrect After Exit From Core C6 (CC6) State > > Description > > If a cor

Re: [PATCH 5/9] objtool: Rework rebuild_reloc logic

2021-03-17 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 10:34:17PM -0500, Josh Poimboeuf wrote: > On Fri, Mar 12, 2021 at 06:16:18PM +0100, Peter Zijlstra wrote: > > --- a/tools/objtool/elf.c > > +++ b/tools/objtool/elf.c > > @@ -479,6 +479,8 @@ void elf_add_reloc(struct elf *elf, stru > > > &g

Re: [PATCH v4 1/4] sched/fair: Introduce primitives for CFS bandwidth burst

2021-03-17 Thread Peter Zijlstra
On Wed, Mar 17, 2021 at 03:16:18PM +0800, changhuaixin wrote: > > Why do you allow such a large burst? I would expect something like: > > > > if (burst > quote) > > return -EINVAL; > > > > That limits the variance in the system. Allowing super long bursts seems > > to defeat the

Re: [PATCH v23 00/28] Control-flow Enforcement: Shadow Stack

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 08:10:26AM -0700, Yu-cheng Yu wrote: > Control-flow Enforcement (CET) is a new Intel processor feature that blocks > return/jump-oriented programming attacks. Details are in "Intel 64 and > IA-32 Architectures Software Developer's Manual" [1]. > > CET can protect applicati

Re: [PATCH v23 6/9] x86/entry: Introduce ENDBR macro

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 01:26:52PM -0700, Yu, Yu-cheng wrote: > Then, what about moving what I had earlier to vdso.h? > If we don't want __i386__ either, then make it two macros. vdso.h seems to use CONFIG_X86_{64,32} resp. > +.macro ENDBR > +#ifdef CONFIG_X86_CET And shouldn't that be CONFIG_X8

Re: [PATCH v23 6/9] x86/entry: Introduce ENDBR macro

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 01:05:30PM -0700, Yu, Yu-cheng wrote: > On 3/16/2021 12:57 PM, Peter Zijlstra wrote: > > On Tue, Mar 16, 2021 at 10:12:39AM -0700, Yu, Yu-cheng wrote: > > > Alternatively, there is another compiler-defined macro _CET_ENDBR that can > > > be used

Re: [PATCH 2/2] futex: Leave the pi lock stealer in a consistent state upon successful fault

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 11:03:05AM -0700, Davidlohr Bueso wrote: > On Tue, 16 Mar 2021, Peter Zijlstra wrote: > > > > IIRC we made the explicit choice to never loop here. That saves having > > to worry about getting stuck in in-kernel loops. > > > > Userspace tr

Re: [PATCH v23 6/9] x86/entry: Introduce ENDBR macro

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 10:12:39AM -0700, Yu, Yu-cheng wrote: > Alternatively, there is another compiler-defined macro _CET_ENDBR that can > be used. We can put the following in calling.h: > > #ifdef __CET__ > #include > #else > #define _CET_ENDBR > #endif > > and then use _CET_ENDBR in other f

Re: unknown NMI on AMD Rome

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 04:45:02PM +0100, Jiri Olsa wrote: > hi, > when running 'perf top' on AMD Rome (/proc/cpuinfo below) > with fedora 33 kernel 5.10.22-200.fc33.x86_64 > > we got unknown NMI messages: > > [ 226.700160] Uhhuh. NMI received for unknown reason 3d on CPU 90. > [ 226.700162] Do

Re: 回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 04:08:08PM +, Zhou Ti (x2019cwm) wrote: > But I don't think it's a good idea to handle this in callers, because > logically the function shouldn't return negative values. Returning 0 > directly would allow idle governors to get another chance to select > again. A: Beca

Re: [PATCH RFC v2 3/8] perf/core: Add support for event removal on exec

2021-03-16 Thread Peter Zijlstra
On Wed, Mar 10, 2021 at 11:41:34AM +0100, Marco Elver wrote: > Adds bit perf_event_attr::remove_on_exec, to support removing an event > from a task on exec. > > This option supports the case where an event is supposed to be > process-wide only, and should not propagate beyond exec, to limit > moni

Re: [PATCH 06/10] timer: Report ignored local enqueue in nohz mode

2021-03-16 Thread Peter Zijlstra
Rafael J. Wysocki > Signed-off-by: Frederic Weisbecker > Cc: Peter Zijlstra > Cc: Thomas Gleixner > Cc: Ingo Molnar > Cc: Paul E. McKenney > --- > kernel/sched/core.c | 20 +++- > 1 file changed, 19 insertions(+), 1 deletion(-) > > diff --git a/k

Re: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 02:37:03PM +0100, Frederic Weisbecker wrote: > On Tue, Mar 16, 2021 at 01:21:29PM +0100, Peter Zijlstra wrote: > > On Thu, Mar 11, 2021 at 01:36:59PM +0100, Frederic Weisbecker wrote: > > > From: "Zhou Ti (x2019cwm)" > > > > >

Re: [PATCH] perf/core: fix unconditional security_locked_down() call

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 09:53:21AM -0400, Paul Moore wrote: > On Wed, Feb 24, 2021 at 4:59 PM Ondrej Mosnacek wrote: > > > > Currently, the lockdown state is queried unconditionally, even though > > its result is used only if the PERF_SAMPLE_REGS_INTR bit is set in > > attr.sample_type. While that

Re: [PATCH 1/5] perf/x86/intel/uncore: Parse uncore discovery tables

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 08:42:25AM -0400, Liang, Kan wrote: > > > On 3/16/2021 7:43 AM, Peter Zijlstra wrote: > > On Fri, Mar 12, 2021 at 08:34:34AM -0800, kan.li...@linux.intel.com wrote: > > > From: Kan Liang > > > > > > A self-describing mechanis

Re: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-16 Thread Peter Zijlstra
On Thu, Mar 11, 2021 at 01:36:59PM +0100, Frederic Weisbecker wrote: > From: "Zhou Ti (x2019cwm)" > > If the hardware clock happens to fire its interrupts late, two possible > issues can happen while calling tick_nohz_get_sleep_length(). Either: > > 1) The next clockevent device event is due pas

Re: [PATCH 02/10] tick/nohz: Add tick_nohz_full_this_cpu()

2021-03-16 Thread Peter Zijlstra
On Thu, Mar 11, 2021 at 01:37:00PM +0100, Frederic Weisbecker wrote: > Optimize further the check for local full dynticks CPU. Testing directly > tick_nohz_full_cpu(smp_processor_id()) is suboptimal because the > compiler first fetches the CPU number and only then processes the > static key. > > I

Re: [PATCH 1/5] perf/x86/intel/uncore: Parse uncore discovery tables

2021-03-16 Thread Peter Zijlstra
On Fri, Mar 12, 2021 at 08:34:34AM -0800, kan.li...@linux.intel.com wrote: > From: Kan Liang > > A self-describing mechanism for the uncore PerfMon hardware has been > introduced with the latest Intel platforms. By reading through an MMIO > page worth of information, perf can 'discover' all the s

Re: [PATCH 1/5] perf/x86/intel/uncore: Parse uncore discovery tables

2021-03-16 Thread Peter Zijlstra
On Fri, Mar 12, 2021 at 08:34:34AM -0800, kan.li...@linux.intel.com wrote: > +static struct intel_uncore_discovery_type * > +search_uncore_discovery_type(u16 type_id) > +{ > + struct rb_node *node = discovery_tables.rb_node; > + struct intel_uncore_discovery_type *type; > + > + while (n

Re: [PATCH 2/2] futex: Leave the pi lock stealer in a consistent state upon successful fault

2021-03-16 Thread Peter Zijlstra
On Sun, Mar 14, 2021 at 10:02:24PM -0700, Davidlohr Bueso wrote: > Before 34b1a1ce145 (futex: Handle faults correctly for PI futexes) any > concurrent pi_state->owner fixup would assume that the task that fixed > things on our behalf also correctly updated the userspace value. This > is not always

Re: [PATCH v3 2/4] sched/fair: Make CFS bandwidth controller burstable

2021-03-16 Thread Peter Zijlstra
On Fri, Mar 12, 2021 at 09:54:33PM +0800, changhuaixin wrote: > > On Mar 10, 2021, at 9:04 PM, Peter Zijlstra wrote: > > There's already an #ifdef block that contains that bandwidth_slice > > thing, see the previous hunk, so why create a new #ifdef here? > >

Re: [PATCH v4 1/4] sched/fair: Introduce primitives for CFS bandwidth burst

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 12:49:28PM +0800, Huaixin Chang wrote: > And the maximun amount of CPU a group can consume in > a given period is "buffer" which is equivalent to "quota" + "burst in > case that this group has done enough accumulation. I'm confused as heck about cfs_b->buffer. Why do you ne

Re: [PATCH v4 1/4] sched/fair: Introduce primitives for CFS bandwidth burst

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 12:49:28PM +0800, Huaixin Chang wrote: > @@ -8982,6 +8983,12 @@ static int tg_set_cfs_bandwidth(struct task_group *tg, > u64 period, u64 quota) > if (quota != RUNTIME_INF && quota > max_cfs_runtime) > return -EINVAL; > > + /* > + * Bound burst

Re: [PATCH v4 2/4] sched/fair: Make CFS bandwidth controller burstable

2021-03-16 Thread Peter Zijlstra
I can't make sense of patch 1 and 2 independent of one another. Why the split?

Re: [PATCH v4 1/4] sched/fair: Introduce primitives for CFS bandwidth burst

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 12:49:28PM +0800, Huaixin Chang wrote: > In this patch, we introduce the notion of CFS bandwidth burst. Unused > "quota" from pervious "periods" might be accumulated and used in the > following "periods". The maximum amount of accumulated bandwidth is > bounded by "burst". A

Re: [PATCH v4 1/4] sched/fair: Introduce primitives for CFS bandwidth burst

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 12:49:28PM +0800, Huaixin Chang wrote: > In this patch, we introduce the notion of CFS bandwidth burst. Unused Documentation/process/submitting-patches.rst:instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy

Re: [tip:x86/cpu 2/3] arch/x86/kernel/alternative.c:96:10: warning: Undefined behaviour, pointer arithmetic 'x86nops+10' is out of bounds.

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 09:27:03AM +0100, Borislav Petkov wrote: > Yet another useless report! > > On Tue, Mar 16, 2021 at 07:50:10AM +0800, kernel test robot wrote: > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/cpu > > head: 301cddc21a157a3072d789a3097857202e550a24

Re: [PATCH 0/2] x86: Remove ideal_nops[]

2021-03-15 Thread Peter Zijlstra
On Mon, Mar 15, 2021 at 07:23:29PM +0100, Sedat Dilek wrote: > You mean something like that ^^? > > - Sedat - > > [1] > https://git.zx2c4.com/laptop-kernel/commit/?id=116badbe0a18bc36ba90acb8b80cff41f9ab0686 *shudder*, I was more thinking you'd simply add it to you CFLAGS when building. I don'

Re: [GIT pull] locking/urgent for v5.12-rc3

2021-03-15 Thread Peter Zijlstra
On Mon, Mar 15, 2021 at 11:59:12AM -0700, Linus Torvalds wrote: > Is it only the static_call_sites entry itself that needs the > alignment? Or do we end up depending on the static call function being > at least 4-byte aligned too? The way it plays games with the key makes > me worry. The only thin

Re: [PATCH 0/2] x86: Remove ideal_nops[]

2021-03-15 Thread Peter Zijlstra
On Mon, Mar 15, 2021 at 06:04:41PM +0100, Sedat Dilek wrote: > make V=1 -j4 LLVM=1 LLVM_IAS=1 So for giggles I checked, neither GCC nor LLVM seem to emit prefix NOPs when building with -march=sandybridge, they always use MOPL. Furthermore, the kernel explicitly sets: -falign-jumps=1 -falign-loop

[tip: x86/cpu] x86: Remove dynamic NOP selection

2021-03-15 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the x86/cpu branch of tip: Commit-ID: a89dfde3dc3c2dbf56910af75e2d8b11ec5308f6 Gitweb: https://git.kernel.org/tip/a89dfde3dc3c2dbf56910af75e2d8b11ec5308f6 Author:Peter Zijlstra AuthorDate:Fri, 12 Mar 2021 12:32:54 +01:00 Committer

[tip: x86/cpu] objtool/x86: Use asm/nops.h

2021-03-15 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the x86/cpu branch of tip: Commit-ID: 301cddc21a157a3072d789a3097857202e550a24 Gitweb: https://git.kernel.org/tip/301cddc21a157a3072d789a3097857202e550a24 Author:Peter Zijlstra AuthorDate:Fri, 12 Mar 2021 12:32:55 +01:00 Committer

Re: [GIT pull] locking/urgent for v5.12-rc3

2021-03-15 Thread Peter Zijlstra
On Mon, Mar 15, 2021 at 12:03:21PM -0500, Josh Poimboeuf wrote: > On Mon, Mar 15, 2021 at 01:08:27PM +0100, Peter Zijlstra wrote: > > On Mon, Mar 15, 2021 at 12:26:12PM +0100, Peter Zijlstra wrote: > > > Ooooh, modules don't have this. They still have regular > > >

Re: [tip: x86/core] x86/insn: Add an insn_decode() API

2021-03-15 Thread Peter Zijlstra
On Mon, Mar 15, 2021 at 03:47:48PM -, tip-bot2 for Borislav Petkov wrote: > x86/insn: Add an insn_decode() API Seeing as how I'm a lazy sod, does we want something like so? --- a/arch/x86/include/asm/insn.h +++ b/arch/x86/include/asm/insn.h @@ -150,6 +150,8 @@ enum insn_mode { extern int i

Re: [RFC][PATCH] x86/alternatives: Optimize optimize_nops()

2021-03-15 Thread Peter Zijlstra
On Mon, Mar 15, 2021 at 04:45:24PM +0100, Peter Zijlstra wrote: > --- a/arch/x86/kernel/alternative.c > +++ b/arch/x86/kernel/alternative.c > @@ -345,19 +345,39 @@ recompute_jump(struct alt_instr *a, u8 * > static void __init_or_module noinline optimize_nops(struct alt_instr *a,

[PATCH] x86/cpu: Resort and comment Intel models

2021-03-15 Thread Peter Zijlstra
tarting at skylake and reorder to keep the cores in chronological order. Furthermore, Intel marketed the names {Amber, Coffee, Whiskey} Lake, but those are in fact steppings of Kaby Lake, add comments for them. Signed-off-by: Peter Zijlstra (Intel) --- Note: we don't seem to have CANNONLA

[RFC][PATCH] x86/alternatives: Optimize optimize_nops()

2021-03-15 Thread Peter Zijlstra
string of NOPs inside the alternative to larger NOPs. Also run it irrespective of patching, replacing NOPs in both the original and replaced code. A direct consequence is that padlen becomes superfluous, so remove it. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/alternative.h

Re: [PATCH 1/2] futex: Fix irq mismatch in exit_pi_state_list()

2021-03-15 Thread Peter Zijlstra
On Sun, Mar 14, 2021 at 10:02:23PM -0700, Davidlohr Bueso wrote: > The pi_mutex->wait_lock is irq safe and needs to enable local > interrupts upon unlocking, matching it's corresponding > raw_spin_lock_irq(). > > Fixes: c74aef2d06a9f (futex: Fix pi_state->owner serialization) > Signed-off-by: Davi

Re: [GIT pull] locking/urgent for v5.12-rc3

2021-03-15 Thread Peter Zijlstra
On Mon, Mar 15, 2021 at 12:26:12PM +0100, Peter Zijlstra wrote: > Ooooh, modules don't have this. They still have regular > .static_call_sites sections, and *those* are unaligned. > > Section Headers: > [Nr] Name TypeAddress OffSize

Re: [GIT pull] locking/urgent for v5.12-rc3

2021-03-15 Thread Peter Zijlstra
On Mon, Mar 15, 2021 at 12:10:10PM +0100, Peter Zijlstra wrote: > On Mon, Mar 15, 2021 at 09:33:45AM +0100, Peter Zijlstra wrote: > > On Sun, Mar 14, 2021 at 01:15:25PM -0700, Linus Torvalds wrote: > > > On Sun, Mar 14, 2021 at 8:40 AM Thomas Gleixner > > > wrote: &

Re: [GIT pull] locking/urgent for v5.12-rc3

2021-03-15 Thread Peter Zijlstra
On Mon, Mar 15, 2021 at 09:33:45AM +0100, Peter Zijlstra wrote: > On Sun, Mar 14, 2021 at 01:15:25PM -0700, Linus Torvalds wrote: > > On Sun, Mar 14, 2021 at 8:40 AM Thomas Gleixner wrote: > > > > > > - A fix for the static_call mechanism so it handles unaligned &

Re: [GIT pull] locking/urgent for v5.12-rc3

2021-03-15 Thread Peter Zijlstra
On Sun, Mar 14, 2021 at 01:15:25PM -0700, Linus Torvalds wrote: > On Sun, Mar 14, 2021 at 8:40 AM Thomas Gleixner wrote: > > > > - A fix for the static_call mechanism so it handles unaligned > >addresses correctly. > > I'm not disputing the fix in any way, but why weren't the relocation > in

[PATCH 1/9] x86/retpoline: Simplify retpolines

2021-03-12 Thread Peter Zijlstra
r_replacement+0x5> c: 48 89 04 24 mov%rax,(%rsp) 10: c3 retq 17 bytes, we have 15 bytes NOP at the end of our 32 byte slot. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/asm-prototypes.h |7 --- arch/x86/include/asm

[PATCH 2/9] objtool: Correctly handle retpoline thunk calls

2021-03-12 Thread Peter Zijlstra
Just like JMP handling, convert a direct CALL to a retpoline thunk into a retpoline safe indirect CALL. Signed-off-by: Peter Zijlstra (Intel) --- tools/objtool/check.c | 12 1 file changed, 12 insertions(+) --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -953,6

[PATCH 9/9] objtool,x86: Rewrite retpoline thunk calls

2021-03-12 Thread Peter Zijlstra
to not emit endless identical .altinst_replacement chunks, use a global symbol for them, see __x86_indirect_alt_*. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/asm-prototypes.h | 12 ++ arch/x86/lib/retpoline.S | 33 +++- tools/objtool/arch/x86/decode.c | 139 +++

[PATCH 0/9] x86,objtool: Optimize !RETPOLINE

2021-03-12 Thread Peter Zijlstra
Hi, Now that Juergen's paravirt rework, which included the negative alternative stuff, landed in tip, here's a respin of my retpoline patches. The main feature is replacing the compiler generated (tail) calls to __x86_indirect_thunk_\reg with an ALTERNATIVE that replaces them with regular indirec

[PATCH 7/9] objtool: Allow archs to rewrite retpolines

2021-03-12 Thread Peter Zijlstra
When retpolines are employed, compilers typically emit calls to retpoline thunks. Objtool recognises these calls and marks them as dynamic calls. Provide infrastructure for architectures to rewrite/augment what the compiler wrote for us. Signed-off-by: Peter Zijlstra (Intel) --- tools/objtool

[PATCH 3/9] objtool: Per arch retpoline naming

2021-03-12 Thread Peter Zijlstra
The __x86_indirect_ naming is obviously not generic. Shorten to allow matching some additional magic names later. Signed-off-by: Peter Zijlstra (Intel) --- tools/objtool/arch/x86/decode.c |5 + tools/objtool/check.c|9 +++-- tools/objtool/include/objtool

<    1   2   3   4   5   6   7   8   9   10   >