Avoid speculative indirect calls in kernel

2018-01-03 Thread Andi Kleen
This is a fix for Variant 2 in https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html Any speculative indirect calls in the kernel can be tricked to execute any kernel code, which may allow side channel attacks that can leak arbitrary kernel data. So we want to

[PATCH 04/11] x86/retpoline/ftrace: Convert ftrace assembler indirect jumps

2018-01-03 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> Convert all indirect jumps in ftrace assembler code to use non speculative sequences. Based on code from David Woodhouse and Tim Chen Signed-off-by: Andi Kleen <a...@linux.intel.com> --- arch/x86/kernel/ftrace_32.S | 3 ++- arch/x86/kernel

[PATCH 04/11] x86/retpoline/ftrace: Convert ftrace assembler indirect jumps

2018-01-03 Thread Andi Kleen
From: Andi Kleen Convert all indirect jumps in ftrace assembler code to use non speculative sequences. Based on code from David Woodhouse and Tim Chen Signed-off-by: Andi Kleen --- arch/x86/kernel/ftrace_32.S | 3 ++- arch/x86/kernel/ftrace_64.S | 6 +++--- 2 files changed, 5 insertions

[PATCH 08/11] x86/retpoline/irq32: Convert assembler indirect jumps

2018-01-03 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> Convert all indirect jumps in 32bit irq inline asm code to use non speculative sequences. Signed-off-by: Andi Kleen <a...@linux.intel.com> --- arch/x86/kernel/irq_32.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --gi

[PATCH 08/11] x86/retpoline/irq32: Convert assembler indirect jumps

2018-01-03 Thread Andi Kleen
From: Andi Kleen Convert all indirect jumps in 32bit irq inline asm code to use non speculative sequences. Signed-off-by: Andi Kleen --- arch/x86/kernel/irq_32.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/irq_32.c b/arch/x86/kernel/irq_32.c index

[PATCH 01/11] x86/retpoline: Define retpoline indirect thunk and macros

2018-01-03 Thread Andi Kleen
out of line trampoline used by the compiler, and NOSPEC_JUMP / NOSPEC_CALL macros for assembler [Originally from David and Tim, heavily hacked by AK] Signed-off-by: David Woodhouse <d...@amazon.co.uk> Signed-off-by: Tim Chen <tim.c.c...@linux.intel.com> Signed-off-by: Andi Kleen <a...@lin

[PATCH 01/11] x86/retpoline: Define retpoline indirect thunk and macros

2018-01-03 Thread Andi Kleen
/ NOSPEC_CALL macros for assembler [Originally from David and Tim, heavily hacked by AK] Signed-off-by: David Woodhouse Signed-off-by: Tim Chen Signed-off-by: Andi Kleen --- arch/x86/include/asm/jump-asm.h | 47 + arch/x86/kernel/vmlinux.lds.S | 1 + arch

[PATCH 07/11] x86/retpoline/checksum32: Convert assembler indirect jumps

2018-01-03 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> Convert all indirect jumps in 32bit checksum assembler code to use non speculative sequences. Based on code from David Woodhouse and Tim Chen Signed-off-by: Andi Kleen <a...@linux.intel.com> --- arch/x86/lib/checksum_32.S | 5 +++-- 1 fil

[PATCH 02/11] x86/retpoline/crypto: Convert crypto assembler indirect jumps

2018-01-03 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> Convert all indirect jumps in crypto assembler code to use non speculative sequences. Based on code from David Woodhouse and Tim Chen Signed-off-by: Andi Kleen <a...@linux.intel.com> --- arch/x86/crypto/aesni-intel_asm.S| 5 +++

[PATCH 07/11] x86/retpoline/checksum32: Convert assembler indirect jumps

2018-01-03 Thread Andi Kleen
From: Andi Kleen Convert all indirect jumps in 32bit checksum assembler code to use non speculative sequences. Based on code from David Woodhouse and Tim Chen Signed-off-by: Andi Kleen --- arch/x86/lib/checksum_32.S | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git

[PATCH 02/11] x86/retpoline/crypto: Convert crypto assembler indirect jumps

2018-01-03 Thread Andi Kleen
From: Andi Kleen Convert all indirect jumps in crypto assembler code to use non speculative sequences. Based on code from David Woodhouse and Tim Chen Signed-off-by: Andi Kleen --- arch/x86/crypto/aesni-intel_asm.S| 5 +++-- arch/x86/crypto/camellia-aesni-avx-asm_64.S | 3

[PATCH 10/11] retpoline/taint: Taint kernel for missing retpoline in compiler

2018-01-03 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> When the kernel or a module hasn't been compiled with a retpoline aware compiler, print a warning and set a taint flag. For modules it is checked at compile time, however it cannot check assembler or other non compiled objects used in the module link

[PATCH 11/11] retpoline/objtool: Disable some objtool warnings

2018-01-03 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> With the indirect call thunk enabled compiler two objtool warnings are triggered very frequently and make the build very noisy. I don't see a good way to avoid them, so just disable them for now. Signed-off-by: Andi Kleen <a...@linux.intel.com>

[PATCH 09/11] x86/retpoline: Finally enable retpoline for C code

2018-01-03 Thread Andi Kleen
From: Dave Hansen From: David Woodhouse Add retpoline compile option in Makefile Update Makefile with retpoline compile options. This requires a gcc with the retpoline compiler patches enabled. Print a warning when the compiler doesn't support

[PATCH 10/11] retpoline/taint: Taint kernel for missing retpoline in compiler

2018-01-03 Thread Andi Kleen
From: Andi Kleen When the kernel or a module hasn't been compiled with a retpoline aware compiler, print a warning and set a taint flag. For modules it is checked at compile time, however it cannot check assembler or other non compiled objects used in the module link. Due to lack of better

[PATCH 11/11] retpoline/objtool: Disable some objtool warnings

2018-01-03 Thread Andi Kleen
From: Andi Kleen With the indirect call thunk enabled compiler two objtool warnings are triggered very frequently and make the build very noisy. I don't see a good way to avoid them, so just disable them for now. Signed-off-by: Andi Kleen --- tools/objtool/check.c | 11 +++ 1 file

[PATCH 09/11] x86/retpoline: Finally enable retpoline for C code

2018-01-03 Thread Andi Kleen
From: Dave Hansen From: David Woodhouse Add retpoline compile option in Makefile Update Makefile with retpoline compile options. This requires a gcc with the retpoline compiler patches enabled. Print a warning when the compiler doesn't support retpoline [Originally from David and Tim, but

[PATCH 03/11] x86/retpoline/entry: Convert entry assembler indirect jumps

2018-01-03 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> Convert all indirect jumps in core 32/64bit entry assembler code to use non speculative sequences. Based on code from David Woodhouse and Tim Chen Signed-off-by: Andi Kleen <a...@linux.intel.com> --- arch/x86/entry/entry_32.S | 5 +++-- ar

[PATCH 03/11] x86/retpoline/entry: Convert entry assembler indirect jumps

2018-01-03 Thread Andi Kleen
From: Andi Kleen Convert all indirect jumps in core 32/64bit entry assembler code to use non speculative sequences. Based on code from David Woodhouse and Tim Chen Signed-off-by: Andi Kleen --- arch/x86/entry/entry_32.S | 5 +++-- arch/x86/entry/entry_64.S | 12 +++- 2 files changed

[PATCH 05/11] x86/retpoline/hyperv: Convert assembler indirect jumps

2018-01-03 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> Convert all indirect jumps in hyperv inline asm code to use non speculative sequences. Based on code from David Woodhouse and Tim Chen Signed-off-by: Andi Kleen <a...@linux.intel.com> --- arch/x86/include/asm/mshyperv.h | 9 + 1 fil

[PATCH 05/11] x86/retpoline/hyperv: Convert assembler indirect jumps

2018-01-03 Thread Andi Kleen
From: Andi Kleen Convert all indirect jumps in hyperv inline asm code to use non speculative sequences. Based on code from David Woodhouse and Tim Chen Signed-off-by: Andi Kleen --- arch/x86/include/asm/mshyperv.h | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git

[PATCH 06/11] x86/retpoline/crypto: Convert xen assembler indirect jumps

2018-01-03 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> Convert all indirect jumps in xen inline assembler code to use non speculative sequences. Based on code from David Woodhouse and Tim Chen Signed-off-by: Andi Kleen <a...@linux.intel.com> --- arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 1

[PATCH 06/11] x86/retpoline/crypto: Convert xen assembler indirect jumps

2018-01-03 Thread Andi Kleen
From: Andi Kleen Convert all indirect jumps in xen inline assembler code to use non speculative sequences. Based on code from David Woodhouse and Tim Chen Signed-off-by: Andi Kleen --- arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 1 + arch/x86/include/asm/xen/hypercall.h | 3 ++- 2

Re: [RFC PATCH 2/5] perf jevents: add support for arch recommended events

2018-01-02 Thread Andi Kleen
> Can you describe how you autogenerate the JSONs? Do you have some internal > proprietary HW file format describing events, with files supplied from HW > designer, which you can just translate into a JSON? Would the files support > deferencing events to improve scalability? For Intel JSON is an

Re: [RFC PATCH 2/5] perf jevents: add support for arch recommended events

2018-01-02 Thread Andi Kleen
> Can you describe how you autogenerate the JSONs? Do you have some internal > proprietary HW file format describing events, with files supplied from HW > designer, which you can just translate into a JSON? Would the files support > deferencing events to improve scalability? For Intel JSON is an

[PATCH] perf/x86/intel: Fix minor memleak on Skylake perf initialization

2017-12-27 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> Tommi reports: I'm seeing this kmemleak report in v4.15-rc4: unreferenced object 0x8801f3d5d720 (size 64): comm "swapper/0", pid 1, jiffies 4294667312 (age 2687.423s) hex dump (first 32 bytes): 60 d1 41 ad ff ff ff ff 20 d1 4

[PATCH] perf/x86/intel: Fix minor memleak on Skylake perf initialization

2017-12-27 Thread Andi Kleen
From: Andi Kleen Tommi reports: I'm seeing this kmemleak report in v4.15-rc4: unreferenced object 0x8801f3d5d720 (size 64): comm "swapper/0", pid 1, jiffies 4294667312 (age 2687.423s) hex dump (first 32 bytes): 60 d1 41 ad ff ff ff ff 20 d1 41 ad f

[PATCH 4/6] x86/kvm: Make steal_time visible

2017-12-21 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> This per cpu variable is accessed from assembler code, so needs to be visible. Signed-off-by: Andi Kleen <a...@linux.intel.com> --- arch/x86/kernel/kvm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/kvm.

[PATCH 4/6] x86/kvm: Make steal_time visible

2017-12-21 Thread Andi Kleen
From: Andi Kleen This per cpu variable is accessed from assembler code, so needs to be visible. Signed-off-by: Andi Kleen --- arch/x86/kernel/kvm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index b40ffbf156c1

[PATCH 1/6] x86/timer: Don't inline __const_udelay

2017-12-21 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> __const_udelay is marked inline, and LTO will happily inline it everywhere Dropping the inline saves ~44k text in a LTO build. 139995601740864 1499136 172395601070e08 vmlinux-with-udelay-inline 139547641736768 1499136 17

[PATCH 1/6] x86/timer: Don't inline __const_udelay

2017-12-21 Thread Andi Kleen
From: Andi Kleen __const_udelay is marked inline, and LTO will happily inline it everywhere Dropping the inline saves ~44k text in a LTO build. 139995601740864 1499136 172395601070e08 vmlinux-with-udelay-inline 139547641736768 1499136 171906681064f0c vmlinux-wo

[PATCH 3/6] locking/spinlocks: Mark spinlocks noinline when inline spinlocks are disabled

2017-12-21 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> Otherwise LTO will inline them anyways and cause a large kernel text increase. Since the explicit intention here is to not inline them marking them noinline is good documentation even for the non LTO case. Signed-off-by: Andi Kleen <a...@linux.

[PATCH 3/6] locking/spinlocks: Mark spinlocks noinline when inline spinlocks are disabled

2017-12-21 Thread Andi Kleen
From: Andi Kleen Otherwise LTO will inline them anyways and cause a large kernel text increase. Since the explicit intention here is to not inline them marking them noinline is good documentation even for the non LTO case. Signed-off-by: Andi Kleen --- kernel/locking/spinlock.c | 56

x86 cleanups from the LTO tree

2017-12-21 Thread Andi Kleen
These are all the fixes for the x86 tree needed for LTO. They are strictly not needed without LTO, but I believe they can be all considered cleanups and documentation improvements and are valuable because of that. The initconst/data fixes help generating correct section permissions in the

x86 cleanups from the LTO tree

2017-12-21 Thread Andi Kleen
These are all the fixes for the x86 tree needed for LTO. They are strictly not needed without LTO, but I believe they can be all considered cleanups and documentation improvements and are valuable because of that. The initconst/data fixes help generating correct section permissions in the

[PATCH 2/6] x86/xen: Mark pv stub assembler symbol visible

2017-12-21 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> With LTO any external assembler symbol has to be marked __visible. Mark the generated asm PV stubs __visible to prevent a linker error. Signed-off-by: Andi Kleen <a...@linux.intel.com> --- arch/x86/include/asm/paravirt.h | 3 ++- driver

[PATCH 2/6] x86/xen: Mark pv stub assembler symbol visible

2017-12-21 Thread Andi Kleen
From: Andi Kleen With LTO any external assembler symbol has to be marked __visible. Mark the generated asm PV stubs __visible to prevent a linker error. Signed-off-by: Andi Kleen --- arch/x86/include/asm/paravirt.h | 3 ++- drivers/xen/time.c | 2 +- 2 files changed, 3 insertions

[PATCH 6/6] x86/idt: Make const __initconst

2017-12-21 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> const variables must use __initconst, not __initdata. Fix this up for the new IDT tables recently added, which got it consistently wrong. Fixes a whole range of commits between 16bc18d895ce x86/idt: Move 32-bit idt_descr to C code and dc20b2d52653 x

[PATCH 6/6] x86/idt: Make const __initconst

2017-12-21 Thread Andi Kleen
From: Andi Kleen const variables must use __initconst, not __initdata. Fix this up for the new IDT tables recently added, which got it consistently wrong. Fixes a whole range of commits between 16bc18d895ce x86/idt: Move 32-bit idt_descr to C code and dc20b2d52653 x86/idt: Move interrupt gate

[PATCH 5/6] x86: Make exception handler functions visible

2017-12-21 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> Make the C exception handler functions that are directly called through exception tables visible. LTO needs to know they are accessed from assembler. Signed-off-by: Andi Kleen <a...@linux.intel.com> --- arch/x86/mm/extable.c | 17 +

[PATCH 5/6] x86: Make exception handler functions visible

2017-12-21 Thread Andi Kleen
From: Andi Kleen Make the C exception handler functions that are directly called through exception tables visible. LTO needs to know they are accessed from assembler. Signed-off-by: Andi Kleen --- arch/x86/mm/extable.c | 17 + 1 file changed, 9 insertions(+), 8 deletions

[PATCH] Fix read buffer overflow in delta-ipc

2017-12-21 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> The single caller passes a string to delta_ipc_open, which copies with a fixed size larger than the string. So it copies some random data after the original string the ro segment. If the string was at the end of a page it may fault. Just copy the

[PATCH] Fix read buffer overflow in delta-ipc

2017-12-21 Thread Andi Kleen
From: Andi Kleen The single caller passes a string to delta_ipc_open, which copies with a fixed size larger than the string. So it copies some random data after the original string the ro segment. If the string was at the end of a page it may fault. Just copy the string with a normal strcpy

[PATCH] ftrace: Mark function tracer test functions noinline/noclone

2017-12-21 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> The ftrace function tracer self tests calls some functions to verify the get traced. This relies on them not being inlined. Previously this was ensured by putting them into another file, but with LTO the compiler can inline across files, which

[PATCH] ftrace: Mark function tracer test functions noinline/noclone

2017-12-21 Thread Andi Kleen
From: Andi Kleen The ftrace function tracer self tests calls some functions to verify the get traced. This relies on them not being inlined. Previously this was ensured by putting them into another file, but with LTO the compiler can inline across files, which makes the tests fail. Mark

Re: [PATCH V2 4/4] perf/x86: fix: disable userspace RDPMC usage for large PEBS

2017-12-20 Thread Andi Kleen
On Wed, Dec 20, 2017 at 11:42:51AM -0800, kan.li...@linux.intel.com wrote: > From: Kan Liang > > The userspace RDPMC usage never works for large PEBS since the large > PEBS is introduced by > commit b8241d20699e ("perf/x86/intel: Implement batched PEBS interrupt >

Re: [PATCH V2 4/4] perf/x86: fix: disable userspace RDPMC usage for large PEBS

2017-12-20 Thread Andi Kleen
On Wed, Dec 20, 2017 at 11:42:51AM -0800, kan.li...@linux.intel.com wrote: > From: Kan Liang > > The userspace RDPMC usage never works for large PEBS since the large > PEBS is introduced by > commit b8241d20699e ("perf/x86/intel: Implement batched PEBS interrupt > handling (large PEBS interrupt

Re: [PATCH 2/4] perf/x86/intel: fix event update for auto-reload

2017-12-19 Thread Andi Kleen
On Tue, Dec 19, 2017 at 11:07:09PM +0100, Peter Zijlstra wrote: > On Tue, Dec 19, 2017 at 03:08:58PM -0500, Liang, Kan wrote: > > > This all looks very wrong... In auto reload we should never call > > > intel_pmu_save_and_restore() in the first place I think. > > > > > > Things like

Re: [PATCH 2/4] perf/x86/intel: fix event update for auto-reload

2017-12-19 Thread Andi Kleen
On Tue, Dec 19, 2017 at 11:07:09PM +0100, Peter Zijlstra wrote: > On Tue, Dec 19, 2017 at 03:08:58PM -0500, Liang, Kan wrote: > > > This all looks very wrong... In auto reload we should never call > > > intel_pmu_save_and_restore() in the first place I think. > > > > > > Things like

Re: [RFC PATCH 2/5] perf jevents: add support for arch recommended events

2017-12-16 Thread Andi Kleen
> Won't this all potentially have a big maintainence cost? No. It's all auto generated. The only cost is slightly bigger binary size. I would hope your event files are auto generated too. > I just don't know how this schema scales with more archs and more platforms > supported. It's just early

Re: [RFC PATCH 2/5] perf jevents: add support for arch recommended events

2017-12-16 Thread Andi Kleen
> Won't this all potentially have a big maintainence cost? No. It's all auto generated. The only cost is slightly bigger binary size. I would hope your event files are auto generated too. > I just don't know how this schema scales with more archs and more platforms > supported. It's just early

Re: [PATCH 4/5] -march=native: REP STOSB

2017-12-08 Thread Andi Kleen
Alexey Dobriyan writes: > > +#ifdef CONFIG_MARCH_NATIVE_REP_STOSB > +static __always_inline void clear_page(void *page) > +{ > + uint32_t len = PAGE_SIZE; > + asm volatile ( > + "rep stosb" > + : "+D" (page), "+c" (len) > + : "a"

Re: [PATCH 4/5] -march=native: REP STOSB

2017-12-08 Thread Andi Kleen
Alexey Dobriyan writes: > > +#ifdef CONFIG_MARCH_NATIVE_REP_STOSB > +static __always_inline void clear_page(void *page) > +{ > + uint32_t len = PAGE_SIZE; > + asm volatile ( > + "rep stosb" > + : "+D" (page), "+c" (len) > + : "a" (0) > + :

Re: [PATCH RFC 2/2] KVM: x86/vPMU: ignore access to LBR-related MSRs

2017-12-06 Thread Andi Kleen
On Wed, Dec 06, 2017 at 08:02:07PM +0300, Jan Dakinevich wrote: > On Wed, 6 Dec 2017 07:57:28 -0800 > Andi Kleen <a...@linux.intel.com> wrote: > > > > > If you do all this it's only a small step to fully enable LBRs for > > guests. > > It is quite simple

Re: [PATCH RFC 2/2] KVM: x86/vPMU: ignore access to LBR-related MSRs

2017-12-06 Thread Andi Kleen
On Wed, Dec 06, 2017 at 08:02:07PM +0300, Jan Dakinevich wrote: > On Wed, 6 Dec 2017 07:57:28 -0800 > Andi Kleen wrote: > > > > > If you do all this it's only a small step to fully enable LBRs for > > guests. > > It is quite simple in a case where guest LBR

[tip:perf/core] perf script: Allow computing 'perf stat' style metrics

2017-12-06 Thread tip-bot for Andi Kleen
Commit-ID: 4bd1bef8bba2f99ff472ae3617864dda301f81bd Gitweb: https://git.kernel.org/tip/4bd1bef8bba2f99ff472ae3617864dda301f81bd Author: Andi Kleen <a...@linux.intel.com> AuthorDate: Fri, 17 Nov 2017 13:43:00 -0800 Committer: Arnaldo Carvalho de Melo <a...@redhat.com> CommitD

[tip:perf/core] perf script: Allow computing 'perf stat' style metrics

2017-12-06 Thread tip-bot for Andi Kleen
Commit-ID: 4bd1bef8bba2f99ff472ae3617864dda301f81bd Gitweb: https://git.kernel.org/tip/4bd1bef8bba2f99ff472ae3617864dda301f81bd Author: Andi Kleen AuthorDate: Fri, 17 Nov 2017 13:43:00 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Wed, 29 Nov 2017 18:18:01 -0300 perf script

[tip:perf/core] perf record: Synthesize thread map and cpu map

2017-12-06 Thread tip-bot for Andi Kleen
Commit-ID: 373565d285e8d2113f1b6c0a2e461b9c8d0da1c9 Gitweb: https://git.kernel.org/tip/373565d285e8d2113f1b6c0a2e461b9c8d0da1c9 Author: Andi Kleen <a...@linux.intel.com> AuthorDate: Fri, 17 Nov 2017 13:42:59 -0800 Committer: Arnaldo Carvalho de Melo <a...@redhat.com> CommitD

[tip:perf/core] perf record: Synthesize thread map and cpu map

2017-12-06 Thread tip-bot for Andi Kleen
Commit-ID: 373565d285e8d2113f1b6c0a2e461b9c8d0da1c9 Gitweb: https://git.kernel.org/tip/373565d285e8d2113f1b6c0a2e461b9c8d0da1c9 Author: Andi Kleen AuthorDate: Fri, 17 Nov 2017 13:42:59 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Wed, 29 Nov 2017 18:18:00 -0300 perf record

[tip:perf/core] perf record: Synthesize unit/scale/... in event update

2017-12-06 Thread tip-bot for Andi Kleen
Commit-ID: bfd8f72c2778f5bd63dc9eb6d23bd7a0d99cff6d Gitweb: https://git.kernel.org/tip/bfd8f72c2778f5bd63dc9eb6d23bd7a0d99cff6d Author: Andi Kleen <a...@linux.intel.com> AuthorDate: Fri, 17 Nov 2017 13:42:58 -0800 Committer: Arnaldo Carvalho de Melo <a...@redhat.com> CommitD

[tip:perf/core] perf record: Synthesize unit/scale/... in event update

2017-12-06 Thread tip-bot for Andi Kleen
Commit-ID: bfd8f72c2778f5bd63dc9eb6d23bd7a0d99cff6d Gitweb: https://git.kernel.org/tip/bfd8f72c2778f5bd63dc9eb6d23bd7a0d99cff6d Author: Andi Kleen AuthorDate: Fri, 17 Nov 2017 13:42:58 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Wed, 29 Nov 2017 18:18:00 -0300 perf record

Re: [PATCH RFC 2/2] KVM: x86/vPMU: ignore access to LBR-related MSRs

2017-12-06 Thread Andi Kleen
If you do all this it's only a small step to fully enable LBRs for guests. Just need to allow them to be written, expose PERF_CAPABILITIES too, and start/stop them on entry/exit, and enable context switching through perf in the host. That would be far better than creating a frankenstate where

Re: [PATCH RFC 2/2] KVM: x86/vPMU: ignore access to LBR-related MSRs

2017-12-06 Thread Andi Kleen
If you do all this it's only a small step to fully enable LBRs for guests. Just need to allow them to be written, expose PERF_CAPABILITIES too, and start/stop them on entry/exit, and enable context switching through perf in the host. That would be far better than creating a frankenstate where

[PATCH] perf, tools: Fix perf stat for old kernels w/o PERF_SAMPLE_IDENTIFIER

2017-12-05 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> PERF_SAMPLE_IDENTIFIER is a newer kernel feature and not available on old kernels. Since 4979d0c7d0c7 ("perf stat record: Add record command") we set it unconditionally for perf stat, which makes perf stat show all counters as "no

[PATCH] perf, tools: Fix perf stat for old kernels w/o PERF_SAMPLE_IDENTIFIER

2017-12-05 Thread Andi Kleen
From: Andi Kleen PERF_SAMPLE_IDENTIFIER is a newer kernel feature and not available on old kernels. Since 4979d0c7d0c7 ("perf stat record: Add record command") we set it unconditionally for perf stat, which makes perf stat show all counters as "not supported" on

Re: [RFC PATCH 2/5] perf jevents: add support for arch recommended events

2017-12-05 Thread Andi Kleen
On Wed, Dec 06, 2017 at 12:13:16AM +0800, John Garry wrote: > For some architectures (like arm64), there are architecture- > defined recommended events. Vendors may not be obliged to > follow the recommendation and may implement their own pmu > event for a specific event cod I would just

Re: [RFC PATCH 2/5] perf jevents: add support for arch recommended events

2017-12-05 Thread Andi Kleen
On Wed, Dec 06, 2017 at 12:13:16AM +0800, John Garry wrote: > For some architectures (like arm64), there are architecture- > defined recommended events. Vendors may not be obliged to > follow the recommendation and may implement their own pmu > event for a specific event cod I would just

Re: [question] handle the page table RAS error

2017-12-05 Thread Andi Kleen
On Sun, Dec 03, 2017 at 01:22:25PM +, gengdongjiu wrote: > Hi all, >Sorry to disturb you. Now the ARM64 has supported the RAS, when enabling > this feature, we encounter a issue. If the user space application happen page > table RAS error, > Memory error handler(memory_failure()) will do

Re: [question] handle the page table RAS error

2017-12-05 Thread Andi Kleen
On Sun, Dec 03, 2017 at 01:22:25PM +, gengdongjiu wrote: > Hi all, >Sorry to disturb you. Now the ARM64 has supported the RAS, when enabling > this feature, we encounter a issue. If the user space application happen page > table RAS error, > Memory error handler(memory_failure()) will do

Re: [PATCH v5 04/12] perf util: Add rbtree node_delete ops

2017-12-01 Thread Andi Kleen
On Fri, Dec 01, 2017 at 11:14:52AM -0300, Arnaldo Carvalho de Melo wrote: > Em Fri, Dec 01, 2017 at 06:57:28PM +0800, Jin Yao escreveu: > > @@ -130,7 +140,7 @@ void perf_stat__init_shadow_stats(void) > > rblist__init(_saved_values); > > runtime_saved_values.node_cmp = saved_value_cmp; > >

Re: [PATCH v5 04/12] perf util: Add rbtree node_delete ops

2017-12-01 Thread Andi Kleen
On Fri, Dec 01, 2017 at 11:14:52AM -0300, Arnaldo Carvalho de Melo wrote: > Em Fri, Dec 01, 2017 at 06:57:28PM +0800, Jin Yao escreveu: > > @@ -130,7 +140,7 @@ void perf_stat__init_shadow_stats(void) > > rblist__init(_saved_values); > > runtime_saved_values.node_cmp = saved_value_cmp; > >

Re: [PATCH] KVM: VMX: Cache IA32_DEBUGCTL in memory

2017-11-29 Thread Andi Kleen
On Wed, Nov 29, 2017 at 11:26:30PM +0100, Paolo Bonzini wrote: > On 29/11/2017 19:20, Andi Kleen wrote: > > But I haven't looked too closely, but I suspect you'll clobber global > > kernel debugger state this way. > > I checked all callers of update_debugctlmsr, and couldn't

Re: [PATCH] KVM: VMX: Cache IA32_DEBUGCTL in memory

2017-11-29 Thread Andi Kleen
On Wed, Nov 29, 2017 at 11:26:30PM +0100, Paolo Bonzini wrote: > On 29/11/2017 19:20, Andi Kleen wrote: > > But I haven't looked too closely, but I suspect you'll clobber global > > kernel debugger state this way. > > I checked all callers of update_debugctlmsr, and couldn't

Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code

2017-11-29 Thread Andi Kleen
> We're really early in the boot -- startup_64 in decompression code -- and > I don't know a way print a message there. Is there a way? > > no_longmode handled by just hanging the machine. Is it enough for no_la57 > case too? The way to handle it is to check it early in the real mode boot code

Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code

2017-11-29 Thread Andi Kleen
> We're really early in the boot -- startup_64 in decompression code -- and > I don't know a way print a message there. Is there a way? > > no_longmode handled by just hanging the machine. Is it enough for no_la57 > case too? The way to handle it is to check it early in the real mode boot code

Re: [PATCH] KVM: VMX: Cache IA32_DEBUGCTL in memory

2017-11-29 Thread Andi Kleen
On Wed, Nov 29, 2017 at 11:05:46AM -0800, Jim Mattson wrote: > An alternative is to give the L1 guest read permission for this MSR in > the MSR permission bitmaps. It's still going to be ~80 cycles, but > that's better than the cost of a VM-exit/VM-entry round-trip. It's a useful optimization, 80

Re: [PATCH] KVM: VMX: Cache IA32_DEBUGCTL in memory

2017-11-29 Thread Andi Kleen
On Wed, Nov 29, 2017 at 11:05:46AM -0800, Jim Mattson wrote: > An alternative is to give the L1 guest read permission for this MSR in > the MSR permission bitmaps. It's still going to be ~80 cycles, but > that's better than the cost of a VM-exit/VM-entry round-trip. It's a useful optimization, 80

Re: [PATCH] KVM: VMX: Cache IA32_DEBUGCTL in memory

2017-11-29 Thread Andi Kleen
Wanpeng Li writes: > From: Wanpeng Li > > MSR_IA32_DEBUGCTLMSR is zeroed on VMEXIT, so it is saved/restored > each time during world switch. Jim from Google pointed out that > when running schbench in L2, vmx_vcpu_run will occupy 4% cpu time, >

Re: [PATCH] KVM: VMX: Cache IA32_DEBUGCTL in memory

2017-11-29 Thread Andi Kleen
Wanpeng Li writes: > From: Wanpeng Li > > MSR_IA32_DEBUGCTLMSR is zeroed on VMEXIT, so it is saved/restored > each time during world switch. Jim from Google pointed out that > when running schbench in L2, vmx_vcpu_run will occupy 4% cpu time, > and the 25% of vmx_vcpu_run cpu time is

[tip:perf/core] perf record: Fix -c/-F options for cpu event aliases

2017-11-28 Thread tip-bot for Andi Kleen
Commit-ID: 59622fd496a3175c7bf549046e091d81c303ecff Gitweb: https://git.kernel.org/tip/59622fd496a3175c7bf549046e091d81c303ecff Author: Andi Kleen <a...@linux.intel.com> AuthorDate: Fri, 20 Oct 2017 13:27:55 -0700 Committer: Arnaldo Carvalho de Melo <a...@redhat.com> CommitD

[tip:perf/core] perf record: Fix -c/-F options for cpu event aliases

2017-11-28 Thread tip-bot for Andi Kleen
Commit-ID: 59622fd496a3175c7bf549046e091d81c303ecff Gitweb: https://git.kernel.org/tip/59622fd496a3175c7bf549046e091d81c303ecff Author: Andi Kleen AuthorDate: Fri, 20 Oct 2017 13:27:55 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 28 Nov 2017 14:19:39 -0300 perf record

Re: [PATCH 06/12] perf, tools, probe: Support a quiet argument for debug info open

2017-11-28 Thread Andi Kleen
On Wed, Nov 29, 2017 at 12:14:00PM +0900, Masami Hiramatsu wrote: > On Mon, 27 Nov 2017 16:23:15 -0800 > Andi Kleen <a...@firstfloor.org> wrote: > > > From: Andi Kleen <a...@linux.intel.com> > > > > Add a extra quiet argument to the debug info open / probe

Re: [PATCH 06/12] perf, tools, probe: Support a quiet argument for debug info open

2017-11-28 Thread Andi Kleen
On Wed, Nov 29, 2017 at 12:14:00PM +0900, Masami Hiramatsu wrote: > On Mon, 27 Nov 2017 16:23:15 -0800 > Andi Kleen wrote: > > > From: Andi Kleen > > > > Add a extra quiet argument to the debug info open / probe finder > > code that allows perf script to make t

Re: [PATCH 02/18] vchecker: introduce the valid access checker

2017-11-28 Thread Andi Kleen
js1...@gmail.com writes: > From: Joonsoo Kim Looks useful. Essentially unlimited hardware break points, combined with slab. Didn't do a full review, but noticed some things below. > + > + buf = kmalloc(PAGE_SIZE, GFP_KERNEL); > + if (!buf) > + return

Re: [PATCH 02/18] vchecker: introduce the valid access checker

2017-11-28 Thread Andi Kleen
js1...@gmail.com writes: > From: Joonsoo Kim Looks useful. Essentially unlimited hardware break points, combined with slab. Didn't do a full review, but noticed some things below. > + > + buf = kmalloc(PAGE_SIZE, GFP_KERNEL); > + if (!buf) > + return -ENOMEM; > + > + if

Re: [PATCH 1/2] mm: NUMA stats code cleanup and enhancement

2017-11-28 Thread Andi Kleen
Vlastimil Babka writes: > > I'm worried about the "for_each_possible..." approach here and elsewhere > in the patch as it can be rather excessive compared to the online number > of cpus (we've seen BIOSes report large numbers of possible CPU's). IIRC Even if they report a few

Re: [PATCH 1/2] mm: NUMA stats code cleanup and enhancement

2017-11-28 Thread Andi Kleen
Vlastimil Babka writes: > > I'm worried about the "for_each_possible..." approach here and elsewhere > in the patch as it can be rather excessive compared to the online number > of cpus (we've seen BIOSes report large numbers of possible CPU's). IIRC Even if they report a few hundred extra

Re: [PATCH 02/21] afs: Fix const confusion in AFS

2017-11-28 Thread Andi Kleen
On Tue, Nov 28, 2017 at 04:04:38PM +, David Howells wrote: > Andi Kleen <a...@firstfloor.org> wrote: > > > A trace point string cannot be const because the underlying special > > section is not marked const. An LTO build complains about the > > s

Re: [PATCH 02/21] afs: Fix const confusion in AFS

2017-11-28 Thread Andi Kleen
On Tue, Nov 28, 2017 at 04:04:38PM +, David Howells wrote: > Andi Kleen wrote: > > > A trace point string cannot be const because the underlying special > > section is not marked const. An LTO build complains about the > > section attribute mismatch. Fix it by not

[PATCH 06/12] perf, tools, probe: Support a quiet argument for debug info open

2017-11-27 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> Add a extra quiet argument to the debug info open / probe finder code that allows perf script to make them quieter. Otherwise we may end up with too many error messages when lots of instructions fail debug info parsing. Signed-off-by: Andi

[PATCH 06/12] perf, tools, probe: Support a quiet argument for debug info open

2017-11-27 Thread Andi Kleen
From: Andi Kleen Add a extra quiet argument to the debug info open / probe finder code that allows perf script to make them quieter. Otherwise we may end up with too many error messages when lots of instructions fail debug info parsing. Signed-off-by: Andi Kleen --- tools/perf/util/probe

[PATCH 01/12] perf, tools, pt: Clear instruction for ptwrite samples

2017-11-27 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> When a PTWRITE sample is synthesized the PT decoder already ran ahead and sample->insn contains the next branch instruction, not the PTWRITE. Clear it for PTWRITE samples to avoid confusion. Signed-off-by: Andi Kleen <a...@linux.intel.com>

[PATCH 12/12] perf, tools, script: Implement dwarf resolving of instructions

2017-11-27 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> Implement resolving arguments of instructions to dwarf variable names. When we sample an instruction, decode the instruction and try to symbolize the register or destination it is using. Also print the type. It builds on the perf probe deb

[PATCH 01/12] perf, tools, pt: Clear instruction for ptwrite samples

2017-11-27 Thread Andi Kleen
From: Andi Kleen When a PTWRITE sample is synthesized the PT decoder already ran ahead and sample->insn contains the next branch instruction, not the PTWRITE. Clear it for PTWRITE samples to avoid confusion. Signed-off-by: Andi Kleen --- tools/perf/util/intel-pt.c | 6 ++ 1 file chan

[PATCH 12/12] perf, tools, script: Implement dwarf resolving of instructions

2017-11-27 Thread Andi Kleen
From: Andi Kleen Implement resolving arguments of instructions to dwarf variable names. When we sample an instruction, decode the instruction and try to symbolize the register or destination it is using. Also print the type. It builds on the perf probe debugging information reverse lookup

[PATCH 05/12] perf, tools, probe: Print location for resolved variables

2017-11-27 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> Print the location, e.g. the register, for resolved variables with perf probe -V. This is useful for debugging, and manually making sense of disassembly. I also have some scripts which can make use of this information. Before: % perf probe -x

[PATCH 05/12] perf, tools, probe: Print location for resolved variables

2017-11-27 Thread Andi Kleen
From: Andi Kleen Print the location, e.g. the register, for resolved variables with perf probe -V. This is useful for debugging, and manually making sense of disassembly. I also have some scripts which can make use of this information. Before: % perf probe -x ./tsrc/tstruct -V main+20

[PATCH 04/12] perf, tools: Store variable name and register for dwarf variable lists

2017-11-27 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> Extend the strlist returned by debuginfo__find_available_vars_at to also directly include the variable name and the location of the resolved variables in each node. This makes it easier to use for callers that parse the output instead of just pr

[PATCH 04/12] perf, tools: Store variable name and register for dwarf variable lists

2017-11-27 Thread Andi Kleen
From: Andi Kleen Extend the strlist returned by debuginfo__find_available_vars_at to also directly include the variable name and the location of the resolved variables in each node. This makes it easier to use for callers that parse the output instead of just printing it. Signed-off-by: Andi

[PATCH 10/12] perf, tools: Add args and gprs shortcut for registers

2017-11-27 Thread Andi Kleen
From: Andi Kleen <a...@linux.intel.com> Writing all registers to sample with -I can give very long command lines. Add short hands for "args" and "gprs" Signed-off-by: Andi Kleen <a...@linux.intel.com> --- tools/perf/arch/x86/util/perf_regs.c | 9 + 1 fil

<    11   12   13   14   15   16   17   18   19   20   >