[tip: objtool/core] objtool: Refactor jump table code to support other architectures
The following commit has been merged into the objtool/core branch of tip: Commit-ID: d871f7b5a6a2a30f4eba577fd56941fa3657e394 Gitweb: https://git.kernel.org/tip/d871f7b5a6a2a30f4eba577fd56941fa3657e394 Author:Raphael Gault AuthorDate:Fri, 04 Sep 2020 16:30:24 +01:00 Committer: Josh Poimboeuf CommitterDate: Thu, 10 Sep 2020 10:43:13 -05:00 objtool: Refactor jump table code to support other architectures The way to identify jump tables and retrieve all the data necessary to handle the different execution branches is not the same on all architectures. In order to be able to add other architecture support, define an arch-dependent function to process jump-tables. Reviewed-by: Miroslav Benes Signed-off-by: Raphael Gault [J.T.: Move arm64 bits out of this patch, Have only one function to find the start of the jump table, for now assume that the jump table format will be the same as x86] Signed-off-by: Julien Thierry Signed-off-by: Josh Poimboeuf --- tools/objtool/arch/x86/special.c | 95 +++- tools/objtool/check.c| 90 + tools/objtool/check.h| 1 +- tools/objtool/special.h | 4 +- 4 files changed, 103 insertions(+), 87 deletions(-) diff --git a/tools/objtool/arch/x86/special.c b/tools/objtool/arch/x86/special.c index 34e0e16..fd4af88 100644 --- a/tools/objtool/arch/x86/special.c +++ b/tools/objtool/arch/x86/special.c @@ -1,4 +1,6 @@ // SPDX-License-Identifier: GPL-2.0-or-later +#include + #include "../../special.h" #include "../../builtin.h" @@ -48,3 +50,96 @@ bool arch_support_alt_relocation(struct special_alt *special_alt, return insn->offset == special_alt->new_off && (insn->type == INSN_CALL || is_static_jump(insn)); } + +/* + * There are 3 basic jump table patterns: + * + * 1. jmpq *[rodata addr](,%reg,8) + * + *This is the most common case by far. It jumps to an address in a simple + *jump table which is stored in .rodata. + * + * 2. jmpq *[rodata addr](%rip) + * + *This is caused by a rare GCC quirk, currently only seen in three driver + *functions in the kernel, only with certain obscure non-distro configs. + * + *As part of an optimization, GCC makes a copy of an existing switch jump + *table, modifies it, and then hard-codes the jump (albeit with an indirect + *jump) to use a single entry in the table. The rest of the jump table and + *some of its jump targets remain as dead code. + * + *In such a case we can just crudely ignore all unreachable instruction + *warnings for the entire object file. Ideally we would just ignore them + *for the function, but that would require redesigning the code quite a + *bit. And honestly that's just not worth doing: unreachable instruction + *warnings are of questionable value anyway, and this is such a rare issue. + * + * 3. mov [rodata addr],%reg1 + *... some instructions ... + *jmpq *(%reg1,%reg2,8) + * + *This is a fairly uncommon pattern which is new for GCC 6. As of this + *writing, there are 11 occurrences of it in the allmodconfig kernel. + * + *As of GCC 7 there are quite a few more of these and the 'in between' code + *is significant. Esp. with KASAN enabled some of the code between the mov + *and jmpq uses .rodata itself, which can confuse things. + * + *TODO: Once we have DWARF CFI and smarter instruction decoding logic, + *ensure the same register is used in the mov and jump instructions. + * + *NOTE: RETPOLINE made it harder still to decode dynamic jumps. + */ +struct reloc *arch_find_switch_table(struct objtool_file *file, + struct instruction *insn) +{ + struct reloc *text_reloc, *rodata_reloc; + struct section *table_sec; + unsigned long table_offset; + + /* look for a relocation which references .rodata */ + text_reloc = find_reloc_by_dest_range(file->elf, insn->sec, + insn->offset, insn->len); + if (!text_reloc || text_reloc->sym->type != STT_SECTION || + !text_reloc->sym->sec->rodata) + return NULL; + + table_offset = text_reloc->addend; + table_sec = text_reloc->sym->sec; + + if (text_reloc->type == R_X86_64_PC32) + table_offset += 4; + + /* +* Make sure the .rodata address isn't associated with a +* symbol. GCC jump tables are anonymous data. +* +* Also support C jump tables which are in the same format as +* switch jump tables. For objtool to recognize them, they +* need to be placed in the C_JUMP_TABLE_SECTION section. They +* have symbols associated with them. +*/ + if (find_symbol_containing(table_sec, table_offset) && +
Re: [RFC v4 00/18] objtool: Add support for arm64
Hi Josh, On 8/22/19 8:56 PM, Josh Poimboeuf wrote: On Fri, Aug 16, 2019 at 01:23:45PM +0100, Raphael Gault wrote: Hi, Changes since RFC V3: * Rebased on tip/master: Switch/jump table had been refactored * Take Catalin Marinas comments into account regarding the asm macro for marking exceptions. As of now, objtool only supports the x86_64 architecture but the groundwork has already been done in order to add support for other architectures without too much effort. This series of patches adds support for the arm64 architecture based on the Armv8.5 Architecture Reference Manual. Objtool will be a valuable tool to progress and provide more guarentees on live patching which is a work in progress for arm64. Once we have the base of objtool working the next steps will be to port Peter Z's uaccess validation for arm64. Hi Raphael, Sorry about the long delay. I have some comments coming shortly. One general comment: I noticed that several of the (mostly minor) suggested changes I made for v1 haven't been fixed. I'll try to suggest them again here for v4, so you don't need to go back and find them. But in the future please try to incorporate all the comments from previous patch sets before posting new versions. I'm sure it wasn't intentional, as you did acknowledge and agree to most of the changes. But it does waste people's time and goodwill if you neglect to incorporate their suggestions. Thanks. Indeed, sorry about that. Thanks for you comments, I will do my best to address them shortly. However, I won't have access to my professional emails for a little while and probably won't be able to work on this before at least a week. I'll try to have a new soon though and use my personal email. Thanks, -- Raphael Gault
[PATCH v4 1/7] perf: arm64: Add test to check userspace access to hardware counters.
This test relies on the fact that the PMU registers are accessible from userspace. It then uses the perf_event_mmap_page to retrieve the counter index and access the underlying register. This test uses sched_setaffinity(2) in order to run on all CPU and thus check the behaviour of the PMU of all cpus in a big.LITTLE environment. Signed-off-by: Raphael Gault --- tools/perf/arch/arm64/include/arch-tests.h | 7 + tools/perf/arch/arm64/tests/Build | 1 + tools/perf/arch/arm64/tests/arch-tests.c | 4 + tools/perf/arch/arm64/tests/user-events.c | 254 + 4 files changed, 266 insertions(+) create mode 100644 tools/perf/arch/arm64/tests/user-events.c diff --git a/tools/perf/arch/arm64/include/arch-tests.h b/tools/perf/arch/arm64/include/arch-tests.h index 90ec4c8cb880..6a8483de1015 100644 --- a/tools/perf/arch/arm64/include/arch-tests.h +++ b/tools/perf/arch/arm64/include/arch-tests.h @@ -2,11 +2,18 @@ #ifndef ARCH_TESTS_H #define ARCH_TESTS_H +#include + #ifdef HAVE_DWARF_UNWIND_SUPPORT struct thread; struct perf_sample; +int test__arch_unwind_sample(struct perf_sample *sample, +struct thread *thread); #endif extern struct test arch_tests[]; +int test__rd_pmevcntr(struct test *test __maybe_unused, + int subtest __maybe_unused); + #endif diff --git a/tools/perf/arch/arm64/tests/Build b/tools/perf/arch/arm64/tests/Build index a61c06bdb757..3f9a20c17fc6 100644 --- a/tools/perf/arch/arm64/tests/Build +++ b/tools/perf/arch/arm64/tests/Build @@ -1,4 +1,5 @@ perf-y += regs_load.o perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o +perf-y += user-events.o perf-y += arch-tests.o diff --git a/tools/perf/arch/arm64/tests/arch-tests.c b/tools/perf/arch/arm64/tests/arch-tests.c index 5b1543c98022..57df9b89dede 100644 --- a/tools/perf/arch/arm64/tests/arch-tests.c +++ b/tools/perf/arch/arm64/tests/arch-tests.c @@ -10,6 +10,10 @@ struct test arch_tests[] = { .func = test__dwarf_unwind, }, #endif + { + .desc = "User counter access", + .func = test__rd_pmevcntr, + }, { .func = NULL, }, diff --git a/tools/perf/arch/arm64/tests/user-events.c b/tools/perf/arch/arm64/tests/user-events.c new file mode 100644 index ..b048d7e392bc --- /dev/null +++ b/tools/perf/arch/arm64/tests/user-events.c @@ -0,0 +1,254 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "perf.h" +#include "debug.h" +#include "tests/tests.h" +#include "cloexec.h" +#include "util.h" +#include "arch-tests.h" + +/* + * ARMv8 ARM reserves the following encoding for system registers: + * (Ref: ARMv8 ARM, Section: "System instruction class encoding overview", + * C5.2, version:ARM DDI 0487A.f) + * [20-19] : Op0 + * [18-16] : Op1 + * [15-12] : CRn + * [11-8] : CRm + * [7-5] : Op2 + */ +#define Op0_shift 19 +#define Op0_mask0x3 +#define Op1_shift 16 +#define Op1_mask0x7 +#define CRn_shift 12 +#define CRn_mask0xf +#define CRm_shift 8 +#define CRm_mask0xf +#define Op2_shift 5 +#define Op2_mask0x7 + +#define __stringify(x) #x + +#define read_sysreg(r) ({ \ + u64 __val; \ + asm volatile("mrs %0, " __stringify(r) : "=r" (__val)); \ + __val; \ +}) + +#define PMEVCNTR_READ_CASE(idx)\ + case idx: \ + return read_sysreg(pmevcntr##idx##_el0) + +#define PMEVCNTR_CASES(readwrite) \ + PMEVCNTR_READ_CASE(0); \ + PMEVCNTR_READ_CASE(1); \ + PMEVCNTR_READ_CASE(2); \ + PMEVCNTR_READ_CASE(3); \ + PMEVCNTR_READ_CASE(4); \ + PMEVCNTR_READ_CASE(5); \ + PMEVCNTR_READ_CASE(6); \ + PMEVCNTR_READ_CASE(7); \ + PMEVCNTR_READ_CASE(8); \ + PMEVCNTR_READ_CASE(9); \ + PMEVCNTR_READ_CASE(10); \ + PMEVCNTR_READ_CASE(11); \ + PMEVCNTR_READ_CASE(12); \ + PMEVCNTR_READ_CASE(13); \ + PMEVCNTR_READ_CASE(14); \ + PMEVCNTR_READ_CASE(15); \ + PMEVCNTR_READ_CASE(16); \ + PMEVCNTR_READ_CASE(17); \ + PMEVCNTR_READ_CASE(18); \ + PMEVCNTR_READ_CASE(19
[PATCH v4 4/7] arm64: pmu: Add hook to handle pmu-related undefined instructions
This patch introduces a protection for the userspace processes which are trying to access the registers from the pmu registers on a big.LITTLE environment. It introduces a hook to handle undefined instructions. The goal here is to prevent the process to be interrupted by a signal when the error is caused by the task being scheduled while accessing a counter, causing the counter access to be invalid. As we are not able to know efficiently the number of counters available physically on both pmu in that context we consider that any faulting access to a counter which is architecturally correct should not cause a SIGILL signal if the permissions are set accordingly. This commit also modifies the mask of the mrs_hook declared in arch/arm64/kernel/cpufeatures.c which emulates only feature register access. This is necessary because this hook's mask was too large and thus masking any mrs instruction, even if not related to the emulated registers which made the pmu emulation inefficient. Signed-off-by: Raphael Gault --- arch/arm64/kernel/cpufeature.c | 4 +-- arch/arm64/kernel/perf_event.c | 55 ++ 2 files changed, 57 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 07be444c1e31..3a6285d0b2c0 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -2186,8 +2186,8 @@ static int emulate_mrs(struct pt_regs *regs, u32 insn) } static struct undef_hook mrs_hook = { - .instr_mask = 0xfff0, - .instr_val = 0xd530, + .instr_mask = 0x, + .instr_val = 0xd538, .pstate_mask = PSR_AA32_MODE_MASK, .pstate_val = PSR_MODE_EL0t, .fn = emulate_mrs, diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index a0b4f1bca491..64ca09c9ea65 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -8,9 +8,11 @@ * This code is based heavily on the ARMv7 perf event code. */ +#include #include #include #include +#include #include #include @@ -1012,6 +1014,59 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu) return probe.present ? 0 : -ENODEV; } +static int emulate_pmu(struct pt_regs *regs, u32 insn) +{ + u32 sys_reg, rt; + u32 pmuserenr; + + sys_reg = (u32)aarch64_insn_decode_immediate(AARCH64_INSN_IMM_16, insn) << 5; + rt = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); + pmuserenr = read_sysreg(pmuserenr_el0); + + if ((pmuserenr & (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR)) != + (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR)) + return -EINVAL; + + + /* +* Userspace is expected to only use this in the context of the scheme +* described in the struct perf_event_mmap_page comments. +* +* Given that context, we can only get here if we got migrated between +* getting the register index and doing the MSR read. This in turn +* implies we'll fail the sequence and retry, so any value returned is +* 'good', all we need is to be non-fatal. +* +* The choice of the value 0 is comming from the fact that when +* accessing a register which is not counting events but is accessible, +* we get 0. +*/ + pt_regs_write_reg(regs, rt, 0); + + arm64_skip_faulting_instruction(regs, 4); + return 0; +} + +/* + * This hook will only be triggered by mrs + * instructions on PMU registers. This is mandatory + * in order to have a consistent behaviour even on + * big.LITTLE systems. + */ +static struct undef_hook pmu_hook = { + .instr_mask = 0x8800, + .instr_val = 0xd53b8800, + .fn = emulate_pmu, +}; + +static int __init enable_pmu_emulation(void) +{ + register_undef_hook(_hook); + return 0; +} + +core_initcall(enable_pmu_emulation); + static int armv8_pmu_init(struct arm_pmu *cpu_pmu) { int ret = armv8pmu_probe_pmu(cpu_pmu); -- 2.17.1
[PATCH v4 6/7] arm64: perf: Enable pmu counter direct access for perf event on armv8
Keep track of event opened with direct access to the hardware counters and modify permissions while they are open. The strategy used here is the same which x86 uses: everytime an event is mapped, the permissions are set if required. The atomic field added in the mm_context helps keep track of the different event opened and de-activate the permissions when all are unmapped. We also need to update the permissions in the context switch code so that tasks keep the right permissions. Signed-off-by: Raphael Gault --- arch/arm64/include/asm/mmu.h | 6 arch/arm64/include/asm/mmu_context.h | 2 ++ arch/arm64/include/asm/perf_event.h | 14 arch/arm64/kernel/perf_event.c | 1 + drivers/perf/arm_pmu.c | 54 5 files changed, 77 insertions(+) diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index fd6161336653..88ed4466bd06 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -18,6 +18,12 @@ typedef struct { atomic64_t id; + + /* +* non-zero if userspace have access to hardware +* counters directly. +*/ + atomic_tpmu_direct_access; void*vdso; unsigned long flags; } mm_context_t; diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index 7ed0adb187a8..6e66ff940494 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -21,6 +21,7 @@ #include #include #include +#include #include #include @@ -224,6 +225,7 @@ static inline void __switch_mm(struct mm_struct *next) } check_and_switch_context(next, cpu); + perf_switch_user_access(next); } static inline void diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h index 2bdbc79bbd01..ba58fa726631 100644 --- a/arch/arm64/include/asm/perf_event.h +++ b/arch/arm64/include/asm/perf_event.h @@ -8,6 +8,7 @@ #include #include +#include #defineARMV8_PMU_MAX_COUNTERS 32 #defineARMV8_PMU_COUNTER_MASK (ARMV8_PMU_MAX_COUNTERS - 1) @@ -223,4 +224,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs); (regs)->pstate = PSR_MODE_EL1h; \ } +static inline void perf_switch_user_access(struct mm_struct *mm) +{ + if (!IS_ENABLED(CONFIG_PERF_EVENTS)) + return; + + if (atomic_read(>context.pmu_direct_access)) { + write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR, +pmuserenr_el0); + } else { + write_sysreg(0, pmuserenr_el0); + } +} + #endif diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index de9b001e8b7c..7de56f22d038 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -1285,6 +1285,7 @@ void arch_perf_update_userpage(struct perf_event *event, */ freq = arch_timer_get_rate(); userpg->cap_user_time = 1; + userpg->cap_user_rdpmc = !!(event->hw.flags & ARMPMU_EL0_RD_CNTR); clocks_calc_mult_shift(>time_mult, , freq, NSEC_PER_SEC, 0); diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index 2d06b8095a19..d0d3e523a4c4 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -25,6 +25,7 @@ #include #include +#include static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu); static DEFINE_PER_CPU(int, cpu_irq); @@ -778,6 +779,57 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu) _pmu->node); } +static void refresh_pmuserenr(void *mm) +{ + perf_switch_user_access(mm); +} + +static int check_homogeneous_cap(struct perf_event *event, struct mm_struct *mm) +{ + pr_info("checking HAS_HOMOGENEOUS_PMU"); + if (!cpus_have_cap(ARM64_HAS_HOMOGENEOUS_PMU)) { + pr_info("Disable direct access (!HAS_HOMOGENEOUS_PMU)"); + atomic_set(>context.pmu_direct_access, 0); + on_each_cpu(refresh_pmuserenr, mm, 1); + event->hw.flags &= ~ARMPMU_EL0_RD_CNTR; + return 0; + } + + return 1; +} + +static void armpmu_event_mapped(struct perf_event *event, struct mm_struct *mm) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return; + + /* +* This function relies on not being called concurrently in two +* tasks in the same mm. Otherwise one task could observe +* pmu_direct_access > 1 and return all the way back to +* userspace with user access disabled while another task is still +* doing on_each_cpu_mask() to enable user access. +* +* For now, this can't happen because all callers hold mmap_sem +* for write. If this changes, we'll need a different
[PATCH v4 7/7] Documentation: arm64: Document PMU counters access from userspace
Add a documentation file to describe the access to the pmu hardware counters from userspace Signed-off-by: Raphael Gault --- .../arm64/pmu_counter_user_access.txt | 42 +++ 1 file changed, 42 insertions(+) create mode 100644 Documentation/arm64/pmu_counter_user_access.txt diff --git a/Documentation/arm64/pmu_counter_user_access.txt b/Documentation/arm64/pmu_counter_user_access.txt new file mode 100644 index ..6788b1107381 --- /dev/null +++ b/Documentation/arm64/pmu_counter_user_access.txt @@ -0,0 +1,42 @@ +Access to PMU hardware counter from userspace += + +Overview + +The perf user-space tool relies on the PMU to monitor events. It offers an +abstraction layer over the hardware counters since the underlying +implementation is cpu-dependent. +Arm64 allows userspace tools to have access to the registers storing the +hardware counters' values directly. + +This targets specifically self-monitoring tasks in order to reduce the overhead +by directly accessing the registers without having to go through the kernel. + +How-to +-- +The focus is set on the armv8 pmuv3 which makes sure that the access to the pmu +registers is enable and that the userspace have access to the relevent +information in order to use them. + +In order to have access to the hardware counter it is necessary to open the event +using the perf tool interface: the sys_perf_event_open syscall returns a fd which +can subsequently be used with the mmap syscall in order to retrieve a page of memory +containing information about the event. +The PMU driver uses this page to expose to the user the hardware counter's +index. Using this index enables the user to access the PMU registers using the +`mrs` instruction. + +Have a look `at tools/perf/arch/arm64/tests/user-events.c` for an example. It can be +run using the perf tool to check that the access to the registers works +correctly from userspace: + +./perf test -v + +About chained events + +When the user requests for an event to be counted on 64 bits, two hardware +counters are used and need to be combined to retrieve the correct value: + +val = read_counter(idx); +if ((event.attr.config1 & 0x1)) + val = (val << 32) | read_counter(idx - 1); -- 2.17.1
[PATCH v4 2/7] arm64: cpu: Add accessor for boot_cpu_data
Mark boot_cpu_data as read-only after initialization. Define accessor to read boot_cpu_data from outside of boot_cpu_data. Signed-off-by: Raphael Gault --- arch/arm64/include/asm/cpu.h | 2 +- arch/arm64/kernel/cpuinfo.c | 7 ++- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h index d72d995b7e25..6abc2faf1a64 100644 --- a/arch/arm64/include/asm/cpu.h +++ b/arch/arm64/include/asm/cpu.h @@ -62,5 +62,5 @@ void __init cpuinfo_store_boot_cpu(void); void __init init_cpu_features(struct cpuinfo_arm64 *info); void update_cpu_features(int cpu, struct cpuinfo_arm64 *info, struct cpuinfo_arm64 *boot); - +struct cpuinfo_arm64 *get_boot_cpu_data(void); #endif /* __ASM_CPU_H */ diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c index 876055e37352..ffa00b3a148b 100644 --- a/arch/arm64/kernel/cpuinfo.c +++ b/arch/arm64/kernel/cpuinfo.c @@ -31,7 +31,7 @@ * values depending on configuration at or after reset. */ DEFINE_PER_CPU(struct cpuinfo_arm64, cpu_data); -static struct cpuinfo_arm64 boot_cpu_data; +static struct cpuinfo_arm64 boot_cpu_data __ro_after_init; static char *icache_policy_str[] = { [0 ... ICACHE_POLICY_PIPT] = "RESERVED/UNKNOWN", @@ -395,4 +395,9 @@ void __init cpuinfo_store_boot_cpu(void) init_cpu_features(_cpu_data); } +struct cpuinfo_arm64 *get_boot_cpu_data(void) +{ + return _cpu_data; +} + device_initcall(cpuinfo_regs_init); -- 2.17.1
[PATCH v4 3/7] arm64: cpufeature: Add feature to detect homogeneous systems
This feature is required in order to enable PMU counters direct access from userspace only when the system is homogeneous. This feature checks the model of each CPU brought online and compares it to the boot CPU. If it differs then it is heterogeneous. This CPU feature doesn't prevent different models of CPUs from being hotplugged on, however if such a scenario happens, it will turn off the feature. There is no possibility for the feature to be turned on again by hotplugging off CPUs though. Signed-off-by: Raphael Gault --- arch/arm64/include/asm/cpucaps.h| 3 ++- arch/arm64/include/asm/cpufeature.h | 10 ++ arch/arm64/kernel/cpufeature.c | 28 3 files changed, 40 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index f19fe4b9acc4..1cd73cf46116 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -52,7 +52,8 @@ #define ARM64_HAS_IRQ_PRIO_MASKING 42 #define ARM64_HAS_DCPODP 43 #define ARM64_WORKAROUND_1463225 44 +#define ARM64_HAS_HOMOGENEOUS_PMU 45 -#define ARM64_NCAPS45 +#define ARM64_NCAPS46 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 407e2bf23676..c54a87896bbd 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -430,6 +430,16 @@ static inline void cpus_set_cap(unsigned int num) } } +static inline void cpus_unset_cap(unsigned int num) +{ + if (num >= ARM64_NCAPS) { + pr_warn("Attempt to unset an illegal CPU capability (%d >= %d)\n", + num, ARM64_NCAPS); + } else { + clear_bit(num, cpu_hwcaps); + } +} + static inline int __attribute_const__ cpuid_feature_extract_signed_field_width(u64 features, int field, int width) { diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index f29f36a65175..07be444c1e31 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1248,6 +1248,23 @@ static bool can_use_gic_priorities(const struct arm64_cpu_capabilities *entry, } #endif +static bool has_homogeneous_pmu(const struct arm64_cpu_capabilities *entry, + int scope) +{ + u32 model = read_cpuid_id() & MIDR_CPU_MODEL_MASK; + struct cpuinfo_arm64 *boot = get_boot_cpu_data(); + + return (boot->reg_midr & MIDR_CPU_MODEL_MASK) == model; +} + +static void disable_homogeneous_cap(const struct arm64_cpu_capabilities *entry) +{ + if (!has_homogeneous_pmu(entry, entry->type)) { + pr_info("Disabling Homogeneous PMU (%d)", entry->capability); + cpus_unset_cap(entry->capability); + } +} + static const struct arm64_cpu_capabilities arm64_features[] = { { .desc = "GIC system register CPU interface", @@ -1548,6 +1565,17 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .min_field_value = 1, }, #endif + { + /* +* Detect whether the system is heterogeneous or +* homogeneous +*/ + .desc = "Homogeneous CPUs", + .capability = ARM64_HAS_HOMOGENEOUS_PMU, + .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE, + .matches = has_homogeneous_pmu, + .cpu_enable = disable_homogeneous_cap, + }, {}, }; -- 2.17.1
[PATCH v4 5/7] arm64: pmu: Add function implementation to update event index in userpage.
In order to be able to access the counter directly for userspace, we need to provide the index of the counter using the userpage. We thus need to override the event_idx function to retrieve and convert the perf_event index to armv8 hardware index. Since the arm_pmu driver can be used by any implementation, even if not armv8, two components play a role into making sure the behaviour is correct and consistent with the PMU capabilities: * the ARMPMU_EL0_RD_CNTR flag which denotes the capability to access counter from userspace. * the event_idx call back, which is implemented and initialized by the PMU implementation: if no callback is provided, the default behaviour applies, returning 0 as index value. Signed-off-by: Raphael Gault --- arch/arm64/kernel/perf_event.c | 21 + include/linux/perf/arm_pmu.h | 2 ++ 2 files changed, 23 insertions(+) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 64ca09c9ea65..de9b001e8b7c 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -820,6 +820,22 @@ static void armv8pmu_clear_event_idx(struct pmu_hw_events *cpuc, clear_bit(idx - 1, cpuc->used_mask); } +static int armv8pmu_access_event_idx(struct perf_event *event) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return 0; + + /* +* We remap the cycle counter index to 32 to +* match the offset applied to the rest of +* the counter indices. +*/ + if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER) + return 32; + + return event->hw.idx; +} + /* * Add an event filter to a given event. */ @@ -913,6 +929,9 @@ static int __armv8_pmuv3_map_event(struct perf_event *event, if (armv8pmu_event_is_64bit(event)) event->hw.flags |= ARMPMU_EVT_64BIT; + if (cpus_have_cap(ARM64_HAS_HOMOGENEOUS_PMU)) + event->hw.flags |= ARMPMU_EL0_RD_CNTR; + /* Only expose micro/arch events supported by this PMU */ if ((hw_event_id > 0) && (hw_event_id < ARMV8_PMUV3_MAX_COMMON_EVENTS) && test_bit(hw_event_id, armpmu->pmceid_bitmap)) { @@ -1086,6 +1105,8 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu) cpu_pmu->set_event_filter = armv8pmu_set_event_filter; cpu_pmu->filter_match = armv8pmu_filter_match; + cpu_pmu->pmu.event_idx = armv8pmu_access_event_idx; + return 0; } diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h index 71f525a35ac2..1106a9ac00fd 100644 --- a/include/linux/perf/arm_pmu.h +++ b/include/linux/perf/arm_pmu.h @@ -26,6 +26,8 @@ */ /* Event uses a 64bit counter */ #define ARMPMU_EVT_64BIT 1 +/* Allow access to hardware counter from userspace */ +#define ARMPMU_EL0_RD_CNTR 2 #define HW_OP_UNSUPPORTED 0x #define C(_x) PERF_COUNT_HW_CACHE_##_x -- 2.17.1
[PATCH v4 0/7] arm64: Enable access to pmu registers by user-space
Hi, Changes since v3: * Rebased on will/for-next/perf in order to include this patch [1] * Re-introduce `mrs` hook used in previous versions * Invert cpu feature to track homogeneity instead of heterogeneity * Introduce accessor for boot_cpu_data (see second commit for more info) * Apply Mark Rutland's comments The perf user-space tool relies on the PMU to monitor events. It offers an abstraction layer over the hardware counters since the underlying implementation is cpu-dependent. We want to allow userspace tools to have access to the registers storing the hardware counters' values directly. This targets specifically self-monitoring tasks in order to reduce the overhead by directly accessing the registers without having to go through the kernel. In order to do this we need to setup the pmu so that it exposes its registers to userspace access. The first patch add a test to the perf tool so that we can test that the access to the registers works correctly from userspace. The second patch introduces an accessor for `boot_cpu_data` which is static. Including cpu.h turned out to cause a chain of dependencies so I opted for the accessor since it is not much used. The third patch add a capability in the arm64 cpufeatures framework in order to detect when we are running on a homogeneous system. The fourth patch re introduces the hooks to handling undefined instruction for `mrs` instructions on pmu-related registers. The fifth patch focuses on the armv8 pmuv3 PMU support and makes sure that the access to the pmu registers is enable and that the userspace have access to the relevent information in order to use them. The sixth patch put in place callbacks to enable access to the hardware counters from userspace when a compatible event is opened using the perf API. The seventh patch adds a short documentation about PMU counters direct access from userspace. [1]: https://lkml.org/lkml/2019/8/20/875 Raphael Gault (7): perf: arm64: Add test to check userspace access to hardware counters. arm64: cpu: Add accessor for boot_cpu_data arm64: cpufeature: Add feature to detect homogeneous systems arm64: pmu: Add hook to handle pmu-related undefined instructions arm64: pmu: Add function implementation to update event index in userpage. arm64: perf: Enable pmu counter direct access for perf event on armv8 Documentation: arm64: Document PMU counters access from userspace .../arm64/pmu_counter_user_access.txt | 42 +++ arch/arm64/include/asm/cpu.h | 2 +- arch/arm64/include/asm/cpucaps.h | 3 +- arch/arm64/include/asm/cpufeature.h | 10 + arch/arm64/include/asm/mmu.h | 6 + arch/arm64/include/asm/mmu_context.h | 2 + arch/arm64/include/asm/perf_event.h | 14 + arch/arm64/kernel/cpufeature.c| 32 ++- arch/arm64/kernel/cpuinfo.c | 7 +- arch/arm64/kernel/perf_event.c| 77 ++ drivers/perf/arm_pmu.c| 54 include/linux/perf/arm_pmu.h | 2 + tools/perf/arch/arm64/include/arch-tests.h| 7 + tools/perf/arch/arm64/tests/Build | 1 + tools/perf/arch/arm64/tests/arch-tests.c | 4 + tools/perf/arch/arm64/tests/user-events.c | 254 ++ 16 files changed, 512 insertions(+), 5 deletions(-) create mode 100644 Documentation/arm64/pmu_counter_user_access.txt create mode 100644 tools/perf/arch/arm64/tests/user-events.c -- 2.17.1
[PATCH] arm64: perf_event: Add missing header needed for smp_processor_id()
Acked-by: Mark Rutland Signed-off-by: Raphael Gault --- arch/arm64/kernel/perf_event.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 96e90e270042..24575c0a0065 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -19,6 +19,7 @@ #include #include #include +#include /* ARMv8 Cortex-A53 specific event types. */ #define ARMV8_A53_PERFCTR_PREF_LINEFILL0xC2 -- 2.17.1
Re: [PATCH v3 2/5] arm64: cpufeature: Add feature to detect heterogeneous systems
Hi Mark, Thank you for your comments. On 8/20/19 4:49 PM, Mark Rutland wrote: On Tue, Aug 20, 2019 at 04:23:17PM +0100, Mark Rutland wrote: Hi Raphael, On Fri, Aug 16, 2019 at 01:59:31PM +0100, Raphael Gault wrote: This feature is required in order to enable PMU counters direct access from userspace only when the system is homogeneous. This feature checks the model of each CPU brought online and compares it to the boot CPU. If it differs then it is heterogeneous. It would be worth noting that this patch prevents heterogeneous CPUs being brought online late if the system was uniform at boot time. Looking again, I think I'd misunderstood how ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU was dealt with, but we do have a problem in this area. [...] + .capability = ARM64_HAS_HETEROGENEOUS_PMU, + .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU | ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU, + .matches = has_heterogeneous_pmu, + }, I had a quick chat with Will, and we concluded that we must permit late onlining of heterogeneous CPUs here as people are likely to rely on late CPU onlining on some heterogeneous systems. I think the above permits that, but that also means that we need some support code to fail gracefully in that case (e.g. without sending a SIGILL to unaware userspace code). I understand, however, I understood that ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU did not allow later CPU to be heterogeneous if the capability wasn't already enabled. Thus if as you say we need to allow the system to switch from homogeneous to heterogeneous, then I should change the type of this capability. That means that we'll need the counter emulation code that you had in previous versions of this patch (e.g. to handle potential UNDEFs when a new CPU has fewer counters than the previously online CPUs). Further, I think the context switch (and event index) code needs to take this cap into account, and disable direct access once the system becomes heterogeneous. That is a good point indeed. Thanks, -- Raphael Gault
Re: [PATCH v3 4/5] arm64: perf: Enable pmu counter direct access for perf event on armv8
Hi, On 8/18/19 1:37 PM, kbuild test robot wrote: Hi Raphael, Thank you for the patch! Yet something to improve: [auto build test ERROR on linus/master] [cannot apply to v5.3-rc4 next-20190816] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] This patchset was based on linux-next/master and note linus Thanks, -- Raphael Gault
[PATCH v3 2/5] arm64: cpufeature: Add feature to detect heterogeneous systems
This feature is required in order to enable PMU counters direct access from userspace only when the system is homogeneous. This feature checks the model of each CPU brought online and compares it to the boot CPU. If it differs then it is heterogeneous. Signed-off-by: Raphael Gault --- arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/kernel/cpufeature.c | 20 arch/arm64/kernel/perf_event.c | 1 + 3 files changed, 23 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index f19fe4b9acc4..040370af38ad 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -52,7 +52,8 @@ #define ARM64_HAS_IRQ_PRIO_MASKING 42 #define ARM64_HAS_DCPODP 43 #define ARM64_WORKAROUND_1463225 44 +#define ARM64_HAS_HETEROGENEOUS_PMU45 -#define ARM64_NCAPS45 +#define ARM64_NCAPS46 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 9323bcc40a58..bbdd809f12a6 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1260,6 +1260,15 @@ static bool can_use_gic_priorities(const struct arm64_cpu_capabilities *entry, } #endif +static bool has_heterogeneous_pmu(const struct arm64_cpu_capabilities *entry, +int scope) +{ + u32 model = read_cpuid_id() & MIDR_CPU_MODEL_MASK; + struct cpuinfo_arm64 *boot = _cpu(cpu_data, 0); + + return (boot->reg_midr & MIDR_CPU_MODEL_MASK) != model; +} + static const struct arm64_cpu_capabilities arm64_features[] = { { .desc = "GIC system register CPU interface", @@ -1560,6 +1569,16 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .min_field_value = 1, }, #endif + { + /* +* Detect whether the system is heterogeneous or +* homogeneous +*/ + .desc = "Detect whether we have heterogeneous CPUs", + .capability = ARM64_HAS_HETEROGENEOUS_PMU, + .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU | ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU, + .matches = has_heterogeneous_pmu, + }, {}, }; @@ -1727,6 +1746,7 @@ static void __init setup_elf_hwcaps(const struct arm64_cpu_capabilities *hwcaps) cap_set_elf_hwcap(hwcaps); } + static void update_cpu_capabilities(u16 scope_mask) { int i; diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 2d3bdebdf6df..a0b4f1bca491 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -19,6 +19,7 @@ #include #include #include +#include /* ARMv8 Cortex-A53 specific event types. */ #define ARMV8_A53_PERFCTR_PREF_LINEFILL0xC2 -- 2.17.1
[PATCH v3 1/5] perf: arm64: Add test to check userspace access to hardware counters.
This test relies on the fact that the PMU registers are accessible from userspace. It then uses the perf_event_mmap_page to retrieve the counter index and access the underlying register. This test uses sched_setaffinity(2) in order to run on all CPU and thus check the behaviour of the PMU of all cpus in a big.LITTLE environment. Signed-off-by: Raphael Gault --- tools/perf/arch/arm64/include/arch-tests.h | 7 + tools/perf/arch/arm64/tests/Build | 1 + tools/perf/arch/arm64/tests/arch-tests.c | 4 + tools/perf/arch/arm64/tests/user-events.c | 254 + 4 files changed, 266 insertions(+) create mode 100644 tools/perf/arch/arm64/tests/user-events.c diff --git a/tools/perf/arch/arm64/include/arch-tests.h b/tools/perf/arch/arm64/include/arch-tests.h index 90ec4c8cb880..6a8483de1015 100644 --- a/tools/perf/arch/arm64/include/arch-tests.h +++ b/tools/perf/arch/arm64/include/arch-tests.h @@ -2,11 +2,18 @@ #ifndef ARCH_TESTS_H #define ARCH_TESTS_H +#include + #ifdef HAVE_DWARF_UNWIND_SUPPORT struct thread; struct perf_sample; +int test__arch_unwind_sample(struct perf_sample *sample, +struct thread *thread); #endif extern struct test arch_tests[]; +int test__rd_pmevcntr(struct test *test __maybe_unused, + int subtest __maybe_unused); + #endif diff --git a/tools/perf/arch/arm64/tests/Build b/tools/perf/arch/arm64/tests/Build index a61c06bdb757..3f9a20c17fc6 100644 --- a/tools/perf/arch/arm64/tests/Build +++ b/tools/perf/arch/arm64/tests/Build @@ -1,4 +1,5 @@ perf-y += regs_load.o perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o +perf-y += user-events.o perf-y += arch-tests.o diff --git a/tools/perf/arch/arm64/tests/arch-tests.c b/tools/perf/arch/arm64/tests/arch-tests.c index 5b1543c98022..57df9b89dede 100644 --- a/tools/perf/arch/arm64/tests/arch-tests.c +++ b/tools/perf/arch/arm64/tests/arch-tests.c @@ -10,6 +10,10 @@ struct test arch_tests[] = { .func = test__dwarf_unwind, }, #endif + { + .desc = "User counter access", + .func = test__rd_pmevcntr, + }, { .func = NULL, }, diff --git a/tools/perf/arch/arm64/tests/user-events.c b/tools/perf/arch/arm64/tests/user-events.c new file mode 100644 index ..b048d7e392bc --- /dev/null +++ b/tools/perf/arch/arm64/tests/user-events.c @@ -0,0 +1,254 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "perf.h" +#include "debug.h" +#include "tests/tests.h" +#include "cloexec.h" +#include "util.h" +#include "arch-tests.h" + +/* + * ARMv8 ARM reserves the following encoding for system registers: + * (Ref: ARMv8 ARM, Section: "System instruction class encoding overview", + * C5.2, version:ARM DDI 0487A.f) + * [20-19] : Op0 + * [18-16] : Op1 + * [15-12] : CRn + * [11-8] : CRm + * [7-5] : Op2 + */ +#define Op0_shift 19 +#define Op0_mask0x3 +#define Op1_shift 16 +#define Op1_mask0x7 +#define CRn_shift 12 +#define CRn_mask0xf +#define CRm_shift 8 +#define CRm_mask0xf +#define Op2_shift 5 +#define Op2_mask0x7 + +#define __stringify(x) #x + +#define read_sysreg(r) ({ \ + u64 __val; \ + asm volatile("mrs %0, " __stringify(r) : "=r" (__val)); \ + __val; \ +}) + +#define PMEVCNTR_READ_CASE(idx)\ + case idx: \ + return read_sysreg(pmevcntr##idx##_el0) + +#define PMEVCNTR_CASES(readwrite) \ + PMEVCNTR_READ_CASE(0); \ + PMEVCNTR_READ_CASE(1); \ + PMEVCNTR_READ_CASE(2); \ + PMEVCNTR_READ_CASE(3); \ + PMEVCNTR_READ_CASE(4); \ + PMEVCNTR_READ_CASE(5); \ + PMEVCNTR_READ_CASE(6); \ + PMEVCNTR_READ_CASE(7); \ + PMEVCNTR_READ_CASE(8); \ + PMEVCNTR_READ_CASE(9); \ + PMEVCNTR_READ_CASE(10); \ + PMEVCNTR_READ_CASE(11); \ + PMEVCNTR_READ_CASE(12); \ + PMEVCNTR_READ_CASE(13); \ + PMEVCNTR_READ_CASE(14); \ + PMEVCNTR_READ_CASE(15); \ + PMEVCNTR_READ_CASE(16); \ + PMEVCNTR_READ_CASE(17); \ + PMEVCNTR_READ_CASE(18); \ + PMEVCNTR_READ_CASE(19
[PATCH v3 4/5] arm64: perf: Enable pmu counter direct access for perf event on armv8
Keep track of event opened with direct access to the hardware counters and modify permissions while they are open. The strategy used here is the same which x86 uses: everytime an event is mapped, the permissions are set if required. The atomic field added in the mm_context helps keep track of the different event opened and de-activate the permissions when all are unmapped. We also need to update the permissions in the context switch code so that tasks keep the right permissions. Signed-off-by: Raphael Gault --- arch/arm64/include/asm/mmu.h | 6 + arch/arm64/include/asm/mmu_context.h | 2 ++ arch/arm64/include/asm/perf_event.h | 14 ++ drivers/perf/arm_pmu.c | 38 4 files changed, 60 insertions(+) diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index fd6161336653..88ed4466bd06 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -18,6 +18,12 @@ typedef struct { atomic64_t id; + + /* +* non-zero if userspace have access to hardware +* counters directly. +*/ + atomic_tpmu_direct_access; void*vdso; unsigned long flags; } mm_context_t; diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index 7ed0adb187a8..6e66ff940494 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -21,6 +21,7 @@ #include #include #include +#include #include #include @@ -224,6 +225,7 @@ static inline void __switch_mm(struct mm_struct *next) } check_and_switch_context(next, cpu); + perf_switch_user_access(next); } static inline void diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h index 2bdbc79bbd01..ba58fa726631 100644 --- a/arch/arm64/include/asm/perf_event.h +++ b/arch/arm64/include/asm/perf_event.h @@ -8,6 +8,7 @@ #include #include +#include #defineARMV8_PMU_MAX_COUNTERS 32 #defineARMV8_PMU_COUNTER_MASK (ARMV8_PMU_MAX_COUNTERS - 1) @@ -223,4 +224,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs); (regs)->pstate = PSR_MODE_EL1h; \ } +static inline void perf_switch_user_access(struct mm_struct *mm) +{ + if (!IS_ENABLED(CONFIG_PERF_EVENTS)) + return; + + if (atomic_read(>context.pmu_direct_access)) { + write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR, +pmuserenr_el0); + } else { + write_sysreg(0, pmuserenr_el0); + } +} + #endif diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index df352b334ea7..3a48cc9f17af 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -25,6 +25,7 @@ #include #include +#include static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu); static DEFINE_PER_CPU(int, cpu_irq); @@ -778,6 +779,41 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu) _pmu->node); } +static void refresh_pmuserenr(void *mm) +{ + perf_switch_user_access(mm); +} + +static void armpmu_event_mapped(struct perf_event *event, struct mm_struct *mm) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return; + + /* +* This function relies on not being called concurrently in two +* tasks in the same mm. Otherwise one task could observe +* pmu_direct_access > 1 and return all the way back to +* userspace with user access disabled while another task is still +* doing on_each_cpu_mask() to enable user access. +* +* For now, this can't happen because all callers hold mmap_sem +* for write. If this changes, we'll need a different solution. +*/ + lockdep_assert_held_write(>mmap_sem); + + if (atomic_inc_return(>context.pmu_direct_access) == 1) + on_each_cpu(refresh_pmuserenr, mm, 1); +} + +static void armpmu_event_unmapped(struct perf_event *event, struct mm_struct *mm) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return; + + if (atomic_dec_and_test(>context.pmu_direct_access)) + on_each_cpu_mask(mm_cpumask(mm), refresh_pmuserenr, NULL, 1); +} + static struct arm_pmu *__armpmu_alloc(gfp_t flags) { struct arm_pmu *pmu; @@ -799,6 +835,8 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags) .pmu_enable = armpmu_enable, .pmu_disable= armpmu_disable, .event_init = armpmu_event_init, + .event_mapped = armpmu_event_mapped, + .event_unmapped = armpmu_event_unmapped, .add= armpmu_add, .del= armpmu_del, .start = armpmu_start, -- 2.17.1
[PATCH v3 5/5] Documentation: arm64: Document PMU counters access from userspace
Add a documentation file to describe the access to the pmu hardware counters from userspace Signed-off-by: Raphael Gault --- .../arm64/pmu_counter_user_access.txt | 42 +++ 1 file changed, 42 insertions(+) create mode 100644 Documentation/arm64/pmu_counter_user_access.txt diff --git a/Documentation/arm64/pmu_counter_user_access.txt b/Documentation/arm64/pmu_counter_user_access.txt new file mode 100644 index ..6788b1107381 --- /dev/null +++ b/Documentation/arm64/pmu_counter_user_access.txt @@ -0,0 +1,42 @@ +Access to PMU hardware counter from userspace += + +Overview + +The perf user-space tool relies on the PMU to monitor events. It offers an +abstraction layer over the hardware counters since the underlying +implementation is cpu-dependent. +Arm64 allows userspace tools to have access to the registers storing the +hardware counters' values directly. + +This targets specifically self-monitoring tasks in order to reduce the overhead +by directly accessing the registers without having to go through the kernel. + +How-to +-- +The focus is set on the armv8 pmuv3 which makes sure that the access to the pmu +registers is enable and that the userspace have access to the relevent +information in order to use them. + +In order to have access to the hardware counter it is necessary to open the event +using the perf tool interface: the sys_perf_event_open syscall returns a fd which +can subsequently be used with the mmap syscall in order to retrieve a page of memory +containing information about the event. +The PMU driver uses this page to expose to the user the hardware counter's +index. Using this index enables the user to access the PMU registers using the +`mrs` instruction. + +Have a look `at tools/perf/arch/arm64/tests/user-events.c` for an example. It can be +run using the perf tool to check that the access to the registers works +correctly from userspace: + +./perf test -v + +About chained events + +When the user requests for an event to be counted on 64 bits, two hardware +counters are used and need to be combined to retrieve the correct value: + +val = read_counter(idx); +if ((event.attr.config1 & 0x1)) + val = (val << 32) | read_counter(idx - 1); -- 2.17.1
[PATCH v3 3/5] arm64: pmu: Add function implementation to update event index in userpage.
In order to be able to access the counter directly for userspace, we need to provide the index of the counter using the userpage. We thus need to override the event_idx function to retrieve and convert the perf_event index to armv8 hardware index. Since the arm_pmu driver can be used by any implementation, even if not armv8, two components play a role into making sure the behaviour is correct and consistent with the PMU capabilities: * the ARMPMU_EL0_RD_CNTR flag which denotes the capability to access counter from userspace. * the event_idx call back, which is implemented and initialized by the PMU implementation: if no callback is provided, the default behaviour applies, returning 0 as index value. Signed-off-by: Raphael Gault --- arch/arm64/kernel/perf_event.c | 22 ++ include/linux/perf/arm_pmu.h | 2 ++ 2 files changed, 24 insertions(+) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index a0b4f1bca491..9fe3f6909513 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -818,6 +818,22 @@ static void armv8pmu_clear_event_idx(struct pmu_hw_events *cpuc, clear_bit(idx - 1, cpuc->used_mask); } +static int armv8pmu_access_event_idx(struct perf_event *event) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return 0; + + /* +* We remap the cycle counter index to 32 to +* match the offset applied to the rest of +* the counter indeces. +*/ + if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER) + return 32; + + return event->hw.idx; +} + /* * Add an event filter to a given event. */ @@ -911,6 +927,9 @@ static int __armv8_pmuv3_map_event(struct perf_event *event, if (armv8pmu_event_is_64bit(event)) event->hw.flags |= ARMPMU_EVT_64BIT; + if (!cpus_have_const_cap(ARM64_HAS_HETEROGENEOUS_PMU)) + event->hw.flags |= ARMPMU_EL0_RD_CNTR; + /* Only expose micro/arch events supported by this PMU */ if ((hw_event_id > 0) && (hw_event_id < ARMV8_PMUV3_MAX_COMMON_EVENTS) && test_bit(hw_event_id, armpmu->pmceid_bitmap)) { @@ -1031,6 +1050,8 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu) cpu_pmu->set_event_filter = armv8pmu_set_event_filter; cpu_pmu->filter_match = armv8pmu_filter_match; + cpu_pmu->pmu.event_idx = armv8pmu_access_event_idx; + return 0; } @@ -1209,6 +1230,7 @@ void arch_perf_update_userpage(struct perf_event *event, */ freq = arch_timer_get_rate(); userpg->cap_user_time = 1; + userpg->cap_user_rdpmc = !!(event->hw.flags & ARMPMU_EL0_RD_CNTR); clocks_calc_mult_shift(>time_mult, , freq, NSEC_PER_SEC, 0); diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h index 71f525a35ac2..1106a9ac00fd 100644 --- a/include/linux/perf/arm_pmu.h +++ b/include/linux/perf/arm_pmu.h @@ -26,6 +26,8 @@ */ /* Event uses a 64bit counter */ #define ARMPMU_EVT_64BIT 1 +/* Allow access to hardware counter from userspace */ +#define ARMPMU_EL0_RD_CNTR 2 #define HW_OP_UNSUPPORTED 0x #define C(_x) PERF_COUNT_HW_CACHE_##_x -- 2.17.1
[PATCH v3 0/5] arm64: Enable access to pmu registers by user-space
Hi, Changes since v2: * Rebased on linux-next/master again (next-20190814) * Use linux/compiler.h header as suggested by Arnaldo The perf user-space tool relies on the PMU to monitor events. It offers an abstraction layer over the hardware counters since the underlying implementation is cpu-dependent. We want to allow userspace tools to have access to the registers storing the hardware counters' values directly. This targets specifically self-monitoring tasks in order to reduce the overhead by directly accessing the registers without having to go through the kernel. In order to do this we need to setup the pmu so that it exposes its registers to userspace access. The first patch add a test to the perf tool so that we can test that the access to the registers works correctly from userspace. The second patch add a capability in the arm64 cpufeatures framework in order to detect when we are running on a heterogeneous system. The third patch focuses on the armv8 pmuv3 PMU support and makes sure that the access to the pmu registers is enable and that the userspace have access to the relevent information in order to use them. The fourth patch put in place callbacks to enable access to the hardware counters from userspace when a compatible event is opened using the perf API. The fifth patch adds a short documentation about PMU counters direct access from userspace. Raphael Gault (5): perf: arm64: Add test to check userspace access to hardware counters. arm64: cpufeature: Add feature to detect heterogeneous systems arm64: pmu: Add function implementation to update event index in userpage. arm64: perf: Enable pmu counter direct access for perf event on armv8 Documentation: arm64: Document PMU counters access from userspace .../arm64/pmu_counter_user_access.txt | 42 +++ arch/arm64/include/asm/cpucaps.h | 3 +- arch/arm64/include/asm/mmu.h | 6 + arch/arm64/include/asm/mmu_context.h | 2 + arch/arm64/include/asm/perf_event.h | 14 + arch/arm64/kernel/cpufeature.c| 20 ++ arch/arm64/kernel/perf_event.c| 23 ++ drivers/perf/arm_pmu.c| 38 +++ include/linux/perf/arm_pmu.h | 2 + tools/perf/arch/arm64/include/arch-tests.h| 7 + tools/perf/arch/arm64/tests/Build | 1 + tools/perf/arch/arm64/tests/arch-tests.c | 4 + tools/perf/arch/arm64/tests/user-events.c | 254 ++ 13 files changed, 415 insertions(+), 1 deletion(-) create mode 100644 Documentation/arm64/pmu_counter_user_access.txt create mode 100644 tools/perf/arch/arm64/tests/user-events.c -- 2.17.1
[RFC v4 02/18] objtool: orc: Refactor ORC API for other architectures to implement.
The ORC unwinder is only supported on x86 at the moment and should thus be in the x86 architecture code. In order not to break the whole structure in case another architecture decides to support the ORC unwinder via objtool we choose to let the implementation be done in the architecture dependent code. Signed-off-by: Raphael Gault --- tools/objtool/Build | 2 - tools/objtool/arch.h| 3 + tools/objtool/arch/x86/Build| 2 + tools/objtool/{ => arch/x86}/orc_dump.c | 4 +- tools/objtool/{ => arch/x86}/orc_gen.c | 104 ++-- tools/objtool/check.c | 99 +- tools/objtool/orc.h | 4 +- 7 files changed, 111 insertions(+), 107 deletions(-) rename tools/objtool/{ => arch/x86}/orc_dump.c (98%) rename tools/objtool/{ => arch/x86}/orc_gen.c (66%) diff --git a/tools/objtool/Build b/tools/objtool/Build index 8dc4f0848362..d069d26d97fa 100644 --- a/tools/objtool/Build +++ b/tools/objtool/Build @@ -2,8 +2,6 @@ objtool-y += arch/$(SRCARCH)/ objtool-y += builtin-check.o objtool-y += builtin-orc.o objtool-y += check.o -objtool-y += orc_gen.o -objtool-y += orc_dump.o objtool-y += elf.o objtool-y += special.o objtool-y += objtool.o diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h index a9a50a25ca66..e91e12807678 100644 --- a/tools/objtool/arch.h +++ b/tools/objtool/arch.h @@ -10,6 +10,7 @@ #include #include "elf.h" #include "cfi.h" +#include "orc.h" enum insn_type { INSN_JUMP_CONDITIONAL, @@ -77,6 +78,8 @@ int arch_decode_instruction(struct elf *elf, struct section *sec, bool arch_callee_saved_reg(unsigned char reg); +int arch_orc_read_unwind_hints(struct objtool_file *file); + unsigned long arch_jump_destination(struct instruction *insn); unsigned long arch_dest_rela_offset(int addend); diff --git a/tools/objtool/arch/x86/Build b/tools/objtool/arch/x86/Build index b998412c017d..1f11b45999d0 100644 --- a/tools/objtool/arch/x86/Build +++ b/tools/objtool/arch/x86/Build @@ -1,4 +1,6 @@ objtool-y += decode.o +objtool-y += orc_dump.o +objtool-y += orc_gen.o inat_tables_script = arch/x86/tools/gen-insn-attr-x86.awk inat_tables_maps = arch/x86/lib/x86-opcode-map.txt diff --git a/tools/objtool/orc_dump.c b/tools/objtool/arch/x86/orc_dump.c similarity index 98% rename from tools/objtool/orc_dump.c rename to tools/objtool/arch/x86/orc_dump.c index 13ccf775a83a..cfe8f96bdd68 100644 --- a/tools/objtool/orc_dump.c +++ b/tools/objtool/arch/x86/orc_dump.c @@ -4,8 +4,8 @@ */ #include -#include "orc.h" -#include "warn.h" +#include "../../orc.h" +#include "../../warn.h" static const char *reg_name(unsigned int reg) { diff --git a/tools/objtool/orc_gen.c b/tools/objtool/arch/x86/orc_gen.c similarity index 66% rename from tools/objtool/orc_gen.c rename to tools/objtool/arch/x86/orc_gen.c index 27a4112848c2..b4f285bf5271 100644 --- a/tools/objtool/orc_gen.c +++ b/tools/objtool/arch/x86/orc_gen.c @@ -6,11 +6,11 @@ #include #include -#include "orc.h" -#include "check.h" -#include "warn.h" +#include "../../orc.h" +#include "../../check.h" +#include "../../warn.h" -int create_orc(struct objtool_file *file) +int arch_create_orc(struct objtool_file *file) { struct instruction *insn; @@ -116,7 +116,7 @@ static int create_orc_entry(struct section *u_sec, struct section *ip_relasec, return 0; } -int create_orc_sections(struct objtool_file *file) +int arch_create_orc_sections(struct objtool_file *file) { struct instruction *insn, *prev_insn; struct section *sec, *u_sec, *ip_relasec; @@ -209,3 +209,97 @@ int create_orc_sections(struct objtool_file *file) return 0; } + +int arch_orc_read_unwind_hints(struct objtool_file *file) +{ + struct section *sec, *relasec; + struct rela *rela; + struct unwind_hint *hint; + struct instruction *insn; + struct cfi_reg *cfa; + int i; + + sec = find_section_by_name(file->elf, ".discard.unwind_hints"); + if (!sec) + return 0; + + relasec = sec->rela; + if (!relasec) { + WARN("missing .rela.discard.unwind_hints section"); + return -1; + } + + if (sec->len % sizeof(struct unwind_hint)) { + WARN("struct unwind_hint size mismatch"); + return -1; + } + + file->hints = true; + + for (i = 0; i < sec->len / sizeof(struct unwind_hint); i++) { + hint = (struct unwind_hint *)sec->data->d_buf + i; + + rela = find_rela_by_dest(sec, i * sizeof(*hint)); + if (!rela) { + WARN("can't find rela for unwind_hints[%d]", i); +
[RFC v4 16/18] arm64: crypto: Add exceptions for crypto object to prevent stack analysis
Some crypto modules contain `.word` of data in the .text section. Since objtool can't make the distinction between data and incorrect instruction, it gives a warning about the instruction beeing unknown and stops the analysis of the object file. The exception can be removed if the data are moved to another section or if objtool is tweaked to handle this particular case. Signed-off-by: Raphael Gault --- arch/arm64/crypto/Makefile | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile index 0435f2a0610e..e2a25919ebaa 100644 --- a/arch/arm64/crypto/Makefile +++ b/arch/arm64/crypto/Makefile @@ -43,9 +43,11 @@ aes-neon-blk-y := aes-glue-neon.o aes-neon.o obj-$(CONFIG_CRYPTO_SHA256_ARM64) += sha256-arm64.o sha256-arm64-y := sha256-glue.o sha256-core.o +OBJECT_FILES_NON_STANDARD_sha256-core.o := y obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o sha512-arm64-y := sha512-glue.o sha512-core.o +OBJECT_FILES_NON_STANDARD_sha512-core.o := y obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o chacha-neon-y := chacha-neon-core.o chacha-neon-glue.o @@ -58,6 +60,7 @@ aes-arm64-y := aes-cipher-core.o aes-cipher-glue.o obj-$(CONFIG_CRYPTO_AES_ARM64_BS) += aes-neon-bs.o aes-neon-bs-y := aes-neonbs-core.o aes-neonbs-glue.o +OBJECT_FILES_NON_STANDARD_aes-neonbs-core.o := y CFLAGS_aes-glue-ce.o := -DUSE_V8_CRYPTO_EXTENSIONS -- 2.17.1
[RFC v4 06/18] objtool: arm64: Adapt the stack frame checks for arm architecture
Since the way the initial stack frame when entering a function is different that what is done in the x86_64 architecture, we need to add some more check to support the different cases. As opposed as for x86_64, the return address is not stored by the call instruction but is instead loaded in a register. The initial stack frame is thus empty when entering a function and 2 push operations are needed to set it up correctly. All the different combinations need to be taken into account. Signed-off-by: Raphael Gault --- tools/objtool/arch.h | 2 + tools/objtool/arch/arm64/decode.c | 28 + tools/objtool/arch/x86/decode.c | 5 ++ tools/objtool/check.c | 100 -- tools/objtool/elf.c | 3 +- 5 files changed, 131 insertions(+), 7 deletions(-) diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h index bb5ce810fb6e..68d6371a24a2 100644 --- a/tools/objtool/arch.h +++ b/tools/objtool/arch.h @@ -91,4 +91,6 @@ unsigned long arch_jump_destination(struct instruction *insn); unsigned long arch_dest_rela_offset(int addend); +bool arch_is_insn_sibling_call(struct instruction *insn); + #endif /* _ARCH_H */ diff --git a/tools/objtool/arch/arm64/decode.c b/tools/objtool/arch/arm64/decode.c index 395c5777afab..be3d2eb10227 100644 --- a/tools/objtool/arch/arm64/decode.c +++ b/tools/objtool/arch/arm64/decode.c @@ -106,6 +106,34 @@ unsigned long arch_dest_rela_offset(int addend) return addend; } +/* + * In order to know if we are in presence of a sibling + * call and not in presence of a switch table we look + * back at the previous instructions and see if we are + * jumping inside the same function that we are already + * in. + */ +bool arch_is_insn_sibling_call(struct instruction *insn) +{ + struct instruction *prev; + struct list_head *l; + struct symbol *sym; + list_for_each_prev(l, >list) { + prev = list_entry(l, struct instruction, list); + if (!prev->func || + prev->func->pfunc != insn->func->pfunc) + return false; + if (prev->stack_op.src.reg != ADR_SOURCE) + continue; + sym = find_symbol_containing(insn->sec, insn->immediate); + if (!sym || sym->type != STT_FUNC) + return false; + else if (sym->type == STT_FUNC) + return true; + break; + } + return false; +} static int is_arm64(struct elf *elf) { switch (elf->ehdr.e_machine) { diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c index fa33b3465722..98726990714d 100644 --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -72,6 +72,11 @@ unsigned long arch_dest_rela_offset(int addend) return addend + 4; } +bool arch_is_insn_sibling_call(struct instruction *insn) +{ + return true; +} + int arch_decode_instruction(struct elf *elf, struct section *sec, unsigned long offset, unsigned int maxlen, unsigned int *len, enum insn_type *type, diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 4af6422d3428..519569b0329f 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -566,10 +566,10 @@ static int add_jump_destinations(struct objtool_file *file) dest_off = arch_jump_destination(insn); } else if (rela->sym->type == STT_SECTION) { dest_sec = rela->sym->sec; - dest_off = rela->addend + 4; + dest_off = arch_dest_rela_offset(rela->addend); } else if (rela->sym->sec->idx) { dest_sec = rela->sym->sec; - dest_off = rela->sym->sym.st_value + rela->addend + 4; + dest_off = rela->sym->sym.st_value + arch_dest_rela_offset(rela->addend); } else if (strstr(rela->sym->name, "_indirect_thunk_")) { /* * Retpoline jumps are really dynamic jumps in @@ -1368,8 +1368,8 @@ static void save_reg(struct insn_state *state, unsigned char reg, int base, static void restore_reg(struct insn_state *state, unsigned char reg) { - state->regs[reg].base = CFI_UNDEFINED; - state->regs[reg].offset = 0; + state->regs[reg].base = initial_func_cfi.regs[reg].base; + state->regs[reg].offset = initial_func_cfi.regs[reg].offset; } /* @@ -1525,8 +1525,32 @@ static int update_insn_state(struct instruction *insn, struct insn_state *state) /* add imm, %rsp */ state->stack_size -= op->src.offset; - if (cfa->base ==
[RFC v4 10/18] objtool: arm64: Implement functions to add switch tables alternatives
This patch implements the functions required to identify and add as alternatives all the possible destinations of the switch table. This implementation relies on the new plugin introduced previously which records information about the switch-table in a .objtool_data section. Signed-off-by: Raphael Gault --- tools/objtool/arch/arm64/arch_special.c | 132 +- .../objtool/arch/arm64/include/arch_special.h | 10 ++ .../objtool/arch/arm64/include/insn_decode.h | 3 +- tools/objtool/check.c | 6 +- tools/objtool/check.h | 2 + 5 files changed, 146 insertions(+), 7 deletions(-) diff --git a/tools/objtool/arch/arm64/arch_special.c b/tools/objtool/arch/arm64/arch_special.c index 17a8a06aac2a..11284066157c 100644 --- a/tools/objtool/arch/arm64/arch_special.c +++ b/tools/objtool/arch/arm64/arch_special.c @@ -12,8 +12,13 @@ * You should have received a copy of the GNU General Public License * along with this program; if not, see <http://www.gnu.org/licenses/>. */ + +#include +#include + #include "../../special.h" #include "arch_special.h" +#include "bit_operations.h" void arch_force_alt_path(unsigned short feature, bool uaccess, @@ -21,9 +26,133 @@ void arch_force_alt_path(unsigned short feature, { } +static u32 next_offset(u8 *table, u8 entry_size) +{ + switch (entry_size) { + case 1: + return table[0]; + case 2: + return *(u16 *)(table); + default: + return *(u32 *)(table); + } +} + +static u32 get_table_entry_size(u32 insn) +{ + unsigned char size = (insn >> 30) & ONES(2); + switch (size) { + case 0: + return 1; + case 1: + return 2; + default: + return 4; + } +} + +static int add_possible_branch(struct objtool_file *file, + struct instruction *insn, + u32 base, u32 offset) +{ + struct instruction *dest_insn; + struct alternative *alt; + offset = base + 4 * offset; + + alt = calloc(1, sizeof(*alt)); + if (alt == NULL) { + WARN("allocation failure, can't add jump alternative"); + return -1; + } + + dest_insn = find_insn(file, insn->sec, offset); + if (dest_insn == NULL) { + free(alt); + return 0; + } + alt->insn = dest_insn; + alt->skip_orig = true; + list_add_tail(>list, >alts); + return 0; +} + int arch_add_jump_table(struct objtool_file *file, struct instruction *insn, struct rela *table, struct rela *next_table) { + struct rela *objtool_data_rela = NULL; + struct switch_table_info *swt_info = NULL; + struct section *objtool_data = find_section_by_name(file->elf, ".objtool_data"); + struct section *rodata_sec = find_section_by_name(file->elf, ".rodata"); + struct section *branch_sec = NULL; + u8 *switch_table = NULL; + u64 base_offset = 0; + struct instruction *pre_jump_insn; + u32 sec_size = 0; + u32 entry_size = 0; + u32 offset = 0; + u32 i, j; + + if (objtool_data == NULL) + return 0; + + /* +* 1. Using rela, Identify entry for the switch table +* 2. Retrieve base offset +* 3. Retrieve branch instruction +* 3. For all entries in switch table: +* 3.1. Compute new offset +* 3.2. Create alternative instruction +* 3.3. Add alt_instr to insn->alts list +*/ + sec_size = objtool_data->sh.sh_size; + for (i = 0, swt_info = (void *)objtool_data->data->d_buf; +i < sec_size / sizeof(struct switch_table_info); +i++, swt_info++) { + offset = i * sizeof(struct switch_table_info); + objtool_data_rela = find_rela_by_dest_range(objtool_data, offset, + sizeof(u64)); + /* retrieving the objtool data of the switch table we need */ + if (objtool_data_rela == NULL || + table->sym->sec != objtool_data_rela->sym->sec || + table->addend != objtool_data_rela->addend) + continue; + + /* retrieving switch table content */ + switch_table = (u8 *)(rodata_sec->data->d_buf + table->addend); + + /* retrieving pre jump instruction (ldr) */ + branch_sec = insn->sec; + pre_jump_insn = find_insn(file, branch_sec, + insn->offset - 3 * sizeof(u32)); + entry_size = get_table_entry_size(*(u32 *)(branch_sec->data->d_
[RFC v4 13/18] arm64: sleep: Prevent stack frame warnings from objtool
This code doesn't respect the Arm PCS but it is intended this way. Adapting it to respect the PCS would result in altering the behaviour. In order to suppress objtool's warnings, we setup a stack frame for __cpu_suspend_enter and annotate cpu_resume and _cpu_resume as having non-standard stack frames. Signed-off-by: Raphael Gault --- arch/arm64/kernel/sleep.S | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S index f5b04dd8a710..55c7c099d32c 100644 --- a/arch/arm64/kernel/sleep.S +++ b/arch/arm64/kernel/sleep.S @@ -1,5 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 */ #include +#include #include #include #include @@ -90,6 +91,7 @@ ENTRY(__cpu_suspend_enter) str x0, [x1] add x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS stp x29, lr, [sp, #-16]! + mov x29, sp bl cpu_do_suspend ldp x29, lr, [sp], #16 mov x0, #1 @@ -146,3 +148,6 @@ ENTRY(_cpu_resume) mov x0, #0 ret ENDPROC(_cpu_resume) + + asm_stack_frame_non_standard cpu_resume + asm_stack_frame_non_standard _cpu_resume -- 2.17.1
[RFC v4 09/18] gcc-plugins: objtool: Add plugin to detect switch table on arm64
This plugins comes into play before the final 2 RTL passes of GCC and detects switch-tables that are to be outputed in the ELF and writes information in an "objtool_data" section which will be used by objtool. Signed-off-by: Raphael Gault --- scripts/Makefile.gcc-plugins | 2 + scripts/gcc-plugins/Kconfig | 9 +++ .../arm64_switch_table_detection_plugin.c | 58 +++ 3 files changed, 69 insertions(+) create mode 100644 scripts/gcc-plugins/arm64_switch_table_detection_plugin.c diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins index 5f7df50cfe7a..a56736df9dc2 100644 --- a/scripts/Makefile.gcc-plugins +++ b/scripts/Makefile.gcc-plugins @@ -44,6 +44,8 @@ ifdef CONFIG_GCC_PLUGIN_ARM_SSP_PER_TASK endif export DISABLE_ARM_SSP_PER_TASK_PLUGIN +gcc-plugin-$(CONFIG_GCC_PLUGIN_SWITCH_TABLES) += arm64_switch_table_detection_plugin.so + # All the plugin CFLAGS are collected here in case a build target needs to # filter them out of the KBUILD_CFLAGS. GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y)) diff --git a/scripts/gcc-plugins/Kconfig b/scripts/gcc-plugins/Kconfig index d33de0b9f4f5..1daeffb55dce 100644 --- a/scripts/gcc-plugins/Kconfig +++ b/scripts/gcc-plugins/Kconfig @@ -113,4 +113,13 @@ config GCC_PLUGIN_ARM_SSP_PER_TASK bool depends on GCC_PLUGINS && ARM +config GCC_PLUGIN_SWITCH_TABLES + bool "GCC Plugin: Identify switch tables at compile time" + default y + depends on STACK_VALIDATION && ARM64 + help + Plugin to identify switch tables generated at compile time and store + them in a .objtool_data section. Objtool will then use that section + to analyse the different execution path of the switch table. + endmenu diff --git a/scripts/gcc-plugins/arm64_switch_table_detection_plugin.c b/scripts/gcc-plugins/arm64_switch_table_detection_plugin.c new file mode 100644 index ..d7f0e13910d5 --- /dev/null +++ b/scripts/gcc-plugins/arm64_switch_table_detection_plugin.c @@ -0,0 +1,58 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include "gcc-common.h" + +__visible int plugin_is_GPL_compatible; + +static unsigned int arm64_switchtbl_rtl_execute(void) +{ + rtx_insn *insn; + rtx_insn *labelp = NULL; + rtx_jump_table_data *tablep = NULL; + section *sec = get_section(".objtool_data", SECTION_STRINGS, NULL); + section *curr_sec = current_function_section(); + + for (insn = get_insns(); insn; insn = NEXT_INSN(insn)) { + /* +* Find a tablejump_p INSN (using a dispatch table) +*/ + if (!tablejump_p(insn, , )) + continue; + + if (labelp && tablep) { + switch_to_section(sec); + assemble_integer_with_op(".quad ", gen_rtx_LABEL_REF(Pmode, labelp)); + assemble_integer_with_op(".quad ", GEN_INT(GET_NUM_ELEM(tablep->get_labels(; + assemble_integer_with_op(".quad ", GEN_INT(ADDR_DIFF_VEC_FLAGS(tablep).offset_unsigned)); + switch_to_section(curr_sec); + } + } + return 0; +} + +#define PASS_NAME arm64_switchtbl_rtl + +#define NO_GATE +#include "gcc-generate-rtl-pass.h" + +__visible int plugin_init(struct plugin_name_args *plugin_info, + struct plugin_gcc_version *version) +{ + const char * const plugin_name = plugin_info->base_name; + int tso = 0; + int i; + + if (!plugin_default_version_check(version, _version)) { + error(G_("incompatible gcc/plugin versions")); + return 1; + } + + PASS_INFO(arm64_switchtbl_rtl, "outof_cfglayout", 1, + PASS_POS_INSERT_AFTER); + + register_callback(plugin_info->base_name, PLUGIN_PASS_MANAGER_SETUP, + NULL, _switchtbl_rtl_pass_info); + + return 0; +} -- 2.17.1
[RFC v4 11/18] arm64: alternative: Mark .altinstr_replacement as containing executable instructions
Until now, the section .altinstr_replacement wasn't marked as containing executable instructions on arm64. This patch changes that so that it is coherent with what is done on x86. Signed-off-by: Raphael Gault --- arch/arm64/include/asm/alternative.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h index b9f8d787eea9..e9e6b81e3eb4 100644 --- a/arch/arm64/include/asm/alternative.h +++ b/arch/arm64/include/asm/alternative.h @@ -71,7 +71,7 @@ static inline void apply_alternatives_module(void *start, size_t length) { } ALTINSTR_ENTRY(feature,cb) \ ".popsection\n" \ " .if " __stringify(cb) " == 0\n" \ - ".pushsection .altinstr_replacement, \"a\"\n" \ + ".pushsection .altinstr_replacement, \"ax\"\n" \ "663:\n\t" \ newinstr "\n" \ "664:\n\t" \ -- 2.17.1
[RFC v4 18/18] objtool: arm64: Enable stack validation for arm64
Signed-off-by: Raphael Gault --- arch/arm64/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 3adcec05b1f6..dc3de85b2502 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -163,6 +163,7 @@ config ARM64 select HAVE_RCU_TABLE_FREE select HAVE_RSEQ select HAVE_STACKPROTECTOR + select HAVE_STACK_VALIDATION select HAVE_SYSCALL_TRACEPOINTS select HAVE_KPROBES select HAVE_KRETPROBES -- 2.17.1
[RFC v4 08/18] objtool: Refactor switch-tables code to support other architectures
The way to identify switch-tables and retrieves all the data necessary to handle the different execution branches is not the same on all architecture. In order to be able to add other architecture support, this patch defines arch-dependent functions to process jump-tables. Signed-off-by: Raphael Gault --- tools/objtool/arch/arm64/arch_special.c | 15 tools/objtool/arch/arm64/decode.c | 4 +- tools/objtool/arch/x86/arch_special.c | 79 tools/objtool/check.c | 95 + tools/objtool/check.h | 7 ++ tools/objtool/special.h | 10 ++- 6 files changed, 114 insertions(+), 96 deletions(-) diff --git a/tools/objtool/arch/arm64/arch_special.c b/tools/objtool/arch/arm64/arch_special.c index a21d28876317..17a8a06aac2a 100644 --- a/tools/objtool/arch/arm64/arch_special.c +++ b/tools/objtool/arch/arm64/arch_special.c @@ -20,3 +20,18 @@ void arch_force_alt_path(unsigned short feature, struct special_alt *alt) { } + +int arch_add_jump_table(struct objtool_file *file, struct instruction *insn, + struct rela *table, struct rela *next_table) +{ + return 0; +} + +struct rela *arch_find_switch_table(struct objtool_file *file, + struct rela *text_rela, + struct section *rodata_sec, + unsigned long table_offset) +{ + file->ignore_unreachables = true; + return NULL; +} diff --git a/tools/objtool/arch/arm64/decode.c b/tools/objtool/arch/arm64/decode.c index 4cb9402d6fe1..a20725c1bfd7 100644 --- a/tools/objtool/arch/arm64/decode.c +++ b/tools/objtool/arch/arm64/decode.c @@ -159,7 +159,7 @@ static int is_arm64(struct elf *elf) int arch_decode_instruction(struct elf *elf, struct section *sec, unsigned long offset, unsigned int maxlen, - unsigned int *len, unsigned char *type, + unsigned int *len, enum insn_type *type, unsigned long *immediate, struct stack_op *op) { int arm64 = 0; @@ -184,7 +184,7 @@ int arch_decode_instruction(struct elf *elf, struct section *sec, insn = *(u32 *)(sec->data->d_buf + offset); //dispatch according to encoding classes - return aarch64_insn_class_decode_table[(insn >> 25) & 0xf](insn, type, + return aarch64_insn_class_decode_table[(insn >> 25) & 0xf](insn, (unsigned char *)type, immediate, op); } diff --git a/tools/objtool/arch/x86/arch_special.c b/tools/objtool/arch/x86/arch_special.c index 6583a1770bb2..c097001d805b 100644 --- a/tools/objtool/arch/x86/arch_special.c +++ b/tools/objtool/arch/x86/arch_special.c @@ -26,3 +26,82 @@ void arch_force_alt_path(unsigned short feature, alt->skip_alt = true; } } + +int arch_add_jump_table(struct objtool_file *file, struct instruction *insn, + struct rela *table, struct rela *next_table) +{ + struct rela *rela = table; + struct instruction *dest_insn; + struct alternative *alt; + struct symbol *pfunc = insn->func->pfunc; + unsigned int prev_offset = 0; + + /* +* Each @rela is a switch table relocation which points to the target +* instruction. +*/ + list_for_each_entry_from(rela, >sec->rela_list, list) { + + /* Check for the end of the table: */ + if (rela != table && rela->jump_table_start) + break; + + /* Make sure the table entries are consecutive: */ + if (prev_offset && rela->offset != prev_offset + 8) + break; + + /* Detect function pointers from contiguous objects: */ + if (rela->sym->sec == pfunc->sec && + rela->addend == pfunc->offset) + break; + + dest_insn = find_insn(file, rela->sym->sec, rela->addend); + if (!dest_insn) + break; + + /* Make sure the destination is in the same function: */ + if (!dest_insn->func || dest_insn->func->pfunc != pfunc) + break; + + alt = malloc(sizeof(*alt)); + if (!alt) { + WARN("malloc failed"); + return -1; + } + + alt->insn = dest_insn; + list_add_tail(>list, >alts); + prev_offset = rela->offset; + } + + if (!prev_offset) { + WARN_FUNC("can't find switch jump table", + insn->sec, insn->offset); + return -1;
[RFC v4 14/18] arm64: kvm: Annotate non-standard stack frame functions
Both __guest_entry and __guest_exit functions do not setup a correct stack frame. Because they can be considered as callable functions, even if they are particular cases, we chose to silence the warnings given by objtool by annotating them as non-standard. Signed-off-by: Raphael Gault --- arch/arm64/kvm/hyp/entry.S | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S index e5cc8d66bf53..c3443bfd0944 100644 --- a/arch/arm64/kvm/hyp/entry.S +++ b/arch/arm64/kvm/hyp/entry.S @@ -15,6 +15,7 @@ #include #include #include +#include #define CPU_GP_REG_OFFSET(x) (CPU_GP_REGS + x) #define CPU_XREG_OFFSET(x) CPU_GP_REG_OFFSET(CPU_USER_PT_REGS + 8*x) @@ -97,6 +98,7 @@ alternative_else_nop_endif eret sb ENDPROC(__guest_enter) +asm_stack_frame_non_standard __guest_enter ENTRY(__guest_exit) // x0: return code @@ -193,3 +195,4 @@ abort_guest_exit_end: orr x0, x0, x5 1: ret ENDPROC(__guest_exit) +asm_stack_frame_non_standard __guest_exit -- 2.17.1
[RFC v4 12/18] arm64: assembler: Add macro to annotate asm function having non standard stack-frame.
Some functions don't have standard stack-frames but are intended this way. In order for objtool to ignore those particular cases we add a macro that enables us to annotate the cases we chose to mark as particular. Signed-off-by: Raphael Gault --- include/linux/frame.h | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/include/linux/frame.h b/include/linux/frame.h index 02d3ca2d9598..1e35e58ab259 100644 --- a/include/linux/frame.h +++ b/include/linux/frame.h @@ -11,14 +11,31 @@ * * For more information, see tools/objtool/Documentation/stack-validation.txt. */ +#ifndef __ASSEMBLY__ #define STACK_FRAME_NON_STANDARD(func) \ static void __used __section(.discard.func_stack_frame_non_standard) \ *__func_stack_frame_non_standard_##func = func +#else + /* +* This macro is the arm64 assembler equivalent of the +* macro STACK_FRAME_NON_STANDARD define at +* ~/include/linux/frame.h +*/ + .macro asm_stack_frame_non_standardfunc + .pushsection ".discard.func_stack_frame_non_standard" + .quad \func + .popsection + .endm +#endif /* __ASSEMBLY__ */ #else /* !CONFIG_STACK_VALIDATION */ +#ifndef __ASSEMBLY__ #define STACK_FRAME_NON_STANDARD(func) - +#else + .macro asm_stack_frame_non_standardfunc + .endm +#endif /* __ASSEMBLY__ */ #endif /* CONFIG_STACK_VALIDATION */ #endif /* _LINUX_FRAME_H */ -- 2.17.1
[RFC v4 17/18] arm64: kernel: Annotate non-standard stack frame functions
Annotate assembler functions which are callable but do not setup a correct stack frame. Signed-off-by: Raphael Gault --- arch/arm64/kernel/hyp-stub.S | 3 +++ arch/arm64/kvm/hyp-init.S| 3 +++ 2 files changed, 6 insertions(+) diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index 73d46070b315..8917d42f38c7 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -6,6 +6,7 @@ * Author: Marc Zyngier */ +#include #include #include #include @@ -42,6 +43,7 @@ ENTRY(__hyp_stub_vectors) ventry el1_fiq_invalid // FIQ 32-bit EL1 ventry el1_error_invalid // Error 32-bit EL1 ENDPROC(__hyp_stub_vectors) +asm_stack_frame_non_standard __hyp_stub_vectors .align 11 @@ -69,6 +71,7 @@ el1_sync: 9: mov x0, xzr eret ENDPROC(el1_sync) +asm_stack_frame_non_standard el1_sync .macro invalid_vector label \label: diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S index 160be2b4696d..63deea39313d 100644 --- a/arch/arm64/kvm/hyp-init.S +++ b/arch/arm64/kvm/hyp-init.S @@ -12,6 +12,7 @@ #include #include #include +#include .text .pushsection.hyp.idmap.text, "ax" @@ -118,6 +119,7 @@ CPU_BE( orr x4, x4, #SCTLR_ELx_EE) /* Hello, World! */ eret ENDPROC(__kvm_hyp_init) +asm_stack_frame_non_standard __kvm_hyp_init ENTRY(__kvm_handle_stub_hvc) cmp x0, #HVC_SOFT_RESTART @@ -159,6 +161,7 @@ reset: eret ENDPROC(__kvm_handle_stub_hvc) +asm_stack_frame_non_standard __kvm_handle_stub_hvc .ltorg -- 2.17.1
[RFC v4 15/18] arm64: kernel: Add exception on kuser32 to prevent stack analysis
kuser32 being used for compatibility, it contains a32 instructions which are not recognised by objtool when trying to analyse arm64 object files. Thus, we add an exception to skip validation on this particular file. Signed-off-by: Raphael Gault --- arch/arm64/kernel/Makefile | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 478491f07b4f..1239c7da4c02 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -33,6 +33,9 @@ ifneq ($(CONFIG_COMPAT_VDSO), y) obj-$(CONFIG_COMPAT) += sigreturn32.o endif obj-$(CONFIG_KUSER_HELPERS)+= kuser32.o + +OBJECT_FILES_NON_STANDARD_kuser32.o := y + obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o obj-$(CONFIG_MODULES) += module.o obj-$(CONFIG_ARM64_MODULE_PLTS)+= module-plts.o -- 2.17.1
[RFC v4 01/18] objtool: Add abstraction for computation of symbols offsets
The jump destination and relocation offset used previously are only reliable on x86_64 architecture. We abstract these computations by calling arch-dependent implementations. Signed-off-by: Raphael Gault --- tools/objtool/arch.h| 6 ++ tools/objtool/arch/x86/decode.c | 11 +++ tools/objtool/check.c | 15 ++- 3 files changed, 27 insertions(+), 5 deletions(-) diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h index ced3765c4f44..a9a50a25ca66 100644 --- a/tools/objtool/arch.h +++ b/tools/objtool/arch.h @@ -66,6 +66,8 @@ struct stack_op { struct op_src src; }; +struct instruction; + void arch_initial_func_cfi_state(struct cfi_state *state); int arch_decode_instruction(struct elf *elf, struct section *sec, @@ -75,4 +77,8 @@ int arch_decode_instruction(struct elf *elf, struct section *sec, bool arch_callee_saved_reg(unsigned char reg); +unsigned long arch_jump_destination(struct instruction *insn); + +unsigned long arch_dest_rela_offset(int addend); + #endif /* _ARCH_H */ diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c index 0567c47a91b1..fa33b3465722 100644 --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -11,6 +11,7 @@ #include "lib/inat.c" #include "lib/insn.c" +#include "../../check.h" #include "../../elf.h" #include "../../arch.h" #include "../../warn.h" @@ -66,6 +67,11 @@ bool arch_callee_saved_reg(unsigned char reg) } } +unsigned long arch_dest_rela_offset(int addend) +{ + return addend + 4; +} + int arch_decode_instruction(struct elf *elf, struct section *sec, unsigned long offset, unsigned int maxlen, unsigned int *len, enum insn_type *type, @@ -497,3 +503,8 @@ void arch_initial_func_cfi_state(struct cfi_state *state) state->regs[16].base = CFI_CFA; state->regs[16].offset = -8; } + +unsigned long arch_jump_destination(struct instruction *insn) +{ + return insn->offset + insn->len + insn->immediate; +} diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 176f2f084060..479fab46b656 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -563,7 +563,7 @@ static int add_jump_destinations(struct objtool_file *file) insn->len); if (!rela) { dest_sec = insn->sec; - dest_off = insn->offset + insn->len + insn->immediate; + dest_off = arch_jump_destination(insn); } else if (rela->sym->type == STT_SECTION) { dest_sec = rela->sym->sec; dest_off = rela->addend + 4; @@ -659,7 +659,7 @@ static int add_call_destinations(struct objtool_file *file) rela = find_rela_by_dest_range(insn->sec, insn->offset, insn->len); if (!rela) { - dest_off = insn->offset + insn->len + insn->immediate; + dest_off = arch_jump_destination(insn); insn->call_dest = find_symbol_by_offset(insn->sec, dest_off); @@ -672,14 +672,19 @@ static int add_call_destinations(struct objtool_file *file) } } else if (rela->sym->type == STT_SECTION) { + /* +* the original x86_64 code adds 4 to the rela->addend +* which is not needed on arm64 architecture. +*/ + dest_off = arch_dest_rela_offset(rela->addend); insn->call_dest = find_symbol_by_offset(rela->sym->sec, - rela->addend+4); + dest_off); if (!insn->call_dest || insn->call_dest->type != STT_FUNC) { - WARN_FUNC("can't find call dest symbol at %s+0x%x", + WARN_FUNC("can't find call dest symbol at %s+0x%lx", insn->sec, insn->offset, rela->sym->sec->name, - rela->addend + 4); + dest_off); return -1; } } else -- 2.17.1
[RFC v4 03/18] objtool: Move registers and control flow to arch-dependent code
The control flow information and register macro definitions were based on the x86_64 architecture but should be abstract so that each architecture can define the correct values for the registers, especially the registers related to the stack frame (Frame Pointer, Stack Pointer and possibly Return Address). Signed-off-by: Raphael Gault --- tools/objtool/arch/x86/include/arch_special.h | 36 +++ tools/objtool/{ => arch/x86/include}/cfi.h| 0 tools/objtool/check.h | 1 + tools/objtool/special.c | 19 +- 4 files changed, 38 insertions(+), 18 deletions(-) create mode 100644 tools/objtool/arch/x86/include/arch_special.h rename tools/objtool/{ => arch/x86/include}/cfi.h (100%) diff --git a/tools/objtool/arch/x86/include/arch_special.h b/tools/objtool/arch/x86/include/arch_special.h new file mode 100644 index ..424ce47013e3 --- /dev/null +++ b/tools/objtool/arch/x86/include/arch_special.h @@ -0,0 +1,36 @@ +/* + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + */ +#ifndef _X86_ARCH_SPECIAL_H +#define _X86_ARCH_SPECIAL_H + +#define EX_ENTRY_SIZE 12 +#define EX_ORIG_OFFSET 0 +#define EX_NEW_OFFSET 4 + +#define JUMP_ENTRY_SIZE16 +#define JUMP_ORIG_OFFSET 0 +#define JUMP_NEW_OFFSET4 + +#define ALT_ENTRY_SIZE 13 +#define ALT_ORIG_OFFSET0 +#define ALT_NEW_OFFSET 4 +#define ALT_FEATURE_OFFSET 8 +#define ALT_ORIG_LEN_OFFSET10 +#define ALT_NEW_LEN_OFFSET 11 + +#define X86_FEATURE_POPCNT (4 * 32 + 23) +#define X86_FEATURE_SMAP (9 * 32 + 20) + +#endif /* _X86_ARCH_SPECIAL_H */ diff --git a/tools/objtool/cfi.h b/tools/objtool/arch/x86/include/cfi.h similarity index 100% rename from tools/objtool/cfi.h rename to tools/objtool/arch/x86/include/cfi.h diff --git a/tools/objtool/check.h b/tools/objtool/check.h index 6d875ca6fce0..af87b55db454 100644 --- a/tools/objtool/check.h +++ b/tools/objtool/check.h @@ -11,6 +11,7 @@ #include "cfi.h" #include "arch.h" #include "orc.h" +#include "arch_special.h" #include struct insn_state { diff --git a/tools/objtool/special.c b/tools/objtool/special.c index fdbaa611146d..b8ccee1b5382 100644 --- a/tools/objtool/special.c +++ b/tools/objtool/special.c @@ -14,24 +14,7 @@ #include "builtin.h" #include "special.h" #include "warn.h" - -#define EX_ENTRY_SIZE 12 -#define EX_ORIG_OFFSET 0 -#define EX_NEW_OFFSET 4 - -#define JUMP_ENTRY_SIZE16 -#define JUMP_ORIG_OFFSET 0 -#define JUMP_NEW_OFFSET4 - -#define ALT_ENTRY_SIZE 13 -#define ALT_ORIG_OFFSET0 -#define ALT_NEW_OFFSET 4 -#define ALT_FEATURE_OFFSET 8 -#define ALT_ORIG_LEN_OFFSET10 -#define ALT_NEW_LEN_OFFSET 11 - -#define X86_FEATURE_POPCNT (4*32+23) -#define X86_FEATURE_SMAP (9*32+20) +#include "arch_special.h" struct special_entry { const char *sec; -- 2.17.1
[RFC v4 05/18] objtool: special: Adapt special section handling
This patch abstracts the few architecture dependent tests that are perform when handling special section and switch tables. It enables any architecture to ignore a particular CPU feature or not to handle switch tables. Signed-off-by: Raphael Gault --- tools/objtool/arch/arm64/Build| 1 + tools/objtool/arch/arm64/arch_special.c | 22 +++ .../objtool/arch/arm64/include/arch_special.h | 10 +-- tools/objtool/arch/x86/Build | 1 + tools/objtool/arch/x86/arch_special.c | 28 +++ tools/objtool/arch/x86/include/arch_special.h | 9 ++ tools/objtool/check.c | 24 ++-- tools/objtool/special.c | 9 ++ tools/objtool/special.h | 3 ++ 9 files changed, 96 insertions(+), 11 deletions(-) create mode 100644 tools/objtool/arch/arm64/arch_special.c create mode 100644 tools/objtool/arch/x86/arch_special.c diff --git a/tools/objtool/arch/arm64/Build b/tools/objtool/arch/arm64/Build index bf7a32c2b9e9..3d09be745a84 100644 --- a/tools/objtool/arch/arm64/Build +++ b/tools/objtool/arch/arm64/Build @@ -1,3 +1,4 @@ +objtool-y += arch_special.o objtool-y += decode.o objtool-y += orc_dump.o objtool-y += orc_gen.o diff --git a/tools/objtool/arch/arm64/arch_special.c b/tools/objtool/arch/arm64/arch_special.c new file mode 100644 index ..a21d28876317 --- /dev/null +++ b/tools/objtool/arch/arm64/arch_special.c @@ -0,0 +1,22 @@ +/* + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + */ +#include "../../special.h" +#include "arch_special.h" + +void arch_force_alt_path(unsigned short feature, +bool uaccess, +struct special_alt *alt) +{ +} diff --git a/tools/objtool/arch/arm64/include/arch_special.h b/tools/objtool/arch/arm64/include/arch_special.h index 63da775d0581..185103be8a51 100644 --- a/tools/objtool/arch/arm64/include/arch_special.h +++ b/tools/objtool/arch/arm64/include/arch_special.h @@ -30,7 +30,13 @@ #define ALT_ORIG_LEN_OFFSET10 #define ALT_NEW_LEN_OFFSET 11 -#define X86_FEATURE_POPCNT (4 * 32 + 23) -#define X86_FEATURE_SMAP (9 * 32 + 20) +static inline bool arch_should_ignore_feature(unsigned short feature) +{ + return false; +} +static inline bool arch_support_switch_table(void) +{ + return false; +} #endif /* _ARM64_ARCH_SPECIAL_H */ diff --git a/tools/objtool/arch/x86/Build b/tools/objtool/arch/x86/Build index 1f11b45999d0..63e167775bc8 100644 --- a/tools/objtool/arch/x86/Build +++ b/tools/objtool/arch/x86/Build @@ -1,3 +1,4 @@ +objtool-y += arch_special.o objtool-y += decode.o objtool-y += orc_dump.o objtool-y += orc_gen.o diff --git a/tools/objtool/arch/x86/arch_special.c b/tools/objtool/arch/x86/arch_special.c new file mode 100644 index ..6583a1770bb2 --- /dev/null +++ b/tools/objtool/arch/x86/arch_special.c @@ -0,0 +1,28 @@ +/* + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + */ +#include "../../special.h" +#include "arch_special.h" + +void arch_force_alt_path(unsigned short feature, +bool uaccess, +struct special_alt *alt) +{ + if (feature == X86_FEATURE_SMAP) { + if (uaccess) + alt->skip_orig = true; + else + alt->skip_alt = true; + } +} diff --git a/tools/objtool/arch/x86/include/arch_special.h b/tools/objtool/arch/x86/include/arch_special.h index 424ce47013e3..fce2b1193194 100644 --- a/tools/objtool/arch/x86/include/arch_special.h +++ b/tools/objtool/arch/x86/include/arch_special.h @@ -33,4 +3
[RFC v4 07/18] objtool: Introduce INSN_UNKNOWN type
On arm64 some object files contain data stored in the .text section. This data is interpreted by objtool as instruction but can't be identified as a valid one. In order to keep analysing those files we introduce INSN_UNKNOWN type. The "unknown instruction" warning will thus only be raised if such instructions are uncountered while validating an execution branch. This change doesn't impact the x86 decoding logic since 0 is still used as a way to specify an unknown type, raising the "unknown instruction" warning during the decoding phase still. Signed-off-by: Raphael Gault --- tools/objtool/arch.h | 1 + tools/objtool/arch/arm64/decode.c | 8 tools/objtool/arch/arm64/include/insn_decode.h | 4 ++-- tools/objtool/check.c | 10 +- 4 files changed, 16 insertions(+), 7 deletions(-) diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h index 68d6371a24a2..9f2590e1df79 100644 --- a/tools/objtool/arch.h +++ b/tools/objtool/arch.h @@ -28,6 +28,7 @@ enum insn_type { INSN_CLAC, INSN_STD, INSN_CLD, + INSN_UNKNOWN, INSN_OTHER, }; diff --git a/tools/objtool/arch/arm64/decode.c b/tools/objtool/arch/arm64/decode.c index be3d2eb10227..4cb9402d6fe1 100644 --- a/tools/objtool/arch/arm64/decode.c +++ b/tools/objtool/arch/arm64/decode.c @@ -37,9 +37,9 @@ */ static arm_decode_class aarch64_insn_class_decode_table[] = { [INSN_RESERVED] = arm_decode_reserved, - [INSN_UNKNOWN] = arm_decode_unknown, + [INSN_UNALLOC_1]= arm_decode_unknown, [INSN_SVE_ENC] = arm_decode_sve_encoding, - [INSN_UNALLOC] = arm_decode_unknown, + [INSN_UNALLOC_2]= arm_decode_unknown, [INSN_LD_ST_4] = arm_decode_ld_st, [INSN_DP_REG_5] = arm_decode_dp_reg, [INSN_LD_ST_6] = arm_decode_ld_st, @@ -191,7 +191,7 @@ int arch_decode_instruction(struct elf *elf, struct section *sec, int arm_decode_unknown(u32 instr, unsigned char *type, unsigned long *immediate, struct stack_op *op) { - *type = 0; + *type = INSN_UNKNOWN; return 0; } @@ -206,7 +206,7 @@ int arm_decode_reserved(u32 instr, unsigned char *type, unsigned long *immediate, struct stack_op *op) { *immediate = instr & ONES(16); - *type = INSN_BUG; + *type = INSN_UNKNOWN; return 0; } diff --git a/tools/objtool/arch/arm64/include/insn_decode.h b/tools/objtool/arch/arm64/include/insn_decode.h index eb54fc39dca5..a01d76306749 100644 --- a/tools/objtool/arch/arm64/include/insn_decode.h +++ b/tools/objtool/arch/arm64/include/insn_decode.h @@ -20,9 +20,9 @@ #include "../../../arch.h" #define INSN_RESERVED 0b -#define INSN_UNKNOWN 0b0001 +#define INSN_UNALLOC_1 0b0001 #define INSN_SVE_ENC 0b0010 -#define INSN_UNALLOC 0b0011 +#define INSN_UNALLOC_2 0b0011 #define INSN_DP_IMM0b1001 //0x100x #define INSN_BRANCH0b1011 //0x101x #define INSN_LD_ST_4 0b0100 //0bx1x0 diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 519569b0329f..baa6a93f37cd 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1981,6 +1981,13 @@ static int validate_branch(struct objtool_file *file, struct symbol *func, while (1) { next_insn = next_insn_same_sec(file, insn); + if (insn->type == INSN_UNKNOWN) { + WARN("%s+0x%lx unknown instruction type, should never be reached", +insn->sec->name, +insn->offset); + return 1; + } + if (file->c_file && func && insn->func && func != insn->func->pfunc) { WARN("%s() falls through to next function %s()", func->name, insn->func->name); @@ -2414,7 +2421,8 @@ static int validate_reachable_instructions(struct objtool_file *file) return 0; for_each_insn(file, insn) { - if (insn->visited || ignore_unreachable_insn(insn)) + if (insn->visited || ignore_unreachable_insn(insn) || + insn->type == INSN_UNKNOWN) continue; WARN_FUNC("unreachable instruction", insn->sec, insn->offset); -- 2.17.1
[RFC v4 04/18] objtool: arm64: Add required implementation for supporting the aarch64 architecture in objtool.
Provide implementation for the arch-dependent functions that are called by the main check function of objtool. The ORC unwinder is not yet supported by the arm64 architecture so we only provide a dummy interface for now. The decoding of the instruction is split into classes and subclasses as described into the Instruction Encoding in the ArmV8.5 Architecture Reference Manual. In order to handle the load/store instructions for a pair of registers we add an extra field to the stack_op structure. We consider that the hypervisor/secure-monitor is behaving correctly. This enables us to handle hvc/smc/svc context switching instructions as nop since we consider that the context is restored correctly. Signed-off-by: Raphael Gault --- tools/objtool/arch.h |7 + tools/objtool/arch/arm64/Build|7 + tools/objtool/arch/arm64/bit_operations.c | 67 + tools/objtool/arch/arm64/decode.c | 2787 + .../objtool/arch/arm64/include/arch_special.h | 36 + .../arch/arm64/include/asm/orc_types.h| 96 + .../arch/arm64/include/bit_operations.h | 24 + tools/objtool/arch/arm64/include/cfi.h| 74 + .../objtool/arch/arm64/include/insn_decode.h | 211 ++ tools/objtool/arch/arm64/orc_dump.c | 26 + tools/objtool/arch/arm64/orc_gen.c| 40 + 11 files changed, 3375 insertions(+) create mode 100644 tools/objtool/arch/arm64/Build create mode 100644 tools/objtool/arch/arm64/bit_operations.c create mode 100644 tools/objtool/arch/arm64/decode.c create mode 100644 tools/objtool/arch/arm64/include/arch_special.h create mode 100644 tools/objtool/arch/arm64/include/asm/orc_types.h create mode 100644 tools/objtool/arch/arm64/include/bit_operations.h create mode 100644 tools/objtool/arch/arm64/include/cfi.h create mode 100644 tools/objtool/arch/arm64/include/insn_decode.h create mode 100644 tools/objtool/arch/arm64/orc_dump.c create mode 100644 tools/objtool/arch/arm64/orc_gen.c diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h index e91e12807678..bb5ce810fb6e 100644 --- a/tools/objtool/arch.h +++ b/tools/objtool/arch.h @@ -62,9 +62,16 @@ struct op_src { int offset; }; +struct op_extra { + unsigned char used; + unsigned char reg; + int offset; +}; + struct stack_op { struct op_dest dest; struct op_src src; + struct op_extra extra; }; struct instruction; diff --git a/tools/objtool/arch/arm64/Build b/tools/objtool/arch/arm64/Build new file mode 100644 index ..bf7a32c2b9e9 --- /dev/null +++ b/tools/objtool/arch/arm64/Build @@ -0,0 +1,7 @@ +objtool-y += decode.o +objtool-y += orc_dump.o +objtool-y += orc_gen.o +objtool-y += bit_operations.o + + +CFLAGS_decode.o += -I$(OUTPUT)arch/arm64/lib diff --git a/tools/objtool/arch/arm64/bit_operations.c b/tools/objtool/arch/arm64/bit_operations.c new file mode 100644 index ..f457a14a7f5d --- /dev/null +++ b/tools/objtool/arch/arm64/bit_operations.c @@ -0,0 +1,67 @@ +#include +#include +#include "bit_operations.h" + +#include "../../warn.h" + +u64 replicate(u64 x, int size, int n) +{ + u64 ret = 0; + + while (n >= 0) { + ret = (ret | x) << size; + n--; + } + return ret | x; +} + +u64 ror(u64 x, int size, int shift) +{ + int m = shift % size; + + if (shift == 0) + return x; + return ZERO_EXTEND((x >> m) | (x << (size - m)), size); +} + +int highest_set_bit(u32 x) +{ + int i; + + for (i = 31; i >= 0; i--, x <<= 1) + if (x & 0x8000) + return i; + return 0; +} + +/* imms and immr are both 6 bit long */ +__uint128_t decode_bit_masks(unsigned char N, unsigned char imms, +unsigned char immr, bool immediate) +{ + u64 tmask, wmask; + u32 diff, S, R, esize, welem, telem; + unsigned char levels = 0, len = 0; + + len = highest_set_bit((N << 6) | ((~imms) & ONES(6))); + levels = ZERO_EXTEND(ONES(len), 6); + + if (immediate && ((imms & levels) == levels)) { + WARN("unknown instruction"); + return -1; + } + + S = imms & levels; + R = immr & levels; + diff = ZERO_EXTEND(S - R, 6); + + esize = 1 << len; + diff = diff & ONES(len); + + welem = ZERO_EXTEND(ONES(S + 1), esize); + telem = ZERO_EXTEND(ONES(diff + 1), esize); + + wmask = replicate(ror(welem, esize, R), esize, 64 / esize); + tmask = replicate(telem, esize, 64 / esize); + + return ((__uint128_t)wmask << 64) | tmask; +} diff --git a/tools/objtool/arch/arm64/decode.c b/tools/objtool/arch/arm64/decode.c new file mode 100644 index ..395c5777afab --- /dev/null +++ b/tools/objtool/arch/arm64/decode.c @@ -0,0 +1
[RFC v4 00/18] objtool: Add support for arm64
Hi, Changes since RFC V3: * Rebased on tip/master: Switch/jump table had been refactored * Take Catalin Marinas comments into account regarding the asm macro for marking exceptions. As of now, objtool only supports the x86_64 architecture but the groundwork has already been done in order to add support for other architectures without too much effort. This series of patches adds support for the arm64 architecture based on the Armv8.5 Architecture Reference Manual. Objtool will be a valuable tool to progress and provide more guarentees on live patching which is a work in progress for arm64. Once we have the base of objtool working the next steps will be to port Peter Z's uaccess validation for arm64. Raphael Gault (18): objtool: Add abstraction for computation of symbols offsets objtool: orc: Refactor ORC API for other architectures to implement. objtool: Move registers and control flow to arch-dependent code objtool: arm64: Add required implementation for supporting the aarch64 architecture in objtool. objtool: special: Adapt special section handling objtool: arm64: Adapt the stack frame checks for arm architecture objtool: Introduce INSN_UNKNOWN type objtool: Refactor switch-tables code to support other architectures gcc-plugins: objtool: Add plugin to detect switch table on arm64 objtool: arm64: Implement functions to add switch tables alternatives arm64: alternative: Mark .altinstr_replacement as containing executable instructions arm64: assembler: Add macro to annotate asm function having non standard stack-frame. arm64: sleep: Prevent stack frame warnings from objtool arm64: kvm: Annotate non-standard stack frame functions arm64: kernel: Add exception on kuser32 to prevent stack analysis arm64: crypto: Add exceptions for crypto object to prevent stack analysis arm64: kernel: Annotate non-standard stack frame functions objtool: arm64: Enable stack validation for arm64 arch/arm64/Kconfig|1 + arch/arm64/crypto/Makefile|3 + arch/arm64/include/asm/alternative.h |2 +- arch/arm64/kernel/Makefile|3 + arch/arm64/kernel/hyp-stub.S |3 + arch/arm64/kernel/sleep.S |5 + arch/arm64/kvm/hyp-init.S |3 + arch/arm64/kvm/hyp/entry.S|3 + include/linux/frame.h | 19 +- scripts/Makefile.gcc-plugins |2 + scripts/gcc-plugins/Kconfig |9 + .../arm64_switch_table_detection_plugin.c | 58 + tools/objtool/Build |2 - tools/objtool/arch.h | 19 + tools/objtool/arch/arm64/Build|8 + tools/objtool/arch/arm64/arch_special.c | 165 + tools/objtool/arch/arm64/bit_operations.c | 67 + tools/objtool/arch/arm64/decode.c | 2815 + .../objtool/arch/arm64/include/arch_special.h | 52 + .../arch/arm64/include/asm/orc_types.h| 96 + .../arch/arm64/include/bit_operations.h | 24 + tools/objtool/arch/arm64/include/cfi.h| 74 + .../objtool/arch/arm64/include/insn_decode.h | 210 ++ tools/objtool/arch/arm64/orc_dump.c | 26 + tools/objtool/arch/arm64/orc_gen.c| 40 + tools/objtool/arch/x86/Build |3 + tools/objtool/arch/x86/arch_special.c | 107 + tools/objtool/arch/x86/decode.c | 16 + tools/objtool/arch/x86/include/arch_special.h | 45 + tools/objtool/{ => arch/x86/include}/cfi.h|0 tools/objtool/{ => arch/x86}/orc_dump.c |4 +- tools/objtool/{ => arch/x86}/orc_gen.c| 104 +- tools/objtool/check.c | 309 +- tools/objtool/check.h | 10 + tools/objtool/elf.c |3 +- tools/objtool/orc.h |4 +- tools/objtool/special.c | 28 +- tools/objtool/special.h | 13 +- 38 files changed, 4129 insertions(+), 226 deletions(-) create mode 100644 scripts/gcc-plugins/arm64_switch_table_detection_plugin.c create mode 100644 tools/objtool/arch/arm64/Build create mode 100644 tools/objtool/arch/arm64/arch_special.c create mode 100644 tools/objtool/arch/arm64/bit_operations.c create mode 100644 tools/objtool/arch/arm64/decode.c create mode 100644 tools/objtool/arch/arm64/include/arch_special.h create mode 100644 tools/objtool/arch/arm64/include/asm/orc_types.h create mode 100644 tools/objtool/arch/arm64/include/bit_operations.h create mode 100644 tools/objtool/arch/arm64/include/cfi.h create mode 100644 tools/objtool/arch/arm64/include/insn_decode.h create mode 100644 tools/objtool/arch/arm64/orc_dump.c create mode 100644 tools/objtool/arch/arm64/orc_gen.c create mode 100644 tools/objtool/arch/x86/arch_special.c cr
Re: [PATCH v2 0/5] arm64: Enable access to pmu registers by user-space
Hi, Any further comments on this patchset ? Cheers, On 7/5/19 9:55 AM, Raphael Gault wrote: The perf user-space tool relies on the PMU to monitor events. It offers an abstraction layer over the hardware counters since the underlying implementation is cpu-dependent. We want to allow userspace tools to have access to the registers storing the hardware counters' values directly. This targets specifically self-monitoring tasks in order to reduce the overhead by directly accessing the registers without having to go through the kernel. In order to do this we need to setup the pmu so that it exposes its registers to userspace access. The first patch add a test to the perf tool so that we can test that the access to the registers works correctly from userspace. The second patch add a capability in the arm64 cpufeatures framework in order to detect when we are running on a heterogeneous system. The third patch focuses on the armv8 pmuv3 PMU support and makes sure that the access to the pmu registers is enable and that the userspace have access to the relevent information in order to use them. The fourth patch put in place callbacks to enable access to the hardware counters from userspace when a compatible event is opened using the perf API. The fifth patch adds a short documentation about PMU counters direct access from userspace. **Changes since v1** * Rebased on linux-next/master * Do not include RSEQs materials (test and utilities) since we want to enable direct access to counters only on homogeneous systems. * Do not include the hook defitinions for the same reason as above. * Add a cpu feature/capability to detect heterogeneous systems. Raphael Gault (5): perf: arm64: Add test to check userspace access to hardware counters. arm64: cpufeature: Add feature to detect heterogeneous systems arm64: pmu: Add function implementation to update event index in userpage. arm64: perf: Enable pmu counter direct access for perf event on armv8 Documentation: arm64: Document PMU counters access from userspace .../arm64/pmu_counter_user_access.txt | 42 +++ arch/arm64/include/asm/cpucaps.h | 3 +- arch/arm64/include/asm/mmu.h | 6 + arch/arm64/include/asm/mmu_context.h | 2 + arch/arm64/include/asm/perf_event.h | 14 + arch/arm64/kernel/cpufeature.c| 20 ++ arch/arm64/kernel/perf_event.c| 23 ++ drivers/perf/arm_pmu.c| 38 +++ include/linux/perf/arm_pmu.h | 2 + tools/perf/arch/arm64/include/arch-tests.h| 6 + tools/perf/arch/arm64/tests/Build | 1 + tools/perf/arch/arm64/tests/arch-tests.c | 4 + tools/perf/arch/arm64/tests/user-events.c | 255 ++ 13 files changed, 415 insertions(+), 1 deletion(-) create mode 100644 Documentation/arm64/pmu_counter_user_access.txt create mode 100644 tools/perf/arch/arm64/tests/user-events.c -- Raphael Gault
Re: [RFC V3 00/18] objtool: Add support for arm64
Hi all, Just a gentle ping to see if anyone has comments to make about this version :) On 6/24/19 10:55 AM, Raphael Gault wrote: As of now, objtool only supports the x86_64 architecture but the groundwork has already been done in order to add support for other architectures without too much effort. This series of patches adds support for the arm64 architecture based on the Armv8.5 Architecture Reference Manual. Objtool will be a valuable tool to progress and provide more guarentees on live patching which is a work in progress for arm64. Once we have the base of objtool working the next steps will be to port Peter Z's uaccess validation for arm64. Changes since previous version: * Rebased on tip/master: Note that I had to re-expose the `struct alternative` using check.h because it is now used outside of check.c. * Reorder commits for a more coherent progression * Introduce GCC plugin to help detect switch-tables for arm64 This plugins could be improve: It plugs in after the RTL control flow graph passes but only extract information about the switch tables. I originally intended for it to introduce new code_label/note within the RTL representation in order to reference them and thus get the address of the branch instruction. However I did not manage to do it properly using gen_rtx_CODE_LABEL/emit_label_before/after. If anyone has some experience with RTL plugins I am all ears for advices. Raphael Gault (18): objtool: Add abstraction for computation of symbols offsets objtool: orc: Refactor ORC API for other architectures to implement. objtool: Move registers and control flow to arch-dependent code objtool: arm64: Add required implementation for supporting the aarch64 architecture in objtool. objtool: special: Adapt special section handling objtool: arm64: Adapt the stack frame checks for arm architecture objtool: Introduce INSN_UNKNOWN type objtool: Refactor switch-tables code to support other architectures gcc-plugins: objtool: Add plugin to detect switch table on arm64 objtool: arm64: Implement functions to add switch tables alternatives arm64: alternative: Mark .altinstr_replacement as containing executable instructions arm64: assembler: Add macro to annotate asm function having non standard stack-frame. arm64: sleep: Prevent stack frame warnings from objtool arm64: kvm: Annotate non-standard stack frame functions arm64: kernel: Add exception on kuser32 to prevent stack analysis arm64: crypto: Add exceptions for crypto object to prevent stack analysis arm64: kernel: Annotate non-standard stack frame functions objtool: arm64: Enable stack validation for arm64 arch/arm64/Kconfig|1 + arch/arm64/crypto/Makefile|3 + arch/arm64/include/asm/alternative.h |2 +- arch/arm64/include/asm/assembler.h| 13 + arch/arm64/kernel/Makefile|3 + arch/arm64/kernel/hyp-stub.S |2 + arch/arm64/kernel/sleep.S |4 + arch/arm64/kvm/hyp-init.S |2 + arch/arm64/kvm/hyp/entry.S|2 + scripts/Makefile.gcc-plugins |2 + scripts/gcc-plugins/Kconfig |9 + .../arm64_switch_table_detection_plugin.c | 58 + tools/objtool/Build |2 - tools/objtool/arch.h | 21 +- tools/objtool/arch/arm64/Build|8 + tools/objtool/arch/arm64/arch_special.c | 173 + tools/objtool/arch/arm64/bit_operations.c | 67 + tools/objtool/arch/arm64/decode.c | 2809 + .../objtool/arch/arm64/include/arch_special.h | 52 + .../arch/arm64/include/asm/orc_types.h| 96 + .../arch/arm64/include/bit_operations.h | 24 + tools/objtool/arch/arm64/include/cfi.h| 74 + .../objtool/arch/arm64/include/insn_decode.h | 210 ++ tools/objtool/arch/arm64/orc_dump.c | 26 + tools/objtool/arch/arm64/orc_gen.c| 40 + tools/objtool/arch/x86/Build |3 + tools/objtool/arch/x86/arch_special.c | 101 + tools/objtool/arch/x86/decode.c | 16 + tools/objtool/arch/x86/include/arch_special.h | 45 + tools/objtool/{ => arch/x86/include}/cfi.h|0 tools/objtool/{ => arch/x86}/orc_dump.c |4 +- tools/objtool/{ => arch/x86}/orc_gen.c| 104 +- tools/objtool/check.c | 309 +- tools/objtool/check.h | 10 + tools/objtool/elf.c |3 +- tools/objtool/orc.h |4 +- tools/objtool/special.c | 28 +- tools/objtool/special.h | 13 +- 38 files changed, 4119 insertions(+), 224 deletions(-) create mode 100644 scripts/gc
Re: [PATCH v2 1/5] perf: arm64: Add test to check userspace access to hardware counters.
Hi Arnaldo, On 7/5/19 3:54 PM, Arnaldo Carvalho de Melo wrote: Em Fri, Jul 05, 2019 at 09:55:37AM +0100, Raphael Gault escreveu: This test relies on the fact that the PMU registers are accessible from userspace. It then uses the perf_event_mmap_page to retrieve the counter index and access the underlying register. This test uses sched_setaffinity(2) in order to run on all CPU and thus check the behaviour of the PMU of all cpus in a big.LITTLE environment. Signed-off-by: Raphael Gault --- tools/perf/arch/arm64/include/arch-tests.h | 6 + tools/perf/arch/arm64/tests/Build | 1 + tools/perf/arch/arm64/tests/arch-tests.c | 4 + tools/perf/arch/arm64/tests/user-events.c | 255 + 4 files changed, 266 insertions(+) create mode 100644 tools/perf/arch/arm64/tests/user-events.c diff --git a/tools/perf/arch/arm64/include/arch-tests.h b/tools/perf/arch/arm64/include/arch-tests.h index 90ec4c8cb880..a9b17ae0560b 100644 --- a/tools/perf/arch/arm64/include/arch-tests.h +++ b/tools/perf/arch/arm64/include/arch-tests.h @@ -2,11 +2,17 @@ #ifndef ARCH_TESTS_H #define ARCH_TESTS_H +#define __maybe_unused __attribute__((unused)) What is wrong with using: #include ? [acme@quaco perf]$ find tools/perf/ -name "*.[ch]" | xargs grep __maybe_unused | wc -l 1115 [acme@quaco perf]$ grep __maybe_unused tools/include/linux/compiler.h #ifndef __maybe_unused # define __maybe_unused __attribute__((unused)) [acme@quaco perf]$ Also please don't break strings in multiple lines just to comply with the 80 column limit. That is ok when you have multiple lines ending with a newline, but otherwise just makes it look ugly. You're right, I shall correct those points. Thanks, -- Raphael Gault
[PATCH v2 1/5] perf: arm64: Add test to check userspace access to hardware counters.
This test relies on the fact that the PMU registers are accessible from userspace. It then uses the perf_event_mmap_page to retrieve the counter index and access the underlying register. This test uses sched_setaffinity(2) in order to run on all CPU and thus check the behaviour of the PMU of all cpus in a big.LITTLE environment. Signed-off-by: Raphael Gault --- tools/perf/arch/arm64/include/arch-tests.h | 6 + tools/perf/arch/arm64/tests/Build | 1 + tools/perf/arch/arm64/tests/arch-tests.c | 4 + tools/perf/arch/arm64/tests/user-events.c | 255 + 4 files changed, 266 insertions(+) create mode 100644 tools/perf/arch/arm64/tests/user-events.c diff --git a/tools/perf/arch/arm64/include/arch-tests.h b/tools/perf/arch/arm64/include/arch-tests.h index 90ec4c8cb880..a9b17ae0560b 100644 --- a/tools/perf/arch/arm64/include/arch-tests.h +++ b/tools/perf/arch/arm64/include/arch-tests.h @@ -2,11 +2,17 @@ #ifndef ARCH_TESTS_H #define ARCH_TESTS_H +#define __maybe_unused __attribute__((unused)) #ifdef HAVE_DWARF_UNWIND_SUPPORT struct thread; struct perf_sample; +int test__arch_unwind_sample(struct perf_sample *sample, +struct thread *thread); #endif extern struct test arch_tests[]; +int test__rd_pmevcntr(struct test *test __maybe_unused, + int subtest __maybe_unused); + #endif diff --git a/tools/perf/arch/arm64/tests/Build b/tools/perf/arch/arm64/tests/Build index a61c06bdb757..3f9a20c17fc6 100644 --- a/tools/perf/arch/arm64/tests/Build +++ b/tools/perf/arch/arm64/tests/Build @@ -1,4 +1,5 @@ perf-y += regs_load.o perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o +perf-y += user-events.o perf-y += arch-tests.o diff --git a/tools/perf/arch/arm64/tests/arch-tests.c b/tools/perf/arch/arm64/tests/arch-tests.c index 5b1543c98022..57df9b89dede 100644 --- a/tools/perf/arch/arm64/tests/arch-tests.c +++ b/tools/perf/arch/arm64/tests/arch-tests.c @@ -10,6 +10,10 @@ struct test arch_tests[] = { .func = test__dwarf_unwind, }, #endif + { + .desc = "User counter access", + .func = test__rd_pmevcntr, + }, { .func = NULL, }, diff --git a/tools/perf/arch/arm64/tests/user-events.c b/tools/perf/arch/arm64/tests/user-events.c new file mode 100644 index ..958e4cd000c1 --- /dev/null +++ b/tools/perf/arch/arm64/tests/user-events.c @@ -0,0 +1,255 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "perf.h" +#include "debug.h" +#include "tests/tests.h" +#include "cloexec.h" +#include "util.h" +#include "arch-tests.h" + +/* + * ARMv8 ARM reserves the following encoding for system registers: + * (Ref: ARMv8 ARM, Section: "System instruction class encoding overview", + * C5.2, version:ARM DDI 0487A.f) + * [20-19] : Op0 + * [18-16] : Op1 + * [15-12] : CRn + * [11-8] : CRm + * [7-5] : Op2 + */ +#define Op0_shift 19 +#define Op0_mask0x3 +#define Op1_shift 16 +#define Op1_mask0x7 +#define CRn_shift 12 +#define CRn_mask0xf +#define CRm_shift 8 +#define CRm_mask0xf +#define Op2_shift 5 +#define Op2_mask0x7 + +#define __stringify(x) #x + +#define read_sysreg(r) ({ \ + u64 __val; \ + asm volatile("mrs %0, " __stringify(r) : "=r" (__val)); \ + __val; \ +}) + +#define PMEVCNTR_READ_CASE(idx)\ + case idx: \ + return read_sysreg(pmevcntr##idx##_el0) + +#define PMEVCNTR_CASES(readwrite) \ + PMEVCNTR_READ_CASE(0); \ + PMEVCNTR_READ_CASE(1); \ + PMEVCNTR_READ_CASE(2); \ + PMEVCNTR_READ_CASE(3); \ + PMEVCNTR_READ_CASE(4); \ + PMEVCNTR_READ_CASE(5); \ + PMEVCNTR_READ_CASE(6); \ + PMEVCNTR_READ_CASE(7); \ + PMEVCNTR_READ_CASE(8); \ + PMEVCNTR_READ_CASE(9); \ + PMEVCNTR_READ_CASE(10); \ + PMEVCNTR_READ_CASE(11); \ + PMEVCNTR_READ_CASE(12); \ + PMEVCNTR_READ_CASE(13); \ + PMEVCNTR_READ_CASE(14); \ + PMEVCNTR_READ_CASE(15); \ + PMEVCNTR_READ_CASE(16); \ + PMEVCNTR_READ_CASE(17); \ + PMEVCNTR_READ_CASE(18
[PATCH v2 4/5] arm64: perf: Enable pmu counter direct access for perf event on armv8
Keep track of event opened with direct access to the hardware counters and modify permissions while they are open. The strategy used here is the same which x86 uses: everytime an event is mapped, the permissions are set if required. The atomic field added in the mm_context helps keep track of the different event opened and de-activate the permissions when all are unmapped. We also need to update the permissions in the context switch code so that tasks keep the right permissions. Signed-off-by: Raphael Gault --- arch/arm64/include/asm/mmu.h | 6 + arch/arm64/include/asm/mmu_context.h | 2 ++ arch/arm64/include/asm/perf_event.h | 14 ++ drivers/perf/arm_pmu.c | 38 4 files changed, 60 insertions(+) diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index fd6161336653..88ed4466bd06 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -18,6 +18,12 @@ typedef struct { atomic64_t id; + + /* +* non-zero if userspace have access to hardware +* counters directly. +*/ + atomic_tpmu_direct_access; void*vdso; unsigned long flags; } mm_context_t; diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index 7ed0adb187a8..6e66ff940494 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -21,6 +21,7 @@ #include #include #include +#include #include #include @@ -224,6 +225,7 @@ static inline void __switch_mm(struct mm_struct *next) } check_and_switch_context(next, cpu); + perf_switch_user_access(next); } static inline void diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h index 2bdbc79bbd01..ba58fa726631 100644 --- a/arch/arm64/include/asm/perf_event.h +++ b/arch/arm64/include/asm/perf_event.h @@ -8,6 +8,7 @@ #include #include +#include #defineARMV8_PMU_MAX_COUNTERS 32 #defineARMV8_PMU_COUNTER_MASK (ARMV8_PMU_MAX_COUNTERS - 1) @@ -223,4 +224,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs); (regs)->pstate = PSR_MODE_EL1h; \ } +static inline void perf_switch_user_access(struct mm_struct *mm) +{ + if (!IS_ENABLED(CONFIG_PERF_EVENTS)) + return; + + if (atomic_read(>context.pmu_direct_access)) { + write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR, +pmuserenr_el0); + } else { + write_sysreg(0, pmuserenr_el0); + } +} + #endif diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index 2d06b8095a19..4844fe97d775 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -25,6 +25,7 @@ #include #include +#include static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu); static DEFINE_PER_CPU(int, cpu_irq); @@ -778,6 +779,41 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu) _pmu->node); } +static void refresh_pmuserenr(void *mm) +{ + perf_switch_user_access(mm); +} + +static void armpmu_event_mapped(struct perf_event *event, struct mm_struct *mm) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return; + + /* +* This function relies on not being called concurrently in two +* tasks in the same mm. Otherwise one task could observe +* pmu_direct_access > 1 and return all the way back to +* userspace with user access disabled while another task is still +* doing on_each_cpu_mask() to enable user access. +* +* For now, this can't happen because all callers hold mmap_sem +* for write. If this changes, we'll need a different solution. +*/ + lockdep_assert_held_write(>mmap_sem); + + if (atomic_inc_return(>context.pmu_direct_access) == 1) + on_each_cpu(refresh_pmuserenr, mm, 1); +} + +static void armpmu_event_unmapped(struct perf_event *event, struct mm_struct *mm) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return; + + if (atomic_dec_and_test(>context.pmu_direct_access)) + on_each_cpu_mask(mm_cpumask(mm), refresh_pmuserenr, NULL, 1); +} + static struct arm_pmu *__armpmu_alloc(gfp_t flags) { struct arm_pmu *pmu; @@ -799,6 +835,8 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags) .pmu_enable = armpmu_enable, .pmu_disable= armpmu_disable, .event_init = armpmu_event_init, + .event_mapped = armpmu_event_mapped, + .event_unmapped = armpmu_event_unmapped, .add= armpmu_add, .del= armpmu_del, .start = armpmu_start, -- 2.17.1
[PATCH v2 3/5] arm64: pmu: Add function implementation to update event index in userpage.
In order to be able to access the counter directly for userspace, we need to provide the index of the counter using the userpage. We thus need to override the event_idx function to retrieve and convert the perf_event index to armv8 hardware index. Since the arm_pmu driver can be used by any implementation, even if not armv8, two components play a role into making sure the behaviour is correct and consistent with the PMU capabilities: * the ARMPMU_EL0_RD_CNTR flag which denotes the capability to access counter from userspace. * the event_idx call back, which is implemented and initialized by the PMU implementation: if no callback is provided, the default behaviour applies, returning 0 as index value. Signed-off-by: Raphael Gault --- arch/arm64/kernel/perf_event.c | 22 ++ include/linux/perf/arm_pmu.h | 2 ++ 2 files changed, 24 insertions(+) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 24575c0a0065..f6336197d29e 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -819,6 +819,22 @@ static void armv8pmu_clear_event_idx(struct pmu_hw_events *cpuc, clear_bit(idx - 1, cpuc->used_mask); } +static int armv8pmu_access_event_idx(struct perf_event *event) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return 0; + + /* +* We remap the cycle counter index to 32 to +* match the offset applied to the rest of +* the counter indeces. +*/ + if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER) + return 32; + + return event->hw.idx; +} + /* * Add an event filter to a given event. */ @@ -912,6 +928,9 @@ static int __armv8_pmuv3_map_event(struct perf_event *event, if (armv8pmu_event_is_64bit(event)) event->hw.flags |= ARMPMU_EVT_64BIT; + if (!cpus_have_const_cap(ARM64_HAS_HETEROGENEOUS_PMU)) + event->hw.flags |= ARMPMU_EL0_RD_CNTR; + /* Only expose micro/arch events supported by this PMU */ if ((hw_event_id > 0) && (hw_event_id < ARMV8_PMUV3_MAX_COMMON_EVENTS) && test_bit(hw_event_id, armpmu->pmceid_bitmap)) { @@ -1032,6 +1051,8 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu) cpu_pmu->set_event_filter = armv8pmu_set_event_filter; cpu_pmu->filter_match = armv8pmu_filter_match; + cpu_pmu->pmu.event_idx = armv8pmu_access_event_idx; + return 0; } @@ -1210,6 +1231,7 @@ void arch_perf_update_userpage(struct perf_event *event, */ freq = arch_timer_get_rate(); userpg->cap_user_time = 1; + userpg->cap_user_rdpmc = !!(event->hw.flags & ARMPMU_EL0_RD_CNTR); clocks_calc_mult_shift(>time_mult, , freq, NSEC_PER_SEC, 0); diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h index 71f525a35ac2..1106a9ac00fd 100644 --- a/include/linux/perf/arm_pmu.h +++ b/include/linux/perf/arm_pmu.h @@ -26,6 +26,8 @@ */ /* Event uses a 64bit counter */ #define ARMPMU_EVT_64BIT 1 +/* Allow access to hardware counter from userspace */ +#define ARMPMU_EL0_RD_CNTR 2 #define HW_OP_UNSUPPORTED 0x #define C(_x) PERF_COUNT_HW_CACHE_##_x -- 2.17.1
[PATCH v2 5/5] Documentation: arm64: Document PMU counters access from userspace
Add a documentation file to describe the access to the pmu hardware counters from userspace Signed-off-by: Raphael Gault --- .../arm64/pmu_counter_user_access.txt | 42 +++ 1 file changed, 42 insertions(+) create mode 100644 Documentation/arm64/pmu_counter_user_access.txt diff --git a/Documentation/arm64/pmu_counter_user_access.txt b/Documentation/arm64/pmu_counter_user_access.txt new file mode 100644 index ..6788b1107381 --- /dev/null +++ b/Documentation/arm64/pmu_counter_user_access.txt @@ -0,0 +1,42 @@ +Access to PMU hardware counter from userspace += + +Overview + +The perf user-space tool relies on the PMU to monitor events. It offers an +abstraction layer over the hardware counters since the underlying +implementation is cpu-dependent. +Arm64 allows userspace tools to have access to the registers storing the +hardware counters' values directly. + +This targets specifically self-monitoring tasks in order to reduce the overhead +by directly accessing the registers without having to go through the kernel. + +How-to +-- +The focus is set on the armv8 pmuv3 which makes sure that the access to the pmu +registers is enable and that the userspace have access to the relevent +information in order to use them. + +In order to have access to the hardware counter it is necessary to open the event +using the perf tool interface: the sys_perf_event_open syscall returns a fd which +can subsequently be used with the mmap syscall in order to retrieve a page of memory +containing information about the event. +The PMU driver uses this page to expose to the user the hardware counter's +index. Using this index enables the user to access the PMU registers using the +`mrs` instruction. + +Have a look `at tools/perf/arch/arm64/tests/user-events.c` for an example. It can be +run using the perf tool to check that the access to the registers works +correctly from userspace: + +./perf test -v + +About chained events + +When the user requests for an event to be counted on 64 bits, two hardware +counters are used and need to be combined to retrieve the correct value: + +val = read_counter(idx); +if ((event.attr.config1 & 0x1)) + val = (val << 32) | read_counter(idx - 1); -- 2.17.1
[PATCH v2 0/5] arm64: Enable access to pmu registers by user-space
The perf user-space tool relies on the PMU to monitor events. It offers an abstraction layer over the hardware counters since the underlying implementation is cpu-dependent. We want to allow userspace tools to have access to the registers storing the hardware counters' values directly. This targets specifically self-monitoring tasks in order to reduce the overhead by directly accessing the registers without having to go through the kernel. In order to do this we need to setup the pmu so that it exposes its registers to userspace access. The first patch add a test to the perf tool so that we can test that the access to the registers works correctly from userspace. The second patch add a capability in the arm64 cpufeatures framework in order to detect when we are running on a heterogeneous system. The third patch focuses on the armv8 pmuv3 PMU support and makes sure that the access to the pmu registers is enable and that the userspace have access to the relevent information in order to use them. The fourth patch put in place callbacks to enable access to the hardware counters from userspace when a compatible event is opened using the perf API. The fifth patch adds a short documentation about PMU counters direct access from userspace. **Changes since v1** * Rebased on linux-next/master * Do not include RSEQs materials (test and utilities) since we want to enable direct access to counters only on homogeneous systems. * Do not include the hook defitinions for the same reason as above. * Add a cpu feature/capability to detect heterogeneous systems. Raphael Gault (5): perf: arm64: Add test to check userspace access to hardware counters. arm64: cpufeature: Add feature to detect heterogeneous systems arm64: pmu: Add function implementation to update event index in userpage. arm64: perf: Enable pmu counter direct access for perf event on armv8 Documentation: arm64: Document PMU counters access from userspace .../arm64/pmu_counter_user_access.txt | 42 +++ arch/arm64/include/asm/cpucaps.h | 3 +- arch/arm64/include/asm/mmu.h | 6 + arch/arm64/include/asm/mmu_context.h | 2 + arch/arm64/include/asm/perf_event.h | 14 + arch/arm64/kernel/cpufeature.c| 20 ++ arch/arm64/kernel/perf_event.c| 23 ++ drivers/perf/arm_pmu.c| 38 +++ include/linux/perf/arm_pmu.h | 2 + tools/perf/arch/arm64/include/arch-tests.h| 6 + tools/perf/arch/arm64/tests/Build | 1 + tools/perf/arch/arm64/tests/arch-tests.c | 4 + tools/perf/arch/arm64/tests/user-events.c | 255 ++ 13 files changed, 415 insertions(+), 1 deletion(-) create mode 100644 Documentation/arm64/pmu_counter_user_access.txt create mode 100644 tools/perf/arch/arm64/tests/user-events.c -- 2.17.1
[PATCH v2 2/5] arm64: cpufeature: Add feature to detect heterogeneous systems
This feature is required in order to enable PMU counters direct access from userspace only when the system is homogeneous. This feature checks the model of each CPU brought online and compares it to the boot CPU. If it differs then it is heterogeneous. Cc: suzuki.poul...@arm.com Signed-off-by: Raphael Gault --- arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/kernel/cpufeature.c | 20 arch/arm64/kernel/perf_event.c | 1 + 3 files changed, 23 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index f19fe4b9acc4..040370af38ad 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -52,7 +52,8 @@ #define ARM64_HAS_IRQ_PRIO_MASKING 42 #define ARM64_HAS_DCPODP 43 #define ARM64_WORKAROUND_1463225 44 +#define ARM64_HAS_HETEROGENEOUS_PMU45 -#define ARM64_NCAPS45 +#define ARM64_NCAPS46 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index f29f36a65175..3527f329ba1a 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1248,6 +1248,15 @@ static bool can_use_gic_priorities(const struct arm64_cpu_capabilities *entry, } #endif +static bool has_heterogeneous_pmu(const struct arm64_cpu_capabilities *entry, +int scope) +{ + u32 model = read_cpuid_id() & MIDR_CPU_MODEL_MASK; + struct cpuinfo_arm64 *boot = _cpu(cpu_data, 0); + + return (boot->reg_midr & MIDR_CPU_MODEL_MASK) != model; +} + static const struct arm64_cpu_capabilities arm64_features[] = { { .desc = "GIC system register CPU interface", @@ -1548,6 +1557,16 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .min_field_value = 1, }, #endif + { + /* +* Detect whether the system is heterogeneous or +* homogeneous +*/ + .desc = "Detect whether we have heterogeneous CPUs", + .capability = ARM64_HAS_HETEROGENEOUS_PMU, + .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU | ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU, + .matches = has_heterogeneous_pmu, + }, {}, }; @@ -1715,6 +1734,7 @@ static void __init setup_elf_hwcaps(const struct arm64_cpu_capabilities *hwcaps) cap_set_elf_hwcap(hwcaps); } + static void update_cpu_capabilities(u16 scope_mask) { int i; diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 96e90e270042..24575c0a0065 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -19,6 +19,7 @@ #include #include #include +#include /* ARMv8 Cortex-A53 specific event types. */ #define ARMV8_A53_PERFCTR_PREF_LINEFILL0xC2 -- 2.17.1
Re: [RFC V3 12/18] arm64: assembler: Add macro to annotate asm function having non standard stack-frame.
Hi, On 7/1/19 3:40 PM, Catalin Marinas wrote: On Mon, Jun 24, 2019 at 10:55:42AM +0100, Raphael Gault wrote: --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -752,4 +752,17 @@ USER(\label, icivau, \tmp2)// invalidate I line PoU .Lyield_out_\@ : .endm + /* +* This macro is the arm64 assembler equivalent of the +* macro STACK_FRAME_NON_STANDARD define at +* ~/include/linux/frame.h +*/ + .macro asm_stack_frame_non_standardfunc +#ifdef CONFIG_STACK_VALIDATION + .pushsection ".discard.func_stack_frame_non_standard" + .8byte \func Nitpicks: Does .quad vs .8byte make any difference? No it doesn't, I'll use .quad then. Could we place this in include/linux/frame.h directly with a generic name (and some __ASSEMBLY__ guards)? It doesn't look to be arm specific. It might be more consistent indeed, I'll do that. Thanks, -- Raphael Gault
Re: [PATCH 3/7] perf: arm64: Use rseq to test userspace access to pmu counters
Hi Mark, Hi Will, Now that we have a better idea what enabling this feature for heterogeneous systems would look like (both with or without using rseqs), it might be worth discussing if this is in fact a desirable thing in term of performance-complexity trade off. Indeed, while not as scary as first thought, maybe the rseq method would still dissuade users from using this feature. It is also worth noting that if we only enable this feature on homogeneous system, the `mrs` hook/emulation would not be necessary. If because of the complexity of the setup we need to consider whether we want to upstream this and have to maintain it afterward. Thanks, -- Raphael Gault
Re: [PATCH 3/7] perf: arm64: Use rseq to test userspace access to pmu counters
Hi Mathieu, Hi Szabolcs, On 6/11/19 8:33 PM, Mathieu Desnoyers wrote: - On Jun 11, 2019, at 6:57 PM, Mark Rutland mark.rutl...@arm.com wrote: Hi Arnaldo, On Tue, Jun 11, 2019 at 11:33:46AM -0300, Arnaldo Carvalho de Melo wrote: Em Tue, Jun 11, 2019 at 01:53:11PM +0100, Raphael Gault escreveu: Add an extra test to check userspace access to pmu hardware counters. This test doesn't rely on the seqlock as a synchronisation mechanism but instead uses the restartable sequences to make sure that the thread is not interrupted when reading the index of the counter and the associated pmu register. In addition to reading the pmu counters, this test is run several time in order to measure the ratio of failures: I ran this test on the Juno development platform, which is big.LITTLE with 4 Cortex A53 and 2 Cortex A57. The results vary quite a lot (running it with 100 tests is not so long and I did it several times). I ran it once with 1 iterations: `runs: 1, abort: 62.53%, zero: 34.93%, success: 2.54%` Signed-off-by: Raphael Gault --- tools/perf/arch/arm64/include/arch-tests.h| 5 +- tools/perf/arch/arm64/include/rseq-arm64.h| 220 ++ So, I applied the first patch in this series, but could you please break this patch into at least two, one introducing the facility (include/rseq*) and the second adding the test? We try to enforce this kind of granularity as down the line we may want to revert one part while the other already has other uses and thus wouldn't allow a straight revert. Also, can this go to tools/arch/ instead? Is this really perf specific? Isn't there any arch/arm64/include files for the kernel that we could mirror and have it checked for drift in tools/perf/check-headers.sh? The rseq bits aren't strictly perf specific, and I think the existing bits under tools/testing/selftests/rseq/ could be factored out to common locations under tools/include/ and tools/arch/*/include/. Hi Mark, Thanks for CCing me! Or into a stand-alone librseq project: https://github.com/compudj/librseq (currently a development branch in my own github) I don't see why this user-space code should sit in the kernel tree. It is not tooling-specific. From a scan, those already duplicate barriers and other helpers which already have definitions under tools/, which seems unfortunate. :/ Comments below are for Raphael and Matthieu. [...] +static u64 noinline mmap_read_self(void *addr, int cpu) +{ + struct perf_event_mmap_page *pc = addr; + u32 idx = 0; + u64 count = 0; + + asm volatile goto( + RSEQ_ASM_DEFINE_TABLE(0, 1f, 2f, 3f) +"nop\n" + RSEQ_ASM_STORE_RSEQ_CS(1, 0b, rseq_cs) +RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f) + RSEQ_ASM_OP_R_LOAD(pc_idx) + RSEQ_ASM_OP_R_AND(0xFF) +RSEQ_ASM_OP_R_STORE(idx) + RSEQ_ASM_OP_R_SUB(0x1) +RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f) + "msr pmselr_el0, " RSEQ_ASM_TMP_REG "\n" + "isb\n" +RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f) I really don't understand why the cpu_id needs to be compared 3 times here (?!?) Explicit comparison of the cpu_id within the rseq critical section should be done _once_. If the kernel happens to preempt and migrate the thread while in the critical section, it's the kernel's job to move user-space execution to the abort handler. + "mrs " RSEQ_ASM_TMP_REG ", pmxevcntr_el0\n" + RSEQ_ASM_OP_R_FINAL_STORE(cnt, 2) +"nop\n" + RSEQ_ASM_DEFINE_ABORT(3, abort) + :/* No output operands */ +: [cpu_id] "r" (cpu), + [current_cpu_id] "Qo" (__rseq_abi.cpu_id), + [rseq_cs] "m" (__rseq_abi.rseq_cs), + [cnt] "m" (count), + [pc_idx] "r" (>index), + [idx] "m" (idx) + :"memory" + :abort +); While baroque, this doesn't look as scary as I thought it would! That's good to hear :) However, I'm very scared that this is modifying input operands without clobbering them. IIUC this is beacause we're trying to use asm goto, which doesn't permit output operands. This is correct. What is wrong with modifying the target of "m" input operands in an inline asm that has a "memory" clobber ? gcc documentation at https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html states: "An asm goto statement cannot have outputs. This is due to an internal restriction of the compiler: control transfer instructions
[RFC V3 06/18] objtool: arm64: Adapt the stack frame checks for arm architecture
Since the way the initial stack frame when entering a function is different that what is done in the x86_64 architecture, we need to add some more check to support the different cases. As opposed as for x86_64, the return address is not stored by the call instruction but is instead loaded in a register. The initial stack frame is thus empty when entering a function and 2 push operations are needed to set it up correctly. All the different combinations need to be taken into account. Signed-off-by: Raphael Gault --- tools/objtool/arch.h | 2 + tools/objtool/arch/arm64/decode.c | 28 + tools/objtool/arch/x86/decode.c | 5 ++ tools/objtool/check.c | 100 -- tools/objtool/elf.c | 3 +- 5 files changed, 131 insertions(+), 7 deletions(-) diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h index 53a72477c352..723600aae13f 100644 --- a/tools/objtool/arch.h +++ b/tools/objtool/arch.h @@ -89,4 +89,6 @@ unsigned long arch_jump_destination(struct instruction *insn); unsigned long arch_dest_rela_offset(int addend); +bool arch_is_insn_sibling_call(struct instruction *insn); + #endif /* _ARCH_H */ diff --git a/tools/objtool/arch/arm64/decode.c b/tools/objtool/arch/arm64/decode.c index 6c77ad1a08ec..5be1d87b1a1c 100644 --- a/tools/objtool/arch/arm64/decode.c +++ b/tools/objtool/arch/arm64/decode.c @@ -106,6 +106,34 @@ unsigned long arch_dest_rela_offset(int addend) return addend; } +/* + * In order to know if we are in presence of a sibling + * call and not in presence of a switch table we look + * back at the previous instructions and see if we are + * jumping inside the same function that we are already + * in. + */ +bool arch_is_insn_sibling_call(struct instruction *insn) +{ + struct instruction *prev; + struct list_head *l; + struct symbol *sym; + list_for_each_prev(l, >list) { + prev = list_entry(l, struct instruction, list); + if (!prev->func || + prev->func->pfunc != insn->func->pfunc) + return false; + if (prev->stack_op.src.reg != ADR_SOURCE) + continue; + sym = find_symbol_containing(insn->sec, insn->immediate); + if (!sym || sym->type != STT_FUNC) + return false; + else if (sym->type == STT_FUNC) + return true; + break; + } + return false; +} static int is_arm64(struct elf *elf) { switch (elf->ehdr.e_machine) { diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c index 8767ee935c47..e2087ddced69 100644 --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -72,6 +72,11 @@ unsigned long arch_dest_rela_offset(int addend) return addend + 4; } +bool arch_is_insn_sibling_call(struct instruction *insn) +{ + return true; +} + int arch_decode_instruction(struct elf *elf, struct section *sec, unsigned long offset, unsigned int maxlen, unsigned int *len, unsigned char *type, diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 0fba7b70d73a..3172f49c3a58 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -568,10 +568,10 @@ static int add_jump_destinations(struct objtool_file *file) dest_off = arch_jump_destination(insn); } else if (rela->sym->type == STT_SECTION) { dest_sec = rela->sym->sec; - dest_off = rela->addend + 4; + dest_off = arch_dest_rela_offset(rela->addend); } else if (rela->sym->sec->idx) { dest_sec = rela->sym->sec; - dest_off = rela->sym->sym.st_value + rela->addend + 4; + dest_off = rela->sym->sym.st_value + arch_dest_rela_offset(rela->addend); } else if (strstr(rela->sym->name, "_indirect_thunk_")) { /* * Retpoline jumps are really dynamic jumps in @@ -1339,8 +1339,8 @@ static void save_reg(struct insn_state *state, unsigned char reg, int base, static void restore_reg(struct insn_state *state, unsigned char reg) { - state->regs[reg].base = CFI_UNDEFINED; - state->regs[reg].offset = 0; + state->regs[reg].base = initial_func_cfi.regs[reg].base; + state->regs[reg].offset = initial_func_cfi.regs[reg].offset; } /* @@ -1496,8 +1496,32 @@ static int update_insn_state(struct instruction *insn, struct insn_state *state) /* add imm, %rsp */ state->stack_size -= op->src.offset; - if (cfa->base == CFI_SP) +
[RFC V3 10/18] objtool: arm64: Implement functions to add switch tables alternatives
This patch implements the functions required to identify and add as alternatives all the possible destinations of the switch table. This implementation relies on the new plugin introduced previously which records information about the switch-table in a .objtool_data section. Signed-off-by: Raphael Gault --- tools/objtool/arch/arm64/arch_special.c | 142 +- tools/objtool/arch/arm64/decode.c | 2 +- .../objtool/arch/arm64/include/arch_special.h | 10 ++ .../objtool/arch/arm64/include/insn_decode.h | 3 +- tools/objtool/check.c | 8 +- tools/objtool/check.h | 2 + 6 files changed, 157 insertions(+), 10 deletions(-) diff --git a/tools/objtool/arch/arm64/arch_special.c b/tools/objtool/arch/arm64/arch_special.c index a0f7066994b5..33f30876d339 100644 --- a/tools/objtool/arch/arm64/arch_special.c +++ b/tools/objtool/arch/arm64/arch_special.c @@ -12,8 +12,13 @@ * You should have received a copy of the GNU General Public License * along with this program; if not, see <http://www.gnu.org/licenses/>. */ + +#include +#include + #include "../../special.h" #include "arch_special.h" +#include "bit_operations.h" void arch_force_alt_path(unsigned short feature, bool uaccess, @@ -21,9 +26,141 @@ void arch_force_alt_path(unsigned short feature, { } +static u32 next_offset(u8 *table, u8 entry_size) +{ + switch (entry_size) { + case 1: + return table[0]; + case 2: + return *(u16 *)(table); + default: + return *(u32 *)(table); + } +} + +static u32 get_table_entry_size(u32 insn) +{ + unsigned char size = (insn >> 30) & ONES(2); + switch (size) { + case 0: + return 1; + case 1: + return 2; + default: + return 4; + } +} + +static int add_possible_branch(struct objtool_file *file, + struct instruction *insn, + u32 base, u32 offset) +{ + struct instruction *new_insn; + struct alternative *alt; + offset = base + 4 * offset; + new_insn = calloc(1, sizeof(*new_insn)); + + if (new_insn == NULL) { + WARN("allocation failure, can't add jump alternative"); + return -1; + } + + memcpy(new_insn, insn, sizeof(*insn)); + alt = calloc(1, sizeof(*alt)); + + if (new_insn == NULL) { + WARN("allocation failure, can't add jump alternative"); + return -1; + } + + new_insn->type = INSN_JUMP_UNCONDITIONAL; + new_insn->immediate = offset; + INIT_LIST_HEAD(_insn->alts); + new_insn->jump_dest = find_insn(file, insn->sec, offset); + alt->insn = new_insn; + alt->skip_orig = true; + list_add_tail(>list, >alts); + list_add_tail(_insn->list, >insn_list); + return 0; +} + int arch_add_switch_table(struct objtool_file *file, struct instruction *insn, - struct rela *table, struct rela *next_table) + struct rela *table, struct rela *next_table) { + struct rela *objtool_data_rela = NULL; + struct switch_table_info *swt_info = NULL; + struct section *objtool_data = find_section_by_name(file->elf, ".objtool_data"); + struct section *rodata_sec = find_section_by_name(file->elf, ".rodata"); + struct section *branch_sec = NULL; + u8 *switch_table = NULL; + u64 base_offset = 0; + struct instruction *pre_jump_insn; + u32 sec_size = 0; + u32 entry_size = 0; + u32 offset = 0; + u32 i, j; + + if (objtool_data == NULL) + return 0; + + /* +* 1. Using rela, Identify entry for the switch table +* 2. Retrieve base offset +* 3. Retrieve branch instruction +* 3. For all entries in switch table: +* 3.1. Compute new offset +* 3.2. Create alternative instruction +* 3.3. Add alt_instr to insn->alts list +*/ + sec_size = objtool_data->sh.sh_size; + for (i = 0, swt_info = (void *)objtool_data->data->d_buf; +i < sec_size / sizeof(struct switch_table_info); +i++, swt_info++) { + offset = i * sizeof(struct switch_table_info); + objtool_data_rela = find_rela_by_dest_range(objtool_data, offset, + sizeof(u64)); + /* retrieving the objtool data of the switch table we need */ + if (objtool_data_rela == NULL || + table->sym->sec != objtool_data_rela->sym->sec || + table->addend != objtool_data_rela->addend) +
[RFC V3 04/18] objtool: arm64: Add required implementation for supporting the aarch64 architecture in objtool.
Provide implementation for the arch-dependent functions that are called by the main check function of objtool. The ORC unwinder is not yet supported by the arm64 architecture so we only provide a dummy interface for now. The decoding of the instruction is split into classes and subclasses as described into the Instruction Encoding in the ArmV8.5 Architecture Reference Manual. In order to handle the load/store instructions for a pair of registers we add an extra field to the stack_op structure. We consider that the hypervisor/secure-monitor is behaving correctly. This enables us to handle hvc/smc/svc context switching instructions as nop since we consider that the context is restored correctly. Signed-off-by: Raphael Gault --- tools/objtool/arch.h |7 + tools/objtool/arch/arm64/Build|7 + tools/objtool/arch/arm64/bit_operations.c | 67 + tools/objtool/arch/arm64/decode.c | 2781 + .../objtool/arch/arm64/include/arch_special.h | 36 + .../arch/arm64/include/asm/orc_types.h| 96 + .../arch/arm64/include/bit_operations.h | 24 + tools/objtool/arch/arm64/include/cfi.h| 74 + .../objtool/arch/arm64/include/insn_decode.h | 211 ++ tools/objtool/arch/arm64/orc_dump.c | 26 + tools/objtool/arch/arm64/orc_gen.c| 40 + 11 files changed, 3369 insertions(+) create mode 100644 tools/objtool/arch/arm64/Build create mode 100644 tools/objtool/arch/arm64/bit_operations.c create mode 100644 tools/objtool/arch/arm64/decode.c create mode 100644 tools/objtool/arch/arm64/include/arch_special.h create mode 100644 tools/objtool/arch/arm64/include/asm/orc_types.h create mode 100644 tools/objtool/arch/arm64/include/bit_operations.h create mode 100644 tools/objtool/arch/arm64/include/cfi.h create mode 100644 tools/objtool/arch/arm64/include/insn_decode.h create mode 100644 tools/objtool/arch/arm64/orc_dump.c create mode 100644 tools/objtool/arch/arm64/orc_gen.c diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h index ce7db772248e..53a72477c352 100644 --- a/tools/objtool/arch.h +++ b/tools/objtool/arch.h @@ -60,9 +60,16 @@ struct op_src { int offset; }; +struct op_extra { + unsigned char used; + unsigned char reg; + int offset; +}; + struct stack_op { struct op_dest dest; struct op_src src; + struct op_extra extra; }; struct instruction; diff --git a/tools/objtool/arch/arm64/Build b/tools/objtool/arch/arm64/Build new file mode 100644 index ..bf7a32c2b9e9 --- /dev/null +++ b/tools/objtool/arch/arm64/Build @@ -0,0 +1,7 @@ +objtool-y += decode.o +objtool-y += orc_dump.o +objtool-y += orc_gen.o +objtool-y += bit_operations.o + + +CFLAGS_decode.o += -I$(OUTPUT)arch/arm64/lib diff --git a/tools/objtool/arch/arm64/bit_operations.c b/tools/objtool/arch/arm64/bit_operations.c new file mode 100644 index ..f457a14a7f5d --- /dev/null +++ b/tools/objtool/arch/arm64/bit_operations.c @@ -0,0 +1,67 @@ +#include +#include +#include "bit_operations.h" + +#include "../../warn.h" + +u64 replicate(u64 x, int size, int n) +{ + u64 ret = 0; + + while (n >= 0) { + ret = (ret | x) << size; + n--; + } + return ret | x; +} + +u64 ror(u64 x, int size, int shift) +{ + int m = shift % size; + + if (shift == 0) + return x; + return ZERO_EXTEND((x >> m) | (x << (size - m)), size); +} + +int highest_set_bit(u32 x) +{ + int i; + + for (i = 31; i >= 0; i--, x <<= 1) + if (x & 0x8000) + return i; + return 0; +} + +/* imms and immr are both 6 bit long */ +__uint128_t decode_bit_masks(unsigned char N, unsigned char imms, +unsigned char immr, bool immediate) +{ + u64 tmask, wmask; + u32 diff, S, R, esize, welem, telem; + unsigned char levels = 0, len = 0; + + len = highest_set_bit((N << 6) | ((~imms) & ONES(6))); + levels = ZERO_EXTEND(ONES(len), 6); + + if (immediate && ((imms & levels) == levels)) { + WARN("unknown instruction"); + return -1; + } + + S = imms & levels; + R = immr & levels; + diff = ZERO_EXTEND(S - R, 6); + + esize = 1 << len; + diff = diff & ONES(len); + + welem = ZERO_EXTEND(ONES(S + 1), esize); + telem = ZERO_EXTEND(ONES(diff + 1), esize); + + wmask = replicate(ror(welem, esize, R), esize, 64 / esize); + tmask = replicate(telem, esize, 64 / esize); + + return ((__uint128_t)wmask << 64) | tmask; +} diff --git a/tools/objtool/arch/arm64/decode.c b/tools/objtool/arch/arm64/decode.c new file mode 100644 index ..6c77ad1a08ec --- /dev/null +++ b/tools/objtool/arch/arm64/decode.c @@ -0,0 +1
[RFC V3 02/18] objtool: orc: Refactor ORC API for other architectures to implement.
The ORC unwinder is only supported on x86 at the moment and should thus be in the x86 architecture code. In order not to break the whole structure in case another architecture decides to support the ORC unwinder via objtool we choose to let the implementation be done in the architecture dependent code. Signed-off-by: Raphael Gault --- tools/objtool/Build | 2 - tools/objtool/arch.h| 3 + tools/objtool/arch/x86/Build| 2 + tools/objtool/{ => arch/x86}/orc_dump.c | 4 +- tools/objtool/{ => arch/x86}/orc_gen.c | 104 ++-- tools/objtool/check.c | 99 +- tools/objtool/orc.h | 4 +- 7 files changed, 111 insertions(+), 107 deletions(-) rename tools/objtool/{ => arch/x86}/orc_dump.c (98%) rename tools/objtool/{ => arch/x86}/orc_gen.c (66%) diff --git a/tools/objtool/Build b/tools/objtool/Build index 749becdf5b90..2ed83344e0a5 100644 --- a/tools/objtool/Build +++ b/tools/objtool/Build @@ -2,8 +2,6 @@ objtool-y += arch/$(SRCARCH)/ objtool-y += builtin-check.o objtool-y += builtin-orc.o objtool-y += check.o -objtool-y += orc_gen.o -objtool-y += orc_dump.o objtool-y += elf.o objtool-y += special.o objtool-y += objtool.o diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h index 2a38a834cf40..ce7db772248e 100644 --- a/tools/objtool/arch.h +++ b/tools/objtool/arch.h @@ -10,6 +10,7 @@ #include #include "elf.h" #include "cfi.h" +#include "orc.h" #define INSN_JUMP_CONDITIONAL 1 #define INSN_JUMP_UNCONDITIONAL2 @@ -75,6 +76,8 @@ int arch_decode_instruction(struct elf *elf, struct section *sec, bool arch_callee_saved_reg(unsigned char reg); +int arch_orc_read_unwind_hints(struct objtool_file *file); + unsigned long arch_jump_destination(struct instruction *insn); unsigned long arch_dest_rela_offset(int addend); diff --git a/tools/objtool/arch/x86/Build b/tools/objtool/arch/x86/Build index b998412c017d..1f11b45999d0 100644 --- a/tools/objtool/arch/x86/Build +++ b/tools/objtool/arch/x86/Build @@ -1,4 +1,6 @@ objtool-y += decode.o +objtool-y += orc_dump.o +objtool-y += orc_gen.o inat_tables_script = arch/x86/tools/gen-insn-attr-x86.awk inat_tables_maps = arch/x86/lib/x86-opcode-map.txt diff --git a/tools/objtool/orc_dump.c b/tools/objtool/arch/x86/orc_dump.c similarity index 98% rename from tools/objtool/orc_dump.c rename to tools/objtool/arch/x86/orc_dump.c index 13ccf775a83a..cfe8f96bdd68 100644 --- a/tools/objtool/orc_dump.c +++ b/tools/objtool/arch/x86/orc_dump.c @@ -4,8 +4,8 @@ */ #include -#include "orc.h" -#include "warn.h" +#include "../../orc.h" +#include "../../warn.h" static const char *reg_name(unsigned int reg) { diff --git a/tools/objtool/orc_gen.c b/tools/objtool/arch/x86/orc_gen.c similarity index 66% rename from tools/objtool/orc_gen.c rename to tools/objtool/arch/x86/orc_gen.c index 27a4112848c2..b4f285bf5271 100644 --- a/tools/objtool/orc_gen.c +++ b/tools/objtool/arch/x86/orc_gen.c @@ -6,11 +6,11 @@ #include #include -#include "orc.h" -#include "check.h" -#include "warn.h" +#include "../../orc.h" +#include "../../check.h" +#include "../../warn.h" -int create_orc(struct objtool_file *file) +int arch_create_orc(struct objtool_file *file) { struct instruction *insn; @@ -116,7 +116,7 @@ static int create_orc_entry(struct section *u_sec, struct section *ip_relasec, return 0; } -int create_orc_sections(struct objtool_file *file) +int arch_create_orc_sections(struct objtool_file *file) { struct instruction *insn, *prev_insn; struct section *sec, *u_sec, *ip_relasec; @@ -209,3 +209,97 @@ int create_orc_sections(struct objtool_file *file) return 0; } + +int arch_orc_read_unwind_hints(struct objtool_file *file) +{ + struct section *sec, *relasec; + struct rela *rela; + struct unwind_hint *hint; + struct instruction *insn; + struct cfi_reg *cfa; + int i; + + sec = find_section_by_name(file->elf, ".discard.unwind_hints"); + if (!sec) + return 0; + + relasec = sec->rela; + if (!relasec) { + WARN("missing .rela.discard.unwind_hints section"); + return -1; + } + + if (sec->len % sizeof(struct unwind_hint)) { + WARN("struct unwind_hint size mismatch"); + return -1; + } + + file->hints = true; + + for (i = 0; i < sec->len / sizeof(struct unwind_hint); i++) { + hint = (struct unwind_hint *)sec->data->d_buf + i; + + rela = find_rela_by_dest(sec, i * sizeof(*hint)); + if (!rela) { + WARN("can't find rela for unwind_hints[%d]", i); +
[RFC V3 07/18] objtool: Introduce INSN_UNKNOWN type
On arm64 some object files contain data stored in the .text section. This data is interpreted by objtool as instruction but can't be identified as a valid one. In order to keep analysing those files we introduce INSN_UNKNOWN type. The "unknown instruction" warning will thus only be raised if such instructions are uncountered while validating an execution branch. This change doesn't impact the x86 decoding logic since 0 is still used as a way to specify an unknown type, raising the "unknown instruction" warning during the decoding phase still. Signed-off-by: Raphael Gault --- tools/objtool/arch.h | 3 ++- tools/objtool/arch/arm64/decode.c | 8 tools/objtool/arch/arm64/include/insn_decode.h | 4 ++-- tools/objtool/check.c | 10 +- 4 files changed, 17 insertions(+), 8 deletions(-) diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h index 723600aae13f..f3f94e2a1403 100644 --- a/tools/objtool/arch.h +++ b/tools/objtool/arch.h @@ -26,7 +26,8 @@ #define INSN_CLAC 12 #define INSN_STD 13 #define INSN_CLD 14 -#define INSN_OTHER 15 +#define INSN_UNKNOWN 15 +#define INSN_OTHER 16 #define INSN_LAST INSN_OTHER enum op_dest_type { diff --git a/tools/objtool/arch/arm64/decode.c b/tools/objtool/arch/arm64/decode.c index 5be1d87b1a1c..a40338a895f5 100644 --- a/tools/objtool/arch/arm64/decode.c +++ b/tools/objtool/arch/arm64/decode.c @@ -37,9 +37,9 @@ */ static arm_decode_class aarch64_insn_class_decode_table[] = { [INSN_RESERVED] = arm_decode_reserved, - [INSN_UNKNOWN] = arm_decode_unknown, + [INSN_UNALLOC_1]= arm_decode_unknown, [INSN_SVE_ENC] = arm_decode_sve_encoding, - [INSN_UNALLOC] = arm_decode_unknown, + [INSN_UNALLOC_2]= arm_decode_unknown, [INSN_LD_ST_4] = arm_decode_ld_st, [INSN_DP_REG_5] = arm_decode_dp_reg, [INSN_LD_ST_6] = arm_decode_ld_st, @@ -191,7 +191,7 @@ int arch_decode_instruction(struct elf *elf, struct section *sec, int arm_decode_unknown(u32 instr, unsigned char *type, unsigned long *immediate, struct stack_op *op) { - *type = 0; + *type = INSN_UNKNOWN; return 0; } @@ -206,7 +206,7 @@ int arm_decode_reserved(u32 instr, unsigned char *type, unsigned long *immediate, struct stack_op *op) { *immediate = instr & ONES(16); - *type = INSN_BUG; + *type = INSN_UNKNOWN; return 0; } diff --git a/tools/objtool/arch/arm64/include/insn_decode.h b/tools/objtool/arch/arm64/include/insn_decode.h index eb54fc39dca5..a01d76306749 100644 --- a/tools/objtool/arch/arm64/include/insn_decode.h +++ b/tools/objtool/arch/arm64/include/insn_decode.h @@ -20,9 +20,9 @@ #include "../../../arch.h" #define INSN_RESERVED 0b -#define INSN_UNKNOWN 0b0001 +#define INSN_UNALLOC_1 0b0001 #define INSN_SVE_ENC 0b0010 -#define INSN_UNALLOC 0b0011 +#define INSN_UNALLOC_2 0b0011 #define INSN_DP_IMM0b1001 //0x100x #define INSN_BRANCH0b1011 //0x101x #define INSN_LD_ST_4 0b0100 //0bx1x0 diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 3172f49c3a58..cba1d91451cc 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1952,6 +1952,13 @@ static int validate_branch(struct objtool_file *file, struct instruction *first, while (1) { next_insn = next_insn_same_sec(file, insn); + if (insn->type == INSN_UNKNOWN) { + WARN("%s+0x%lx unknown instruction type, should never be reached", +insn->sec->name, +insn->offset); + return 1; + } + if (file->c_file && func && insn->func && func != insn->func->pfunc) { WARN("%s() falls through to next function %s()", func->name, insn->func->name); @@ -2383,7 +2390,8 @@ static int validate_reachable_instructions(struct objtool_file *file) return 0; for_each_insn(file, insn) { - if (insn->visited || ignore_unreachable_insn(insn)) + if (insn->visited || ignore_unreachable_insn(insn) || + insn->type == INSN_UNKNOWN) continue; WARN_FUNC("unreachable instruction", insn->sec, insn->offset); -- 2.17.1
[RFC V3 05/18] objtool: special: Adapt special section handling
This patch abstracts the few architecture dependent tests that are perform when handling special section and switch tables. It enables any architecture to ignore a particular CPU feature or not to handle switch tables. Signed-off-by: Raphael Gault --- tools/objtool/arch/arm64/Build| 1 + tools/objtool/arch/arm64/arch_special.c | 22 +++ .../objtool/arch/arm64/include/arch_special.h | 10 +-- tools/objtool/arch/x86/Build | 1 + tools/objtool/arch/x86/arch_special.c | 28 +++ tools/objtool/arch/x86/include/arch_special.h | 9 ++ tools/objtool/check.c | 15 -- tools/objtool/special.c | 9 ++ tools/objtool/special.h | 3 ++ 9 files changed, 87 insertions(+), 11 deletions(-) create mode 100644 tools/objtool/arch/arm64/arch_special.c create mode 100644 tools/objtool/arch/x86/arch_special.c diff --git a/tools/objtool/arch/arm64/Build b/tools/objtool/arch/arm64/Build index bf7a32c2b9e9..3d09be745a84 100644 --- a/tools/objtool/arch/arm64/Build +++ b/tools/objtool/arch/arm64/Build @@ -1,3 +1,4 @@ +objtool-y += arch_special.o objtool-y += decode.o objtool-y += orc_dump.o objtool-y += orc_gen.o diff --git a/tools/objtool/arch/arm64/arch_special.c b/tools/objtool/arch/arm64/arch_special.c new file mode 100644 index ..a21d28876317 --- /dev/null +++ b/tools/objtool/arch/arm64/arch_special.c @@ -0,0 +1,22 @@ +/* + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + */ +#include "../../special.h" +#include "arch_special.h" + +void arch_force_alt_path(unsigned short feature, +bool uaccess, +struct special_alt *alt) +{ +} diff --git a/tools/objtool/arch/arm64/include/arch_special.h b/tools/objtool/arch/arm64/include/arch_special.h index 63da775d0581..185103be8a51 100644 --- a/tools/objtool/arch/arm64/include/arch_special.h +++ b/tools/objtool/arch/arm64/include/arch_special.h @@ -30,7 +30,13 @@ #define ALT_ORIG_LEN_OFFSET10 #define ALT_NEW_LEN_OFFSET 11 -#define X86_FEATURE_POPCNT (4 * 32 + 23) -#define X86_FEATURE_SMAP (9 * 32 + 20) +static inline bool arch_should_ignore_feature(unsigned short feature) +{ + return false; +} +static inline bool arch_support_switch_table(void) +{ + return false; +} #endif /* _ARM64_ARCH_SPECIAL_H */ diff --git a/tools/objtool/arch/x86/Build b/tools/objtool/arch/x86/Build index 1f11b45999d0..63e167775bc8 100644 --- a/tools/objtool/arch/x86/Build +++ b/tools/objtool/arch/x86/Build @@ -1,3 +1,4 @@ +objtool-y += arch_special.o objtool-y += decode.o objtool-y += orc_dump.o objtool-y += orc_gen.o diff --git a/tools/objtool/arch/x86/arch_special.c b/tools/objtool/arch/x86/arch_special.c new file mode 100644 index ..6583a1770bb2 --- /dev/null +++ b/tools/objtool/arch/x86/arch_special.c @@ -0,0 +1,28 @@ +/* + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + */ +#include "../../special.h" +#include "arch_special.h" + +void arch_force_alt_path(unsigned short feature, +bool uaccess, +struct special_alt *alt) +{ + if (feature == X86_FEATURE_SMAP) { + if (uaccess) + alt->skip_orig = true; + else + alt->skip_alt = true; + } +} diff --git a/tools/objtool/arch/x86/include/arch_special.h b/tools/objtool/arch/x86/include/arch_special.h index 424ce47013e3..fce2b1193194 100644 --- a/tools/objtool/arch/x86/include/arch_special.h +++ b/tools/objtool/arch/x86/include/arch_special.h @@ -33,4 +33,13 @@ #define
[RFC V3 08/18] objtool: Refactor switch-tables code to support other architectures
The way to identify switch-tables and retrieves all the data necessary to handle the different execution branches is not the same on all architecture. In order to be able to add other architecture support, this patch defines arch-dependent functions to process jump-tables. Signed-off-by: Raphael Gault --- tools/objtool/arch/arm64/arch_special.c | 15 + tools/objtool/arch/x86/arch_special.c | 73 + tools/objtool/check.c | 84 + tools/objtool/check.h | 7 +++ tools/objtool/special.h | 10 ++- 5 files changed, 107 insertions(+), 82 deletions(-) diff --git a/tools/objtool/arch/arm64/arch_special.c b/tools/objtool/arch/arm64/arch_special.c index a21d28876317..a0f7066994b5 100644 --- a/tools/objtool/arch/arm64/arch_special.c +++ b/tools/objtool/arch/arm64/arch_special.c @@ -20,3 +20,18 @@ void arch_force_alt_path(unsigned short feature, struct special_alt *alt) { } + +int arch_add_switch_table(struct objtool_file *file, struct instruction *insn, + struct rela *table, struct rela *next_table) +{ + return 0; +} + +struct rela *arch_find_switch_table(struct objtool_file *file, + struct rela *text_rela, + struct section *rodata_sec, + unsigned long table_offset) +{ + file->ignore_unreachables = true; + return NULL; +} diff --git a/tools/objtool/arch/x86/arch_special.c b/tools/objtool/arch/x86/arch_special.c index 6583a1770bb2..38ac010f8a02 100644 --- a/tools/objtool/arch/x86/arch_special.c +++ b/tools/objtool/arch/x86/arch_special.c @@ -26,3 +26,76 @@ void arch_force_alt_path(unsigned short feature, alt->skip_alt = true; } } + +int arch_add_switch_table(struct objtool_file *file, struct instruction *insn, + struct rela *table, struct rela *next_table) +{ + struct rela *rela = table; + struct instruction *alt_insn; + struct alternative *alt; + struct symbol *pfunc = insn->func->pfunc; + unsigned int prev_offset = 0; + + list_for_each_entry_from(rela, >rela_sec->rela_list, list) { + if (rela == next_table) + break; + + /* Make sure the switch table entries are consecutive: */ + if (prev_offset && rela->offset != prev_offset + 8) + break; + + /* Detect function pointers from contiguous objects: */ + if (rela->sym->sec == pfunc->sec && + rela->addend == pfunc->offset) + break; + + alt_insn = find_insn(file, rela->sym->sec, rela->addend); + if (!alt_insn) + break; + + /* Make sure the jmp dest is in the function or subfunction: */ + if (alt_insn->func->pfunc != pfunc) + break; + + alt = malloc(sizeof(*alt)); + if (!alt) { + WARN("malloc failed"); + return -1; + } + + alt->insn = alt_insn; + list_add_tail(>list, >alts); + prev_offset = rela->offset; + } + + if (!prev_offset) { + WARN_FUNC("can't find switch jump table", + insn->sec, insn->offset); + return -1; + } + + return 0; +} + +struct rela *arch_find_switch_table(struct objtool_file *file, + struct rela *text_rela, + struct section *rodata_sec, + unsigned long table_offset) +{ + struct rela *rodata_rela; + + rodata_rela = find_rela_by_dest(rodata_sec, table_offset); + if (rodata_rela) { + /* +* Use of RIP-relative switch jumps is quite rare, and +* indicates a rare GCC quirk/bug which can leave dead +* code behind. +*/ + if (text_rela->type == R_X86_64_PC32) + file->ignore_unreachables = true; + + return rodata_rela; + } + + return NULL; +} diff --git a/tools/objtool/check.c b/tools/objtool/check.c index cba1d91451cc..ce1165ce448a 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -18,12 +18,6 @@ #define FAKE_JUMP_OFFSET -1 -struct alternative { - struct list_head list; - struct instruction *insn; - bool skip_orig; -}; - const char *objname; struct cfi_state initial_func_cfi; @@ -901,56 +895,6 @@ static int add_special_section_alts(struct objtool_file *file) return ret; } -static int add_switch_table(st
[RFC V3 03/18] objtool: Move registers and control flow to arch-dependent code
The control flow information and register macro definitions were based on the x86_64 architecture but should be abstract so that each architecture can define the correct values for the registers, especially the registers related to the stack frame (Frame Pointer, Stack Pointer and possibly Return Address). Signed-off-by: Raphael Gault --- tools/objtool/arch/x86/include/arch_special.h | 36 +++ tools/objtool/{ => arch/x86/include}/cfi.h| 0 tools/objtool/check.h | 1 + tools/objtool/special.c | 19 +- 4 files changed, 38 insertions(+), 18 deletions(-) create mode 100644 tools/objtool/arch/x86/include/arch_special.h rename tools/objtool/{ => arch/x86/include}/cfi.h (100%) diff --git a/tools/objtool/arch/x86/include/arch_special.h b/tools/objtool/arch/x86/include/arch_special.h new file mode 100644 index ..424ce47013e3 --- /dev/null +++ b/tools/objtool/arch/x86/include/arch_special.h @@ -0,0 +1,36 @@ +/* + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + */ +#ifndef _X86_ARCH_SPECIAL_H +#define _X86_ARCH_SPECIAL_H + +#define EX_ENTRY_SIZE 12 +#define EX_ORIG_OFFSET 0 +#define EX_NEW_OFFSET 4 + +#define JUMP_ENTRY_SIZE16 +#define JUMP_ORIG_OFFSET 0 +#define JUMP_NEW_OFFSET4 + +#define ALT_ENTRY_SIZE 13 +#define ALT_ORIG_OFFSET0 +#define ALT_NEW_OFFSET 4 +#define ALT_FEATURE_OFFSET 8 +#define ALT_ORIG_LEN_OFFSET10 +#define ALT_NEW_LEN_OFFSET 11 + +#define X86_FEATURE_POPCNT (4 * 32 + 23) +#define X86_FEATURE_SMAP (9 * 32 + 20) + +#endif /* _X86_ARCH_SPECIAL_H */ diff --git a/tools/objtool/cfi.h b/tools/objtool/arch/x86/include/cfi.h similarity index 100% rename from tools/objtool/cfi.h rename to tools/objtool/arch/x86/include/cfi.h diff --git a/tools/objtool/check.h b/tools/objtool/check.h index cb60b9acf5cf..c44f9fe40178 100644 --- a/tools/objtool/check.h +++ b/tools/objtool/check.h @@ -11,6 +11,7 @@ #include "cfi.h" #include "arch.h" #include "orc.h" +#include "arch_special.h" #include struct insn_state { diff --git a/tools/objtool/special.c b/tools/objtool/special.c index fdbaa611146d..b8ccee1b5382 100644 --- a/tools/objtool/special.c +++ b/tools/objtool/special.c @@ -14,24 +14,7 @@ #include "builtin.h" #include "special.h" #include "warn.h" - -#define EX_ENTRY_SIZE 12 -#define EX_ORIG_OFFSET 0 -#define EX_NEW_OFFSET 4 - -#define JUMP_ENTRY_SIZE16 -#define JUMP_ORIG_OFFSET 0 -#define JUMP_NEW_OFFSET4 - -#define ALT_ENTRY_SIZE 13 -#define ALT_ORIG_OFFSET0 -#define ALT_NEW_OFFSET 4 -#define ALT_FEATURE_OFFSET 8 -#define ALT_ORIG_LEN_OFFSET10 -#define ALT_NEW_LEN_OFFSET 11 - -#define X86_FEATURE_POPCNT (4*32+23) -#define X86_FEATURE_SMAP (9*32+20) +#include "arch_special.h" struct special_entry { const char *sec; -- 2.17.1
[RFC V3 13/18] arm64: sleep: Prevent stack frame warnings from objtool
This code doesn't respect the Arm PCS but it is intended this way. Adapting it to respect the PCS would result in altering the behaviour. In order to suppress objtool's warnings, we setup a stack frame for __cpu_suspend_enter and annotate cpu_resume and _cpu_resume as having non-standard stack frames. Signed-off-by: Raphael Gault --- arch/arm64/kernel/sleep.S | 4 1 file changed, 4 insertions(+) diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S index 3e53ffa07994..eb434525fe82 100644 --- a/arch/arm64/kernel/sleep.S +++ b/arch/arm64/kernel/sleep.S @@ -90,6 +90,7 @@ ENTRY(__cpu_suspend_enter) str x0, [x1] add x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS stp x29, lr, [sp, #-16]! + mov x29, sp bl cpu_do_suspend ldp x29, lr, [sp], #16 mov x0, #1 @@ -146,3 +147,6 @@ ENTRY(_cpu_resume) mov x0, #0 ret ENDPROC(_cpu_resume) + + asm_stack_frame_non_standard cpu_resume + asm_stack_frame_non_standard _cpu_resume -- 2.17.1
[RFC V3 09/18] gcc-plugins: objtool: Add plugin to detect switch table on arm64
This plugins comes into play before the final 2 RTL passes of GCC and detects switch-tables that are to be outputed in the ELF and writes information in an "objtool_data" section which will be used by objtool. Signed-off-by: Raphael Gault --- scripts/Makefile.gcc-plugins | 2 + scripts/gcc-plugins/Kconfig | 9 +++ .../arm64_switch_table_detection_plugin.c | 58 +++ 3 files changed, 69 insertions(+) create mode 100644 scripts/gcc-plugins/arm64_switch_table_detection_plugin.c diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins index 5f7df50cfe7a..a56736df9dc2 100644 --- a/scripts/Makefile.gcc-plugins +++ b/scripts/Makefile.gcc-plugins @@ -44,6 +44,8 @@ ifdef CONFIG_GCC_PLUGIN_ARM_SSP_PER_TASK endif export DISABLE_ARM_SSP_PER_TASK_PLUGIN +gcc-plugin-$(CONFIG_GCC_PLUGIN_SWITCH_TABLES) += arm64_switch_table_detection_plugin.so + # All the plugin CFLAGS are collected here in case a build target needs to # filter them out of the KBUILD_CFLAGS. GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y)) diff --git a/scripts/gcc-plugins/Kconfig b/scripts/gcc-plugins/Kconfig index e9c677a53c74..a9b13d257cd2 100644 --- a/scripts/gcc-plugins/Kconfig +++ b/scripts/gcc-plugins/Kconfig @@ -113,4 +113,13 @@ config GCC_PLUGIN_ARM_SSP_PER_TASK bool depends on GCC_PLUGINS && ARM +config GCC_PLUGIN_SWITCH_TABLES + bool "GCC Plugin: Identify switch tables at compile time" + default y + depends on STACK_VALIDATION && ARM64 + help + Plugin to identify switch tables generated at compile time and store + them in a .objtool_data section. Objtool will then use that section + to analyse the different execution path of the switch table. + endmenu diff --git a/scripts/gcc-plugins/arm64_switch_table_detection_plugin.c b/scripts/gcc-plugins/arm64_switch_table_detection_plugin.c new file mode 100644 index ..d7f0e13910d5 --- /dev/null +++ b/scripts/gcc-plugins/arm64_switch_table_detection_plugin.c @@ -0,0 +1,58 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include "gcc-common.h" + +__visible int plugin_is_GPL_compatible; + +static unsigned int arm64_switchtbl_rtl_execute(void) +{ + rtx_insn *insn; + rtx_insn *labelp = NULL; + rtx_jump_table_data *tablep = NULL; + section *sec = get_section(".objtool_data", SECTION_STRINGS, NULL); + section *curr_sec = current_function_section(); + + for (insn = get_insns(); insn; insn = NEXT_INSN(insn)) { + /* +* Find a tablejump_p INSN (using a dispatch table) +*/ + if (!tablejump_p(insn, , )) + continue; + + if (labelp && tablep) { + switch_to_section(sec); + assemble_integer_with_op(".quad ", gen_rtx_LABEL_REF(Pmode, labelp)); + assemble_integer_with_op(".quad ", GEN_INT(GET_NUM_ELEM(tablep->get_labels(; + assemble_integer_with_op(".quad ", GEN_INT(ADDR_DIFF_VEC_FLAGS(tablep).offset_unsigned)); + switch_to_section(curr_sec); + } + } + return 0; +} + +#define PASS_NAME arm64_switchtbl_rtl + +#define NO_GATE +#include "gcc-generate-rtl-pass.h" + +__visible int plugin_init(struct plugin_name_args *plugin_info, + struct plugin_gcc_version *version) +{ + const char * const plugin_name = plugin_info->base_name; + int tso = 0; + int i; + + if (!plugin_default_version_check(version, _version)) { + error(G_("incompatible gcc/plugin versions")); + return 1; + } + + PASS_INFO(arm64_switchtbl_rtl, "outof_cfglayout", 1, + PASS_POS_INSERT_AFTER); + + register_callback(plugin_info->base_name, PLUGIN_PASS_MANAGER_SETUP, + NULL, _switchtbl_rtl_pass_info); + + return 0; +} -- 2.17.1
[RFC V3 15/18] arm64: kernel: Add exception on kuser32 to prevent stack analysis
kuser32 being used for compatibility, it contains a32 instructions which are not recognised by objtool when trying to analyse arm64 object files. Thus, we add an exception to skip validation on this particular file. Signed-off-by: Raphael Gault --- arch/arm64/kernel/Makefile | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 478491f07b4f..1239c7da4c02 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -33,6 +33,9 @@ ifneq ($(CONFIG_COMPAT_VDSO), y) obj-$(CONFIG_COMPAT) += sigreturn32.o endif obj-$(CONFIG_KUSER_HELPERS)+= kuser32.o + +OBJECT_FILES_NON_STANDARD_kuser32.o := y + obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o obj-$(CONFIG_MODULES) += module.o obj-$(CONFIG_ARM64_MODULE_PLTS)+= module-plts.o -- 2.17.1
[RFC V3 16/18] arm64: crypto: Add exceptions for crypto object to prevent stack analysis
Some crypto modules contain `.word` of data in the .text section. Since objtool can't make the distinction between data and incorrect instruction, it gives a warning about the instruction beeing unknown and stops the analysis of the object file. The exception can be removed if the data are moved to another section or if objtool is tweaked to handle this particular case. Signed-off-by: Raphael Gault --- arch/arm64/crypto/Makefile | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile index 0435f2a0610e..e2a25919ebaa 100644 --- a/arch/arm64/crypto/Makefile +++ b/arch/arm64/crypto/Makefile @@ -43,9 +43,11 @@ aes-neon-blk-y := aes-glue-neon.o aes-neon.o obj-$(CONFIG_CRYPTO_SHA256_ARM64) += sha256-arm64.o sha256-arm64-y := sha256-glue.o sha256-core.o +OBJECT_FILES_NON_STANDARD_sha256-core.o := y obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o sha512-arm64-y := sha512-glue.o sha512-core.o +OBJECT_FILES_NON_STANDARD_sha512-core.o := y obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o chacha-neon-y := chacha-neon-core.o chacha-neon-glue.o @@ -58,6 +60,7 @@ aes-arm64-y := aes-cipher-core.o aes-cipher-glue.o obj-$(CONFIG_CRYPTO_AES_ARM64_BS) += aes-neon-bs.o aes-neon-bs-y := aes-neonbs-core.o aes-neonbs-glue.o +OBJECT_FILES_NON_STANDARD_aes-neonbs-core.o := y CFLAGS_aes-glue-ce.o := -DUSE_V8_CRYPTO_EXTENSIONS -- 2.17.1
[RFC V3 18/18] objtool: arm64: Enable stack validation for arm64
Signed-off-by: Raphael Gault --- arch/arm64/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index f5eb592b8579..c5fdfb635d3d 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -159,6 +159,7 @@ config ARM64 select HAVE_RCU_TABLE_FREE select HAVE_RSEQ select HAVE_STACKPROTECTOR + select HAVE_STACK_VALIDATION select HAVE_SYSCALL_TRACEPOINTS select HAVE_KPROBES select HAVE_KRETPROBES -- 2.17.1
[RFC V3 17/18] arm64: kernel: Annotate non-standard stack frame functions
Annotate assembler functions which are callable but do not setup a correct stack frame. Signed-off-by: Raphael Gault --- arch/arm64/kernel/hyp-stub.S | 2 ++ arch/arm64/kvm/hyp-init.S| 2 ++ 2 files changed, 4 insertions(+) diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index 73d46070b315..a382f0e33735 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -42,6 +42,7 @@ ENTRY(__hyp_stub_vectors) ventry el1_fiq_invalid // FIQ 32-bit EL1 ventry el1_error_invalid // Error 32-bit EL1 ENDPROC(__hyp_stub_vectors) +asm_stack_frame_non_standard __hyp_stub_vectors .align 11 @@ -69,6 +70,7 @@ el1_sync: 9: mov x0, xzr eret ENDPROC(el1_sync) +asm_stack_frame_non_standard el1_sync .macro invalid_vector label \label: diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S index 160be2b4696d..65b7c12b9aa8 100644 --- a/arch/arm64/kvm/hyp-init.S +++ b/arch/arm64/kvm/hyp-init.S @@ -118,6 +118,7 @@ CPU_BE( orr x4, x4, #SCTLR_ELx_EE) /* Hello, World! */ eret ENDPROC(__kvm_hyp_init) +asm_stack_frame_non_standard __kvm_hyp_init ENTRY(__kvm_handle_stub_hvc) cmp x0, #HVC_SOFT_RESTART @@ -159,6 +160,7 @@ reset: eret ENDPROC(__kvm_handle_stub_hvc) +asm_stack_frame_non_standard __kvm_handle_stub_hvc .ltorg -- 2.17.1
[RFC V3 14/18] arm64: kvm: Annotate non-standard stack frame functions
Both __guest_entry and __guest_exit functions do not setup a correct stack frame. Because they can be considered as callable functions, even if they are particular cases, we chose to silence the warnings given by objtool by annotating them as non-standard. Signed-off-by: Raphael Gault --- arch/arm64/kvm/hyp/entry.S | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S index bd34016354ba..d8e122bd3f6f 100644 --- a/arch/arm64/kvm/hyp/entry.S +++ b/arch/arm64/kvm/hyp/entry.S @@ -82,6 +82,7 @@ ENTRY(__guest_enter) eret sb ENDPROC(__guest_enter) +asm_stack_frame_non_standard __guest_enter ENTRY(__guest_exit) // x0: return code @@ -171,3 +172,4 @@ abort_guest_exit_end: orr x0, x0, x5 1: ret ENDPROC(__guest_exit) +asm_stack_frame_non_standard __guest_exit -- 2.17.1
[RFC V3 12/18] arm64: assembler: Add macro to annotate asm function having non standard stack-frame.
Some functions don't have standard stack-frames but are intended this way. In order for objtool to ignore those particular cases we add a macro that enables us to annotate the cases we chose to mark as particular. Signed-off-by: Raphael Gault --- arch/arm64/include/asm/assembler.h | 13 + 1 file changed, 13 insertions(+) diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index 570d195a184d..969a59c5c276 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -752,4 +752,17 @@ USER(\label, icivau, \tmp2)// invalidate I line PoU .Lyield_out_\@ : .endm + /* +* This macro is the arm64 assembler equivalent of the +* macro STACK_FRAME_NON_STANDARD define at +* ~/include/linux/frame.h +*/ + .macro asm_stack_frame_non_standardfunc +#ifdef CONFIG_STACK_VALIDATION + .pushsection ".discard.func_stack_frame_non_standard" + .8byte \func + .popsection +#endif + .endm + #endif /* __ASM_ASSEMBLER_H */ -- 2.17.1
[RFC V3 11/18] arm64: alternative: Mark .altinstr_replacement as containing executable instructions
Until now, the section .altinstr_replacement wasn't marked as containing executable instructions on arm64. This patch changes that so that it is coherent with what is done on x86. Signed-off-by: Raphael Gault --- arch/arm64/include/asm/alternative.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h index b9f8d787eea9..e9e6b81e3eb4 100644 --- a/arch/arm64/include/asm/alternative.h +++ b/arch/arm64/include/asm/alternative.h @@ -71,7 +71,7 @@ static inline void apply_alternatives_module(void *start, size_t length) { } ALTINSTR_ENTRY(feature,cb) \ ".popsection\n" \ " .if " __stringify(cb) " == 0\n" \ - ".pushsection .altinstr_replacement, \"a\"\n" \ + ".pushsection .altinstr_replacement, \"ax\"\n" \ "663:\n\t" \ newinstr "\n" \ "664:\n\t" \ -- 2.17.1
[RFC V3 00/18] objtool: Add support for arm64
As of now, objtool only supports the x86_64 architecture but the groundwork has already been done in order to add support for other architectures without too much effort. This series of patches adds support for the arm64 architecture based on the Armv8.5 Architecture Reference Manual. Objtool will be a valuable tool to progress and provide more guarentees on live patching which is a work in progress for arm64. Once we have the base of objtool working the next steps will be to port Peter Z's uaccess validation for arm64. Changes since previous version: * Rebased on tip/master: Note that I had to re-expose the `struct alternative` using check.h because it is now used outside of check.c. * Reorder commits for a more coherent progression * Introduce GCC plugin to help detect switch-tables for arm64 This plugins could be improve: It plugs in after the RTL control flow graph passes but only extract information about the switch tables. I originally intended for it to introduce new code_label/note within the RTL representation in order to reference them and thus get the address of the branch instruction. However I did not manage to do it properly using gen_rtx_CODE_LABEL/emit_label_before/after. If anyone has some experience with RTL plugins I am all ears for advices. Raphael Gault (18): objtool: Add abstraction for computation of symbols offsets objtool: orc: Refactor ORC API for other architectures to implement. objtool: Move registers and control flow to arch-dependent code objtool: arm64: Add required implementation for supporting the aarch64 architecture in objtool. objtool: special: Adapt special section handling objtool: arm64: Adapt the stack frame checks for arm architecture objtool: Introduce INSN_UNKNOWN type objtool: Refactor switch-tables code to support other architectures gcc-plugins: objtool: Add plugin to detect switch table on arm64 objtool: arm64: Implement functions to add switch tables alternatives arm64: alternative: Mark .altinstr_replacement as containing executable instructions arm64: assembler: Add macro to annotate asm function having non standard stack-frame. arm64: sleep: Prevent stack frame warnings from objtool arm64: kvm: Annotate non-standard stack frame functions arm64: kernel: Add exception on kuser32 to prevent stack analysis arm64: crypto: Add exceptions for crypto object to prevent stack analysis arm64: kernel: Annotate non-standard stack frame functions objtool: arm64: Enable stack validation for arm64 arch/arm64/Kconfig|1 + arch/arm64/crypto/Makefile|3 + arch/arm64/include/asm/alternative.h |2 +- arch/arm64/include/asm/assembler.h| 13 + arch/arm64/kernel/Makefile|3 + arch/arm64/kernel/hyp-stub.S |2 + arch/arm64/kernel/sleep.S |4 + arch/arm64/kvm/hyp-init.S |2 + arch/arm64/kvm/hyp/entry.S|2 + scripts/Makefile.gcc-plugins |2 + scripts/gcc-plugins/Kconfig |9 + .../arm64_switch_table_detection_plugin.c | 58 + tools/objtool/Build |2 - tools/objtool/arch.h | 21 +- tools/objtool/arch/arm64/Build|8 + tools/objtool/arch/arm64/arch_special.c | 173 + tools/objtool/arch/arm64/bit_operations.c | 67 + tools/objtool/arch/arm64/decode.c | 2809 + .../objtool/arch/arm64/include/arch_special.h | 52 + .../arch/arm64/include/asm/orc_types.h| 96 + .../arch/arm64/include/bit_operations.h | 24 + tools/objtool/arch/arm64/include/cfi.h| 74 + .../objtool/arch/arm64/include/insn_decode.h | 210 ++ tools/objtool/arch/arm64/orc_dump.c | 26 + tools/objtool/arch/arm64/orc_gen.c| 40 + tools/objtool/arch/x86/Build |3 + tools/objtool/arch/x86/arch_special.c | 101 + tools/objtool/arch/x86/decode.c | 16 + tools/objtool/arch/x86/include/arch_special.h | 45 + tools/objtool/{ => arch/x86/include}/cfi.h|0 tools/objtool/{ => arch/x86}/orc_dump.c |4 +- tools/objtool/{ => arch/x86}/orc_gen.c| 104 +- tools/objtool/check.c | 309 +- tools/objtool/check.h | 10 + tools/objtool/elf.c |3 +- tools/objtool/orc.h |4 +- tools/objtool/special.c | 28 +- tools/objtool/special.h | 13 +- 38 files changed, 4119 insertions(+), 224 deletions(-) create mode 100644 scripts/gcc-plugins/arm64_switch_table_detection_plugin.c create mode 100644 tools/objtool/arch/arm64/Build create mode 100644 tools/objtool/arch/arm64/arch_special.c create mode 100644 tools/objtool/arch/arm64/bit_ope
[RFC V3 01/18] objtool: Add abstraction for computation of symbols offsets
The jump destination and relocation offset used previously are only reliable on x86_64 architecture. We abstract these computations by calling arch-dependent implementations. Signed-off-by: Raphael Gault --- tools/objtool/arch.h| 6 ++ tools/objtool/arch/x86/decode.c | 11 +++ tools/objtool/check.c | 15 ++- 3 files changed, 27 insertions(+), 5 deletions(-) diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h index 580e344db3dd..2a38a834cf40 100644 --- a/tools/objtool/arch.h +++ b/tools/objtool/arch.h @@ -64,6 +64,8 @@ struct stack_op { struct op_src src; }; +struct instruction; + void arch_initial_func_cfi_state(struct cfi_state *state); int arch_decode_instruction(struct elf *elf, struct section *sec, @@ -73,4 +75,8 @@ int arch_decode_instruction(struct elf *elf, struct section *sec, bool arch_callee_saved_reg(unsigned char reg); +unsigned long arch_jump_destination(struct instruction *insn); + +unsigned long arch_dest_rela_offset(int addend); + #endif /* _ARCH_H */ diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c index 584568f27a83..8767ee935c47 100644 --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -11,6 +11,7 @@ #include "lib/inat.c" #include "lib/insn.c" +#include "../../check.h" #include "../../elf.h" #include "../../arch.h" #include "../../warn.h" @@ -66,6 +67,11 @@ bool arch_callee_saved_reg(unsigned char reg) } } +unsigned long arch_dest_rela_offset(int addend) +{ + return addend + 4; +} + int arch_decode_instruction(struct elf *elf, struct section *sec, unsigned long offset, unsigned int maxlen, unsigned int *len, unsigned char *type, @@ -497,3 +503,8 @@ void arch_initial_func_cfi_state(struct cfi_state *state) state->regs[16].base = CFI_CFA; state->regs[16].offset = -8; } + +unsigned long arch_jump_destination(struct instruction *insn) +{ + return insn->offset + insn->len + insn->immediate; +} diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 172f99195726..b37ca4822f25 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -565,7 +565,7 @@ static int add_jump_destinations(struct objtool_file *file) insn->len); if (!rela) { dest_sec = insn->sec; - dest_off = insn->offset + insn->len + insn->immediate; + dest_off = arch_jump_destination(insn); } else if (rela->sym->type == STT_SECTION) { dest_sec = rela->sym->sec; dest_off = rela->addend + 4; @@ -659,7 +659,7 @@ static int add_call_destinations(struct objtool_file *file) rela = find_rela_by_dest_range(insn->sec, insn->offset, insn->len); if (!rela) { - dest_off = insn->offset + insn->len + insn->immediate; + dest_off = arch_jump_destination(insn); insn->call_dest = find_symbol_by_offset(insn->sec, dest_off); @@ -672,14 +672,19 @@ static int add_call_destinations(struct objtool_file *file) } } else if (rela->sym->type == STT_SECTION) { + /* +* the original x86_64 code adds 4 to the rela->addend +* which is not needed on arm64 architecture. +*/ + dest_off = arch_dest_rela_offset(rela->addend); insn->call_dest = find_symbol_by_offset(rela->sym->sec, - rela->addend+4); + dest_off); if (!insn->call_dest || insn->call_dest->type != STT_FUNC) { - WARN_FUNC("can't find call dest symbol at %s+0x%x", + WARN_FUNC("can't find call dest symbol at %s+0x%lx", insn->sec, insn->offset, rela->sym->sec->name, - rela->addend + 4); + dest_off); return -1; } } else -- 2.17.1
[tip:perf/core] perf tests arm64: Compile tests unconditionally
Commit-ID: 010e3e8fc12b1c13ce19821a11d8930226ebb4b6 Gitweb: https://git.kernel.org/tip/010e3e8fc12b1c13ce19821a11d8930226ebb4b6 Author: Raphael Gault AuthorDate: Tue, 11 Jun 2019 13:53:09 +0100 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 17 Jun 2019 15:57:16 -0300 perf tests arm64: Compile tests unconditionally In order to subsequently add more tests for the arm64 architecture we compile the tests target for arm64 systematically. Further explanation provided by Mark Rutland: Given prior questions regarding this commit, it's probably worth spelling things out more explicitly, e.g. Currently we only build the arm64/tests directory if CONFIG_DWARF_UNWIND is selected, which is fine as the only test we have is arm64/tests/dwarf-unwind.o. So that we can add more tests to the test directory, let's unconditionally build the directory, but conditionally build dwarf-unwind.o depending on CONFIG_DWARF_UNWIND. There should be no functional change as a result of this patch. Signed-off-by: Raphael Gault Acked-by: Mark Rutland Cc: Catalin Marinas Cc: Peter Zijlstra Cc: Will Deacon Cc: linux-arm-ker...@lists.infradead.org Link: http://lkml.kernel.org/r/20190611125315.18736-2-raphael.ga...@arm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/arch/arm64/Build | 2 +- tools/perf/arch/arm64/tests/Build | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/arch/arm64/Build b/tools/perf/arch/arm64/Build index 36222e64bbf7..a7dd46a5b678 100644 --- a/tools/perf/arch/arm64/Build +++ b/tools/perf/arch/arm64/Build @@ -1,2 +1,2 @@ perf-y += util/ -perf-$(CONFIG_DWARF_UNWIND) += tests/ +perf-y += tests/ diff --git a/tools/perf/arch/arm64/tests/Build b/tools/perf/arch/arm64/tests/Build index 41707fea74b3..a61c06bdb757 100644 --- a/tools/perf/arch/arm64/tests/Build +++ b/tools/perf/arch/arm64/tests/Build @@ -1,4 +1,4 @@ perf-y += regs_load.o -perf-y += dwarf-unwind.o +perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o perf-y += arch-tests.o
Re: [PATCH 3/7] perf: arm64: Use rseq to test userspace access to pmu counters
Hi Mathieu, On 6/11/19 8:33 PM, Mathieu Desnoyers wrote: - On Jun 11, 2019, at 6:57 PM, Mark Rutland mark.rutl...@arm.com wrote: Hi Arnaldo, On Tue, Jun 11, 2019 at 11:33:46AM -0300, Arnaldo Carvalho de Melo wrote: Em Tue, Jun 11, 2019 at 01:53:11PM +0100, Raphael Gault escreveu: Add an extra test to check userspace access to pmu hardware counters. This test doesn't rely on the seqlock as a synchronisation mechanism but instead uses the restartable sequences to make sure that the thread is not interrupted when reading the index of the counter and the associated pmu register. In addition to reading the pmu counters, this test is run several time in order to measure the ratio of failures: I ran this test on the Juno development platform, which is big.LITTLE with 4 Cortex A53 and 2 Cortex A57. The results vary quite a lot (running it with 100 tests is not so long and I did it several times). I ran it once with 1 iterations: `runs: 1, abort: 62.53%, zero: 34.93%, success: 2.54%` Signed-off-by: Raphael Gault --- tools/perf/arch/arm64/include/arch-tests.h| 5 +- tools/perf/arch/arm64/include/rseq-arm64.h| 220 ++ So, I applied the first patch in this series, but could you please break this patch into at least two, one introducing the facility (include/rseq*) and the second adding the test? We try to enforce this kind of granularity as down the line we may want to revert one part while the other already has other uses and thus wouldn't allow a straight revert. Also, can this go to tools/arch/ instead? Is this really perf specific? Isn't there any arch/arm64/include files for the kernel that we could mirror and have it checked for drift in tools/perf/check-headers.sh? The rseq bits aren't strictly perf specific, and I think the existing bits under tools/testing/selftests/rseq/ could be factored out to common locations under tools/include/ and tools/arch/*/include/. Hi Mark, Thanks for CCing me! Or into a stand-alone librseq project: https://github.com/compudj/librseq (currently a development branch in my own github) I don't see why this user-space code should sit in the kernel tree. It is not tooling-specific. From a scan, those already duplicate barriers and other helpers which already have definitions under tools/, which seems unfortunate. :/ Comments below are for Raphael and Matthieu. [...] +static u64 noinline mmap_read_self(void *addr, int cpu) +{ + struct perf_event_mmap_page *pc = addr; + u32 idx = 0; + u64 count = 0; + + asm volatile goto( + RSEQ_ASM_DEFINE_TABLE(0, 1f, 2f, 3f) +"nop\n" + RSEQ_ASM_STORE_RSEQ_CS(1, 0b, rseq_cs) +RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f) + RSEQ_ASM_OP_R_LOAD(pc_idx) + RSEQ_ASM_OP_R_AND(0xFF) +RSEQ_ASM_OP_R_STORE(idx) + RSEQ_ASM_OP_R_SUB(0x1) +RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f) + "msr pmselr_el0, " RSEQ_ASM_TMP_REG "\n" + "isb\n" +RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f) I really don't understand why the cpu_id needs to be compared 3 times here (?!?) Explicit comparison of the cpu_id within the rseq critical section should be done _once_. I understand and that's what I thought as well but I got confused with a comment in (src)/include/uapi/linux/rseq.h which states: > This CPU number value should always be compared > against the value of the cpu_id field before performing a rseq > commit or returning a value read from a data structure indexed > using the cpu_id_start value. I'll remove the unnecessary testing. If the kernel happens to preempt and migrate the thread while in the critical section, it's the kernel's job to move user-space execution to the abort handler. [...] Thanks, -- Raphael Gault
Re: [RFC V2 00/16] objtool: Add support for Arm64
Hi Josh, On 5/28/19 11:24 PM, Josh Poimboeuf wrote: On Tue, May 21, 2019 at 12:50:57PM +, Raphael Gault wrote: Hi Josh, Thanks for offering your help and sorry for the late answer. My understanding is that a table of offsets is built by GCC, those offsets being scaled by 4 before adding them to the base label. I believe the offsets are stored in the .rodata section. To find the size of that table, it is needed to find a comparison, which can be optimized out apprently. In that case the end of the array can be found by locating labels pointing to data behind it (which is not 100% safe). On 5/16/19 3:29 PM, Josh Poimboeuf wrote: On Thu, May 16, 2019 at 11:36:39AM +0100, Raphael Gault wrote: Noteworthy points: * I still haven't figured out how to detect switch-tables on arm64. I have a better understanding of them but still haven't implemented checks as it doesn't look trivial at all. Switch tables were tricky to get right on x86. If you share an example (or even just a .o file) I can take a look. Hopefully they're somewhat similar to x86 switch tables. Otherwise we may want to consider a different approach (for example maybe a GCC plugin could help annotate them). The case which made me realize the issue is the one of arch/arm64/kernel/module.o:apply_relocate_add: ``` What seems to happen in the case of module.o is: 334: 9015adrpx21, 0 which retrieves the location of an offset in the rodata section, and a bit later we do some extra computation with it in order to compute the jump destination: 3e0: 78625aa0ldrhw0, [x21, w2, uxtw #1] 3e4: 1061adr x1, 3f0 3e8: 8b20a820add x0, x1, w0, sxth #2 3ec: d61fbr x0 ``` Please keep in mind that the actual offsets might vary. I'm happy to provide more details about what I have identified if you want me to. I get the feeling this is going to be trickier than x86 switch tables (which have already been tricky enough). On x86, there's a .rela.rodata section which applies relocations to .rodata. The presence of those relocations makes it relatively easy to differentiate switch tables from other read-only data. For example, we can tell that a switch table ends when either a) there's not a text relocation or b) another switch table begins. But with arm64 I don't see a deterministic way to do that, because the table offsets are hard-coded in .rodata, with no relocations. From talking with Kamalesh I got the impression that we might have a similar issue for powerpc. So I'm beginning to think we'll need compiler help. Like a GCC plugin that annotates at least the following switch table metadata: - Branch instruction address - Switch table address - Switch table entry size - Switch table size The GCC plugin could write all the above metadata into a special section which gets discarded at link time. I can look at implementing it, though I'll be traveling for two out of the next three weeks so it may be a while before I can get to it. I am completely new to GCC plugins but I had a look and I think I found a possible solution to retrieve at least part of this information using the RTL representation in GCC. I can't say it will work for sure but I would be happy to discuss it with you if you want. Although there are still some area I need to investigate related to interacting with the RTL representation and storing info into the ELF I'd be interested in giving it a try, if you are okay with that. Thanks, -- Raphael Gault
Re: [PATCH 3/7] perf: arm64: Use rseq to test userspace access to pmu counters
Hi Mathieu, Mark, On 6/11/19 8:33 PM, Mathieu Desnoyers wrote: - On Jun 11, 2019, at 6:57 PM, Mark Rutland mark.rutl...@arm.com wrote: Hi Arnaldo, On Tue, Jun 11, 2019 at 11:33:46AM -0300, Arnaldo Carvalho de Melo wrote: Em Tue, Jun 11, 2019 at 01:53:11PM +0100, Raphael Gault escreveu: Add an extra test to check userspace access to pmu hardware counters. This test doesn't rely on the seqlock as a synchronisation mechanism but instead uses the restartable sequences to make sure that the thread is not interrupted when reading the index of the counter and the associated pmu register. In addition to reading the pmu counters, this test is run several time in order to measure the ratio of failures: I ran this test on the Juno development platform, which is big.LITTLE with 4 Cortex A53 and 2 Cortex A57. The results vary quite a lot (running it with 100 tests is not so long and I did it several times). I ran it once with 1 iterations: `runs: 1, abort: 62.53%, zero: 34.93%, success: 2.54%` Signed-off-by: Raphael Gault --- tools/perf/arch/arm64/include/arch-tests.h| 5 +- tools/perf/arch/arm64/include/rseq-arm64.h| 220 ++ So, I applied the first patch in this series, but could you please break this patch into at least two, one introducing the facility (include/rseq*) and the second adding the test? We try to enforce this kind of granularity as down the line we may want to revert one part while the other already has other uses and thus wouldn't allow a straight revert. Also, can this go to tools/arch/ instead? Is this really perf specific? Isn't there any arch/arm64/include files for the kernel that we could mirror and have it checked for drift in tools/perf/check-headers.sh? The rseq bits aren't strictly perf specific, and I think the existing bits under tools/testing/selftests/rseq/ could be factored out to common locations under tools/include/ and tools/arch/*/include/. Hi Mark, Thanks for CCing me! Or into a stand-alone librseq project: https://github.com/compudj/librseq (currently a development branch in my own github) I don't see why this user-space code should sit in the kernel tree. It is not tooling-specific. I understand your point but I have to admit that I don't really see how to make it work together with the test which require those definitions. From a scan, those already duplicate barriers and other helpers which already have definitions under tools/, which seems unfortunate. :/ Also I realize that there is a duplicate with definitions introduced in the selftests but I kind of simplified the macros I'm using to get rid of what wasn't useful to me at the moment. (mainly the loop labels and parameter injections in the asm statement) I understand what both Mark and Arnaldo are saying about moving it out of perf so that it is not duplicated but my question is whether it is a good thing to do as is since it is not exactly the same content as what's in the selftests. I hope you can understand my concerns and I'd like to hear your opinions on that matter. Thanks, -- Raphael Gault
[PATCH 2/7] perf: arm64: Add test to check userspace access to hardware counters.
This test relies on the fact that the PMU registers are accessible from userspace. It then uses the perf_event_mmap_page to retrieve the counter index and access the underlying register. This test uses sched_setaffinity(2) in order to run on all CPU and thus check the behaviour of the PMU of all cpus in a big.LITTLE environment. Signed-off-by: Raphael Gault --- tools/perf/arch/arm64/include/arch-tests.h | 6 + tools/perf/arch/arm64/tests/Build | 1 + tools/perf/arch/arm64/tests/arch-tests.c | 4 + tools/perf/arch/arm64/tests/user-events.c | 255 + 4 files changed, 266 insertions(+) create mode 100644 tools/perf/arch/arm64/tests/user-events.c diff --git a/tools/perf/arch/arm64/include/arch-tests.h b/tools/perf/arch/arm64/include/arch-tests.h index 90ec4c8cb880..a9b17ae0560b 100644 --- a/tools/perf/arch/arm64/include/arch-tests.h +++ b/tools/perf/arch/arm64/include/arch-tests.h @@ -2,11 +2,17 @@ #ifndef ARCH_TESTS_H #define ARCH_TESTS_H +#define __maybe_unused __attribute__((unused)) #ifdef HAVE_DWARF_UNWIND_SUPPORT struct thread; struct perf_sample; +int test__arch_unwind_sample(struct perf_sample *sample, +struct thread *thread); #endif extern struct test arch_tests[]; +int test__rd_pmevcntr(struct test *test __maybe_unused, + int subtest __maybe_unused); + #endif diff --git a/tools/perf/arch/arm64/tests/Build b/tools/perf/arch/arm64/tests/Build index a61c06bdb757..3f9a20c17fc6 100644 --- a/tools/perf/arch/arm64/tests/Build +++ b/tools/perf/arch/arm64/tests/Build @@ -1,4 +1,5 @@ perf-y += regs_load.o perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o +perf-y += user-events.o perf-y += arch-tests.o diff --git a/tools/perf/arch/arm64/tests/arch-tests.c b/tools/perf/arch/arm64/tests/arch-tests.c index 5b1543c98022..57df9b89dede 100644 --- a/tools/perf/arch/arm64/tests/arch-tests.c +++ b/tools/perf/arch/arm64/tests/arch-tests.c @@ -10,6 +10,10 @@ struct test arch_tests[] = { .func = test__dwarf_unwind, }, #endif + { + .desc = "User counter access", + .func = test__rd_pmevcntr, + }, { .func = NULL, }, diff --git a/tools/perf/arch/arm64/tests/user-events.c b/tools/perf/arch/arm64/tests/user-events.c new file mode 100644 index ..958e4cd000c1 --- /dev/null +++ b/tools/perf/arch/arm64/tests/user-events.c @@ -0,0 +1,255 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "perf.h" +#include "debug.h" +#include "tests/tests.h" +#include "cloexec.h" +#include "util.h" +#include "arch-tests.h" + +/* + * ARMv8 ARM reserves the following encoding for system registers: + * (Ref: ARMv8 ARM, Section: "System instruction class encoding overview", + * C5.2, version:ARM DDI 0487A.f) + * [20-19] : Op0 + * [18-16] : Op1 + * [15-12] : CRn + * [11-8] : CRm + * [7-5] : Op2 + */ +#define Op0_shift 19 +#define Op0_mask0x3 +#define Op1_shift 16 +#define Op1_mask0x7 +#define CRn_shift 12 +#define CRn_mask0xf +#define CRm_shift 8 +#define CRm_mask0xf +#define Op2_shift 5 +#define Op2_mask0x7 + +#define __stringify(x) #x + +#define read_sysreg(r) ({ \ + u64 __val; \ + asm volatile("mrs %0, " __stringify(r) : "=r" (__val)); \ + __val; \ +}) + +#define PMEVCNTR_READ_CASE(idx)\ + case idx: \ + return read_sysreg(pmevcntr##idx##_el0) + +#define PMEVCNTR_CASES(readwrite) \ + PMEVCNTR_READ_CASE(0); \ + PMEVCNTR_READ_CASE(1); \ + PMEVCNTR_READ_CASE(2); \ + PMEVCNTR_READ_CASE(3); \ + PMEVCNTR_READ_CASE(4); \ + PMEVCNTR_READ_CASE(5); \ + PMEVCNTR_READ_CASE(6); \ + PMEVCNTR_READ_CASE(7); \ + PMEVCNTR_READ_CASE(8); \ + PMEVCNTR_READ_CASE(9); \ + PMEVCNTR_READ_CASE(10); \ + PMEVCNTR_READ_CASE(11); \ + PMEVCNTR_READ_CASE(12); \ + PMEVCNTR_READ_CASE(13); \ + PMEVCNTR_READ_CASE(14); \ + PMEVCNTR_READ_CASE(15); \ + PMEVCNTR_READ_CASE(16); \ + PMEVCNTR_READ_CASE(17); \ + PMEVCNTR_READ_CASE(18
[PATCH 7/7] Documentation: arm64: Document PMU counters access from userspace
Add a documentation file to describe the access to the pmu hardware counters from userspace Signed-off-by: Raphael Gault --- .../arm64/pmu_counter_user_access.txt | 42 +++ 1 file changed, 42 insertions(+) create mode 100644 Documentation/arm64/pmu_counter_user_access.txt diff --git a/Documentation/arm64/pmu_counter_user_access.txt b/Documentation/arm64/pmu_counter_user_access.txt new file mode 100644 index ..6788b1107381 --- /dev/null +++ b/Documentation/arm64/pmu_counter_user_access.txt @@ -0,0 +1,42 @@ +Access to PMU hardware counter from userspace += + +Overview + +The perf user-space tool relies on the PMU to monitor events. It offers an +abstraction layer over the hardware counters since the underlying +implementation is cpu-dependent. +Arm64 allows userspace tools to have access to the registers storing the +hardware counters' values directly. + +This targets specifically self-monitoring tasks in order to reduce the overhead +by directly accessing the registers without having to go through the kernel. + +How-to +-- +The focus is set on the armv8 pmuv3 which makes sure that the access to the pmu +registers is enable and that the userspace have access to the relevent +information in order to use them. + +In order to have access to the hardware counter it is necessary to open the event +using the perf tool interface: the sys_perf_event_open syscall returns a fd which +can subsequently be used with the mmap syscall in order to retrieve a page of memory +containing information about the event. +The PMU driver uses this page to expose to the user the hardware counter's +index. Using this index enables the user to access the PMU registers using the +`mrs` instruction. + +Have a look `at tools/perf/arch/arm64/tests/user-events.c` for an example. It can be +run using the perf tool to check that the access to the registers works +correctly from userspace: + +./perf test -v + +About chained events + +When the user requests for an event to be counted on 64 bits, two hardware +counters are used and need to be combined to retrieve the correct value: + +val = read_counter(idx); +if ((event.attr.config1 & 0x1)) + val = (val << 32) | read_counter(idx - 1); -- 2.17.1
[PATCH 6/7] arm64: perf: Enable pmu counter direct access for perf event on armv8
Keep track of event opened with direct access to the hardware counters and modify permissions while they are open. The strategy used here is the same which x86 uses: everytime an event is mapped, the permissions are set if required. The atomic field added in the mm_context helps keep track of the different event opened and de-activate the permissions when all are unmapped. We also need to update the permissions in the context switch code so that tasks keep the right permissions. Signed-off-by: Raphael Gault --- arch/arm64/include/asm/mmu.h | 6 + arch/arm64/include/asm/mmu_context.h | 2 ++ arch/arm64/include/asm/perf_event.h | 14 ++ drivers/perf/arm_pmu.c | 38 4 files changed, 60 insertions(+) diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index 67ef25d037ea..9de4cf0b17c7 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -29,6 +29,12 @@ typedef struct { atomic64_t id; + + /* +* non-zero if userspace have access to hardware +* counters directly. +*/ + atomic_tpmu_direct_access; void*vdso; unsigned long flags; } mm_context_t; diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index 2da3e478fd8f..33494af613d8 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -32,6 +32,7 @@ #include #include #include +#include #include #include @@ -235,6 +236,7 @@ static inline void __switch_mm(struct mm_struct *next) } check_and_switch_context(next, cpu); + perf_switch_user_access(next); } static inline void diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h index c593761ba61c..32a6d604bb3b 100644 --- a/arch/arm64/include/asm/perf_event.h +++ b/arch/arm64/include/asm/perf_event.h @@ -19,6 +19,7 @@ #include #include +#include #defineARMV8_PMU_MAX_COUNTERS 32 #defineARMV8_PMU_COUNTER_MASK (ARMV8_PMU_MAX_COUNTERS - 1) @@ -234,4 +235,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs); (regs)->pstate = PSR_MODE_EL1h; \ } +static inline void perf_switch_user_access(struct mm_struct *mm) +{ + if (!IS_ENABLED(CONFIG_PERF_EVENTS)) + return; + + if (atomic_read(>context.pmu_direct_access)) { + write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR, +pmuserenr_el0); + } else { + write_sysreg(0, pmuserenr_el0); + } +} + #endif diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index 2d06b8095a19..6ae85fcbf297 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -25,6 +25,7 @@ #include #include +#include static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu); static DEFINE_PER_CPU(int, cpu_irq); @@ -778,6 +779,41 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu) _pmu->node); } +static void refresh_pmuserenr(void *mm) +{ + perf_switch_user_access(mm); +} + +static void armpmu_event_mapped(struct perf_event *event, struct mm_struct *mm) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return; + + /* +* This function relies on not being called concurrently in two +* tasks in the same mm. Otherwise one task could observe +* pmu_direct_access > 1 and return all the way back to +* userspace with user access disabled while another task is still +* doing on_each_cpu_mask() to enable user access. +* +* For now, this can't happen because all callers hold mmap_sem +* for write. If this changes, we'll need a different solution. +*/ + lockdep_assert_held_exclusive(>mmap_sem); + + if (atomic_inc_return(>context.pmu_direct_access) == 1) + on_each_cpu(refresh_pmuserenr, mm, 1); +} + +static void armpmu_event_unmapped(struct perf_event *event, struct mm_struct *mm) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return; + + if (atomic_dec_and_test(>context.pmu_direct_access)) + on_each_cpu_mask(mm_cpumask(mm), refresh_pmuserenr, NULL, 1); +} + static struct arm_pmu *__armpmu_alloc(gfp_t flags) { struct arm_pmu *pmu; @@ -799,6 +835,8 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags) .pmu_enable = armpmu_enable, .pmu_disable= armpmu_disable, .event_init = armpmu_event_init, + .event_mapped = armpmu_event_mapped, + .event_unmapped = armpmu_event_unmapped, .add= armpmu_add, .del= armpmu_del, .start = armpmu_start, -- 2.17.1
[PATCH 5/7] arm64: pmu: Add hook to handle pmu-related undefined instructions
In order to prevent the userspace processes which are trying to access the registers from the pmu registers on a big.LITTLE environment we introduce a hook to handle undefined instructions. The goal here is to prevent the process to be interrupted by a signal when the error is caused by the task being scheduled while accessing a counter, causing the counter access to be invalid. As we are not able to know efficiently the number of counters available physically on both pmu in that context we consider that any faulting access to a counter which is architecturally correct should not cause a SIGILL signal if the permissions are set accordingly. This commit also modifies the mask of the mrs_hook declared in arch/arm64/kernel/cpufeatures.c which emulates only feature register access. This is necessary because this hook's mask was too large and thus masking any mrs instruction, even if not related to the emulated registers which made the pmu emulation inefficient. Signed-off-by: Raphael Gault --- arch/arm64/kernel/cpufeature.c | 4 +-- arch/arm64/kernel/perf_event.c | 55 ++ 2 files changed, 57 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 80babf451519..d9b2be97cc06 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -2167,8 +2167,8 @@ static int emulate_mrs(struct pt_regs *regs, u32 insn) } static struct undef_hook mrs_hook = { - .instr_mask = 0xfff0, - .instr_val = 0xd530, + .instr_mask = 0x, + .instr_val = 0xd538, .pstate_mask = PSR_AA32_MODE_MASK, .pstate_val = PSR_MODE_EL0t, .fn = emulate_mrs, diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 293e4c365a53..93ac24b51d5f 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -19,9 +19,11 @@ * along with this program. If not, see <http://www.gnu.org/licenses/>. */ +#include #include #include #include +#include #include #include @@ -1041,6 +1043,59 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu) return probe.present ? 0 : -ENODEV; } +static int emulate_pmu(struct pt_regs *regs, u32 insn) +{ + u32 sys_reg, rt; + u32 pmuserenr; + + sys_reg = (u32)aarch64_insn_decode_immediate(AARCH64_INSN_IMM_16, insn) << 5; + rt = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); + pmuserenr = read_sysreg(pmuserenr_el0); + + if ((pmuserenr & (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR)) != + (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR)) + return -EINVAL; + + + /* +* Userspace is expected to only use this in the context of the scheme +* described in the struct perf_event_mmap_page comments. +* +* Given that context, we can only get here if we got migrated between +* getting the register index and doing the MSR read. This in turn +* implies we'll fail the sequence and retry, so any value returned is +* 'good', all we need is to be non-fatal. +* +* The choice of the value 0 is comming from the fact that when +* accessing a register which is not counting events but is accessible, +* we get 0. +*/ + pt_regs_write_reg(regs, rt, 0); + + arm64_skip_faulting_instruction(regs, 4); + return 0; +} + +/* + * This hook will only be triggered by mrs + * instructions on PMU registers. This is mandatory + * in order to have a consistent behaviour even on + * big.LITTLE systems. + */ +static struct undef_hook pmu_hook = { + .instr_mask = 0x8800, + .instr_val = 0xd53b8800, + .fn = emulate_pmu, +}; + +static int __init enable_pmu_emulation(void) +{ + register_undef_hook(_hook); + return 0; +} + +core_initcall(enable_pmu_emulation); + static int armv8_pmu_init(struct arm_pmu *cpu_pmu) { int ret = armv8pmu_probe_pmu(cpu_pmu); -- 2.17.1
[PATCH 3/7] perf: arm64: Use rseq to test userspace access to pmu counters
Add an extra test to check userspace access to pmu hardware counters. This test doesn't rely on the seqlock as a synchronisation mechanism but instead uses the restartable sequences to make sure that the thread is not interrupted when reading the index of the counter and the associated pmu register. In addition to reading the pmu counters, this test is run several time in order to measure the ratio of failures: I ran this test on the Juno development platform, which is big.LITTLE with 4 Cortex A53 and 2 Cortex A57. The results vary quite a lot (running it with 100 tests is not so long and I did it several times). I ran it once with 1 iterations: `runs: 1, abort: 62.53%, zero: 34.93%, success: 2.54%` Signed-off-by: Raphael Gault --- tools/perf/arch/arm64/include/arch-tests.h| 5 +- tools/perf/arch/arm64/include/rseq-arm64.h| 220 ++ tools/perf/arch/arm64/tests/Build | 1 + tools/perf/arch/arm64/tests/arch-tests.c | 6 + tools/perf/arch/arm64/tests/rseq-pmu-events.c | 219 + 5 files changed, 450 insertions(+), 1 deletion(-) create mode 100644 tools/perf/arch/arm64/include/rseq-arm64.h create mode 100644 tools/perf/arch/arm64/tests/rseq-pmu-events.c diff --git a/tools/perf/arch/arm64/include/arch-tests.h b/tools/perf/arch/arm64/include/arch-tests.h index a9b17ae0560b..4164762b43c6 100644 --- a/tools/perf/arch/arm64/include/arch-tests.h +++ b/tools/perf/arch/arm64/include/arch-tests.h @@ -13,6 +13,9 @@ int test__arch_unwind_sample(struct perf_sample *sample, extern struct test arch_tests[]; int test__rd_pmevcntr(struct test *test __maybe_unused, int subtest __maybe_unused); - +#ifdef CONFIG_RSEQ +int rseq__rd_pmevcntr(struct test *test __maybe_unused, + int subtest __maybe_unused); +#endif #endif diff --git a/tools/perf/arch/arm64/include/rseq-arm64.h b/tools/perf/arch/arm64/include/rseq-arm64.h new file mode 100644 index ..00d6960915a9 --- /dev/null +++ b/tools/perf/arch/arm64/include/rseq-arm64.h @@ -0,0 +1,220 @@ +/* SPDX-License-Identifier: LGPL-2.1 OR MIT */ +/* + * rseq-arm64.h + * + * This file is mostly a copy from + * tools/testing/selftests/rseq/rseq-arm64.h + */ + +/* + * aarch64 -mbig-endian generates mixed endianness code vs data: + * little-endian code and big-endian data. Ensure the RSEQ_SIG signature + * matches code endianness. + */ +#define __rseq_str_1(x) #x +#define __rseq_str(x)__rseq_str_1(x) + +#define RSEQ_ACCESS_ONCE(x)(*(__volatile__ __typeof__(x) *)&(x)) +#define RSEQ_SIG_CODE 0xd428bc00 /* BRK #0x45E0. */ + +#ifdef __AARCH64EB__ +#define RSEQ_SIG_DATA 0x00bc28d4 /* BRK #0x45E0. */ +#else +#define RSEQ_SIG_DATA RSEQ_SIG_CODE +#endif + +#define RSEQ_SIG RSEQ_SIG_DATA + +#define rseq_smp_mb() __asm__ __volatile__ ("dmb ish" ::: "memory") +#define rseq_smp_rmb() __asm__ __volatile__ ("dmb ishld" ::: "memory") +#define rseq_smp_wmb() __asm__ __volatile__ ("dmb ishst" ::: "memory") + +#define rseq_smp_load_acquire(p) \ +__extension__ ({ \ + __typeof(*p) p1; \ + switch (sizeof(*p)) { \ + case 1: \ + asm volatile ("ldarb %w0, %1" \ + : "=r" (*(__u8 *)p) \ + : "Q" (*p) : "memory"); \ + break; \ + case 2: \ + asm volatile ("ldarh %w0, %1" \ + : "=r" (*(__u16 *)p) \ + : "Q" (*p) : "memory"); \ + break; \ + case 4: \ + asm volatile ("ldar %w0, %1" \ + : "=r" (*(__u32 *)p) \ + : "Q" (*p) : "memory"); \ + break; \ + case 8: \ + asm volatile ("ldar %0, %1"
[PATCH 4/7] arm64: pmu: Add function implementation to update event index in userpage.
In order to be able to access the counter directly for userspace, we need to provide the index of the counter using the userpage. We thus need to override the event_idx function to retrieve and convert the perf_event index to armv8 hardware index. Since the arm_pmu driver can be used by any implementation, even if not armv8, two components play a role into making sure the behaviour is correct and consistent with the PMU capabilities: * the ARMPMU_EL0_RD_CNTR flag which denotes the capability to access counter from userspace. * the event_idx call back, which is implemented and initialized by the PMU implementation: if no callback is provided, the default behaviour applies, returning 0 as index value. Signed-off-by: Raphael Gault --- arch/arm64/kernel/perf_event.c | 21 + include/linux/perf/arm_pmu.h | 2 ++ 2 files changed, 23 insertions(+) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 348d12eec566..293e4c365a53 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -829,6 +829,22 @@ static void armv8pmu_clear_event_idx(struct pmu_hw_events *cpuc, clear_bit(idx - 1, cpuc->used_mask); } +static int armv8pmu_access_event_idx(struct perf_event *event) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return 0; + + /* +* We remap the cycle counter index to 32 to +* match the offset applied to the rest of +* the counter indeces. +*/ + if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER) + return 32; + + return event->hw.idx; +} + /* * Add an event filter to a given event. */ @@ -922,6 +938,8 @@ static int __armv8_pmuv3_map_event(struct perf_event *event, if (armv8pmu_event_is_64bit(event)) event->hw.flags |= ARMPMU_EVT_64BIT; + event->hw.flags |= ARMPMU_EL0_RD_CNTR; + /* Only expose micro/arch events supported by this PMU */ if ((hw_event_id > 0) && (hw_event_id < ARMV8_PMUV3_MAX_COMMON_EVENTS) && test_bit(hw_event_id, armpmu->pmceid_bitmap)) { @@ -1042,6 +1060,8 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu) cpu_pmu->set_event_filter = armv8pmu_set_event_filter; cpu_pmu->filter_match = armv8pmu_filter_match; + cpu_pmu->pmu.event_idx = armv8pmu_access_event_idx; + return 0; } @@ -1220,6 +1240,7 @@ void arch_perf_update_userpage(struct perf_event *event, */ freq = arch_timer_get_rate(); userpg->cap_user_time = 1; + userpg->cap_user_rdpmc = !!(event->hw.flags & ARMPMU_EL0_RD_CNTR); clocks_calc_mult_shift(>time_mult, , freq, NSEC_PER_SEC, 0); diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h index 4641e850b204..3bef390c1069 100644 --- a/include/linux/perf/arm_pmu.h +++ b/include/linux/perf/arm_pmu.h @@ -30,6 +30,8 @@ */ /* Event uses a 64bit counter */ #define ARMPMU_EVT_64BIT 1 +/* Allow access to hardware counter from userspace */ +#define ARMPMU_EL0_RD_CNTR 2 #define HW_OP_UNSUPPORTED 0x #define C(_x) PERF_COUNT_HW_CACHE_##_x -- 2.17.1
[PATCH 1/7] perf: arm64: Compile tests unconditionally
In order to subsequently add more tests for the arm64 architecture we compile the tests target for arm64 systematically. Signed-off-by: Raphael Gault --- tools/perf/arch/arm64/Build | 2 +- tools/perf/arch/arm64/tests/Build | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/arch/arm64/Build b/tools/perf/arch/arm64/Build index 36222e64bbf7..a7dd46a5b678 100644 --- a/tools/perf/arch/arm64/Build +++ b/tools/perf/arch/arm64/Build @@ -1,2 +1,2 @@ perf-y += util/ -perf-$(CONFIG_DWARF_UNWIND) += tests/ +perf-y += tests/ diff --git a/tools/perf/arch/arm64/tests/Build b/tools/perf/arch/arm64/tests/Build index 41707fea74b3..a61c06bdb757 100644 --- a/tools/perf/arch/arm64/tests/Build +++ b/tools/perf/arch/arm64/tests/Build @@ -1,4 +1,4 @@ perf-y += regs_load.o -perf-y += dwarf-unwind.o +perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o perf-y += arch-tests.o -- 2.17.1
[PATCH 0/7] arm64: Enable access to pmu registers by user-space
The perf user-space tool relies on the PMU to monitor events. It offers an abstraction layer over the hardware counters since the underlying implementation is cpu-dependent. We want to allow userspace tools to have access to the registers storing the hardware counters' values directly. This targets specifically self-monitoring tasks in order to reduce the overhead by directly accessing the registers without having to go through the kernel. In order to do this we need to setup the pmu so that it exposes its registers to userspace access. The first patch enables the tests for arm64 architecture in the perf tool to be compiled systematically. The second patch add a test to the perf tool so that we can test that the access to the registers works correctly from userspace. The third patch adds another test similar to the first one but this time using rseq as mechanism to make sure of the data correctness. The fourth patch focuses on the armv8 pmuv3 PMU support and makes sure that the access to the pmu registers is enable and that the userspace have access to the relevent information in order to use them. The fifth patch adds a hook to handle faulting access to the pmu registers. This is necessary in order to have a coherent behaviour on big.LITTLE environment. The sixth patch put in place callbacks to enable access to the hardware counters from userspace when a compatible event is opened using the perf API. Raphael Gault (7): perf: arm64: Compile tests unconditionally perf: arm64: Add test to check userspace access to hardware counters. perf: arm64: Use rseq to test userspace access to pmu counters arm64: pmu: Add function implementation to update event index in userpage. arm64: pmu: Add hook to handle pmu-related undefined instructions arm64: perf: Enable pmu counter direct access for perf event on armv8 Documentation: arm64: Document PMU counters access from userspace .../arm64/pmu_counter_user_access.txt | 42 +++ arch/arm64/include/asm/mmu.h | 6 + arch/arm64/include/asm/mmu_context.h | 2 + arch/arm64/include/asm/perf_event.h | 14 + arch/arm64/kernel/cpufeature.c| 4 +- arch/arm64/kernel/perf_event.c| 76 ++ drivers/perf/arm_pmu.c| 38 +++ include/linux/perf/arm_pmu.h | 2 + tools/perf/arch/arm64/Build | 2 +- tools/perf/arch/arm64/include/arch-tests.h| 9 + tools/perf/arch/arm64/include/rseq-arm64.h| 220 +++ tools/perf/arch/arm64/tests/Build | 4 +- tools/perf/arch/arm64/tests/arch-tests.c | 10 + tools/perf/arch/arm64/tests/rseq-pmu-events.c | 219 +++ tools/perf/arch/arm64/tests/user-events.c | 255 ++ 15 files changed, 899 insertions(+), 4 deletions(-) create mode 100644 Documentation/arm64/pmu_counter_user_access.txt create mode 100644 tools/perf/arch/arm64/include/rseq-arm64.h create mode 100644 tools/perf/arch/arm64/tests/rseq-pmu-events.c create mode 100644 tools/perf/arch/arm64/tests/user-events.c -- 2.17.1
Re: [RFC 4/7] arm64: pmu: Add function implementation to update event index in userpage.
Hi Peter, On 5/29/19 1:32 PM, Peter Zijlstra wrote: On Wed, May 29, 2019 at 01:25:46PM +0100, Raphael Gault wrote: Hi Robin, Hi Peter, On 5/29/19 11:50 AM, Robin Murphy wrote: On 29/05/2019 11:46, Raphael Gault wrote: Hi Peter, On 5/29/19 10:46 AM, Peter Zijlstra wrote: On Tue, May 28, 2019 at 04:03:17PM +0100, Raphael Gault wrote: +static int armv8pmu_access_event_idx(struct perf_event *event) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return 0; + + /* + * We remap the cycle counter index to 32 to + * match the offset applied to the rest of + * the counter indeces. + */ + if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER) + return 32; + + return event->hw.idx; Is there a guarantee event->hw.idx is never 0? Or should you, just like x86, use +1 here? You are right, I should use +1 here. Thanks for pointing that out. Isn't that already the case though, since we reserve index 0 for the cycle counter? I'm looking at ARMV8_IDX_TO_COUNTER() here... Well the current behaviour is correct and takes care of the zero case with the ARMV8_IDX_CYCLE_COUNTER check. But using ARMV8_IDX_TO_COUNTER() and add 1 would also work. However this seems indeed redundant with the current value held in event->hw.idx. Note that whatever you pick now will become ABI. Also note that the comment/pseudo-code in perf_event_mmap_page suggests to use idx-1 for the actual hardware access. Indeed that's true. As for the pseudo-code in perf_event_mmap_page. It is compatible with what I do here. The two approach are only different in form but it is in both case necessary to subtract 1 on the returned value in order to access the correct hardware counter. Thank you, -- Raphael Gault
Re: [RFC 4/7] arm64: pmu: Add function implementation to update event index in userpage.
Hi Robin, Hi Peter, On 5/29/19 11:50 AM, Robin Murphy wrote: On 29/05/2019 11:46, Raphael Gault wrote: Hi Peter, On 5/29/19 10:46 AM, Peter Zijlstra wrote: On Tue, May 28, 2019 at 04:03:17PM +0100, Raphael Gault wrote: +static int armv8pmu_access_event_idx(struct perf_event *event) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return 0; + + /* + * We remap the cycle counter index to 32 to + * match the offset applied to the rest of + * the counter indeces. + */ + if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER) + return 32; + + return event->hw.idx; Is there a guarantee event->hw.idx is never 0? Or should you, just like x86, use +1 here? You are right, I should use +1 here. Thanks for pointing that out. Isn't that already the case though, since we reserve index 0 for the cycle counter? I'm looking at ARMV8_IDX_TO_COUNTER() here... Well the current behaviour is correct and takes care of the zero case with the ARMV8_IDX_CYCLE_COUNTER check. But using ARMV8_IDX_TO_COUNTER() and add 1 would also work. However this seems indeed redundant with the current value held in event->hw.idx. Robin. -- Raphael Gault
Re: [RFC 4/7] arm64: pmu: Add function implementation to update event index in userpage.
Hi Peter, On 5/29/19 10:46 AM, Peter Zijlstra wrote: On Tue, May 28, 2019 at 04:03:17PM +0100, Raphael Gault wrote: +static int armv8pmu_access_event_idx(struct perf_event *event) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return 0; + + /* +* We remap the cycle counter index to 32 to +* match the offset applied to the rest of +* the counter indeces. +*/ + if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER) + return 32; + + return event->hw.idx; Is there a guarantee event->hw.idx is never 0? Or should you, just like x86, use +1 here? You are right, I should use +1 here. Thanks for pointing that out. +} Thanks, -- Raphael Gault
[RFC 6/7] arm64: perf: Enable pmu counter direct access for perf event on armv8
Keep track of event opened with direct access to the hardware counters and modify permissions while they are open. The strategy used here is the same which x86 uses: everytime an event is mapped, the permissions are set if required. The atomic field added in the mm_context helps keep track of the different event opened and de-activate the permissions when all are unmapped. We also need to update the permissions in the context switch code so that tasks keep the right permissions. Signed-off-by: Raphael Gault --- arch/arm64/include/asm/mmu.h | 6 + arch/arm64/include/asm/mmu_context.h | 2 ++ arch/arm64/include/asm/perf_event.h | 14 ++ drivers/perf/arm_pmu.c | 38 4 files changed, 60 insertions(+) diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index 67ef25d037ea..9de4cf0b17c7 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -29,6 +29,12 @@ typedef struct { atomic64_t id; + + /* +* non-zero if userspace have access to hardware +* counters directly. +*/ + atomic_tpmu_direct_access; void*vdso; unsigned long flags; } mm_context_t; diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index 2da3e478fd8f..33494af613d8 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -32,6 +32,7 @@ #include #include #include +#include #include #include @@ -235,6 +236,7 @@ static inline void __switch_mm(struct mm_struct *next) } check_and_switch_context(next, cpu); + perf_switch_user_access(next); } static inline void diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h index c593761ba61c..32a6d604bb3b 100644 --- a/arch/arm64/include/asm/perf_event.h +++ b/arch/arm64/include/asm/perf_event.h @@ -19,6 +19,7 @@ #include #include +#include #defineARMV8_PMU_MAX_COUNTERS 32 #defineARMV8_PMU_COUNTER_MASK (ARMV8_PMU_MAX_COUNTERS - 1) @@ -234,4 +235,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs); (regs)->pstate = PSR_MODE_EL1h; \ } +static inline void perf_switch_user_access(struct mm_struct *mm) +{ + if (!IS_ENABLED(CONFIG_PERF_EVENTS)) + return; + + if (atomic_read(>context.pmu_direct_access)) { + write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR, +pmuserenr_el0); + } else { + write_sysreg(0, pmuserenr_el0); + } +} + #endif diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index eec75b97e7ea..0e5588cd2f39 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -24,6 +24,7 @@ #include #include +#include static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu); static DEFINE_PER_CPU(int, cpu_irq); @@ -777,6 +778,41 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu) _pmu->node); } +static void refresh_pmuserenr(void *mm) +{ + perf_switch_user_access(mm); +} + +static void armpmu_event_mapped(struct perf_event *event, struct mm_struct *mm) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return; + + /* +* This function relies on not being called concurrently in two +* tasks in the same mm. Otherwise one task could observe +* pmu_direct_access > 1 and return all the way back to +* userspace with user access disabled while another task is still +* doing on_each_cpu_mask() to enable user access. +* +* For now, this can't happen because all callers hold mmap_sem +* for write. If this changes, we'll need a different solution. +*/ + lockdep_assert_held_exclusive(>mmap_sem); + + if (atomic_inc_return(>context.pmu_direct_access) == 1) + on_each_cpu(refresh_pmuserenr, mm, 1); +} + +static void armpmu_event_unmapped(struct perf_event *event, struct mm_struct *mm) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return; + + if (atomic_dec_and_test(>context.pmu_direct_access)) + on_each_cpu_mask(mm_cpumask(mm), refresh_pmuserenr, NULL, 1); +} + static struct arm_pmu *__armpmu_alloc(gfp_t flags) { struct arm_pmu *pmu; @@ -798,6 +834,8 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags) .pmu_enable = armpmu_enable, .pmu_disable= armpmu_disable, .event_init = armpmu_event_init, + .event_mapped = armpmu_event_mapped, + .event_unmapped = armpmu_event_unmapped, .add= armpmu_add, .del= armpmu_del, .start = armpmu_start, -- 2.17.1
[RFC 3/7] perf: arm64: Use rseq to test userspace access to pmu counters
Add an extra test to check userspace access to pmu hardware counters. This test doesn't rely on the seqlock as a synchronisation mechanism but instead uses the restartable sequences to make sure that the thread is not interrupted when reading the index of the counter and the associated pmu register. In addition to reading the pmu counters, this test is run several time in order to measure the ratio of failures: I ran this test on the Juno development platform, which is big.LITTLE with 4 Cortex A53 and 2 Cortex A57. The results vary quite a lot (running it with 100 tests is not so long and I did it several times). I ran it once with 1 iterations: `runs: 1, abort: 62.53%, zero: 34.93%, success: 2.54%` Signed-off-by: Raphael Gault --- tools/perf/arch/arm64/include/arch-tests.h| 5 +- tools/perf/arch/arm64/include/rseq-arm64.h| 220 ++ tools/perf/arch/arm64/tests/Build | 1 + tools/perf/arch/arm64/tests/arch-tests.c | 6 + tools/perf/arch/arm64/tests/rseq-pmu-events.c | 219 + 5 files changed, 450 insertions(+), 1 deletion(-) create mode 100644 tools/perf/arch/arm64/include/rseq-arm64.h create mode 100644 tools/perf/arch/arm64/tests/rseq-pmu-events.c diff --git a/tools/perf/arch/arm64/include/arch-tests.h b/tools/perf/arch/arm64/include/arch-tests.h index a9b17ae0560b..4164762b43c6 100644 --- a/tools/perf/arch/arm64/include/arch-tests.h +++ b/tools/perf/arch/arm64/include/arch-tests.h @@ -13,6 +13,9 @@ int test__arch_unwind_sample(struct perf_sample *sample, extern struct test arch_tests[]; int test__rd_pmevcntr(struct test *test __maybe_unused, int subtest __maybe_unused); - +#ifdef CONFIG_RSEQ +int rseq__rd_pmevcntr(struct test *test __maybe_unused, + int subtest __maybe_unused); +#endif #endif diff --git a/tools/perf/arch/arm64/include/rseq-arm64.h b/tools/perf/arch/arm64/include/rseq-arm64.h new file mode 100644 index ..00d6960915a9 --- /dev/null +++ b/tools/perf/arch/arm64/include/rseq-arm64.h @@ -0,0 +1,220 @@ +/* SPDX-License-Identifier: LGPL-2.1 OR MIT */ +/* + * rseq-arm64.h + * + * This file is mostly a copy from + * tools/testing/selftests/rseq/rseq-arm64.h + */ + +/* + * aarch64 -mbig-endian generates mixed endianness code vs data: + * little-endian code and big-endian data. Ensure the RSEQ_SIG signature + * matches code endianness. + */ +#define __rseq_str_1(x) #x +#define __rseq_str(x)__rseq_str_1(x) + +#define RSEQ_ACCESS_ONCE(x)(*(__volatile__ __typeof__(x) *)&(x)) +#define RSEQ_SIG_CODE 0xd428bc00 /* BRK #0x45E0. */ + +#ifdef __AARCH64EB__ +#define RSEQ_SIG_DATA 0x00bc28d4 /* BRK #0x45E0. */ +#else +#define RSEQ_SIG_DATA RSEQ_SIG_CODE +#endif + +#define RSEQ_SIG RSEQ_SIG_DATA + +#define rseq_smp_mb() __asm__ __volatile__ ("dmb ish" ::: "memory") +#define rseq_smp_rmb() __asm__ __volatile__ ("dmb ishld" ::: "memory") +#define rseq_smp_wmb() __asm__ __volatile__ ("dmb ishst" ::: "memory") + +#define rseq_smp_load_acquire(p) \ +__extension__ ({ \ + __typeof(*p) p1; \ + switch (sizeof(*p)) { \ + case 1: \ + asm volatile ("ldarb %w0, %1" \ + : "=r" (*(__u8 *)p) \ + : "Q" (*p) : "memory"); \ + break; \ + case 2: \ + asm volatile ("ldarh %w0, %1" \ + : "=r" (*(__u16 *)p) \ + : "Q" (*p) : "memory"); \ + break; \ + case 4: \ + asm volatile ("ldar %w0, %1" \ + : "=r" (*(__u32 *)p) \ + : "Q" (*p) : "memory"); \ + break; \ + case 8: \ + asm volatile ("ldar %0, %1"
[RFC 5/7] arm64: pmu: Add hook to handle pmu-related undefined instructions
In order to prevent the userspace processes which are trying to access the registers from the pmu registers on a big.LITTLE environment we introduce a hook to handle undefined instructions. The goal here is to prevent the process to be interrupted by a signal when the error is caused by the task being scheduled while accessing a counter, causing the counter access to be invalid. As we are not able to know efficiently the number of counters available physically on both pmu in that context we consider that any faulting access to a counter which is architecturally correct should not cause a SIGILL signal if the permissions are set accordingly. This commit also modifies the mask of the mrs_hook declared in arch/arm64/kernel/cpufeatures.c which emulates only feature register access. This is necessary because this hook's mask was too large and thus masking any mrs instruction, even if not related to the emulated registers which made the pmu emulation inefficient. Signed-off-by: Raphael Gault --- arch/arm64/kernel/cpufeature.c | 4 ++-- arch/arm64/kernel/perf_event.c | 41 ++ 2 files changed, 43 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 2b807f129e60..daa7b31f2c73 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -2166,8 +2166,8 @@ static int emulate_mrs(struct pt_regs *regs, u32 insn) } static struct undef_hook mrs_hook = { - .instr_mask = 0xfff0, - .instr_val = 0xd530, + .instr_mask = 0x, + .instr_val = 0xd538, .pstate_mask = PSR_AA32_MODE_MASK, .pstate_val = PSR_MODE_EL0t, .fn = emulate_mrs, diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 3dc1265540df..1687f6d1fa27 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -19,9 +19,11 @@ * along with this program. If not, see <http://www.gnu.org/licenses/>. */ +#include #include #include #include +#include #include #include @@ -1009,6 +1011,45 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu) return probe.present ? 0 : -ENODEV; } +static int emulate_pmu(struct pt_regs *regs, u32 insn) +{ + u32 sys_reg, rt; + u32 pmuserenr; + + sys_reg = (u32)aarch64_insn_decode_immediate(AARCH64_INSN_IMM_16, insn) << 5; + rt = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); + pmuserenr = read_sysreg(pmuserenr_el0); + + if ((pmuserenr & (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR)) != + (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR)) + return -EINVAL; + + pt_regs_write_reg(regs, rt, 0); + + arm64_skip_faulting_instruction(regs, 4); + return 0; +} + +/* + * This hook will only be triggered by mrs + * instructions on PMU registers. This is mandatory + * in order to have a consistent behaviour even on + * big.LITTLE systems. + */ +static struct undef_hook pmu_hook = { + .instr_mask = 0x8800, + .instr_val = 0xd53b8800, + .fn = emulate_pmu, +}; + +static int __init enable_pmu_emulation(void) +{ + register_undef_hook(_hook); + return 0; +} + +core_initcall(enable_pmu_emulation); + static int armv8_pmu_init(struct arm_pmu *cpu_pmu) { int ret = armv8pmu_probe_pmu(cpu_pmu); -- 2.17.1
[RFC 4/7] arm64: pmu: Add function implementation to update event index in userpage.
In order to be able to access the counter directly for userspace, we need to provide the index of the counter using the userpage. We thus need to override the event_idx function to retrieve and convert the perf_event index to armv8 hardware index. Since the arm_pmu driver can be used by any implementation, even if not armv8, two components play a role into making sure the behaviour is correct and consistent with the PMU capabilities: * the ARMPMU_EL0_RD_CNTR flag which denotes the capability to access counter from userspace. * the event_idx call back, which is implemented and initialized by the PMU implementation: if no callback is provided, the default behaviour applies, returning 0 as index value. Signed-off-by: Raphael Gault --- arch/arm64/kernel/perf_event.c | 21 + include/linux/perf/arm_pmu.h | 2 ++ 2 files changed, 23 insertions(+) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 6164d389eed6..3dc1265540df 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -809,6 +809,22 @@ static void armv8pmu_clear_event_idx(struct pmu_hw_events *cpuc, clear_bit(idx - 1, cpuc->used_mask); } +static int armv8pmu_access_event_idx(struct perf_event *event) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return 0; + + /* +* We remap the cycle counter index to 32 to +* match the offset applied to the rest of +* the counter indeces. +*/ + if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER) + return 32; + + return event->hw.idx; +} + /* * Add an event filter to a given event. */ @@ -890,6 +906,8 @@ static int __armv8_pmuv3_map_event(struct perf_event *event, if (armv8pmu_event_is_64bit(event)) event->hw.flags |= ARMPMU_EVT_64BIT; + event->hw.flags |= ARMPMU_EL0_RD_CNTR; + /* Only expose micro/arch events supported by this PMU */ if ((hw_event_id > 0) && (hw_event_id < ARMV8_PMUV3_MAX_COMMON_EVENTS) && test_bit(hw_event_id, armpmu->pmceid_bitmap)) { @@ -1010,6 +1028,8 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu) cpu_pmu->set_event_filter = armv8pmu_set_event_filter; cpu_pmu->filter_match = armv8pmu_filter_match; + cpu_pmu->pmu.event_idx = armv8pmu_access_event_idx; + return 0; } @@ -1188,6 +1208,7 @@ void arch_perf_update_userpage(struct perf_event *event, */ freq = arch_timer_get_rate(); userpg->cap_user_time = 1; + userpg->cap_user_rdpmc = !!(event->hw.flags & ARMPMU_EL0_RD_CNTR); clocks_calc_mult_shift(>time_mult, , freq, NSEC_PER_SEC, 0); diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h index 4641e850b204..3bef390c1069 100644 --- a/include/linux/perf/arm_pmu.h +++ b/include/linux/perf/arm_pmu.h @@ -30,6 +30,8 @@ */ /* Event uses a 64bit counter */ #define ARMPMU_EVT_64BIT 1 +/* Allow access to hardware counter from userspace */ +#define ARMPMU_EL0_RD_CNTR 2 #define HW_OP_UNSUPPORTED 0x #define C(_x) PERF_COUNT_HW_CACHE_##_x -- 2.17.1
[RFC 2/7] perf: arm64: Add test to check userspace access to hardware counters.
This test relies on the fact that the PMU registers are accessible from userspace. It then uses the perf_event_mmap_page to retrieve the counter index and access the underlying register. This test uses sched_setaffinity(2) in order to run on all CPU and thus check the behaviour of the PMU of all cpus in a big.LITTLE environment. Signed-off-by: Raphael Gault --- tools/perf/arch/arm64/include/arch-tests.h | 6 + tools/perf/arch/arm64/tests/Build | 1 + tools/perf/arch/arm64/tests/arch-tests.c | 4 + tools/perf/arch/arm64/tests/user-events.c | 255 + 4 files changed, 266 insertions(+) create mode 100644 tools/perf/arch/arm64/tests/user-events.c diff --git a/tools/perf/arch/arm64/include/arch-tests.h b/tools/perf/arch/arm64/include/arch-tests.h index 90ec4c8cb880..a9b17ae0560b 100644 --- a/tools/perf/arch/arm64/include/arch-tests.h +++ b/tools/perf/arch/arm64/include/arch-tests.h @@ -2,11 +2,17 @@ #ifndef ARCH_TESTS_H #define ARCH_TESTS_H +#define __maybe_unused __attribute__((unused)) #ifdef HAVE_DWARF_UNWIND_SUPPORT struct thread; struct perf_sample; +int test__arch_unwind_sample(struct perf_sample *sample, +struct thread *thread); #endif extern struct test arch_tests[]; +int test__rd_pmevcntr(struct test *test __maybe_unused, + int subtest __maybe_unused); + #endif diff --git a/tools/perf/arch/arm64/tests/Build b/tools/perf/arch/arm64/tests/Build index a61c06bdb757..3f9a20c17fc6 100644 --- a/tools/perf/arch/arm64/tests/Build +++ b/tools/perf/arch/arm64/tests/Build @@ -1,4 +1,5 @@ perf-y += regs_load.o perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o +perf-y += user-events.o perf-y += arch-tests.o diff --git a/tools/perf/arch/arm64/tests/arch-tests.c b/tools/perf/arch/arm64/tests/arch-tests.c index 5b1543c98022..57df9b89dede 100644 --- a/tools/perf/arch/arm64/tests/arch-tests.c +++ b/tools/perf/arch/arm64/tests/arch-tests.c @@ -10,6 +10,10 @@ struct test arch_tests[] = { .func = test__dwarf_unwind, }, #endif + { + .desc = "User counter access", + .func = test__rd_pmevcntr, + }, { .func = NULL, }, diff --git a/tools/perf/arch/arm64/tests/user-events.c b/tools/perf/arch/arm64/tests/user-events.c new file mode 100644 index ..958e4cd000c1 --- /dev/null +++ b/tools/perf/arch/arm64/tests/user-events.c @@ -0,0 +1,255 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "perf.h" +#include "debug.h" +#include "tests/tests.h" +#include "cloexec.h" +#include "util.h" +#include "arch-tests.h" + +/* + * ARMv8 ARM reserves the following encoding for system registers: + * (Ref: ARMv8 ARM, Section: "System instruction class encoding overview", + * C5.2, version:ARM DDI 0487A.f) + * [20-19] : Op0 + * [18-16] : Op1 + * [15-12] : CRn + * [11-8] : CRm + * [7-5] : Op2 + */ +#define Op0_shift 19 +#define Op0_mask0x3 +#define Op1_shift 16 +#define Op1_mask0x7 +#define CRn_shift 12 +#define CRn_mask0xf +#define CRm_shift 8 +#define CRm_mask0xf +#define Op2_shift 5 +#define Op2_mask0x7 + +#define __stringify(x) #x + +#define read_sysreg(r) ({ \ + u64 __val; \ + asm volatile("mrs %0, " __stringify(r) : "=r" (__val)); \ + __val; \ +}) + +#define PMEVCNTR_READ_CASE(idx)\ + case idx: \ + return read_sysreg(pmevcntr##idx##_el0) + +#define PMEVCNTR_CASES(readwrite) \ + PMEVCNTR_READ_CASE(0); \ + PMEVCNTR_READ_CASE(1); \ + PMEVCNTR_READ_CASE(2); \ + PMEVCNTR_READ_CASE(3); \ + PMEVCNTR_READ_CASE(4); \ + PMEVCNTR_READ_CASE(5); \ + PMEVCNTR_READ_CASE(6); \ + PMEVCNTR_READ_CASE(7); \ + PMEVCNTR_READ_CASE(8); \ + PMEVCNTR_READ_CASE(9); \ + PMEVCNTR_READ_CASE(10); \ + PMEVCNTR_READ_CASE(11); \ + PMEVCNTR_READ_CASE(12); \ + PMEVCNTR_READ_CASE(13); \ + PMEVCNTR_READ_CASE(14); \ + PMEVCNTR_READ_CASE(15); \ + PMEVCNTR_READ_CASE(16); \ + PMEVCNTR_READ_CASE(17); \ + PMEVCNTR_READ_CASE(18
[RFC 7/7] Documentation: arm64: Document PMU counters access from userspace
Add a documentation file to describe the access to the pmu hardware counters from userspace Signed-off-by: Raphael Gault --- .../arm64/pmu_counter_user_access.txt | 42 +++ 1 file changed, 42 insertions(+) create mode 100644 Documentation/arm64/pmu_counter_user_access.txt diff --git a/Documentation/arm64/pmu_counter_user_access.txt b/Documentation/arm64/pmu_counter_user_access.txt new file mode 100644 index ..6788b1107381 --- /dev/null +++ b/Documentation/arm64/pmu_counter_user_access.txt @@ -0,0 +1,42 @@ +Access to PMU hardware counter from userspace += + +Overview + +The perf user-space tool relies on the PMU to monitor events. It offers an +abstraction layer over the hardware counters since the underlying +implementation is cpu-dependent. +Arm64 allows userspace tools to have access to the registers storing the +hardware counters' values directly. + +This targets specifically self-monitoring tasks in order to reduce the overhead +by directly accessing the registers without having to go through the kernel. + +How-to +-- +The focus is set on the armv8 pmuv3 which makes sure that the access to the pmu +registers is enable and that the userspace have access to the relevent +information in order to use them. + +In order to have access to the hardware counter it is necessary to open the event +using the perf tool interface: the sys_perf_event_open syscall returns a fd which +can subsequently be used with the mmap syscall in order to retrieve a page of memory +containing information about the event. +The PMU driver uses this page to expose to the user the hardware counter's +index. Using this index enables the user to access the PMU registers using the +`mrs` instruction. + +Have a look `at tools/perf/arch/arm64/tests/user-events.c` for an example. It can be +run using the perf tool to check that the access to the registers works +correctly from userspace: + +./perf test -v + +About chained events + +When the user requests for an event to be counted on 64 bits, two hardware +counters are used and need to be combined to retrieve the correct value: + +val = read_counter(idx); +if ((event.attr.config1 & 0x1)) + val = (val << 32) | read_counter(idx - 1); -- 2.17.1
[RFC V2 0/7] arm64: Enable access to pmu registers by user-space
The perf user-space tool relies on the PMU to monitor events. It offers an abstraction layer over the hardware counters since the underlying implementation is cpu-dependent. We want to allow userspace tools to have access to the registers storing the hardware counters' values directly. This targets specifically self-monitoring tasks in order to reduce the overhead by directly accessing the registers without having to go through the kernel. In order to do this we need to setup the pmu so that it exposes its registers to userspace access. The first patch enables the tests for arm64 architecture in the perf tool to be compiled systematically. The second patch add a test to the perf tool so that we can test that the access to the registers works correctly from userspace. The third patch adds another test similar to the first one but this time using rseq as mechanism to make sure of the data correctness. The fourth patch focuses on the armv8 pmuv3 PMU support and makes sure that the access to the pmu registers is enable and that the userspace have access to the relevent information in order to use them. The fifth patch adds a hook to handle faulting access to the pmu registers. This is necessary in order to have a coherent behaviour on big.LITTLE environment. The sixth patch put in place callbacks to enable access to the hardware counters from userspace when a compatible event is opened using the perf API. RFC: In my opinion there is no need to save pmselr_el0 when context switching like we do for pmuserenr_el0 since whether it's the seqlock mechanism or the restartable sequences, the user should notice right away that the value held in pmxevcntr_el0 is incorrect when the task has been rescheduled. However, I still wanted to discuss this point on the list to see if that's indeed not necessary to save it. Changes since V1: Add a test using rseq Raphael Gault (7): perf: arm64: Compile tests unconditionally perf: arm64: Add test to check userspace access to hardware counters. perf: arm64: Use rseq to test userspace access to pmu counters arm64: pmu: Add function implementation to update event index in userpage. arm64: pmu: Add hook to handle pmu-related undefined instructions arm64: perf: Enable pmu counter direct access for perf event on armv8 Documentation: arm64: Document PMU counters access from userspace .../arm64/pmu_counter_user_access.txt | 42 +++ arch/arm64/include/asm/mmu.h | 6 + arch/arm64/include/asm/mmu_context.h | 2 + arch/arm64/include/asm/perf_event.h | 14 + arch/arm64/kernel/cpufeature.c| 4 +- arch/arm64/kernel/perf_event.c| 62 + drivers/perf/arm_pmu.c| 38 +++ include/linux/perf/arm_pmu.h | 2 + tools/perf/arch/arm64/Build | 2 +- tools/perf/arch/arm64/include/arch-tests.h| 9 + tools/perf/arch/arm64/include/rseq-arm64.h| 220 +++ tools/perf/arch/arm64/tests/Build | 4 +- tools/perf/arch/arm64/tests/arch-tests.c | 10 + tools/perf/arch/arm64/tests/rseq-pmu-events.c | 219 +++ tools/perf/arch/arm64/tests/user-events.c | 255 ++ 15 files changed, 885 insertions(+), 4 deletions(-) create mode 100644 Documentation/arm64/pmu_counter_user_access.txt create mode 100644 tools/perf/arch/arm64/include/rseq-arm64.h create mode 100644 tools/perf/arch/arm64/tests/rseq-pmu-events.c create mode 100644 tools/perf/arch/arm64/tests/user-events.c -- 2.17.1
[RFC 1/7] perf: arm64: Compile tests unconditionally
In order to subsequently add more tests for the arm64 architecture we compile the tests target for arm64 systematically. Signed-off-by: Raphael Gault --- tools/perf/arch/arm64/Build | 2 +- tools/perf/arch/arm64/tests/Build | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/arch/arm64/Build b/tools/perf/arch/arm64/Build index 36222e64bbf7..a7dd46a5b678 100644 --- a/tools/perf/arch/arm64/Build +++ b/tools/perf/arch/arm64/Build @@ -1,2 +1,2 @@ perf-y += util/ -perf-$(CONFIG_DWARF_UNWIND) += tests/ +perf-y += tests/ diff --git a/tools/perf/arch/arm64/tests/Build b/tools/perf/arch/arm64/tests/Build index 41707fea74b3..a61c06bdb757 100644 --- a/tools/perf/arch/arm64/tests/Build +++ b/tools/perf/arch/arm64/tests/Build @@ -1,4 +1,4 @@ perf-y += regs_load.o -perf-y += dwarf-unwind.o +perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o perf-y += arch-tests.o -- 2.17.1
Re: [RFC V2 00/16] objtool: Add support for Arm64
Hi Josh, Thanks for offering your help and sorry for the late answer. My understanding is that a table of offsets is built by GCC, those offsets being scaled by 4 before adding them to the base label. I believe the offsets are stored in the .rodata section. To find the size of that table, it is needed to find a comparison, which can be optimized out apprently. In that case the end of the array can be found by locating labels pointing to data behind it (which is not 100% safe). On 5/16/19 3:29 PM, Josh Poimboeuf wrote: > On Thu, May 16, 2019 at 11:36:39AM +0100, Raphael Gault wrote: >> Noteworthy points: >> * I still haven't figured out how to detect switch-tables on arm64. I >> have a better understanding of them but still haven't implemented checks >> as it doesn't look trivial at all. > > Switch tables were tricky to get right on x86. If you share an example > (or even just a .o file) I can take a look. Hopefully they're somewhat > similar to x86 switch tables. Otherwise we may want to consider a > different approach (for example maybe a GCC plugin could help annotate > them). > The case which made me realize the issue is the one of arch/arm64/kernel/module.o:apply_relocate_add: ``` What seems to happen in the case of module.o is: 334: 9015adrpx21, 0 which retrieves the location of an offset in the rodata section, and a bit later we do some extra computation with it in order to compute the jump destination: 3e0: 78625aa0ldrhw0, [x21, w2, uxtw #1] 3e4: 1061adr x1, 3f0 3e8: 8b20a820add x0, x1, w0, sxth #2 3ec: d61fbr x0 ``` Please keep in mind that the actual offsets might vary. I'm happy to provide more details about what I have identified if you want me to. Thanks, -- Raphael Gault IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Re: [PATCH 4/6] arm64: pmu: Add hook to handle pmu-related undefined instructions
Hi, On 5/17/19 8:10 AM, Peter Zijlstra wrote: > On Thu, May 16, 2019 at 02:21:46PM +0100, Raphael Gault wrote: >> In order to prevent the userspace processes which are trying to access >> the registers from the pmu registers on a big.LITTLE environment we >> introduce a hook to handle undefined instructions. >> >> The goal here is to prevent the process to be interrupted by a signal >> when the error is caused by the task being scheduled while accessing >> a counter, causing the counter access to be invalid. As we are not able >> to know efficiently the number of counters available physically on both >> pmu in that context we consider that any faulting access to a counter >> which is architecturally correct should not cause a SIGILL signal if >> the permissions are set accordingly. > > The other approach is using rseq for this; with that you can guarantee > it will never issue the instruction on a wrong CPU. > > That said; emulating the thing isn't horrible either. > >> +/* >> + * We put 0 in the target register if we >> + * are reading from pmu register. If we are >> + * writing, we do nothing. >> + */ > > Wait _what_ ?!? userspace can _WRITE_ to these registers? > The user can write to some pmu registers but those are not the ones that interest us here. My comment was ill formed, indeed this hook can only be triggered by reads in this case. Sorry about that. Thanks, -- Raphael Gault IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
[PATCH 2/6] perf: arm64: Add test to check userspace access to hardware counters.
This test relies on the fact that the PMU registers are accessible from userspace. It then uses the perf_event_mmap_page to retrieve the counter index and access the underlying register. This test uses sched_setaffinity(2) in order to run on all CPU and thus check the behaviour of the PMU of all cpus in a big.LITTLE environment. Signed-off-by: Raphael Gault --- tools/perf/arch/arm64/include/arch-tests.h | 6 + tools/perf/arch/arm64/tests/Build | 1 + tools/perf/arch/arm64/tests/arch-tests.c | 4 + tools/perf/arch/arm64/tests/user-events.c | 255 + 4 files changed, 266 insertions(+) create mode 100644 tools/perf/arch/arm64/tests/user-events.c diff --git a/tools/perf/arch/arm64/include/arch-tests.h b/tools/perf/arch/arm64/include/arch-tests.h index 90ec4c8cb880..a9b17ae0560b 100644 --- a/tools/perf/arch/arm64/include/arch-tests.h +++ b/tools/perf/arch/arm64/include/arch-tests.h @@ -2,11 +2,17 @@ #ifndef ARCH_TESTS_H #define ARCH_TESTS_H +#define __maybe_unused __attribute__((unused)) #ifdef HAVE_DWARF_UNWIND_SUPPORT struct thread; struct perf_sample; +int test__arch_unwind_sample(struct perf_sample *sample, +struct thread *thread); #endif extern struct test arch_tests[]; +int test__rd_pmevcntr(struct test *test __maybe_unused, + int subtest __maybe_unused); + #endif diff --git a/tools/perf/arch/arm64/tests/Build b/tools/perf/arch/arm64/tests/Build index a61c06bdb757..3f9a20c17fc6 100644 --- a/tools/perf/arch/arm64/tests/Build +++ b/tools/perf/arch/arm64/tests/Build @@ -1,4 +1,5 @@ perf-y += regs_load.o perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o +perf-y += user-events.o perf-y += arch-tests.o diff --git a/tools/perf/arch/arm64/tests/arch-tests.c b/tools/perf/arch/arm64/tests/arch-tests.c index 5b1543c98022..57df9b89dede 100644 --- a/tools/perf/arch/arm64/tests/arch-tests.c +++ b/tools/perf/arch/arm64/tests/arch-tests.c @@ -10,6 +10,10 @@ struct test arch_tests[] = { .func = test__dwarf_unwind, }, #endif + { + .desc = "User counter access", + .func = test__rd_pmevcntr, + }, { .func = NULL, }, diff --git a/tools/perf/arch/arm64/tests/user-events.c b/tools/perf/arch/arm64/tests/user-events.c new file mode 100644 index ..958e4cd000c1 --- /dev/null +++ b/tools/perf/arch/arm64/tests/user-events.c @@ -0,0 +1,255 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "perf.h" +#include "debug.h" +#include "tests/tests.h" +#include "cloexec.h" +#include "util.h" +#include "arch-tests.h" + +/* + * ARMv8 ARM reserves the following encoding for system registers: + * (Ref: ARMv8 ARM, Section: "System instruction class encoding overview", + * C5.2, version:ARM DDI 0487A.f) + * [20-19] : Op0 + * [18-16] : Op1 + * [15-12] : CRn + * [11-8] : CRm + * [7-5] : Op2 + */ +#define Op0_shift 19 +#define Op0_mask0x3 +#define Op1_shift 16 +#define Op1_mask0x7 +#define CRn_shift 12 +#define CRn_mask0xf +#define CRm_shift 8 +#define CRm_mask0xf +#define Op2_shift 5 +#define Op2_mask0x7 + +#define __stringify(x) #x + +#define read_sysreg(r) ({ \ + u64 __val; \ + asm volatile("mrs %0, " __stringify(r) : "=r" (__val)); \ + __val; \ +}) + +#define PMEVCNTR_READ_CASE(idx)\ + case idx: \ + return read_sysreg(pmevcntr##idx##_el0) + +#define PMEVCNTR_CASES(readwrite) \ + PMEVCNTR_READ_CASE(0); \ + PMEVCNTR_READ_CASE(1); \ + PMEVCNTR_READ_CASE(2); \ + PMEVCNTR_READ_CASE(3); \ + PMEVCNTR_READ_CASE(4); \ + PMEVCNTR_READ_CASE(5); \ + PMEVCNTR_READ_CASE(6); \ + PMEVCNTR_READ_CASE(7); \ + PMEVCNTR_READ_CASE(8); \ + PMEVCNTR_READ_CASE(9); \ + PMEVCNTR_READ_CASE(10); \ + PMEVCNTR_READ_CASE(11); \ + PMEVCNTR_READ_CASE(12); \ + PMEVCNTR_READ_CASE(13); \ + PMEVCNTR_READ_CASE(14); \ + PMEVCNTR_READ_CASE(15); \ + PMEVCNTR_READ_CASE(16); \ + PMEVCNTR_READ_CASE(17); \ + PMEVCNTR_READ_CASE(18
[PATCH 1/6] perf: arm64: Compile tests unconditionally
In order to subsequently add more tests for the arm64 architecture we compile the tests target for arm64 systematically. Signed-off-by: Raphael Gault --- tools/perf/arch/arm64/Build | 2 +- tools/perf/arch/arm64/tests/Build | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/arch/arm64/Build b/tools/perf/arch/arm64/Build index 36222e64bbf7..a7dd46a5b678 100644 --- a/tools/perf/arch/arm64/Build +++ b/tools/perf/arch/arm64/Build @@ -1,2 +1,2 @@ perf-y += util/ -perf-$(CONFIG_DWARF_UNWIND) += tests/ +perf-y += tests/ diff --git a/tools/perf/arch/arm64/tests/Build b/tools/perf/arch/arm64/tests/Build index 41707fea74b3..a61c06bdb757 100644 --- a/tools/perf/arch/arm64/tests/Build +++ b/tools/perf/arch/arm64/tests/Build @@ -1,4 +1,4 @@ perf-y += regs_load.o -perf-y += dwarf-unwind.o +perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o perf-y += arch-tests.o -- 2.17.1
[PATCH 3/6] arm64: pmu: Add function implementation to update event index in userpage.
In order to be able to access the counter directly for userspace, we need to provide the index of the counter using the userpage. We thus need to override the event_idx function to retrieve and convert the perf_event index to armv8 hardware index. Signed-off-by: Raphael Gault --- arch/arm64/kernel/perf_event.c | 4 drivers/perf/arm_pmu.c | 10 ++ include/linux/perf/arm_pmu.h | 2 ++ 3 files changed, 16 insertions(+) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 6164d389eed6..e6316f99f66b 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -890,6 +890,8 @@ static int __armv8_pmuv3_map_event(struct perf_event *event, if (armv8pmu_event_is_64bit(event)) event->hw.flags |= ARMPMU_EVT_64BIT; + event->hw.flags |= ARMPMU_EL0_RD_CNTR; + /* Only expose micro/arch events supported by this PMU */ if ((hw_event_id > 0) && (hw_event_id < ARMV8_PMUV3_MAX_COMMON_EVENTS) && test_bit(hw_event_id, armpmu->pmceid_bitmap)) { @@ -1188,6 +1190,8 @@ void arch_perf_update_userpage(struct perf_event *event, */ freq = arch_timer_get_rate(); userpg->cap_user_time = 1; + userpg->cap_user_rdpmc = + !!(event->hw.flags & ARMPMU_EL0_RD_CNTR); clocks_calc_mult_shift(>time_mult, , freq, NSEC_PER_SEC, 0); diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index eec75b97e7ea..3f4c2ec7ff89 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -777,6 +777,15 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu) _pmu->node); } + +static int armpmu_event_idx(struct perf_event *event) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return 0; + + return event->hw.idx; +} + static struct arm_pmu *__armpmu_alloc(gfp_t flags) { struct arm_pmu *pmu; @@ -803,6 +812,7 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags) .start = armpmu_start, .stop = armpmu_stop, .read = armpmu_read, + .event_idx = armpmu_event_idx, .filter_match = armpmu_filter_match, .attr_groups= pmu->attr_groups, /* diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h index 4641e850b204..3bef390c1069 100644 --- a/include/linux/perf/arm_pmu.h +++ b/include/linux/perf/arm_pmu.h @@ -30,6 +30,8 @@ */ /* Event uses a 64bit counter */ #define ARMPMU_EVT_64BIT 1 +/* Allow access to hardware counter from userspace */ +#define ARMPMU_EL0_RD_CNTR 2 #define HW_OP_UNSUPPORTED 0x #define C(_x) PERF_COUNT_HW_CACHE_##_x -- 2.17.1
[PATCH 4/6] arm64: pmu: Add hook to handle pmu-related undefined instructions
In order to prevent the userspace processes which are trying to access the registers from the pmu registers on a big.LITTLE environment we introduce a hook to handle undefined instructions. The goal here is to prevent the process to be interrupted by a signal when the error is caused by the task being scheduled while accessing a counter, causing the counter access to be invalid. As we are not able to know efficiently the number of counters available physically on both pmu in that context we consider that any faulting access to a counter which is architecturally correct should not cause a SIGILL signal if the permissions are set accordingly. Signed-off-by: Raphael Gault --- arch/arm64/kernel/perf_event.c | 68 ++ 1 file changed, 68 insertions(+) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index e6316f99f66b..760c947b58dd 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -19,9 +19,11 @@ * along with this program. If not, see <http://www.gnu.org/licenses/>. */ +#include #include #include #include +#include #include #include @@ -993,6 +995,72 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu) return probe.present ? 0 : -ENODEV; } +static bool is_evcntr(u32 sys_reg) +{ + u32 CRn, Op0, Op1, CRm; + + CRn = sys_reg_CRn(sys_reg); + CRm = sys_reg_CRm(sys_reg); + Op0 = sys_reg_Op0(sys_reg); + Op1 = sys_reg_Op1(sys_reg); + + return (CRn == 0xE && + (CRm & 0xc) == 0x8 && + Op1 == 0x3 && + Op0 == 0x3); +} + +static int emulate_pmu(struct pt_regs *regs, u32 insn) +{ + u32 sys_reg, rt; + u32 pmuserenr; + + sys_reg = (u32)aarch64_insn_decode_immediate(AARCH64_INSN_IMM_16, insn) << 5; + rt = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); + pmuserenr = read_sysreg(pmuserenr_el0); + + if ((pmuserenr & (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR)) != + (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR)) + return -EINVAL; + + if (sys_reg != SYS_PMXEVCNTR_EL0 && + !is_evcntr(sys_reg)) + return -EINVAL; + + /* +* We put 0 in the target register if we +* are reading from pmu register. If we are +* writing, we do nothing. +*/ + if ((insn & 0xfff0) == 0xd530) + pt_regs_write_reg(regs, rt, 0); + else if (sys_reg != SYS_PMSELR_EL0) + return -EINVAL; + + arm64_skip_faulting_instruction(regs, 4); + return 0; +} + +/* + * This hook will only be triggered by mrs + * instructions on PMU registers. This is mandatory + * in order to have a consistent behaviour even on + * big.LITTLE systems. + */ +static struct undef_hook pmu_hook = { + .instr_mask = 0x8800, + .instr_val = 0xd53b8800, + .fn = emulate_pmu, +}; + +static int __init enable_pmu_emulation(void) +{ + register_undef_hook(_hook); + return 0; +} + +core_initcall(enable_pmu_emulation); + static int armv8_pmu_init(struct arm_pmu *cpu_pmu) { int ret = armv8pmu_probe_pmu(cpu_pmu); -- 2.17.1
[PATCH 5/6] arm64: perf: Enable pmu counter direct access for perf event on armv8
Keep track of event opened with direct access to the hardware counters and modify permissions while they are open. The strategy used here is the same which x86 uses: everytime an event is mapped, the permissions are set if required. The atomic field added in the mm_context helps keep track of the different event opened and de-activate the permissions when all are unmapped. We also need to update the permissions in the context switch code so that tasks keep the right permissions. Signed-off-by: Raphael Gault --- arch/arm64/include/asm/mmu.h | 6 + arch/arm64/include/asm/mmu_context.h | 2 ++ arch/arm64/include/asm/perf_event.h | 14 ++ drivers/perf/arm_pmu.c | 38 4 files changed, 60 insertions(+) diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index 67ef25d037ea..9de4cf0b17c7 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -29,6 +29,12 @@ typedef struct { atomic64_t id; + + /* +* non-zero if userspace have access to hardware +* counters directly. +*/ + atomic_tpmu_direct_access; void*vdso; unsigned long flags; } mm_context_t; diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index 2da3e478fd8f..33494af613d8 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -32,6 +32,7 @@ #include #include #include +#include #include #include @@ -235,6 +236,7 @@ static inline void __switch_mm(struct mm_struct *next) } check_and_switch_context(next, cpu); + perf_switch_user_access(next); } static inline void diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h index c593761ba61c..32a6d604bb3b 100644 --- a/arch/arm64/include/asm/perf_event.h +++ b/arch/arm64/include/asm/perf_event.h @@ -19,6 +19,7 @@ #include #include +#include #defineARMV8_PMU_MAX_COUNTERS 32 #defineARMV8_PMU_COUNTER_MASK (ARMV8_PMU_MAX_COUNTERS - 1) @@ -234,4 +235,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs); (regs)->pstate = PSR_MODE_EL1h; \ } +static inline void perf_switch_user_access(struct mm_struct *mm) +{ + if (!IS_ENABLED(CONFIG_PERF_EVENTS)) + return; + + if (atomic_read(>context.pmu_direct_access)) { + write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR, +pmuserenr_el0); + } else { + write_sysreg(0, pmuserenr_el0); + } +} + #endif diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index 3f4c2ec7ff89..45a64f942864 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -24,6 +24,7 @@ #include #include +#include static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu); static DEFINE_PER_CPU(int, cpu_irq); @@ -786,6 +787,41 @@ static int armpmu_event_idx(struct perf_event *event) return event->hw.idx; } +static void refresh_pmuserenr(void *mm) +{ + perf_switch_user_access(mm); +} + +static void armpmu_event_mapped(struct perf_event *event, struct mm_struct *mm) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return; + + /* +* This function relies on not being called concurrently in two +* tasks in the same mm. Otherwise one task could observe +* pmu_direct_access > 1 and return all the way back to +* userspace with user access disabled while another task is still +* doing on_each_cpu_mask() to enable user access. +* +* For now, this can't happen because all callers hold mmap_sem +* for write. If this changes, we'll need a different solution. +*/ + lockdep_assert_held_exclusive(>mmap_sem); + + if (atomic_inc_return(>context.pmu_direct_access) == 1) + on_each_cpu(refresh_pmuserenr, mm, 1); +} + +static void armpmu_event_unmapped(struct perf_event *event, struct mm_struct *mm) +{ + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return; + + if (atomic_dec_and_test(>context.pmu_direct_access)) + on_each_cpu_mask(mm_cpumask(mm), refresh_pmuserenr, NULL, 1); +} + static struct arm_pmu *__armpmu_alloc(gfp_t flags) { struct arm_pmu *pmu; @@ -807,6 +843,8 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags) .pmu_enable = armpmu_enable, .pmu_disable= armpmu_disable, .event_init = armpmu_event_init, + .event_mapped = armpmu_event_mapped, + .event_unmapped = armpmu_event_unmapped, .add= armpmu_add, .del= armpmu_del, .start = armpmu_start, -- 2.17.1
[PATCH 6/6] Documentation: arm64: Document PMU counters access from userspace
Add a documentation file to describe the access to the pmu hardware counters from userspace Signed-off-by: Raphael Gault --- .../arm64/pmu_counter_user_access.txt | 42 +++ 1 file changed, 42 insertions(+) create mode 100644 Documentation/arm64/pmu_counter_user_access.txt diff --git a/Documentation/arm64/pmu_counter_user_access.txt b/Documentation/arm64/pmu_counter_user_access.txt new file mode 100644 index ..bccf5edbf7f5 --- /dev/null +++ b/Documentation/arm64/pmu_counter_user_access.txt @@ -0,0 +1,42 @@ +Access to PMU hardware counter from userspace += + +Overview + +The perf user-space tool relies on the PMU to monitor events. It offers an +abstraction layer over the hardware counters since the underlying +implementation is cpu-dependent. +Arm64 allows userspace tools to have access to the registers storing the +hardware counters' values directly. + +This targets specifically self-monitoring tasks in order to reduce the overhead +by directly accessing the registers without having to go through the kernel. + +How-to +-- +The focus is set on the armv8 pmuv3 which makes sure that the access to the pmu +registers is enable and that the userspace have access to the relevent +information in order to use them. + +In order to have access to the hardware counter it is necessary to open the event +using the perf tool interface: the sys_perf_event_open syscall returns a fd which +can subsequently be used with the mmap syscall in order to retrieve a page of memory +containing information about the event. +The PMU driver uses this page to expose to the user the hardware counter's +index. Using this index enables the user to access the PMU registers using the +`mrs` instruction. + +Have a look `at tools/perf/arch/arm64/tests/user-events.c` for an example. It can be +run using the perf tool to check that the access to the registers works +correctly from userspace: + +./perf test -v + +About chained events + +When the user requests for an event to be counted on 64 bits, two hardware +counters are used and need to be combined to retrieve the correct value: + +val = read_counter(idx); +if ((event.attr.config1 & 0x1)) + val = (val << 32) | read_counter(idx - 1); -- 2.17.1