[tip: objtool/core] objtool: Refactor jump table code to support other architectures

2020-09-10 Thread tip-bot2 for Raphael Gault
The following commit has been merged into the objtool/core branch of tip:

Commit-ID: d871f7b5a6a2a30f4eba577fd56941fa3657e394
Gitweb:
https://git.kernel.org/tip/d871f7b5a6a2a30f4eba577fd56941fa3657e394
Author:Raphael Gault 
AuthorDate:Fri, 04 Sep 2020 16:30:24 +01:00
Committer: Josh Poimboeuf 
CommitterDate: Thu, 10 Sep 2020 10:43:13 -05:00

objtool: Refactor jump table code to support other architectures

The way to identify jump tables and retrieve all the data necessary to
handle the different execution branches is not the same on all
architectures.  In order to be able to add other architecture support,
define an arch-dependent function to process jump-tables.

Reviewed-by: Miroslav Benes 
Signed-off-by: Raphael Gault 
[J.T.: Move arm64 bits out of this patch,
   Have only one function to find the start of the jump table,
   for now assume that the jump table format will be the same as
   x86]
Signed-off-by: Julien Thierry 
Signed-off-by: Josh Poimboeuf 
---
 tools/objtool/arch/x86/special.c | 95 +++-
 tools/objtool/check.c| 90 +
 tools/objtool/check.h|  1 +-
 tools/objtool/special.h  |  4 +-
 4 files changed, 103 insertions(+), 87 deletions(-)

diff --git a/tools/objtool/arch/x86/special.c b/tools/objtool/arch/x86/special.c
index 34e0e16..fd4af88 100644
--- a/tools/objtool/arch/x86/special.c
+++ b/tools/objtool/arch/x86/special.c
@@ -1,4 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
+#include 
+
 #include "../../special.h"
 #include "../../builtin.h"
 
@@ -48,3 +50,96 @@ bool arch_support_alt_relocation(struct special_alt 
*special_alt,
return insn->offset == special_alt->new_off &&
   (insn->type == INSN_CALL || is_static_jump(insn));
 }
+
+/*
+ * There are 3 basic jump table patterns:
+ *
+ * 1. jmpq *[rodata addr](,%reg,8)
+ *
+ *This is the most common case by far.  It jumps to an address in a simple
+ *jump table which is stored in .rodata.
+ *
+ * 2. jmpq *[rodata addr](%rip)
+ *
+ *This is caused by a rare GCC quirk, currently only seen in three driver
+ *functions in the kernel, only with certain obscure non-distro configs.
+ *
+ *As part of an optimization, GCC makes a copy of an existing switch jump
+ *table, modifies it, and then hard-codes the jump (albeit with an indirect
+ *jump) to use a single entry in the table.  The rest of the jump table and
+ *some of its jump targets remain as dead code.
+ *
+ *In such a case we can just crudely ignore all unreachable instruction
+ *warnings for the entire object file.  Ideally we would just ignore them
+ *for the function, but that would require redesigning the code quite a
+ *bit.  And honestly that's just not worth doing: unreachable instruction
+ *warnings are of questionable value anyway, and this is such a rare issue.
+ *
+ * 3. mov [rodata addr],%reg1
+ *... some instructions ...
+ *jmpq *(%reg1,%reg2,8)
+ *
+ *This is a fairly uncommon pattern which is new for GCC 6.  As of this
+ *writing, there are 11 occurrences of it in the allmodconfig kernel.
+ *
+ *As of GCC 7 there are quite a few more of these and the 'in between' code
+ *is significant. Esp. with KASAN enabled some of the code between the mov
+ *and jmpq uses .rodata itself, which can confuse things.
+ *
+ *TODO: Once we have DWARF CFI and smarter instruction decoding logic,
+ *ensure the same register is used in the mov and jump instructions.
+ *
+ *NOTE: RETPOLINE made it harder still to decode dynamic jumps.
+ */
+struct reloc *arch_find_switch_table(struct objtool_file *file,
+   struct instruction *insn)
+{
+   struct reloc  *text_reloc, *rodata_reloc;
+   struct section *table_sec;
+   unsigned long table_offset;
+
+   /* look for a relocation which references .rodata */
+   text_reloc = find_reloc_by_dest_range(file->elf, insn->sec,
+ insn->offset, insn->len);
+   if (!text_reloc || text_reloc->sym->type != STT_SECTION ||
+   !text_reloc->sym->sec->rodata)
+   return NULL;
+
+   table_offset = text_reloc->addend;
+   table_sec = text_reloc->sym->sec;
+
+   if (text_reloc->type == R_X86_64_PC32)
+   table_offset += 4;
+
+   /*
+* Make sure the .rodata address isn't associated with a
+* symbol.  GCC jump tables are anonymous data.
+*
+* Also support C jump tables which are in the same format as
+* switch jump tables.  For objtool to recognize them, they
+* need to be placed in the C_JUMP_TABLE_SECTION section.  They
+* have symbols associated with them.
+*/
+   if (find_symbol_containing(table_sec, table_offset) &&
+ 

Re: [RFC v4 00/18] objtool: Add support for arm64

2019-08-23 Thread Raphael Gault

Hi Josh,

On 8/22/19 8:56 PM, Josh Poimboeuf wrote:

On Fri, Aug 16, 2019 at 01:23:45PM +0100, Raphael Gault wrote:

Hi,

Changes since RFC V3:
* Rebased on tip/master: Switch/jump table had been refactored
* Take Catalin Marinas comments into account regarding the asm macro for
   marking exceptions.

As of now, objtool only supports the x86_64 architecture but the
groundwork has already been done in order to add support for other
architectures without too much effort.

This series of patches adds support for the arm64 architecture
based on the Armv8.5 Architecture Reference Manual.

Objtool will be a valuable tool to progress and provide more guarentees
on live patching which is a work in progress for arm64.

Once we have the base of objtool working the next steps will be to
port Peter Z's uaccess validation for arm64.


Hi Raphael,

Sorry about the long delay.  I have some comments coming shortly.

One general comment: I noticed that several of the (mostly minor)
suggested changes I made for v1 haven't been fixed.

I'll try to suggest them again here for v4, so you don't need to go back
and find them.  But in the future please try to incorporate all the
comments from previous patch sets before posting new versions.  I'm sure
it wasn't intentional, as you did acknowledge and agree to most of the
changes.  But it does waste people's time and goodwill if you neglect to
incorporate their suggestions.  Thanks.



Indeed, sorry about that.

Thanks for you comments, I will do my best to address them shortly. 
However, I won't have access to my professional emails for a little 
while and probably won't be able to work on this before at least a week. 
I'll try to have a new soon though and use my personal email.


Thanks,

--
Raphael Gault


[PATCH v4 1/7] perf: arm64: Add test to check userspace access to hardware counters.

2019-08-22 Thread Raphael Gault
This test relies on the fact that the PMU registers are accessible
from userspace. It then uses the perf_event_mmap_page to retrieve
the counter index and access the underlying register.

This test uses sched_setaffinity(2) in order to run on all CPU and thus
check the behaviour of the PMU of all cpus in a big.LITTLE environment.

Signed-off-by: Raphael Gault 
---
 tools/perf/arch/arm64/include/arch-tests.h |   7 +
 tools/perf/arch/arm64/tests/Build  |   1 +
 tools/perf/arch/arm64/tests/arch-tests.c   |   4 +
 tools/perf/arch/arm64/tests/user-events.c  | 254 +
 4 files changed, 266 insertions(+)
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c

diff --git a/tools/perf/arch/arm64/include/arch-tests.h 
b/tools/perf/arch/arm64/include/arch-tests.h
index 90ec4c8cb880..6a8483de1015 100644
--- a/tools/perf/arch/arm64/include/arch-tests.h
+++ b/tools/perf/arch/arm64/include/arch-tests.h
@@ -2,11 +2,18 @@
 #ifndef ARCH_TESTS_H
 #define ARCH_TESTS_H
 
+#include 
+
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 struct thread;
 struct perf_sample;
+int test__arch_unwind_sample(struct perf_sample *sample,
+struct thread *thread);
 #endif
 
 extern struct test arch_tests[];
+int test__rd_pmevcntr(struct test *test __maybe_unused,
+ int subtest __maybe_unused);
+
 
 #endif
diff --git a/tools/perf/arch/arm64/tests/Build 
b/tools/perf/arch/arm64/tests/Build
index a61c06bdb757..3f9a20c17fc6 100644
--- a/tools/perf/arch/arm64/tests/Build
+++ b/tools/perf/arch/arm64/tests/Build
@@ -1,4 +1,5 @@
 perf-y += regs_load.o
 perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 
+perf-y += user-events.o
 perf-y += arch-tests.o
diff --git a/tools/perf/arch/arm64/tests/arch-tests.c 
b/tools/perf/arch/arm64/tests/arch-tests.c
index 5b1543c98022..57df9b89dede 100644
--- a/tools/perf/arch/arm64/tests/arch-tests.c
+++ b/tools/perf/arch/arm64/tests/arch-tests.c
@@ -10,6 +10,10 @@ struct test arch_tests[] = {
.func = test__dwarf_unwind,
},
 #endif
+   {
+   .desc = "User counter access",
+   .func = test__rd_pmevcntr,
+   },
{
.func = NULL,
},
diff --git a/tools/perf/arch/arm64/tests/user-events.c 
b/tools/perf/arch/arm64/tests/user-events.c
new file mode 100644
index ..b048d7e392bc
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/user-events.c
@@ -0,0 +1,254 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "perf.h"
+#include "debug.h"
+#include "tests/tests.h"
+#include "cloexec.h"
+#include "util.h"
+#include "arch-tests.h"
+
+/*
+ * ARMv8 ARM reserves the following encoding for system registers:
+ * (Ref: ARMv8 ARM, Section: "System instruction class encoding overview",
+ *  C5.2, version:ARM DDI 0487A.f)
+ *  [20-19] : Op0
+ *  [18-16] : Op1
+ *  [15-12] : CRn
+ *  [11-8]  : CRm
+ *  [7-5]   : Op2
+ */
+#define Op0_shift   19
+#define Op0_mask0x3
+#define Op1_shift   16
+#define Op1_mask0x7
+#define CRn_shift   12
+#define CRn_mask0xf
+#define CRm_shift   8
+#define CRm_mask0xf
+#define Op2_shift   5
+#define Op2_mask0x7
+
+#define __stringify(x) #x
+
+#define read_sysreg(r) ({  \
+   u64 __val;  \
+   asm volatile("mrs %0, " __stringify(r) : "=r" (__val)); \
+   __val;  \
+})
+
+#define PMEVCNTR_READ_CASE(idx)\
+   case idx:   \
+   return read_sysreg(pmevcntr##idx##_el0)
+
+#define PMEVCNTR_CASES(readwrite)  \
+   PMEVCNTR_READ_CASE(0);  \
+   PMEVCNTR_READ_CASE(1);  \
+   PMEVCNTR_READ_CASE(2);  \
+   PMEVCNTR_READ_CASE(3);  \
+   PMEVCNTR_READ_CASE(4);  \
+   PMEVCNTR_READ_CASE(5);  \
+   PMEVCNTR_READ_CASE(6);  \
+   PMEVCNTR_READ_CASE(7);  \
+   PMEVCNTR_READ_CASE(8);  \
+   PMEVCNTR_READ_CASE(9);  \
+   PMEVCNTR_READ_CASE(10); \
+   PMEVCNTR_READ_CASE(11); \
+   PMEVCNTR_READ_CASE(12); \
+   PMEVCNTR_READ_CASE(13); \
+   PMEVCNTR_READ_CASE(14); \
+   PMEVCNTR_READ_CASE(15); \
+   PMEVCNTR_READ_CASE(16); \
+   PMEVCNTR_READ_CASE(17); \
+   PMEVCNTR_READ_CASE(18); \
+   PMEVCNTR_READ_CASE(19

[PATCH v4 4/7] arm64: pmu: Add hook to handle pmu-related undefined instructions

2019-08-22 Thread Raphael Gault
This patch introduces a protection for the userspace processes which are
trying to access the registers from the pmu registers on a big.LITTLE
environment. It introduces a hook to handle undefined instructions.

The goal here is to prevent the process to be interrupted by a signal
when the error is caused by the task being scheduled while accessing
a counter, causing the counter access to be invalid. As we are not able
to know efficiently the number of counters available physically on both
pmu in that context we consider that any faulting access to a counter
which is architecturally correct should not cause a SIGILL signal if
the permissions are set accordingly.

This commit also modifies the mask of the mrs_hook declared in
arch/arm64/kernel/cpufeatures.c which emulates only feature register
access. This is necessary because this hook's mask was too large and
thus masking any mrs instruction, even if not related to the emulated
registers which made the pmu emulation inefficient.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/cpufeature.c |  4 +--
 arch/arm64/kernel/perf_event.c | 55 ++
 2 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 07be444c1e31..3a6285d0b2c0 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -2186,8 +2186,8 @@ static int emulate_mrs(struct pt_regs *regs, u32 insn)
 }
 
 static struct undef_hook mrs_hook = {
-   .instr_mask = 0xfff0,
-   .instr_val  = 0xd530,
+   .instr_mask = 0x,
+   .instr_val  = 0xd538,
.pstate_mask = PSR_AA32_MODE_MASK,
.pstate_val = PSR_MODE_EL0t,
.fn = emulate_mrs,
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index a0b4f1bca491..64ca09c9ea65 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -8,9 +8,11 @@
  * This code is based heavily on the ARMv7 perf event code.
  */
 
+#include 
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -1012,6 +1014,59 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
return probe.present ? 0 : -ENODEV;
 }
 
+static int emulate_pmu(struct pt_regs *regs, u32 insn)
+{
+   u32 sys_reg, rt;
+   u32 pmuserenr;
+
+   sys_reg = (u32)aarch64_insn_decode_immediate(AARCH64_INSN_IMM_16, insn) 
<< 5;
+   rt = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn);
+   pmuserenr = read_sysreg(pmuserenr_el0);
+
+   if ((pmuserenr & (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR)) !=
+   (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR))
+   return -EINVAL;
+
+
+   /*
+* Userspace is expected to only use this in the context of the scheme
+* described in the struct perf_event_mmap_page comments.
+*
+* Given that context, we can only get here if we got migrated between
+* getting the register index and doing the MSR read.  This in turn
+* implies we'll fail the sequence and retry, so any value returned is
+* 'good', all we need is to be non-fatal.
+*
+* The choice of the value 0 is comming from the fact that when
+* accessing a register which is not counting events but is accessible,
+* we get 0.
+*/
+   pt_regs_write_reg(regs, rt, 0);
+
+   arm64_skip_faulting_instruction(regs, 4);
+   return 0;
+}
+
+/*
+ * This hook will only be triggered by mrs
+ * instructions on PMU registers. This is mandatory
+ * in order to have a consistent behaviour even on
+ * big.LITTLE systems.
+ */
+static struct undef_hook pmu_hook = {
+   .instr_mask = 0x8800,
+   .instr_val  = 0xd53b8800,
+   .fn = emulate_pmu,
+};
+
+static int __init enable_pmu_emulation(void)
+{
+   register_undef_hook(_hook);
+   return 0;
+}
+
+core_initcall(enable_pmu_emulation);
+
 static int armv8_pmu_init(struct arm_pmu *cpu_pmu)
 {
int ret = armv8pmu_probe_pmu(cpu_pmu);
-- 
2.17.1



[PATCH v4 6/7] arm64: perf: Enable pmu counter direct access for perf event on armv8

2019-08-22 Thread Raphael Gault
Keep track of event opened with direct access to the hardware counters
and modify permissions while they are open.

The strategy used here is the same which x86 uses: everytime an event
is mapped, the permissions are set if required. The atomic field added
in the mm_context helps keep track of the different event opened and
de-activate the permissions when all are unmapped.
We also need to update the permissions in the context switch code so
that tasks keep the right permissions.

Signed-off-by: Raphael Gault 
---
 arch/arm64/include/asm/mmu.h |  6 
 arch/arm64/include/asm/mmu_context.h |  2 ++
 arch/arm64/include/asm/perf_event.h  | 14 
 arch/arm64/kernel/perf_event.c   |  1 +
 drivers/perf/arm_pmu.c   | 54 
 5 files changed, 77 insertions(+)

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index fd6161336653..88ed4466bd06 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -18,6 +18,12 @@
 
 typedef struct {
atomic64_t  id;
+
+   /*
+* non-zero if userspace have access to hardware
+* counters directly.
+*/
+   atomic_tpmu_direct_access;
void*vdso;
unsigned long   flags;
 } mm_context_t;
diff --git a/arch/arm64/include/asm/mmu_context.h 
b/arch/arm64/include/asm/mmu_context.h
index 7ed0adb187a8..6e66ff940494 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -224,6 +225,7 @@ static inline void __switch_mm(struct mm_struct *next)
}
 
check_and_switch_context(next, cpu);
+   perf_switch_user_access(next);
 }
 
 static inline void
diff --git a/arch/arm64/include/asm/perf_event.h 
b/arch/arm64/include/asm/perf_event.h
index 2bdbc79bbd01..ba58fa726631 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -8,6 +8,7 @@
 
 #include 
 #include 
+#include 
 
 #defineARMV8_PMU_MAX_COUNTERS  32
 #defineARMV8_PMU_COUNTER_MASK  (ARMV8_PMU_MAX_COUNTERS - 1)
@@ -223,4 +224,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
(regs)->pstate = PSR_MODE_EL1h; \
 }
 
+static inline void perf_switch_user_access(struct mm_struct *mm)
+{
+   if (!IS_ENABLED(CONFIG_PERF_EVENTS))
+   return;
+
+   if (atomic_read(>context.pmu_direct_access)) {
+   write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR,
+pmuserenr_el0);
+   } else {
+   write_sysreg(0, pmuserenr_el0);
+   }
+}
+
 #endif
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index de9b001e8b7c..7de56f22d038 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -1285,6 +1285,7 @@ void arch_perf_update_userpage(struct perf_event *event,
 */
freq = arch_timer_get_rate();
userpg->cap_user_time = 1;
+   userpg->cap_user_rdpmc = !!(event->hw.flags & ARMPMU_EL0_RD_CNTR);
 
clocks_calc_mult_shift(>time_mult, , freq,
NSEC_PER_SEC, 0);
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 2d06b8095a19..d0d3e523a4c4 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -25,6 +25,7 @@
 #include 
 
 #include 
+#include 
 
 static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
 static DEFINE_PER_CPU(int, cpu_irq);
@@ -778,6 +779,57 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu)
_pmu->node);
 }
 
+static void refresh_pmuserenr(void *mm)
+{
+   perf_switch_user_access(mm);
+}
+
+static int check_homogeneous_cap(struct perf_event *event, struct mm_struct 
*mm)
+{
+   pr_info("checking HAS_HOMOGENEOUS_PMU");
+   if (!cpus_have_cap(ARM64_HAS_HOMOGENEOUS_PMU)) {
+   pr_info("Disable direct access (!HAS_HOMOGENEOUS_PMU)");
+   atomic_set(>context.pmu_direct_access, 0);
+   on_each_cpu(refresh_pmuserenr, mm, 1);
+   event->hw.flags &= ~ARMPMU_EL0_RD_CNTR;
+   return 0;
+   }
+
+   return 1;
+}
+
+static void armpmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return;
+
+   /*
+* This function relies on not being called concurrently in two
+* tasks in the same mm.  Otherwise one task could observe
+* pmu_direct_access > 1 and return all the way back to
+* userspace with user access disabled while another task is still
+* doing on_each_cpu_mask() to enable user access.
+*
+* For now, this can't happen because all callers hold mmap_sem
+* for write.  If this changes, we'll need a different 

[PATCH v4 7/7] Documentation: arm64: Document PMU counters access from userspace

2019-08-22 Thread Raphael Gault
Add a documentation file to describe the access to the pmu hardware
counters from userspace

Signed-off-by: Raphael Gault 
---
 .../arm64/pmu_counter_user_access.txt | 42 +++
 1 file changed, 42 insertions(+)
 create mode 100644 Documentation/arm64/pmu_counter_user_access.txt

diff --git a/Documentation/arm64/pmu_counter_user_access.txt 
b/Documentation/arm64/pmu_counter_user_access.txt
new file mode 100644
index ..6788b1107381
--- /dev/null
+++ b/Documentation/arm64/pmu_counter_user_access.txt
@@ -0,0 +1,42 @@
+Access to PMU hardware counter from userspace
+=
+
+Overview
+
+The perf user-space tool relies on the PMU to monitor events. It offers an
+abstraction layer over the hardware counters since the underlying
+implementation is cpu-dependent.
+Arm64 allows userspace tools to have access to the registers storing the
+hardware counters' values directly.
+
+This targets specifically self-monitoring tasks in order to reduce the overhead
+by directly accessing the registers without having to go through the kernel.
+
+How-to
+--
+The focus is set on the armv8 pmuv3 which makes sure that the access to the pmu
+registers is enable and that the userspace have access to the relevent
+information in order to use them.
+
+In order to have access to the hardware counter it is necessary to open the 
event
+using the perf tool interface: the sys_perf_event_open syscall returns a fd 
which
+can subsequently be used with the mmap syscall in order to retrieve a page of 
memory
+containing information about the event.
+The PMU driver uses this page to expose to the user the hardware counter's
+index. Using this index enables the user to access the PMU registers using the
+`mrs` instruction.
+
+Have a look `at tools/perf/arch/arm64/tests/user-events.c` for an example. It 
can be
+run using the perf tool to check that the access to the registers works
+correctly from userspace:
+
+./perf test -v
+
+About chained events
+
+When the user requests for an event to be counted on 64 bits, two hardware
+counters are used and need to be combined to retrieve the correct value:
+
+val = read_counter(idx);
+if ((event.attr.config1 & 0x1))
+   val = (val << 32) | read_counter(idx - 1);
-- 
2.17.1



[PATCH v4 2/7] arm64: cpu: Add accessor for boot_cpu_data

2019-08-22 Thread Raphael Gault
Mark boot_cpu_data as read-only after initialization.
Define accessor to read boot_cpu_data from outside of boot_cpu_data.

Signed-off-by: Raphael Gault 
---
 arch/arm64/include/asm/cpu.h | 2 +-
 arch/arm64/kernel/cpuinfo.c  | 7 ++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index d72d995b7e25..6abc2faf1a64 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -62,5 +62,5 @@ void __init cpuinfo_store_boot_cpu(void);
 void __init init_cpu_features(struct cpuinfo_arm64 *info);
 void update_cpu_features(int cpu, struct cpuinfo_arm64 *info,
 struct cpuinfo_arm64 *boot);
-
+struct cpuinfo_arm64 *get_boot_cpu_data(void);
 #endif /* __ASM_CPU_H */
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 876055e37352..ffa00b3a148b 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -31,7 +31,7 @@
  * values depending on configuration at or after reset.
  */
 DEFINE_PER_CPU(struct cpuinfo_arm64, cpu_data);
-static struct cpuinfo_arm64 boot_cpu_data;
+static struct cpuinfo_arm64 boot_cpu_data __ro_after_init;
 
 static char *icache_policy_str[] = {
[0 ... ICACHE_POLICY_PIPT]  = "RESERVED/UNKNOWN",
@@ -395,4 +395,9 @@ void __init cpuinfo_store_boot_cpu(void)
init_cpu_features(_cpu_data);
 }
 
+struct cpuinfo_arm64 *get_boot_cpu_data(void)
+{
+   return _cpu_data;
+}
+
 device_initcall(cpuinfo_regs_init);
-- 
2.17.1



[PATCH v4 3/7] arm64: cpufeature: Add feature to detect homogeneous systems

2019-08-22 Thread Raphael Gault
This feature is required in order to enable PMU counters direct
access from userspace only when the system is homogeneous.
This feature checks the model of each CPU brought online and compares it
to the boot CPU. If it differs then it is heterogeneous.

This CPU feature doesn't prevent different models of CPUs from being
hotplugged on, however if such a scenario happens, it will turn off the
feature. There is no possibility for the feature to be turned on again
by hotplugging off CPUs though.

Signed-off-by: Raphael Gault 
---
 arch/arm64/include/asm/cpucaps.h|  3 ++-
 arch/arm64/include/asm/cpufeature.h | 10 ++
 arch/arm64/kernel/cpufeature.c  | 28 
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index f19fe4b9acc4..1cd73cf46116 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -52,7 +52,8 @@
 #define ARM64_HAS_IRQ_PRIO_MASKING 42
 #define ARM64_HAS_DCPODP   43
 #define ARM64_WORKAROUND_1463225   44
+#define ARM64_HAS_HOMOGENEOUS_PMU  45
 
-#define ARM64_NCAPS45
+#define ARM64_NCAPS46
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 407e2bf23676..c54a87896bbd 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -430,6 +430,16 @@ static inline void cpus_set_cap(unsigned int num)
}
 }
 
+static inline void cpus_unset_cap(unsigned int num)
+{
+   if (num >= ARM64_NCAPS) {
+   pr_warn("Attempt to unset an illegal CPU capability (%d >= 
%d)\n",
+   num, ARM64_NCAPS);
+   } else {
+   clear_bit(num, cpu_hwcaps);
+   }
+}
+
 static inline int __attribute_const__
 cpuid_feature_extract_signed_field_width(u64 features, int field, int width)
 {
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index f29f36a65175..07be444c1e31 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1248,6 +1248,23 @@ static bool can_use_gic_priorities(const struct 
arm64_cpu_capabilities *entry,
 }
 #endif
 
+static bool has_homogeneous_pmu(const struct arm64_cpu_capabilities *entry,
+ int scope)
+{
+   u32 model = read_cpuid_id() & MIDR_CPU_MODEL_MASK;
+   struct cpuinfo_arm64 *boot = get_boot_cpu_data();
+
+   return  (boot->reg_midr & MIDR_CPU_MODEL_MASK) == model;
+}
+
+static void disable_homogeneous_cap(const struct arm64_cpu_capabilities *entry)
+{
+   if (!has_homogeneous_pmu(entry, entry->type)) {
+   pr_info("Disabling Homogeneous PMU (%d)", entry->capability);
+   cpus_unset_cap(entry->capability);
+   }
+}
+
 static const struct arm64_cpu_capabilities arm64_features[] = {
{
.desc = "GIC system register CPU interface",
@@ -1548,6 +1565,17 @@ static const struct arm64_cpu_capabilities 
arm64_features[] = {
.min_field_value = 1,
},
 #endif
+   {
+   /*
+* Detect whether the system is heterogeneous or
+* homogeneous
+*/
+   .desc = "Homogeneous CPUs",
+   .capability = ARM64_HAS_HOMOGENEOUS_PMU,
+   .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
+   .matches = has_homogeneous_pmu,
+   .cpu_enable = disable_homogeneous_cap,
+   },
{},
 };
 
-- 
2.17.1



[PATCH v4 5/7] arm64: pmu: Add function implementation to update event index in userpage.

2019-08-22 Thread Raphael Gault
In order to be able to access the counter directly for userspace,
we need to provide the index of the counter using the userpage.
We thus need to override the event_idx function to retrieve and
convert the perf_event index to armv8 hardware index.

Since the arm_pmu driver can be used by any implementation, even
if not armv8, two components play a role into making sure the
behaviour is correct and consistent with the PMU capabilities:

* the ARMPMU_EL0_RD_CNTR flag which denotes the capability to access
counter from userspace.
* the event_idx call back, which is implemented and initialized by
the PMU implementation: if no callback is provided, the default
behaviour applies, returning 0 as index value.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/perf_event.c | 21 +
 include/linux/perf/arm_pmu.h   |  2 ++
 2 files changed, 23 insertions(+)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 64ca09c9ea65..de9b001e8b7c 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -820,6 +820,22 @@ static void armv8pmu_clear_event_idx(struct pmu_hw_events 
*cpuc,
clear_bit(idx - 1, cpuc->used_mask);
 }
 
+static int armv8pmu_access_event_idx(struct perf_event *event)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return 0;
+
+   /*
+* We remap the cycle counter index to 32 to
+* match the offset applied to the rest of
+* the counter indices.
+*/
+   if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER)
+   return 32;
+
+   return event->hw.idx;
+}
+
 /*
  * Add an event filter to a given event.
  */
@@ -913,6 +929,9 @@ static int __armv8_pmuv3_map_event(struct perf_event *event,
if (armv8pmu_event_is_64bit(event))
event->hw.flags |= ARMPMU_EVT_64BIT;
 
+   if (cpus_have_cap(ARM64_HAS_HOMOGENEOUS_PMU))
+   event->hw.flags |= ARMPMU_EL0_RD_CNTR;
+
/* Only expose micro/arch events supported by this PMU */
if ((hw_event_id > 0) && (hw_event_id < ARMV8_PMUV3_MAX_COMMON_EVENTS)
&& test_bit(hw_event_id, armpmu->pmceid_bitmap)) {
@@ -1086,6 +1105,8 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu)
cpu_pmu->set_event_filter   = armv8pmu_set_event_filter;
cpu_pmu->filter_match   = armv8pmu_filter_match;
 
+   cpu_pmu->pmu.event_idx  = armv8pmu_access_event_idx;
+
return 0;
 }
 
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 71f525a35ac2..1106a9ac00fd 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -26,6 +26,8 @@
  */
 /* Event uses a 64bit counter */
 #define ARMPMU_EVT_64BIT   1
+/* Allow access to hardware counter from userspace */
+#define ARMPMU_EL0_RD_CNTR 2
 
 #define HW_OP_UNSUPPORTED  0x
 #define C(_x)  PERF_COUNT_HW_CACHE_##_x
-- 
2.17.1



[PATCH v4 0/7] arm64: Enable access to pmu registers by user-space

2019-08-22 Thread Raphael Gault
Hi,

Changes since v3:
* Rebased on will/for-next/perf in order to include this patch [1]
* Re-introduce `mrs` hook used in previous versions
* Invert cpu feature to track homogeneity instead of heterogeneity
* Introduce accessor for boot_cpu_data (see second commit for more info)
* Apply Mark Rutland's comments


The perf user-space tool relies on the PMU to monitor events. It offers an
abstraction layer over the hardware counters since the underlying
implementation is cpu-dependent. We want to allow userspace tools to have
access to the registers storing the hardware counters' values directly.
This targets specifically self-monitoring tasks in order to reduce the
overhead by directly accessing the registers without having to go
through the kernel.
In order to do this we need to setup the pmu so that it exposes its registers
to userspace access.

The first patch add a test to the perf tool so that we can test that the
access to the registers works correctly from userspace.

The second patch introduces an accessor for `boot_cpu_data` which is
static. Including cpu.h turned out to cause a chain of dependencies so I
opted for the accessor since it is not much used.

The third patch add a capability in the arm64 cpufeatures framework in
order to detect when we are running on a homogeneous system.

The fourth patch re introduces the hooks to handling undefined
instruction for `mrs` instructions on pmu-related registers.

The fifth patch focuses on the armv8 pmuv3 PMU support and makes sure that
the access to the pmu registers is enable and that the userspace have
access to the relevent information in order to use them.

The sixth patch put in place callbacks to enable access to the hardware
counters from userspace when a compatible event is opened using the perf
API.

The seventh patch adds a short documentation about PMU counters direct
access from userspace.

[1]: https://lkml.org/lkml/2019/8/20/875

Raphael Gault (7):
  perf: arm64: Add test to check userspace access to hardware counters.
  arm64: cpu: Add accessor for boot_cpu_data
  arm64: cpufeature: Add feature to detect homogeneous systems
  arm64: pmu: Add hook to handle pmu-related undefined instructions
  arm64: pmu: Add function implementation to update event index in
userpage.
  arm64: perf: Enable pmu counter direct access for perf event on armv8
  Documentation: arm64: Document PMU counters access from userspace

 .../arm64/pmu_counter_user_access.txt |  42 +++
 arch/arm64/include/asm/cpu.h  |   2 +-
 arch/arm64/include/asm/cpucaps.h  |   3 +-
 arch/arm64/include/asm/cpufeature.h   |  10 +
 arch/arm64/include/asm/mmu.h  |   6 +
 arch/arm64/include/asm/mmu_context.h  |   2 +
 arch/arm64/include/asm/perf_event.h   |  14 +
 arch/arm64/kernel/cpufeature.c|  32 ++-
 arch/arm64/kernel/cpuinfo.c   |   7 +-
 arch/arm64/kernel/perf_event.c|  77 ++
 drivers/perf/arm_pmu.c|  54 
 include/linux/perf/arm_pmu.h  |   2 +
 tools/perf/arch/arm64/include/arch-tests.h|   7 +
 tools/perf/arch/arm64/tests/Build |   1 +
 tools/perf/arch/arm64/tests/arch-tests.c  |   4 +
 tools/perf/arch/arm64/tests/user-events.c | 254 ++
 16 files changed, 512 insertions(+), 5 deletions(-)
 create mode 100644 Documentation/arm64/pmu_counter_user_access.txt
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c

-- 
2.17.1



[PATCH] arm64: perf_event: Add missing header needed for smp_processor_id()

2019-08-20 Thread Raphael Gault
Acked-by: Mark Rutland 
Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/perf_event.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 96e90e270042..24575c0a0065 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* ARMv8 Cortex-A53 specific event types. */
 #define ARMV8_A53_PERFCTR_PREF_LINEFILL0xC2
-- 
2.17.1



Re: [PATCH v3 2/5] arm64: cpufeature: Add feature to detect heterogeneous systems

2019-08-20 Thread Raphael Gault

Hi Mark,

Thank you for your comments.

On 8/20/19 4:49 PM, Mark Rutland wrote:

On Tue, Aug 20, 2019 at 04:23:17PM +0100, Mark Rutland wrote:

Hi Raphael,

On Fri, Aug 16, 2019 at 01:59:31PM +0100, Raphael Gault wrote:

This feature is required in order to enable PMU counters direct
access from userspace only when the system is homogeneous.
This feature checks the model of each CPU brought online and compares it
to the boot CPU. If it differs then it is heterogeneous.


It would be worth noting that this patch prevents heterogeneous CPUs
being brought online late if the system was uniform at boot time.


Looking again, I think I'd misunderstood how
ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU was dealt with, but we do have a
problem in this area.

[...]




+   .capability = ARM64_HAS_HETEROGENEOUS_PMU,
+   .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU | 
ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU,
+   .matches = has_heterogeneous_pmu,
+   },


I had a quick chat with Will, and we concluded that we must permit late
onlining of heterogeneous CPUs here as people are likely to rely on
late CPU onlining on some heterogeneous systems.

I think the above permits that, but that also means that we need some
support code to fail gracefully in that case (e.g. without sending
a SIGILL to unaware userspace code).


I understand, however, I understood that 
ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU did not allow later CPU to be 
heterogeneous if the capability wasn't already enabled. Thus if as you 
say we need to allow the system to switch from homogeneous to 
heterogeneous, then I should change the type of this capability.



That means that we'll need the counter emulation code that you had in
previous versions of this patch (e.g. to handle potential UNDEFs when a
new CPU has fewer counters than the previously online CPUs).

Further, I think the context switch (and event index) code needs to take
this cap into account, and disable direct access once the system becomes
heterogeneous.


That is a good point indeed.

Thanks,

--
Raphael Gault


Re: [PATCH v3 4/5] arm64: perf: Enable pmu counter direct access for perf event on armv8

2019-08-19 Thread Raphael Gault

Hi,

On 8/18/19 1:37 PM, kbuild test robot wrote:

Hi Raphael,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[cannot apply to v5.3-rc4 next-20190816]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]


This patchset was based on linux-next/master and note linus

Thanks,

--
Raphael Gault


[PATCH v3 2/5] arm64: cpufeature: Add feature to detect heterogeneous systems

2019-08-16 Thread Raphael Gault
This feature is required in order to enable PMU counters direct
access from userspace only when the system is homogeneous.
This feature checks the model of each CPU brought online and compares it
to the boot CPU. If it differs then it is heterogeneous.

Signed-off-by: Raphael Gault 
---
 arch/arm64/include/asm/cpucaps.h |  3 ++-
 arch/arm64/kernel/cpufeature.c   | 20 
 arch/arm64/kernel/perf_event.c   |  1 +
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index f19fe4b9acc4..040370af38ad 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -52,7 +52,8 @@
 #define ARM64_HAS_IRQ_PRIO_MASKING 42
 #define ARM64_HAS_DCPODP   43
 #define ARM64_WORKAROUND_1463225   44
+#define ARM64_HAS_HETEROGENEOUS_PMU45
 
-#define ARM64_NCAPS45
+#define ARM64_NCAPS46
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9323bcc40a58..bbdd809f12a6 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1260,6 +1260,15 @@ static bool can_use_gic_priorities(const struct 
arm64_cpu_capabilities *entry,
 }
 #endif
 
+static bool has_heterogeneous_pmu(const struct arm64_cpu_capabilities *entry,
+int scope)
+{
+   u32 model = read_cpuid_id() & MIDR_CPU_MODEL_MASK;
+   struct cpuinfo_arm64 *boot = _cpu(cpu_data, 0);
+
+   return  (boot->reg_midr & MIDR_CPU_MODEL_MASK) != model;
+}
+
 static const struct arm64_cpu_capabilities arm64_features[] = {
{
.desc = "GIC system register CPU interface",
@@ -1560,6 +1569,16 @@ static const struct arm64_cpu_capabilities 
arm64_features[] = {
.min_field_value = 1,
},
 #endif
+   {
+   /*
+* Detect whether the system is heterogeneous or
+* homogeneous
+*/
+   .desc = "Detect whether we have heterogeneous CPUs",
+   .capability = ARM64_HAS_HETEROGENEOUS_PMU,
+   .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU | 
ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU,
+   .matches = has_heterogeneous_pmu,
+   },
{},
 };
 
@@ -1727,6 +1746,7 @@ static void __init setup_elf_hwcaps(const struct 
arm64_cpu_capabilities *hwcaps)
cap_set_elf_hwcap(hwcaps);
 }
 
+
 static void update_cpu_capabilities(u16 scope_mask)
 {
int i;
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 2d3bdebdf6df..a0b4f1bca491 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* ARMv8 Cortex-A53 specific event types. */
 #define ARMV8_A53_PERFCTR_PREF_LINEFILL0xC2
-- 
2.17.1



[PATCH v3 1/5] perf: arm64: Add test to check userspace access to hardware counters.

2019-08-16 Thread Raphael Gault
This test relies on the fact that the PMU registers are accessible
from userspace. It then uses the perf_event_mmap_page to retrieve
the counter index and access the underlying register.

This test uses sched_setaffinity(2) in order to run on all CPU and thus
check the behaviour of the PMU of all cpus in a big.LITTLE environment.

Signed-off-by: Raphael Gault 
---
 tools/perf/arch/arm64/include/arch-tests.h |   7 +
 tools/perf/arch/arm64/tests/Build  |   1 +
 tools/perf/arch/arm64/tests/arch-tests.c   |   4 +
 tools/perf/arch/arm64/tests/user-events.c  | 254 +
 4 files changed, 266 insertions(+)
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c

diff --git a/tools/perf/arch/arm64/include/arch-tests.h 
b/tools/perf/arch/arm64/include/arch-tests.h
index 90ec4c8cb880..6a8483de1015 100644
--- a/tools/perf/arch/arm64/include/arch-tests.h
+++ b/tools/perf/arch/arm64/include/arch-tests.h
@@ -2,11 +2,18 @@
 #ifndef ARCH_TESTS_H
 #define ARCH_TESTS_H
 
+#include 
+
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 struct thread;
 struct perf_sample;
+int test__arch_unwind_sample(struct perf_sample *sample,
+struct thread *thread);
 #endif
 
 extern struct test arch_tests[];
+int test__rd_pmevcntr(struct test *test __maybe_unused,
+ int subtest __maybe_unused);
+
 
 #endif
diff --git a/tools/perf/arch/arm64/tests/Build 
b/tools/perf/arch/arm64/tests/Build
index a61c06bdb757..3f9a20c17fc6 100644
--- a/tools/perf/arch/arm64/tests/Build
+++ b/tools/perf/arch/arm64/tests/Build
@@ -1,4 +1,5 @@
 perf-y += regs_load.o
 perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 
+perf-y += user-events.o
 perf-y += arch-tests.o
diff --git a/tools/perf/arch/arm64/tests/arch-tests.c 
b/tools/perf/arch/arm64/tests/arch-tests.c
index 5b1543c98022..57df9b89dede 100644
--- a/tools/perf/arch/arm64/tests/arch-tests.c
+++ b/tools/perf/arch/arm64/tests/arch-tests.c
@@ -10,6 +10,10 @@ struct test arch_tests[] = {
.func = test__dwarf_unwind,
},
 #endif
+   {
+   .desc = "User counter access",
+   .func = test__rd_pmevcntr,
+   },
{
.func = NULL,
},
diff --git a/tools/perf/arch/arm64/tests/user-events.c 
b/tools/perf/arch/arm64/tests/user-events.c
new file mode 100644
index ..b048d7e392bc
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/user-events.c
@@ -0,0 +1,254 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "perf.h"
+#include "debug.h"
+#include "tests/tests.h"
+#include "cloexec.h"
+#include "util.h"
+#include "arch-tests.h"
+
+/*
+ * ARMv8 ARM reserves the following encoding for system registers:
+ * (Ref: ARMv8 ARM, Section: "System instruction class encoding overview",
+ *  C5.2, version:ARM DDI 0487A.f)
+ *  [20-19] : Op0
+ *  [18-16] : Op1
+ *  [15-12] : CRn
+ *  [11-8]  : CRm
+ *  [7-5]   : Op2
+ */
+#define Op0_shift   19
+#define Op0_mask0x3
+#define Op1_shift   16
+#define Op1_mask0x7
+#define CRn_shift   12
+#define CRn_mask0xf
+#define CRm_shift   8
+#define CRm_mask0xf
+#define Op2_shift   5
+#define Op2_mask0x7
+
+#define __stringify(x) #x
+
+#define read_sysreg(r) ({  \
+   u64 __val;  \
+   asm volatile("mrs %0, " __stringify(r) : "=r" (__val)); \
+   __val;  \
+})
+
+#define PMEVCNTR_READ_CASE(idx)\
+   case idx:   \
+   return read_sysreg(pmevcntr##idx##_el0)
+
+#define PMEVCNTR_CASES(readwrite)  \
+   PMEVCNTR_READ_CASE(0);  \
+   PMEVCNTR_READ_CASE(1);  \
+   PMEVCNTR_READ_CASE(2);  \
+   PMEVCNTR_READ_CASE(3);  \
+   PMEVCNTR_READ_CASE(4);  \
+   PMEVCNTR_READ_CASE(5);  \
+   PMEVCNTR_READ_CASE(6);  \
+   PMEVCNTR_READ_CASE(7);  \
+   PMEVCNTR_READ_CASE(8);  \
+   PMEVCNTR_READ_CASE(9);  \
+   PMEVCNTR_READ_CASE(10); \
+   PMEVCNTR_READ_CASE(11); \
+   PMEVCNTR_READ_CASE(12); \
+   PMEVCNTR_READ_CASE(13); \
+   PMEVCNTR_READ_CASE(14); \
+   PMEVCNTR_READ_CASE(15); \
+   PMEVCNTR_READ_CASE(16); \
+   PMEVCNTR_READ_CASE(17); \
+   PMEVCNTR_READ_CASE(18); \
+   PMEVCNTR_READ_CASE(19

[PATCH v3 4/5] arm64: perf: Enable pmu counter direct access for perf event on armv8

2019-08-16 Thread Raphael Gault
Keep track of event opened with direct access to the hardware counters
and modify permissions while they are open.

The strategy used here is the same which x86 uses: everytime an event
is mapped, the permissions are set if required. The atomic field added
in the mm_context helps keep track of the different event opened and
de-activate the permissions when all are unmapped.
We also need to update the permissions in the context switch code so
that tasks keep the right permissions.

Signed-off-by: Raphael Gault 
---
 arch/arm64/include/asm/mmu.h |  6 +
 arch/arm64/include/asm/mmu_context.h |  2 ++
 arch/arm64/include/asm/perf_event.h  | 14 ++
 drivers/perf/arm_pmu.c   | 38 
 4 files changed, 60 insertions(+)

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index fd6161336653..88ed4466bd06 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -18,6 +18,12 @@
 
 typedef struct {
atomic64_t  id;
+
+   /*
+* non-zero if userspace have access to hardware
+* counters directly.
+*/
+   atomic_tpmu_direct_access;
void*vdso;
unsigned long   flags;
 } mm_context_t;
diff --git a/arch/arm64/include/asm/mmu_context.h 
b/arch/arm64/include/asm/mmu_context.h
index 7ed0adb187a8..6e66ff940494 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -224,6 +225,7 @@ static inline void __switch_mm(struct mm_struct *next)
}
 
check_and_switch_context(next, cpu);
+   perf_switch_user_access(next);
 }
 
 static inline void
diff --git a/arch/arm64/include/asm/perf_event.h 
b/arch/arm64/include/asm/perf_event.h
index 2bdbc79bbd01..ba58fa726631 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -8,6 +8,7 @@
 
 #include 
 #include 
+#include 
 
 #defineARMV8_PMU_MAX_COUNTERS  32
 #defineARMV8_PMU_COUNTER_MASK  (ARMV8_PMU_MAX_COUNTERS - 1)
@@ -223,4 +224,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
(regs)->pstate = PSR_MODE_EL1h; \
 }
 
+static inline void perf_switch_user_access(struct mm_struct *mm)
+{
+   if (!IS_ENABLED(CONFIG_PERF_EVENTS))
+   return;
+
+   if (atomic_read(>context.pmu_direct_access)) {
+   write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR,
+pmuserenr_el0);
+   } else {
+   write_sysreg(0, pmuserenr_el0);
+   }
+}
+
 #endif
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index df352b334ea7..3a48cc9f17af 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -25,6 +25,7 @@
 #include 
 
 #include 
+#include 
 
 static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
 static DEFINE_PER_CPU(int, cpu_irq);
@@ -778,6 +779,41 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu)
_pmu->node);
 }
 
+static void refresh_pmuserenr(void *mm)
+{
+   perf_switch_user_access(mm);
+}
+
+static void armpmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return;
+
+   /*
+* This function relies on not being called concurrently in two
+* tasks in the same mm.  Otherwise one task could observe
+* pmu_direct_access > 1 and return all the way back to
+* userspace with user access disabled while another task is still
+* doing on_each_cpu_mask() to enable user access.
+*
+* For now, this can't happen because all callers hold mmap_sem
+* for write.  If this changes, we'll need a different solution.
+*/
+   lockdep_assert_held_write(>mmap_sem);
+
+   if (atomic_inc_return(>context.pmu_direct_access) == 1)
+   on_each_cpu(refresh_pmuserenr, mm, 1);
+}
+
+static void armpmu_event_unmapped(struct perf_event *event, struct mm_struct 
*mm)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return;
+
+   if (atomic_dec_and_test(>context.pmu_direct_access))
+   on_each_cpu_mask(mm_cpumask(mm), refresh_pmuserenr, NULL, 1);
+}
+
 static struct arm_pmu *__armpmu_alloc(gfp_t flags)
 {
struct arm_pmu *pmu;
@@ -799,6 +835,8 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags)
.pmu_enable = armpmu_enable,
.pmu_disable= armpmu_disable,
.event_init = armpmu_event_init,
+   .event_mapped   = armpmu_event_mapped,
+   .event_unmapped = armpmu_event_unmapped,
.add= armpmu_add,
.del= armpmu_del,
.start  = armpmu_start,
-- 
2.17.1



[PATCH v3 5/5] Documentation: arm64: Document PMU counters access from userspace

2019-08-16 Thread Raphael Gault
Add a documentation file to describe the access to the pmu hardware
counters from userspace

Signed-off-by: Raphael Gault 
---
 .../arm64/pmu_counter_user_access.txt | 42 +++
 1 file changed, 42 insertions(+)
 create mode 100644 Documentation/arm64/pmu_counter_user_access.txt

diff --git a/Documentation/arm64/pmu_counter_user_access.txt 
b/Documentation/arm64/pmu_counter_user_access.txt
new file mode 100644
index ..6788b1107381
--- /dev/null
+++ b/Documentation/arm64/pmu_counter_user_access.txt
@@ -0,0 +1,42 @@
+Access to PMU hardware counter from userspace
+=
+
+Overview
+
+The perf user-space tool relies on the PMU to monitor events. It offers an
+abstraction layer over the hardware counters since the underlying
+implementation is cpu-dependent.
+Arm64 allows userspace tools to have access to the registers storing the
+hardware counters' values directly.
+
+This targets specifically self-monitoring tasks in order to reduce the overhead
+by directly accessing the registers without having to go through the kernel.
+
+How-to
+--
+The focus is set on the armv8 pmuv3 which makes sure that the access to the pmu
+registers is enable and that the userspace have access to the relevent
+information in order to use them.
+
+In order to have access to the hardware counter it is necessary to open the 
event
+using the perf tool interface: the sys_perf_event_open syscall returns a fd 
which
+can subsequently be used with the mmap syscall in order to retrieve a page of 
memory
+containing information about the event.
+The PMU driver uses this page to expose to the user the hardware counter's
+index. Using this index enables the user to access the PMU registers using the
+`mrs` instruction.
+
+Have a look `at tools/perf/arch/arm64/tests/user-events.c` for an example. It 
can be
+run using the perf tool to check that the access to the registers works
+correctly from userspace:
+
+./perf test -v
+
+About chained events
+
+When the user requests for an event to be counted on 64 bits, two hardware
+counters are used and need to be combined to retrieve the correct value:
+
+val = read_counter(idx);
+if ((event.attr.config1 & 0x1))
+   val = (val << 32) | read_counter(idx - 1);
-- 
2.17.1



[PATCH v3 3/5] arm64: pmu: Add function implementation to update event index in userpage.

2019-08-16 Thread Raphael Gault
In order to be able to access the counter directly for userspace,
we need to provide the index of the counter using the userpage.
We thus need to override the event_idx function to retrieve and
convert the perf_event index to armv8 hardware index.

Since the arm_pmu driver can be used by any implementation, even
if not armv8, two components play a role into making sure the
behaviour is correct and consistent with the PMU capabilities:

* the ARMPMU_EL0_RD_CNTR flag which denotes the capability to access
counter from userspace.
* the event_idx call back, which is implemented and initialized by
the PMU implementation: if no callback is provided, the default
behaviour applies, returning 0 as index value.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/perf_event.c | 22 ++
 include/linux/perf/arm_pmu.h   |  2 ++
 2 files changed, 24 insertions(+)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index a0b4f1bca491..9fe3f6909513 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -818,6 +818,22 @@ static void armv8pmu_clear_event_idx(struct pmu_hw_events 
*cpuc,
clear_bit(idx - 1, cpuc->used_mask);
 }
 
+static int armv8pmu_access_event_idx(struct perf_event *event)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return 0;
+
+   /*
+* We remap the cycle counter index to 32 to
+* match the offset applied to the rest of
+* the counter indeces.
+*/
+   if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER)
+   return 32;
+
+   return event->hw.idx;
+}
+
 /*
  * Add an event filter to a given event.
  */
@@ -911,6 +927,9 @@ static int __armv8_pmuv3_map_event(struct perf_event *event,
if (armv8pmu_event_is_64bit(event))
event->hw.flags |= ARMPMU_EVT_64BIT;
 
+   if (!cpus_have_const_cap(ARM64_HAS_HETEROGENEOUS_PMU))
+   event->hw.flags |= ARMPMU_EL0_RD_CNTR;
+
/* Only expose micro/arch events supported by this PMU */
if ((hw_event_id > 0) && (hw_event_id < ARMV8_PMUV3_MAX_COMMON_EVENTS)
&& test_bit(hw_event_id, armpmu->pmceid_bitmap)) {
@@ -1031,6 +1050,8 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu)
cpu_pmu->set_event_filter   = armv8pmu_set_event_filter;
cpu_pmu->filter_match   = armv8pmu_filter_match;
 
+   cpu_pmu->pmu.event_idx  = armv8pmu_access_event_idx;
+
return 0;
 }
 
@@ -1209,6 +1230,7 @@ void arch_perf_update_userpage(struct perf_event *event,
 */
freq = arch_timer_get_rate();
userpg->cap_user_time = 1;
+   userpg->cap_user_rdpmc = !!(event->hw.flags & ARMPMU_EL0_RD_CNTR);
 
clocks_calc_mult_shift(>time_mult, , freq,
NSEC_PER_SEC, 0);
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 71f525a35ac2..1106a9ac00fd 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -26,6 +26,8 @@
  */
 /* Event uses a 64bit counter */
 #define ARMPMU_EVT_64BIT   1
+/* Allow access to hardware counter from userspace */
+#define ARMPMU_EL0_RD_CNTR 2
 
 #define HW_OP_UNSUPPORTED  0x
 #define C(_x)  PERF_COUNT_HW_CACHE_##_x
-- 
2.17.1



[PATCH v3 0/5] arm64: Enable access to pmu registers by user-space

2019-08-16 Thread Raphael Gault
Hi,

Changes since v2:
* Rebased on linux-next/master again (next-20190814)
* Use linux/compiler.h header as suggested by Arnaldo

The perf user-space tool relies on the PMU to monitor events. It offers an
abstraction layer over the hardware counters since the underlying
implementation is cpu-dependent. We want to allow userspace tools to have
access to the registers storing the hardware counters' values directly.
This targets specifically self-monitoring tasks in order to reduce the
overhead by directly accessing the registers without having to go
through the kernel.
In order to do this we need to setup the pmu so that it exposes its registers
to userspace access.

The first patch add a test to the perf tool so that we can test that the
access to the registers works correctly from userspace.

The second patch add a capability in the arm64 cpufeatures framework in
order to detect when we are running on a heterogeneous system.

The third patch focuses on the armv8 pmuv3 PMU support and makes sure that
the access to the pmu registers is enable and that the userspace have
access to the relevent information in order to use them.

The fourth patch put in place callbacks to enable access to the hardware
counters from userspace when a compatible event is opened using the perf
API.

The fifth patch adds a short documentation about PMU counters direct
access from userspace.

Raphael Gault (5):
  perf: arm64: Add test to check userspace access to hardware counters.
  arm64: cpufeature: Add feature to detect heterogeneous systems
  arm64: pmu: Add function implementation to update event index in
userpage.
  arm64: perf: Enable pmu counter direct access for perf event on armv8
  Documentation: arm64: Document PMU counters access from userspace

 .../arm64/pmu_counter_user_access.txt |  42 +++
 arch/arm64/include/asm/cpucaps.h  |   3 +-
 arch/arm64/include/asm/mmu.h  |   6 +
 arch/arm64/include/asm/mmu_context.h  |   2 +
 arch/arm64/include/asm/perf_event.h   |  14 +
 arch/arm64/kernel/cpufeature.c|  20 ++
 arch/arm64/kernel/perf_event.c|  23 ++
 drivers/perf/arm_pmu.c|  38 +++
 include/linux/perf/arm_pmu.h  |   2 +
 tools/perf/arch/arm64/include/arch-tests.h|   7 +
 tools/perf/arch/arm64/tests/Build |   1 +
 tools/perf/arch/arm64/tests/arch-tests.c  |   4 +
 tools/perf/arch/arm64/tests/user-events.c | 254 ++
 13 files changed, 415 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/arm64/pmu_counter_user_access.txt
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c

-- 
2.17.1



[RFC v4 02/18] objtool: orc: Refactor ORC API for other architectures to implement.

2019-08-16 Thread Raphael Gault
The ORC unwinder is only supported on x86 at the moment and should thus be
in the x86 architecture code. In order not to break the whole structure in
case another architecture decides to support the ORC unwinder via objtool
we choose to let the implementation be done in the architecture dependent
code.

Signed-off-by: Raphael Gault 
---
 tools/objtool/Build |   2 -
 tools/objtool/arch.h|   3 +
 tools/objtool/arch/x86/Build|   2 +
 tools/objtool/{ => arch/x86}/orc_dump.c |   4 +-
 tools/objtool/{ => arch/x86}/orc_gen.c  | 104 ++--
 tools/objtool/check.c   |  99 +-
 tools/objtool/orc.h |   4 +-
 7 files changed, 111 insertions(+), 107 deletions(-)
 rename tools/objtool/{ => arch/x86}/orc_dump.c (98%)
 rename tools/objtool/{ => arch/x86}/orc_gen.c (66%)

diff --git a/tools/objtool/Build b/tools/objtool/Build
index 8dc4f0848362..d069d26d97fa 100644
--- a/tools/objtool/Build
+++ b/tools/objtool/Build
@@ -2,8 +2,6 @@ objtool-y += arch/$(SRCARCH)/
 objtool-y += builtin-check.o
 objtool-y += builtin-orc.o
 objtool-y += check.o
-objtool-y += orc_gen.o
-objtool-y += orc_dump.o
 objtool-y += elf.o
 objtool-y += special.o
 objtool-y += objtool.o
diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h
index a9a50a25ca66..e91e12807678 100644
--- a/tools/objtool/arch.h
+++ b/tools/objtool/arch.h
@@ -10,6 +10,7 @@
 #include 
 #include "elf.h"
 #include "cfi.h"
+#include "orc.h"
 
 enum insn_type {
INSN_JUMP_CONDITIONAL,
@@ -77,6 +78,8 @@ int arch_decode_instruction(struct elf *elf, struct section 
*sec,
 
 bool arch_callee_saved_reg(unsigned char reg);
 
+int arch_orc_read_unwind_hints(struct objtool_file *file);
+
 unsigned long arch_jump_destination(struct instruction *insn);
 
 unsigned long arch_dest_rela_offset(int addend);
diff --git a/tools/objtool/arch/x86/Build b/tools/objtool/arch/x86/Build
index b998412c017d..1f11b45999d0 100644
--- a/tools/objtool/arch/x86/Build
+++ b/tools/objtool/arch/x86/Build
@@ -1,4 +1,6 @@
 objtool-y += decode.o
+objtool-y += orc_dump.o
+objtool-y += orc_gen.o
 
 inat_tables_script = arch/x86/tools/gen-insn-attr-x86.awk
 inat_tables_maps = arch/x86/lib/x86-opcode-map.txt
diff --git a/tools/objtool/orc_dump.c b/tools/objtool/arch/x86/orc_dump.c
similarity index 98%
rename from tools/objtool/orc_dump.c
rename to tools/objtool/arch/x86/orc_dump.c
index 13ccf775a83a..cfe8f96bdd68 100644
--- a/tools/objtool/orc_dump.c
+++ b/tools/objtool/arch/x86/orc_dump.c
@@ -4,8 +4,8 @@
  */
 
 #include 
-#include "orc.h"
-#include "warn.h"
+#include "../../orc.h"
+#include "../../warn.h"
 
 static const char *reg_name(unsigned int reg)
 {
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/arch/x86/orc_gen.c
similarity index 66%
rename from tools/objtool/orc_gen.c
rename to tools/objtool/arch/x86/orc_gen.c
index 27a4112848c2..b4f285bf5271 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/arch/x86/orc_gen.c
@@ -6,11 +6,11 @@
 #include 
 #include 
 
-#include "orc.h"
-#include "check.h"
-#include "warn.h"
+#include "../../orc.h"
+#include "../../check.h"
+#include "../../warn.h"
 
-int create_orc(struct objtool_file *file)
+int arch_create_orc(struct objtool_file *file)
 {
struct instruction *insn;
 
@@ -116,7 +116,7 @@ static int create_orc_entry(struct section *u_sec, struct 
section *ip_relasec,
return 0;
 }
 
-int create_orc_sections(struct objtool_file *file)
+int arch_create_orc_sections(struct objtool_file *file)
 {
struct instruction *insn, *prev_insn;
struct section *sec, *u_sec, *ip_relasec;
@@ -209,3 +209,97 @@ int create_orc_sections(struct objtool_file *file)
 
return 0;
 }
+
+int arch_orc_read_unwind_hints(struct objtool_file *file)
+{
+   struct section *sec, *relasec;
+   struct rela *rela;
+   struct unwind_hint *hint;
+   struct instruction *insn;
+   struct cfi_reg *cfa;
+   int i;
+
+   sec = find_section_by_name(file->elf, ".discard.unwind_hints");
+   if (!sec)
+   return 0;
+
+   relasec = sec->rela;
+   if (!relasec) {
+   WARN("missing .rela.discard.unwind_hints section");
+   return -1;
+   }
+
+   if (sec->len % sizeof(struct unwind_hint)) {
+   WARN("struct unwind_hint size mismatch");
+   return -1;
+   }
+
+   file->hints = true;
+
+   for (i = 0; i < sec->len / sizeof(struct unwind_hint); i++) {
+   hint = (struct unwind_hint *)sec->data->d_buf + i;
+
+   rela = find_rela_by_dest(sec, i * sizeof(*hint));
+   if (!rela) {
+   WARN("can't find rela for unwind_hints[%d]", i);
+ 

[RFC v4 16/18] arm64: crypto: Add exceptions for crypto object to prevent stack analysis

2019-08-16 Thread Raphael Gault
Some crypto modules contain `.word` of data in the .text section.
Since objtool can't make the distinction between data and incorrect
instruction, it gives a warning about the instruction beeing unknown
and stops the analysis of the object file.

The exception can be removed if the data are moved to another section
or if objtool is tweaked to handle this particular case.

Signed-off-by: Raphael Gault 
---
 arch/arm64/crypto/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
index 0435f2a0610e..e2a25919ebaa 100644
--- a/arch/arm64/crypto/Makefile
+++ b/arch/arm64/crypto/Makefile
@@ -43,9 +43,11 @@ aes-neon-blk-y := aes-glue-neon.o aes-neon.o
 
 obj-$(CONFIG_CRYPTO_SHA256_ARM64) += sha256-arm64.o
 sha256-arm64-y := sha256-glue.o sha256-core.o
+OBJECT_FILES_NON_STANDARD_sha256-core.o := y
 
 obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o
 sha512-arm64-y := sha512-glue.o sha512-core.o
+OBJECT_FILES_NON_STANDARD_sha512-core.o := y
 
 obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o
 chacha-neon-y := chacha-neon-core.o chacha-neon-glue.o
@@ -58,6 +60,7 @@ aes-arm64-y := aes-cipher-core.o aes-cipher-glue.o
 
 obj-$(CONFIG_CRYPTO_AES_ARM64_BS) += aes-neon-bs.o
 aes-neon-bs-y := aes-neonbs-core.o aes-neonbs-glue.o
+OBJECT_FILES_NON_STANDARD_aes-neonbs-core.o := y
 
 CFLAGS_aes-glue-ce.o   := -DUSE_V8_CRYPTO_EXTENSIONS
 
-- 
2.17.1



[RFC v4 06/18] objtool: arm64: Adapt the stack frame checks for arm architecture

2019-08-16 Thread Raphael Gault
Since the way the initial stack frame when entering a function is different
that what is done in the x86_64 architecture, we need to add some more
check to support the different cases.  As opposed as for x86_64, the return
address is not stored by the call instruction but is instead loaded in a
register. The initial stack frame is thus empty when entering a function
and 2 push operations are needed to set it up correctly. All the different
combinations need to be taken into account.

Signed-off-by: Raphael Gault 
---
 tools/objtool/arch.h  |   2 +
 tools/objtool/arch/arm64/decode.c |  28 +
 tools/objtool/arch/x86/decode.c   |   5 ++
 tools/objtool/check.c | 100 --
 tools/objtool/elf.c   |   3 +-
 5 files changed, 131 insertions(+), 7 deletions(-)

diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h
index bb5ce810fb6e..68d6371a24a2 100644
--- a/tools/objtool/arch.h
+++ b/tools/objtool/arch.h
@@ -91,4 +91,6 @@ unsigned long arch_jump_destination(struct instruction *insn);
 
 unsigned long arch_dest_rela_offset(int addend);
 
+bool arch_is_insn_sibling_call(struct instruction *insn);
+
 #endif /* _ARCH_H */
diff --git a/tools/objtool/arch/arm64/decode.c 
b/tools/objtool/arch/arm64/decode.c
index 395c5777afab..be3d2eb10227 100644
--- a/tools/objtool/arch/arm64/decode.c
+++ b/tools/objtool/arch/arm64/decode.c
@@ -106,6 +106,34 @@ unsigned long arch_dest_rela_offset(int addend)
return addend;
 }
 
+/*
+ * In order to know if we are in presence of a sibling
+ * call and not in presence of a switch table we look
+ * back at the previous instructions and see if we are
+ * jumping inside the same function that we are already
+ * in.
+ */
+bool arch_is_insn_sibling_call(struct instruction *insn)
+{
+   struct instruction *prev;
+   struct list_head *l;
+   struct symbol *sym;
+   list_for_each_prev(l, >list) {
+   prev = list_entry(l, struct instruction, list);
+   if (!prev->func ||
+   prev->func->pfunc != insn->func->pfunc)
+   return false;
+   if (prev->stack_op.src.reg != ADR_SOURCE)
+   continue;
+   sym = find_symbol_containing(insn->sec, insn->immediate);
+   if (!sym || sym->type != STT_FUNC)
+   return false;
+   else if (sym->type == STT_FUNC)
+   return true;
+   break;
+   }
+   return false;
+}
 static int is_arm64(struct elf *elf)
 {
switch (elf->ehdr.e_machine) {
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index fa33b3465722..98726990714d 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -72,6 +72,11 @@ unsigned long arch_dest_rela_offset(int addend)
return addend + 4;
 }
 
+bool arch_is_insn_sibling_call(struct instruction *insn)
+{
+   return true;
+}
+
 int arch_decode_instruction(struct elf *elf, struct section *sec,
unsigned long offset, unsigned int maxlen,
unsigned int *len, enum insn_type *type,
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 4af6422d3428..519569b0329f 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -566,10 +566,10 @@ static int add_jump_destinations(struct objtool_file 
*file)
dest_off = arch_jump_destination(insn);
} else if (rela->sym->type == STT_SECTION) {
dest_sec = rela->sym->sec;
-   dest_off = rela->addend + 4;
+   dest_off = arch_dest_rela_offset(rela->addend);
} else if (rela->sym->sec->idx) {
dest_sec = rela->sym->sec;
-   dest_off = rela->sym->sym.st_value + rela->addend + 4;
+   dest_off = rela->sym->sym.st_value + 
arch_dest_rela_offset(rela->addend);
} else if (strstr(rela->sym->name, "_indirect_thunk_")) {
/*
 * Retpoline jumps are really dynamic jumps in
@@ -1368,8 +1368,8 @@ static void save_reg(struct insn_state *state, unsigned 
char reg, int base,
 
 static void restore_reg(struct insn_state *state, unsigned char reg)
 {
-   state->regs[reg].base = CFI_UNDEFINED;
-   state->regs[reg].offset = 0;
+   state->regs[reg].base = initial_func_cfi.regs[reg].base;
+   state->regs[reg].offset = initial_func_cfi.regs[reg].offset;
 }
 
 /*
@@ -1525,8 +1525,32 @@ static int update_insn_state(struct instruction *insn, 
struct insn_state *state)
 
/* add imm, %rsp */
state->stack_size -= op->src.offset;
-   if (cfa->base == 

[RFC v4 10/18] objtool: arm64: Implement functions to add switch tables alternatives

2019-08-16 Thread Raphael Gault
This patch implements the functions required to identify and add as
alternatives all the possible destinations of the switch table.
This implementation relies on the new plugin introduced previously which
records information about the switch-table in a .objtool_data section.

Signed-off-by: Raphael Gault 
---
 tools/objtool/arch/arm64/arch_special.c   | 132 +-
 .../objtool/arch/arm64/include/arch_special.h |  10 ++
 .../objtool/arch/arm64/include/insn_decode.h  |   3 +-
 tools/objtool/check.c |   6 +-
 tools/objtool/check.h |   2 +
 5 files changed, 146 insertions(+), 7 deletions(-)

diff --git a/tools/objtool/arch/arm64/arch_special.c 
b/tools/objtool/arch/arm64/arch_special.c
index 17a8a06aac2a..11284066157c 100644
--- a/tools/objtool/arch/arm64/arch_special.c
+++ b/tools/objtool/arch/arm64/arch_special.c
@@ -12,8 +12,13 @@
  * You should have received a copy of the GNU General Public License
  * along with this program; if not, see <http://www.gnu.org/licenses/>.
  */
+
+#include 
+#include 
+
 #include "../../special.h"
 #include "arch_special.h"
+#include "bit_operations.h"
 
 void arch_force_alt_path(unsigned short feature,
 bool uaccess,
@@ -21,9 +26,133 @@ void arch_force_alt_path(unsigned short feature,
 {
 }
 
+static u32 next_offset(u8 *table, u8 entry_size)
+{
+   switch (entry_size) {
+   case 1:
+   return table[0];
+   case 2:
+   return *(u16 *)(table);
+   default:
+   return *(u32 *)(table);
+   }
+}
+
+static u32 get_table_entry_size(u32 insn)
+{
+   unsigned char size = (insn >> 30) & ONES(2);
+   switch (size) {
+   case 0:
+   return 1;
+   case 1:
+   return 2;
+   default:
+   return 4;
+   }
+}
+
+static int add_possible_branch(struct objtool_file *file,
+  struct instruction *insn,
+  u32 base, u32 offset)
+{
+   struct instruction *dest_insn;
+   struct alternative *alt;
+   offset = base + 4 * offset;
+
+   alt = calloc(1, sizeof(*alt));
+   if (alt == NULL) {
+   WARN("allocation failure, can't add jump alternative");
+   return -1;
+   }
+
+   dest_insn = find_insn(file, insn->sec, offset);
+   if (dest_insn == NULL) {
+   free(alt);
+   return 0;
+   }
+   alt->insn = dest_insn;
+   alt->skip_orig = true;
+   list_add_tail(>list, >alts);
+   return 0;
+}
+
 int arch_add_jump_table(struct objtool_file *file, struct instruction *insn,
struct rela *table, struct rela *next_table)
 {
+   struct rela *objtool_data_rela = NULL;
+   struct switch_table_info *swt_info = NULL;
+   struct section *objtool_data = find_section_by_name(file->elf, 
".objtool_data");
+   struct section *rodata_sec = find_section_by_name(file->elf, ".rodata");
+   struct section *branch_sec = NULL;
+   u8 *switch_table = NULL;
+   u64 base_offset = 0;
+   struct instruction *pre_jump_insn;
+   u32 sec_size = 0;
+   u32 entry_size = 0;
+   u32 offset = 0;
+   u32 i, j;
+
+   if (objtool_data == NULL)
+   return 0;
+
+   /*
+* 1. Using rela, Identify entry for the switch table
+* 2. Retrieve base offset
+* 3. Retrieve branch instruction
+* 3. For all entries in switch table:
+*  3.1. Compute new offset
+*  3.2. Create alternative instruction
+*  3.3. Add alt_instr to insn->alts list
+*/
+   sec_size = objtool_data->sh.sh_size;
+   for (i = 0, swt_info = (void *)objtool_data->data->d_buf;
+i < sec_size / sizeof(struct switch_table_info);
+i++, swt_info++) {
+   offset = i * sizeof(struct switch_table_info);
+   objtool_data_rela = find_rela_by_dest_range(objtool_data, 
offset,
+   sizeof(u64));
+   /* retrieving the objtool data of the switch table we need */
+   if (objtool_data_rela == NULL ||
+   table->sym->sec != objtool_data_rela->sym->sec ||
+   table->addend != objtool_data_rela->addend)
+   continue;
+
+   /* retrieving switch table content */
+   switch_table = (u8 *)(rodata_sec->data->d_buf + table->addend);
+
+   /* retrieving pre jump instruction (ldr) */
+   branch_sec = insn->sec;
+   pre_jump_insn = find_insn(file, branch_sec,
+ insn->offset - 3 * sizeof(u32));
+   entry_size = get_table_entry_size(*(u32 
*)(branch_sec->data->d_

[RFC v4 13/18] arm64: sleep: Prevent stack frame warnings from objtool

2019-08-16 Thread Raphael Gault
This code doesn't respect the Arm PCS but it is intended this
way. Adapting it to respect the PCS would result in altering the
behaviour.

In order to suppress objtool's warnings, we setup a stack frame
for __cpu_suspend_enter and annotate cpu_resume and _cpu_resume
as having non-standard stack frames.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/sleep.S | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index f5b04dd8a710..55c7c099d32c 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -1,5 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -90,6 +91,7 @@ ENTRY(__cpu_suspend_enter)
str x0, [x1]
add x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
stp x29, lr, [sp, #-16]!
+   mov x29, sp
bl  cpu_do_suspend
ldp x29, lr, [sp], #16
mov x0, #1
@@ -146,3 +148,6 @@ ENTRY(_cpu_resume)
mov x0, #0
ret
 ENDPROC(_cpu_resume)
+
+   asm_stack_frame_non_standard cpu_resume
+   asm_stack_frame_non_standard _cpu_resume
-- 
2.17.1



[RFC v4 09/18] gcc-plugins: objtool: Add plugin to detect switch table on arm64

2019-08-16 Thread Raphael Gault
This plugins comes into play before the final 2 RTL passes of GCC and
detects switch-tables that are to be outputed in the ELF and writes
information in an "objtool_data" section which will be used by objtool.

Signed-off-by: Raphael Gault 
---
 scripts/Makefile.gcc-plugins  |  2 +
 scripts/gcc-plugins/Kconfig   |  9 +++
 .../arm64_switch_table_detection_plugin.c | 58 +++
 3 files changed, 69 insertions(+)
 create mode 100644 scripts/gcc-plugins/arm64_switch_table_detection_plugin.c

diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins
index 5f7df50cfe7a..a56736df9dc2 100644
--- a/scripts/Makefile.gcc-plugins
+++ b/scripts/Makefile.gcc-plugins
@@ -44,6 +44,8 @@ ifdef CONFIG_GCC_PLUGIN_ARM_SSP_PER_TASK
 endif
 export DISABLE_ARM_SSP_PER_TASK_PLUGIN
 
+gcc-plugin-$(CONFIG_GCC_PLUGIN_SWITCH_TABLES)  += 
arm64_switch_table_detection_plugin.so
+
 # All the plugin CFLAGS are collected here in case a build target needs to
 # filter them out of the KBUILD_CFLAGS.
 GCC_PLUGINS_CFLAGS := $(strip $(addprefix 
-fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) 
$(gcc-plugin-cflags-y))
diff --git a/scripts/gcc-plugins/Kconfig b/scripts/gcc-plugins/Kconfig
index d33de0b9f4f5..1daeffb55dce 100644
--- a/scripts/gcc-plugins/Kconfig
+++ b/scripts/gcc-plugins/Kconfig
@@ -113,4 +113,13 @@ config GCC_PLUGIN_ARM_SSP_PER_TASK
bool
depends on GCC_PLUGINS && ARM
 
+config GCC_PLUGIN_SWITCH_TABLES
+   bool "GCC Plugin: Identify switch tables at compile time"
+   default y
+   depends on STACK_VALIDATION && ARM64
+   help
+ Plugin to identify switch tables generated at compile time and store
+ them in a .objtool_data section. Objtool will then use that section
+ to analyse the different execution path of the switch table.
+
 endmenu
diff --git a/scripts/gcc-plugins/arm64_switch_table_detection_plugin.c 
b/scripts/gcc-plugins/arm64_switch_table_detection_plugin.c
new file mode 100644
index ..d7f0e13910d5
--- /dev/null
+++ b/scripts/gcc-plugins/arm64_switch_table_detection_plugin.c
@@ -0,0 +1,58 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include 
+#include "gcc-common.h"
+
+__visible int plugin_is_GPL_compatible;
+
+static unsigned int arm64_switchtbl_rtl_execute(void)
+{
+   rtx_insn *insn;
+   rtx_insn *labelp = NULL;
+   rtx_jump_table_data *tablep = NULL;
+   section *sec = get_section(".objtool_data", SECTION_STRINGS, NULL);
+   section *curr_sec = current_function_section();
+
+   for (insn = get_insns(); insn; insn = NEXT_INSN(insn)) {
+   /*
+* Find a tablejump_p INSN (using a dispatch table)
+*/
+   if (!tablejump_p(insn, , ))
+   continue;
+
+   if (labelp && tablep) {
+   switch_to_section(sec);
+   assemble_integer_with_op(".quad ", 
gen_rtx_LABEL_REF(Pmode, labelp));
+   assemble_integer_with_op(".quad ", 
GEN_INT(GET_NUM_ELEM(tablep->get_labels(;
+   assemble_integer_with_op(".quad ", 
GEN_INT(ADDR_DIFF_VEC_FLAGS(tablep).offset_unsigned));
+   switch_to_section(curr_sec);
+   }
+   }
+   return 0;
+}
+
+#define PASS_NAME arm64_switchtbl_rtl
+
+#define NO_GATE
+#include "gcc-generate-rtl-pass.h"
+
+__visible int plugin_init(struct plugin_name_args *plugin_info,
+ struct plugin_gcc_version *version)
+{
+   const char * const plugin_name = plugin_info->base_name;
+   int tso = 0;
+   int i;
+
+   if (!plugin_default_version_check(version, _version)) {
+   error(G_("incompatible gcc/plugin versions"));
+   return 1;
+   }
+
+   PASS_INFO(arm64_switchtbl_rtl, "outof_cfglayout", 1,
+ PASS_POS_INSERT_AFTER);
+
+   register_callback(plugin_info->base_name, PLUGIN_PASS_MANAGER_SETUP,
+ NULL, _switchtbl_rtl_pass_info);
+
+   return 0;
+}
-- 
2.17.1



[RFC v4 11/18] arm64: alternative: Mark .altinstr_replacement as containing executable instructions

2019-08-16 Thread Raphael Gault
Until now, the section .altinstr_replacement wasn't marked as containing
executable instructions on arm64. This patch changes that so that it is
coherent with what is done on x86.

Signed-off-by: Raphael Gault 
---
 arch/arm64/include/asm/alternative.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/alternative.h 
b/arch/arm64/include/asm/alternative.h
index b9f8d787eea9..e9e6b81e3eb4 100644
--- a/arch/arm64/include/asm/alternative.h
+++ b/arch/arm64/include/asm/alternative.h
@@ -71,7 +71,7 @@ static inline void apply_alternatives_module(void *start, 
size_t length) { }
ALTINSTR_ENTRY(feature,cb)  \
".popsection\n" \
" .if " __stringify(cb) " == 0\n"   \
-   ".pushsection .altinstr_replacement, \"a\"\n"   \
+   ".pushsection .altinstr_replacement, \"ax\"\n"  \
"663:\n\t"  \
newinstr "\n"   \
"664:\n\t"  \
-- 
2.17.1



[RFC v4 18/18] objtool: arm64: Enable stack validation for arm64

2019-08-16 Thread Raphael Gault
Signed-off-by: Raphael Gault 
---
 arch/arm64/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 3adcec05b1f6..dc3de85b2502 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -163,6 +163,7 @@ config ARM64
select HAVE_RCU_TABLE_FREE
select HAVE_RSEQ
select HAVE_STACKPROTECTOR
+   select HAVE_STACK_VALIDATION
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_KPROBES
select HAVE_KRETPROBES
-- 
2.17.1



[RFC v4 08/18] objtool: Refactor switch-tables code to support other architectures

2019-08-16 Thread Raphael Gault
The way to identify switch-tables and retrieves all the data necessary
to handle the different execution branches is not the same on all
architecture. In order to be able to add other architecture support,
this patch defines arch-dependent functions to process jump-tables.

Signed-off-by: Raphael Gault 
---
 tools/objtool/arch/arm64/arch_special.c | 15 
 tools/objtool/arch/arm64/decode.c   |  4 +-
 tools/objtool/arch/x86/arch_special.c   | 79 
 tools/objtool/check.c   | 95 +
 tools/objtool/check.h   |  7 ++
 tools/objtool/special.h | 10 ++-
 6 files changed, 114 insertions(+), 96 deletions(-)

diff --git a/tools/objtool/arch/arm64/arch_special.c 
b/tools/objtool/arch/arm64/arch_special.c
index a21d28876317..17a8a06aac2a 100644
--- a/tools/objtool/arch/arm64/arch_special.c
+++ b/tools/objtool/arch/arm64/arch_special.c
@@ -20,3 +20,18 @@ void arch_force_alt_path(unsigned short feature,
 struct special_alt *alt)
 {
 }
+
+int arch_add_jump_table(struct objtool_file *file, struct instruction *insn,
+   struct rela *table, struct rela *next_table)
+{
+   return 0;
+}
+
+struct rela *arch_find_switch_table(struct objtool_file *file,
+ struct rela *text_rela,
+ struct section *rodata_sec,
+ unsigned long table_offset)
+{
+   file->ignore_unreachables = true;
+   return NULL;
+}
diff --git a/tools/objtool/arch/arm64/decode.c 
b/tools/objtool/arch/arm64/decode.c
index 4cb9402d6fe1..a20725c1bfd7 100644
--- a/tools/objtool/arch/arm64/decode.c
+++ b/tools/objtool/arch/arm64/decode.c
@@ -159,7 +159,7 @@ static int is_arm64(struct elf *elf)
 
 int arch_decode_instruction(struct elf *elf, struct section *sec,
unsigned long offset, unsigned int maxlen,
-   unsigned int *len, unsigned char *type,
+   unsigned int *len, enum insn_type *type,
unsigned long *immediate, struct stack_op *op)
 {
int arm64 = 0;
@@ -184,7 +184,7 @@ int arch_decode_instruction(struct elf *elf, struct section 
*sec,
insn = *(u32 *)(sec->data->d_buf + offset);
 
//dispatch according to encoding classes
-   return aarch64_insn_class_decode_table[(insn >> 25) & 0xf](insn, type,
+   return aarch64_insn_class_decode_table[(insn >> 25) & 0xf](insn, 
(unsigned char *)type,
immediate, op);
 }
 
diff --git a/tools/objtool/arch/x86/arch_special.c 
b/tools/objtool/arch/x86/arch_special.c
index 6583a1770bb2..c097001d805b 100644
--- a/tools/objtool/arch/x86/arch_special.c
+++ b/tools/objtool/arch/x86/arch_special.c
@@ -26,3 +26,82 @@ void arch_force_alt_path(unsigned short feature,
alt->skip_alt = true;
}
 }
+
+int arch_add_jump_table(struct objtool_file *file, struct instruction *insn,
+   struct rela *table, struct rela *next_table)
+{
+   struct rela *rela = table;
+   struct instruction *dest_insn;
+   struct alternative *alt;
+   struct symbol *pfunc = insn->func->pfunc;
+   unsigned int prev_offset = 0;
+
+   /*
+* Each @rela is a switch table relocation which points to the target
+* instruction.
+*/
+   list_for_each_entry_from(rela, >sec->rela_list, list) {
+
+   /* Check for the end of the table: */
+   if (rela != table && rela->jump_table_start)
+   break;
+
+   /* Make sure the table entries are consecutive: */
+   if (prev_offset && rela->offset != prev_offset + 8)
+   break;
+
+   /* Detect function pointers from contiguous objects: */
+   if (rela->sym->sec == pfunc->sec &&
+   rela->addend == pfunc->offset)
+   break;
+
+   dest_insn = find_insn(file, rela->sym->sec, rela->addend);
+   if (!dest_insn)
+   break;
+
+   /* Make sure the destination is in the same function: */
+   if (!dest_insn->func || dest_insn->func->pfunc != pfunc)
+   break;
+
+   alt = malloc(sizeof(*alt));
+   if (!alt) {
+   WARN("malloc failed");
+   return -1;
+   }
+
+   alt->insn = dest_insn;
+   list_add_tail(>list, >alts);
+   prev_offset = rela->offset;
+   }
+
+   if (!prev_offset) {
+   WARN_FUNC("can't find switch jump table",
+ insn->sec, insn->offset);
+   return -1;

[RFC v4 14/18] arm64: kvm: Annotate non-standard stack frame functions

2019-08-16 Thread Raphael Gault
Both __guest_entry and __guest_exit functions do not setup
a correct stack frame. Because they can be considered as callable
functions, even if they are particular cases, we chose to silence
the warnings given by objtool by annotating them as non-standard.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kvm/hyp/entry.S | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index e5cc8d66bf53..c3443bfd0944 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define CPU_GP_REG_OFFSET(x)   (CPU_GP_REGS + x)
 #define CPU_XREG_OFFSET(x) CPU_GP_REG_OFFSET(CPU_USER_PT_REGS + 8*x)
@@ -97,6 +98,7 @@ alternative_else_nop_endif
eret
sb
 ENDPROC(__guest_enter)
+asm_stack_frame_non_standard __guest_enter
 
 ENTRY(__guest_exit)
// x0: return code
@@ -193,3 +195,4 @@ abort_guest_exit_end:
orr x0, x0, x5
 1: ret
 ENDPROC(__guest_exit)
+asm_stack_frame_non_standard __guest_exit
-- 
2.17.1



[RFC v4 12/18] arm64: assembler: Add macro to annotate asm function having non standard stack-frame.

2019-08-16 Thread Raphael Gault
Some functions don't have standard stack-frames but are intended
this way. In order for objtool to ignore those particular cases
we add a macro that enables us to annotate the cases we chose
to mark as particular.

Signed-off-by: Raphael Gault 
---
 include/linux/frame.h | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/include/linux/frame.h b/include/linux/frame.h
index 02d3ca2d9598..1e35e58ab259 100644
--- a/include/linux/frame.h
+++ b/include/linux/frame.h
@@ -11,14 +11,31 @@
  *
  * For more information, see tools/objtool/Documentation/stack-validation.txt.
  */
+#ifndef __ASSEMBLY__
 #define STACK_FRAME_NON_STANDARD(func) \
static void __used __section(.discard.func_stack_frame_non_standard) \
*__func_stack_frame_non_standard_##func = func
+#else
+   /*
+* This macro is the arm64 assembler equivalent of the
+* macro STACK_FRAME_NON_STANDARD define at
+* ~/include/linux/frame.h
+*/
+   .macro  asm_stack_frame_non_standardfunc
+   .pushsection ".discard.func_stack_frame_non_standard"
+   .quad   \func
+   .popsection
+   .endm
 
+#endif /* __ASSEMBLY__ */
 #else /* !CONFIG_STACK_VALIDATION */
 
+#ifndef __ASSEMBLY__
 #define STACK_FRAME_NON_STANDARD(func)
-
+#else
+   .macro  asm_stack_frame_non_standardfunc
+   .endm
+#endif /* __ASSEMBLY__ */
 #endif /* CONFIG_STACK_VALIDATION */
 
 #endif /* _LINUX_FRAME_H */
-- 
2.17.1



[RFC v4 17/18] arm64: kernel: Annotate non-standard stack frame functions

2019-08-16 Thread Raphael Gault
Annotate assembler functions which are callable but do not
setup a correct stack frame.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/hyp-stub.S | 3 +++
 arch/arm64/kvm/hyp-init.S| 3 +++
 2 files changed, 6 insertions(+)

diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 73d46070b315..8917d42f38c7 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -6,6 +6,7 @@
  * Author: Marc Zyngier 
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -42,6 +43,7 @@ ENTRY(__hyp_stub_vectors)
ventry  el1_fiq_invalid // FIQ 32-bit EL1
ventry  el1_error_invalid   // Error 32-bit EL1
 ENDPROC(__hyp_stub_vectors)
+asm_stack_frame_non_standard __hyp_stub_vectors
 
.align 11
 
@@ -69,6 +71,7 @@ el1_sync:
 9: mov x0, xzr
eret
 ENDPROC(el1_sync)
+asm_stack_frame_non_standard el1_sync
 
 .macro invalid_vector  label
 \label:
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 160be2b4696d..63deea39313d 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 
.text
.pushsection.hyp.idmap.text, "ax"
@@ -118,6 +119,7 @@ CPU_BE( orr x4, x4, #SCTLR_ELx_EE)
/* Hello, World! */
eret
 ENDPROC(__kvm_hyp_init)
+asm_stack_frame_non_standard __kvm_hyp_init
 
 ENTRY(__kvm_handle_stub_hvc)
cmp x0, #HVC_SOFT_RESTART
@@ -159,6 +161,7 @@ reset:
eret
 
 ENDPROC(__kvm_handle_stub_hvc)
+asm_stack_frame_non_standard __kvm_handle_stub_hvc
 
.ltorg
 
-- 
2.17.1



[RFC v4 15/18] arm64: kernel: Add exception on kuser32 to prevent stack analysis

2019-08-16 Thread Raphael Gault
kuser32 being used for compatibility, it contains a32 instructions
which are not recognised by objtool when trying to analyse arm64
object files. Thus, we add an exception to skip validation on this
particular file.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 478491f07b4f..1239c7da4c02 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -33,6 +33,9 @@ ifneq ($(CONFIG_COMPAT_VDSO), y)
 obj-$(CONFIG_COMPAT)   += sigreturn32.o
 endif
 obj-$(CONFIG_KUSER_HELPERS)+= kuser32.o
+
+OBJECT_FILES_NON_STANDARD_kuser32.o := y
+
 obj-$(CONFIG_FUNCTION_TRACER)  += ftrace.o entry-ftrace.o
 obj-$(CONFIG_MODULES)  += module.o
 obj-$(CONFIG_ARM64_MODULE_PLTS)+= module-plts.o
-- 
2.17.1



[RFC v4 01/18] objtool: Add abstraction for computation of symbols offsets

2019-08-16 Thread Raphael Gault
The jump destination and relocation offset used previously are only
reliable on x86_64 architecture. We abstract these computations by calling
arch-dependent implementations.

Signed-off-by: Raphael Gault 
---
 tools/objtool/arch.h|  6 ++
 tools/objtool/arch/x86/decode.c | 11 +++
 tools/objtool/check.c   | 15 ++-
 3 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h
index ced3765c4f44..a9a50a25ca66 100644
--- a/tools/objtool/arch.h
+++ b/tools/objtool/arch.h
@@ -66,6 +66,8 @@ struct stack_op {
struct op_src src;
 };
 
+struct instruction;
+
 void arch_initial_func_cfi_state(struct cfi_state *state);
 
 int arch_decode_instruction(struct elf *elf, struct section *sec,
@@ -75,4 +77,8 @@ int arch_decode_instruction(struct elf *elf, struct section 
*sec,
 
 bool arch_callee_saved_reg(unsigned char reg);
 
+unsigned long arch_jump_destination(struct instruction *insn);
+
+unsigned long arch_dest_rela_offset(int addend);
+
 #endif /* _ARCH_H */
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 0567c47a91b1..fa33b3465722 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -11,6 +11,7 @@
 #include "lib/inat.c"
 #include "lib/insn.c"
 
+#include "../../check.h"
 #include "../../elf.h"
 #include "../../arch.h"
 #include "../../warn.h"
@@ -66,6 +67,11 @@ bool arch_callee_saved_reg(unsigned char reg)
}
 }
 
+unsigned long arch_dest_rela_offset(int addend)
+{
+   return addend + 4;
+}
+
 int arch_decode_instruction(struct elf *elf, struct section *sec,
unsigned long offset, unsigned int maxlen,
unsigned int *len, enum insn_type *type,
@@ -497,3 +503,8 @@ void arch_initial_func_cfi_state(struct cfi_state *state)
state->regs[16].base = CFI_CFA;
state->regs[16].offset = -8;
 }
+
+unsigned long arch_jump_destination(struct instruction *insn)
+{
+   return insn->offset + insn->len + insn->immediate;
+}
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 176f2f084060..479fab46b656 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -563,7 +563,7 @@ static int add_jump_destinations(struct objtool_file *file)
   insn->len);
if (!rela) {
dest_sec = insn->sec;
-   dest_off = insn->offset + insn->len + insn->immediate;
+   dest_off = arch_jump_destination(insn);
} else if (rela->sym->type == STT_SECTION) {
dest_sec = rela->sym->sec;
dest_off = rela->addend + 4;
@@ -659,7 +659,7 @@ static int add_call_destinations(struct objtool_file *file)
rela = find_rela_by_dest_range(insn->sec, insn->offset,
   insn->len);
if (!rela) {
-   dest_off = insn->offset + insn->len + insn->immediate;
+   dest_off = arch_jump_destination(insn);
insn->call_dest = find_symbol_by_offset(insn->sec,
dest_off);
 
@@ -672,14 +672,19 @@ static int add_call_destinations(struct objtool_file 
*file)
}
 
} else if (rela->sym->type == STT_SECTION) {
+   /*
+* the original x86_64 code adds 4 to the rela->addend
+* which is not needed on arm64 architecture.
+*/
+   dest_off = arch_dest_rela_offset(rela->addend);
insn->call_dest = find_symbol_by_offset(rela->sym->sec,
-   rela->addend+4);
+   dest_off);
if (!insn->call_dest ||
insn->call_dest->type != STT_FUNC) {
-   WARN_FUNC("can't find call dest symbol at 
%s+0x%x",
+   WARN_FUNC("can't find call dest symbol at 
%s+0x%lx",
  insn->sec, insn->offset,
  rela->sym->sec->name,
- rela->addend + 4);
+ dest_off);
return -1;
}
} else
-- 
2.17.1



[RFC v4 03/18] objtool: Move registers and control flow to arch-dependent code

2019-08-16 Thread Raphael Gault
The control flow information and register macro definitions were based on
the x86_64 architecture but should be abstract so that each architecture
can define the correct values for the registers, especially the registers
related to the stack frame (Frame Pointer, Stack Pointer and possibly
Return Address).

Signed-off-by: Raphael Gault 
---
 tools/objtool/arch/x86/include/arch_special.h | 36 +++
 tools/objtool/{ => arch/x86/include}/cfi.h|  0
 tools/objtool/check.h |  1 +
 tools/objtool/special.c   | 19 +-
 4 files changed, 38 insertions(+), 18 deletions(-)
 create mode 100644 tools/objtool/arch/x86/include/arch_special.h
 rename tools/objtool/{ => arch/x86/include}/cfi.h (100%)

diff --git a/tools/objtool/arch/x86/include/arch_special.h 
b/tools/objtool/arch/x86/include/arch_special.h
new file mode 100644
index ..424ce47013e3
--- /dev/null
+++ b/tools/objtool/arch/x86/include/arch_special.h
@@ -0,0 +1,36 @@
+/*
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef _X86_ARCH_SPECIAL_H
+#define _X86_ARCH_SPECIAL_H
+
+#define EX_ENTRY_SIZE  12
+#define EX_ORIG_OFFSET 0
+#define EX_NEW_OFFSET  4
+
+#define JUMP_ENTRY_SIZE16
+#define JUMP_ORIG_OFFSET   0
+#define JUMP_NEW_OFFSET4
+
+#define ALT_ENTRY_SIZE 13
+#define ALT_ORIG_OFFSET0
+#define ALT_NEW_OFFSET 4
+#define ALT_FEATURE_OFFSET 8
+#define ALT_ORIG_LEN_OFFSET10
+#define ALT_NEW_LEN_OFFSET 11
+
+#define X86_FEATURE_POPCNT (4 * 32 + 23)
+#define X86_FEATURE_SMAP   (9 * 32 + 20)
+
+#endif /* _X86_ARCH_SPECIAL_H */
diff --git a/tools/objtool/cfi.h b/tools/objtool/arch/x86/include/cfi.h
similarity index 100%
rename from tools/objtool/cfi.h
rename to tools/objtool/arch/x86/include/cfi.h
diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index 6d875ca6fce0..af87b55db454 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -11,6 +11,7 @@
 #include "cfi.h"
 #include "arch.h"
 #include "orc.h"
+#include "arch_special.h"
 #include 
 
 struct insn_state {
diff --git a/tools/objtool/special.c b/tools/objtool/special.c
index fdbaa611146d..b8ccee1b5382 100644
--- a/tools/objtool/special.c
+++ b/tools/objtool/special.c
@@ -14,24 +14,7 @@
 #include "builtin.h"
 #include "special.h"
 #include "warn.h"
-
-#define EX_ENTRY_SIZE  12
-#define EX_ORIG_OFFSET 0
-#define EX_NEW_OFFSET  4
-
-#define JUMP_ENTRY_SIZE16
-#define JUMP_ORIG_OFFSET   0
-#define JUMP_NEW_OFFSET4
-
-#define ALT_ENTRY_SIZE 13
-#define ALT_ORIG_OFFSET0
-#define ALT_NEW_OFFSET 4
-#define ALT_FEATURE_OFFSET 8
-#define ALT_ORIG_LEN_OFFSET10
-#define ALT_NEW_LEN_OFFSET 11
-
-#define X86_FEATURE_POPCNT (4*32+23)
-#define X86_FEATURE_SMAP   (9*32+20)
+#include "arch_special.h"
 
 struct special_entry {
const char *sec;
-- 
2.17.1



[RFC v4 05/18] objtool: special: Adapt special section handling

2019-08-16 Thread Raphael Gault
This patch abstracts the few architecture dependent tests that are
perform when handling special section and switch tables. It enables any
architecture to ignore a particular CPU feature or not to handle switch
tables.

Signed-off-by: Raphael Gault 
---
 tools/objtool/arch/arm64/Build|  1 +
 tools/objtool/arch/arm64/arch_special.c   | 22 +++
 .../objtool/arch/arm64/include/arch_special.h | 10 +--
 tools/objtool/arch/x86/Build  |  1 +
 tools/objtool/arch/x86/arch_special.c | 28 +++
 tools/objtool/arch/x86/include/arch_special.h |  9 ++
 tools/objtool/check.c | 24 ++--
 tools/objtool/special.c   |  9 ++
 tools/objtool/special.h   |  3 ++
 9 files changed, 96 insertions(+), 11 deletions(-)
 create mode 100644 tools/objtool/arch/arm64/arch_special.c
 create mode 100644 tools/objtool/arch/x86/arch_special.c

diff --git a/tools/objtool/arch/arm64/Build b/tools/objtool/arch/arm64/Build
index bf7a32c2b9e9..3d09be745a84 100644
--- a/tools/objtool/arch/arm64/Build
+++ b/tools/objtool/arch/arm64/Build
@@ -1,3 +1,4 @@
+objtool-y += arch_special.o
 objtool-y += decode.o
 objtool-y += orc_dump.o
 objtool-y += orc_gen.o
diff --git a/tools/objtool/arch/arm64/arch_special.c 
b/tools/objtool/arch/arm64/arch_special.c
new file mode 100644
index ..a21d28876317
--- /dev/null
+++ b/tools/objtool/arch/arm64/arch_special.c
@@ -0,0 +1,22 @@
+/*
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+#include "../../special.h"
+#include "arch_special.h"
+
+void arch_force_alt_path(unsigned short feature,
+bool uaccess,
+struct special_alt *alt)
+{
+}
diff --git a/tools/objtool/arch/arm64/include/arch_special.h 
b/tools/objtool/arch/arm64/include/arch_special.h
index 63da775d0581..185103be8a51 100644
--- a/tools/objtool/arch/arm64/include/arch_special.h
+++ b/tools/objtool/arch/arm64/include/arch_special.h
@@ -30,7 +30,13 @@
 #define ALT_ORIG_LEN_OFFSET10
 #define ALT_NEW_LEN_OFFSET 11
 
-#define X86_FEATURE_POPCNT (4 * 32 + 23)
-#define X86_FEATURE_SMAP   (9 * 32 + 20)
+static inline bool arch_should_ignore_feature(unsigned short feature)
+{
+   return false;
+}
 
+static inline bool arch_support_switch_table(void)
+{
+   return false;
+}
 #endif /* _ARM64_ARCH_SPECIAL_H */
diff --git a/tools/objtool/arch/x86/Build b/tools/objtool/arch/x86/Build
index 1f11b45999d0..63e167775bc8 100644
--- a/tools/objtool/arch/x86/Build
+++ b/tools/objtool/arch/x86/Build
@@ -1,3 +1,4 @@
+objtool-y += arch_special.o
 objtool-y += decode.o
 objtool-y += orc_dump.o
 objtool-y += orc_gen.o
diff --git a/tools/objtool/arch/x86/arch_special.c 
b/tools/objtool/arch/x86/arch_special.c
new file mode 100644
index ..6583a1770bb2
--- /dev/null
+++ b/tools/objtool/arch/x86/arch_special.c
@@ -0,0 +1,28 @@
+/*
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+#include "../../special.h"
+#include "arch_special.h"
+
+void arch_force_alt_path(unsigned short feature,
+bool uaccess,
+struct special_alt *alt)
+{
+   if (feature == X86_FEATURE_SMAP) {
+   if (uaccess)
+   alt->skip_orig = true;
+   else
+   alt->skip_alt = true;
+   }
+}
diff --git a/tools/objtool/arch/x86/include/arch_special.h 
b/tools/objtool/arch/x86/include/arch_special.h
index 424ce47013e3..fce2b1193194 100644
--- a/tools/objtool/arch/x86/include/arch_special.h
+++ b/tools/objtool/arch/x86/include/arch_special.h
@@ -33,4 +3

[RFC v4 07/18] objtool: Introduce INSN_UNKNOWN type

2019-08-16 Thread Raphael Gault
On arm64 some object files contain data stored in the .text section.
This data is interpreted by objtool as instruction but can't be
identified as a valid one. In order to keep analysing those files we
introduce INSN_UNKNOWN type. The "unknown instruction" warning will thus
only be raised if such instructions are uncountered while validating an
execution branch.

This change doesn't impact the x86 decoding logic since 0 is still used
as a way to specify an unknown type, raising the "unknown instruction"
warning during the decoding phase still.

Signed-off-by: Raphael Gault 
---
 tools/objtool/arch.h   |  1 +
 tools/objtool/arch/arm64/decode.c  |  8 
 tools/objtool/arch/arm64/include/insn_decode.h |  4 ++--
 tools/objtool/check.c  | 10 +-
 4 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h
index 68d6371a24a2..9f2590e1df79 100644
--- a/tools/objtool/arch.h
+++ b/tools/objtool/arch.h
@@ -28,6 +28,7 @@ enum insn_type {
INSN_CLAC,
INSN_STD,
INSN_CLD,
+   INSN_UNKNOWN,
INSN_OTHER,
 };
 
diff --git a/tools/objtool/arch/arm64/decode.c 
b/tools/objtool/arch/arm64/decode.c
index be3d2eb10227..4cb9402d6fe1 100644
--- a/tools/objtool/arch/arm64/decode.c
+++ b/tools/objtool/arch/arm64/decode.c
@@ -37,9 +37,9 @@
  */
 static arm_decode_class aarch64_insn_class_decode_table[] = {
[INSN_RESERVED] = arm_decode_reserved,
-   [INSN_UNKNOWN]  = arm_decode_unknown,
+   [INSN_UNALLOC_1]= arm_decode_unknown,
[INSN_SVE_ENC]  = arm_decode_sve_encoding,
-   [INSN_UNALLOC]  = arm_decode_unknown,
+   [INSN_UNALLOC_2]= arm_decode_unknown,
[INSN_LD_ST_4]  = arm_decode_ld_st,
[INSN_DP_REG_5] = arm_decode_dp_reg,
[INSN_LD_ST_6]  = arm_decode_ld_st,
@@ -191,7 +191,7 @@ int arch_decode_instruction(struct elf *elf, struct section 
*sec,
 int arm_decode_unknown(u32 instr, unsigned char *type,
   unsigned long *immediate, struct stack_op *op)
 {
-   *type = 0;
+   *type = INSN_UNKNOWN;
return 0;
 }
 
@@ -206,7 +206,7 @@ int arm_decode_reserved(u32 instr, unsigned char *type,
unsigned long *immediate, struct stack_op *op)
 {
*immediate = instr & ONES(16);
-   *type = INSN_BUG;
+   *type = INSN_UNKNOWN;
return 0;
 }
 
diff --git a/tools/objtool/arch/arm64/include/insn_decode.h 
b/tools/objtool/arch/arm64/include/insn_decode.h
index eb54fc39dca5..a01d76306749 100644
--- a/tools/objtool/arch/arm64/include/insn_decode.h
+++ b/tools/objtool/arch/arm64/include/insn_decode.h
@@ -20,9 +20,9 @@
 #include "../../../arch.h"
 
 #define INSN_RESERVED  0b
-#define INSN_UNKNOWN   0b0001
+#define INSN_UNALLOC_1 0b0001
 #define INSN_SVE_ENC   0b0010
-#define INSN_UNALLOC   0b0011
+#define INSN_UNALLOC_2 0b0011
 #define INSN_DP_IMM0b1001  //0x100x
 #define INSN_BRANCH0b1011  //0x101x
 #define INSN_LD_ST_4   0b0100  //0bx1x0
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 519569b0329f..baa6a93f37cd 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1981,6 +1981,13 @@ static int validate_branch(struct objtool_file *file, 
struct symbol *func,
while (1) {
next_insn = next_insn_same_sec(file, insn);
 
+   if (insn->type == INSN_UNKNOWN) {
+   WARN("%s+0x%lx unknown instruction type, should never 
be reached",
+insn->sec->name,
+insn->offset);
+   return 1;
+   }
+
if (file->c_file && func && insn->func && func != 
insn->func->pfunc) {
WARN("%s() falls through to next function %s()",
 func->name, insn->func->name);
@@ -2414,7 +2421,8 @@ static int validate_reachable_instructions(struct 
objtool_file *file)
return 0;
 
for_each_insn(file, insn) {
-   if (insn->visited || ignore_unreachable_insn(insn))
+   if (insn->visited || ignore_unreachable_insn(insn) ||
+   insn->type == INSN_UNKNOWN)
continue;
 
WARN_FUNC("unreachable instruction", insn->sec, insn->offset);
-- 
2.17.1



[RFC v4 04/18] objtool: arm64: Add required implementation for supporting the aarch64 architecture in objtool.

2019-08-16 Thread Raphael Gault
Provide implementation for the arch-dependent functions that are called by
the main check function of objtool.  The ORC unwinder is not yet supported
by the arm64 architecture so we only provide a dummy interface for now.
The decoding of the instruction is split into classes and subclasses as
described into the Instruction Encoding in the ArmV8.5 Architecture
Reference Manual.

In order to handle the load/store instructions for a pair of registers
we add an extra field to the stack_op structure.

We consider that the hypervisor/secure-monitor is behaving
correctly. This enables us to handle hvc/smc/svc context switching
instructions as nop since we consider that the context is restored
correctly.

Signed-off-by: Raphael Gault 
---
 tools/objtool/arch.h  |7 +
 tools/objtool/arch/arm64/Build|7 +
 tools/objtool/arch/arm64/bit_operations.c |   67 +
 tools/objtool/arch/arm64/decode.c | 2787 +
 .../objtool/arch/arm64/include/arch_special.h |   36 +
 .../arch/arm64/include/asm/orc_types.h|   96 +
 .../arch/arm64/include/bit_operations.h   |   24 +
 tools/objtool/arch/arm64/include/cfi.h|   74 +
 .../objtool/arch/arm64/include/insn_decode.h  |  211 ++
 tools/objtool/arch/arm64/orc_dump.c   |   26 +
 tools/objtool/arch/arm64/orc_gen.c|   40 +
 11 files changed, 3375 insertions(+)
 create mode 100644 tools/objtool/arch/arm64/Build
 create mode 100644 tools/objtool/arch/arm64/bit_operations.c
 create mode 100644 tools/objtool/arch/arm64/decode.c
 create mode 100644 tools/objtool/arch/arm64/include/arch_special.h
 create mode 100644 tools/objtool/arch/arm64/include/asm/orc_types.h
 create mode 100644 tools/objtool/arch/arm64/include/bit_operations.h
 create mode 100644 tools/objtool/arch/arm64/include/cfi.h
 create mode 100644 tools/objtool/arch/arm64/include/insn_decode.h
 create mode 100644 tools/objtool/arch/arm64/orc_dump.c
 create mode 100644 tools/objtool/arch/arm64/orc_gen.c

diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h
index e91e12807678..bb5ce810fb6e 100644
--- a/tools/objtool/arch.h
+++ b/tools/objtool/arch.h
@@ -62,9 +62,16 @@ struct op_src {
int offset;
 };
 
+struct op_extra {
+   unsigned char used;
+   unsigned char reg;
+   int offset;
+};
+
 struct stack_op {
struct op_dest dest;
struct op_src src;
+   struct op_extra extra;
 };
 
 struct instruction;
diff --git a/tools/objtool/arch/arm64/Build b/tools/objtool/arch/arm64/Build
new file mode 100644
index ..bf7a32c2b9e9
--- /dev/null
+++ b/tools/objtool/arch/arm64/Build
@@ -0,0 +1,7 @@
+objtool-y += decode.o
+objtool-y += orc_dump.o
+objtool-y += orc_gen.o
+objtool-y += bit_operations.o
+
+
+CFLAGS_decode.o += -I$(OUTPUT)arch/arm64/lib
diff --git a/tools/objtool/arch/arm64/bit_operations.c 
b/tools/objtool/arch/arm64/bit_operations.c
new file mode 100644
index ..f457a14a7f5d
--- /dev/null
+++ b/tools/objtool/arch/arm64/bit_operations.c
@@ -0,0 +1,67 @@
+#include 
+#include 
+#include "bit_operations.h"
+
+#include "../../warn.h"
+
+u64 replicate(u64 x, int size, int n)
+{
+   u64 ret = 0;
+
+   while (n >= 0) {
+   ret = (ret | x) << size;
+   n--;
+   }
+   return ret | x;
+}
+
+u64 ror(u64 x, int size, int shift)
+{
+   int m = shift % size;
+
+   if (shift == 0)
+   return x;
+   return ZERO_EXTEND((x >> m) | (x << (size - m)), size);
+}
+
+int highest_set_bit(u32 x)
+{
+   int i;
+
+   for (i = 31; i >= 0; i--, x <<= 1)
+   if (x & 0x8000)
+   return i;
+   return 0;
+}
+
+/* imms and immr are both 6 bit long */
+__uint128_t decode_bit_masks(unsigned char N, unsigned char imms,
+unsigned char immr, bool immediate)
+{
+   u64 tmask, wmask;
+   u32 diff, S, R, esize, welem, telem;
+   unsigned char levels = 0, len = 0;
+
+   len = highest_set_bit((N << 6) | ((~imms) & ONES(6)));
+   levels = ZERO_EXTEND(ONES(len), 6);
+
+   if (immediate && ((imms & levels) == levels)) {
+   WARN("unknown instruction");
+   return -1;
+   }
+
+   S = imms & levels;
+   R = immr & levels;
+   diff = ZERO_EXTEND(S - R, 6);
+
+   esize = 1 << len;
+   diff = diff & ONES(len);
+
+   welem = ZERO_EXTEND(ONES(S + 1), esize);
+   telem = ZERO_EXTEND(ONES(diff + 1), esize);
+
+   wmask = replicate(ror(welem, esize, R), esize, 64 / esize);
+   tmask = replicate(telem, esize, 64 / esize);
+
+   return ((__uint128_t)wmask << 64) | tmask;
+}
diff --git a/tools/objtool/arch/arm64/decode.c 
b/tools/objtool/arch/arm64/decode.c
new file mode 100644
index ..395c5777afab
--- /dev/null
+++ b/tools/objtool/arch/arm64/decode.c
@@ -0,0 +1

[RFC v4 00/18] objtool: Add support for arm64

2019-08-16 Thread Raphael Gault
Hi,

Changes since RFC V3:
* Rebased on tip/master: Switch/jump table had been refactored
* Take Catalin Marinas comments into account regarding the asm macro for
  marking exceptions.

As of now, objtool only supports the x86_64 architecture but the
groundwork has already been done in order to add support for other
architectures without too much effort.

This series of patches adds support for the arm64 architecture
based on the Armv8.5 Architecture Reference Manual.

Objtool will be a valuable tool to progress and provide more guarentees
on live patching which is a work in progress for arm64.

Once we have the base of objtool working the next steps will be to
port Peter Z's uaccess validation for arm64.


Raphael Gault (18):
  objtool: Add abstraction for computation of symbols offsets
  objtool: orc: Refactor ORC API for other architectures to implement.
  objtool: Move registers and control flow to arch-dependent code
  objtool: arm64: Add required implementation for supporting the aarch64
architecture in objtool.
  objtool: special: Adapt special section handling
  objtool: arm64: Adapt the stack frame checks for arm architecture
  objtool: Introduce INSN_UNKNOWN type
  objtool: Refactor switch-tables code to support other architectures
  gcc-plugins: objtool: Add plugin to detect switch table on arm64
  objtool: arm64: Implement functions to add switch tables alternatives
  arm64: alternative: Mark .altinstr_replacement as containing
executable instructions
  arm64: assembler: Add macro to annotate asm function having non
standard stack-frame.
  arm64: sleep: Prevent stack frame warnings from objtool
  arm64: kvm: Annotate non-standard stack frame functions
  arm64: kernel: Add exception on kuser32 to prevent stack analysis
  arm64: crypto: Add exceptions for crypto object to prevent stack
analysis
  arm64: kernel: Annotate non-standard stack frame functions
  objtool: arm64: Enable stack validation for arm64

 arch/arm64/Kconfig|1 +
 arch/arm64/crypto/Makefile|3 +
 arch/arm64/include/asm/alternative.h  |2 +-
 arch/arm64/kernel/Makefile|3 +
 arch/arm64/kernel/hyp-stub.S  |3 +
 arch/arm64/kernel/sleep.S |5 +
 arch/arm64/kvm/hyp-init.S |3 +
 arch/arm64/kvm/hyp/entry.S|3 +
 include/linux/frame.h |   19 +-
 scripts/Makefile.gcc-plugins  |2 +
 scripts/gcc-plugins/Kconfig   |9 +
 .../arm64_switch_table_detection_plugin.c |   58 +
 tools/objtool/Build   |2 -
 tools/objtool/arch.h  |   19 +
 tools/objtool/arch/arm64/Build|8 +
 tools/objtool/arch/arm64/arch_special.c   |  165 +
 tools/objtool/arch/arm64/bit_operations.c |   67 +
 tools/objtool/arch/arm64/decode.c | 2815 +
 .../objtool/arch/arm64/include/arch_special.h |   52 +
 .../arch/arm64/include/asm/orc_types.h|   96 +
 .../arch/arm64/include/bit_operations.h   |   24 +
 tools/objtool/arch/arm64/include/cfi.h|   74 +
 .../objtool/arch/arm64/include/insn_decode.h  |  210 ++
 tools/objtool/arch/arm64/orc_dump.c   |   26 +
 tools/objtool/arch/arm64/orc_gen.c|   40 +
 tools/objtool/arch/x86/Build  |3 +
 tools/objtool/arch/x86/arch_special.c |  107 +
 tools/objtool/arch/x86/decode.c   |   16 +
 tools/objtool/arch/x86/include/arch_special.h |   45 +
 tools/objtool/{ => arch/x86/include}/cfi.h|0
 tools/objtool/{ => arch/x86}/orc_dump.c   |4 +-
 tools/objtool/{ => arch/x86}/orc_gen.c|  104 +-
 tools/objtool/check.c |  309 +-
 tools/objtool/check.h |   10 +
 tools/objtool/elf.c   |3 +-
 tools/objtool/orc.h   |4 +-
 tools/objtool/special.c   |   28 +-
 tools/objtool/special.h   |   13 +-
 38 files changed, 4129 insertions(+), 226 deletions(-)
 create mode 100644 scripts/gcc-plugins/arm64_switch_table_detection_plugin.c
 create mode 100644 tools/objtool/arch/arm64/Build
 create mode 100644 tools/objtool/arch/arm64/arch_special.c
 create mode 100644 tools/objtool/arch/arm64/bit_operations.c
 create mode 100644 tools/objtool/arch/arm64/decode.c
 create mode 100644 tools/objtool/arch/arm64/include/arch_special.h
 create mode 100644 tools/objtool/arch/arm64/include/asm/orc_types.h
 create mode 100644 tools/objtool/arch/arm64/include/bit_operations.h
 create mode 100644 tools/objtool/arch/arm64/include/cfi.h
 create mode 100644 tools/objtool/arch/arm64/include/insn_decode.h
 create mode 100644 tools/objtool/arch/arm64/orc_dump.c
 create mode 100644 tools/objtool/arch/arm64/orc_gen.c
 create mode 100644 tools/objtool/arch/x86/arch_special.c
 cr

Re: [PATCH v2 0/5] arm64: Enable access to pmu registers by user-space

2019-07-23 Thread Raphael Gault

Hi,

Any further comments on this patchset ?

Cheers,

On 7/5/19 9:55 AM, Raphael Gault wrote:

The perf user-space tool relies on the PMU to monitor events. It offers an
abstraction layer over the hardware counters since the underlying
implementation is cpu-dependent. We want to allow userspace tools to have
access to the registers storing the hardware counters' values directly.
This targets specifically self-monitoring tasks in order to reduce the
overhead by directly accessing the registers without having to go
through the kernel.
In order to do this we need to setup the pmu so that it exposes its registers
to userspace access.

The first patch add a test to the perf tool so that we can test that the
access to the registers works correctly from userspace.

The second patch add a capability in the arm64 cpufeatures framework in
order to detect when we are running on a heterogeneous system.

The third patch focuses on the armv8 pmuv3 PMU support and makes sure that
the access to the pmu registers is enable and that the userspace have
access to the relevent information in order to use them.

The fourth patch put in place callbacks to enable access to the hardware
counters from userspace when a compatible event is opened using the perf
API.

The fifth patch adds a short documentation about PMU counters direct
access from userspace.

**Changes since v1**

* Rebased on linux-next/master
* Do not include RSEQs materials (test and utilities) since we want to
   enable direct access to counters only on homogeneous systems.
* Do not include the hook defitinions for the same reason as above.
* Add a cpu feature/capability to detect heterogeneous systems.

Raphael Gault (5):
   perf: arm64: Add test to check userspace access to hardware counters.
   arm64: cpufeature: Add feature to detect heterogeneous systems
   arm64: pmu: Add function implementation to update event index in
 userpage.
   arm64: perf: Enable pmu counter direct access for perf event on armv8
   Documentation: arm64: Document PMU counters access from userspace

  .../arm64/pmu_counter_user_access.txt |  42 +++
  arch/arm64/include/asm/cpucaps.h  |   3 +-
  arch/arm64/include/asm/mmu.h  |   6 +
  arch/arm64/include/asm/mmu_context.h  |   2 +
  arch/arm64/include/asm/perf_event.h   |  14 +
  arch/arm64/kernel/cpufeature.c|  20 ++
  arch/arm64/kernel/perf_event.c|  23 ++
  drivers/perf/arm_pmu.c|  38 +++
  include/linux/perf/arm_pmu.h  |   2 +
  tools/perf/arch/arm64/include/arch-tests.h|   6 +
  tools/perf/arch/arm64/tests/Build |   1 +
  tools/perf/arch/arm64/tests/arch-tests.c  |   4 +
  tools/perf/arch/arm64/tests/user-events.c | 255 ++
  13 files changed, 415 insertions(+), 1 deletion(-)
  create mode 100644 Documentation/arm64/pmu_counter_user_access.txt
  create mode 100644 tools/perf/arch/arm64/tests/user-events.c



--
Raphael Gault


Re: [RFC V3 00/18] objtool: Add support for arm64

2019-07-10 Thread Raphael Gault

Hi all,

Just a gentle ping to see if anyone has comments to make about this 
version :)


On 6/24/19 10:55 AM, Raphael Gault wrote:

As of now, objtool only supports the x86_64 architecture but the
groundwork has already been done in order to add support for other
architectures without too much effort.

This series of patches adds support for the arm64 architecture
based on the Armv8.5 Architecture Reference Manual.

Objtool will be a valuable tool to progress and provide more guarentees
on live patching which is a work in progress for arm64.

Once we have the base of objtool working the next steps will be to
port Peter Z's uaccess validation for arm64.

Changes since previous version:
* Rebased on tip/master: Note that I had to re-expose the
`struct alternative` using check.h because it is now used outside of
check.c.
* Reorder commits for a more coherent progression
* Introduce GCC plugin to help detect switch-tables for arm64
This plugins could be improve: It plugs in after the RTL control flow
graph passes but only extract information about the switch tables. I
originally intended for it to introduce new code_label/note within the
RTL representation in order to reference them and thus get the address
of the branch instruction. However I did not manage to do it properly
using gen_rtx_CODE_LABEL/emit_label_before/after. If anyone has some
experience with RTL plugins I am all ears for advices.

Raphael Gault (18):
   objtool: Add abstraction for computation of symbols offsets
   objtool: orc: Refactor ORC API for other architectures to implement.
   objtool: Move registers and control flow to arch-dependent code
   objtool: arm64: Add required implementation for supporting the aarch64
 architecture in objtool.
   objtool: special: Adapt special section handling
   objtool: arm64: Adapt the stack frame checks for arm architecture
   objtool: Introduce INSN_UNKNOWN type
   objtool: Refactor switch-tables code to support other architectures
   gcc-plugins: objtool: Add plugin to detect switch table on arm64
   objtool: arm64: Implement functions to add switch tables alternatives
   arm64: alternative: Mark .altinstr_replacement as containing
 executable instructions
   arm64: assembler: Add macro to annotate asm function having non
 standard stack-frame.
   arm64: sleep: Prevent stack frame warnings from objtool
   arm64: kvm: Annotate non-standard stack frame functions
   arm64: kernel: Add exception on kuser32 to prevent stack analysis
   arm64: crypto: Add exceptions for crypto object to prevent stack
 analysis
   arm64: kernel: Annotate non-standard stack frame functions
   objtool: arm64: Enable stack validation for arm64

  arch/arm64/Kconfig|1 +
  arch/arm64/crypto/Makefile|3 +
  arch/arm64/include/asm/alternative.h  |2 +-
  arch/arm64/include/asm/assembler.h|   13 +
  arch/arm64/kernel/Makefile|3 +
  arch/arm64/kernel/hyp-stub.S  |2 +
  arch/arm64/kernel/sleep.S |4 +
  arch/arm64/kvm/hyp-init.S |2 +
  arch/arm64/kvm/hyp/entry.S|2 +
  scripts/Makefile.gcc-plugins  |2 +
  scripts/gcc-plugins/Kconfig   |9 +
  .../arm64_switch_table_detection_plugin.c |   58 +
  tools/objtool/Build   |2 -
  tools/objtool/arch.h  |   21 +-
  tools/objtool/arch/arm64/Build|8 +
  tools/objtool/arch/arm64/arch_special.c   |  173 +
  tools/objtool/arch/arm64/bit_operations.c |   67 +
  tools/objtool/arch/arm64/decode.c | 2809 +
  .../objtool/arch/arm64/include/arch_special.h |   52 +
  .../arch/arm64/include/asm/orc_types.h|   96 +
  .../arch/arm64/include/bit_operations.h   |   24 +
  tools/objtool/arch/arm64/include/cfi.h|   74 +
  .../objtool/arch/arm64/include/insn_decode.h  |  210 ++
  tools/objtool/arch/arm64/orc_dump.c   |   26 +
  tools/objtool/arch/arm64/orc_gen.c|   40 +
  tools/objtool/arch/x86/Build  |3 +
  tools/objtool/arch/x86/arch_special.c |  101 +
  tools/objtool/arch/x86/decode.c   |   16 +
  tools/objtool/arch/x86/include/arch_special.h |   45 +
  tools/objtool/{ => arch/x86/include}/cfi.h|0
  tools/objtool/{ => arch/x86}/orc_dump.c   |4 +-
  tools/objtool/{ => arch/x86}/orc_gen.c|  104 +-
  tools/objtool/check.c |  309 +-
  tools/objtool/check.h |   10 +
  tools/objtool/elf.c   |3 +-
  tools/objtool/orc.h   |4 +-
  tools/objtool/special.c   |   28 +-
  tools/objtool/special.h   |   13 +-
  38 files changed, 4119 insertions(+), 224 deletions(-)
  create mode 100644 scripts/gc

Re: [PATCH v2 1/5] perf: arm64: Add test to check userspace access to hardware counters.

2019-07-08 Thread Raphael Gault

Hi Arnaldo,

On 7/5/19 3:54 PM, Arnaldo Carvalho de Melo wrote:

Em Fri, Jul 05, 2019 at 09:55:37AM +0100, Raphael Gault escreveu:

This test relies on the fact that the PMU registers are accessible
from userspace. It then uses the perf_event_mmap_page to retrieve
the counter index and access the underlying register.

This test uses sched_setaffinity(2) in order to run on all CPU and thus
check the behaviour of the PMU of all cpus in a big.LITTLE environment.

Signed-off-by: Raphael Gault 
---
  tools/perf/arch/arm64/include/arch-tests.h |   6 +
  tools/perf/arch/arm64/tests/Build  |   1 +
  tools/perf/arch/arm64/tests/arch-tests.c   |   4 +
  tools/perf/arch/arm64/tests/user-events.c  | 255 +
  4 files changed, 266 insertions(+)
  create mode 100644 tools/perf/arch/arm64/tests/user-events.c

diff --git a/tools/perf/arch/arm64/include/arch-tests.h 
b/tools/perf/arch/arm64/include/arch-tests.h
index 90ec4c8cb880..a9b17ae0560b 100644
--- a/tools/perf/arch/arm64/include/arch-tests.h
+++ b/tools/perf/arch/arm64/include/arch-tests.h
@@ -2,11 +2,17 @@
  #ifndef ARCH_TESTS_H
  #define ARCH_TESTS_H
  
+#define __maybe_unused	__attribute__((unused))



What is wrong with using:

#include 

?

[acme@quaco perf]$ find tools/perf/ -name "*.[ch]" | xargs grep __maybe_unused 
| wc -l
1115
[acme@quaco perf]$ grep __maybe_unused tools/include/linux/compiler.h
#ifndef __maybe_unused
# define __maybe_unused __attribute__((unused))
[acme@quaco perf]$

Also please don't break strings in multiple lines just to comply with
the 80 column limit. That is ok when you have multiple lines ending with
a newline, but otherwise just makes it look ugly.



You're right, I shall correct those points.

Thanks,

--
Raphael Gault


[PATCH v2 1/5] perf: arm64: Add test to check userspace access to hardware counters.

2019-07-05 Thread Raphael Gault
This test relies on the fact that the PMU registers are accessible
from userspace. It then uses the perf_event_mmap_page to retrieve
the counter index and access the underlying register.

This test uses sched_setaffinity(2) in order to run on all CPU and thus
check the behaviour of the PMU of all cpus in a big.LITTLE environment.

Signed-off-by: Raphael Gault 
---
 tools/perf/arch/arm64/include/arch-tests.h |   6 +
 tools/perf/arch/arm64/tests/Build  |   1 +
 tools/perf/arch/arm64/tests/arch-tests.c   |   4 +
 tools/perf/arch/arm64/tests/user-events.c  | 255 +
 4 files changed, 266 insertions(+)
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c

diff --git a/tools/perf/arch/arm64/include/arch-tests.h 
b/tools/perf/arch/arm64/include/arch-tests.h
index 90ec4c8cb880..a9b17ae0560b 100644
--- a/tools/perf/arch/arm64/include/arch-tests.h
+++ b/tools/perf/arch/arm64/include/arch-tests.h
@@ -2,11 +2,17 @@
 #ifndef ARCH_TESTS_H
 #define ARCH_TESTS_H
 
+#define __maybe_unused __attribute__((unused))
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 struct thread;
 struct perf_sample;
+int test__arch_unwind_sample(struct perf_sample *sample,
+struct thread *thread);
 #endif
 
 extern struct test arch_tests[];
+int test__rd_pmevcntr(struct test *test __maybe_unused,
+ int subtest __maybe_unused);
+
 
 #endif
diff --git a/tools/perf/arch/arm64/tests/Build 
b/tools/perf/arch/arm64/tests/Build
index a61c06bdb757..3f9a20c17fc6 100644
--- a/tools/perf/arch/arm64/tests/Build
+++ b/tools/perf/arch/arm64/tests/Build
@@ -1,4 +1,5 @@
 perf-y += regs_load.o
 perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 
+perf-y += user-events.o
 perf-y += arch-tests.o
diff --git a/tools/perf/arch/arm64/tests/arch-tests.c 
b/tools/perf/arch/arm64/tests/arch-tests.c
index 5b1543c98022..57df9b89dede 100644
--- a/tools/perf/arch/arm64/tests/arch-tests.c
+++ b/tools/perf/arch/arm64/tests/arch-tests.c
@@ -10,6 +10,10 @@ struct test arch_tests[] = {
.func = test__dwarf_unwind,
},
 #endif
+   {
+   .desc = "User counter access",
+   .func = test__rd_pmevcntr,
+   },
{
.func = NULL,
},
diff --git a/tools/perf/arch/arm64/tests/user-events.c 
b/tools/perf/arch/arm64/tests/user-events.c
new file mode 100644
index ..958e4cd000c1
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/user-events.c
@@ -0,0 +1,255 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "perf.h"
+#include "debug.h"
+#include "tests/tests.h"
+#include "cloexec.h"
+#include "util.h"
+#include "arch-tests.h"
+
+/*
+ * ARMv8 ARM reserves the following encoding for system registers:
+ * (Ref: ARMv8 ARM, Section: "System instruction class encoding overview",
+ *  C5.2, version:ARM DDI 0487A.f)
+ *  [20-19] : Op0
+ *  [18-16] : Op1
+ *  [15-12] : CRn
+ *  [11-8]  : CRm
+ *  [7-5]   : Op2
+ */
+#define Op0_shift   19
+#define Op0_mask0x3
+#define Op1_shift   16
+#define Op1_mask0x7
+#define CRn_shift   12
+#define CRn_mask0xf
+#define CRm_shift   8
+#define CRm_mask0xf
+#define Op2_shift   5
+#define Op2_mask0x7
+
+#define __stringify(x) #x
+
+#define read_sysreg(r) ({  \
+   u64 __val;  \
+   asm volatile("mrs %0, " __stringify(r) : "=r" (__val)); \
+   __val;  \
+})
+
+#define PMEVCNTR_READ_CASE(idx)\
+   case idx:   \
+   return read_sysreg(pmevcntr##idx##_el0)
+
+#define PMEVCNTR_CASES(readwrite)  \
+   PMEVCNTR_READ_CASE(0);  \
+   PMEVCNTR_READ_CASE(1);  \
+   PMEVCNTR_READ_CASE(2);  \
+   PMEVCNTR_READ_CASE(3);  \
+   PMEVCNTR_READ_CASE(4);  \
+   PMEVCNTR_READ_CASE(5);  \
+   PMEVCNTR_READ_CASE(6);  \
+   PMEVCNTR_READ_CASE(7);  \
+   PMEVCNTR_READ_CASE(8);  \
+   PMEVCNTR_READ_CASE(9);  \
+   PMEVCNTR_READ_CASE(10); \
+   PMEVCNTR_READ_CASE(11); \
+   PMEVCNTR_READ_CASE(12); \
+   PMEVCNTR_READ_CASE(13); \
+   PMEVCNTR_READ_CASE(14); \
+   PMEVCNTR_READ_CASE(15); \
+   PMEVCNTR_READ_CASE(16); \
+   PMEVCNTR_READ_CASE(17); \
+   PMEVCNTR_READ_CASE(18

[PATCH v2 4/5] arm64: perf: Enable pmu counter direct access for perf event on armv8

2019-07-05 Thread Raphael Gault
Keep track of event opened with direct access to the hardware counters
and modify permissions while they are open.

The strategy used here is the same which x86 uses: everytime an event
is mapped, the permissions are set if required. The atomic field added
in the mm_context helps keep track of the different event opened and
de-activate the permissions when all are unmapped.
We also need to update the permissions in the context switch code so
that tasks keep the right permissions.

Signed-off-by: Raphael Gault 
---
 arch/arm64/include/asm/mmu.h |  6 +
 arch/arm64/include/asm/mmu_context.h |  2 ++
 arch/arm64/include/asm/perf_event.h  | 14 ++
 drivers/perf/arm_pmu.c   | 38 
 4 files changed, 60 insertions(+)

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index fd6161336653..88ed4466bd06 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -18,6 +18,12 @@
 
 typedef struct {
atomic64_t  id;
+
+   /*
+* non-zero if userspace have access to hardware
+* counters directly.
+*/
+   atomic_tpmu_direct_access;
void*vdso;
unsigned long   flags;
 } mm_context_t;
diff --git a/arch/arm64/include/asm/mmu_context.h 
b/arch/arm64/include/asm/mmu_context.h
index 7ed0adb187a8..6e66ff940494 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -224,6 +225,7 @@ static inline void __switch_mm(struct mm_struct *next)
}
 
check_and_switch_context(next, cpu);
+   perf_switch_user_access(next);
 }
 
 static inline void
diff --git a/arch/arm64/include/asm/perf_event.h 
b/arch/arm64/include/asm/perf_event.h
index 2bdbc79bbd01..ba58fa726631 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -8,6 +8,7 @@
 
 #include 
 #include 
+#include 
 
 #defineARMV8_PMU_MAX_COUNTERS  32
 #defineARMV8_PMU_COUNTER_MASK  (ARMV8_PMU_MAX_COUNTERS - 1)
@@ -223,4 +224,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
(regs)->pstate = PSR_MODE_EL1h; \
 }
 
+static inline void perf_switch_user_access(struct mm_struct *mm)
+{
+   if (!IS_ENABLED(CONFIG_PERF_EVENTS))
+   return;
+
+   if (atomic_read(>context.pmu_direct_access)) {
+   write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR,
+pmuserenr_el0);
+   } else {
+   write_sysreg(0, pmuserenr_el0);
+   }
+}
+
 #endif
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 2d06b8095a19..4844fe97d775 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -25,6 +25,7 @@
 #include 
 
 #include 
+#include 
 
 static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
 static DEFINE_PER_CPU(int, cpu_irq);
@@ -778,6 +779,41 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu)
_pmu->node);
 }
 
+static void refresh_pmuserenr(void *mm)
+{
+   perf_switch_user_access(mm);
+}
+
+static void armpmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return;
+
+   /*
+* This function relies on not being called concurrently in two
+* tasks in the same mm.  Otherwise one task could observe
+* pmu_direct_access > 1 and return all the way back to
+* userspace with user access disabled while another task is still
+* doing on_each_cpu_mask() to enable user access.
+*
+* For now, this can't happen because all callers hold mmap_sem
+* for write.  If this changes, we'll need a different solution.
+*/
+   lockdep_assert_held_write(>mmap_sem);
+
+   if (atomic_inc_return(>context.pmu_direct_access) == 1)
+   on_each_cpu(refresh_pmuserenr, mm, 1);
+}
+
+static void armpmu_event_unmapped(struct perf_event *event, struct mm_struct 
*mm)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return;
+
+   if (atomic_dec_and_test(>context.pmu_direct_access))
+   on_each_cpu_mask(mm_cpumask(mm), refresh_pmuserenr, NULL, 1);
+}
+
 static struct arm_pmu *__armpmu_alloc(gfp_t flags)
 {
struct arm_pmu *pmu;
@@ -799,6 +835,8 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags)
.pmu_enable = armpmu_enable,
.pmu_disable= armpmu_disable,
.event_init = armpmu_event_init,
+   .event_mapped   = armpmu_event_mapped,
+   .event_unmapped = armpmu_event_unmapped,
.add= armpmu_add,
.del= armpmu_del,
.start  = armpmu_start,
-- 
2.17.1



[PATCH v2 3/5] arm64: pmu: Add function implementation to update event index in userpage.

2019-07-05 Thread Raphael Gault
In order to be able to access the counter directly for userspace,
we need to provide the index of the counter using the userpage.
We thus need to override the event_idx function to retrieve and
convert the perf_event index to armv8 hardware index.

Since the arm_pmu driver can be used by any implementation, even
if not armv8, two components play a role into making sure the
behaviour is correct and consistent with the PMU capabilities:

* the ARMPMU_EL0_RD_CNTR flag which denotes the capability to access
counter from userspace.
* the event_idx call back, which is implemented and initialized by
the PMU implementation: if no callback is provided, the default
behaviour applies, returning 0 as index value.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/perf_event.c | 22 ++
 include/linux/perf/arm_pmu.h   |  2 ++
 2 files changed, 24 insertions(+)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 24575c0a0065..f6336197d29e 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -819,6 +819,22 @@ static void armv8pmu_clear_event_idx(struct pmu_hw_events 
*cpuc,
clear_bit(idx - 1, cpuc->used_mask);
 }
 
+static int armv8pmu_access_event_idx(struct perf_event *event)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return 0;
+
+   /*
+* We remap the cycle counter index to 32 to
+* match the offset applied to the rest of
+* the counter indeces.
+*/
+   if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER)
+   return 32;
+
+   return event->hw.idx;
+}
+
 /*
  * Add an event filter to a given event.
  */
@@ -912,6 +928,9 @@ static int __armv8_pmuv3_map_event(struct perf_event *event,
if (armv8pmu_event_is_64bit(event))
event->hw.flags |= ARMPMU_EVT_64BIT;
 
+   if (!cpus_have_const_cap(ARM64_HAS_HETEROGENEOUS_PMU))
+   event->hw.flags |= ARMPMU_EL0_RD_CNTR;
+
/* Only expose micro/arch events supported by this PMU */
if ((hw_event_id > 0) && (hw_event_id < ARMV8_PMUV3_MAX_COMMON_EVENTS)
&& test_bit(hw_event_id, armpmu->pmceid_bitmap)) {
@@ -1032,6 +1051,8 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu)
cpu_pmu->set_event_filter   = armv8pmu_set_event_filter;
cpu_pmu->filter_match   = armv8pmu_filter_match;
 
+   cpu_pmu->pmu.event_idx  = armv8pmu_access_event_idx;
+
return 0;
 }
 
@@ -1210,6 +1231,7 @@ void arch_perf_update_userpage(struct perf_event *event,
 */
freq = arch_timer_get_rate();
userpg->cap_user_time = 1;
+   userpg->cap_user_rdpmc = !!(event->hw.flags & ARMPMU_EL0_RD_CNTR);
 
clocks_calc_mult_shift(>time_mult, , freq,
NSEC_PER_SEC, 0);
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 71f525a35ac2..1106a9ac00fd 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -26,6 +26,8 @@
  */
 /* Event uses a 64bit counter */
 #define ARMPMU_EVT_64BIT   1
+/* Allow access to hardware counter from userspace */
+#define ARMPMU_EL0_RD_CNTR 2
 
 #define HW_OP_UNSUPPORTED  0x
 #define C(_x)  PERF_COUNT_HW_CACHE_##_x
-- 
2.17.1



[PATCH v2 5/5] Documentation: arm64: Document PMU counters access from userspace

2019-07-05 Thread Raphael Gault
Add a documentation file to describe the access to the pmu hardware
counters from userspace

Signed-off-by: Raphael Gault 
---
 .../arm64/pmu_counter_user_access.txt | 42 +++
 1 file changed, 42 insertions(+)
 create mode 100644 Documentation/arm64/pmu_counter_user_access.txt

diff --git a/Documentation/arm64/pmu_counter_user_access.txt 
b/Documentation/arm64/pmu_counter_user_access.txt
new file mode 100644
index ..6788b1107381
--- /dev/null
+++ b/Documentation/arm64/pmu_counter_user_access.txt
@@ -0,0 +1,42 @@
+Access to PMU hardware counter from userspace
+=
+
+Overview
+
+The perf user-space tool relies on the PMU to monitor events. It offers an
+abstraction layer over the hardware counters since the underlying
+implementation is cpu-dependent.
+Arm64 allows userspace tools to have access to the registers storing the
+hardware counters' values directly.
+
+This targets specifically self-monitoring tasks in order to reduce the overhead
+by directly accessing the registers without having to go through the kernel.
+
+How-to
+--
+The focus is set on the armv8 pmuv3 which makes sure that the access to the pmu
+registers is enable and that the userspace have access to the relevent
+information in order to use them.
+
+In order to have access to the hardware counter it is necessary to open the 
event
+using the perf tool interface: the sys_perf_event_open syscall returns a fd 
which
+can subsequently be used with the mmap syscall in order to retrieve a page of 
memory
+containing information about the event.
+The PMU driver uses this page to expose to the user the hardware counter's
+index. Using this index enables the user to access the PMU registers using the
+`mrs` instruction.
+
+Have a look `at tools/perf/arch/arm64/tests/user-events.c` for an example. It 
can be
+run using the perf tool to check that the access to the registers works
+correctly from userspace:
+
+./perf test -v
+
+About chained events
+
+When the user requests for an event to be counted on 64 bits, two hardware
+counters are used and need to be combined to retrieve the correct value:
+
+val = read_counter(idx);
+if ((event.attr.config1 & 0x1))
+   val = (val << 32) | read_counter(idx - 1);
-- 
2.17.1



[PATCH v2 0/5] arm64: Enable access to pmu registers by user-space

2019-07-05 Thread Raphael Gault
The perf user-space tool relies on the PMU to monitor events. It offers an
abstraction layer over the hardware counters since the underlying
implementation is cpu-dependent. We want to allow userspace tools to have
access to the registers storing the hardware counters' values directly.
This targets specifically self-monitoring tasks in order to reduce the
overhead by directly accessing the registers without having to go
through the kernel.
In order to do this we need to setup the pmu so that it exposes its registers
to userspace access.

The first patch add a test to the perf tool so that we can test that the
access to the registers works correctly from userspace.

The second patch add a capability in the arm64 cpufeatures framework in
order to detect when we are running on a heterogeneous system.

The third patch focuses on the armv8 pmuv3 PMU support and makes sure that
the access to the pmu registers is enable and that the userspace have
access to the relevent information in order to use them.

The fourth patch put in place callbacks to enable access to the hardware
counters from userspace when a compatible event is opened using the perf
API.

The fifth patch adds a short documentation about PMU counters direct
access from userspace.

**Changes since v1**

* Rebased on linux-next/master
* Do not include RSEQs materials (test and utilities) since we want to
  enable direct access to counters only on homogeneous systems.
* Do not include the hook defitinions for the same reason as above.
* Add a cpu feature/capability to detect heterogeneous systems.

Raphael Gault (5):
  perf: arm64: Add test to check userspace access to hardware counters.
  arm64: cpufeature: Add feature to detect heterogeneous systems
  arm64: pmu: Add function implementation to update event index in
userpage.
  arm64: perf: Enable pmu counter direct access for perf event on armv8
  Documentation: arm64: Document PMU counters access from userspace

 .../arm64/pmu_counter_user_access.txt |  42 +++
 arch/arm64/include/asm/cpucaps.h  |   3 +-
 arch/arm64/include/asm/mmu.h  |   6 +
 arch/arm64/include/asm/mmu_context.h  |   2 +
 arch/arm64/include/asm/perf_event.h   |  14 +
 arch/arm64/kernel/cpufeature.c|  20 ++
 arch/arm64/kernel/perf_event.c|  23 ++
 drivers/perf/arm_pmu.c|  38 +++
 include/linux/perf/arm_pmu.h  |   2 +
 tools/perf/arch/arm64/include/arch-tests.h|   6 +
 tools/perf/arch/arm64/tests/Build |   1 +
 tools/perf/arch/arm64/tests/arch-tests.c  |   4 +
 tools/perf/arch/arm64/tests/user-events.c | 255 ++
 13 files changed, 415 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/arm64/pmu_counter_user_access.txt
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c

-- 
2.17.1



[PATCH v2 2/5] arm64: cpufeature: Add feature to detect heterogeneous systems

2019-07-05 Thread Raphael Gault
This feature is required in order to enable PMU counters direct
access from userspace only when the system is homogeneous.
This feature checks the model of each CPU brought online and compares it
to the boot CPU. If it differs then it is heterogeneous.

Cc: suzuki.poul...@arm.com

Signed-off-by: Raphael Gault 
---
 arch/arm64/include/asm/cpucaps.h |  3 ++-
 arch/arm64/kernel/cpufeature.c   | 20 
 arch/arm64/kernel/perf_event.c   |  1 +
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index f19fe4b9acc4..040370af38ad 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -52,7 +52,8 @@
 #define ARM64_HAS_IRQ_PRIO_MASKING 42
 #define ARM64_HAS_DCPODP   43
 #define ARM64_WORKAROUND_1463225   44
+#define ARM64_HAS_HETEROGENEOUS_PMU45
 
-#define ARM64_NCAPS45
+#define ARM64_NCAPS46
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index f29f36a65175..3527f329ba1a 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1248,6 +1248,15 @@ static bool can_use_gic_priorities(const struct 
arm64_cpu_capabilities *entry,
 }
 #endif
 
+static bool has_heterogeneous_pmu(const struct arm64_cpu_capabilities *entry,
+int scope)
+{
+   u32 model = read_cpuid_id() & MIDR_CPU_MODEL_MASK;
+   struct cpuinfo_arm64 *boot = _cpu(cpu_data, 0);
+
+   return  (boot->reg_midr & MIDR_CPU_MODEL_MASK) != model;
+}
+
 static const struct arm64_cpu_capabilities arm64_features[] = {
{
.desc = "GIC system register CPU interface",
@@ -1548,6 +1557,16 @@ static const struct arm64_cpu_capabilities 
arm64_features[] = {
.min_field_value = 1,
},
 #endif
+   {
+   /*
+* Detect whether the system is heterogeneous or
+* homogeneous
+*/
+   .desc = "Detect whether we have heterogeneous CPUs",
+   .capability = ARM64_HAS_HETEROGENEOUS_PMU,
+   .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU | 
ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU,
+   .matches = has_heterogeneous_pmu,
+   },
{},
 };
 
@@ -1715,6 +1734,7 @@ static void __init setup_elf_hwcaps(const struct 
arm64_cpu_capabilities *hwcaps)
cap_set_elf_hwcap(hwcaps);
 }
 
+
 static void update_cpu_capabilities(u16 scope_mask)
 {
int i;
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 96e90e270042..24575c0a0065 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* ARMv8 Cortex-A53 specific event types. */
 #define ARMV8_A53_PERFCTR_PREF_LINEFILL0xC2
-- 
2.17.1



Re: [RFC V3 12/18] arm64: assembler: Add macro to annotate asm function having non standard stack-frame.

2019-07-02 Thread Raphael Gault

Hi,

On 7/1/19 3:40 PM, Catalin Marinas wrote:

On Mon, Jun 24, 2019 at 10:55:42AM +0100, Raphael Gault wrote:

--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -752,4 +752,17 @@ USER(\label, icivau, \tmp2)// 
invalidate I line PoU
  .Lyield_out_\@ :
.endm
  
+	/*

+* This macro is the arm64 assembler equivalent of the
+* macro STACK_FRAME_NON_STANDARD define at
+* ~/include/linux/frame.h
+*/
+   .macro  asm_stack_frame_non_standardfunc
+#ifdef CONFIG_STACK_VALIDATION
+   .pushsection ".discard.func_stack_frame_non_standard"
+   .8byte  \func


Nitpicks:

Does .quad vs .8byte make any difference?



No it doesn't, I'll use .quad then.


Could we place this in include/linux/frame.h directly with a generic
name (and some __ASSEMBLY__ guards)? It doesn't look to be arm specific.



It might be more consistent indeed, I'll do that.

Thanks,

--
Raphael Gault


Re: [PATCH 3/7] perf: arm64: Use rseq to test userspace access to pmu counters

2019-06-25 Thread Raphael Gault

Hi Mark, Hi Will,

Now that we have a better idea what enabling this feature for 
heterogeneous systems would look like (both with or without using 
rseqs), it might be worth discussing if this is in fact a desirable 
thing in term of performance-complexity trade off.


Indeed, while not as scary as first thought, maybe the rseq method would 
still dissuade users from using this feature. It is also worth noting 
that if we only enable this feature on homogeneous system, the `mrs` 
hook/emulation would not be necessary.
If because of the complexity of the setup we need to consider whether we 
want to upstream this and have to maintain it afterward.


Thanks,

--
Raphael Gault


Re: [PATCH 3/7] perf: arm64: Use rseq to test userspace access to pmu counters

2019-06-25 Thread Raphael Gault

Hi Mathieu, Hi Szabolcs,

On 6/11/19 8:33 PM, Mathieu Desnoyers wrote:

- On Jun 11, 2019, at 6:57 PM, Mark Rutland mark.rutl...@arm.com wrote:


Hi Arnaldo,

On Tue, Jun 11, 2019 at 11:33:46AM -0300, Arnaldo Carvalho de Melo wrote:

Em Tue, Jun 11, 2019 at 01:53:11PM +0100, Raphael Gault escreveu:

Add an extra test to check userspace access to pmu hardware counters.
This test doesn't rely on the seqlock as a synchronisation mechanism but
instead uses the restartable sequences to make sure that the thread is
not interrupted when reading the index of the counter and the associated
pmu register.

In addition to reading the pmu counters, this test is run several time
in order to measure the ratio of failures:
I ran this test on the Juno development platform, which is big.LITTLE
with 4 Cortex A53 and 2 Cortex A57. The results vary quite a lot
(running it with 100 tests is not so long and I did it several times).
I ran it once with 1 iterations:
`runs: 1, abort: 62.53%, zero: 34.93%, success: 2.54%`

Signed-off-by: Raphael Gault 
---
  tools/perf/arch/arm64/include/arch-tests.h|   5 +-
  tools/perf/arch/arm64/include/rseq-arm64.h| 220 ++


So, I applied the first patch in this series, but could you please break
this patch into at least two, one introducing the facility
(include/rseq*) and the second adding the test?

We try to enforce this kind of granularity as down the line we may want
to revert one part while the other already has other uses and thus
wouldn't allow a straight revert.

Also, can this go to tools/arch/ instead? Is this really perf specific?
Isn't there any arch/arm64/include files for the kernel that we could
mirror and have it checked for drift in tools/perf/check-headers.sh?


The rseq bits aren't strictly perf specific, and I think the existing
bits under tools/testing/selftests/rseq/ could be factored out to common
locations under tools/include/ and tools/arch/*/include/.


Hi Mark,

Thanks for CCing me!

Or into a stand-alone librseq project:

https://github.com/compudj/librseq (currently a development branch in
my own github)

I don't see why this user-space code should sit in the kernel tree.
It is not tooling-specific.



 From a scan, those already duplicate barriers and other helpers which
already have definitions under tools/, which seems unfortunate. :/

Comments below are for Raphael and Matthieu.

[...]


+static u64 noinline mmap_read_self(void *addr, int cpu)
+{
+   struct perf_event_mmap_page *pc = addr;
+   u32 idx = 0;
+   u64 count = 0;
+
+   asm volatile goto(
+ RSEQ_ASM_DEFINE_TABLE(0, 1f, 2f, 3f)
+"nop\n"
+ RSEQ_ASM_STORE_RSEQ_CS(1, 0b, rseq_cs)
+RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f)
+ RSEQ_ASM_OP_R_LOAD(pc_idx)
+ RSEQ_ASM_OP_R_AND(0xFF)
+RSEQ_ASM_OP_R_STORE(idx)
+ RSEQ_ASM_OP_R_SUB(0x1)
+RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f)
+ "msr pmselr_el0, " RSEQ_ASM_TMP_REG "\n"
+ "isb\n"
+RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f)


I really don't understand why the cpu_id needs to be compared 3 times
here (?!?)

Explicit comparison of the cpu_id within the rseq critical section
should be done _once_.

If the kernel happens to preempt and migrate the thread while in the
critical section, it's the kernel's job to move user-space execution
to the abort handler.


+ "mrs " RSEQ_ASM_TMP_REG ", pmxevcntr_el0\n"
+ RSEQ_ASM_OP_R_FINAL_STORE(cnt, 2)
+"nop\n"
+ RSEQ_ASM_DEFINE_ABORT(3, abort)
+ :/* No output operands */
+:  [cpu_id] "r" (cpu),
+   [current_cpu_id] "Qo" (__rseq_abi.cpu_id),
+   [rseq_cs] "m" (__rseq_abi.rseq_cs),
+   [cnt] "m" (count),
+   [pc_idx] "r" (>index),
+   [idx] "m" (idx)
+ :"memory"
+ :abort
+);


While baroque, this doesn't look as scary as I thought it would!


That's good to hear :)



However, I'm very scared that this is modifying input operands without
clobbering them. IIUC this is beacause we're trying to use asm goto,
which doesn't permit output operands.


This is correct. What is wrong with modifying the target of "m" input
operands in an inline asm that has a "memory" clobber ?

gcc documentation at https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
states:

"An asm goto statement cannot have outputs. This is due to an internal
restriction of the compiler: control transfer instructions 

[RFC V3 06/18] objtool: arm64: Adapt the stack frame checks for arm architecture

2019-06-24 Thread Raphael Gault
Since the way the initial stack frame when entering a function is different
that what is done in the x86_64 architecture, we need to add some more
check to support the different cases.  As opposed as for x86_64, the return
address is not stored by the call instruction but is instead loaded in a
register. The initial stack frame is thus empty when entering a function
and 2 push operations are needed to set it up correctly. All the different
combinations need to be taken into account.

Signed-off-by: Raphael Gault 
---
 tools/objtool/arch.h  |   2 +
 tools/objtool/arch/arm64/decode.c |  28 +
 tools/objtool/arch/x86/decode.c   |   5 ++
 tools/objtool/check.c | 100 --
 tools/objtool/elf.c   |   3 +-
 5 files changed, 131 insertions(+), 7 deletions(-)

diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h
index 53a72477c352..723600aae13f 100644
--- a/tools/objtool/arch.h
+++ b/tools/objtool/arch.h
@@ -89,4 +89,6 @@ unsigned long arch_jump_destination(struct instruction *insn);
 
 unsigned long arch_dest_rela_offset(int addend);
 
+bool arch_is_insn_sibling_call(struct instruction *insn);
+
 #endif /* _ARCH_H */
diff --git a/tools/objtool/arch/arm64/decode.c 
b/tools/objtool/arch/arm64/decode.c
index 6c77ad1a08ec..5be1d87b1a1c 100644
--- a/tools/objtool/arch/arm64/decode.c
+++ b/tools/objtool/arch/arm64/decode.c
@@ -106,6 +106,34 @@ unsigned long arch_dest_rela_offset(int addend)
return addend;
 }
 
+/*
+ * In order to know if we are in presence of a sibling
+ * call and not in presence of a switch table we look
+ * back at the previous instructions and see if we are
+ * jumping inside the same function that we are already
+ * in.
+ */
+bool arch_is_insn_sibling_call(struct instruction *insn)
+{
+   struct instruction *prev;
+   struct list_head *l;
+   struct symbol *sym;
+   list_for_each_prev(l, >list) {
+   prev = list_entry(l, struct instruction, list);
+   if (!prev->func ||
+   prev->func->pfunc != insn->func->pfunc)
+   return false;
+   if (prev->stack_op.src.reg != ADR_SOURCE)
+   continue;
+   sym = find_symbol_containing(insn->sec, insn->immediate);
+   if (!sym || sym->type != STT_FUNC)
+   return false;
+   else if (sym->type == STT_FUNC)
+   return true;
+   break;
+   }
+   return false;
+}
 static int is_arm64(struct elf *elf)
 {
switch (elf->ehdr.e_machine) {
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 8767ee935c47..e2087ddced69 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -72,6 +72,11 @@ unsigned long arch_dest_rela_offset(int addend)
return addend + 4;
 }
 
+bool arch_is_insn_sibling_call(struct instruction *insn)
+{
+   return true;
+}
+
 int arch_decode_instruction(struct elf *elf, struct section *sec,
unsigned long offset, unsigned int maxlen,
unsigned int *len, unsigned char *type,
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 0fba7b70d73a..3172f49c3a58 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -568,10 +568,10 @@ static int add_jump_destinations(struct objtool_file 
*file)
dest_off = arch_jump_destination(insn);
} else if (rela->sym->type == STT_SECTION) {
dest_sec = rela->sym->sec;
-   dest_off = rela->addend + 4;
+   dest_off = arch_dest_rela_offset(rela->addend);
} else if (rela->sym->sec->idx) {
dest_sec = rela->sym->sec;
-   dest_off = rela->sym->sym.st_value + rela->addend + 4;
+   dest_off = rela->sym->sym.st_value + 
arch_dest_rela_offset(rela->addend);
} else if (strstr(rela->sym->name, "_indirect_thunk_")) {
/*
 * Retpoline jumps are really dynamic jumps in
@@ -1339,8 +1339,8 @@ static void save_reg(struct insn_state *state, unsigned 
char reg, int base,
 
 static void restore_reg(struct insn_state *state, unsigned char reg)
 {
-   state->regs[reg].base = CFI_UNDEFINED;
-   state->regs[reg].offset = 0;
+   state->regs[reg].base = initial_func_cfi.regs[reg].base;
+   state->regs[reg].offset = initial_func_cfi.regs[reg].offset;
 }
 
 /*
@@ -1496,8 +1496,32 @@ static int update_insn_state(struct instruction *insn, 
struct insn_state *state)
 
/* add imm, %rsp */
state->stack_size -= op->src.offset;
-   if (cfa->base == CFI_SP)
+   

[RFC V3 10/18] objtool: arm64: Implement functions to add switch tables alternatives

2019-06-24 Thread Raphael Gault
This patch implements the functions required to identify and add as
alternatives all the possible destinations of the switch table.
This implementation relies on the new plugin introduced previously which
records information about the switch-table in a .objtool_data section.

Signed-off-by: Raphael Gault 
---
 tools/objtool/arch/arm64/arch_special.c   | 142 +-
 tools/objtool/arch/arm64/decode.c |   2 +-
 .../objtool/arch/arm64/include/arch_special.h |  10 ++
 .../objtool/arch/arm64/include/insn_decode.h  |   3 +-
 tools/objtool/check.c |   8 +-
 tools/objtool/check.h |   2 +
 6 files changed, 157 insertions(+), 10 deletions(-)

diff --git a/tools/objtool/arch/arm64/arch_special.c 
b/tools/objtool/arch/arm64/arch_special.c
index a0f7066994b5..33f30876d339 100644
--- a/tools/objtool/arch/arm64/arch_special.c
+++ b/tools/objtool/arch/arm64/arch_special.c
@@ -12,8 +12,13 @@
  * You should have received a copy of the GNU General Public License
  * along with this program; if not, see <http://www.gnu.org/licenses/>.
  */
+
+#include 
+#include 
+
 #include "../../special.h"
 #include "arch_special.h"
+#include "bit_operations.h"
 
 void arch_force_alt_path(unsigned short feature,
 bool uaccess,
@@ -21,9 +26,141 @@ void arch_force_alt_path(unsigned short feature,
 {
 }
 
+static u32 next_offset(u8 *table, u8 entry_size)
+{
+   switch (entry_size) {
+   case 1:
+   return table[0];
+   case 2:
+   return *(u16 *)(table);
+   default:
+   return *(u32 *)(table);
+   }
+}
+
+static u32 get_table_entry_size(u32 insn)
+{
+   unsigned char size = (insn >> 30) & ONES(2);
+   switch (size) {
+   case 0:
+   return 1;
+   case 1:
+   return 2;
+   default:
+   return 4;
+   }
+}
+
+static int add_possible_branch(struct objtool_file *file,
+  struct instruction *insn,
+  u32 base, u32 offset)
+{
+   struct instruction *new_insn;
+   struct alternative *alt;
+   offset = base + 4 * offset;
+   new_insn = calloc(1, sizeof(*new_insn));
+
+   if (new_insn == NULL) {
+   WARN("allocation failure, can't add jump alternative");
+   return -1;
+   }
+
+   memcpy(new_insn, insn, sizeof(*insn));
+   alt = calloc(1, sizeof(*alt));
+
+   if (new_insn == NULL) {
+   WARN("allocation failure, can't add jump alternative");
+   return -1;
+   }
+
+   new_insn->type = INSN_JUMP_UNCONDITIONAL;
+   new_insn->immediate = offset;
+   INIT_LIST_HEAD(_insn->alts);
+   new_insn->jump_dest = find_insn(file, insn->sec, offset);
+   alt->insn = new_insn;
+   alt->skip_orig = true;
+   list_add_tail(>list, >alts);
+   list_add_tail(_insn->list, >insn_list);
+   return 0;
+}
+
 int arch_add_switch_table(struct objtool_file *file, struct instruction *insn,
-   struct rela *table, struct rela *next_table)
+ struct rela *table, struct rela *next_table)
 {
+   struct rela *objtool_data_rela = NULL;
+   struct switch_table_info *swt_info = NULL;
+   struct section *objtool_data = find_section_by_name(file->elf, 
".objtool_data");
+   struct section *rodata_sec = find_section_by_name(file->elf, ".rodata");
+   struct section *branch_sec = NULL;
+   u8 *switch_table = NULL;
+   u64 base_offset = 0;
+   struct instruction *pre_jump_insn;
+   u32 sec_size = 0;
+   u32 entry_size = 0;
+   u32 offset = 0;
+   u32 i, j;
+
+   if (objtool_data == NULL)
+   return 0;
+
+   /*
+* 1. Using rela, Identify entry for the switch table
+* 2. Retrieve base offset
+* 3. Retrieve branch instruction
+* 3. For all entries in switch table:
+*  3.1. Compute new offset
+*  3.2. Create alternative instruction
+*  3.3. Add alt_instr to insn->alts list
+*/
+   sec_size = objtool_data->sh.sh_size;
+   for (i = 0, swt_info = (void *)objtool_data->data->d_buf;
+i < sec_size / sizeof(struct switch_table_info);
+i++, swt_info++) {
+   offset = i * sizeof(struct switch_table_info);
+   objtool_data_rela = find_rela_by_dest_range(objtool_data, 
offset,
+   sizeof(u64));
+   /* retrieving the objtool data of the switch table we need */
+   if (objtool_data_rela == NULL ||
+   table->sym->sec != objtool_data_rela->sym->sec ||
+   table->addend != objtool_data_rela->addend)
+

[RFC V3 04/18] objtool: arm64: Add required implementation for supporting the aarch64 architecture in objtool.

2019-06-24 Thread Raphael Gault
Provide implementation for the arch-dependent functions that are called by
the main check function of objtool.  The ORC unwinder is not yet supported
by the arm64 architecture so we only provide a dummy interface for now.
The decoding of the instruction is split into classes and subclasses as
described into the Instruction Encoding in the ArmV8.5 Architecture
Reference Manual.

In order to handle the load/store instructions for a pair of registers
we add an extra field to the stack_op structure.

We consider that the hypervisor/secure-monitor is behaving
correctly. This enables us to handle hvc/smc/svc context switching
instructions as nop since we consider that the context is restored
correctly.

Signed-off-by: Raphael Gault 
---
 tools/objtool/arch.h  |7 +
 tools/objtool/arch/arm64/Build|7 +
 tools/objtool/arch/arm64/bit_operations.c |   67 +
 tools/objtool/arch/arm64/decode.c | 2781 +
 .../objtool/arch/arm64/include/arch_special.h |   36 +
 .../arch/arm64/include/asm/orc_types.h|   96 +
 .../arch/arm64/include/bit_operations.h   |   24 +
 tools/objtool/arch/arm64/include/cfi.h|   74 +
 .../objtool/arch/arm64/include/insn_decode.h  |  211 ++
 tools/objtool/arch/arm64/orc_dump.c   |   26 +
 tools/objtool/arch/arm64/orc_gen.c|   40 +
 11 files changed, 3369 insertions(+)
 create mode 100644 tools/objtool/arch/arm64/Build
 create mode 100644 tools/objtool/arch/arm64/bit_operations.c
 create mode 100644 tools/objtool/arch/arm64/decode.c
 create mode 100644 tools/objtool/arch/arm64/include/arch_special.h
 create mode 100644 tools/objtool/arch/arm64/include/asm/orc_types.h
 create mode 100644 tools/objtool/arch/arm64/include/bit_operations.h
 create mode 100644 tools/objtool/arch/arm64/include/cfi.h
 create mode 100644 tools/objtool/arch/arm64/include/insn_decode.h
 create mode 100644 tools/objtool/arch/arm64/orc_dump.c
 create mode 100644 tools/objtool/arch/arm64/orc_gen.c

diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h
index ce7db772248e..53a72477c352 100644
--- a/tools/objtool/arch.h
+++ b/tools/objtool/arch.h
@@ -60,9 +60,16 @@ struct op_src {
int offset;
 };
 
+struct op_extra {
+   unsigned char used;
+   unsigned char reg;
+   int offset;
+};
+
 struct stack_op {
struct op_dest dest;
struct op_src src;
+   struct op_extra extra;
 };
 
 struct instruction;
diff --git a/tools/objtool/arch/arm64/Build b/tools/objtool/arch/arm64/Build
new file mode 100644
index ..bf7a32c2b9e9
--- /dev/null
+++ b/tools/objtool/arch/arm64/Build
@@ -0,0 +1,7 @@
+objtool-y += decode.o
+objtool-y += orc_dump.o
+objtool-y += orc_gen.o
+objtool-y += bit_operations.o
+
+
+CFLAGS_decode.o += -I$(OUTPUT)arch/arm64/lib
diff --git a/tools/objtool/arch/arm64/bit_operations.c 
b/tools/objtool/arch/arm64/bit_operations.c
new file mode 100644
index ..f457a14a7f5d
--- /dev/null
+++ b/tools/objtool/arch/arm64/bit_operations.c
@@ -0,0 +1,67 @@
+#include 
+#include 
+#include "bit_operations.h"
+
+#include "../../warn.h"
+
+u64 replicate(u64 x, int size, int n)
+{
+   u64 ret = 0;
+
+   while (n >= 0) {
+   ret = (ret | x) << size;
+   n--;
+   }
+   return ret | x;
+}
+
+u64 ror(u64 x, int size, int shift)
+{
+   int m = shift % size;
+
+   if (shift == 0)
+   return x;
+   return ZERO_EXTEND((x >> m) | (x << (size - m)), size);
+}
+
+int highest_set_bit(u32 x)
+{
+   int i;
+
+   for (i = 31; i >= 0; i--, x <<= 1)
+   if (x & 0x8000)
+   return i;
+   return 0;
+}
+
+/* imms and immr are both 6 bit long */
+__uint128_t decode_bit_masks(unsigned char N, unsigned char imms,
+unsigned char immr, bool immediate)
+{
+   u64 tmask, wmask;
+   u32 diff, S, R, esize, welem, telem;
+   unsigned char levels = 0, len = 0;
+
+   len = highest_set_bit((N << 6) | ((~imms) & ONES(6)));
+   levels = ZERO_EXTEND(ONES(len), 6);
+
+   if (immediate && ((imms & levels) == levels)) {
+   WARN("unknown instruction");
+   return -1;
+   }
+
+   S = imms & levels;
+   R = immr & levels;
+   diff = ZERO_EXTEND(S - R, 6);
+
+   esize = 1 << len;
+   diff = diff & ONES(len);
+
+   welem = ZERO_EXTEND(ONES(S + 1), esize);
+   telem = ZERO_EXTEND(ONES(diff + 1), esize);
+
+   wmask = replicate(ror(welem, esize, R), esize, 64 / esize);
+   tmask = replicate(telem, esize, 64 / esize);
+
+   return ((__uint128_t)wmask << 64) | tmask;
+}
diff --git a/tools/objtool/arch/arm64/decode.c 
b/tools/objtool/arch/arm64/decode.c
new file mode 100644
index ..6c77ad1a08ec
--- /dev/null
+++ b/tools/objtool/arch/arm64/decode.c
@@ -0,0 +1

[RFC V3 02/18] objtool: orc: Refactor ORC API for other architectures to implement.

2019-06-24 Thread Raphael Gault
The ORC unwinder is only supported on x86 at the moment and should thus be
in the x86 architecture code. In order not to break the whole structure in
case another architecture decides to support the ORC unwinder via objtool
we choose to let the implementation be done in the architecture dependent
code.

Signed-off-by: Raphael Gault 
---
 tools/objtool/Build |   2 -
 tools/objtool/arch.h|   3 +
 tools/objtool/arch/x86/Build|   2 +
 tools/objtool/{ => arch/x86}/orc_dump.c |   4 +-
 tools/objtool/{ => arch/x86}/orc_gen.c  | 104 ++--
 tools/objtool/check.c   |  99 +-
 tools/objtool/orc.h |   4 +-
 7 files changed, 111 insertions(+), 107 deletions(-)
 rename tools/objtool/{ => arch/x86}/orc_dump.c (98%)
 rename tools/objtool/{ => arch/x86}/orc_gen.c (66%)

diff --git a/tools/objtool/Build b/tools/objtool/Build
index 749becdf5b90..2ed83344e0a5 100644
--- a/tools/objtool/Build
+++ b/tools/objtool/Build
@@ -2,8 +2,6 @@ objtool-y += arch/$(SRCARCH)/
 objtool-y += builtin-check.o
 objtool-y += builtin-orc.o
 objtool-y += check.o
-objtool-y += orc_gen.o
-objtool-y += orc_dump.o
 objtool-y += elf.o
 objtool-y += special.o
 objtool-y += objtool.o
diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h
index 2a38a834cf40..ce7db772248e 100644
--- a/tools/objtool/arch.h
+++ b/tools/objtool/arch.h
@@ -10,6 +10,7 @@
 #include 
 #include "elf.h"
 #include "cfi.h"
+#include "orc.h"
 
 #define INSN_JUMP_CONDITIONAL  1
 #define INSN_JUMP_UNCONDITIONAL2
@@ -75,6 +76,8 @@ int arch_decode_instruction(struct elf *elf, struct section 
*sec,
 
 bool arch_callee_saved_reg(unsigned char reg);
 
+int arch_orc_read_unwind_hints(struct objtool_file *file);
+
 unsigned long arch_jump_destination(struct instruction *insn);
 
 unsigned long arch_dest_rela_offset(int addend);
diff --git a/tools/objtool/arch/x86/Build b/tools/objtool/arch/x86/Build
index b998412c017d..1f11b45999d0 100644
--- a/tools/objtool/arch/x86/Build
+++ b/tools/objtool/arch/x86/Build
@@ -1,4 +1,6 @@
 objtool-y += decode.o
+objtool-y += orc_dump.o
+objtool-y += orc_gen.o
 
 inat_tables_script = arch/x86/tools/gen-insn-attr-x86.awk
 inat_tables_maps = arch/x86/lib/x86-opcode-map.txt
diff --git a/tools/objtool/orc_dump.c b/tools/objtool/arch/x86/orc_dump.c
similarity index 98%
rename from tools/objtool/orc_dump.c
rename to tools/objtool/arch/x86/orc_dump.c
index 13ccf775a83a..cfe8f96bdd68 100644
--- a/tools/objtool/orc_dump.c
+++ b/tools/objtool/arch/x86/orc_dump.c
@@ -4,8 +4,8 @@
  */
 
 #include 
-#include "orc.h"
-#include "warn.h"
+#include "../../orc.h"
+#include "../../warn.h"
 
 static const char *reg_name(unsigned int reg)
 {
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/arch/x86/orc_gen.c
similarity index 66%
rename from tools/objtool/orc_gen.c
rename to tools/objtool/arch/x86/orc_gen.c
index 27a4112848c2..b4f285bf5271 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/arch/x86/orc_gen.c
@@ -6,11 +6,11 @@
 #include 
 #include 
 
-#include "orc.h"
-#include "check.h"
-#include "warn.h"
+#include "../../orc.h"
+#include "../../check.h"
+#include "../../warn.h"
 
-int create_orc(struct objtool_file *file)
+int arch_create_orc(struct objtool_file *file)
 {
struct instruction *insn;
 
@@ -116,7 +116,7 @@ static int create_orc_entry(struct section *u_sec, struct 
section *ip_relasec,
return 0;
 }
 
-int create_orc_sections(struct objtool_file *file)
+int arch_create_orc_sections(struct objtool_file *file)
 {
struct instruction *insn, *prev_insn;
struct section *sec, *u_sec, *ip_relasec;
@@ -209,3 +209,97 @@ int create_orc_sections(struct objtool_file *file)
 
return 0;
 }
+
+int arch_orc_read_unwind_hints(struct objtool_file *file)
+{
+   struct section *sec, *relasec;
+   struct rela *rela;
+   struct unwind_hint *hint;
+   struct instruction *insn;
+   struct cfi_reg *cfa;
+   int i;
+
+   sec = find_section_by_name(file->elf, ".discard.unwind_hints");
+   if (!sec)
+   return 0;
+
+   relasec = sec->rela;
+   if (!relasec) {
+   WARN("missing .rela.discard.unwind_hints section");
+   return -1;
+   }
+
+   if (sec->len % sizeof(struct unwind_hint)) {
+   WARN("struct unwind_hint size mismatch");
+   return -1;
+   }
+
+   file->hints = true;
+
+   for (i = 0; i < sec->len / sizeof(struct unwind_hint); i++) {
+   hint = (struct unwind_hint *)sec->data->d_buf + i;
+
+   rela = find_rela_by_dest(sec, i * sizeof(*hint));
+   if (!rela) {
+   WARN("can't find rela for unwind_hints[%d]", i);
+ 

[RFC V3 07/18] objtool: Introduce INSN_UNKNOWN type

2019-06-24 Thread Raphael Gault
On arm64 some object files contain data stored in the .text section.
This data is interpreted by objtool as instruction but can't be
identified as a valid one. In order to keep analysing those files we
introduce INSN_UNKNOWN type. The "unknown instruction" warning will thus
only be raised if such instructions are uncountered while validating an
execution branch.

This change doesn't impact the x86 decoding logic since 0 is still used
as a way to specify an unknown type, raising the "unknown instruction"
warning during the decoding phase still.

Signed-off-by: Raphael Gault 
---
 tools/objtool/arch.h   |  3 ++-
 tools/objtool/arch/arm64/decode.c  |  8 
 tools/objtool/arch/arm64/include/insn_decode.h |  4 ++--
 tools/objtool/check.c  | 10 +-
 4 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h
index 723600aae13f..f3f94e2a1403 100644
--- a/tools/objtool/arch.h
+++ b/tools/objtool/arch.h
@@ -26,7 +26,8 @@
 #define INSN_CLAC  12
 #define INSN_STD   13
 #define INSN_CLD   14
-#define INSN_OTHER 15
+#define INSN_UNKNOWN   15
+#define INSN_OTHER 16
 #define INSN_LAST  INSN_OTHER
 
 enum op_dest_type {
diff --git a/tools/objtool/arch/arm64/decode.c 
b/tools/objtool/arch/arm64/decode.c
index 5be1d87b1a1c..a40338a895f5 100644
--- a/tools/objtool/arch/arm64/decode.c
+++ b/tools/objtool/arch/arm64/decode.c
@@ -37,9 +37,9 @@
  */
 static arm_decode_class aarch64_insn_class_decode_table[] = {
[INSN_RESERVED] = arm_decode_reserved,
-   [INSN_UNKNOWN]  = arm_decode_unknown,
+   [INSN_UNALLOC_1]= arm_decode_unknown,
[INSN_SVE_ENC]  = arm_decode_sve_encoding,
-   [INSN_UNALLOC]  = arm_decode_unknown,
+   [INSN_UNALLOC_2]= arm_decode_unknown,
[INSN_LD_ST_4]  = arm_decode_ld_st,
[INSN_DP_REG_5] = arm_decode_dp_reg,
[INSN_LD_ST_6]  = arm_decode_ld_st,
@@ -191,7 +191,7 @@ int arch_decode_instruction(struct elf *elf, struct section 
*sec,
 int arm_decode_unknown(u32 instr, unsigned char *type,
   unsigned long *immediate, struct stack_op *op)
 {
-   *type = 0;
+   *type = INSN_UNKNOWN;
return 0;
 }
 
@@ -206,7 +206,7 @@ int arm_decode_reserved(u32 instr, unsigned char *type,
unsigned long *immediate, struct stack_op *op)
 {
*immediate = instr & ONES(16);
-   *type = INSN_BUG;
+   *type = INSN_UNKNOWN;
return 0;
 }
 
diff --git a/tools/objtool/arch/arm64/include/insn_decode.h 
b/tools/objtool/arch/arm64/include/insn_decode.h
index eb54fc39dca5..a01d76306749 100644
--- a/tools/objtool/arch/arm64/include/insn_decode.h
+++ b/tools/objtool/arch/arm64/include/insn_decode.h
@@ -20,9 +20,9 @@
 #include "../../../arch.h"
 
 #define INSN_RESERVED  0b
-#define INSN_UNKNOWN   0b0001
+#define INSN_UNALLOC_1 0b0001
 #define INSN_SVE_ENC   0b0010
-#define INSN_UNALLOC   0b0011
+#define INSN_UNALLOC_2 0b0011
 #define INSN_DP_IMM0b1001  //0x100x
 #define INSN_BRANCH0b1011  //0x101x
 #define INSN_LD_ST_4   0b0100  //0bx1x0
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 3172f49c3a58..cba1d91451cc 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1952,6 +1952,13 @@ static int validate_branch(struct objtool_file *file, 
struct instruction *first,
while (1) {
next_insn = next_insn_same_sec(file, insn);
 
+   if (insn->type == INSN_UNKNOWN) {
+   WARN("%s+0x%lx unknown instruction type, should never 
be reached",
+insn->sec->name,
+insn->offset);
+   return 1;
+   }
+
if (file->c_file && func && insn->func && func != 
insn->func->pfunc) {
WARN("%s() falls through to next function %s()",
 func->name, insn->func->name);
@@ -2383,7 +2390,8 @@ static int validate_reachable_instructions(struct 
objtool_file *file)
return 0;
 
for_each_insn(file, insn) {
-   if (insn->visited || ignore_unreachable_insn(insn))
+   if (insn->visited || ignore_unreachable_insn(insn) ||
+   insn->type == INSN_UNKNOWN)
continue;
 
WARN_FUNC("unreachable instruction", insn->sec, insn->offset);
-- 
2.17.1



[RFC V3 05/18] objtool: special: Adapt special section handling

2019-06-24 Thread Raphael Gault
This patch abstracts the few architecture dependent tests that are
perform when handling special section and switch tables. It enables any
architecture to ignore a particular CPU feature or not to handle switch
tables.

Signed-off-by: Raphael Gault 
---
 tools/objtool/arch/arm64/Build|  1 +
 tools/objtool/arch/arm64/arch_special.c   | 22 +++
 .../objtool/arch/arm64/include/arch_special.h | 10 +--
 tools/objtool/arch/x86/Build  |  1 +
 tools/objtool/arch/x86/arch_special.c | 28 +++
 tools/objtool/arch/x86/include/arch_special.h |  9 ++
 tools/objtool/check.c | 15 --
 tools/objtool/special.c   |  9 ++
 tools/objtool/special.h   |  3 ++
 9 files changed, 87 insertions(+), 11 deletions(-)
 create mode 100644 tools/objtool/arch/arm64/arch_special.c
 create mode 100644 tools/objtool/arch/x86/arch_special.c

diff --git a/tools/objtool/arch/arm64/Build b/tools/objtool/arch/arm64/Build
index bf7a32c2b9e9..3d09be745a84 100644
--- a/tools/objtool/arch/arm64/Build
+++ b/tools/objtool/arch/arm64/Build
@@ -1,3 +1,4 @@
+objtool-y += arch_special.o
 objtool-y += decode.o
 objtool-y += orc_dump.o
 objtool-y += orc_gen.o
diff --git a/tools/objtool/arch/arm64/arch_special.c 
b/tools/objtool/arch/arm64/arch_special.c
new file mode 100644
index ..a21d28876317
--- /dev/null
+++ b/tools/objtool/arch/arm64/arch_special.c
@@ -0,0 +1,22 @@
+/*
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+#include "../../special.h"
+#include "arch_special.h"
+
+void arch_force_alt_path(unsigned short feature,
+bool uaccess,
+struct special_alt *alt)
+{
+}
diff --git a/tools/objtool/arch/arm64/include/arch_special.h 
b/tools/objtool/arch/arm64/include/arch_special.h
index 63da775d0581..185103be8a51 100644
--- a/tools/objtool/arch/arm64/include/arch_special.h
+++ b/tools/objtool/arch/arm64/include/arch_special.h
@@ -30,7 +30,13 @@
 #define ALT_ORIG_LEN_OFFSET10
 #define ALT_NEW_LEN_OFFSET 11
 
-#define X86_FEATURE_POPCNT (4 * 32 + 23)
-#define X86_FEATURE_SMAP   (9 * 32 + 20)
+static inline bool arch_should_ignore_feature(unsigned short feature)
+{
+   return false;
+}
 
+static inline bool arch_support_switch_table(void)
+{
+   return false;
+}
 #endif /* _ARM64_ARCH_SPECIAL_H */
diff --git a/tools/objtool/arch/x86/Build b/tools/objtool/arch/x86/Build
index 1f11b45999d0..63e167775bc8 100644
--- a/tools/objtool/arch/x86/Build
+++ b/tools/objtool/arch/x86/Build
@@ -1,3 +1,4 @@
+objtool-y += arch_special.o
 objtool-y += decode.o
 objtool-y += orc_dump.o
 objtool-y += orc_gen.o
diff --git a/tools/objtool/arch/x86/arch_special.c 
b/tools/objtool/arch/x86/arch_special.c
new file mode 100644
index ..6583a1770bb2
--- /dev/null
+++ b/tools/objtool/arch/x86/arch_special.c
@@ -0,0 +1,28 @@
+/*
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+#include "../../special.h"
+#include "arch_special.h"
+
+void arch_force_alt_path(unsigned short feature,
+bool uaccess,
+struct special_alt *alt)
+{
+   if (feature == X86_FEATURE_SMAP) {
+   if (uaccess)
+   alt->skip_orig = true;
+   else
+   alt->skip_alt = true;
+   }
+}
diff --git a/tools/objtool/arch/x86/include/arch_special.h 
b/tools/objtool/arch/x86/include/arch_special.h
index 424ce47013e3..fce2b1193194 100644
--- a/tools/objtool/arch/x86/include/arch_special.h
+++ b/tools/objtool/arch/x86/include/arch_special.h
@@ -33,4 +33,13 @@
 #define 

[RFC V3 08/18] objtool: Refactor switch-tables code to support other architectures

2019-06-24 Thread Raphael Gault
The way to identify switch-tables and retrieves all the data necessary
to handle the different execution branches is not the same on all
architecture. In order to be able to add other architecture support,
this patch defines arch-dependent functions to process jump-tables.

Signed-off-by: Raphael Gault 
---
 tools/objtool/arch/arm64/arch_special.c | 15 +
 tools/objtool/arch/x86/arch_special.c   | 73 +
 tools/objtool/check.c   | 84 +
 tools/objtool/check.h   |  7 +++
 tools/objtool/special.h | 10 ++-
 5 files changed, 107 insertions(+), 82 deletions(-)

diff --git a/tools/objtool/arch/arm64/arch_special.c 
b/tools/objtool/arch/arm64/arch_special.c
index a21d28876317..a0f7066994b5 100644
--- a/tools/objtool/arch/arm64/arch_special.c
+++ b/tools/objtool/arch/arm64/arch_special.c
@@ -20,3 +20,18 @@ void arch_force_alt_path(unsigned short feature,
 struct special_alt *alt)
 {
 }
+
+int arch_add_switch_table(struct objtool_file *file, struct instruction *insn,
+   struct rela *table, struct rela *next_table)
+{
+   return 0;
+}
+
+struct rela *arch_find_switch_table(struct objtool_file *file,
+   struct rela *text_rela,
+   struct section *rodata_sec,
+   unsigned long table_offset)
+{
+   file->ignore_unreachables = true;
+   return NULL;
+}
diff --git a/tools/objtool/arch/x86/arch_special.c 
b/tools/objtool/arch/x86/arch_special.c
index 6583a1770bb2..38ac010f8a02 100644
--- a/tools/objtool/arch/x86/arch_special.c
+++ b/tools/objtool/arch/x86/arch_special.c
@@ -26,3 +26,76 @@ void arch_force_alt_path(unsigned short feature,
alt->skip_alt = true;
}
 }
+
+int arch_add_switch_table(struct objtool_file *file, struct instruction *insn,
+   struct rela *table, struct rela *next_table)
+{
+   struct rela *rela = table;
+   struct instruction *alt_insn;
+   struct alternative *alt;
+   struct symbol *pfunc = insn->func->pfunc;
+   unsigned int prev_offset = 0;
+
+   list_for_each_entry_from(rela, >rela_sec->rela_list, list) {
+   if (rela == next_table)
+   break;
+
+   /* Make sure the switch table entries are consecutive: */
+   if (prev_offset && rela->offset != prev_offset + 8)
+   break;
+
+   /* Detect function pointers from contiguous objects: */
+   if (rela->sym->sec == pfunc->sec &&
+   rela->addend == pfunc->offset)
+   break;
+
+   alt_insn = find_insn(file, rela->sym->sec, rela->addend);
+   if (!alt_insn)
+   break;
+
+   /* Make sure the jmp dest is in the function or subfunction: */
+   if (alt_insn->func->pfunc != pfunc)
+   break;
+
+   alt = malloc(sizeof(*alt));
+   if (!alt) {
+   WARN("malloc failed");
+   return -1;
+   }
+
+   alt->insn = alt_insn;
+   list_add_tail(>list, >alts);
+   prev_offset = rela->offset;
+   }
+
+   if (!prev_offset) {
+   WARN_FUNC("can't find switch jump table",
+ insn->sec, insn->offset);
+   return -1;
+   }
+
+   return 0;
+}
+
+struct rela *arch_find_switch_table(struct objtool_file *file,
+   struct rela *text_rela,
+   struct section *rodata_sec,
+   unsigned long table_offset)
+{
+   struct rela *rodata_rela;
+
+   rodata_rela = find_rela_by_dest(rodata_sec, table_offset);
+   if (rodata_rela) {
+   /*
+* Use of RIP-relative switch jumps is quite rare, and
+* indicates a rare GCC quirk/bug which can leave dead
+* code behind.
+*/
+   if (text_rela->type == R_X86_64_PC32)
+   file->ignore_unreachables = true;
+
+   return rodata_rela;
+   }
+
+   return NULL;
+}
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index cba1d91451cc..ce1165ce448a 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -18,12 +18,6 @@
 
 #define FAKE_JUMP_OFFSET -1
 
-struct alternative {
-   struct list_head list;
-   struct instruction *insn;
-   bool skip_orig;
-};
-
 const char *objname;
 struct cfi_state initial_func_cfi;
 
@@ -901,56 +895,6 @@ static int add_special_section_alts(struct objtool_file 
*file)
return ret;
 }
 
-static int add_switch_table(st

[RFC V3 03/18] objtool: Move registers and control flow to arch-dependent code

2019-06-24 Thread Raphael Gault
The control flow information and register macro definitions were based on
the x86_64 architecture but should be abstract so that each architecture
can define the correct values for the registers, especially the registers
related to the stack frame (Frame Pointer, Stack Pointer and possibly
Return Address).

Signed-off-by: Raphael Gault 
---
 tools/objtool/arch/x86/include/arch_special.h | 36 +++
 tools/objtool/{ => arch/x86/include}/cfi.h|  0
 tools/objtool/check.h |  1 +
 tools/objtool/special.c   | 19 +-
 4 files changed, 38 insertions(+), 18 deletions(-)
 create mode 100644 tools/objtool/arch/x86/include/arch_special.h
 rename tools/objtool/{ => arch/x86/include}/cfi.h (100%)

diff --git a/tools/objtool/arch/x86/include/arch_special.h 
b/tools/objtool/arch/x86/include/arch_special.h
new file mode 100644
index ..424ce47013e3
--- /dev/null
+++ b/tools/objtool/arch/x86/include/arch_special.h
@@ -0,0 +1,36 @@
+/*
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef _X86_ARCH_SPECIAL_H
+#define _X86_ARCH_SPECIAL_H
+
+#define EX_ENTRY_SIZE  12
+#define EX_ORIG_OFFSET 0
+#define EX_NEW_OFFSET  4
+
+#define JUMP_ENTRY_SIZE16
+#define JUMP_ORIG_OFFSET   0
+#define JUMP_NEW_OFFSET4
+
+#define ALT_ENTRY_SIZE 13
+#define ALT_ORIG_OFFSET0
+#define ALT_NEW_OFFSET 4
+#define ALT_FEATURE_OFFSET 8
+#define ALT_ORIG_LEN_OFFSET10
+#define ALT_NEW_LEN_OFFSET 11
+
+#define X86_FEATURE_POPCNT (4 * 32 + 23)
+#define X86_FEATURE_SMAP   (9 * 32 + 20)
+
+#endif /* _X86_ARCH_SPECIAL_H */
diff --git a/tools/objtool/cfi.h b/tools/objtool/arch/x86/include/cfi.h
similarity index 100%
rename from tools/objtool/cfi.h
rename to tools/objtool/arch/x86/include/cfi.h
diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index cb60b9acf5cf..c44f9fe40178 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -11,6 +11,7 @@
 #include "cfi.h"
 #include "arch.h"
 #include "orc.h"
+#include "arch_special.h"
 #include 
 
 struct insn_state {
diff --git a/tools/objtool/special.c b/tools/objtool/special.c
index fdbaa611146d..b8ccee1b5382 100644
--- a/tools/objtool/special.c
+++ b/tools/objtool/special.c
@@ -14,24 +14,7 @@
 #include "builtin.h"
 #include "special.h"
 #include "warn.h"
-
-#define EX_ENTRY_SIZE  12
-#define EX_ORIG_OFFSET 0
-#define EX_NEW_OFFSET  4
-
-#define JUMP_ENTRY_SIZE16
-#define JUMP_ORIG_OFFSET   0
-#define JUMP_NEW_OFFSET4
-
-#define ALT_ENTRY_SIZE 13
-#define ALT_ORIG_OFFSET0
-#define ALT_NEW_OFFSET 4
-#define ALT_FEATURE_OFFSET 8
-#define ALT_ORIG_LEN_OFFSET10
-#define ALT_NEW_LEN_OFFSET 11
-
-#define X86_FEATURE_POPCNT (4*32+23)
-#define X86_FEATURE_SMAP   (9*32+20)
+#include "arch_special.h"
 
 struct special_entry {
const char *sec;
-- 
2.17.1



[RFC V3 13/18] arm64: sleep: Prevent stack frame warnings from objtool

2019-06-24 Thread Raphael Gault
This code doesn't respect the Arm PCS but it is intended this
way. Adapting it to respect the PCS would result in altering the
behaviour.

In order to suppress objtool's warnings, we setup a stack frame
for __cpu_suspend_enter and annotate cpu_resume and _cpu_resume
as having non-standard stack frames.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/sleep.S | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 3e53ffa07994..eb434525fe82 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -90,6 +90,7 @@ ENTRY(__cpu_suspend_enter)
str x0, [x1]
add x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
stp x29, lr, [sp, #-16]!
+   mov x29, sp
bl  cpu_do_suspend
ldp x29, lr, [sp], #16
mov x0, #1
@@ -146,3 +147,6 @@ ENTRY(_cpu_resume)
mov x0, #0
ret
 ENDPROC(_cpu_resume)
+
+   asm_stack_frame_non_standard cpu_resume
+   asm_stack_frame_non_standard _cpu_resume
-- 
2.17.1



[RFC V3 09/18] gcc-plugins: objtool: Add plugin to detect switch table on arm64

2019-06-24 Thread Raphael Gault
This plugins comes into play before the final 2 RTL passes of GCC and
detects switch-tables that are to be outputed in the ELF and writes
information in an "objtool_data" section which will be used by objtool.

Signed-off-by: Raphael Gault 
---
 scripts/Makefile.gcc-plugins  |  2 +
 scripts/gcc-plugins/Kconfig   |  9 +++
 .../arm64_switch_table_detection_plugin.c | 58 +++
 3 files changed, 69 insertions(+)
 create mode 100644 scripts/gcc-plugins/arm64_switch_table_detection_plugin.c

diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins
index 5f7df50cfe7a..a56736df9dc2 100644
--- a/scripts/Makefile.gcc-plugins
+++ b/scripts/Makefile.gcc-plugins
@@ -44,6 +44,8 @@ ifdef CONFIG_GCC_PLUGIN_ARM_SSP_PER_TASK
 endif
 export DISABLE_ARM_SSP_PER_TASK_PLUGIN
 
+gcc-plugin-$(CONFIG_GCC_PLUGIN_SWITCH_TABLES)  += 
arm64_switch_table_detection_plugin.so
+
 # All the plugin CFLAGS are collected here in case a build target needs to
 # filter them out of the KBUILD_CFLAGS.
 GCC_PLUGINS_CFLAGS := $(strip $(addprefix 
-fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) 
$(gcc-plugin-cflags-y))
diff --git a/scripts/gcc-plugins/Kconfig b/scripts/gcc-plugins/Kconfig
index e9c677a53c74..a9b13d257cd2 100644
--- a/scripts/gcc-plugins/Kconfig
+++ b/scripts/gcc-plugins/Kconfig
@@ -113,4 +113,13 @@ config GCC_PLUGIN_ARM_SSP_PER_TASK
bool
depends on GCC_PLUGINS && ARM
 
+config GCC_PLUGIN_SWITCH_TABLES
+   bool "GCC Plugin: Identify switch tables at compile time"
+   default y
+   depends on STACK_VALIDATION && ARM64
+   help
+ Plugin to identify switch tables generated at compile time and store
+ them in a .objtool_data section. Objtool will then use that section
+ to analyse the different execution path of the switch table.
+
 endmenu
diff --git a/scripts/gcc-plugins/arm64_switch_table_detection_plugin.c 
b/scripts/gcc-plugins/arm64_switch_table_detection_plugin.c
new file mode 100644
index ..d7f0e13910d5
--- /dev/null
+++ b/scripts/gcc-plugins/arm64_switch_table_detection_plugin.c
@@ -0,0 +1,58 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include 
+#include "gcc-common.h"
+
+__visible int plugin_is_GPL_compatible;
+
+static unsigned int arm64_switchtbl_rtl_execute(void)
+{
+   rtx_insn *insn;
+   rtx_insn *labelp = NULL;
+   rtx_jump_table_data *tablep = NULL;
+   section *sec = get_section(".objtool_data", SECTION_STRINGS, NULL);
+   section *curr_sec = current_function_section();
+
+   for (insn = get_insns(); insn; insn = NEXT_INSN(insn)) {
+   /*
+* Find a tablejump_p INSN (using a dispatch table)
+*/
+   if (!tablejump_p(insn, , ))
+   continue;
+
+   if (labelp && tablep) {
+   switch_to_section(sec);
+   assemble_integer_with_op(".quad ", 
gen_rtx_LABEL_REF(Pmode, labelp));
+   assemble_integer_with_op(".quad ", 
GEN_INT(GET_NUM_ELEM(tablep->get_labels(;
+   assemble_integer_with_op(".quad ", 
GEN_INT(ADDR_DIFF_VEC_FLAGS(tablep).offset_unsigned));
+   switch_to_section(curr_sec);
+   }
+   }
+   return 0;
+}
+
+#define PASS_NAME arm64_switchtbl_rtl
+
+#define NO_GATE
+#include "gcc-generate-rtl-pass.h"
+
+__visible int plugin_init(struct plugin_name_args *plugin_info,
+ struct plugin_gcc_version *version)
+{
+   const char * const plugin_name = plugin_info->base_name;
+   int tso = 0;
+   int i;
+
+   if (!plugin_default_version_check(version, _version)) {
+   error(G_("incompatible gcc/plugin versions"));
+   return 1;
+   }
+
+   PASS_INFO(arm64_switchtbl_rtl, "outof_cfglayout", 1,
+ PASS_POS_INSERT_AFTER);
+
+   register_callback(plugin_info->base_name, PLUGIN_PASS_MANAGER_SETUP,
+ NULL, _switchtbl_rtl_pass_info);
+
+   return 0;
+}
-- 
2.17.1



[RFC V3 15/18] arm64: kernel: Add exception on kuser32 to prevent stack analysis

2019-06-24 Thread Raphael Gault
kuser32 being used for compatibility, it contains a32 instructions
which are not recognised by objtool when trying to analyse arm64
object files. Thus, we add an exception to skip validation on this
particular file.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 478491f07b4f..1239c7da4c02 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -33,6 +33,9 @@ ifneq ($(CONFIG_COMPAT_VDSO), y)
 obj-$(CONFIG_COMPAT)   += sigreturn32.o
 endif
 obj-$(CONFIG_KUSER_HELPERS)+= kuser32.o
+
+OBJECT_FILES_NON_STANDARD_kuser32.o := y
+
 obj-$(CONFIG_FUNCTION_TRACER)  += ftrace.o entry-ftrace.o
 obj-$(CONFIG_MODULES)  += module.o
 obj-$(CONFIG_ARM64_MODULE_PLTS)+= module-plts.o
-- 
2.17.1



[RFC V3 16/18] arm64: crypto: Add exceptions for crypto object to prevent stack analysis

2019-06-24 Thread Raphael Gault
Some crypto modules contain `.word` of data in the .text section.
Since objtool can't make the distinction between data and incorrect
instruction, it gives a warning about the instruction beeing unknown
and stops the analysis of the object file.

The exception can be removed if the data are moved to another section
or if objtool is tweaked to handle this particular case.

Signed-off-by: Raphael Gault 
---
 arch/arm64/crypto/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
index 0435f2a0610e..e2a25919ebaa 100644
--- a/arch/arm64/crypto/Makefile
+++ b/arch/arm64/crypto/Makefile
@@ -43,9 +43,11 @@ aes-neon-blk-y := aes-glue-neon.o aes-neon.o
 
 obj-$(CONFIG_CRYPTO_SHA256_ARM64) += sha256-arm64.o
 sha256-arm64-y := sha256-glue.o sha256-core.o
+OBJECT_FILES_NON_STANDARD_sha256-core.o := y
 
 obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o
 sha512-arm64-y := sha512-glue.o sha512-core.o
+OBJECT_FILES_NON_STANDARD_sha512-core.o := y
 
 obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o
 chacha-neon-y := chacha-neon-core.o chacha-neon-glue.o
@@ -58,6 +60,7 @@ aes-arm64-y := aes-cipher-core.o aes-cipher-glue.o
 
 obj-$(CONFIG_CRYPTO_AES_ARM64_BS) += aes-neon-bs.o
 aes-neon-bs-y := aes-neonbs-core.o aes-neonbs-glue.o
+OBJECT_FILES_NON_STANDARD_aes-neonbs-core.o := y
 
 CFLAGS_aes-glue-ce.o   := -DUSE_V8_CRYPTO_EXTENSIONS
 
-- 
2.17.1



[RFC V3 18/18] objtool: arm64: Enable stack validation for arm64

2019-06-24 Thread Raphael Gault
Signed-off-by: Raphael Gault 
---
 arch/arm64/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index f5eb592b8579..c5fdfb635d3d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -159,6 +159,7 @@ config ARM64
select HAVE_RCU_TABLE_FREE
select HAVE_RSEQ
select HAVE_STACKPROTECTOR
+   select HAVE_STACK_VALIDATION
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_KPROBES
select HAVE_KRETPROBES
-- 
2.17.1



[RFC V3 17/18] arm64: kernel: Annotate non-standard stack frame functions

2019-06-24 Thread Raphael Gault
Annotate assembler functions which are callable but do not
setup a correct stack frame.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/hyp-stub.S | 2 ++
 arch/arm64/kvm/hyp-init.S| 2 ++
 2 files changed, 4 insertions(+)

diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 73d46070b315..a382f0e33735 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -42,6 +42,7 @@ ENTRY(__hyp_stub_vectors)
ventry  el1_fiq_invalid // FIQ 32-bit EL1
ventry  el1_error_invalid   // Error 32-bit EL1
 ENDPROC(__hyp_stub_vectors)
+asm_stack_frame_non_standard __hyp_stub_vectors
 
.align 11
 
@@ -69,6 +70,7 @@ el1_sync:
 9: mov x0, xzr
eret
 ENDPROC(el1_sync)
+asm_stack_frame_non_standard el1_sync
 
 .macro invalid_vector  label
 \label:
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 160be2b4696d..65b7c12b9aa8 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -118,6 +118,7 @@ CPU_BE( orr x4, x4, #SCTLR_ELx_EE)
/* Hello, World! */
eret
 ENDPROC(__kvm_hyp_init)
+asm_stack_frame_non_standard __kvm_hyp_init
 
 ENTRY(__kvm_handle_stub_hvc)
cmp x0, #HVC_SOFT_RESTART
@@ -159,6 +160,7 @@ reset:
eret
 
 ENDPROC(__kvm_handle_stub_hvc)
+asm_stack_frame_non_standard __kvm_handle_stub_hvc
 
.ltorg
 
-- 
2.17.1



[RFC V3 14/18] arm64: kvm: Annotate non-standard stack frame functions

2019-06-24 Thread Raphael Gault
Both __guest_entry and __guest_exit functions do not setup
a correct stack frame. Because they can be considered as callable
functions, even if they are particular cases, we chose to silence
the warnings given by objtool by annotating them as non-standard.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kvm/hyp/entry.S | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index bd34016354ba..d8e122bd3f6f 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -82,6 +82,7 @@ ENTRY(__guest_enter)
eret
sb
 ENDPROC(__guest_enter)
+asm_stack_frame_non_standard __guest_enter
 
 ENTRY(__guest_exit)
// x0: return code
@@ -171,3 +172,4 @@ abort_guest_exit_end:
orr x0, x0, x5
 1: ret
 ENDPROC(__guest_exit)
+asm_stack_frame_non_standard __guest_exit
-- 
2.17.1



[RFC V3 12/18] arm64: assembler: Add macro to annotate asm function having non standard stack-frame.

2019-06-24 Thread Raphael Gault
Some functions don't have standard stack-frames but are intended
this way. In order for objtool to ignore those particular cases
we add a macro that enables us to annotate the cases we chose
to mark as particular.

Signed-off-by: Raphael Gault 
---
 arch/arm64/include/asm/assembler.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/arm64/include/asm/assembler.h 
b/arch/arm64/include/asm/assembler.h
index 570d195a184d..969a59c5c276 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -752,4 +752,17 @@ USER(\label, icivau, \tmp2)// 
invalidate I line PoU
 .Lyield_out_\@ :
.endm
 
+   /*
+* This macro is the arm64 assembler equivalent of the
+* macro STACK_FRAME_NON_STANDARD define at
+* ~/include/linux/frame.h
+*/
+   .macro  asm_stack_frame_non_standardfunc
+#ifdef CONFIG_STACK_VALIDATION
+   .pushsection ".discard.func_stack_frame_non_standard"
+   .8byte  \func
+   .popsection
+#endif
+   .endm
+
 #endif /* __ASM_ASSEMBLER_H */
-- 
2.17.1



[RFC V3 11/18] arm64: alternative: Mark .altinstr_replacement as containing executable instructions

2019-06-24 Thread Raphael Gault
Until now, the section .altinstr_replacement wasn't marked as containing
executable instructions on arm64. This patch changes that so that it is
coherent with what is done on x86.

Signed-off-by: Raphael Gault 
---
 arch/arm64/include/asm/alternative.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/alternative.h 
b/arch/arm64/include/asm/alternative.h
index b9f8d787eea9..e9e6b81e3eb4 100644
--- a/arch/arm64/include/asm/alternative.h
+++ b/arch/arm64/include/asm/alternative.h
@@ -71,7 +71,7 @@ static inline void apply_alternatives_module(void *start, 
size_t length) { }
ALTINSTR_ENTRY(feature,cb)  \
".popsection\n" \
" .if " __stringify(cb) " == 0\n"   \
-   ".pushsection .altinstr_replacement, \"a\"\n"   \
+   ".pushsection .altinstr_replacement, \"ax\"\n"  \
"663:\n\t"  \
newinstr "\n"   \
"664:\n\t"  \
-- 
2.17.1



[RFC V3 00/18] objtool: Add support for arm64

2019-06-24 Thread Raphael Gault
As of now, objtool only supports the x86_64 architecture but the
groundwork has already been done in order to add support for other
architectures without too much effort.

This series of patches adds support for the arm64 architecture
based on the Armv8.5 Architecture Reference Manual.

Objtool will be a valuable tool to progress and provide more guarentees
on live patching which is a work in progress for arm64.

Once we have the base of objtool working the next steps will be to
port Peter Z's uaccess validation for arm64.

Changes since previous version:
* Rebased on tip/master: Note that I had to re-expose the
`struct alternative` using check.h because it is now used outside of
check.c.
* Reorder commits for a more coherent progression
* Introduce GCC plugin to help detect switch-tables for arm64
This plugins could be improve: It plugs in after the RTL control flow
graph passes but only extract information about the switch tables. I
originally intended for it to introduce new code_label/note within the
RTL representation in order to reference them and thus get the address
of the branch instruction. However I did not manage to do it properly
using gen_rtx_CODE_LABEL/emit_label_before/after. If anyone has some
experience with RTL plugins I am all ears for advices.

Raphael Gault (18):
  objtool: Add abstraction for computation of symbols offsets
  objtool: orc: Refactor ORC API for other architectures to implement.
  objtool: Move registers and control flow to arch-dependent code
  objtool: arm64: Add required implementation for supporting the aarch64
architecture in objtool.
  objtool: special: Adapt special section handling
  objtool: arm64: Adapt the stack frame checks for arm architecture
  objtool: Introduce INSN_UNKNOWN type
  objtool: Refactor switch-tables code to support other architectures
  gcc-plugins: objtool: Add plugin to detect switch table on arm64
  objtool: arm64: Implement functions to add switch tables alternatives
  arm64: alternative: Mark .altinstr_replacement as containing
executable instructions
  arm64: assembler: Add macro to annotate asm function having non
standard stack-frame.
  arm64: sleep: Prevent stack frame warnings from objtool
  arm64: kvm: Annotate non-standard stack frame functions
  arm64: kernel: Add exception on kuser32 to prevent stack analysis
  arm64: crypto: Add exceptions for crypto object to prevent stack
analysis
  arm64: kernel: Annotate non-standard stack frame functions
  objtool: arm64: Enable stack validation for arm64

 arch/arm64/Kconfig|1 +
 arch/arm64/crypto/Makefile|3 +
 arch/arm64/include/asm/alternative.h  |2 +-
 arch/arm64/include/asm/assembler.h|   13 +
 arch/arm64/kernel/Makefile|3 +
 arch/arm64/kernel/hyp-stub.S  |2 +
 arch/arm64/kernel/sleep.S |4 +
 arch/arm64/kvm/hyp-init.S |2 +
 arch/arm64/kvm/hyp/entry.S|2 +
 scripts/Makefile.gcc-plugins  |2 +
 scripts/gcc-plugins/Kconfig   |9 +
 .../arm64_switch_table_detection_plugin.c |   58 +
 tools/objtool/Build   |2 -
 tools/objtool/arch.h  |   21 +-
 tools/objtool/arch/arm64/Build|8 +
 tools/objtool/arch/arm64/arch_special.c   |  173 +
 tools/objtool/arch/arm64/bit_operations.c |   67 +
 tools/objtool/arch/arm64/decode.c | 2809 +
 .../objtool/arch/arm64/include/arch_special.h |   52 +
 .../arch/arm64/include/asm/orc_types.h|   96 +
 .../arch/arm64/include/bit_operations.h   |   24 +
 tools/objtool/arch/arm64/include/cfi.h|   74 +
 .../objtool/arch/arm64/include/insn_decode.h  |  210 ++
 tools/objtool/arch/arm64/orc_dump.c   |   26 +
 tools/objtool/arch/arm64/orc_gen.c|   40 +
 tools/objtool/arch/x86/Build  |3 +
 tools/objtool/arch/x86/arch_special.c |  101 +
 tools/objtool/arch/x86/decode.c   |   16 +
 tools/objtool/arch/x86/include/arch_special.h |   45 +
 tools/objtool/{ => arch/x86/include}/cfi.h|0
 tools/objtool/{ => arch/x86}/orc_dump.c   |4 +-
 tools/objtool/{ => arch/x86}/orc_gen.c|  104 +-
 tools/objtool/check.c |  309 +-
 tools/objtool/check.h |   10 +
 tools/objtool/elf.c   |3 +-
 tools/objtool/orc.h   |4 +-
 tools/objtool/special.c   |   28 +-
 tools/objtool/special.h   |   13 +-
 38 files changed, 4119 insertions(+), 224 deletions(-)
 create mode 100644 scripts/gcc-plugins/arm64_switch_table_detection_plugin.c
 create mode 100644 tools/objtool/arch/arm64/Build
 create mode 100644 tools/objtool/arch/arm64/arch_special.c
 create mode 100644 tools/objtool/arch/arm64/bit_ope

[RFC V3 01/18] objtool: Add abstraction for computation of symbols offsets

2019-06-24 Thread Raphael Gault
The jump destination and relocation offset used previously are only
reliable on x86_64 architecture. We abstract these computations by calling
arch-dependent implementations.

Signed-off-by: Raphael Gault 
---
 tools/objtool/arch.h|  6 ++
 tools/objtool/arch/x86/decode.c | 11 +++
 tools/objtool/check.c   | 15 ++-
 3 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h
index 580e344db3dd..2a38a834cf40 100644
--- a/tools/objtool/arch.h
+++ b/tools/objtool/arch.h
@@ -64,6 +64,8 @@ struct stack_op {
struct op_src src;
 };
 
+struct instruction;
+
 void arch_initial_func_cfi_state(struct cfi_state *state);
 
 int arch_decode_instruction(struct elf *elf, struct section *sec,
@@ -73,4 +75,8 @@ int arch_decode_instruction(struct elf *elf, struct section 
*sec,
 
 bool arch_callee_saved_reg(unsigned char reg);
 
+unsigned long arch_jump_destination(struct instruction *insn);
+
+unsigned long arch_dest_rela_offset(int addend);
+
 #endif /* _ARCH_H */
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 584568f27a83..8767ee935c47 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -11,6 +11,7 @@
 #include "lib/inat.c"
 #include "lib/insn.c"
 
+#include "../../check.h"
 #include "../../elf.h"
 #include "../../arch.h"
 #include "../../warn.h"
@@ -66,6 +67,11 @@ bool arch_callee_saved_reg(unsigned char reg)
}
 }
 
+unsigned long arch_dest_rela_offset(int addend)
+{
+   return addend + 4;
+}
+
 int arch_decode_instruction(struct elf *elf, struct section *sec,
unsigned long offset, unsigned int maxlen,
unsigned int *len, unsigned char *type,
@@ -497,3 +503,8 @@ void arch_initial_func_cfi_state(struct cfi_state *state)
state->regs[16].base = CFI_CFA;
state->regs[16].offset = -8;
 }
+
+unsigned long arch_jump_destination(struct instruction *insn)
+{
+   return insn->offset + insn->len + insn->immediate;
+}
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 172f99195726..b37ca4822f25 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -565,7 +565,7 @@ static int add_jump_destinations(struct objtool_file *file)
   insn->len);
if (!rela) {
dest_sec = insn->sec;
-   dest_off = insn->offset + insn->len + insn->immediate;
+   dest_off = arch_jump_destination(insn);
} else if (rela->sym->type == STT_SECTION) {
dest_sec = rela->sym->sec;
dest_off = rela->addend + 4;
@@ -659,7 +659,7 @@ static int add_call_destinations(struct objtool_file *file)
rela = find_rela_by_dest_range(insn->sec, insn->offset,
   insn->len);
if (!rela) {
-   dest_off = insn->offset + insn->len + insn->immediate;
+   dest_off = arch_jump_destination(insn);
insn->call_dest = find_symbol_by_offset(insn->sec,
dest_off);
 
@@ -672,14 +672,19 @@ static int add_call_destinations(struct objtool_file 
*file)
}
 
} else if (rela->sym->type == STT_SECTION) {
+   /*
+* the original x86_64 code adds 4 to the rela->addend
+* which is not needed on arm64 architecture.
+*/
+   dest_off = arch_dest_rela_offset(rela->addend);
insn->call_dest = find_symbol_by_offset(rela->sym->sec,
-   rela->addend+4);
+   dest_off);
if (!insn->call_dest ||
insn->call_dest->type != STT_FUNC) {
-   WARN_FUNC("can't find call dest symbol at 
%s+0x%x",
+   WARN_FUNC("can't find call dest symbol at 
%s+0x%lx",
  insn->sec, insn->offset,
  rela->sym->sec->name,
- rela->addend + 4);
+ dest_off);
return -1;
}
} else
-- 
2.17.1



[tip:perf/core] perf tests arm64: Compile tests unconditionally

2019-06-22 Thread tip-bot for Raphael Gault
Commit-ID:  010e3e8fc12b1c13ce19821a11d8930226ebb4b6
Gitweb: https://git.kernel.org/tip/010e3e8fc12b1c13ce19821a11d8930226ebb4b6
Author: Raphael Gault 
AuthorDate: Tue, 11 Jun 2019 13:53:09 +0100
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 17 Jun 2019 15:57:16 -0300

perf tests arm64: Compile tests unconditionally

In order to subsequently add more tests for the arm64 architecture we
compile the tests target for arm64 systematically.

Further explanation provided by Mark Rutland:

Given prior questions regarding this commit, it's probably worth
spelling things out more explicitly, e.g.

  Currently we only build the arm64/tests directory if
  CONFIG_DWARF_UNWIND is selected, which is fine as the only test we
  have is arm64/tests/dwarf-unwind.o.

  So that we can add more tests to the test directory, let's
  unconditionally build the directory, but conditionally build
  dwarf-unwind.o depending on CONFIG_DWARF_UNWIND.

  There should be no functional change as a result of this patch.

Signed-off-by: Raphael Gault 
Acked-by: Mark Rutland 
Cc: Catalin Marinas 
Cc: Peter Zijlstra 
Cc: Will Deacon 
Cc: linux-arm-ker...@lists.infradead.org
Link: http://lkml.kernel.org/r/20190611125315.18736-2-raphael.ga...@arm.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/arch/arm64/Build   | 2 +-
 tools/perf/arch/arm64/tests/Build | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/arch/arm64/Build b/tools/perf/arch/arm64/Build
index 36222e64bbf7..a7dd46a5b678 100644
--- a/tools/perf/arch/arm64/Build
+++ b/tools/perf/arch/arm64/Build
@@ -1,2 +1,2 @@
 perf-y += util/
-perf-$(CONFIG_DWARF_UNWIND) += tests/
+perf-y += tests/
diff --git a/tools/perf/arch/arm64/tests/Build 
b/tools/perf/arch/arm64/tests/Build
index 41707fea74b3..a61c06bdb757 100644
--- a/tools/perf/arch/arm64/tests/Build
+++ b/tools/perf/arch/arm64/tests/Build
@@ -1,4 +1,4 @@
 perf-y += regs_load.o
-perf-y += dwarf-unwind.o
+perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 
 perf-y += arch-tests.o


Re: [PATCH 3/7] perf: arm64: Use rseq to test userspace access to pmu counters

2019-06-13 Thread Raphael Gault

Hi Mathieu,

On 6/11/19 8:33 PM, Mathieu Desnoyers wrote:

- On Jun 11, 2019, at 6:57 PM, Mark Rutland mark.rutl...@arm.com wrote:


Hi Arnaldo,

On Tue, Jun 11, 2019 at 11:33:46AM -0300, Arnaldo Carvalho de Melo wrote:

Em Tue, Jun 11, 2019 at 01:53:11PM +0100, Raphael Gault escreveu:

Add an extra test to check userspace access to pmu hardware counters.
This test doesn't rely on the seqlock as a synchronisation mechanism but
instead uses the restartable sequences to make sure that the thread is
not interrupted when reading the index of the counter and the associated
pmu register.

In addition to reading the pmu counters, this test is run several time
in order to measure the ratio of failures:
I ran this test on the Juno development platform, which is big.LITTLE
with 4 Cortex A53 and 2 Cortex A57. The results vary quite a lot
(running it with 100 tests is not so long and I did it several times).
I ran it once with 1 iterations:
`runs: 1, abort: 62.53%, zero: 34.93%, success: 2.54%`

Signed-off-by: Raphael Gault 
---
  tools/perf/arch/arm64/include/arch-tests.h|   5 +-
  tools/perf/arch/arm64/include/rseq-arm64.h| 220 ++


So, I applied the first patch in this series, but could you please break
this patch into at least two, one introducing the facility
(include/rseq*) and the second adding the test?

We try to enforce this kind of granularity as down the line we may want
to revert one part while the other already has other uses and thus
wouldn't allow a straight revert.

Also, can this go to tools/arch/ instead? Is this really perf specific?
Isn't there any arch/arm64/include files for the kernel that we could
mirror and have it checked for drift in tools/perf/check-headers.sh?


The rseq bits aren't strictly perf specific, and I think the existing
bits under tools/testing/selftests/rseq/ could be factored out to common
locations under tools/include/ and tools/arch/*/include/.


Hi Mark,

Thanks for CCing me!

Or into a stand-alone librseq project:

https://github.com/compudj/librseq (currently a development branch in
my own github)

I don't see why this user-space code should sit in the kernel tree.
It is not tooling-specific.



 From a scan, those already duplicate barriers and other helpers which
already have definitions under tools/, which seems unfortunate. :/

Comments below are for Raphael and Matthieu.

[...]


+static u64 noinline mmap_read_self(void *addr, int cpu)
+{
+   struct perf_event_mmap_page *pc = addr;
+   u32 idx = 0;
+   u64 count = 0;
+
+   asm volatile goto(
+ RSEQ_ASM_DEFINE_TABLE(0, 1f, 2f, 3f)
+"nop\n"
+ RSEQ_ASM_STORE_RSEQ_CS(1, 0b, rseq_cs)
+RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f)
+ RSEQ_ASM_OP_R_LOAD(pc_idx)
+ RSEQ_ASM_OP_R_AND(0xFF)
+RSEQ_ASM_OP_R_STORE(idx)
+ RSEQ_ASM_OP_R_SUB(0x1)
+RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f)
+ "msr pmselr_el0, " RSEQ_ASM_TMP_REG "\n"
+ "isb\n"
+RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f)


I really don't understand why the cpu_id needs to be compared 3 times
here (?!?)

Explicit comparison of the cpu_id within the rseq critical section
should be done _once_.



I understand and that's what I thought as well but I got confused with a 
comment in (src)/include/uapi/linux/rseq.h which states:

> This CPU number value should always be compared
> against the value of the cpu_id field before performing a rseq
> commit or returning a value read from a data structure indexed
> using the cpu_id_start value.

I'll remove the unnecessary testing.



If the kernel happens to preempt and migrate the thread while in the
critical section, it's the kernel's job to move user-space execution
to the abort handler.


[...]

Thanks,

--
Raphael Gault


Re: [RFC V2 00/16] objtool: Add support for Arm64

2019-06-13 Thread Raphael Gault

Hi Josh,

On 5/28/19 11:24 PM, Josh Poimboeuf wrote:

On Tue, May 21, 2019 at 12:50:57PM +, Raphael Gault wrote:

Hi Josh,

Thanks for offering your help and sorry for the late answer.

My understanding is that a table of offsets is built by GCC, those
offsets being scaled by 4 before adding them to the base label.
I believe the offsets are stored in the .rodata section. To find the
size of that table, it is needed to find a comparison, which can be
optimized out apprently. In that case the end of the array can be found
by locating labels pointing to data behind it (which is not 100% safe).

On 5/16/19 3:29 PM, Josh Poimboeuf wrote:

On Thu, May 16, 2019 at 11:36:39AM +0100, Raphael Gault wrote:

Noteworthy points:
* I still haven't figured out how to detect switch-tables on arm64. I
have a better understanding of them but still haven't implemented checks
as it doesn't look trivial at all.


Switch tables were tricky to get right on x86.  If you share an example
(or even just a .o file) I can take a look.  Hopefully they're somewhat
similar to x86 switch tables.  Otherwise we may want to consider a
different approach (for example maybe a GCC plugin could help annotate
them).



The case which made me realize the issue is the one of
arch/arm64/kernel/module.o:apply_relocate_add:

```
What seems to happen in the case of module.o is:
   334:   9015adrpx21, 0 
which retrieves the location of an offset in the rodata section, and a
bit later we do some extra computation with it in order to compute the
jump destination:
   3e0:   78625aa0ldrhw0, [x21, w2, uxtw #1]
   3e4:   1061adr x1, 3f0 
   3e8:   8b20a820add x0, x1, w0, sxth #2
   3ec:   d61fbr  x0
```

Please keep in mind that the actual offsets might vary.

I'm happy to provide more details about what I have identified if you
want me to.


I get the feeling this is going to be trickier than x86 switch tables
(which have already been tricky enough).

On x86, there's a .rela.rodata section which applies relocations to
.rodata.  The presence of those relocations makes it relatively easy to
differentiate switch tables from other read-only data.  For example, we
can tell that a switch table ends when either a) there's not a text
relocation or b) another switch table begins.

But with arm64 I don't see a deterministic way to do that, because the
table offsets are hard-coded in .rodata, with no relocations.

 From talking with Kamalesh I got the impression that we might have a
similar issue for powerpc.

So I'm beginning to think we'll need compiler help.  Like a GCC plugin
that annotates at least the following switch table metadata:

- Branch instruction address
- Switch table address
- Switch table entry size
- Switch table size

The GCC plugin could write all the above metadata into a special section
which gets discarded at link time.  I can look at implementing it,
though I'll be traveling for two out of the next three weeks so it may
be a while before I can get to it.



I am completely new to GCC plugins but I had a look and I think I found 
a possible solution to retrieve at least part of this information using 
the RTL representation in GCC. I can't say it will work for sure but I 
would be happy to discuss it with you if you want.
Although there are still some area I need to investigate related to 
interacting with the RTL representation and storing info into the ELF

I'd be interested in giving it a try, if you are okay with that.

Thanks,
--
Raphael Gault


Re: [PATCH 3/7] perf: arm64: Use rseq to test userspace access to pmu counters

2019-06-13 Thread Raphael Gault

Hi Mathieu, Mark,

On 6/11/19 8:33 PM, Mathieu Desnoyers wrote:

- On Jun 11, 2019, at 6:57 PM, Mark Rutland mark.rutl...@arm.com wrote:


Hi Arnaldo,

On Tue, Jun 11, 2019 at 11:33:46AM -0300, Arnaldo Carvalho de Melo wrote:

Em Tue, Jun 11, 2019 at 01:53:11PM +0100, Raphael Gault escreveu:

Add an extra test to check userspace access to pmu hardware counters.
This test doesn't rely on the seqlock as a synchronisation mechanism but
instead uses the restartable sequences to make sure that the thread is
not interrupted when reading the index of the counter and the associated
pmu register.

In addition to reading the pmu counters, this test is run several time
in order to measure the ratio of failures:
I ran this test on the Juno development platform, which is big.LITTLE
with 4 Cortex A53 and 2 Cortex A57. The results vary quite a lot
(running it with 100 tests is not so long and I did it several times).
I ran it once with 1 iterations:
`runs: 1, abort: 62.53%, zero: 34.93%, success: 2.54%`

Signed-off-by: Raphael Gault 
---
  tools/perf/arch/arm64/include/arch-tests.h|   5 +-
  tools/perf/arch/arm64/include/rseq-arm64.h| 220 ++


So, I applied the first patch in this series, but could you please break
this patch into at least two, one introducing the facility
(include/rseq*) and the second adding the test?

We try to enforce this kind of granularity as down the line we may want
to revert one part while the other already has other uses and thus
wouldn't allow a straight revert.

Also, can this go to tools/arch/ instead? Is this really perf specific?
Isn't there any arch/arm64/include files for the kernel that we could
mirror and have it checked for drift in tools/perf/check-headers.sh?


The rseq bits aren't strictly perf specific, and I think the existing
bits under tools/testing/selftests/rseq/ could be factored out to common
locations under tools/include/ and tools/arch/*/include/.


Hi Mark,

Thanks for CCing me!

Or into a stand-alone librseq project:

https://github.com/compudj/librseq (currently a development branch in
my own github)

I don't see why this user-space code should sit in the kernel tree.
It is not tooling-specific.



I understand your point but I have to admit that I don't really see how 
to make it work together with the test which require those definitions.




 From a scan, those already duplicate barriers and other helpers which
already have definitions under tools/, which seems unfortunate. :/



Also I realize that there is a duplicate with definitions introduced in 
the selftests but I kind of simplified the macros I'm using to get rid 
of what wasn't useful to me at the moment. (mainly the loop labels and 
parameter injections in the asm statement)
I understand what both Mark and Arnaldo are saying about moving it out 
of perf so that it is not duplicated but my question is whether it is a 
good thing to do as is since it is not exactly the same content as 
what's in the selftests.


I hope you can understand my concerns and I'd like to hear your opinions 
on that matter.


Thanks,

--
Raphael Gault


[PATCH 2/7] perf: arm64: Add test to check userspace access to hardware counters.

2019-06-11 Thread Raphael Gault
This test relies on the fact that the PMU registers are accessible
from userspace. It then uses the perf_event_mmap_page to retrieve
the counter index and access the underlying register.

This test uses sched_setaffinity(2) in order to run on all CPU and thus
check the behaviour of the PMU of all cpus in a big.LITTLE environment.

Signed-off-by: Raphael Gault 
---
 tools/perf/arch/arm64/include/arch-tests.h |   6 +
 tools/perf/arch/arm64/tests/Build  |   1 +
 tools/perf/arch/arm64/tests/arch-tests.c   |   4 +
 tools/perf/arch/arm64/tests/user-events.c  | 255 +
 4 files changed, 266 insertions(+)
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c

diff --git a/tools/perf/arch/arm64/include/arch-tests.h 
b/tools/perf/arch/arm64/include/arch-tests.h
index 90ec4c8cb880..a9b17ae0560b 100644
--- a/tools/perf/arch/arm64/include/arch-tests.h
+++ b/tools/perf/arch/arm64/include/arch-tests.h
@@ -2,11 +2,17 @@
 #ifndef ARCH_TESTS_H
 #define ARCH_TESTS_H
 
+#define __maybe_unused __attribute__((unused))
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 struct thread;
 struct perf_sample;
+int test__arch_unwind_sample(struct perf_sample *sample,
+struct thread *thread);
 #endif
 
 extern struct test arch_tests[];
+int test__rd_pmevcntr(struct test *test __maybe_unused,
+ int subtest __maybe_unused);
+
 
 #endif
diff --git a/tools/perf/arch/arm64/tests/Build 
b/tools/perf/arch/arm64/tests/Build
index a61c06bdb757..3f9a20c17fc6 100644
--- a/tools/perf/arch/arm64/tests/Build
+++ b/tools/perf/arch/arm64/tests/Build
@@ -1,4 +1,5 @@
 perf-y += regs_load.o
 perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 
+perf-y += user-events.o
 perf-y += arch-tests.o
diff --git a/tools/perf/arch/arm64/tests/arch-tests.c 
b/tools/perf/arch/arm64/tests/arch-tests.c
index 5b1543c98022..57df9b89dede 100644
--- a/tools/perf/arch/arm64/tests/arch-tests.c
+++ b/tools/perf/arch/arm64/tests/arch-tests.c
@@ -10,6 +10,10 @@ struct test arch_tests[] = {
.func = test__dwarf_unwind,
},
 #endif
+   {
+   .desc = "User counter access",
+   .func = test__rd_pmevcntr,
+   },
{
.func = NULL,
},
diff --git a/tools/perf/arch/arm64/tests/user-events.c 
b/tools/perf/arch/arm64/tests/user-events.c
new file mode 100644
index ..958e4cd000c1
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/user-events.c
@@ -0,0 +1,255 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "perf.h"
+#include "debug.h"
+#include "tests/tests.h"
+#include "cloexec.h"
+#include "util.h"
+#include "arch-tests.h"
+
+/*
+ * ARMv8 ARM reserves the following encoding for system registers:
+ * (Ref: ARMv8 ARM, Section: "System instruction class encoding overview",
+ *  C5.2, version:ARM DDI 0487A.f)
+ *  [20-19] : Op0
+ *  [18-16] : Op1
+ *  [15-12] : CRn
+ *  [11-8]  : CRm
+ *  [7-5]   : Op2
+ */
+#define Op0_shift   19
+#define Op0_mask0x3
+#define Op1_shift   16
+#define Op1_mask0x7
+#define CRn_shift   12
+#define CRn_mask0xf
+#define CRm_shift   8
+#define CRm_mask0xf
+#define Op2_shift   5
+#define Op2_mask0x7
+
+#define __stringify(x) #x
+
+#define read_sysreg(r) ({  \
+   u64 __val;  \
+   asm volatile("mrs %0, " __stringify(r) : "=r" (__val)); \
+   __val;  \
+})
+
+#define PMEVCNTR_READ_CASE(idx)\
+   case idx:   \
+   return read_sysreg(pmevcntr##idx##_el0)
+
+#define PMEVCNTR_CASES(readwrite)  \
+   PMEVCNTR_READ_CASE(0);  \
+   PMEVCNTR_READ_CASE(1);  \
+   PMEVCNTR_READ_CASE(2);  \
+   PMEVCNTR_READ_CASE(3);  \
+   PMEVCNTR_READ_CASE(4);  \
+   PMEVCNTR_READ_CASE(5);  \
+   PMEVCNTR_READ_CASE(6);  \
+   PMEVCNTR_READ_CASE(7);  \
+   PMEVCNTR_READ_CASE(8);  \
+   PMEVCNTR_READ_CASE(9);  \
+   PMEVCNTR_READ_CASE(10); \
+   PMEVCNTR_READ_CASE(11); \
+   PMEVCNTR_READ_CASE(12); \
+   PMEVCNTR_READ_CASE(13); \
+   PMEVCNTR_READ_CASE(14); \
+   PMEVCNTR_READ_CASE(15); \
+   PMEVCNTR_READ_CASE(16); \
+   PMEVCNTR_READ_CASE(17); \
+   PMEVCNTR_READ_CASE(18

[PATCH 7/7] Documentation: arm64: Document PMU counters access from userspace

2019-06-11 Thread Raphael Gault
Add a documentation file to describe the access to the pmu hardware
counters from userspace

Signed-off-by: Raphael Gault 
---
 .../arm64/pmu_counter_user_access.txt | 42 +++
 1 file changed, 42 insertions(+)
 create mode 100644 Documentation/arm64/pmu_counter_user_access.txt

diff --git a/Documentation/arm64/pmu_counter_user_access.txt 
b/Documentation/arm64/pmu_counter_user_access.txt
new file mode 100644
index ..6788b1107381
--- /dev/null
+++ b/Documentation/arm64/pmu_counter_user_access.txt
@@ -0,0 +1,42 @@
+Access to PMU hardware counter from userspace
+=
+
+Overview
+
+The perf user-space tool relies on the PMU to monitor events. It offers an
+abstraction layer over the hardware counters since the underlying
+implementation is cpu-dependent.
+Arm64 allows userspace tools to have access to the registers storing the
+hardware counters' values directly.
+
+This targets specifically self-monitoring tasks in order to reduce the overhead
+by directly accessing the registers without having to go through the kernel.
+
+How-to
+--
+The focus is set on the armv8 pmuv3 which makes sure that the access to the pmu
+registers is enable and that the userspace have access to the relevent
+information in order to use them.
+
+In order to have access to the hardware counter it is necessary to open the 
event
+using the perf tool interface: the sys_perf_event_open syscall returns a fd 
which
+can subsequently be used with the mmap syscall in order to retrieve a page of 
memory
+containing information about the event.
+The PMU driver uses this page to expose to the user the hardware counter's
+index. Using this index enables the user to access the PMU registers using the
+`mrs` instruction.
+
+Have a look `at tools/perf/arch/arm64/tests/user-events.c` for an example. It 
can be
+run using the perf tool to check that the access to the registers works
+correctly from userspace:
+
+./perf test -v
+
+About chained events
+
+When the user requests for an event to be counted on 64 bits, two hardware
+counters are used and need to be combined to retrieve the correct value:
+
+val = read_counter(idx);
+if ((event.attr.config1 & 0x1))
+   val = (val << 32) | read_counter(idx - 1);
-- 
2.17.1



[PATCH 6/7] arm64: perf: Enable pmu counter direct access for perf event on armv8

2019-06-11 Thread Raphael Gault
Keep track of event opened with direct access to the hardware counters
and modify permissions while they are open.

The strategy used here is the same which x86 uses: everytime an event
is mapped, the permissions are set if required. The atomic field added
in the mm_context helps keep track of the different event opened and
de-activate the permissions when all are unmapped.
We also need to update the permissions in the context switch code so
that tasks keep the right permissions.

Signed-off-by: Raphael Gault 
---
 arch/arm64/include/asm/mmu.h |  6 +
 arch/arm64/include/asm/mmu_context.h |  2 ++
 arch/arm64/include/asm/perf_event.h  | 14 ++
 drivers/perf/arm_pmu.c   | 38 
 4 files changed, 60 insertions(+)

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 67ef25d037ea..9de4cf0b17c7 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -29,6 +29,12 @@
 
 typedef struct {
atomic64_t  id;
+
+   /*
+* non-zero if userspace have access to hardware
+* counters directly.
+*/
+   atomic_tpmu_direct_access;
void*vdso;
unsigned long   flags;
 } mm_context_t;
diff --git a/arch/arm64/include/asm/mmu_context.h 
b/arch/arm64/include/asm/mmu_context.h
index 2da3e478fd8f..33494af613d8 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -235,6 +236,7 @@ static inline void __switch_mm(struct mm_struct *next)
}
 
check_and_switch_context(next, cpu);
+   perf_switch_user_access(next);
 }
 
 static inline void
diff --git a/arch/arm64/include/asm/perf_event.h 
b/arch/arm64/include/asm/perf_event.h
index c593761ba61c..32a6d604bb3b 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -19,6 +19,7 @@
 
 #include 
 #include 
+#include 
 
 #defineARMV8_PMU_MAX_COUNTERS  32
 #defineARMV8_PMU_COUNTER_MASK  (ARMV8_PMU_MAX_COUNTERS - 1)
@@ -234,4 +235,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
(regs)->pstate = PSR_MODE_EL1h; \
 }
 
+static inline void perf_switch_user_access(struct mm_struct *mm)
+{
+   if (!IS_ENABLED(CONFIG_PERF_EVENTS))
+   return;
+
+   if (atomic_read(>context.pmu_direct_access)) {
+   write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR,
+pmuserenr_el0);
+   } else {
+   write_sysreg(0, pmuserenr_el0);
+   }
+}
+
 #endif
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 2d06b8095a19..6ae85fcbf297 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -25,6 +25,7 @@
 #include 
 
 #include 
+#include 
 
 static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
 static DEFINE_PER_CPU(int, cpu_irq);
@@ -778,6 +779,41 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu)
_pmu->node);
 }
 
+static void refresh_pmuserenr(void *mm)
+{
+   perf_switch_user_access(mm);
+}
+
+static void armpmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return;
+
+   /*
+* This function relies on not being called concurrently in two
+* tasks in the same mm.  Otherwise one task could observe
+* pmu_direct_access > 1 and return all the way back to
+* userspace with user access disabled while another task is still
+* doing on_each_cpu_mask() to enable user access.
+*
+* For now, this can't happen because all callers hold mmap_sem
+* for write.  If this changes, we'll need a different solution.
+*/
+   lockdep_assert_held_exclusive(>mmap_sem);
+
+   if (atomic_inc_return(>context.pmu_direct_access) == 1)
+   on_each_cpu(refresh_pmuserenr, mm, 1);
+}
+
+static void armpmu_event_unmapped(struct perf_event *event, struct mm_struct 
*mm)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return;
+
+   if (atomic_dec_and_test(>context.pmu_direct_access))
+   on_each_cpu_mask(mm_cpumask(mm), refresh_pmuserenr, NULL, 1);
+}
+
 static struct arm_pmu *__armpmu_alloc(gfp_t flags)
 {
struct arm_pmu *pmu;
@@ -799,6 +835,8 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags)
.pmu_enable = armpmu_enable,
.pmu_disable= armpmu_disable,
.event_init = armpmu_event_init,
+   .event_mapped   = armpmu_event_mapped,
+   .event_unmapped = armpmu_event_unmapped,
.add= armpmu_add,
.del= armpmu_del,
.start  = armpmu_start,
-- 
2.17.1



[PATCH 5/7] arm64: pmu: Add hook to handle pmu-related undefined instructions

2019-06-11 Thread Raphael Gault
In order to prevent the userspace processes which are trying to access
the registers from the pmu registers on a big.LITTLE environment we
introduce a hook to handle undefined instructions.

The goal here is to prevent the process to be interrupted by a signal
when the error is caused by the task being scheduled while accessing
a counter, causing the counter access to be invalid. As we are not able
to know efficiently the number of counters available physically on both
pmu in that context we consider that any faulting access to a counter
which is architecturally correct should not cause a SIGILL signal if
the permissions are set accordingly.

This commit also modifies the mask of the mrs_hook declared in
arch/arm64/kernel/cpufeatures.c which emulates only feature register
access. This is necessary because this hook's mask was too large and
thus masking any mrs instruction, even if not related to the emulated
registers which made the pmu emulation inefficient.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/cpufeature.c |  4 +--
 arch/arm64/kernel/perf_event.c | 55 ++
 2 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 80babf451519..d9b2be97cc06 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -2167,8 +2167,8 @@ static int emulate_mrs(struct pt_regs *regs, u32 insn)
 }
 
 static struct undef_hook mrs_hook = {
-   .instr_mask = 0xfff0,
-   .instr_val  = 0xd530,
+   .instr_mask = 0x,
+   .instr_val  = 0xd538,
.pstate_mask = PSR_AA32_MODE_MASK,
.pstate_val = PSR_MODE_EL0t,
.fn = emulate_mrs,
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 293e4c365a53..93ac24b51d5f 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -19,9 +19,11 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include 
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -1041,6 +1043,59 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
return probe.present ? 0 : -ENODEV;
 }
 
+static int emulate_pmu(struct pt_regs *regs, u32 insn)
+{
+   u32 sys_reg, rt;
+   u32 pmuserenr;
+
+   sys_reg = (u32)aarch64_insn_decode_immediate(AARCH64_INSN_IMM_16, insn) 
<< 5;
+   rt = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn);
+   pmuserenr = read_sysreg(pmuserenr_el0);
+
+   if ((pmuserenr & (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR)) !=
+   (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR))
+   return -EINVAL;
+
+
+   /*
+* Userspace is expected to only use this in the context of the scheme
+* described in the struct perf_event_mmap_page comments.
+*
+* Given that context, we can only get here if we got migrated between
+* getting the register index and doing the MSR read.  This in turn
+* implies we'll fail the sequence and retry, so any value returned is
+* 'good', all we need is to be non-fatal.
+*
+* The choice of the value 0 is comming from the fact that when
+* accessing a register which is not counting events but is accessible,
+* we get 0.
+*/
+   pt_regs_write_reg(regs, rt, 0);
+
+   arm64_skip_faulting_instruction(regs, 4);
+   return 0;
+}
+
+/*
+ * This hook will only be triggered by mrs
+ * instructions on PMU registers. This is mandatory
+ * in order to have a consistent behaviour even on
+ * big.LITTLE systems.
+ */
+static struct undef_hook pmu_hook = {
+   .instr_mask = 0x8800,
+   .instr_val  = 0xd53b8800,
+   .fn = emulate_pmu,
+};
+
+static int __init enable_pmu_emulation(void)
+{
+   register_undef_hook(_hook);
+   return 0;
+}
+
+core_initcall(enable_pmu_emulation);
+
 static int armv8_pmu_init(struct arm_pmu *cpu_pmu)
 {
int ret = armv8pmu_probe_pmu(cpu_pmu);
-- 
2.17.1



[PATCH 3/7] perf: arm64: Use rseq to test userspace access to pmu counters

2019-06-11 Thread Raphael Gault
Add an extra test to check userspace access to pmu hardware counters.
This test doesn't rely on the seqlock as a synchronisation mechanism but
instead uses the restartable sequences to make sure that the thread is
not interrupted when reading the index of the counter and the associated
pmu register.

In addition to reading the pmu counters, this test is run several time
in order to measure the ratio of failures:
I ran this test on the Juno development platform, which is big.LITTLE
with 4 Cortex A53 and 2 Cortex A57. The results vary quite a lot
(running it with 100 tests is not so long and I did it several times).
I ran it once with 1 iterations:
`runs: 1, abort: 62.53%, zero: 34.93%, success: 2.54%`

Signed-off-by: Raphael Gault 
---
 tools/perf/arch/arm64/include/arch-tests.h|   5 +-
 tools/perf/arch/arm64/include/rseq-arm64.h| 220 ++
 tools/perf/arch/arm64/tests/Build |   1 +
 tools/perf/arch/arm64/tests/arch-tests.c  |   6 +
 tools/perf/arch/arm64/tests/rseq-pmu-events.c | 219 +
 5 files changed, 450 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/arch/arm64/include/rseq-arm64.h
 create mode 100644 tools/perf/arch/arm64/tests/rseq-pmu-events.c

diff --git a/tools/perf/arch/arm64/include/arch-tests.h 
b/tools/perf/arch/arm64/include/arch-tests.h
index a9b17ae0560b..4164762b43c6 100644
--- a/tools/perf/arch/arm64/include/arch-tests.h
+++ b/tools/perf/arch/arm64/include/arch-tests.h
@@ -13,6 +13,9 @@ int test__arch_unwind_sample(struct perf_sample *sample,
 extern struct test arch_tests[];
 int test__rd_pmevcntr(struct test *test __maybe_unused,
  int subtest __maybe_unused);
-
+#ifdef CONFIG_RSEQ
+int rseq__rd_pmevcntr(struct test *test __maybe_unused,
+ int subtest __maybe_unused);
+#endif
 
 #endif
diff --git a/tools/perf/arch/arm64/include/rseq-arm64.h 
b/tools/perf/arch/arm64/include/rseq-arm64.h
new file mode 100644
index ..00d6960915a9
--- /dev/null
+++ b/tools/perf/arch/arm64/include/rseq-arm64.h
@@ -0,0 +1,220 @@
+/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
+/*
+ * rseq-arm64.h
+ *
+ * This file is mostly a copy from
+ * tools/testing/selftests/rseq/rseq-arm64.h
+ */
+
+/*
+ * aarch64 -mbig-endian generates mixed endianness code vs data:
+ * little-endian code and big-endian data. Ensure the RSEQ_SIG signature
+ * matches code endianness.
+ */
+#define __rseq_str_1(x)  #x
+#define __rseq_str(x)__rseq_str_1(x)
+
+#define RSEQ_ACCESS_ONCE(x)(*(__volatile__  __typeof__(x) *)&(x))
+#define RSEQ_SIG_CODE  0xd428bc00  /* BRK #0x45E0.  */
+
+#ifdef __AARCH64EB__
+#define RSEQ_SIG_DATA  0x00bc28d4  /* BRK #0x45E0.  */
+#else
+#define RSEQ_SIG_DATA  RSEQ_SIG_CODE
+#endif
+
+#define RSEQ_SIG   RSEQ_SIG_DATA
+
+#define rseq_smp_mb()  __asm__ __volatile__ ("dmb ish" ::: "memory")
+#define rseq_smp_rmb() __asm__ __volatile__ ("dmb ishld" ::: "memory")
+#define rseq_smp_wmb() __asm__ __volatile__ ("dmb ishst" ::: "memory")
+
+#define rseq_smp_load_acquire(p)   
\
+__extension__ ({   
\
+   __typeof(*p) p1;
\
+   switch (sizeof(*p)) {   
\
+   case 1: 
\
+   asm volatile ("ldarb %w0, %1"   
\
+   : "=r" (*(__u8 *)p) 
\
+   : "Q" (*p) : "memory"); 
\
+   break;  
\
+   case 2: 
\
+   asm volatile ("ldarh %w0, %1"   
\
+   : "=r" (*(__u16 *)p)
\
+   : "Q" (*p) : "memory"); 
\
+   break;  
\
+   case 4: 
\
+   asm volatile ("ldar %w0, %1"
\
+   : "=r" (*(__u32 *)p)
\
+   : "Q" (*p) : "memory"); 
\
+   break;  
\
+   case 8: 
\
+   asm volatile ("ldar %0, %1" 

[PATCH 4/7] arm64: pmu: Add function implementation to update event index in userpage.

2019-06-11 Thread Raphael Gault
In order to be able to access the counter directly for userspace,
we need to provide the index of the counter using the userpage.
We thus need to override the event_idx function to retrieve and
convert the perf_event index to armv8 hardware index.

Since the arm_pmu driver can be used by any implementation, even
if not armv8, two components play a role into making sure the
behaviour is correct and consistent with the PMU capabilities:

* the ARMPMU_EL0_RD_CNTR flag which denotes the capability to access
counter from userspace.
* the event_idx call back, which is implemented and initialized by
the PMU implementation: if no callback is provided, the default
behaviour applies, returning 0 as index value.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/perf_event.c | 21 +
 include/linux/perf/arm_pmu.h   |  2 ++
 2 files changed, 23 insertions(+)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 348d12eec566..293e4c365a53 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -829,6 +829,22 @@ static void armv8pmu_clear_event_idx(struct pmu_hw_events 
*cpuc,
clear_bit(idx - 1, cpuc->used_mask);
 }
 
+static int armv8pmu_access_event_idx(struct perf_event *event)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return 0;
+
+   /*
+* We remap the cycle counter index to 32 to
+* match the offset applied to the rest of
+* the counter indeces.
+*/
+   if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER)
+   return 32;
+
+   return event->hw.idx;
+}
+
 /*
  * Add an event filter to a given event.
  */
@@ -922,6 +938,8 @@ static int __armv8_pmuv3_map_event(struct perf_event *event,
if (armv8pmu_event_is_64bit(event))
event->hw.flags |= ARMPMU_EVT_64BIT;
 
+   event->hw.flags |= ARMPMU_EL0_RD_CNTR;
+
/* Only expose micro/arch events supported by this PMU */
if ((hw_event_id > 0) && (hw_event_id < ARMV8_PMUV3_MAX_COMMON_EVENTS)
&& test_bit(hw_event_id, armpmu->pmceid_bitmap)) {
@@ -1042,6 +1060,8 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu)
cpu_pmu->set_event_filter   = armv8pmu_set_event_filter;
cpu_pmu->filter_match   = armv8pmu_filter_match;
 
+   cpu_pmu->pmu.event_idx  = armv8pmu_access_event_idx;
+
return 0;
 }
 
@@ -1220,6 +1240,7 @@ void arch_perf_update_userpage(struct perf_event *event,
 */
freq = arch_timer_get_rate();
userpg->cap_user_time = 1;
+   userpg->cap_user_rdpmc = !!(event->hw.flags & ARMPMU_EL0_RD_CNTR);
 
clocks_calc_mult_shift(>time_mult, , freq,
NSEC_PER_SEC, 0);
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 4641e850b204..3bef390c1069 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -30,6 +30,8 @@
  */
 /* Event uses a 64bit counter */
 #define ARMPMU_EVT_64BIT   1
+/* Allow access to hardware counter from userspace */
+#define ARMPMU_EL0_RD_CNTR 2
 
 #define HW_OP_UNSUPPORTED  0x
 #define C(_x)  PERF_COUNT_HW_CACHE_##_x
-- 
2.17.1



[PATCH 1/7] perf: arm64: Compile tests unconditionally

2019-06-11 Thread Raphael Gault
In order to subsequently add more tests for the arm64 architecture
we compile the tests target for arm64 systematically.

Signed-off-by: Raphael Gault 
---
 tools/perf/arch/arm64/Build   | 2 +-
 tools/perf/arch/arm64/tests/Build | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/arch/arm64/Build b/tools/perf/arch/arm64/Build
index 36222e64bbf7..a7dd46a5b678 100644
--- a/tools/perf/arch/arm64/Build
+++ b/tools/perf/arch/arm64/Build
@@ -1,2 +1,2 @@
 perf-y += util/
-perf-$(CONFIG_DWARF_UNWIND) += tests/
+perf-y += tests/
diff --git a/tools/perf/arch/arm64/tests/Build 
b/tools/perf/arch/arm64/tests/Build
index 41707fea74b3..a61c06bdb757 100644
--- a/tools/perf/arch/arm64/tests/Build
+++ b/tools/perf/arch/arm64/tests/Build
@@ -1,4 +1,4 @@
 perf-y += regs_load.o
-perf-y += dwarf-unwind.o
+perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 
 perf-y += arch-tests.o
-- 
2.17.1



[PATCH 0/7] arm64: Enable access to pmu registers by user-space

2019-06-11 Thread Raphael Gault
The perf user-space tool relies on the PMU to monitor events. It offers an
abstraction layer over the hardware counters since the underlying
implementation is cpu-dependent. We want to allow userspace tools to have
access to the registers storing the hardware counters' values directly.
This targets specifically self-monitoring tasks in order to reduce the
overhead by directly accessing the registers without having to go
through the kernel.
In order to do this we need to setup the pmu so that it exposes its registers
to userspace access.

The first patch enables the tests for arm64 architecture in the perf
tool to be compiled systematically.

The second patch add a test to the perf tool so that we can test that the
access to the registers works correctly from userspace.

The third patch adds another test similar to the first one but this time
using rseq as mechanism to make sure of the data correctness.

The fourth patch focuses on the armv8 pmuv3 PMU support and makes sure that
the access to the pmu registers is enable and that the userspace have
access to the relevent information in order to use them.

The fifth patch adds a hook to handle faulting access to the pmu
registers. This is necessary in order to have a coherent behaviour
on big.LITTLE environment.

The sixth patch put in place callbacks to enable access to the hardware
counters from userspace when a compatible event is opened using the perf
API.

Raphael Gault (7):
  perf: arm64: Compile tests unconditionally
  perf: arm64: Add test to check userspace access to hardware counters.
  perf: arm64: Use rseq to test userspace access to pmu counters
  arm64: pmu: Add function implementation to update event index in
userpage.
  arm64: pmu: Add hook to handle pmu-related undefined instructions
  arm64: perf: Enable pmu counter direct access for perf event on armv8
  Documentation: arm64: Document PMU counters access from userspace

 .../arm64/pmu_counter_user_access.txt |  42 +++
 arch/arm64/include/asm/mmu.h  |   6 +
 arch/arm64/include/asm/mmu_context.h  |   2 +
 arch/arm64/include/asm/perf_event.h   |  14 +
 arch/arm64/kernel/cpufeature.c|   4 +-
 arch/arm64/kernel/perf_event.c|  76 ++
 drivers/perf/arm_pmu.c|  38 +++
 include/linux/perf/arm_pmu.h  |   2 +
 tools/perf/arch/arm64/Build   |   2 +-
 tools/perf/arch/arm64/include/arch-tests.h|   9 +
 tools/perf/arch/arm64/include/rseq-arm64.h| 220 +++
 tools/perf/arch/arm64/tests/Build |   4 +-
 tools/perf/arch/arm64/tests/arch-tests.c  |  10 +
 tools/perf/arch/arm64/tests/rseq-pmu-events.c | 219 +++
 tools/perf/arch/arm64/tests/user-events.c | 255 ++
 15 files changed, 899 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/arm64/pmu_counter_user_access.txt
 create mode 100644 tools/perf/arch/arm64/include/rseq-arm64.h
 create mode 100644 tools/perf/arch/arm64/tests/rseq-pmu-events.c
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c

-- 
2.17.1



Re: [RFC 4/7] arm64: pmu: Add function implementation to update event index in userpage.

2019-05-29 Thread Raphael Gault

Hi Peter,

On 5/29/19 1:32 PM, Peter Zijlstra wrote:

On Wed, May 29, 2019 at 01:25:46PM +0100, Raphael Gault wrote:

Hi Robin, Hi Peter,

On 5/29/19 11:50 AM, Robin Murphy wrote:

On 29/05/2019 11:46, Raphael Gault wrote:

Hi Peter,

On 5/29/19 10:46 AM, Peter Zijlstra wrote:

On Tue, May 28, 2019 at 04:03:17PM +0100, Raphael Gault wrote:

+static int armv8pmu_access_event_idx(struct perf_event *event)
+{
+    if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+    return 0;
+
+    /*
+ * We remap the cycle counter index to 32 to
+ * match the offset applied to the rest of
+ * the counter indeces.
+ */
+    if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER)
+    return 32;
+
+    return event->hw.idx;


Is there a guarantee event->hw.idx is never 0? Or should you, just like
x86, use +1 here?



You are right, I should use +1 here. Thanks for pointing that out.


Isn't that already the case though, since we reserve index 0 for the
cycle counter? I'm looking at ARMV8_IDX_TO_COUNTER() here...



Well the current behaviour is correct and takes care of the zero case with
the ARMV8_IDX_CYCLE_COUNTER check. But using ARMV8_IDX_TO_COUNTER() and add
1 would also work. However this seems indeed redundant with the current
value held in event->hw.idx.


Note that whatever you pick now will become ABI. Also note that the
comment/pseudo-code in perf_event_mmap_page suggests to use idx-1 for
the actual hardware access.



Indeed that's true. As for the pseudo-code in perf_event_mmap_page. It 
is compatible with what I do here. The two approach are only different 
in form but it is in both case necessary to subtract 1 on the returned 
value in order to access the correct hardware counter.


Thank you,

--
Raphael Gault


Re: [RFC 4/7] arm64: pmu: Add function implementation to update event index in userpage.

2019-05-29 Thread Raphael Gault

Hi Robin, Hi Peter,

On 5/29/19 11:50 AM, Robin Murphy wrote:

On 29/05/2019 11:46, Raphael Gault wrote:

Hi Peter,

On 5/29/19 10:46 AM, Peter Zijlstra wrote:

On Tue, May 28, 2019 at 04:03:17PM +0100, Raphael Gault wrote:

+static int armv8pmu_access_event_idx(struct perf_event *event)
+{
+    if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+    return 0;
+
+    /*
+ * We remap the cycle counter index to 32 to
+ * match the offset applied to the rest of
+ * the counter indeces.
+ */
+    if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER)
+    return 32;
+
+    return event->hw.idx;


Is there a guarantee event->hw.idx is never 0? Or should you, just like
x86, use +1 here?



You are right, I should use +1 here. Thanks for pointing that out.


Isn't that already the case though, since we reserve index 0 for the 
cycle counter? I'm looking at ARMV8_IDX_TO_COUNTER() here...




Well the current behaviour is correct and takes care of the zero case 
with the ARMV8_IDX_CYCLE_COUNTER check. But using ARMV8_IDX_TO_COUNTER() 
and add 1 would also work. However this seems indeed redundant with the 
current value held in event->hw.idx.



Robin.


--
Raphael Gault


Re: [RFC 4/7] arm64: pmu: Add function implementation to update event index in userpage.

2019-05-29 Thread Raphael Gault

Hi Peter,

On 5/29/19 10:46 AM, Peter Zijlstra wrote:

On Tue, May 28, 2019 at 04:03:17PM +0100, Raphael Gault wrote:

+static int armv8pmu_access_event_idx(struct perf_event *event)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return 0;
+
+   /*
+* We remap the cycle counter index to 32 to
+* match the offset applied to the rest of
+* the counter indeces.
+*/
+   if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER)
+   return 32;
+
+   return event->hw.idx;


Is there a guarantee event->hw.idx is never 0? Or should you, just like
x86, use +1 here?



You are right, I should use +1 here. Thanks for pointing that out.


+}


Thanks,

--
Raphael Gault


[RFC 6/7] arm64: perf: Enable pmu counter direct access for perf event on armv8

2019-05-28 Thread Raphael Gault
Keep track of event opened with direct access to the hardware counters
and modify permissions while they are open.

The strategy used here is the same which x86 uses: everytime an event
is mapped, the permissions are set if required. The atomic field added
in the mm_context helps keep track of the different event opened and
de-activate the permissions when all are unmapped.
We also need to update the permissions in the context switch code so
that tasks keep the right permissions.

Signed-off-by: Raphael Gault 
---
 arch/arm64/include/asm/mmu.h |  6 +
 arch/arm64/include/asm/mmu_context.h |  2 ++
 arch/arm64/include/asm/perf_event.h  | 14 ++
 drivers/perf/arm_pmu.c   | 38 
 4 files changed, 60 insertions(+)

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 67ef25d037ea..9de4cf0b17c7 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -29,6 +29,12 @@
 
 typedef struct {
atomic64_t  id;
+
+   /*
+* non-zero if userspace have access to hardware
+* counters directly.
+*/
+   atomic_tpmu_direct_access;
void*vdso;
unsigned long   flags;
 } mm_context_t;
diff --git a/arch/arm64/include/asm/mmu_context.h 
b/arch/arm64/include/asm/mmu_context.h
index 2da3e478fd8f..33494af613d8 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -235,6 +236,7 @@ static inline void __switch_mm(struct mm_struct *next)
}
 
check_and_switch_context(next, cpu);
+   perf_switch_user_access(next);
 }
 
 static inline void
diff --git a/arch/arm64/include/asm/perf_event.h 
b/arch/arm64/include/asm/perf_event.h
index c593761ba61c..32a6d604bb3b 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -19,6 +19,7 @@
 
 #include 
 #include 
+#include 
 
 #defineARMV8_PMU_MAX_COUNTERS  32
 #defineARMV8_PMU_COUNTER_MASK  (ARMV8_PMU_MAX_COUNTERS - 1)
@@ -234,4 +235,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
(regs)->pstate = PSR_MODE_EL1h; \
 }
 
+static inline void perf_switch_user_access(struct mm_struct *mm)
+{
+   if (!IS_ENABLED(CONFIG_PERF_EVENTS))
+   return;
+
+   if (atomic_read(>context.pmu_direct_access)) {
+   write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR,
+pmuserenr_el0);
+   } else {
+   write_sysreg(0, pmuserenr_el0);
+   }
+}
+
 #endif
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index eec75b97e7ea..0e5588cd2f39 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -24,6 +24,7 @@
 #include 
 
 #include 
+#include 
 
 static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
 static DEFINE_PER_CPU(int, cpu_irq);
@@ -777,6 +778,41 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu)
_pmu->node);
 }
 
+static void refresh_pmuserenr(void *mm)
+{
+   perf_switch_user_access(mm);
+}
+
+static void armpmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return;
+
+   /*
+* This function relies on not being called concurrently in two
+* tasks in the same mm.  Otherwise one task could observe
+* pmu_direct_access > 1 and return all the way back to
+* userspace with user access disabled while another task is still
+* doing on_each_cpu_mask() to enable user access.
+*
+* For now, this can't happen because all callers hold mmap_sem
+* for write.  If this changes, we'll need a different solution.
+*/
+   lockdep_assert_held_exclusive(>mmap_sem);
+
+   if (atomic_inc_return(>context.pmu_direct_access) == 1)
+   on_each_cpu(refresh_pmuserenr, mm, 1);
+}
+
+static void armpmu_event_unmapped(struct perf_event *event, struct mm_struct 
*mm)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return;
+
+   if (atomic_dec_and_test(>context.pmu_direct_access))
+   on_each_cpu_mask(mm_cpumask(mm), refresh_pmuserenr, NULL, 1);
+}
+
 static struct arm_pmu *__armpmu_alloc(gfp_t flags)
 {
struct arm_pmu *pmu;
@@ -798,6 +834,8 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags)
.pmu_enable = armpmu_enable,
.pmu_disable= armpmu_disable,
.event_init = armpmu_event_init,
+   .event_mapped   = armpmu_event_mapped,
+   .event_unmapped = armpmu_event_unmapped,
.add= armpmu_add,
.del= armpmu_del,
.start  = armpmu_start,
-- 
2.17.1



[RFC 3/7] perf: arm64: Use rseq to test userspace access to pmu counters

2019-05-28 Thread Raphael Gault
Add an extra test to check userspace access to pmu hardware counters.
This test doesn't rely on the seqlock as a synchronisation mechanism but
instead uses the restartable sequences to make sure that the thread is
not interrupted when reading the index of the counter and the associated
pmu register.

In addition to reading the pmu counters, this test is run several time
in order to measure the ratio of failures:
I ran this test on the Juno development platform, which is big.LITTLE
with 4 Cortex A53 and 2 Cortex A57. The results vary quite a lot
(running it with 100 tests is not so long and I did it several times).
I ran it once with 1 iterations:
`runs: 1, abort: 62.53%, zero: 34.93%, success: 2.54%`

Signed-off-by: Raphael Gault 
---
 tools/perf/arch/arm64/include/arch-tests.h|   5 +-
 tools/perf/arch/arm64/include/rseq-arm64.h| 220 ++
 tools/perf/arch/arm64/tests/Build |   1 +
 tools/perf/arch/arm64/tests/arch-tests.c  |   6 +
 tools/perf/arch/arm64/tests/rseq-pmu-events.c | 219 +
 5 files changed, 450 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/arch/arm64/include/rseq-arm64.h
 create mode 100644 tools/perf/arch/arm64/tests/rseq-pmu-events.c

diff --git a/tools/perf/arch/arm64/include/arch-tests.h 
b/tools/perf/arch/arm64/include/arch-tests.h
index a9b17ae0560b..4164762b43c6 100644
--- a/tools/perf/arch/arm64/include/arch-tests.h
+++ b/tools/perf/arch/arm64/include/arch-tests.h
@@ -13,6 +13,9 @@ int test__arch_unwind_sample(struct perf_sample *sample,
 extern struct test arch_tests[];
 int test__rd_pmevcntr(struct test *test __maybe_unused,
  int subtest __maybe_unused);
-
+#ifdef CONFIG_RSEQ
+int rseq__rd_pmevcntr(struct test *test __maybe_unused,
+ int subtest __maybe_unused);
+#endif
 
 #endif
diff --git a/tools/perf/arch/arm64/include/rseq-arm64.h 
b/tools/perf/arch/arm64/include/rseq-arm64.h
new file mode 100644
index ..00d6960915a9
--- /dev/null
+++ b/tools/perf/arch/arm64/include/rseq-arm64.h
@@ -0,0 +1,220 @@
+/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
+/*
+ * rseq-arm64.h
+ *
+ * This file is mostly a copy from
+ * tools/testing/selftests/rseq/rseq-arm64.h
+ */
+
+/*
+ * aarch64 -mbig-endian generates mixed endianness code vs data:
+ * little-endian code and big-endian data. Ensure the RSEQ_SIG signature
+ * matches code endianness.
+ */
+#define __rseq_str_1(x)  #x
+#define __rseq_str(x)__rseq_str_1(x)
+
+#define RSEQ_ACCESS_ONCE(x)(*(__volatile__  __typeof__(x) *)&(x))
+#define RSEQ_SIG_CODE  0xd428bc00  /* BRK #0x45E0.  */
+
+#ifdef __AARCH64EB__
+#define RSEQ_SIG_DATA  0x00bc28d4  /* BRK #0x45E0.  */
+#else
+#define RSEQ_SIG_DATA  RSEQ_SIG_CODE
+#endif
+
+#define RSEQ_SIG   RSEQ_SIG_DATA
+
+#define rseq_smp_mb()  __asm__ __volatile__ ("dmb ish" ::: "memory")
+#define rseq_smp_rmb() __asm__ __volatile__ ("dmb ishld" ::: "memory")
+#define rseq_smp_wmb() __asm__ __volatile__ ("dmb ishst" ::: "memory")
+
+#define rseq_smp_load_acquire(p)   
\
+__extension__ ({   
\
+   __typeof(*p) p1;
\
+   switch (sizeof(*p)) {   
\
+   case 1: 
\
+   asm volatile ("ldarb %w0, %1"   
\
+   : "=r" (*(__u8 *)p) 
\
+   : "Q" (*p) : "memory"); 
\
+   break;  
\
+   case 2: 
\
+   asm volatile ("ldarh %w0, %1"   
\
+   : "=r" (*(__u16 *)p)
\
+   : "Q" (*p) : "memory"); 
\
+   break;  
\
+   case 4: 
\
+   asm volatile ("ldar %w0, %1"
\
+   : "=r" (*(__u32 *)p)
\
+   : "Q" (*p) : "memory"); 
\
+   break;  
\
+   case 8: 
\
+   asm volatile ("ldar %0, %1" 

[RFC 5/7] arm64: pmu: Add hook to handle pmu-related undefined instructions

2019-05-28 Thread Raphael Gault
In order to prevent the userspace processes which are trying to access
the registers from the pmu registers on a big.LITTLE environment we
introduce a hook to handle undefined instructions.

The goal here is to prevent the process to be interrupted by a signal
when the error is caused by the task being scheduled while accessing
a counter, causing the counter access to be invalid. As we are not able
to know efficiently the number of counters available physically on both
pmu in that context we consider that any faulting access to a counter
which is architecturally correct should not cause a SIGILL signal if
the permissions are set accordingly.

This commit also modifies the mask of the mrs_hook declared in
arch/arm64/kernel/cpufeatures.c which emulates only feature register
access. This is necessary because this hook's mask was too large and
thus masking any mrs instruction, even if not related to the emulated
registers which made the pmu emulation inefficient.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/cpufeature.c |  4 ++--
 arch/arm64/kernel/perf_event.c | 41 ++
 2 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 2b807f129e60..daa7b31f2c73 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -2166,8 +2166,8 @@ static int emulate_mrs(struct pt_regs *regs, u32 insn)
 }
 
 static struct undef_hook mrs_hook = {
-   .instr_mask = 0xfff0,
-   .instr_val  = 0xd530,
+   .instr_mask = 0x,
+   .instr_val  = 0xd538,
.pstate_mask = PSR_AA32_MODE_MASK,
.pstate_val = PSR_MODE_EL0t,
.fn = emulate_mrs,
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 3dc1265540df..1687f6d1fa27 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -19,9 +19,11 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include 
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -1009,6 +1011,45 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
return probe.present ? 0 : -ENODEV;
 }
 
+static int emulate_pmu(struct pt_regs *regs, u32 insn)
+{
+   u32 sys_reg, rt;
+   u32 pmuserenr;
+
+   sys_reg = (u32)aarch64_insn_decode_immediate(AARCH64_INSN_IMM_16, insn) 
<< 5;
+   rt = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn);
+   pmuserenr = read_sysreg(pmuserenr_el0);
+
+   if ((pmuserenr & (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR)) !=
+   (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR))
+   return -EINVAL;
+
+   pt_regs_write_reg(regs, rt, 0);
+
+   arm64_skip_faulting_instruction(regs, 4);
+   return 0;
+}
+
+/*
+ * This hook will only be triggered by mrs
+ * instructions on PMU registers. This is mandatory
+ * in order to have a consistent behaviour even on
+ * big.LITTLE systems.
+ */
+static struct undef_hook pmu_hook = {
+   .instr_mask = 0x8800,
+   .instr_val  = 0xd53b8800,
+   .fn = emulate_pmu,
+};
+
+static int __init enable_pmu_emulation(void)
+{
+   register_undef_hook(_hook);
+   return 0;
+}
+
+core_initcall(enable_pmu_emulation);
+
 static int armv8_pmu_init(struct arm_pmu *cpu_pmu)
 {
int ret = armv8pmu_probe_pmu(cpu_pmu);
-- 
2.17.1



[RFC 4/7] arm64: pmu: Add function implementation to update event index in userpage.

2019-05-28 Thread Raphael Gault
In order to be able to access the counter directly for userspace,
we need to provide the index of the counter using the userpage.
We thus need to override the event_idx function to retrieve and
convert the perf_event index to armv8 hardware index.

Since the arm_pmu driver can be used by any implementation, even
if not armv8, two components play a role into making sure the
behaviour is correct and consistent with the PMU capabilities:

* the ARMPMU_EL0_RD_CNTR flag which denotes the capability to access
counter from userspace.
* the event_idx call back, which is implemented and initialized by
the PMU implementation: if no callback is provided, the default
behaviour applies, returning 0 as index value.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/perf_event.c | 21 +
 include/linux/perf/arm_pmu.h   |  2 ++
 2 files changed, 23 insertions(+)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 6164d389eed6..3dc1265540df 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -809,6 +809,22 @@ static void armv8pmu_clear_event_idx(struct pmu_hw_events 
*cpuc,
clear_bit(idx - 1, cpuc->used_mask);
 }
 
+static int armv8pmu_access_event_idx(struct perf_event *event)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return 0;
+
+   /*
+* We remap the cycle counter index to 32 to
+* match the offset applied to the rest of
+* the counter indeces.
+*/
+   if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER)
+   return 32;
+
+   return event->hw.idx;
+}
+
 /*
  * Add an event filter to a given event.
  */
@@ -890,6 +906,8 @@ static int __armv8_pmuv3_map_event(struct perf_event *event,
if (armv8pmu_event_is_64bit(event))
event->hw.flags |= ARMPMU_EVT_64BIT;
 
+   event->hw.flags |= ARMPMU_EL0_RD_CNTR;
+
/* Only expose micro/arch events supported by this PMU */
if ((hw_event_id > 0) && (hw_event_id < ARMV8_PMUV3_MAX_COMMON_EVENTS)
&& test_bit(hw_event_id, armpmu->pmceid_bitmap)) {
@@ -1010,6 +1028,8 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu)
cpu_pmu->set_event_filter   = armv8pmu_set_event_filter;
cpu_pmu->filter_match   = armv8pmu_filter_match;
 
+   cpu_pmu->pmu.event_idx  = armv8pmu_access_event_idx;
+
return 0;
 }
 
@@ -1188,6 +1208,7 @@ void arch_perf_update_userpage(struct perf_event *event,
 */
freq = arch_timer_get_rate();
userpg->cap_user_time = 1;
+   userpg->cap_user_rdpmc = !!(event->hw.flags & ARMPMU_EL0_RD_CNTR);
 
clocks_calc_mult_shift(>time_mult, , freq,
NSEC_PER_SEC, 0);
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 4641e850b204..3bef390c1069 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -30,6 +30,8 @@
  */
 /* Event uses a 64bit counter */
 #define ARMPMU_EVT_64BIT   1
+/* Allow access to hardware counter from userspace */
+#define ARMPMU_EL0_RD_CNTR 2
 
 #define HW_OP_UNSUPPORTED  0x
 #define C(_x)  PERF_COUNT_HW_CACHE_##_x
-- 
2.17.1



[RFC 2/7] perf: arm64: Add test to check userspace access to hardware counters.

2019-05-28 Thread Raphael Gault
This test relies on the fact that the PMU registers are accessible
from userspace. It then uses the perf_event_mmap_page to retrieve
the counter index and access the underlying register.

This test uses sched_setaffinity(2) in order to run on all CPU and thus
check the behaviour of the PMU of all cpus in a big.LITTLE environment.

Signed-off-by: Raphael Gault 
---
 tools/perf/arch/arm64/include/arch-tests.h |   6 +
 tools/perf/arch/arm64/tests/Build  |   1 +
 tools/perf/arch/arm64/tests/arch-tests.c   |   4 +
 tools/perf/arch/arm64/tests/user-events.c  | 255 +
 4 files changed, 266 insertions(+)
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c

diff --git a/tools/perf/arch/arm64/include/arch-tests.h 
b/tools/perf/arch/arm64/include/arch-tests.h
index 90ec4c8cb880..a9b17ae0560b 100644
--- a/tools/perf/arch/arm64/include/arch-tests.h
+++ b/tools/perf/arch/arm64/include/arch-tests.h
@@ -2,11 +2,17 @@
 #ifndef ARCH_TESTS_H
 #define ARCH_TESTS_H
 
+#define __maybe_unused __attribute__((unused))
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 struct thread;
 struct perf_sample;
+int test__arch_unwind_sample(struct perf_sample *sample,
+struct thread *thread);
 #endif
 
 extern struct test arch_tests[];
+int test__rd_pmevcntr(struct test *test __maybe_unused,
+ int subtest __maybe_unused);
+
 
 #endif
diff --git a/tools/perf/arch/arm64/tests/Build 
b/tools/perf/arch/arm64/tests/Build
index a61c06bdb757..3f9a20c17fc6 100644
--- a/tools/perf/arch/arm64/tests/Build
+++ b/tools/perf/arch/arm64/tests/Build
@@ -1,4 +1,5 @@
 perf-y += regs_load.o
 perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 
+perf-y += user-events.o
 perf-y += arch-tests.o
diff --git a/tools/perf/arch/arm64/tests/arch-tests.c 
b/tools/perf/arch/arm64/tests/arch-tests.c
index 5b1543c98022..57df9b89dede 100644
--- a/tools/perf/arch/arm64/tests/arch-tests.c
+++ b/tools/perf/arch/arm64/tests/arch-tests.c
@@ -10,6 +10,10 @@ struct test arch_tests[] = {
.func = test__dwarf_unwind,
},
 #endif
+   {
+   .desc = "User counter access",
+   .func = test__rd_pmevcntr,
+   },
{
.func = NULL,
},
diff --git a/tools/perf/arch/arm64/tests/user-events.c 
b/tools/perf/arch/arm64/tests/user-events.c
new file mode 100644
index ..958e4cd000c1
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/user-events.c
@@ -0,0 +1,255 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "perf.h"
+#include "debug.h"
+#include "tests/tests.h"
+#include "cloexec.h"
+#include "util.h"
+#include "arch-tests.h"
+
+/*
+ * ARMv8 ARM reserves the following encoding for system registers:
+ * (Ref: ARMv8 ARM, Section: "System instruction class encoding overview",
+ *  C5.2, version:ARM DDI 0487A.f)
+ *  [20-19] : Op0
+ *  [18-16] : Op1
+ *  [15-12] : CRn
+ *  [11-8]  : CRm
+ *  [7-5]   : Op2
+ */
+#define Op0_shift   19
+#define Op0_mask0x3
+#define Op1_shift   16
+#define Op1_mask0x7
+#define CRn_shift   12
+#define CRn_mask0xf
+#define CRm_shift   8
+#define CRm_mask0xf
+#define Op2_shift   5
+#define Op2_mask0x7
+
+#define __stringify(x) #x
+
+#define read_sysreg(r) ({  \
+   u64 __val;  \
+   asm volatile("mrs %0, " __stringify(r) : "=r" (__val)); \
+   __val;  \
+})
+
+#define PMEVCNTR_READ_CASE(idx)\
+   case idx:   \
+   return read_sysreg(pmevcntr##idx##_el0)
+
+#define PMEVCNTR_CASES(readwrite)  \
+   PMEVCNTR_READ_CASE(0);  \
+   PMEVCNTR_READ_CASE(1);  \
+   PMEVCNTR_READ_CASE(2);  \
+   PMEVCNTR_READ_CASE(3);  \
+   PMEVCNTR_READ_CASE(4);  \
+   PMEVCNTR_READ_CASE(5);  \
+   PMEVCNTR_READ_CASE(6);  \
+   PMEVCNTR_READ_CASE(7);  \
+   PMEVCNTR_READ_CASE(8);  \
+   PMEVCNTR_READ_CASE(9);  \
+   PMEVCNTR_READ_CASE(10); \
+   PMEVCNTR_READ_CASE(11); \
+   PMEVCNTR_READ_CASE(12); \
+   PMEVCNTR_READ_CASE(13); \
+   PMEVCNTR_READ_CASE(14); \
+   PMEVCNTR_READ_CASE(15); \
+   PMEVCNTR_READ_CASE(16); \
+   PMEVCNTR_READ_CASE(17); \
+   PMEVCNTR_READ_CASE(18

[RFC 7/7] Documentation: arm64: Document PMU counters access from userspace

2019-05-28 Thread Raphael Gault
Add a documentation file to describe the access to the pmu hardware
counters from userspace

Signed-off-by: Raphael Gault 
---
 .../arm64/pmu_counter_user_access.txt | 42 +++
 1 file changed, 42 insertions(+)
 create mode 100644 Documentation/arm64/pmu_counter_user_access.txt

diff --git a/Documentation/arm64/pmu_counter_user_access.txt 
b/Documentation/arm64/pmu_counter_user_access.txt
new file mode 100644
index ..6788b1107381
--- /dev/null
+++ b/Documentation/arm64/pmu_counter_user_access.txt
@@ -0,0 +1,42 @@
+Access to PMU hardware counter from userspace
+=
+
+Overview
+
+The perf user-space tool relies on the PMU to monitor events. It offers an
+abstraction layer over the hardware counters since the underlying
+implementation is cpu-dependent.
+Arm64 allows userspace tools to have access to the registers storing the
+hardware counters' values directly.
+
+This targets specifically self-monitoring tasks in order to reduce the overhead
+by directly accessing the registers without having to go through the kernel.
+
+How-to
+--
+The focus is set on the armv8 pmuv3 which makes sure that the access to the pmu
+registers is enable and that the userspace have access to the relevent
+information in order to use them.
+
+In order to have access to the hardware counter it is necessary to open the 
event
+using the perf tool interface: the sys_perf_event_open syscall returns a fd 
which
+can subsequently be used with the mmap syscall in order to retrieve a page of 
memory
+containing information about the event.
+The PMU driver uses this page to expose to the user the hardware counter's
+index. Using this index enables the user to access the PMU registers using the
+`mrs` instruction.
+
+Have a look `at tools/perf/arch/arm64/tests/user-events.c` for an example. It 
can be
+run using the perf tool to check that the access to the registers works
+correctly from userspace:
+
+./perf test -v
+
+About chained events
+
+When the user requests for an event to be counted on 64 bits, two hardware
+counters are used and need to be combined to retrieve the correct value:
+
+val = read_counter(idx);
+if ((event.attr.config1 & 0x1))
+   val = (val << 32) | read_counter(idx - 1);
-- 
2.17.1



[RFC V2 0/7] arm64: Enable access to pmu registers by user-space

2019-05-28 Thread Raphael Gault
The perf user-space tool relies on the PMU to monitor events. It offers an
abstraction layer over the hardware counters since the underlying
implementation is cpu-dependent. We want to allow userspace tools to have
access to the registers storing the hardware counters' values directly.
This targets specifically self-monitoring tasks in order to reduce the
overhead by directly accessing the registers without having to go
through the kernel.
In order to do this we need to setup the pmu so that it exposes its registers
to userspace access.

The first patch enables the tests for arm64 architecture in the perf
tool to be compiled systematically.

The second patch add a test to the perf tool so that we can test that the
access to the registers works correctly from userspace.

The third patch adds another test similar to the first one but this time
using rseq as mechanism to make sure of the data correctness.

The fourth patch focuses on the armv8 pmuv3 PMU support and makes sure that
the access to the pmu registers is enable and that the userspace have
access to the relevent information in order to use them.

The fifth patch adds a hook to handle faulting access to the pmu
registers. This is necessary in order to have a coherent behaviour
on big.LITTLE environment.

The sixth patch put in place callbacks to enable access to the hardware
counters from userspace when a compatible event is opened using the perf
API.

RFC: In my opinion there is no need to save pmselr_el0 when context
switching like we do for pmuserenr_el0 since whether it's the seqlock
mechanism or the restartable sequences, the user should notice right
away that the value held in pmxevcntr_el0 is incorrect when the task has
been rescheduled. However, I still wanted to discuss this point on the
list to see if that's indeed not necessary to save it.

Changes since V1: Add a test using rseq

Raphael Gault (7):
  perf: arm64: Compile tests unconditionally
  perf: arm64: Add test to check userspace access to hardware counters.
  perf: arm64: Use rseq to test userspace access to pmu counters
  arm64: pmu: Add function implementation to update event index in
userpage.
  arm64: pmu: Add hook to handle pmu-related undefined instructions
  arm64: perf: Enable pmu counter direct access for perf event on armv8
  Documentation: arm64: Document PMU counters access from userspace

 .../arm64/pmu_counter_user_access.txt |  42 +++
 arch/arm64/include/asm/mmu.h  |   6 +
 arch/arm64/include/asm/mmu_context.h  |   2 +
 arch/arm64/include/asm/perf_event.h   |  14 +
 arch/arm64/kernel/cpufeature.c|   4 +-
 arch/arm64/kernel/perf_event.c|  62 +
 drivers/perf/arm_pmu.c|  38 +++
 include/linux/perf/arm_pmu.h  |   2 +
 tools/perf/arch/arm64/Build   |   2 +-
 tools/perf/arch/arm64/include/arch-tests.h|   9 +
 tools/perf/arch/arm64/include/rseq-arm64.h| 220 +++
 tools/perf/arch/arm64/tests/Build |   4 +-
 tools/perf/arch/arm64/tests/arch-tests.c  |  10 +
 tools/perf/arch/arm64/tests/rseq-pmu-events.c | 219 +++
 tools/perf/arch/arm64/tests/user-events.c | 255 ++
 15 files changed, 885 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/arm64/pmu_counter_user_access.txt
 create mode 100644 tools/perf/arch/arm64/include/rseq-arm64.h
 create mode 100644 tools/perf/arch/arm64/tests/rseq-pmu-events.c
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c

-- 
2.17.1



[RFC 1/7] perf: arm64: Compile tests unconditionally

2019-05-28 Thread Raphael Gault
In order to subsequently add more tests for the arm64 architecture
we compile the tests target for arm64 systematically.

Signed-off-by: Raphael Gault 
---
 tools/perf/arch/arm64/Build   | 2 +-
 tools/perf/arch/arm64/tests/Build | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/arch/arm64/Build b/tools/perf/arch/arm64/Build
index 36222e64bbf7..a7dd46a5b678 100644
--- a/tools/perf/arch/arm64/Build
+++ b/tools/perf/arch/arm64/Build
@@ -1,2 +1,2 @@
 perf-y += util/
-perf-$(CONFIG_DWARF_UNWIND) += tests/
+perf-y += tests/
diff --git a/tools/perf/arch/arm64/tests/Build 
b/tools/perf/arch/arm64/tests/Build
index 41707fea74b3..a61c06bdb757 100644
--- a/tools/perf/arch/arm64/tests/Build
+++ b/tools/perf/arch/arm64/tests/Build
@@ -1,4 +1,4 @@
 perf-y += regs_load.o
-perf-y += dwarf-unwind.o
+perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 
 perf-y += arch-tests.o
-- 
2.17.1



Re: [RFC V2 00/16] objtool: Add support for Arm64

2019-05-21 Thread Raphael Gault
Hi Josh,

Thanks for offering your help and sorry for the late answer.

My understanding is that a table of offsets is built by GCC, those
offsets being scaled by 4 before adding them to the base label.
I believe the offsets are stored in the .rodata section. To find the
size of that table, it is needed to find a comparison, which can be
optimized out apprently. In that case the end of the array can be found
by locating labels pointing to data behind it (which is not 100% safe).

On 5/16/19 3:29 PM, Josh Poimboeuf wrote:
> On Thu, May 16, 2019 at 11:36:39AM +0100, Raphael Gault wrote:
>> Noteworthy points:
>> * I still haven't figured out how to detect switch-tables on arm64. I
>> have a better understanding of them but still haven't implemented checks
>> as it doesn't look trivial at all.
>
> Switch tables were tricky to get right on x86.  If you share an example
> (or even just a .o file) I can take a look.  Hopefully they're somewhat
> similar to x86 switch tables.  Otherwise we may want to consider a
> different approach (for example maybe a GCC plugin could help annotate
> them).
>

The case which made me realize the issue is the one of
arch/arm64/kernel/module.o:apply_relocate_add:

```
What seems to happen in the case of module.o is:
  334:   9015adrpx21, 0 
which retrieves the location of an offset in the rodata section, and a
bit later we do some extra computation with it in order to compute the
jump destination:
  3e0:   78625aa0ldrhw0, [x21, w2, uxtw #1]
  3e4:   1061adr x1, 3f0 
  3e8:   8b20a820add x0, x1, w0, sxth #2
  3ec:   d61fbr  x0
```

Please keep in mind that the actual offsets might vary.

I'm happy to provide more details about what I have identified if you
want me to.

Thanks,

--
Raphael Gault
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.


Re: [PATCH 4/6] arm64: pmu: Add hook to handle pmu-related undefined instructions

2019-05-17 Thread Raphael Gault
Hi,

On 5/17/19 8:10 AM, Peter Zijlstra wrote:
> On Thu, May 16, 2019 at 02:21:46PM +0100, Raphael Gault wrote:
>> In order to prevent the userspace processes which are trying to access
>> the registers from the pmu registers on a big.LITTLE environment we
>> introduce a hook to handle undefined instructions.
>>
>> The goal here is to prevent the process to be interrupted by a signal
>> when the error is caused by the task being scheduled while accessing
>> a counter, causing the counter access to be invalid. As we are not able
>> to know efficiently the number of counters available physically on both
>> pmu in that context we consider that any faulting access to a counter
>> which is architecturally correct should not cause a SIGILL signal if
>> the permissions are set accordingly.
>
> The other approach is using rseq for this; with that you can guarantee
> it will never issue the instruction on a wrong CPU.
>
> That said; emulating the thing isn't horrible either.
>
>> +/*
>> + * We put 0 in the target register if we
>> + * are reading from pmu register. If we are
>> + * writing, we do nothing.
>> + */
>
> Wait _what_ ?!? userspace can _WRITE_ to these registers?
>

The user can write to some pmu registers but those are not the ones that
interest us here. My comment was ill formed, indeed this hook can only
be triggered by reads in this case.
Sorry about that.

Thanks,

--
Raphael Gault
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.


[PATCH 2/6] perf: arm64: Add test to check userspace access to hardware counters.

2019-05-16 Thread Raphael Gault
This test relies on the fact that the PMU registers are accessible
from userspace. It then uses the perf_event_mmap_page to retrieve
the counter index and access the underlying register.

This test uses sched_setaffinity(2) in order to run on all CPU and thus
check the behaviour of the PMU of all cpus in a big.LITTLE environment.

Signed-off-by: Raphael Gault 
---
 tools/perf/arch/arm64/include/arch-tests.h |   6 +
 tools/perf/arch/arm64/tests/Build  |   1 +
 tools/perf/arch/arm64/tests/arch-tests.c   |   4 +
 tools/perf/arch/arm64/tests/user-events.c  | 255 +
 4 files changed, 266 insertions(+)
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c

diff --git a/tools/perf/arch/arm64/include/arch-tests.h 
b/tools/perf/arch/arm64/include/arch-tests.h
index 90ec4c8cb880..a9b17ae0560b 100644
--- a/tools/perf/arch/arm64/include/arch-tests.h
+++ b/tools/perf/arch/arm64/include/arch-tests.h
@@ -2,11 +2,17 @@
 #ifndef ARCH_TESTS_H
 #define ARCH_TESTS_H
 
+#define __maybe_unused __attribute__((unused))
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 struct thread;
 struct perf_sample;
+int test__arch_unwind_sample(struct perf_sample *sample,
+struct thread *thread);
 #endif
 
 extern struct test arch_tests[];
+int test__rd_pmevcntr(struct test *test __maybe_unused,
+ int subtest __maybe_unused);
+
 
 #endif
diff --git a/tools/perf/arch/arm64/tests/Build 
b/tools/perf/arch/arm64/tests/Build
index a61c06bdb757..3f9a20c17fc6 100644
--- a/tools/perf/arch/arm64/tests/Build
+++ b/tools/perf/arch/arm64/tests/Build
@@ -1,4 +1,5 @@
 perf-y += regs_load.o
 perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 
+perf-y += user-events.o
 perf-y += arch-tests.o
diff --git a/tools/perf/arch/arm64/tests/arch-tests.c 
b/tools/perf/arch/arm64/tests/arch-tests.c
index 5b1543c98022..57df9b89dede 100644
--- a/tools/perf/arch/arm64/tests/arch-tests.c
+++ b/tools/perf/arch/arm64/tests/arch-tests.c
@@ -10,6 +10,10 @@ struct test arch_tests[] = {
.func = test__dwarf_unwind,
},
 #endif
+   {
+   .desc = "User counter access",
+   .func = test__rd_pmevcntr,
+   },
{
.func = NULL,
},
diff --git a/tools/perf/arch/arm64/tests/user-events.c 
b/tools/perf/arch/arm64/tests/user-events.c
new file mode 100644
index ..958e4cd000c1
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/user-events.c
@@ -0,0 +1,255 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "perf.h"
+#include "debug.h"
+#include "tests/tests.h"
+#include "cloexec.h"
+#include "util.h"
+#include "arch-tests.h"
+
+/*
+ * ARMv8 ARM reserves the following encoding for system registers:
+ * (Ref: ARMv8 ARM, Section: "System instruction class encoding overview",
+ *  C5.2, version:ARM DDI 0487A.f)
+ *  [20-19] : Op0
+ *  [18-16] : Op1
+ *  [15-12] : CRn
+ *  [11-8]  : CRm
+ *  [7-5]   : Op2
+ */
+#define Op0_shift   19
+#define Op0_mask0x3
+#define Op1_shift   16
+#define Op1_mask0x7
+#define CRn_shift   12
+#define CRn_mask0xf
+#define CRm_shift   8
+#define CRm_mask0xf
+#define Op2_shift   5
+#define Op2_mask0x7
+
+#define __stringify(x) #x
+
+#define read_sysreg(r) ({  \
+   u64 __val;  \
+   asm volatile("mrs %0, " __stringify(r) : "=r" (__val)); \
+   __val;  \
+})
+
+#define PMEVCNTR_READ_CASE(idx)\
+   case idx:   \
+   return read_sysreg(pmevcntr##idx##_el0)
+
+#define PMEVCNTR_CASES(readwrite)  \
+   PMEVCNTR_READ_CASE(0);  \
+   PMEVCNTR_READ_CASE(1);  \
+   PMEVCNTR_READ_CASE(2);  \
+   PMEVCNTR_READ_CASE(3);  \
+   PMEVCNTR_READ_CASE(4);  \
+   PMEVCNTR_READ_CASE(5);  \
+   PMEVCNTR_READ_CASE(6);  \
+   PMEVCNTR_READ_CASE(7);  \
+   PMEVCNTR_READ_CASE(8);  \
+   PMEVCNTR_READ_CASE(9);  \
+   PMEVCNTR_READ_CASE(10); \
+   PMEVCNTR_READ_CASE(11); \
+   PMEVCNTR_READ_CASE(12); \
+   PMEVCNTR_READ_CASE(13); \
+   PMEVCNTR_READ_CASE(14); \
+   PMEVCNTR_READ_CASE(15); \
+   PMEVCNTR_READ_CASE(16); \
+   PMEVCNTR_READ_CASE(17); \
+   PMEVCNTR_READ_CASE(18

[PATCH 1/6] perf: arm64: Compile tests unconditionally

2019-05-16 Thread Raphael Gault
In order to subsequently add more tests for the arm64 architecture
we compile the tests target for arm64 systematically.

Signed-off-by: Raphael Gault 
---
 tools/perf/arch/arm64/Build   | 2 +-
 tools/perf/arch/arm64/tests/Build | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/arch/arm64/Build b/tools/perf/arch/arm64/Build
index 36222e64bbf7..a7dd46a5b678 100644
--- a/tools/perf/arch/arm64/Build
+++ b/tools/perf/arch/arm64/Build
@@ -1,2 +1,2 @@
 perf-y += util/
-perf-$(CONFIG_DWARF_UNWIND) += tests/
+perf-y += tests/
diff --git a/tools/perf/arch/arm64/tests/Build 
b/tools/perf/arch/arm64/tests/Build
index 41707fea74b3..a61c06bdb757 100644
--- a/tools/perf/arch/arm64/tests/Build
+++ b/tools/perf/arch/arm64/tests/Build
@@ -1,4 +1,4 @@
 perf-y += regs_load.o
-perf-y += dwarf-unwind.o
+perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 
 perf-y += arch-tests.o
-- 
2.17.1



[PATCH 3/6] arm64: pmu: Add function implementation to update event index in userpage.

2019-05-16 Thread Raphael Gault
In order to be able to access the counter directly for userspace,
we need to provide the index of the counter using the userpage.
We thus need to override the event_idx function to retrieve and
convert the perf_event index to armv8 hardware index.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/perf_event.c |  4 
 drivers/perf/arm_pmu.c | 10 ++
 include/linux/perf/arm_pmu.h   |  2 ++
 3 files changed, 16 insertions(+)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 6164d389eed6..e6316f99f66b 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -890,6 +890,8 @@ static int __armv8_pmuv3_map_event(struct perf_event *event,
if (armv8pmu_event_is_64bit(event))
event->hw.flags |= ARMPMU_EVT_64BIT;
 
+   event->hw.flags |= ARMPMU_EL0_RD_CNTR;
+
/* Only expose micro/arch events supported by this PMU */
if ((hw_event_id > 0) && (hw_event_id < ARMV8_PMUV3_MAX_COMMON_EVENTS)
&& test_bit(hw_event_id, armpmu->pmceid_bitmap)) {
@@ -1188,6 +1190,8 @@ void arch_perf_update_userpage(struct perf_event *event,
 */
freq = arch_timer_get_rate();
userpg->cap_user_time = 1;
+   userpg->cap_user_rdpmc =
+   !!(event->hw.flags & ARMPMU_EL0_RD_CNTR);
 
clocks_calc_mult_shift(>time_mult, , freq,
NSEC_PER_SEC, 0);
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index eec75b97e7ea..3f4c2ec7ff89 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -777,6 +777,15 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu)
_pmu->node);
 }
 
+
+static int armpmu_event_idx(struct perf_event *event)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return 0;
+
+   return event->hw.idx;
+}
+
 static struct arm_pmu *__armpmu_alloc(gfp_t flags)
 {
struct arm_pmu *pmu;
@@ -803,6 +812,7 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags)
.start  = armpmu_start,
.stop   = armpmu_stop,
.read   = armpmu_read,
+   .event_idx  = armpmu_event_idx,
.filter_match   = armpmu_filter_match,
.attr_groups= pmu->attr_groups,
/*
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 4641e850b204..3bef390c1069 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -30,6 +30,8 @@
  */
 /* Event uses a 64bit counter */
 #define ARMPMU_EVT_64BIT   1
+/* Allow access to hardware counter from userspace */
+#define ARMPMU_EL0_RD_CNTR 2
 
 #define HW_OP_UNSUPPORTED  0x
 #define C(_x)  PERF_COUNT_HW_CACHE_##_x
-- 
2.17.1



[PATCH 4/6] arm64: pmu: Add hook to handle pmu-related undefined instructions

2019-05-16 Thread Raphael Gault
In order to prevent the userspace processes which are trying to access
the registers from the pmu registers on a big.LITTLE environment we
introduce a hook to handle undefined instructions.

The goal here is to prevent the process to be interrupted by a signal
when the error is caused by the task being scheduled while accessing
a counter, causing the counter access to be invalid. As we are not able
to know efficiently the number of counters available physically on both
pmu in that context we consider that any faulting access to a counter
which is architecturally correct should not cause a SIGILL signal if
the permissions are set accordingly.

Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/perf_event.c | 68 ++
 1 file changed, 68 insertions(+)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index e6316f99f66b..760c947b58dd 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -19,9 +19,11 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include 
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -993,6 +995,72 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
return probe.present ? 0 : -ENODEV;
 }
 
+static bool is_evcntr(u32 sys_reg)
+{
+   u32 CRn, Op0, Op1, CRm;
+
+   CRn = sys_reg_CRn(sys_reg);
+   CRm = sys_reg_CRm(sys_reg);
+   Op0 = sys_reg_Op0(sys_reg);
+   Op1 = sys_reg_Op1(sys_reg);
+
+   return (CRn == 0xE &&
+   (CRm & 0xc) == 0x8 &&
+   Op1 == 0x3 &&
+   Op0 == 0x3);
+}
+
+static int emulate_pmu(struct pt_regs *regs, u32 insn)
+{
+   u32 sys_reg, rt;
+   u32 pmuserenr;
+
+   sys_reg = (u32)aarch64_insn_decode_immediate(AARCH64_INSN_IMM_16, insn) 
<< 5;
+   rt = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn);
+   pmuserenr = read_sysreg(pmuserenr_el0);
+
+   if ((pmuserenr & (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR)) !=
+   (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR))
+   return -EINVAL;
+
+   if (sys_reg != SYS_PMXEVCNTR_EL0 &&
+   !is_evcntr(sys_reg))
+   return -EINVAL;
+
+   /*
+* We put 0 in the target register if we
+* are reading from pmu register. If we are
+* writing, we do nothing.
+*/
+   if ((insn & 0xfff0) == 0xd530)
+   pt_regs_write_reg(regs, rt, 0);
+   else if (sys_reg != SYS_PMSELR_EL0)
+   return -EINVAL;
+
+   arm64_skip_faulting_instruction(regs, 4);
+   return 0;
+}
+
+/*
+ * This hook will only be triggered by mrs
+ * instructions on PMU registers. This is mandatory
+ * in order to have a consistent behaviour even on
+ * big.LITTLE systems.
+ */
+static struct undef_hook pmu_hook = {
+   .instr_mask = 0x8800,
+   .instr_val  = 0xd53b8800,
+   .fn = emulate_pmu,
+};
+
+static int __init enable_pmu_emulation(void)
+{
+   register_undef_hook(_hook);
+   return 0;
+}
+
+core_initcall(enable_pmu_emulation);
+
 static int armv8_pmu_init(struct arm_pmu *cpu_pmu)
 {
int ret = armv8pmu_probe_pmu(cpu_pmu);
-- 
2.17.1



[PATCH 5/6] arm64: perf: Enable pmu counter direct access for perf event on armv8

2019-05-16 Thread Raphael Gault
Keep track of event opened with direct access to the hardware counters
and modify permissions while they are open.

The strategy used here is the same which x86 uses: everytime an event
is mapped, the permissions are set if required. The atomic field added
in the mm_context helps keep track of the different event opened and
de-activate the permissions when all are unmapped.
We also need to update the permissions in the context switch code so
that tasks keep the right permissions.

Signed-off-by: Raphael Gault 
---
 arch/arm64/include/asm/mmu.h |  6 +
 arch/arm64/include/asm/mmu_context.h |  2 ++
 arch/arm64/include/asm/perf_event.h  | 14 ++
 drivers/perf/arm_pmu.c   | 38 
 4 files changed, 60 insertions(+)

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 67ef25d037ea..9de4cf0b17c7 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -29,6 +29,12 @@
 
 typedef struct {
atomic64_t  id;
+
+   /*
+* non-zero if userspace have access to hardware
+* counters directly.
+*/
+   atomic_tpmu_direct_access;
void*vdso;
unsigned long   flags;
 } mm_context_t;
diff --git a/arch/arm64/include/asm/mmu_context.h 
b/arch/arm64/include/asm/mmu_context.h
index 2da3e478fd8f..33494af613d8 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -235,6 +236,7 @@ static inline void __switch_mm(struct mm_struct *next)
}
 
check_and_switch_context(next, cpu);
+   perf_switch_user_access(next);
 }
 
 static inline void
diff --git a/arch/arm64/include/asm/perf_event.h 
b/arch/arm64/include/asm/perf_event.h
index c593761ba61c..32a6d604bb3b 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -19,6 +19,7 @@
 
 #include 
 #include 
+#include 
 
 #defineARMV8_PMU_MAX_COUNTERS  32
 #defineARMV8_PMU_COUNTER_MASK  (ARMV8_PMU_MAX_COUNTERS - 1)
@@ -234,4 +235,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
(regs)->pstate = PSR_MODE_EL1h; \
 }
 
+static inline void perf_switch_user_access(struct mm_struct *mm)
+{
+   if (!IS_ENABLED(CONFIG_PERF_EVENTS))
+   return;
+
+   if (atomic_read(>context.pmu_direct_access)) {
+   write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR,
+pmuserenr_el0);
+   } else {
+   write_sysreg(0, pmuserenr_el0);
+   }
+}
+
 #endif
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 3f4c2ec7ff89..45a64f942864 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -24,6 +24,7 @@
 #include 
 
 #include 
+#include 
 
 static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
 static DEFINE_PER_CPU(int, cpu_irq);
@@ -786,6 +787,41 @@ static int armpmu_event_idx(struct perf_event *event)
return event->hw.idx;
 }
 
+static void refresh_pmuserenr(void *mm)
+{
+   perf_switch_user_access(mm);
+}
+
+static void armpmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return;
+
+   /*
+* This function relies on not being called concurrently in two
+* tasks in the same mm.  Otherwise one task could observe
+* pmu_direct_access > 1 and return all the way back to
+* userspace with user access disabled while another task is still
+* doing on_each_cpu_mask() to enable user access.
+*
+* For now, this can't happen because all callers hold mmap_sem
+* for write.  If this changes, we'll need a different solution.
+*/
+   lockdep_assert_held_exclusive(>mmap_sem);
+
+   if (atomic_inc_return(>context.pmu_direct_access) == 1)
+   on_each_cpu(refresh_pmuserenr, mm, 1);
+}
+
+static void armpmu_event_unmapped(struct perf_event *event, struct mm_struct 
*mm)
+{
+   if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+   return;
+
+   if (atomic_dec_and_test(>context.pmu_direct_access))
+   on_each_cpu_mask(mm_cpumask(mm), refresh_pmuserenr, NULL, 1);
+}
+
 static struct arm_pmu *__armpmu_alloc(gfp_t flags)
 {
struct arm_pmu *pmu;
@@ -807,6 +843,8 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags)
.pmu_enable = armpmu_enable,
.pmu_disable= armpmu_disable,
.event_init = armpmu_event_init,
+   .event_mapped   = armpmu_event_mapped,
+   .event_unmapped = armpmu_event_unmapped,
.add= armpmu_add,
.del= armpmu_del,
.start  = armpmu_start,
-- 
2.17.1



[PATCH 6/6] Documentation: arm64: Document PMU counters access from userspace

2019-05-16 Thread Raphael Gault
Add a documentation file to describe the access to the pmu hardware
counters from userspace

Signed-off-by: Raphael Gault 
---
 .../arm64/pmu_counter_user_access.txt | 42 +++
 1 file changed, 42 insertions(+)
 create mode 100644 Documentation/arm64/pmu_counter_user_access.txt

diff --git a/Documentation/arm64/pmu_counter_user_access.txt 
b/Documentation/arm64/pmu_counter_user_access.txt
new file mode 100644
index ..bccf5edbf7f5
--- /dev/null
+++ b/Documentation/arm64/pmu_counter_user_access.txt
@@ -0,0 +1,42 @@
+Access to PMU hardware counter from userspace
+=
+
+Overview
+
+The perf user-space tool relies on the PMU to monitor events. It offers an
+abstraction layer over the hardware counters since the underlying
+implementation is cpu-dependent.
+Arm64 allows userspace tools to have access to the registers storing the
+hardware counters' values directly.
+
+This targets specifically self-monitoring tasks in order to reduce the overhead
+by directly accessing the registers without having to go through the kernel.
+
+How-to
+--
+The focus is set on the armv8 pmuv3 which makes sure that the access to the pmu
+registers is enable and that the userspace have access to the relevent
+information in order to use them. 
+
+In order to have access to the hardware counter it is necessary to open the 
event
+using the perf tool interface: the sys_perf_event_open syscall returns a fd 
which
+can subsequently be used with the mmap syscall in order to retrieve a page of 
memory
+containing information about the event.
+The PMU driver uses this page to expose to the user the hardware counter's
+index. Using this index enables the user to access the PMU registers using the
+`mrs` instruction.
+
+Have a look `at tools/perf/arch/arm64/tests/user-events.c` for an example. It 
can be
+run using the perf tool to check that the access to the registers works
+correctly from userspace:
+
+./perf test -v
+
+About chained events
+
+When the user requests for an event to be counted on 64 bits, two hardware
+counters are used and need to be combined to retrieve the correct value:
+
+val = read_counter(idx);
+if ((event.attr.config1 & 0x1))
+   val = (val << 32) | read_counter(idx - 1);
-- 
2.17.1



  1   2   >