Re: [RFC PATCH] perf/kvm: Guest Symbol Resolution for powerpc
Hi Arnaldo, On 06/16/2015 09:08 PM, Arnaldo Carvalho de Melo wrote: Em Tue, Jun 16, 2015 at 08:20:53AM +0530, Hemant Kumar escreveu: "perf kvm {record|report}" is used to record and report the performance profile of any workload on a guest. From the host, we can collect guest kernel statistics which is useful in finding out any contentions in guest kernel symbols for a certain workload. This feature is not available on powerpc because "perf" relies on the "cycles" event (a PMU event) to profile the guest. However, for powerpc, this can't be used from the host because the PMUs are controlled by the guest rather than the host. Due to this problem, we need a different approach to profile the workload in the guest. There exists a tracepoint "kvm_hv:kvm_guest_exit" in powerpc which is hit whenever any of the threads exit the guest context. The guest instruction pointer dumped along with this tracepoint data in the field "pc", can be used as guest instruction pointer while postprocessing the trace data to map this IP to symbol from guest.kallsyms. However, to have some kind of periodicity, we can't use all the kvm exits, rather exits which are bound to happen in certain intervals. HV_DECREMENTER Interrupt forces the threads to exit after an interval of 10 ms. This patch makes use of the "kvm_guest_exit" tracepoint and checks the exit reason for any kvm exit. If it is HV_DECREMENTER, then the instruction pointer dumped along with this tracepoint is retrieved and mapped with the guest kallsyms. This patch is a prototype asking for suggestions/comments as to whether the approach is right or is there any way better than this (like using a different event to profile for, etc) to profile the guest from the host. Thank You. Signed-off-by: Hemant Kumar --- tools/perf/arch/powerpc/Makefile| 1 + tools/perf/arch/powerpc/util/parse-tp.c | 55 + tools/perf/builtin-report.c | 9 ++ tools/perf/util/event.c | 7 - tools/perf/util/evsel.c | 7 + tools/perf/util/evsel.h | 4 +++ tools/perf/util/session.c | 7 +++-- 7 files changed, 86 insertions(+), 4 deletions(-) create mode 100644 tools/perf/arch/powerpc/util/parse-tp.c diff --git a/tools/perf/arch/powerpc/Makefile b/tools/perf/arch/powerpc/Makefile index 6f7782b..992a0d5 100644 --- a/tools/perf/arch/powerpc/Makefile +++ b/tools/perf/arch/powerpc/Makefile @@ -4,3 +4,4 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/skip-callchain-idx.o endif LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/header.o +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/parse-tp.o diff --git a/tools/perf/arch/powerpc/util/parse-tp.c b/tools/perf/arch/powerpc/util/parse-tp.c new file mode 100644 index 000..4c6e49c --- /dev/null +++ b/tools/perf/arch/powerpc/util/parse-tp.c @@ -0,0 +1,55 @@ +#include "../../util/evsel.h" +#include "../../util/trace-event.h" +#include "../../util/session.h" + +#define KVMPPC_EXIT "kvm_hv:kvm_guest_exit" +#define HV_DECREMENTER 2432 +#define HV_BIT 3 +#define PR_BIT 49 +#define PPC_MAX 63 + +/* + * Get the instruction pointer from the tracepoint data + */ +u64 arch__get_ip(struct perf_evsel *evsel, struct perf_sample *data) +{ + u64 tp_ip = data->ip; + int trap; + + if (!strcmp(KVMPPC_EXIT, evsel->name)) { Can't you cache this somewhere? I.e. something like static int kvmppc_exit = -1; if (evsel->attr.type != PERF_TRACEPOINT) goto out; if (unlikely(kvmppc_exit == -1)) { if (strcmp(KVMPPC_EXIT, evsel->name))) goto out; kvmppc_exit = evsel->attr.config; } else (if kvmppc_exit != evsel->attr.config) goto out; Will try this. + trap = raw_field_value(evsel->tp_format, "trap", data->raw_data); + + if (trap == HV_DECREMENTER) + tp_ip = raw_field_value(evsel->tp_format, "pc", + data->raw_data); out: + return tp_ip; +} Also we have: u64 perf_evsel__intval(struct perf_evsel *evsel, struct perf_sample *sample, const char *name); So: trap = perf_evsel__intval(evsel, sample, "trap"); And: tp_ip = perf_evsel__intval(evsel, sample, "pc"); Makes it a bit shorter and allows for optimizations in how to find that field by name made at the evsel code. Thanks, missed perf_evsel__intval, will use this in the next iteration. - Arnaldo + +/* + * Get the HV and PR bits and accordingly, determine the cpumode + */ +u8 arch__get_cpumode(union perf_event *event, struct perf_evsel *evsel, +struct perf_sample *data) +{ + unsigned long hv, pr, msr; + u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK; + + if (strcmp(KVMPPC_EXIT, evsel->name)) + goto ret; + + if
Re: [PATCH 0/2] KVM: PPC: Book3S HV: Dynamic micro-threading/split-core
[I resend my message because MLs have refused the first one in HTML] On 28/05/2015 07:17, Paul Mackerras wrote: > This patch series provides a way to use more of the capacity of each > processor core when running guests configured with threads=1, 2 or 4 > on a POWER8 host with HV KVM, without having to change the static > micro-threading (the official name for split-core) mode for the whole > machine. The problem with setting the machine to static 2-way or > 4-way micro-threading mode is that (a) then you can't run guests with > threads=8 and (b) selecting the right mode can be tricky and requires > knowledge of what guests you will be running. > > Instead, with these two patches, we can now run more than one virtual > core (vcore) on a given physical core if possible, and if that means > we need to switch the core to 2-way or 4-way micro-threading mode, > then we do that on entry to the guests and switch back to whole-core > mode on exit (and we only switch the one core, not the whole machine). > The core mode switching is only done if the machine is in static > whole-core mode. > > All of this only comes into effect when a core is over-committed. > When the machine is lightly loaded everything operates the same with > these patches as without. Only when some core has a vcore that is > able to run while there is also another vcore that was wanting to run > on that core but got preempted does the logic kick in to try to run > both vcores at once. > > Paul. > --- > > arch/powerpc/include/asm/kvm_book3s_asm.h | 20 + > arch/powerpc/include/asm/kvm_host.h | 22 +- > arch/powerpc/kernel/asm-offsets.c | 9 + > arch/powerpc/kvm/book3s_hv.c | 648 > ++ > arch/powerpc/kvm/book3s_hv_builtin.c | 32 +- > arch/powerpc/kvm/book3s_hv_rm_xics.c | 4 +- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 111 - > 7 files changed, 740 insertions(+), 106 deletions(-) Tested-by: Laurent Vivier Performance is better, but Paul could you explain why it is better if I disable dynamic micro-threading ? Did I miss something ? My test system is an IBM Power S822L. I run two guests with 8 vCPUs (-smp 8,sockets=8,cores=1,threads=1) both attached on the same core (with pinning option of virt-manager). Then, I measure the time needed to compile a kernel in parallel in both guests with "make -j 16". My kernel without micro-threading: real37m23.424s real37m24.959s user167m31.474suser165m44.142s sys 113m26.195ssys 113m45.072s With micro-threading patches (PATCH 1+2): target_smt_mode 0 [in fact It was 8 here, but it should behave like 0, as it is > max threads/sub-core] dynamic_mt_modes 6 real32m13.338s real 32m26.652s user139m21.181suser 140m20.994s sys 77m35.339s sys 78m16.599s It's better, but if I disable dynamic micro-threading (but PATCH 1+2): target_smt_mode 0 dynamic_mt_modes 0 real30m49.100s real 30m48.161s user144m22.989suser 142m53.886s sys 65m4.942s sys 66m8.159s it's even better. without dynamic micro-threading patch (with PATCH1 but not PATCH2): target_smt_mode 0 real33m57.279s real 34m19.524s user158m43.064suser 156m19.863s sys 74m25.442s sys 76m42.994s Laurent -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] powerpc: implement barrier primitives
On 17.06.15 12:15, Will Deacon wrote: > On Wed, Jun 17, 2015 at 10:43:48AM +0100, Andre Przywara wrote: >> Instead of referring to the Linux header including the barrier >> macros, copy over the rather simple implementation for the PowerPC >> barrier instructions kvmtool uses. This fixes build for powerpc. >> >> Signed-off-by: Andre Przywara >> --- >> Hi, >> >> I just took what kvmtool seems to have used before, I actually have >> no idea if "sync" is the right instruction or "lwsync" would do. >> Would be nice if some people with PowerPC knowledge could comment. > > I *think* we can use lwsync for rmb and wmb, but would want confirmation > from a ppc guy before making that change! Also I'd prefer to play safe for now :) Alex -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] powerpc: implement barrier primitives
On Wed, Jun 17, 2015 at 10:43:48AM +0100, Andre Przywara wrote: > Instead of referring to the Linux header including the barrier > macros, copy over the rather simple implementation for the PowerPC > barrier instructions kvmtool uses. This fixes build for powerpc. > > Signed-off-by: Andre Przywara > --- > Hi, > > I just took what kvmtool seems to have used before, I actually have > no idea if "sync" is the right instruction or "lwsync" would do. > Would be nice if some people with PowerPC knowledge could comment. I *think* we can use lwsync for rmb and wmb, but would want confirmation from a ppc guy before making that change! Will -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] powerpc: add hvcall.h header from Linux
On Wed, Jun 17, 2015 at 10:43:50AM +0100, Andre Przywara wrote: > The powerpc code uses some PAPR hypercalls, of which we need the > hypercall number. Copy the macro definition parts from the kernel's > (private) hvcall.h file and remove the extra tricks formerly used > to be able to include this header file directly. > > Signed-off-by: Andre Przywara > --- > Hi, > > I copied most of the Linux header, without removing > definitions that kvmtool doesn't use. That should make updates > easier. If people would prefer a bespoke header, let me know. I'd rather just #define the stuff we need now that we're outside of the kernel source tree. Will -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] powerpc: use default endianness for converting guest/init
For converting the guest/init binary into an object file, we call the linker binary, setting the endianness to big endian explicitly when compiling kvmtool for powerpc. This breaks if the compiler is actually targetting little endian (which is true for the Debian port, for instance). Remove the explicit big endianness switch from the linker call to allow linking on little endian PowerPC builds again. Signed-off-by: Andre Przywara --- Hi, this fixed the powerpc64le build for me, while still compiling fine for big endian. Admittedly this whole init->guest_init.o conversion has its issues (with MIPS, for instance), which deserve proper fixing, but lets just fix that build for now. Andre. Makefile | 1 - 1 file changed, 1 deletion(-) diff --git a/Makefile b/Makefile index 6110b8e..c118e1a 100644 --- a/Makefile +++ b/Makefile @@ -149,7 +149,6 @@ ifeq ($(ARCH), powerpc) OBJS+= powerpc/xics.o ARCH_INCLUDE := powerpc/include CFLAGS += -m64 - LDFLAGS += -m elf64ppc ARCH_WANT_LIBFDT := y endif -- 2.3.5 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] powerpc: add hvcall.h header from Linux
The powerpc code uses some PAPR hypercalls, of which we need the hypercall number. Copy the macro definition parts from the kernel's (private) hvcall.h file and remove the extra tricks formerly used to be able to include this header file directly. Signed-off-by: Andre Przywara --- Hi, I copied most of the Linux header, without removing definitions that kvmtool doesn't use. That should make updates easier. If people would prefer a bespoke header, let me know. Andre. powerpc/include/asm/hvcall.h | 287 +++ powerpc/spapr.h | 3 - 2 files changed, 287 insertions(+), 3 deletions(-) create mode 100644 powerpc/include/asm/hvcall.h diff --git a/powerpc/include/asm/hvcall.h b/powerpc/include/asm/hvcall.h new file mode 100644 index 000..b6dc250 --- /dev/null +++ b/powerpc/include/asm/hvcall.h @@ -0,0 +1,287 @@ +#ifndef _ASM_POWERPC_HVCALL_H +#define _ASM_POWERPC_HVCALL_H + +#define HVSC .long 0x4422 + +#define H_SUCCESS 0 +#define H_BUSY 1 /* Hardware busy -- retry later */ +#define H_CLOSED 2 /* Resource closed */ +#define H_NOT_AVAILABLE 3 +#define H_CONSTRAINED 4 /* Resource request constrained to max allowed */ +#define H_PARTIAL 5 +#define H_IN_PROGRESS 14 /* Kind of like busy */ +#define H_PAGE_REGISTERED 15 +#define H_PARTIAL_STORE 16 +#define H_PENDING 17 /* returned from H_POLL_PENDING */ +#define H_CONTINUE 18 /* Returned from H_Join on success */ +#define H_LONG_BUSY_START_RANGE9900 /* Start of long busy range */ +#define H_LONG_BUSY_ORDER_1_MSEC 9900 /* Long busy, hint that 1msec \ +is a good time to retry */ +#define H_LONG_BUSY_ORDER_10_MSEC 9901 /* Long busy, hint that 10msec \ +is a good time to retry */ +#define H_LONG_BUSY_ORDER_100_MSEC 9902 /* Long busy, hint that 100msec \ +is a good time to retry */ +#define H_LONG_BUSY_ORDER_1_SEC9903 /* Long busy, hint that 1sec \ +is a good time to retry */ +#define H_LONG_BUSY_ORDER_10_SEC 9904 /* Long busy, hint that 10sec \ +is a good time to retry */ +#define H_LONG_BUSY_ORDER_100_SEC 9905 /* Long busy, hint that 100sec \ +is a good time to retry */ +#define H_LONG_BUSY_END_RANGE 9905 /* End of long busy range */ + +/* Internal value used in book3s_hv kvm support; not returned to guests */ +#define H_TOO_HARD + +#define H_HARDWARE -1 /* Hardware error */ +#define H_FUNCTION -2 /* Function not supported */ +#define H_PRIVILEGE-3 /* Caller not privileged */ +#define H_PARAMETER-4 /* Parameter invalid, out-of-range or conflicting */ +#define H_BAD_MODE -5 /* Illegal msr value */ +#define H_PTEG_FULL-6 /* PTEG is full */ +#define H_NOT_FOUND-7 /* PTE was not found" */ +#define H_RESERVED_DABR-8 /* DABR address is reserved by the hypervisor on this processor" */ +#define H_NO_MEM -9 +#define H_AUTHORITY-10 +#define H_PERMISSION -11 +#define H_DROPPED -12 +#define H_SOURCE_PARM -13 +#define H_DEST_PARM-14 +#define H_REMOTE_PARM -15 +#define H_RESOURCE -16 +#define H_ADAPTER_PARM -17 +#define H_RH_PARM -18 +#define H_RCQ_PARM -19 +#define H_SCQ_PARM -20 +#define H_EQ_PARM -21 +#define H_RT_PARM -22 +#define H_ST_PARM -23 +#define H_SIGT_PARM -24 +#define H_TOKEN_PARM-25 +#define H_MLENGTH_PARM -27 +#define H_MEM_PARM -28 +#define H_MEM_ACCESS_PARM -29 +#define H_ATTR_PARM -30 +#define H_PORT_PARM -31 +#define H_MCG_PARM -32 +#define H_VL_PARM -33 +#define H_TSIZE_PARM-34 +#define H_TRACE_PARM-35 + +#define H_MASK_PARM -37 +#define H_MCG_FULL -38 +#define H_ALIAS_EXIST -39 +#define H_P_COUNTER -40 +#define H_TABLE_FULL-41 +#define H_ALT_TABLE -42 +#define H_MR_CONDITION -43 +#define H_NOT_ENOUGH_RESOURCES -44 +#define H_R_STATE -45 +#define H_RESCINDED -46 +#define H_P2 -55 +#define H_P3 -56 +#define H_P4 -57 +#define H_P5 -58 +#define H_P6 -59 +#define H_P7 -60 +#define H_P8 -61 +#define H_P9 -62 +#define H_TOO_BIG -64 +#define H_OVERLAP -68 +#define H_INTERRUPT-69 +#define H_BAD_DATA -70 +#define H_NOT_ACTIVE -71 +#define H_SG_LIST -72 +#define H_OP_MODE -73 +#define H_COP_HW -74 +#define H_UNSUPPORTED_FLAG_START -256 +#define H_UNSUPPORTED_FLAG_END -511 +#define H_MULTI_THREADS_ACTIVE -9005 +#define H_OUTSTANDING_COP_OPS -9006 + + +/* Long Busy is a condition that can be returned by the firmware
[PATCH 0/3] kvmtool: fixes for PowerPC
Hello, some patches to fix at least the build of the new kvmtool for PowerPC. I could only compile test it so far, so I'd be grateful if people more familiar with that architecture can have a look and maybe even test it on actual machines. Cheers, Andre. Andre Przywara (3): powerpc: implement barrier primitives powerpc: use default endianness for converting guest/init powerpc: add hvcall.h header from Linux Makefile | 1 - powerpc/include/asm/hvcall.h | 287 ++ powerpc/include/kvm/barrier.h | 4 +- powerpc/spapr.h | 3 - 4 files changed, 290 insertions(+), 5 deletions(-) create mode 100644 powerpc/include/asm/hvcall.h -- 2.3.5 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] powerpc: implement barrier primitives
Instead of referring to the Linux header including the barrier macros, copy over the rather simple implementation for the PowerPC barrier instructions kvmtool uses. This fixes build for powerpc. Signed-off-by: Andre Przywara --- Hi, I just took what kvmtool seems to have used before, I actually have no idea if "sync" is the right instruction or "lwsync" would do. Would be nice if some people with PowerPC knowledge could comment. Cheers, Andre. powerpc/include/kvm/barrier.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/powerpc/include/kvm/barrier.h b/powerpc/include/kvm/barrier.h index dd5115a..4b708ae 100644 --- a/powerpc/include/kvm/barrier.h +++ b/powerpc/include/kvm/barrier.h @@ -1,6 +1,8 @@ #ifndef _KVM_BARRIER_H_ #define _KVM_BARRIER_H_ -#include +#define mb() asm volatile ("sync" : : : "memory") +#define rmb() asm volatile ("sync" : : : "memory") +#define wmb() asm volatile ("sync" : : : "memory") #endif /* _KVM_BARRIER_H_ */ -- 2.3.5 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html