Re: [PATCH 13/23] KVM: PPC: Book3S HV: Accumulate timing information for real-mode code
On Fri, Mar 20, 2015 at 12:35:43PM +0100, Alexander Graf wrote: On 20.03.15 12:25, Paul Mackerras wrote: On Fri, Mar 20, 2015 at 12:15:15PM +0100, Alexander Graf wrote: Have you measure the additional overhead this brings? I haven't - in fact I did this patch so I could measure the overhead or improvement from other changes I did, but it doesn't measure its own overhead, of course. I guess I need a workload that does a defined number of guest entries and exits and measure how fast it runs with and without the patch (maybe something like H_SET_MODE in a loop). I'll figure something out and post the results. Yeah, just measure the number of exits you can handle for a simple hcall. If there is measurable overhead, it's probably a good idea to move the statistics gathering into #ifdef paths for DEBUGFS or maybe even a separate EXIT_TIMING config option as we have it for booke. For 1-vcpu guest on POWER8, it adds 29ns to the time for an hcall that is handled in real mode (H_SET_DABR), which is 25%. It adds 43ns to the time for an hcall that is handled in the host kernel in virtual mode (H_PROD), which is 1.2%. I'll add a config option. Paul. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 13/23] KVM: PPC: Book3S HV: Accumulate timing information for real-mode code
On 20.03.15 10:39, Paul Mackerras wrote: This reads the timebase at various points in the real-mode guest entry/exit code and uses that to accumulate total, minimum and maximum time spent in those parts of the code. Currently these times are accumulated per vcpu in 5 parts of the code: * rm_entry - time taken from the start of kvmppc_hv_entry() until just before entering the guest. * rm_intr - time from when we take a hypervisor interrupt in the guest until we either re-enter the guest or decide to exit to the host. This includes time spent handling hcalls in real mode. * rm_exit - time from when we decide to exit the guest until the return from kvmppc_hv_entry(). * guest - time spend in the guest * cede - time spent napping in real mode due to an H_CEDE hcall while other threads in the same vcore are active. These times are exposed in debugfs in a directory per vcpu that contains a file called timings. This file contains one line for each of the 5 timings above, with the name followed by a colon and 4 numbers, which are the count (number of times the code has been executed), the total time, the minimum time, and the maximum time, all in nanoseconds. Signed-off-by: Paul Mackerras pau...@samba.org Have you measure the additional overhead this brings? --- arch/powerpc/include/asm/kvm_host.h | 19 + arch/powerpc/include/asm/time.h | 3 + arch/powerpc/kernel/asm-offsets.c | 11 +++ arch/powerpc/kernel/time.c | 6 ++ arch/powerpc/kvm/book3s_hv.c| 135 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 105 - 6 files changed, 276 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index f1d0bbc..286c0ce 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -369,6 +369,14 @@ struct kvmppc_slb { u8 base_page_size; /* MMU_PAGE_xxx */ }; +/* Struct used to accumulate timing information in HV real mode code */ +struct kvmhv_tb_accumulator { + u64 seqcount; /* used to synchronize access, also count * 2 */ + u64 tb_total; /* total time in timebase ticks */ + u64 tb_min; /* min time */ + u64 tb_max; /* max time */ +}; + # ifdef CONFIG_PPC_FSL_BOOK3E #define KVMPPC_BOOKE_IAC_NUM 2 #define KVMPPC_BOOKE_DAC_NUM 2 @@ -656,6 +664,17 @@ struct kvm_vcpu_arch { u64 busy_preempt; u32 emul_inst; + + struct kvmhv_tb_accumulator *cur_activity; /* What we're timing */ + u64 cur_tb_start; /* when it started */ + struct kvmhv_tb_accumulator rm_entry; /* real-mode entry code */ + struct kvmhv_tb_accumulator rm_intr;/* real-mode intr handling */ + struct kvmhv_tb_accumulator rm_exit;/* real-mode exit code */ + struct kvmhv_tb_accumulator guest_time; /* guest execution */ + struct kvmhv_tb_accumulator cede_time; /* time napping inside guest */ + + struct dentry *debugfs_dir; + struct dentry *debugfs_timings; #endif }; diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h index 03cbada..10fc784 100644 --- a/arch/powerpc/include/asm/time.h +++ b/arch/powerpc/include/asm/time.h @@ -211,5 +211,8 @@ extern void secondary_cpu_time_init(void); DECLARE_PER_CPU(u64, decrementers_next_tb); +/* Convert timebase ticks to nanoseconds */ +unsigned long long tb_to_ns(unsigned long long tb_ticks); + #endif /* __KERNEL__ */ #endif /* __POWERPC_TIME_H */ diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 4717859..ec9f59c 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -458,6 +458,17 @@ int main(void) DEFINE(VCPU_SPRG1, offsetof(struct kvm_vcpu, arch.shregs.sprg1)); DEFINE(VCPU_SPRG2, offsetof(struct kvm_vcpu, arch.shregs.sprg2)); DEFINE(VCPU_SPRG3, offsetof(struct kvm_vcpu, arch.shregs.sprg3)); + DEFINE(VCPU_TB_RMENTRY, offsetof(struct kvm_vcpu, arch.rm_entry)); + DEFINE(VCPU_TB_RMINTR, offsetof(struct kvm_vcpu, arch.rm_intr)); + DEFINE(VCPU_TB_RMEXIT, offsetof(struct kvm_vcpu, arch.rm_exit)); + DEFINE(VCPU_TB_GUEST, offsetof(struct kvm_vcpu, arch.guest_time)); + DEFINE(VCPU_TB_CEDE, offsetof(struct kvm_vcpu, arch.cede_time)); + DEFINE(VCPU_CUR_ACTIVITY, offsetof(struct kvm_vcpu, arch.cur_activity)); + DEFINE(VCPU_ACTIVITY_START, offsetof(struct kvm_vcpu, arch.cur_tb_start)); + DEFINE(TAS_SEQCOUNT, offsetof(struct kvmhv_tb_accumulator, seqcount)); + DEFINE(TAS_TOTAL, offsetof(struct kvmhv_tb_accumulator, tb_total)); + DEFINE(TAS_MIN, offsetof(struct kvmhv_tb_accumulator, tb_min)); + DEFINE(TAS_MAX, offsetof(struct kvmhv_tb_accumulator, tb_max)); #endif DEFINE(VCPU_SHARED_SPRG3,
Re: [PATCH 13/23] KVM: PPC: Book3S HV: Accumulate timing information for real-mode code
On Fri, Mar 20, 2015 at 12:15:15PM +0100, Alexander Graf wrote: On 20.03.15 10:39, Paul Mackerras wrote: This reads the timebase at various points in the real-mode guest entry/exit code and uses that to accumulate total, minimum and maximum time spent in those parts of the code. Currently these times are accumulated per vcpu in 5 parts of the code: * rm_entry - time taken from the start of kvmppc_hv_entry() until just before entering the guest. * rm_intr - time from when we take a hypervisor interrupt in the guest until we either re-enter the guest or decide to exit to the host. This includes time spent handling hcalls in real mode. * rm_exit - time from when we decide to exit the guest until the return from kvmppc_hv_entry(). * guest - time spend in the guest * cede - time spent napping in real mode due to an H_CEDE hcall while other threads in the same vcore are active. These times are exposed in debugfs in a directory per vcpu that contains a file called timings. This file contains one line for each of the 5 timings above, with the name followed by a colon and 4 numbers, which are the count (number of times the code has been executed), the total time, the minimum time, and the maximum time, all in nanoseconds. Signed-off-by: Paul Mackerras pau...@samba.org Have you measure the additional overhead this brings? I haven't - in fact I did this patch so I could measure the overhead or improvement from other changes I did, but it doesn't measure its own overhead, of course. I guess I need a workload that does a defined number of guest entries and exits and measure how fast it runs with and without the patch (maybe something like H_SET_MODE in a loop). I'll figure something out and post the results. Paul. -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 13/23] KVM: PPC: Book3S HV: Accumulate timing information for real-mode code
On 20.03.15 12:25, Paul Mackerras wrote: On Fri, Mar 20, 2015 at 12:15:15PM +0100, Alexander Graf wrote: On 20.03.15 10:39, Paul Mackerras wrote: This reads the timebase at various points in the real-mode guest entry/exit code and uses that to accumulate total, minimum and maximum time spent in those parts of the code. Currently these times are accumulated per vcpu in 5 parts of the code: * rm_entry - time taken from the start of kvmppc_hv_entry() until just before entering the guest. * rm_intr - time from when we take a hypervisor interrupt in the guest until we either re-enter the guest or decide to exit to the host. This includes time spent handling hcalls in real mode. * rm_exit - time from when we decide to exit the guest until the return from kvmppc_hv_entry(). * guest - time spend in the guest * cede - time spent napping in real mode due to an H_CEDE hcall while other threads in the same vcore are active. These times are exposed in debugfs in a directory per vcpu that contains a file called timings. This file contains one line for each of the 5 timings above, with the name followed by a colon and 4 numbers, which are the count (number of times the code has been executed), the total time, the minimum time, and the maximum time, all in nanoseconds. Signed-off-by: Paul Mackerras pau...@samba.org Have you measure the additional overhead this brings? I haven't - in fact I did this patch so I could measure the overhead or improvement from other changes I did, but it doesn't measure its own overhead, of course. I guess I need a workload that does a defined number of guest entries and exits and measure how fast it runs with and without the patch (maybe something like H_SET_MODE in a loop). I'll figure something out and post the results. Yeah, just measure the number of exits you can handle for a simple hcall. If there is measurable overhead, it's probably a good idea to move the statistics gathering into #ifdef paths for DEBUGFS or maybe even a separate EXIT_TIMING config option as we have it for booke. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 13/23] KVM: PPC: Book3S HV: Accumulate timing information for real-mode code
On 20.03.15 10:39, Paul Mackerras wrote: This reads the timebase at various points in the real-mode guest entry/exit code and uses that to accumulate total, minimum and maximum time spent in those parts of the code. Currently these times are accumulated per vcpu in 5 parts of the code: * rm_entry - time taken from the start of kvmppc_hv_entry() until just before entering the guest. * rm_intr - time from when we take a hypervisor interrupt in the guest until we either re-enter the guest or decide to exit to the host. This includes time spent handling hcalls in real mode. * rm_exit - time from when we decide to exit the guest until the return from kvmppc_hv_entry(). * guest - time spend in the guest * cede - time spent napping in real mode due to an H_CEDE hcall while other threads in the same vcore are active. These times are exposed in debugfs in a directory per vcpu that contains a file called timings. This file contains one line for each of the 5 timings above, with the name followed by a colon and 4 numbers, which are the count (number of times the code has been executed), the total time, the minimum time, and the maximum time, all in nanoseconds. Signed-off-by: Paul Mackerras pau...@samba.org Have you measure the additional overhead this brings? --- arch/powerpc/include/asm/kvm_host.h | 19 + arch/powerpc/include/asm/time.h | 3 + arch/powerpc/kernel/asm-offsets.c | 11 +++ arch/powerpc/kernel/time.c | 6 ++ arch/powerpc/kvm/book3s_hv.c| 135 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 105 - 6 files changed, 276 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index f1d0bbc..286c0ce 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -369,6 +369,14 @@ struct kvmppc_slb { u8 base_page_size; /* MMU_PAGE_xxx */ }; +/* Struct used to accumulate timing information in HV real mode code */ +struct kvmhv_tb_accumulator { + u64 seqcount; /* used to synchronize access, also count * 2 */ + u64 tb_total; /* total time in timebase ticks */ + u64 tb_min; /* min time */ + u64 tb_max; /* max time */ +}; + # ifdef CONFIG_PPC_FSL_BOOK3E #define KVMPPC_BOOKE_IAC_NUM 2 #define KVMPPC_BOOKE_DAC_NUM 2 @@ -656,6 +664,17 @@ struct kvm_vcpu_arch { u64 busy_preempt; u32 emul_inst; + + struct kvmhv_tb_accumulator *cur_activity; /* What we're timing */ + u64 cur_tb_start; /* when it started */ + struct kvmhv_tb_accumulator rm_entry; /* real-mode entry code */ + struct kvmhv_tb_accumulator rm_intr;/* real-mode intr handling */ + struct kvmhv_tb_accumulator rm_exit;/* real-mode exit code */ + struct kvmhv_tb_accumulator guest_time; /* guest execution */ + struct kvmhv_tb_accumulator cede_time; /* time napping inside guest */ + + struct dentry *debugfs_dir; + struct dentry *debugfs_timings; #endif }; diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h index 03cbada..10fc784 100644 --- a/arch/powerpc/include/asm/time.h +++ b/arch/powerpc/include/asm/time.h @@ -211,5 +211,8 @@ extern void secondary_cpu_time_init(void); DECLARE_PER_CPU(u64, decrementers_next_tb); +/* Convert timebase ticks to nanoseconds */ +unsigned long long tb_to_ns(unsigned long long tb_ticks); + #endif /* __KERNEL__ */ #endif /* __POWERPC_TIME_H */ diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 4717859..ec9f59c 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -458,6 +458,17 @@ int main(void) DEFINE(VCPU_SPRG1, offsetof(struct kvm_vcpu, arch.shregs.sprg1)); DEFINE(VCPU_SPRG2, offsetof(struct kvm_vcpu, arch.shregs.sprg2)); DEFINE(VCPU_SPRG3, offsetof(struct kvm_vcpu, arch.shregs.sprg3)); + DEFINE(VCPU_TB_RMENTRY, offsetof(struct kvm_vcpu, arch.rm_entry)); + DEFINE(VCPU_TB_RMINTR, offsetof(struct kvm_vcpu, arch.rm_intr)); + DEFINE(VCPU_TB_RMEXIT, offsetof(struct kvm_vcpu, arch.rm_exit)); + DEFINE(VCPU_TB_GUEST, offsetof(struct kvm_vcpu, arch.guest_time)); + DEFINE(VCPU_TB_CEDE, offsetof(struct kvm_vcpu, arch.cede_time)); + DEFINE(VCPU_CUR_ACTIVITY, offsetof(struct kvm_vcpu, arch.cur_activity)); + DEFINE(VCPU_ACTIVITY_START, offsetof(struct kvm_vcpu, arch.cur_tb_start)); + DEFINE(TAS_SEQCOUNT, offsetof(struct kvmhv_tb_accumulator, seqcount)); + DEFINE(TAS_TOTAL, offsetof(struct kvmhv_tb_accumulator, tb_total)); + DEFINE(TAS_MIN, offsetof(struct kvmhv_tb_accumulator, tb_min)); + DEFINE(TAS_MAX, offsetof(struct kvmhv_tb_accumulator, tb_max)); #endif DEFINE(VCPU_SHARED_SPRG3,
Re: [PATCH 13/23] KVM: PPC: Book3S HV: Accumulate timing information for real-mode code
On Fri, Mar 20, 2015 at 12:15:15PM +0100, Alexander Graf wrote: On 20.03.15 10:39, Paul Mackerras wrote: This reads the timebase at various points in the real-mode guest entry/exit code and uses that to accumulate total, minimum and maximum time spent in those parts of the code. Currently these times are accumulated per vcpu in 5 parts of the code: * rm_entry - time taken from the start of kvmppc_hv_entry() until just before entering the guest. * rm_intr - time from when we take a hypervisor interrupt in the guest until we either re-enter the guest or decide to exit to the host. This includes time spent handling hcalls in real mode. * rm_exit - time from when we decide to exit the guest until the return from kvmppc_hv_entry(). * guest - time spend in the guest * cede - time spent napping in real mode due to an H_CEDE hcall while other threads in the same vcore are active. These times are exposed in debugfs in a directory per vcpu that contains a file called timings. This file contains one line for each of the 5 timings above, with the name followed by a colon and 4 numbers, which are the count (number of times the code has been executed), the total time, the minimum time, and the maximum time, all in nanoseconds. Signed-off-by: Paul Mackerras pau...@samba.org Have you measure the additional overhead this brings? I haven't - in fact I did this patch so I could measure the overhead or improvement from other changes I did, but it doesn't measure its own overhead, of course. I guess I need a workload that does a defined number of guest entries and exits and measure how fast it runs with and without the patch (maybe something like H_SET_MODE in a loop). I'll figure something out and post the results. Paul. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 13/23] KVM: PPC: Book3S HV: Accumulate timing information for real-mode code
On 20.03.15 12:25, Paul Mackerras wrote: On Fri, Mar 20, 2015 at 12:15:15PM +0100, Alexander Graf wrote: On 20.03.15 10:39, Paul Mackerras wrote: This reads the timebase at various points in the real-mode guest entry/exit code and uses that to accumulate total, minimum and maximum time spent in those parts of the code. Currently these times are accumulated per vcpu in 5 parts of the code: * rm_entry - time taken from the start of kvmppc_hv_entry() until just before entering the guest. * rm_intr - time from when we take a hypervisor interrupt in the guest until we either re-enter the guest or decide to exit to the host. This includes time spent handling hcalls in real mode. * rm_exit - time from when we decide to exit the guest until the return from kvmppc_hv_entry(). * guest - time spend in the guest * cede - time spent napping in real mode due to an H_CEDE hcall while other threads in the same vcore are active. These times are exposed in debugfs in a directory per vcpu that contains a file called timings. This file contains one line for each of the 5 timings above, with the name followed by a colon and 4 numbers, which are the count (number of times the code has been executed), the total time, the minimum time, and the maximum time, all in nanoseconds. Signed-off-by: Paul Mackerras pau...@samba.org Have you measure the additional overhead this brings? I haven't - in fact I did this patch so I could measure the overhead or improvement from other changes I did, but it doesn't measure its own overhead, of course. I guess I need a workload that does a defined number of guest entries and exits and measure how fast it runs with and without the patch (maybe something like H_SET_MODE in a loop). I'll figure something out and post the results. Yeah, just measure the number of exits you can handle for a simple hcall. If there is measurable overhead, it's probably a good idea to move the statistics gathering into #ifdef paths for DEBUGFS or maybe even a separate EXIT_TIMING config option as we have it for booke. Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 13/23] KVM: PPC: Book3S HV: Accumulate timing information for real-mode code
This reads the timebase at various points in the real-mode guest entry/exit code and uses that to accumulate total, minimum and maximum time spent in those parts of the code. Currently these times are accumulated per vcpu in 5 parts of the code: * rm_entry - time taken from the start of kvmppc_hv_entry() until just before entering the guest. * rm_intr - time from when we take a hypervisor interrupt in the guest until we either re-enter the guest or decide to exit to the host. This includes time spent handling hcalls in real mode. * rm_exit - time from when we decide to exit the guest until the return from kvmppc_hv_entry(). * guest - time spend in the guest * cede - time spent napping in real mode due to an H_CEDE hcall while other threads in the same vcore are active. These times are exposed in debugfs in a directory per vcpu that contains a file called timings. This file contains one line for each of the 5 timings above, with the name followed by a colon and 4 numbers, which are the count (number of times the code has been executed), the total time, the minimum time, and the maximum time, all in nanoseconds. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/include/asm/kvm_host.h | 19 + arch/powerpc/include/asm/time.h | 3 + arch/powerpc/kernel/asm-offsets.c | 11 +++ arch/powerpc/kernel/time.c | 6 ++ arch/powerpc/kvm/book3s_hv.c| 135 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 105 - 6 files changed, 276 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index f1d0bbc..286c0ce 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -369,6 +369,14 @@ struct kvmppc_slb { u8 base_page_size; /* MMU_PAGE_xxx */ }; +/* Struct used to accumulate timing information in HV real mode code */ +struct kvmhv_tb_accumulator { + u64 seqcount; /* used to synchronize access, also count * 2 */ + u64 tb_total; /* total time in timebase ticks */ + u64 tb_min; /* min time */ + u64 tb_max; /* max time */ +}; + # ifdef CONFIG_PPC_FSL_BOOK3E #define KVMPPC_BOOKE_IAC_NUM 2 #define KVMPPC_BOOKE_DAC_NUM 2 @@ -656,6 +664,17 @@ struct kvm_vcpu_arch { u64 busy_preempt; u32 emul_inst; + + struct kvmhv_tb_accumulator *cur_activity; /* What we're timing */ + u64 cur_tb_start; /* when it started */ + struct kvmhv_tb_accumulator rm_entry; /* real-mode entry code */ + struct kvmhv_tb_accumulator rm_intr;/* real-mode intr handling */ + struct kvmhv_tb_accumulator rm_exit;/* real-mode exit code */ + struct kvmhv_tb_accumulator guest_time; /* guest execution */ + struct kvmhv_tb_accumulator cede_time; /* time napping inside guest */ + + struct dentry *debugfs_dir; + struct dentry *debugfs_timings; #endif }; diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h index 03cbada..10fc784 100644 --- a/arch/powerpc/include/asm/time.h +++ b/arch/powerpc/include/asm/time.h @@ -211,5 +211,8 @@ extern void secondary_cpu_time_init(void); DECLARE_PER_CPU(u64, decrementers_next_tb); +/* Convert timebase ticks to nanoseconds */ +unsigned long long tb_to_ns(unsigned long long tb_ticks); + #endif /* __KERNEL__ */ #endif /* __POWERPC_TIME_H */ diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 4717859..ec9f59c 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -458,6 +458,17 @@ int main(void) DEFINE(VCPU_SPRG1, offsetof(struct kvm_vcpu, arch.shregs.sprg1)); DEFINE(VCPU_SPRG2, offsetof(struct kvm_vcpu, arch.shregs.sprg2)); DEFINE(VCPU_SPRG3, offsetof(struct kvm_vcpu, arch.shregs.sprg3)); + DEFINE(VCPU_TB_RMENTRY, offsetof(struct kvm_vcpu, arch.rm_entry)); + DEFINE(VCPU_TB_RMINTR, offsetof(struct kvm_vcpu, arch.rm_intr)); + DEFINE(VCPU_TB_RMEXIT, offsetof(struct kvm_vcpu, arch.rm_exit)); + DEFINE(VCPU_TB_GUEST, offsetof(struct kvm_vcpu, arch.guest_time)); + DEFINE(VCPU_TB_CEDE, offsetof(struct kvm_vcpu, arch.cede_time)); + DEFINE(VCPU_CUR_ACTIVITY, offsetof(struct kvm_vcpu, arch.cur_activity)); + DEFINE(VCPU_ACTIVITY_START, offsetof(struct kvm_vcpu, arch.cur_tb_start)); + DEFINE(TAS_SEQCOUNT, offsetof(struct kvmhv_tb_accumulator, seqcount)); + DEFINE(TAS_TOTAL, offsetof(struct kvmhv_tb_accumulator, tb_total)); + DEFINE(TAS_MIN, offsetof(struct kvmhv_tb_accumulator, tb_min)); + DEFINE(TAS_MAX, offsetof(struct kvmhv_tb_accumulator, tb_max)); #endif DEFINE(VCPU_SHARED_SPRG3, offsetof(struct kvm_vcpu_arch_shared, sprg3)); DEFINE(VCPU_SHARED_SPRG4, offsetof(struct kvm_vcpu_arch_shared, sprg4)); diff --git
[PATCH 13/23] KVM: PPC: Book3S HV: Accumulate timing information for real-mode code
This reads the timebase at various points in the real-mode guest entry/exit code and uses that to accumulate total, minimum and maximum time spent in those parts of the code. Currently these times are accumulated per vcpu in 5 parts of the code: * rm_entry - time taken from the start of kvmppc_hv_entry() until just before entering the guest. * rm_intr - time from when we take a hypervisor interrupt in the guest until we either re-enter the guest or decide to exit to the host. This includes time spent handling hcalls in real mode. * rm_exit - time from when we decide to exit the guest until the return from kvmppc_hv_entry(). * guest - time spend in the guest * cede - time spent napping in real mode due to an H_CEDE hcall while other threads in the same vcore are active. These times are exposed in debugfs in a directory per vcpu that contains a file called timings. This file contains one line for each of the 5 timings above, with the name followed by a colon and 4 numbers, which are the count (number of times the code has been executed), the total time, the minimum time, and the maximum time, all in nanoseconds. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/include/asm/kvm_host.h | 19 + arch/powerpc/include/asm/time.h | 3 + arch/powerpc/kernel/asm-offsets.c | 11 +++ arch/powerpc/kernel/time.c | 6 ++ arch/powerpc/kvm/book3s_hv.c| 135 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 105 - 6 files changed, 276 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index f1d0bbc..286c0ce 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -369,6 +369,14 @@ struct kvmppc_slb { u8 base_page_size; /* MMU_PAGE_xxx */ }; +/* Struct used to accumulate timing information in HV real mode code */ +struct kvmhv_tb_accumulator { + u64 seqcount; /* used to synchronize access, also count * 2 */ + u64 tb_total; /* total time in timebase ticks */ + u64 tb_min; /* min time */ + u64 tb_max; /* max time */ +}; + # ifdef CONFIG_PPC_FSL_BOOK3E #define KVMPPC_BOOKE_IAC_NUM 2 #define KVMPPC_BOOKE_DAC_NUM 2 @@ -656,6 +664,17 @@ struct kvm_vcpu_arch { u64 busy_preempt; u32 emul_inst; + + struct kvmhv_tb_accumulator *cur_activity; /* What we're timing */ + u64 cur_tb_start; /* when it started */ + struct kvmhv_tb_accumulator rm_entry; /* real-mode entry code */ + struct kvmhv_tb_accumulator rm_intr;/* real-mode intr handling */ + struct kvmhv_tb_accumulator rm_exit;/* real-mode exit code */ + struct kvmhv_tb_accumulator guest_time; /* guest execution */ + struct kvmhv_tb_accumulator cede_time; /* time napping inside guest */ + + struct dentry *debugfs_dir; + struct dentry *debugfs_timings; #endif }; diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h index 03cbada..10fc784 100644 --- a/arch/powerpc/include/asm/time.h +++ b/arch/powerpc/include/asm/time.h @@ -211,5 +211,8 @@ extern void secondary_cpu_time_init(void); DECLARE_PER_CPU(u64, decrementers_next_tb); +/* Convert timebase ticks to nanoseconds */ +unsigned long long tb_to_ns(unsigned long long tb_ticks); + #endif /* __KERNEL__ */ #endif /* __POWERPC_TIME_H */ diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 4717859..ec9f59c 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -458,6 +458,17 @@ int main(void) DEFINE(VCPU_SPRG1, offsetof(struct kvm_vcpu, arch.shregs.sprg1)); DEFINE(VCPU_SPRG2, offsetof(struct kvm_vcpu, arch.shregs.sprg2)); DEFINE(VCPU_SPRG3, offsetof(struct kvm_vcpu, arch.shregs.sprg3)); + DEFINE(VCPU_TB_RMENTRY, offsetof(struct kvm_vcpu, arch.rm_entry)); + DEFINE(VCPU_TB_RMINTR, offsetof(struct kvm_vcpu, arch.rm_intr)); + DEFINE(VCPU_TB_RMEXIT, offsetof(struct kvm_vcpu, arch.rm_exit)); + DEFINE(VCPU_TB_GUEST, offsetof(struct kvm_vcpu, arch.guest_time)); + DEFINE(VCPU_TB_CEDE, offsetof(struct kvm_vcpu, arch.cede_time)); + DEFINE(VCPU_CUR_ACTIVITY, offsetof(struct kvm_vcpu, arch.cur_activity)); + DEFINE(VCPU_ACTIVITY_START, offsetof(struct kvm_vcpu, arch.cur_tb_start)); + DEFINE(TAS_SEQCOUNT, offsetof(struct kvmhv_tb_accumulator, seqcount)); + DEFINE(TAS_TOTAL, offsetof(struct kvmhv_tb_accumulator, tb_total)); + DEFINE(TAS_MIN, offsetof(struct kvmhv_tb_accumulator, tb_min)); + DEFINE(TAS_MAX, offsetof(struct kvmhv_tb_accumulator, tb_max)); #endif DEFINE(VCPU_SHARED_SPRG3, offsetof(struct kvm_vcpu_arch_shared, sprg3)); DEFINE(VCPU_SHARED_SPRG4, offsetof(struct kvm_vcpu_arch_shared, sprg4)); diff --git