Re: [PATCH 0/4] Powerpc: Better preemption for shared processor
* Waiman Long [2020-10-28 20:01:30]: > > Srikar Dronamraju (4): > >powerpc: Refactor is_kvm_guest declaration to new header > >powerpc: Rename is_kvm_guest to check_kvm_guest > >powerpc: Reintroduce is_kvm_guest > >powerpc/paravirt: Use is_kvm_guest in vcpu_is_preempted > > > > arch/powerpc/include/asm/firmware.h | 6 -- > > arch/powerpc/include/asm/kvm_guest.h | 25 + > > arch/powerpc/include/asm/kvm_para.h | 2 +- > > arch/powerpc/include/asm/paravirt.h | 18 ++ > > arch/powerpc/kernel/firmware.c | 5 - > > arch/powerpc/platforms/pseries/smp.c | 3 ++- > > 6 files changed, 50 insertions(+), 9 deletions(-) > > create mode 100644 arch/powerpc/include/asm/kvm_guest.h > > > This patch series looks good to me and the performance is nice too. > > Acked-by: Waiman Long Thank you. > > Just curious, is the performance mainly from the use of static_branch > (patches 1 - 3) or from reducing call to yield_count_of(). Because of the reduced call to yield_count > > Cheers, > Longman > -- Thanks and Regards Srikar Dronamraju
Re: [PATCH 0/4] Powerpc: Better preemption for shared processor
On 10/28/20 8:35 AM, Srikar Dronamraju wrote: Currently, vcpu_is_preempted will return the yield_count for shared_processor. On a PowerVM LPAR, Phyp schedules at SMT8 core boundary i.e all CPUs belonging to a core are either group scheduled in or group scheduled out. This can be used to better predict non-preempted CPUs on PowerVM shared LPARs. perf stat -r 5 -a perf bench sched pipe -l 1000 (lesser time is better) powerpc/next 35,107,951.20 msec cpu-clock # 255.898 CPUs utilized ( +- 0.31% ) 23,655,348 context-switches #0.674 K/sec ( +- 3.72% ) 14,465 cpu-migrations#0.000 K/sec ( +- 5.37% ) 82,463 page-faults #0.002 K/sec ( +- 8.40% ) 1,127,182,328,206 cycles#0.032 GHz ( +- 1.60% ) (66.67%) 78,587,300,622 stalled-cycles-frontend #6.97% frontend cycles idle ( +- 0.08% ) (50.01%) 654,124,218,432 stalled-cycles-backend# 58.03% backend cycles idle ( +- 1.74% ) (50.01%) 834,013,059,242 instructions #0.74 insn per cycle #0.78 stalled cycles per insn ( +- 0.73% ) (66.67%) 132,911,454,387 branches #3.786 M/sec ( +- 0.59% ) (50.00%) 2,890,882,143 branch-misses #2.18% of all branches ( +- 0.46% ) (50.00%) 137.195 +- 0.419 seconds time elapsed ( +- 0.31% ) powerpc/next + patchset 29,981,702.64 msec cpu-clock # 255.881 CPUs utilized ( +- 1.30% ) 40,162,456 context-switches #0.001 M/sec ( +- 0.01% ) 1,110 cpu-migrations#0.000 K/sec ( +- 5.20% ) 62,616 page-faults #0.002 K/sec ( +- 3.93% ) 1,430,030,626,037 cycles#0.048 GHz ( +- 1.41% ) (66.67%) 83,202,707,288 stalled-cycles-frontend #5.82% frontend cycles idle ( +- 0.75% ) (50.01%) 744,556,088,520 stalled-cycles-backend# 52.07% backend cycles idle ( +- 1.39% ) (50.01%) 940,138,418,674 instructions #0.66 insn per cycle #0.79 stalled cycles per insn ( +- 0.51% ) (66.67%) 146,452,852,283 branches #4.885 M/sec ( +- 0.80% ) (50.00%) 3,237,743,996 branch-misses #2.21% of all branches ( +- 1.18% ) (50.01%) 117.17 +- 1.52 seconds time elapsed ( +- 1.30% ) This is around 14.6% improvement in performance. Cc: linuxppc-dev Cc: LKML Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Nathan Lynch Cc: Gautham R Shenoy Cc: Peter Zijlstra Cc: Valentin Schneider Cc: Juri Lelli Cc: Waiman Long Cc: Phil Auld Srikar Dronamraju (4): powerpc: Refactor is_kvm_guest declaration to new header powerpc: Rename is_kvm_guest to check_kvm_guest powerpc: Reintroduce is_kvm_guest powerpc/paravirt: Use is_kvm_guest in vcpu_is_preempted arch/powerpc/include/asm/firmware.h | 6 -- arch/powerpc/include/asm/kvm_guest.h | 25 + arch/powerpc/include/asm/kvm_para.h | 2 +- arch/powerpc/include/asm/paravirt.h | 18 ++ arch/powerpc/kernel/firmware.c | 5 - arch/powerpc/platforms/pseries/smp.c | 3 ++- 6 files changed, 50 insertions(+), 9 deletions(-) create mode 100644 arch/powerpc/include/asm/kvm_guest.h This patch series looks good to me and the performance is nice too. Acked-by: Waiman Long Just curious, is the performance mainly from the use of static_branch (patches 1 - 3) or from reducing call to yield_count_of(). Cheers, Longman
[PATCH 0/4] Powerpc: Better preemption for shared processor
Currently, vcpu_is_preempted will return the yield_count for shared_processor. On a PowerVM LPAR, Phyp schedules at SMT8 core boundary i.e all CPUs belonging to a core are either group scheduled in or group scheduled out. This can be used to better predict non-preempted CPUs on PowerVM shared LPARs. perf stat -r 5 -a perf bench sched pipe -l 1000 (lesser time is better) powerpc/next 35,107,951.20 msec cpu-clock # 255.898 CPUs utilized ( +- 0.31% ) 23,655,348 context-switches #0.674 K/sec ( +- 3.72% ) 14,465 cpu-migrations#0.000 K/sec ( +- 5.37% ) 82,463 page-faults #0.002 K/sec ( +- 8.40% ) 1,127,182,328,206 cycles#0.032 GHz ( +- 1.60% ) (66.67%) 78,587,300,622 stalled-cycles-frontend #6.97% frontend cycles idle ( +- 0.08% ) (50.01%) 654,124,218,432 stalled-cycles-backend# 58.03% backend cycles idle ( +- 1.74% ) (50.01%) 834,013,059,242 instructions #0.74 insn per cycle #0.78 stalled cycles per insn ( +- 0.73% ) (66.67%) 132,911,454,387 branches #3.786 M/sec ( +- 0.59% ) (50.00%) 2,890,882,143 branch-misses #2.18% of all branches ( +- 0.46% ) (50.00%) 137.195 +- 0.419 seconds time elapsed ( +- 0.31% ) powerpc/next + patchset 29,981,702.64 msec cpu-clock # 255.881 CPUs utilized ( +- 1.30% ) 40,162,456 context-switches #0.001 M/sec ( +- 0.01% ) 1,110 cpu-migrations#0.000 K/sec ( +- 5.20% ) 62,616 page-faults #0.002 K/sec ( +- 3.93% ) 1,430,030,626,037 cycles#0.048 GHz ( +- 1.41% ) (66.67%) 83,202,707,288 stalled-cycles-frontend #5.82% frontend cycles idle ( +- 0.75% ) (50.01%) 744,556,088,520 stalled-cycles-backend# 52.07% backend cycles idle ( +- 1.39% ) (50.01%) 940,138,418,674 instructions #0.66 insn per cycle #0.79 stalled cycles per insn ( +- 0.51% ) (66.67%) 146,452,852,283 branches #4.885 M/sec ( +- 0.80% ) (50.00%) 3,237,743,996 branch-misses #2.21% of all branches ( +- 1.18% ) (50.01%) 117.17 +- 1.52 seconds time elapsed ( +- 1.30% ) This is around 14.6% improvement in performance. Cc: linuxppc-dev Cc: LKML Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Nathan Lynch Cc: Gautham R Shenoy Cc: Peter Zijlstra Cc: Valentin Schneider Cc: Juri Lelli Cc: Waiman Long Cc: Phil Auld Srikar Dronamraju (4): powerpc: Refactor is_kvm_guest declaration to new header powerpc: Rename is_kvm_guest to check_kvm_guest powerpc: Reintroduce is_kvm_guest powerpc/paravirt: Use is_kvm_guest in vcpu_is_preempted arch/powerpc/include/asm/firmware.h | 6 -- arch/powerpc/include/asm/kvm_guest.h | 25 + arch/powerpc/include/asm/kvm_para.h | 2 +- arch/powerpc/include/asm/paravirt.h | 18 ++ arch/powerpc/kernel/firmware.c | 5 - arch/powerpc/platforms/pseries/smp.c | 3 ++- 6 files changed, 50 insertions(+), 9 deletions(-) create mode 100644 arch/powerpc/include/asm/kvm_guest.h -- 2.18.4