On 26/10/2017 10:14, kemi wrote: > Some regression is found by LKP-tools(linux kernel performance) on this patch > series > tested on Intel 2s/4s Skylake platform. > The regression result is sorted by the metric will-it-scale.per_process_ops.
Hi Kemi, Thanks for reporting this, I'll try to address it by turning some features of the SPF path off when the process is monothreaded. Laurent. > Branch:Laurent-Dufour/Speculative-page-faults/20171011-213456(V4 patch series) > Commit id: > base:9a4b4dd1d8700dd5771f11dd2c048e4363efb493 > head:56a4a8962fb32555a42eefdc9a19eeedd3e8c2e6 > Benchmark suite:will-it-scale > Download > link:https://github.com/antonblanchard/will-it-scale/tree/master/tests > Metrics: > will-it-scale.per_process_ops=processes/nr_cpu > will-it-scale.per_thread_ops=threads/nr_cpu > > tbox:lkp-skl-4sp1(nr_cpu=192,memory=768G) > kconfig:CONFIG_TRANSPARENT_HUGEPAGE is not set > testcase base change head metric > > brk1 2251803 -18.1% 1843535 > will-it-scale.per_process_ops > 341101 -17.5% 281284 > will-it-scale.per_thread_ops > malloc1 48833 -9.2% 44343 > will-it-scale.per_process_ops > 31555 +2.9% 32473 > will-it-scale.per_thread_ops > page_fault3 913019 -8.5% 835203 > will-it-scale.per_process_ops > 233978 -18.1% 191593 > will-it-scale.per_thread_ops > mmap2 95892 -6.6% 89536 > will-it-scale.per_process_ops > 90180 -13.7% 77803 > will-it-scale.per_thread_ops > mmap1 109586 -4.7% 104414 > will-it-scale.per_process_ops > 104477 -12.4% 91484 > will-it-scale.per_thread_ops > sched_yield 4964649 -2.1% 4859927 > will-it-scale.per_process_ops > 4946759 -1.7% 4864924 > will-it-scale.per_thread_ops > write1 1345159 -1.3% 1327719 > will-it-scale.per_process_ops > 1228754 -2.2% 1201915 > will-it-scale.per_thread_ops > page_fault2 202519 -1.0% 200545 > will-it-scale.per_process_ops > 96573 -10.4% 86526 > will-it-scale.per_thread_ops > page_fault1 225608 -0.9% 223585 > will-it-scale.per_process_ops > 105945 +14.4% 121199 > will-it-scale.per_thread_ops > > tbox:lkp-skl-4sp1(nr_cpu=192,memory=768G) > kconfig:CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y > testcase base change head metric > > context_switch1 333780 -23.0% 256927 > will-it-scale.per_process_ops > brk1 2263539 -18.8% 1837462 > will-it-scale.per_process_ops > 325854 -15.7% 274752 > will-it-scale.per_thread_ops > malloc1 48746 -13.5% 42148 > will-it-scale.per_process_ops > mmap1 106860 -12.4% 93634 > will-it-scale.per_process_ops > 98082 -18.9% 79506 > will-it-scale.per_thread_ops > mmap2 92468 -11.3% 82059 > will-it-scale.per_process_ops > 80468 -8.9% 73343 > will-it-scale.per_thread_ops > page_fault3 900709 -9.1% 818851 > will-it-scale.per_process_ops > 229837 -18.3% 187769 > will-it-scale.per_thread_ops > write1 1327409 -1.7% 1305048 > will-it-scale.per_process_ops > 1215658 -1.6% 1196479 > will-it-scale.per_thread_ops > writeseek3 300639 -1.6% 295882 > will-it-scale.per_process_ops > 231118 -2.2% 225929 > will-it-scale.per_thread_ops > signal1 122011 -1.5% 120155 > will-it-scale.per_process_ops > futex1 5123778 -1.2% 5062087 > will-it-scale.per_process_ops > page_fault2 202321 -1.0% 200289 > will-it-scale.per_process_ops > 93073 -9.8% 83927 > will-it-scale.per_thread_ops > > tbox:lkp-skl-2sp2(nr_cpu=112,memory=64G) > kconfig:CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y > testcase base change head metric > > brk1 2177903 -20.0% 1742054 > will-it-scale.per_process_ops > 434558 -15.3% 367896 > will-it-scale.per_thread_ops > malloc1 64871 -10.3% 58174 > will-it-scale.per_process_ops > page_fault3 882435 -9.0% 802892 > will-it-scale.per_process_ops > 299176 -15.7% 252170 > will-it-scale.per_thread_ops > mmap2 124567 -8.3% 114214 > will-it-scale.per_process_ops > 110674 -12.1% 97272 > will-it-scale.per_thread_ops > mmap1 137205 -7.8% 126440 > will-it-scale.per_process_ops > 128973 -15.1% 109560 > will-it-scale.per_thread_ops > context_switch1 343790 -7.2% 319209 > will-it-scale.per_process_ops > page_fault2 161891 -2.1% 158458 > will-it-scale.per_process_ops > 123278 -5.4% 116629 > will-it-scale.per_thread_ops > malloc2 14354856 -1.8% 14096856 > will-it-scale.per_process_ops > read2 1204838 -1.7% 1183993 > will-it-scale.per_process_ops > futex1 5017718 -1.6% 4938677 > will-it-scale.per_process_ops > 1408250 -1.0% 1394022 > will-it-scale.per_thread_ops > writeseek3 399651 -1.4% 393935 > will-it-scale.per_process_ops > signal1 157952 -1.0% 156302 > will-it-scale.per_process_ops > > On 2017年10月11日 21:52, Laurent Dufour wrote: >> This patch enable the speculative page fault on the PowerPC >> architecture. >> >> This will try a speculative page fault without holding the mmap_sem, >> if it returns with VM_FAULT_RETRY, the mmap_sem is acquired and the >> traditional page fault processing is done. >> >> Build on if CONFIG_SPF is defined (currently for BOOK3S_64 && SMP). >> >> Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> >> --- >> arch/powerpc/mm/fault.c | 17 +++++++++++++++++ >> 1 file changed, 17 insertions(+) >> >> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c >> index 4797d08581ce..c018c2554cc8 100644 >> --- a/arch/powerpc/mm/fault.c >> +++ b/arch/powerpc/mm/fault.c >> @@ -442,6 +442,20 @@ static int __do_page_fault(struct pt_regs *regs, >> unsigned long address, >> if (is_exec) >> flags |= FAULT_FLAG_INSTRUCTION; >> >> +#ifdef CONFIG_SPF >> + if (is_user) { >> + /* let's try a speculative page fault without grabbing the >> + * mmap_sem. >> + */ >> + fault = handle_speculative_fault(mm, address, flags); >> + if (!(fault & VM_FAULT_RETRY)) { >> + perf_sw_event(PERF_COUNT_SW_SPF, 1, >> + regs, address); >> + goto done; >> + } >> + } >> +#endif /* CONFIG_SPF */ >> + >> /* When running in the kernel we expect faults to occur only to >> * addresses in user space. All other faults represent errors in the >> * kernel and should generate an OOPS. Unfortunately, in the case of an >> @@ -526,6 +540,9 @@ static int __do_page_fault(struct pt_regs *regs, >> unsigned long address, >> >> up_read(¤t->mm->mmap_sem); >> >> +#ifdef CONFIG_SPF >> +done: >> +#endif >> if (unlikely(fault & VM_FAULT_ERROR)) >> return mm_fault_error(regs, address, fault); >> >> >