Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-31 Thread Feng Tang
On Mon, Aug 31, 2020 at 09:55:17AM +0100, Mel Gorman wrote: > On Mon, Aug 31, 2020 at 04:23:06PM +0800, Feng Tang wrote: > > On Mon, Aug 31, 2020 at 08:56:11AM +0100, Mel Gorman wrote: > > > On Mon, Aug 31, 2020 at 10:16:38AM +0800, Feng Tang wrote: > > > > > So why don't you define both variables

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-31 Thread Mel Gorman
On Mon, Aug 31, 2020 at 04:23:06PM +0800, Feng Tang wrote: > On Mon, Aug 31, 2020 at 08:56:11AM +0100, Mel Gorman wrote: > > On Mon, Aug 31, 2020 at 10:16:38AM +0800, Feng Tang wrote: > > > > So why don't you define both variables with DEFINE_PER_CPU_ALIGNED and > > > > check if all your bad

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-31 Thread Feng Tang
On Mon, Aug 31, 2020 at 08:56:11AM +0100, Mel Gorman wrote: > On Mon, Aug 31, 2020 at 10:16:38AM +0800, Feng Tang wrote: > > > So why don't you define both variables with DEFINE_PER_CPU_ALIGNED and > > > check if all your bad measurements go away this way? > > > > For 'arch_freq_scale', there are

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-31 Thread Mel Gorman
On Mon, Aug 31, 2020 at 10:16:38AM +0800, Feng Tang wrote: > > So why don't you define both variables with DEFINE_PER_CPU_ALIGNED and > > check if all your bad measurements go away this way? > > For 'arch_freq_scale', there are other percpu variables in the same > smpboot.c: 'arch_prev_aperf' and

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-30 Thread Feng Tang
On Fri, Aug 28, 2020 at 07:48:39PM +0200, Borislav Petkov wrote: > On Tue, Aug 25, 2020 at 02:23:05PM +0800, Feng Tang wrote: > > Also one good news is, we seem to identify the 2 key percpu variables > > out of the list mentioned in previous email: > > 'arch_freq_scale' > > 'tsc_adjust'

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-28 Thread Borislav Petkov
On Tue, Aug 25, 2020 at 02:23:05PM +0800, Feng Tang wrote: > Also one good news is, we seem to identify the 2 key percpu variables > out of the list mentioned in previous email: > 'arch_freq_scale' > 'tsc_adjust' > > These 2 variables are accessed in 2 hot call stacks (for this 288

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-25 Thread Feng Tang
On Wed, Aug 26, 2020 at 12:44:37AM +0800, Luck, Tony wrote: > > These 2 variables are accessed in 2 hot call stacks (for this 288 CPU > > Xeon Phi platform): > > This might be the key element of "weirdness" for this system. It > has 288 CPUs ... cache alignment problems are often not too bad > on

RE: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-25 Thread Luck, Tony
> These 2 variables are accessed in 2 hot call stacks (for this 288 CPU > Xeon Phi platform): This might be the key element of "weirdness" for this system. It has 288 CPUs ... cache alignment problems are often not too bad on "small" systems. The as you scale up to bigger machines you suddenly

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-25 Thread Feng Tang
On Mon, Aug 24, 2020 at 05:56:53PM +0100, Mel Gorman wrote: > On Mon, Aug 24, 2020 at 06:12:38PM +0200, Borislav Petkov wrote: > > > > > :) Right, this is what I'm doing right now. Some test job is queued on > > > the test box, and it may needs some iterations of new patch. Hopefully we > > >

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-25 Thread Feng Tang
On Mon, Aug 24, 2020 at 06:12:38PM +0200, Borislav Petkov wrote: > > -DEFINE_PER_CPU(struct mce, injectm); > > +DEFINE_PER_CPU_ALIGNED(struct mce, injectm); > > EXPORT_PER_CPU_SYMBOL_GPL(injectm); > > I don't think this is the right fix. Agreed :) This is a debug patch, what we want is to root

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-24 Thread Mel Gorman
On Mon, Aug 24, 2020 at 06:12:38PM +0200, Borislav Petkov wrote: > > > :) Right, this is what I'm doing right now. Some test job is queued on > > the test box, and it may needs some iterations of new patch. Hopefully we > > can isolate some specific variable given some luck. > > ... yes,

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-24 Thread Borislav Petkov
On Mon, Aug 24, 2020 at 11:33:00PM +0800, Feng Tang wrote: > diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c > index 43b1519..2c020ef 100644 > --- a/arch/x86/kernel/cpu/mce/core.c > +++ b/arch/x86/kernel/cpu/mce/core.c > @@ -95,7 +95,7 @@ struct mca_config mca_cfg

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-24 Thread Feng Tang
On Mon, Aug 24, 2020 at 11:38:53PM +0800, Luck, Tony wrote: > > Yes, that's what we suspected. And I just did another try to force the > > percpu mce structure aligned. And the regression seems to be gone (reduced > > from 14.1% to 2%), which further proved it. > > I wonder whether it would be

RE: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-24 Thread Luck, Tony
> Yes, that's what we suspected. And I just did another try to force the > percpu mce structure aligned. And the regression seems to be gone (reduced > from 14.1% to 2%), which further proved it. I wonder whether it would be useful for bisection of performance issues for you to change the global

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-24 Thread Feng Tang
On Mon, Aug 24, 2020 at 05:14:25PM +0200, Borislav Petkov wrote: > On Fri, Aug 21, 2020 at 10:02:59AM +0800, Feng Tang wrote: > > 1de08dccd383 x86/mce: Add a struct mce.kflags field > > 9554bfe403bd x86/mce: Convert the CEC to use the MCE notifier > > > > And strange thing is after using gcc9

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-24 Thread Borislav Petkov
On Fri, Aug 21, 2020 at 10:02:59AM +0800, Feng Tang wrote: > 1de08dccd383 x86/mce: Add a struct mce.kflags field > 9554bfe403bd x86/mce: Convert the CEC to use the MCE notifier > > And strange thing is after using gcc9 and debian10 rootfs, with same commits > the regression turns to a

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-18 Thread Feng Tang
On Wed, Aug 19, 2020 at 10:23:11AM +0800, Luck, Tony wrote: > 00019260 D pqr_state > > Do you have /sys/fs/resctrl mounted? This variable is read on every context > switch. > If your benchmark does a lot of context switching and this now shares a cache > line > with something

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-18 Thread Feng Tang
On Wed, Aug 19, 2020 at 10:23:11AM +0800, Luck, Tony wrote: > 00019260 D pqr_state > > Do you have /sys/fs/resctrl mounted? This variable is read on every context > switch. > If your benchmark does a lot of context switching and this now shares a cache > line > with something

RE: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-18 Thread Luck, Tony
00019260 D pqr_state Do you have /sys/fs/resctrl mounted? This variable is read on every context switch. If your benchmark does a lot of context switching and this now shares a cache line with something different (especially something that is sometimes modified from another

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-18 Thread Feng Tang
On Tue, Aug 18, 2020 at 01:06:54PM -0700, Luck, Tony wrote: > On Tue, Aug 18, 2020 at 04:29:43PM +0800, Feng Tang wrote: > > Hi Borislav, > > > > On Sat, Apr 25, 2020 at 03:01:36PM +0200, Borislav Petkov wrote: > > > On Sat, Apr 25, 2020 at 07:44:14PM +0800, kernel test robot wrote: > > > >

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-18 Thread Luck, Tony
On Tue, Aug 18, 2020 at 04:29:43PM +0800, Feng Tang wrote: > Hi Borislav, > > On Sat, Apr 25, 2020 at 03:01:36PM +0200, Borislav Petkov wrote: > > On Sat, Apr 25, 2020 at 07:44:14PM +0800, kernel test robot wrote: > > > Greeting, > > > > > > FYI, we noticed a -14.1% regression of

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-18 Thread Feng Tang
Hi Borislav, On Sat, Apr 25, 2020 at 03:01:36PM +0200, Borislav Petkov wrote: > On Sat, Apr 25, 2020 at 07:44:14PM +0800, kernel test robot wrote: > > Greeting, > > > > FYI, we noticed a -14.1% regression of will-it-scale.per_process_ops due to > > commit: > > > > > > commit: