Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-09-14 Thread Jirka Hladky
Hi Mel, we have tried to revert following 2 commits: 305c1fac3225 2d4056fafa196e1ab We had to revert 10864a9e222048a862da2c21efa28929a4dfed15 as well. The performance of the kernel was better than when only 2d4056fafa196e1ab was reverted but still worse than the performance of 4.18 kernel.

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-09-14 Thread Jirka Hladky
Hi Mel, we have tried to revert following 2 commits: 305c1fac3225 2d4056fafa196e1ab We had to revert 10864a9e222048a862da2c21efa28929a4dfed15 as well. The performance of the kernel was better than when only 2d4056fafa196e1ab was reverted but still worse than the performance of 4.18 kernel.

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-09-07 Thread Jirka Hladky
> Maybe 305c1fac3225dfa7eeb89bfe91b7335a6edd5172. That introduces a weird > condition in terms of idle CPU handling that has been problematic. We will try that, thanks! > I would suggest contacting Srikar directly. I will do that right away. Whom should I put on Cc? Just you and

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-09-07 Thread Jirka Hladky
> Maybe 305c1fac3225dfa7eeb89bfe91b7335a6edd5172. That introduces a weird > condition in terms of idle CPU handling that has been problematic. We will try that, thanks! > I would suggest contacting Srikar directly. I will do that right away. Whom should I put on Cc? Just you and

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-09-06 Thread Mel Gorman
On Thu, Sep 06, 2018 at 10:16:28AM +0200, Jirka Hladky wrote: > Hi Mel, > > we have results with 2d4056fafa196e1ab4e7161bae4df76f9602d56d reverted. > > * Compared to 4.18, there is still performance regression - > especially with NAS (sp_C_x subtest) and SPECjvm2008. On 4 NUMA > systems,

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-09-06 Thread Mel Gorman
On Thu, Sep 06, 2018 at 10:16:28AM +0200, Jirka Hladky wrote: > Hi Mel, > > we have results with 2d4056fafa196e1ab4e7161bae4df76f9602d56d reverted. > > * Compared to 4.18, there is still performance regression - > especially with NAS (sp_C_x subtest) and SPECjvm2008. On 4 NUMA > systems,

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-09-06 Thread Jirka Hladky
Hi Mel, we have results with 2d4056fafa196e1ab4e7161bae4df76f9602d56d reverted. * Compared to 4.18, there is still performance regression - especially with NAS (sp_C_x subtest) and SPECjvm2008. On 4 NUMA systems, regression is around 10-15% * Compared to 4.19rc1 there is a clear gain across

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-09-06 Thread Jirka Hladky
Hi Mel, we have results with 2d4056fafa196e1ab4e7161bae4df76f9602d56d reverted. * Compared to 4.18, there is still performance regression - especially with NAS (sp_C_x subtest) and SPECjvm2008. On 4 NUMA systems, regression is around 10-15% * Compared to 4.19rc1 there is a clear gain across

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-09-04 Thread Jirka Hladky
Hi Mel, thanks for sharing the background information! We will check if 2d4056fafa196e1ab4e7161bae4df76f9602d56d is causing the current regression in 4.19 rc1 and let you know the outcome. Jirka On Tue, Sep 4, 2018 at 11:00 AM, Mel Gorman wrote: > On Mon, Sep 03, 2018 at 05:07:15PM +0200,

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-09-04 Thread Jirka Hladky
Hi Mel, thanks for sharing the background information! We will check if 2d4056fafa196e1ab4e7161bae4df76f9602d56d is causing the current regression in 4.19 rc1 and let you know the outcome. Jirka On Tue, Sep 4, 2018 at 11:00 AM, Mel Gorman wrote: > On Mon, Sep 03, 2018 at 05:07:15PM +0200,

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-09-04 Thread Mel Gorman
On Mon, Sep 03, 2018 at 05:07:15PM +0200, Jirka Hladky wrote: > Resending in the plain text mode. > > > My own testing completed and the results are within expectations and I > > saw no red flags. Unfortunately, I consider it unlikely they'll be merged > > for 4.18. Srikar Dronamraju's series is

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-09-04 Thread Mel Gorman
On Mon, Sep 03, 2018 at 05:07:15PM +0200, Jirka Hladky wrote: > Resending in the plain text mode. > > > My own testing completed and the results are within expectations and I > > saw no red flags. Unfortunately, I consider it unlikely they'll be merged > > for 4.18. Srikar Dronamraju's series is

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-09-03 Thread Jirka Hladky
Resending in the plain text mode. > My own testing completed and the results are within expectations and I > saw no red flags. Unfortunately, I consider it unlikely they'll be merged > for 4.18. Srikar Dronamraju's series is likely to need another update > and I would need to rebase my patches on

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-09-03 Thread Jirka Hladky
Resending in the plain text mode. > My own testing completed and the results are within expectations and I > saw no red flags. Unfortunately, I consider it unlikely they'll be merged > for 4.18. Srikar Dronamraju's series is likely to need another update > and I would need to rebase my patches on

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-07-17 Thread Mel Gorman
On Tue, Jul 17, 2018 at 10:45:51AM +0200, Jirka Hladky wrote: > Hi Mel, > > we have compared 4.18 + git:// > git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git > sched-numa-fast-crossnode-v1r12 against 4.16 kernel and performance results > look very good! > Excellent, thanks to both Kamil

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-07-17 Thread Mel Gorman
On Tue, Jul 17, 2018 at 10:45:51AM +0200, Jirka Hladky wrote: > Hi Mel, > > we have compared 4.18 + git:// > git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git > sched-numa-fast-crossnode-v1r12 against 4.16 kernel and performance results > look very good! > Excellent, thanks to both Kamil

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-27 Thread Mel Gorman
On Wed, Jun 27, 2018 at 12:18:37AM +0200, Jirka Hladky wrote: > Hi Mel, > > we have results for the "Fixes for sched/numa_balancing" series and overall > it looks very promising. > > We see improvements in the range 15-20% for the stream benchmark and > upto 60% for the OpenMP NAS benchmark.

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-27 Thread Mel Gorman
On Wed, Jun 27, 2018 at 12:18:37AM +0200, Jirka Hladky wrote: > Hi Mel, > > we have results for the "Fixes for sched/numa_balancing" series and overall > it looks very promising. > > We see improvements in the range 15-20% for the stream benchmark and > upto 60% for the OpenMP NAS benchmark.

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-21 Thread Mel Gorman
On Wed, Jun 20, 2018 at 07:25:19PM +0200, Jirka Hladky wrote: > Hi Mel and others, > > I would like to let you know that I have tested following patch > Understood. FWIW, there is a lot in flight at the moment but the first likely patch is removing rate limiting entirely and see what falls out.

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-21 Thread Mel Gorman
On Wed, Jun 20, 2018 at 07:25:19PM +0200, Jirka Hladky wrote: > Hi Mel and others, > > I would like to let you know that I have tested following patch > Understood. FWIW, there is a lot in flight at the moment but the first likely patch is removing rate limiting entirely and see what falls out.

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-19 Thread Mel Gorman
On Tue, Jun 19, 2018 at 03:36:53PM +0200, Jirka Hladky wrote: > Hi Mel, > > we have tested following variants: > > var1: 4.16 + 2c83362734dad8e48ccc0710b5cd2436a0323893 > fix1: var1+ ratelimit_pages __read_mostly increased by factor 4x > -static unsigned int ratelimit_pages __read_mostly = 128

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-19 Thread Mel Gorman
On Tue, Jun 19, 2018 at 03:36:53PM +0200, Jirka Hladky wrote: > Hi Mel, > > we have tested following variants: > > var1: 4.16 + 2c83362734dad8e48ccc0710b5cd2436a0323893 > fix1: var1+ ratelimit_pages __read_mostly increased by factor 4x > -static unsigned int ratelimit_pages __read_mostly = 128

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-15 Thread Mel Gorman
On Fri, Jun 15, 2018 at 02:23:17PM +0200, Jirka Hladky wrote: > I added configurations that used half of the CPUs. However, that would > > mean it fits too nicely within sockets. I've added another set for one > > third of the CPUs and scheduled the tests. Unfortunately, they will not > > complete

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-15 Thread Mel Gorman
On Fri, Jun 15, 2018 at 02:23:17PM +0200, Jirka Hladky wrote: > I added configurations that used half of the CPUs. However, that would > > mean it fits too nicely within sockets. I've added another set for one > > third of the CPUs and scheduled the tests. Unfortunately, they will not > > complete

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-15 Thread Mel Gorman
On Fri, Jun 15, 2018 at 01:07:32AM +0200, Jirka Hladky wrote: > > > > In terms of the speed of migration, it may be worth checking how often the > > mm_numa_migrate_ratelimit tracepoint is triggered with bonus points for > > using > > the nr_pages to calculate how many pages get throttled from

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-15 Thread Mel Gorman
On Fri, Jun 15, 2018 at 01:07:32AM +0200, Jirka Hladky wrote: > > > > In terms of the speed of migration, it may be worth checking how often the > > mm_numa_migrate_ratelimit tracepoint is triggered with bonus points for > > using > > the nr_pages to calculate how many pages get throttled from

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-14 Thread Mel Gorman
On Mon, Jun 11, 2018 at 06:07:58PM +0200, Jirka Hladky wrote: > > > > Fixing any part of it for STREAM will end up regressing something else. > > > I fully understand that. We run a set of benchmarks and we always look at > the results as the ensemble. Looking only at one benchmark would be >

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-14 Thread Mel Gorman
On Mon, Jun 11, 2018 at 06:07:58PM +0200, Jirka Hladky wrote: > > > > Fixing any part of it for STREAM will end up regressing something else. > > > I fully understand that. We run a set of benchmarks and we always look at > the results as the ensemble. Looking only at one benchmark would be >

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-11 Thread Mel Gorman
On Mon, Jun 11, 2018 at 12:04:34PM +0200, Jirka Hladky wrote: > Hi Mel, > > your suggestion about the commit which has caused the regression was right > - it's indeed this commit: > > 2c83362734dad8e48ccc0710b5cd2436a0323893 > > The question now is what can be done to improve the results. I

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-11 Thread Mel Gorman
On Mon, Jun 11, 2018 at 12:04:34PM +0200, Jirka Hladky wrote: > Hi Mel, > > your suggestion about the commit which has caused the regression was right > - it's indeed this commit: > > 2c83362734dad8e48ccc0710b5cd2436a0323893 > > The question now is what can be done to improve the results. I

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-08 Thread Mel Gorman
On Fri, Jun 08, 2018 at 01:02:54PM +0200, Jirka Hladky wrote: > > > > Unknown and unknowable. It depends entirely on the reference pattern of > > the different threads. If they are fully parallelised with private buffers > > that are page-aligned then I expect it to be quick (to pass the

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-08 Thread Mel Gorman
On Fri, Jun 08, 2018 at 01:02:54PM +0200, Jirka Hladky wrote: > > > > Unknown and unknowable. It depends entirely on the reference pattern of > > the different threads. If they are fully parallelised with private buffers > > that are page-aligned then I expect it to be quick (to pass the

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-08 Thread Mel Gorman
On Fri, Jun 08, 2018 at 10:49:03AM +0200, Jirka Hladky wrote: > Hi Mel, > > automatic NUMA balancing doesn't run long enough to migrate all the > > memory. That would definitely be the case for STREAM. > > This could explain the behavior we observe. stream is running ~20 seconds > at the moment.

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-08 Thread Mel Gorman
On Fri, Jun 08, 2018 at 10:49:03AM +0200, Jirka Hladky wrote: > Hi Mel, > > automatic NUMA balancing doesn't run long enough to migrate all the > > memory. That would definitely be the case for STREAM. > > This could explain the behavior we observe. stream is running ~20 seconds > at the moment.

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-08 Thread Mel Gorman
On Fri, Jun 08, 2018 at 07:49:37AM +0200, Jirka Hladky wrote: > Hi Mel, > > we will do the bisection today and report the results back. > The most likely outcome is 2c83362734dad8e48ccc0710b5cd2436a0323893 which is a patch that restricts newly forked processes from selecting a remote node when

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-08 Thread Mel Gorman
On Fri, Jun 08, 2018 at 07:49:37AM +0200, Jirka Hladky wrote: > Hi Mel, > > we will do the bisection today and report the results back. > The most likely outcome is 2c83362734dad8e48ccc0710b5cd2436a0323893 which is a patch that restricts newly forked processes from selecting a remote node when

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-07 Thread Mel Gorman
On Wed, Jun 06, 2018 at 02:27:32PM +0200, Jakub Racek wrote: > There is a huge performance regression on the 2 and 4 NUMA node systems on > stream benchmark with 4.17 kernel compared to 4.16 kernel. Stream, Linpack > and NAS parallel benchmarks show upto 50% performance drop. > I have not

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-07 Thread Mel Gorman
On Wed, Jun 06, 2018 at 02:27:32PM +0200, Jakub Racek wrote: > There is a huge performance regression on the 2 and 4 NUMA node systems on > stream benchmark with 4.17 kernel compared to 4.16 kernel. Stream, Linpack > and NAS parallel benchmarks show upto 50% performance drop. > I have not

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-07 Thread Jakub Raček
Hi, On 06/07/2018 01:07 PM, Michal Hocko wrote: [CCing Mel and MM mailing list] On Wed 06-06-18 14:27:32, Jakub Racek wrote: Hi, There is a huge performance regression on the 2 and 4 NUMA node systems on stream benchmark with 4.17 kernel compared to 4.16 kernel. Stream, Linpack and NAS

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-07 Thread Jakub Raček
Hi, On 06/07/2018 01:07 PM, Michal Hocko wrote: [CCing Mel and MM mailing list] On Wed 06-06-18 14:27:32, Jakub Racek wrote: Hi, There is a huge performance regression on the 2 and 4 NUMA node systems on stream benchmark with 4.17 kernel compared to 4.16 kernel. Stream, Linpack and NAS

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-07 Thread Michal Hocko
[CCing Mel and MM mailing list] On Wed 06-06-18 14:27:32, Jakub Racek wrote: > Hi, > > There is a huge performance regression on the 2 and 4 NUMA node systems on > stream benchmark with 4.17 kernel compared to 4.16 kernel. Stream, Linpack > and NAS parallel benchmarks show upto 50% performance

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-07 Thread Michal Hocko
[CCing Mel and MM mailing list] On Wed 06-06-18 14:27:32, Jakub Racek wrote: > Hi, > > There is a huge performance regression on the 2 and 4 NUMA node systems on > stream benchmark with 4.17 kernel compared to 4.16 kernel. Stream, Linpack > and NAS parallel benchmarks show upto 50% performance

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-06 Thread Rafael J. Wysocki
On Wed, Jun 6, 2018 at 2:34 PM, Rafael J. Wysocki wrote: > On Wed, Jun 6, 2018 at 2:27 PM, Jakub Racek wrote: >> Hi, >> >> There is a huge performance regression on the 2 and 4 NUMA node systems on >> stream benchmark with 4.17 kernel compared to 4.16 kernel. Stream, Linpack >> and NAS parallel

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-06 Thread Rafael J. Wysocki
On Wed, Jun 6, 2018 at 2:34 PM, Rafael J. Wysocki wrote: > On Wed, Jun 6, 2018 at 2:27 PM, Jakub Racek wrote: >> Hi, >> >> There is a huge performance regression on the 2 and 4 NUMA node systems on >> stream benchmark with 4.17 kernel compared to 4.16 kernel. Stream, Linpack >> and NAS parallel

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-06 Thread Rafael J. Wysocki
On Wed, Jun 6, 2018 at 2:27 PM, Jakub Racek wrote: > Hi, > > There is a huge performance regression on the 2 and 4 NUMA node systems on > stream benchmark with 4.17 kernel compared to 4.16 kernel. Stream, Linpack > and NAS parallel benchmarks show upto 50% performance drop. > > When running for

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-06 Thread Rafael J. Wysocki
On Wed, Jun 6, 2018 at 2:27 PM, Jakub Racek wrote: > Hi, > > There is a huge performance regression on the 2 and 4 NUMA node systems on > stream benchmark with 4.17 kernel compared to 4.16 kernel. Stream, Linpack > and NAS parallel benchmarks show upto 50% performance drop. > > When running for

[4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-06 Thread Jakub Racek
Hi, There is a huge performance regression on the 2 and 4 NUMA node systems on stream benchmark with 4.17 kernel compared to 4.16 kernel. Stream, Linpack and NAS parallel benchmarks show upto 50% performance drop. When running for example 20 stream processes in parallel, we see the following

[4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-06 Thread Jakub Racek
Hi, There is a huge performance regression on the 2 and 4 NUMA node systems on stream benchmark with 4.17 kernel compared to 4.16 kernel. Stream, Linpack and NAS parallel benchmarks show upto 50% performance drop. When running for example 20 stream processes in parallel, we see the following