Re: aim7 -30% regression in 2.6.24-rc1
On Mon, 2007-11-05 at 10:37 +0100, Cyrus Massoumi wrote: > Zhang, Yanmin wrote: > > On Thu, 2007-11-01 at 11:02 +0100, Cyrus Massoumi wrote: > >> Zhang, Yanmin wrote: > >>> On Wed, 2007-10-31 at 17:57 +0800, Zhang, Yanmin wrote: > On Tue, 2007-10-30 at 16:36 +0800, Zhang, Yanmin wrote: > > On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote: > >> * Zhang, Yanmin <[EMAIL PROTECTED]> wrote: > >> > >>> sub-bisecting captured patch > >>> 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) > >>> caused 20% regression of aim7. > >>> > >>> The last 10% should be also related to sched parameters, such like > >>> sysctl_sched_min_granularity. > >> ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you > >> please try to figure out what the best value for > >> /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and > >> /proc/sys/kernel_sched_min_granularity is? > >> > >> there's a tuning constraint for kernel_sched_nr_latency: > >> > >> - kernel_sched_nr_latency should always be set to > >> kernel_sched_latency/kernel_sched_min_granularity. (it's not a free > >> tunable) > >> > >> i suspect a good approach would be to double the value of > >> kernel_sched_latency and kernel_sched_nr_latency in each tuning > >> iteration, while keeping kernel_sched_min_granularity unchanged. That > >> will excercise the tuning values of the 2.6.23 kernel as well. > > I followed your idea to test 2.6.24-rc1. The improvement is slow. > > When sched_nr_latency=2560 and sched_latency_ns=64000, the > > performance > > is still about 15% less than 2.6.23. > I got the aim7 30% regression on my new upgraded stoakley machine. I > found > this mahcine is slower than the old one. Maybe BIOS has issues, or > memeory(Might not > be dual-channel?) is slow. So I retested it on the old machine and found > on the old > stoakley machine, the regression is about 6%, quite similiar to the > regression on tigerton > machine. > > By sched_nr_latency=640 and sched_latency_ns=64000 on the old > stoakley machine, > the regression becomes about 2%. Other latency has more regression. > > On my tulsa machine, by sched_nr_latency=640 and > sched_latency_ns=64000, > the regression becomes less than 1% (The original regression is about > 20%). > >>> I rerun SPECjbb by ched_nr_latency=640 and sched_latency_ns=64000. On > >>> tigerton, > >>> the regression is still more than 40%. On stoakley machine, it becomes > >>> worse (26%, > >>> original is 9%). I will do more investigation to make sure SPECjbb > >>> regression is > >>> also casued by the bad default values. > >>> > >>> We need a smarter method to calculate the best default values for the key > >>> tuning > >>> parameters. > >>> > >>> One interesting is sysbench+mysql(readonly) got the same result like > >>> 2.6.22 (no > >>> regression). Good job! > >> Do you mean you couldn't reproduce the regression which was reported > >> with 2.6.23 (http://lkml.org/lkml/2007/10/30/53) with 2.6.24-rc1? > > It looks like you missed my emails. > > Yeah :( > > > Firstly, I reproduced (or just find the same myself :) ) the issue with > > kernel 2.6.22, > > 2.6.23-rc and 2.6.23. > > > > Ingo wrote a big patch to fix it and the new patch is in 2.6.24-rc1 now. > > That's nice, could you please point me to the commit? The patch is very big. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b5869ce7f68b233ceb81465a7644be0d9a5f3dbb - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Mon, 2007-11-05 at 10:37 +0100, Cyrus Massoumi wrote: Zhang, Yanmin wrote: On Thu, 2007-11-01 at 11:02 +0100, Cyrus Massoumi wrote: Zhang, Yanmin wrote: On Wed, 2007-10-31 at 17:57 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 16:36 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote: * Zhang, Yanmin [EMAIL PROTECTED] wrote: sub-bisecting captured patch 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) caused 20% regression of aim7. The last 10% should be also related to sched parameters, such like sysctl_sched_min_granularity. ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you please try to figure out what the best value for /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and /proc/sys/kernel_sched_min_granularity is? there's a tuning constraint for kernel_sched_nr_latency: - kernel_sched_nr_latency should always be set to kernel_sched_latency/kernel_sched_min_granularity. (it's not a free tunable) i suspect a good approach would be to double the value of kernel_sched_latency and kernel_sched_nr_latency in each tuning iteration, while keeping kernel_sched_min_granularity unchanged. That will excercise the tuning values of the 2.6.23 kernel as well. I followed your idea to test 2.6.24-rc1. The improvement is slow. When sched_nr_latency=2560 and sched_latency_ns=64000, the performance is still about 15% less than 2.6.23. I got the aim7 30% regression on my new upgraded stoakley machine. I found this mahcine is slower than the old one. Maybe BIOS has issues, or memeory(Might not be dual-channel?) is slow. So I retested it on the old machine and found on the old stoakley machine, the regression is about 6%, quite similiar to the regression on tigerton machine. By sched_nr_latency=640 and sched_latency_ns=64000 on the old stoakley machine, the regression becomes about 2%. Other latency has more regression. On my tulsa machine, by sched_nr_latency=640 and sched_latency_ns=64000, the regression becomes less than 1% (The original regression is about 20%). I rerun SPECjbb by ched_nr_latency=640 and sched_latency_ns=64000. On tigerton, the regression is still more than 40%. On stoakley machine, it becomes worse (26%, original is 9%). I will do more investigation to make sure SPECjbb regression is also casued by the bad default values. We need a smarter method to calculate the best default values for the key tuning parameters. One interesting is sysbench+mysql(readonly) got the same result like 2.6.22 (no regression). Good job! Do you mean you couldn't reproduce the regression which was reported with 2.6.23 (http://lkml.org/lkml/2007/10/30/53) with 2.6.24-rc1? It looks like you missed my emails. Yeah :( Firstly, I reproduced (or just find the same myself :) ) the issue with kernel 2.6.22, 2.6.23-rc and 2.6.23. Ingo wrote a big patch to fix it and the new patch is in 2.6.24-rc1 now. That's nice, could you please point me to the commit? The patch is very big. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b5869ce7f68b233ceb81465a7644be0d9a5f3dbb - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
Zhang, Yanmin wrote: On Thu, 2007-11-01 at 11:02 +0100, Cyrus Massoumi wrote: Zhang, Yanmin wrote: On Wed, 2007-10-31 at 17:57 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 16:36 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote: * Zhang, Yanmin <[EMAIL PROTECTED]> wrote: sub-bisecting captured patch 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) caused 20% regression of aim7. The last 10% should be also related to sched parameters, such like sysctl_sched_min_granularity. ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you please try to figure out what the best value for /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and /proc/sys/kernel_sched_min_granularity is? there's a tuning constraint for kernel_sched_nr_latency: - kernel_sched_nr_latency should always be set to kernel_sched_latency/kernel_sched_min_granularity. (it's not a free tunable) i suspect a good approach would be to double the value of kernel_sched_latency and kernel_sched_nr_latency in each tuning iteration, while keeping kernel_sched_min_granularity unchanged. That will excercise the tuning values of the 2.6.23 kernel as well. I followed your idea to test 2.6.24-rc1. The improvement is slow. When sched_nr_latency=2560 and sched_latency_ns=64000, the performance is still about 15% less than 2.6.23. I got the aim7 30% regression on my new upgraded stoakley machine. I found this mahcine is slower than the old one. Maybe BIOS has issues, or memeory(Might not be dual-channel?) is slow. So I retested it on the old machine and found on the old stoakley machine, the regression is about 6%, quite similiar to the regression on tigerton machine. By sched_nr_latency=640 and sched_latency_ns=64000 on the old stoakley machine, the regression becomes about 2%. Other latency has more regression. On my tulsa machine, by sched_nr_latency=640 and sched_latency_ns=64000, the regression becomes less than 1% (The original regression is about 20%). I rerun SPECjbb by ched_nr_latency=640 and sched_latency_ns=64000. On tigerton, the regression is still more than 40%. On stoakley machine, it becomes worse (26%, original is 9%). I will do more investigation to make sure SPECjbb regression is also casued by the bad default values. We need a smarter method to calculate the best default values for the key tuning parameters. One interesting is sysbench+mysql(readonly) got the same result like 2.6.22 (no regression). Good job! Do you mean you couldn't reproduce the regression which was reported with 2.6.23 (http://lkml.org/lkml/2007/10/30/53) with 2.6.24-rc1? It looks like you missed my emails. Yeah :( Firstly, I reproduced (or just find the same myself :) ) the issue with kernel 2.6.22, 2.6.23-rc and 2.6.23. Ingo wrote a big patch to fix it and the new patch is in 2.6.24-rc1 now. That's nice, could you please point me to the commit? Then I retested it with 2.6.24-rc1 on a couple of x86_64 machines. The issue disappeared. You could test it with 2.6.24-rc1. Will do! It would be nice if you could provide some numbers for 2.6.22, 2.6.23 and 2.6.24-rc1. Sorry. Intel policy doesn't allow me to publish the numbers because only specific departments in Intel could do that. But I could talk the regression percentage. Fair enough :) -yanmin greetings Cyrus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
Zhang, Yanmin wrote: On Thu, 2007-11-01 at 11:02 +0100, Cyrus Massoumi wrote: Zhang, Yanmin wrote: On Wed, 2007-10-31 at 17:57 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 16:36 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote: * Zhang, Yanmin [EMAIL PROTECTED] wrote: sub-bisecting captured patch 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) caused 20% regression of aim7. The last 10% should be also related to sched parameters, such like sysctl_sched_min_granularity. ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you please try to figure out what the best value for /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and /proc/sys/kernel_sched_min_granularity is? there's a tuning constraint for kernel_sched_nr_latency: - kernel_sched_nr_latency should always be set to kernel_sched_latency/kernel_sched_min_granularity. (it's not a free tunable) i suspect a good approach would be to double the value of kernel_sched_latency and kernel_sched_nr_latency in each tuning iteration, while keeping kernel_sched_min_granularity unchanged. That will excercise the tuning values of the 2.6.23 kernel as well. I followed your idea to test 2.6.24-rc1. The improvement is slow. When sched_nr_latency=2560 and sched_latency_ns=64000, the performance is still about 15% less than 2.6.23. I got the aim7 30% regression on my new upgraded stoakley machine. I found this mahcine is slower than the old one. Maybe BIOS has issues, or memeory(Might not be dual-channel?) is slow. So I retested it on the old machine and found on the old stoakley machine, the regression is about 6%, quite similiar to the regression on tigerton machine. By sched_nr_latency=640 and sched_latency_ns=64000 on the old stoakley machine, the regression becomes about 2%. Other latency has more regression. On my tulsa machine, by sched_nr_latency=640 and sched_latency_ns=64000, the regression becomes less than 1% (The original regression is about 20%). I rerun SPECjbb by ched_nr_latency=640 and sched_latency_ns=64000. On tigerton, the regression is still more than 40%. On stoakley machine, it becomes worse (26%, original is 9%). I will do more investigation to make sure SPECjbb regression is also casued by the bad default values. We need a smarter method to calculate the best default values for the key tuning parameters. One interesting is sysbench+mysql(readonly) got the same result like 2.6.22 (no regression). Good job! Do you mean you couldn't reproduce the regression which was reported with 2.6.23 (http://lkml.org/lkml/2007/10/30/53) with 2.6.24-rc1? It looks like you missed my emails. Yeah :( Firstly, I reproduced (or just find the same myself :) ) the issue with kernel 2.6.22, 2.6.23-rc and 2.6.23. Ingo wrote a big patch to fix it and the new patch is in 2.6.24-rc1 now. That's nice, could you please point me to the commit? Then I retested it with 2.6.24-rc1 on a couple of x86_64 machines. The issue disappeared. You could test it with 2.6.24-rc1. Will do! It would be nice if you could provide some numbers for 2.6.22, 2.6.23 and 2.6.24-rc1. Sorry. Intel policy doesn't allow me to publish the numbers because only specific departments in Intel could do that. But I could talk the regression percentage. Fair enough :) -yanmin greetings Cyrus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Thu, 2007-11-01 at 11:02 +0100, Cyrus Massoumi wrote: > Zhang, Yanmin wrote: > > On Wed, 2007-10-31 at 17:57 +0800, Zhang, Yanmin wrote: > >> On Tue, 2007-10-30 at 16:36 +0800, Zhang, Yanmin wrote: > >>> On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote: > * Zhang, Yanmin <[EMAIL PROTECTED]> wrote: > > > sub-bisecting captured patch > > 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) > > caused 20% regression of aim7. > > > > The last 10% should be also related to sched parameters, such like > > sysctl_sched_min_granularity. > ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you > please try to figure out what the best value for > /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and > /proc/sys/kernel_sched_min_granularity is? > > there's a tuning constraint for kernel_sched_nr_latency: > > - kernel_sched_nr_latency should always be set to > kernel_sched_latency/kernel_sched_min_granularity. (it's not a free > tunable) > > i suspect a good approach would be to double the value of > kernel_sched_latency and kernel_sched_nr_latency in each tuning > iteration, while keeping kernel_sched_min_granularity unchanged. That > will excercise the tuning values of the 2.6.23 kernel as well. > >>> I followed your idea to test 2.6.24-rc1. The improvement is slow. > >>> When sched_nr_latency=2560 and sched_latency_ns=64000, the performance > >>> is still about 15% less than 2.6.23. > >> I got the aim7 30% regression on my new upgraded stoakley machine. I found > >> this mahcine is slower than the old one. Maybe BIOS has issues, or > >> memeory(Might not > >> be dual-channel?) is slow. So I retested it on the old machine and found > >> on the old > >> stoakley machine, the regression is about 6%, quite similiar to the > >> regression on tigerton > >> machine. > >> > >> By sched_nr_latency=640 and sched_latency_ns=64000 on the old stoakley > >> machine, > >> the regression becomes about 2%. Other latency has more regression. > >> > >> On my tulsa machine, by sched_nr_latency=640 and > >> sched_latency_ns=64000, > >> the regression becomes less than 1% (The original regression is about 20%). > > I rerun SPECjbb by ched_nr_latency=640 and sched_latency_ns=64000. On > > tigerton, > > the regression is still more than 40%. On stoakley machine, it becomes > > worse (26%, > > original is 9%). I will do more investigation to make sure SPECjbb > > regression is > > also casued by the bad default values. > > > > We need a smarter method to calculate the best default values for the key > > tuning > > parameters. > > > > One interesting is sysbench+mysql(readonly) got the same result like 2.6.22 > > (no > > regression). Good job! > > Do you mean you couldn't reproduce the regression which was reported > with 2.6.23 (http://lkml.org/lkml/2007/10/30/53) with 2.6.24-rc1? It looks like you missed my emails. Firstly, I reproduced (or just find the same myself :) ) the issue with kernel 2.6.22, 2.6.23-rc and 2.6.23. Ingo wrote a big patch to fix it and the new patch is in 2.6.24-rc1 now. Then I retested it with 2.6.24-rc1 on a couple of x86_64 machines. The issue disappeared. You could test it with 2.6.24-rc1. > It > would be nice if you could provide some numbers for 2.6.22, 2.6.23 and > 2.6.24-rc1. Sorry. Intel policy doesn't allow me to publish the numbers because only specific departments in Intel could do that. But I could talk the regression percentage. -yanmin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Thu, 2007-11-01 at 11:02 +0100, Cyrus Massoumi wrote: Zhang, Yanmin wrote: On Wed, 2007-10-31 at 17:57 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 16:36 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote: * Zhang, Yanmin [EMAIL PROTECTED] wrote: sub-bisecting captured patch 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) caused 20% regression of aim7. The last 10% should be also related to sched parameters, such like sysctl_sched_min_granularity. ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you please try to figure out what the best value for /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and /proc/sys/kernel_sched_min_granularity is? there's a tuning constraint for kernel_sched_nr_latency: - kernel_sched_nr_latency should always be set to kernel_sched_latency/kernel_sched_min_granularity. (it's not a free tunable) i suspect a good approach would be to double the value of kernel_sched_latency and kernel_sched_nr_latency in each tuning iteration, while keeping kernel_sched_min_granularity unchanged. That will excercise the tuning values of the 2.6.23 kernel as well. I followed your idea to test 2.6.24-rc1. The improvement is slow. When sched_nr_latency=2560 and sched_latency_ns=64000, the performance is still about 15% less than 2.6.23. I got the aim7 30% regression on my new upgraded stoakley machine. I found this mahcine is slower than the old one. Maybe BIOS has issues, or memeory(Might not be dual-channel?) is slow. So I retested it on the old machine and found on the old stoakley machine, the regression is about 6%, quite similiar to the regression on tigerton machine. By sched_nr_latency=640 and sched_latency_ns=64000 on the old stoakley machine, the regression becomes about 2%. Other latency has more regression. On my tulsa machine, by sched_nr_latency=640 and sched_latency_ns=64000, the regression becomes less than 1% (The original regression is about 20%). I rerun SPECjbb by ched_nr_latency=640 and sched_latency_ns=64000. On tigerton, the regression is still more than 40%. On stoakley machine, it becomes worse (26%, original is 9%). I will do more investigation to make sure SPECjbb regression is also casued by the bad default values. We need a smarter method to calculate the best default values for the key tuning parameters. One interesting is sysbench+mysql(readonly) got the same result like 2.6.22 (no regression). Good job! Do you mean you couldn't reproduce the regression which was reported with 2.6.23 (http://lkml.org/lkml/2007/10/30/53) with 2.6.24-rc1? It looks like you missed my emails. Firstly, I reproduced (or just find the same myself :) ) the issue with kernel 2.6.22, 2.6.23-rc and 2.6.23. Ingo wrote a big patch to fix it and the new patch is in 2.6.24-rc1 now. Then I retested it with 2.6.24-rc1 on a couple of x86_64 machines. The issue disappeared. You could test it with 2.6.24-rc1. It would be nice if you could provide some numbers for 2.6.22, 2.6.23 and 2.6.24-rc1. Sorry. Intel policy doesn't allow me to publish the numbers because only specific departments in Intel could do that. But I could talk the regression percentage. -yanmin - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
* Peter Zijlstra <[EMAIL PROTECTED]> wrote: > > > We don't have min_granularity anymore. > > > > i think we should reintroduce it in the SCHED_DEBUG case and make it > > the main tunable item - sched_nr is a nice performance optimization > > but quite unintuitive as a tuning knob. > > ok, I don't particularly care either way, could be because I wrote the > stuff :-) heh :-) I've applied your patch, it looks good to me. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
(restoring CCs which I inadvertly dropped) On Thu, 2007-11-01 at 16:00 +0100, Ingo Molnar wrote: > * Peter Zijlstra <[EMAIL PROTECTED]> wrote: > > > > could we instead justmake sched_nr_latency non-tunable, and > > > recalculate it from the sysctl handler whenever sched_latency or > > > sched_min_granularity changes? That would avoid not only the > > > division by zero bug but also other out-of-spec tunings. > > > > We don't have min_granularity anymore. > > i think we should reintroduce it in the SCHED_DEBUG case and make it the > main tunable item - sched_nr is a nice performance optimization but > quite unintuitive as a tuning knob. ok, I don't particularly care either way, could be because I wrote the stuff :-) Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]> --- Index: linux-2.6/include/linux/sched.h === --- linux-2.6.orig/include/linux/sched.h +++ linux-2.6/include/linux/sched.h @@ -1466,12 +1466,16 @@ extern void sched_idle_next(void); #ifdef CONFIG_SCHED_DEBUG extern unsigned int sysctl_sched_latency; -extern unsigned int sysctl_sched_nr_latency; +extern unsigned int sysctl_sched_min_granularity; extern unsigned int sysctl_sched_wakeup_granularity; extern unsigned int sysctl_sched_batch_wakeup_granularity; extern unsigned int sysctl_sched_child_runs_first; extern unsigned int sysctl_sched_features; extern unsigned int sysctl_sched_migration_cost; + +int sched_nr_latency_handler(struct ctl_table *table, int write, + struct file *file, void __user *buffer, size_t *length, + loff_t *ppos); #endif extern unsigned int sysctl_sched_compat_yield; Index: linux-2.6/kernel/sched_debug.c === --- linux-2.6.orig/kernel/sched_debug.c +++ linux-2.6/kernel/sched_debug.c @@ -210,7 +210,7 @@ static int sched_debug_show(struct seq_f #define PN(x) \ SEQ_printf(m, " .%-40s: %Ld.%06ld\n", #x, SPLIT_NS(x)) PN(sysctl_sched_latency); - PN(sysctl_sched_nr_latency); + PN(sysctl_sched_min_granularity); PN(sysctl_sched_wakeup_granularity); PN(sysctl_sched_batch_wakeup_granularity); PN(sysctl_sched_child_runs_first); Index: linux-2.6/kernel/sched_fair.c === --- linux-2.6.orig/kernel/sched_fair.c +++ linux-2.6/kernel/sched_fair.c @@ -35,16 +35,21 @@ const_debug unsigned int sysctl_sched_latency = 2000ULL; /* - * After fork, child runs first. (default) If set to 0 then - * parent will (try to) run first. + * Minimal preemption granularity for CPU-bound tasks: + * (default: 1 msec, units: nanoseconds) */ -const_debug unsigned int sysctl_sched_child_runs_first = 1; +const_debug unsigned int sysctl_sched_min_granularity = 100ULL; /* - * Minimal preemption granularity for CPU-bound tasks: - * (default: 2 msec, units: nanoseconds) + * is kept at sysctl_sched_latency / sysctl_sched_min_granularity + */ +const_debug unsigned int sched_nr_latency = 20; + +/* + * After fork, child runs first. (default) If set to 0 then + * parent will (try to) run first. */ -const_debug unsigned int sysctl_sched_nr_latency = 20; +const_debug unsigned int sysctl_sched_child_runs_first = 1; /* * sys_sched_yield() compat mode @@ -301,6 +306,21 @@ static inline struct sched_entity *__pic * Scheduling class statistics methods: */ +#ifdef CONFIG_SCHED_DEBUG +int sched_nr_latency_handler(struct ctl_table *table, int write, + struct file *filp, void __user *buffer, size_t *lenp, + loff_t *ppos) +{ + int ret = proc_dointvec_minmax(table, write, filp, buffer, lenp, ppos); + + if (!ret && write) { + sched_nr_latency = + sysctl_sched_latency / sysctl_sched_min_granularity; + } + + return ret; +} +#endif /* * The idea is to set a period in which each task runs once. @@ -313,7 +333,7 @@ static inline struct sched_entity *__pic static u64 __sched_period(unsigned long nr_running) { u64 period = sysctl_sched_latency; - unsigned long nr_latency = sysctl_sched_nr_latency; + unsigned long nr_latency = sched_nr_latency; if (unlikely(nr_running > nr_latency)) { period *= nr_running; Index: linux-2.6/kernel/sysctl.c === --- linux-2.6.orig/kernel/sysctl.c +++ linux-2.6/kernel/sysctl.c @@ -235,11 +235,14 @@ static struct ctl_table kern_table[] = { #ifdef CONFIG_SCHED_DEBUG { .ctl_name = CTL_UNNUMBERED, - .procname = "sched_nr_latency", - .data = _sched_nr_latency, + .procname = "sched_min_granularity_ns", + .data = _sched_min_granularity, .maxlen = sizeof(unsigned int), .mode = 0644, -
Re: aim7 -30% regression in 2.6.24-rc1
Zhang, Yanmin wrote: On Wed, 2007-10-31 at 17:57 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 16:36 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote: * Zhang, Yanmin <[EMAIL PROTECTED]> wrote: sub-bisecting captured patch 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) caused 20% regression of aim7. The last 10% should be also related to sched parameters, such like sysctl_sched_min_granularity. ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you please try to figure out what the best value for /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and /proc/sys/kernel_sched_min_granularity is? there's a tuning constraint for kernel_sched_nr_latency: - kernel_sched_nr_latency should always be set to kernel_sched_latency/kernel_sched_min_granularity. (it's not a free tunable) i suspect a good approach would be to double the value of kernel_sched_latency and kernel_sched_nr_latency in each tuning iteration, while keeping kernel_sched_min_granularity unchanged. That will excercise the tuning values of the 2.6.23 kernel as well. I followed your idea to test 2.6.24-rc1. The improvement is slow. When sched_nr_latency=2560 and sched_latency_ns=64000, the performance is still about 15% less than 2.6.23. I got the aim7 30% regression on my new upgraded stoakley machine. I found this mahcine is slower than the old one. Maybe BIOS has issues, or memeory(Might not be dual-channel?) is slow. So I retested it on the old machine and found on the old stoakley machine, the regression is about 6%, quite similiar to the regression on tigerton machine. By sched_nr_latency=640 and sched_latency_ns=64000 on the old stoakley machine, the regression becomes about 2%. Other latency has more regression. On my tulsa machine, by sched_nr_latency=640 and sched_latency_ns=64000, the regression becomes less than 1% (The original regression is about 20%). I rerun SPECjbb by ched_nr_latency=640 and sched_latency_ns=64000. On tigerton, the regression is still more than 40%. On stoakley machine, it becomes worse (26%, original is 9%). I will do more investigation to make sure SPECjbb regression is also casued by the bad default values. We need a smarter method to calculate the best default values for the key tuning parameters. One interesting is sysbench+mysql(readonly) got the same result like 2.6.22 (no regression). Good job! Do you mean you couldn't reproduce the regression which was reported with 2.6.23 (http://lkml.org/lkml/2007/10/30/53) with 2.6.24-rc1? It would be nice if you could provide some numbers for 2.6.22, 2.6.23 and 2.6.24-rc1. -yanmin greetings Cyrus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Wed, 2007-10-31 at 17:57 +0800, Zhang, Yanmin wrote: > On Tue, 2007-10-30 at 16:36 +0800, Zhang, Yanmin wrote: > > On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote: > > > * Zhang, Yanmin <[EMAIL PROTECTED]> wrote: > > > > > > > sub-bisecting captured patch > > > > 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) > > > > caused 20% regression of aim7. > > > > > > > > The last 10% should be also related to sched parameters, such like > > > > sysctl_sched_min_granularity. > > > > > > ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you > > > please try to figure out what the best value for > > > /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and > > > /proc/sys/kernel_sched_min_granularity is? > > > > > > there's a tuning constraint for kernel_sched_nr_latency: > > > > > > - kernel_sched_nr_latency should always be set to > > > kernel_sched_latency/kernel_sched_min_granularity. (it's not a free > > > tunable) > > > > > > i suspect a good approach would be to double the value of > > > kernel_sched_latency and kernel_sched_nr_latency in each tuning > > > iteration, while keeping kernel_sched_min_granularity unchanged. That > > > will excercise the tuning values of the 2.6.23 kernel as well. > > I followed your idea to test 2.6.24-rc1. The improvement is slow. > > When sched_nr_latency=2560 and sched_latency_ns=64000, the performance > > is still about 15% less than 2.6.23. > > I got the aim7 30% regression on my new upgraded stoakley machine. I found > this mahcine is slower than the old one. Maybe BIOS has issues, or > memeory(Might not > be dual-channel?) is slow. So I retested it on the old machine and found on > the old > stoakley machine, the regression is about 6%, quite similiar to the > regression on tigerton > machine. > > By sched_nr_latency=640 and sched_latency_ns=64000 on the old stoakley > machine, > the regression becomes about 2%. Other latency has more regression. > > On my tulsa machine, by sched_nr_latency=640 and sched_latency_ns=64000, > the regression becomes less than 1% (The original regression is about 20%). I rerun SPECjbb by ched_nr_latency=640 and sched_latency_ns=64000. On tigerton, the regression is still more than 40%. On stoakley machine, it becomes worse (26%, original is 9%). I will do more investigation to make sure SPECjbb regression is also casued by the bad default values. We need a smarter method to calculate the best default values for the key tuning parameters. One interesting is sysbench+mysql(readonly) got the same result like 2.6.22 (no regression). Good job! -yanmin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
* Peter Zijlstra <[EMAIL PROTECTED]> wrote: > static int one_hundred = 100; > +static int int_max = INT_MAX; > > /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID > */ > static int maxolduid = 65535; > @@ -239,7 +240,10 @@ static struct ctl_table kern_table[] = { > .data = _sched_nr_latency, > .maxlen = sizeof(unsigned int), > .mode = 0644, > - .proc_handler = _dointvec, > + .proc_handler = _dointvec_minmax, > + .strategy = _intvec, > + .extra1 = , > + .extra2 = _max, could we instead justmake sched_nr_latency non-tunable, and recalculate it from the sysctl handler whenever sched_latency or sched_min_granularity changes? That would avoid not only the division by zero bug but also other out-of-spec tunings. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
* Peter Zijlstra [EMAIL PROTECTED] wrote: static int one_hundred = 100; +static int int_max = INT_MAX; /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */ static int maxolduid = 65535; @@ -239,7 +240,10 @@ static struct ctl_table kern_table[] = { .data = sysctl_sched_nr_latency, .maxlen = sizeof(unsigned int), .mode = 0644, - .proc_handler = proc_dointvec, + .proc_handler = proc_dointvec_minmax, + .strategy = sysctl_intvec, + .extra1 = one, + .extra2 = int_max, could we instead justmake sched_nr_latency non-tunable, and recalculate it from the sysctl handler whenever sched_latency or sched_min_granularity changes? That would avoid not only the division by zero bug but also other out-of-spec tunings. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Wed, 2007-10-31 at 17:57 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 16:36 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote: * Zhang, Yanmin [EMAIL PROTECTED] wrote: sub-bisecting captured patch 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) caused 20% regression of aim7. The last 10% should be also related to sched parameters, such like sysctl_sched_min_granularity. ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you please try to figure out what the best value for /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and /proc/sys/kernel_sched_min_granularity is? there's a tuning constraint for kernel_sched_nr_latency: - kernel_sched_nr_latency should always be set to kernel_sched_latency/kernel_sched_min_granularity. (it's not a free tunable) i suspect a good approach would be to double the value of kernel_sched_latency and kernel_sched_nr_latency in each tuning iteration, while keeping kernel_sched_min_granularity unchanged. That will excercise the tuning values of the 2.6.23 kernel as well. I followed your idea to test 2.6.24-rc1. The improvement is slow. When sched_nr_latency=2560 and sched_latency_ns=64000, the performance is still about 15% less than 2.6.23. I got the aim7 30% regression on my new upgraded stoakley machine. I found this mahcine is slower than the old one. Maybe BIOS has issues, or memeory(Might not be dual-channel?) is slow. So I retested it on the old machine and found on the old stoakley machine, the regression is about 6%, quite similiar to the regression on tigerton machine. By sched_nr_latency=640 and sched_latency_ns=64000 on the old stoakley machine, the regression becomes about 2%. Other latency has more regression. On my tulsa machine, by sched_nr_latency=640 and sched_latency_ns=64000, the regression becomes less than 1% (The original regression is about 20%). I rerun SPECjbb by ched_nr_latency=640 and sched_latency_ns=64000. On tigerton, the regression is still more than 40%. On stoakley machine, it becomes worse (26%, original is 9%). I will do more investigation to make sure SPECjbb regression is also casued by the bad default values. We need a smarter method to calculate the best default values for the key tuning parameters. One interesting is sysbench+mysql(readonly) got the same result like 2.6.22 (no regression). Good job! -yanmin - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
Zhang, Yanmin wrote: On Wed, 2007-10-31 at 17:57 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 16:36 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote: * Zhang, Yanmin [EMAIL PROTECTED] wrote: sub-bisecting captured patch 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) caused 20% regression of aim7. The last 10% should be also related to sched parameters, such like sysctl_sched_min_granularity. ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you please try to figure out what the best value for /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and /proc/sys/kernel_sched_min_granularity is? there's a tuning constraint for kernel_sched_nr_latency: - kernel_sched_nr_latency should always be set to kernel_sched_latency/kernel_sched_min_granularity. (it's not a free tunable) i suspect a good approach would be to double the value of kernel_sched_latency and kernel_sched_nr_latency in each tuning iteration, while keeping kernel_sched_min_granularity unchanged. That will excercise the tuning values of the 2.6.23 kernel as well. I followed your idea to test 2.6.24-rc1. The improvement is slow. When sched_nr_latency=2560 and sched_latency_ns=64000, the performance is still about 15% less than 2.6.23. I got the aim7 30% regression on my new upgraded stoakley machine. I found this mahcine is slower than the old one. Maybe BIOS has issues, or memeory(Might not be dual-channel?) is slow. So I retested it on the old machine and found on the old stoakley machine, the regression is about 6%, quite similiar to the regression on tigerton machine. By sched_nr_latency=640 and sched_latency_ns=64000 on the old stoakley machine, the regression becomes about 2%. Other latency has more regression. On my tulsa machine, by sched_nr_latency=640 and sched_latency_ns=64000, the regression becomes less than 1% (The original regression is about 20%). I rerun SPECjbb by ched_nr_latency=640 and sched_latency_ns=64000. On tigerton, the regression is still more than 40%. On stoakley machine, it becomes worse (26%, original is 9%). I will do more investigation to make sure SPECjbb regression is also casued by the bad default values. We need a smarter method to calculate the best default values for the key tuning parameters. One interesting is sysbench+mysql(readonly) got the same result like 2.6.22 (no regression). Good job! Do you mean you couldn't reproduce the regression which was reported with 2.6.23 (http://lkml.org/lkml/2007/10/30/53) with 2.6.24-rc1? It would be nice if you could provide some numbers for 2.6.22, 2.6.23 and 2.6.24-rc1. -yanmin greetings Cyrus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
(restoring CCs which I inadvertly dropped) On Thu, 2007-11-01 at 16:00 +0100, Ingo Molnar wrote: * Peter Zijlstra [EMAIL PROTECTED] wrote: could we instead justmake sched_nr_latency non-tunable, and recalculate it from the sysctl handler whenever sched_latency or sched_min_granularity changes? That would avoid not only the division by zero bug but also other out-of-spec tunings. We don't have min_granularity anymore. i think we should reintroduce it in the SCHED_DEBUG case and make it the main tunable item - sched_nr is a nice performance optimization but quite unintuitive as a tuning knob. ok, I don't particularly care either way, could be because I wrote the stuff :-) Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- Index: linux-2.6/include/linux/sched.h === --- linux-2.6.orig/include/linux/sched.h +++ linux-2.6/include/linux/sched.h @@ -1466,12 +1466,16 @@ extern void sched_idle_next(void); #ifdef CONFIG_SCHED_DEBUG extern unsigned int sysctl_sched_latency; -extern unsigned int sysctl_sched_nr_latency; +extern unsigned int sysctl_sched_min_granularity; extern unsigned int sysctl_sched_wakeup_granularity; extern unsigned int sysctl_sched_batch_wakeup_granularity; extern unsigned int sysctl_sched_child_runs_first; extern unsigned int sysctl_sched_features; extern unsigned int sysctl_sched_migration_cost; + +int sched_nr_latency_handler(struct ctl_table *table, int write, + struct file *file, void __user *buffer, size_t *length, + loff_t *ppos); #endif extern unsigned int sysctl_sched_compat_yield; Index: linux-2.6/kernel/sched_debug.c === --- linux-2.6.orig/kernel/sched_debug.c +++ linux-2.6/kernel/sched_debug.c @@ -210,7 +210,7 @@ static int sched_debug_show(struct seq_f #define PN(x) \ SEQ_printf(m, .%-40s: %Ld.%06ld\n, #x, SPLIT_NS(x)) PN(sysctl_sched_latency); - PN(sysctl_sched_nr_latency); + PN(sysctl_sched_min_granularity); PN(sysctl_sched_wakeup_granularity); PN(sysctl_sched_batch_wakeup_granularity); PN(sysctl_sched_child_runs_first); Index: linux-2.6/kernel/sched_fair.c === --- linux-2.6.orig/kernel/sched_fair.c +++ linux-2.6/kernel/sched_fair.c @@ -35,16 +35,21 @@ const_debug unsigned int sysctl_sched_latency = 2000ULL; /* - * After fork, child runs first. (default) If set to 0 then - * parent will (try to) run first. + * Minimal preemption granularity for CPU-bound tasks: + * (default: 1 msec, units: nanoseconds) */ -const_debug unsigned int sysctl_sched_child_runs_first = 1; +const_debug unsigned int sysctl_sched_min_granularity = 100ULL; /* - * Minimal preemption granularity for CPU-bound tasks: - * (default: 2 msec, units: nanoseconds) + * is kept at sysctl_sched_latency / sysctl_sched_min_granularity + */ +const_debug unsigned int sched_nr_latency = 20; + +/* + * After fork, child runs first. (default) If set to 0 then + * parent will (try to) run first. */ -const_debug unsigned int sysctl_sched_nr_latency = 20; +const_debug unsigned int sysctl_sched_child_runs_first = 1; /* * sys_sched_yield() compat mode @@ -301,6 +306,21 @@ static inline struct sched_entity *__pic * Scheduling class statistics methods: */ +#ifdef CONFIG_SCHED_DEBUG +int sched_nr_latency_handler(struct ctl_table *table, int write, + struct file *filp, void __user *buffer, size_t *lenp, + loff_t *ppos) +{ + int ret = proc_dointvec_minmax(table, write, filp, buffer, lenp, ppos); + + if (!ret write) { + sched_nr_latency = + sysctl_sched_latency / sysctl_sched_min_granularity; + } + + return ret; +} +#endif /* * The idea is to set a period in which each task runs once. @@ -313,7 +333,7 @@ static inline struct sched_entity *__pic static u64 __sched_period(unsigned long nr_running) { u64 period = sysctl_sched_latency; - unsigned long nr_latency = sysctl_sched_nr_latency; + unsigned long nr_latency = sched_nr_latency; if (unlikely(nr_running nr_latency)) { period *= nr_running; Index: linux-2.6/kernel/sysctl.c === --- linux-2.6.orig/kernel/sysctl.c +++ linux-2.6/kernel/sysctl.c @@ -235,11 +235,14 @@ static struct ctl_table kern_table[] = { #ifdef CONFIG_SCHED_DEBUG { .ctl_name = CTL_UNNUMBERED, - .procname = sched_nr_latency, - .data = sysctl_sched_nr_latency, + .procname = sched_min_granularity_ns, + .data = sysctl_sched_min_granularity, .maxlen = sizeof(unsigned int), .mode = 0644, -
Re: aim7 -30% regression in 2.6.24-rc1
* Peter Zijlstra [EMAIL PROTECTED] wrote: We don't have min_granularity anymore. i think we should reintroduce it in the SCHED_DEBUG case and make it the main tunable item - sched_nr is a nice performance optimization but quite unintuitive as a tuning knob. ok, I don't particularly care either way, could be because I wrote the stuff :-) heh :-) I've applied your patch, it looks good to me. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Wed, 2007-10-31 at 17:57 +0800, Zhang, Yanmin wrote: > On Tue, 2007-10-30 at 16:36 +0800, Zhang, Yanmin wrote: > > On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote: > > > * Zhang, Yanmin <[EMAIL PROTECTED]> wrote: > > > > > > > sub-bisecting captured patch > > > > 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) > > > > caused 20% regression of aim7. > > > > > > > > The last 10% should be also related to sched parameters, such like > > > > sysctl_sched_min_granularity. > > > > > > ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you > > > please try to figure out what the best value for > > > /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and > > > /proc/sys/kernel_sched_min_granularity is? > > > > > > there's a tuning constraint for kernel_sched_nr_latency: > > > > > > - kernel_sched_nr_latency should always be set to > > > kernel_sched_latency/kernel_sched_min_granularity. (it's not a free > > > tunable) > > > > > > i suspect a good approach would be to double the value of > > > kernel_sched_latency and kernel_sched_nr_latency in each tuning > > > iteration, while keeping kernel_sched_min_granularity unchanged. That > > > will excercise the tuning values of the 2.6.23 kernel as well. > > I followed your idea to test 2.6.24-rc1. The improvement is slow. > > When sched_nr_latency=2560 and sched_latency_ns=64000, the performance > > is still about 15% less than 2.6.23. > > I got the aim7 30% regression on my new upgraded stoakley machine. I found > this mahcine is slower than the old one. Maybe BIOS has issues, or > memeory(Might not > be dual-channel?) is slow. So I retested it on the old machine and found on > the old > stoakley machine, the regression is about 6%, quite similiar to the > regression on tigerton > machine. > > By sched_nr_latency=640 and sched_latency_ns=64000 on the old stoakley > machine, > the regression becomes about 2%. Other latency has more regression. > > On my tulsa machine, by sched_nr_latency=640 and sched_latency_ns=64000, > the regression becomes less than 1% (The original regression is about 20%). > > When I ran a bad script to change the values of sched_nr_latency and > sched_latency_ns, > I hit OOPS on my tulsa machine. Below is the log. It looks like > sched_nr_latency becomes > 0. Oops, yeah I think I overlooked that case :-/ I think limiting the sysctl parameters make most sense, as a 0 value really doesn't. Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]> --- diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 3b4efbe..0f34c91 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -94,6 +94,7 @@ static int two = 2; static int zero; static int one_hundred = 100; +static int int_max = INT_MAX; /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */ static int maxolduid = 65535; @@ -239,7 +240,10 @@ static struct ctl_table kern_table[] = { .data = _sched_nr_latency, .maxlen = sizeof(unsigned int), .mode = 0644, - .proc_handler = _dointvec, + .proc_handler = _dointvec_minmax, + .strategy = _intvec, + .extra1 = , + .extra2 = _max, }, { .ctl_name = CTL_UNNUMBERED, signature.asc Description: This is a digitally signed message part
Re: aim7 -30% regression in 2.6.24-rc1
On Tue, 2007-10-30 at 16:36 +0800, Zhang, Yanmin wrote: > On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote: > > * Zhang, Yanmin <[EMAIL PROTECTED]> wrote: > > > > > sub-bisecting captured patch > > > 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) > > > caused 20% regression of aim7. > > > > > > The last 10% should be also related to sched parameters, such like > > > sysctl_sched_min_granularity. > > > > ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you > > please try to figure out what the best value for > > /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and > > /proc/sys/kernel_sched_min_granularity is? > > > > there's a tuning constraint for kernel_sched_nr_latency: > > > > - kernel_sched_nr_latency should always be set to > > kernel_sched_latency/kernel_sched_min_granularity. (it's not a free > > tunable) > > > > i suspect a good approach would be to double the value of > > kernel_sched_latency and kernel_sched_nr_latency in each tuning > > iteration, while keeping kernel_sched_min_granularity unchanged. That > > will excercise the tuning values of the 2.6.23 kernel as well. > I followed your idea to test 2.6.24-rc1. The improvement is slow. > When sched_nr_latency=2560 and sched_latency_ns=64000, the performance > is still about 15% less than 2.6.23. I got the aim7 30% regression on my new upgraded stoakley machine. I found this mahcine is slower than the old one. Maybe BIOS has issues, or memeory(Might not be dual-channel?) is slow. So I retested it on the old machine and found on the old stoakley machine, the regression is about 6%, quite similiar to the regression on tigerton machine. By sched_nr_latency=640 and sched_latency_ns=64000 on the old stoakley machine, the regression becomes about 2%. Other latency has more regression. On my tulsa machine, by sched_nr_latency=640 and sched_latency_ns=64000, the regression becomes less than 1% (The original regression is about 20%). When I ran a bad script to change the values of sched_nr_latency and sched_latency_ns, I hit OOPS on my tulsa machine. Below is the log. It looks like sched_nr_latency becomes 0. ***Log divide error: [1] SMP CPU 1 Modules linked in: megaraid_mbox megaraid_mm Pid: 7326, comm: sh Not tainted 2.6.24-rc1 #2 RIP: 0010:[] [] __sched_period+0x22/0x2e RSP: 0018:810105909e38 EFLAGS: 00010046 RAX: 5a00 RBX: RCX: 2d00 RDX: RSI: 0002 RDI: 0002 RBP: 810105909e40 R08: 810103bfed50 R09: R10: 0038 R11: 0296 R12: 810100d6db40 R13: 8101058c4148 R14: 0001 R15: 810104c34088 FS: 2b851bc59f50() GS:810100cb1b40() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 006c64d8 CR3: 00010752c000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process sh (pid: 7326, threadinfo 810105908000, task 810104c34040) Stack: 0800 810105909e58 8022c2db 079d292b 810105909e88 8022c36e 810100d6db40 8101058c4148 8101058c4100 0001 810105909ec8 80232d0a Call Trace: [] __sched_vslice+0x10/0x1d [] place_entity+0x86/0xc3 [] task_new_fair+0x48/0xa5 [] system_call+0x7e/0x83 [] wake_up_new_task+0x70/0xa4 [] do_fork+0x137/0x204 [] vfs_write+0x121/0x136 [] recalc_sigpending+0xe/0x25 [] sigprocmask+0x9e/0xc0 [] ptregscall_common+0x67/0xb0 Code: 48 f7 f3 48 89 c1 5b c9 48 89 c8 c3 55 48 89 e5 53 48 89 fb RIP [] __sched_period+0x22/0x2e RSP divide error: [2] SMP CPU 0 Modules linked in: megaraid_mbox megaraid_mm Pid: 3674, comm: automount Tainted: G D 2.6.24-rc1 #2 RIP: 0010:[] [] __sched_period+0x22/0x2e RSP: 0018:81010690de38 EFLAGS: 00010046 RAX: 5a00 RBX: RCX: 2d00 RDX: RSI: 0002 RDI: 0002 RBP: 81010690de40 R08: 81010690c000 R09: R10: 0038 R11: 810104007040 R12: 810001033880 R13: 810100f2a828 R14: 0001 R15: 810104007088 FS: 40021950(0063) GS:8074e000() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 2b6cc4245000 CR3: 000105972000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process automount (pid: 3674, threadinfo 81010690c000, task 810104007040) Stack: 0800 81010690de58 8022c2db 00057aef240d 81010690de88 8022c36e 810001033880 810100f2a828 810100f2a7e0 81010690dec8 80232d0a
Re: aim7 -30% regression in 2.6.24-rc1
On Tue, 2007-10-30 at 16:36 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote: * Zhang, Yanmin [EMAIL PROTECTED] wrote: sub-bisecting captured patch 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) caused 20% regression of aim7. The last 10% should be also related to sched parameters, such like sysctl_sched_min_granularity. ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you please try to figure out what the best value for /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and /proc/sys/kernel_sched_min_granularity is? there's a tuning constraint for kernel_sched_nr_latency: - kernel_sched_nr_latency should always be set to kernel_sched_latency/kernel_sched_min_granularity. (it's not a free tunable) i suspect a good approach would be to double the value of kernel_sched_latency and kernel_sched_nr_latency in each tuning iteration, while keeping kernel_sched_min_granularity unchanged. That will excercise the tuning values of the 2.6.23 kernel as well. I followed your idea to test 2.6.24-rc1. The improvement is slow. When sched_nr_latency=2560 and sched_latency_ns=64000, the performance is still about 15% less than 2.6.23. I got the aim7 30% regression on my new upgraded stoakley machine. I found this mahcine is slower than the old one. Maybe BIOS has issues, or memeory(Might not be dual-channel?) is slow. So I retested it on the old machine and found on the old stoakley machine, the regression is about 6%, quite similiar to the regression on tigerton machine. By sched_nr_latency=640 and sched_latency_ns=64000 on the old stoakley machine, the regression becomes about 2%. Other latency has more regression. On my tulsa machine, by sched_nr_latency=640 and sched_latency_ns=64000, the regression becomes less than 1% (The original regression is about 20%). When I ran a bad script to change the values of sched_nr_latency and sched_latency_ns, I hit OOPS on my tulsa machine. Below is the log. It looks like sched_nr_latency becomes 0. ***Log divide error: [1] SMP CPU 1 Modules linked in: megaraid_mbox megaraid_mm Pid: 7326, comm: sh Not tainted 2.6.24-rc1 #2 RIP: 0010:[8022c2bf] [8022c2bf] __sched_period+0x22/0x2e RSP: 0018:810105909e38 EFLAGS: 00010046 RAX: 5a00 RBX: RCX: 2d00 RDX: RSI: 0002 RDI: 0002 RBP: 810105909e40 R08: 810103bfed50 R09: R10: 0038 R11: 0296 R12: 810100d6db40 R13: 8101058c4148 R14: 0001 R15: 810104c34088 FS: 2b851bc59f50() GS:810100cb1b40() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 006c64d8 CR3: 00010752c000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process sh (pid: 7326, threadinfo 810105908000, task 810104c34040) Stack: 0800 810105909e58 8022c2db 079d292b 810105909e88 8022c36e 810100d6db40 8101058c4148 8101058c4100 0001 810105909ec8 80232d0a Call Trace: [8022c2db] __sched_vslice+0x10/0x1d [8022c36e] place_entity+0x86/0xc3 [80232d0a] task_new_fair+0x48/0xa5 [8020b63e] system_call+0x7e/0x83 [80233325] wake_up_new_task+0x70/0xa4 [80235612] do_fork+0x137/0x204 [802818bd] vfs_write+0x121/0x136 [8023f017] recalc_sigpending+0xe/0x25 [8023f0ef] sigprocmask+0x9e/0xc0 [8020b957] ptregscall_common+0x67/0xb0 Code: 48 f7 f3 48 89 c1 5b c9 48 89 c8 c3 55 48 89 e5 53 48 89 fb RIP [8022c2bf] __sched_period+0x22/0x2e RSP 810105909e38 divide error: [2] SMP CPU 0 Modules linked in: megaraid_mbox megaraid_mm Pid: 3674, comm: automount Tainted: G D 2.6.24-rc1 #2 RIP: 0010:[8022c2bf] [8022c2bf] __sched_period+0x22/0x2e RSP: 0018:81010690de38 EFLAGS: 00010046 RAX: 5a00 RBX: RCX: 2d00 RDX: RSI: 0002 RDI: 0002 RBP: 81010690de40 R08: 81010690c000 R09: R10: 0038 R11: 810104007040 R12: 810001033880 R13: 810100f2a828 R14: 0001 R15: 810104007088 FS: 40021950(0063) GS:8074e000() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 2b6cc4245000 CR3: 000105972000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process automount (pid: 3674, threadinfo 81010690c000, task 810104007040) Stack:
Re: aim7 -30% regression in 2.6.24-rc1
On Wed, 2007-10-31 at 17:57 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 16:36 +0800, Zhang, Yanmin wrote: On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote: * Zhang, Yanmin [EMAIL PROTECTED] wrote: sub-bisecting captured patch 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) caused 20% regression of aim7. The last 10% should be also related to sched parameters, such like sysctl_sched_min_granularity. ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you please try to figure out what the best value for /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and /proc/sys/kernel_sched_min_granularity is? there's a tuning constraint for kernel_sched_nr_latency: - kernel_sched_nr_latency should always be set to kernel_sched_latency/kernel_sched_min_granularity. (it's not a free tunable) i suspect a good approach would be to double the value of kernel_sched_latency and kernel_sched_nr_latency in each tuning iteration, while keeping kernel_sched_min_granularity unchanged. That will excercise the tuning values of the 2.6.23 kernel as well. I followed your idea to test 2.6.24-rc1. The improvement is slow. When sched_nr_latency=2560 and sched_latency_ns=64000, the performance is still about 15% less than 2.6.23. I got the aim7 30% regression on my new upgraded stoakley machine. I found this mahcine is slower than the old one. Maybe BIOS has issues, or memeory(Might not be dual-channel?) is slow. So I retested it on the old machine and found on the old stoakley machine, the regression is about 6%, quite similiar to the regression on tigerton machine. By sched_nr_latency=640 and sched_latency_ns=64000 on the old stoakley machine, the regression becomes about 2%. Other latency has more regression. On my tulsa machine, by sched_nr_latency=640 and sched_latency_ns=64000, the regression becomes less than 1% (The original regression is about 20%). When I ran a bad script to change the values of sched_nr_latency and sched_latency_ns, I hit OOPS on my tulsa machine. Below is the log. It looks like sched_nr_latency becomes 0. Oops, yeah I think I overlooked that case :-/ I think limiting the sysctl parameters make most sense, as a 0 value really doesn't. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 3b4efbe..0f34c91 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -94,6 +94,7 @@ static int two = 2; static int zero; static int one_hundred = 100; +static int int_max = INT_MAX; /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */ static int maxolduid = 65535; @@ -239,7 +240,10 @@ static struct ctl_table kern_table[] = { .data = sysctl_sched_nr_latency, .maxlen = sizeof(unsigned int), .mode = 0644, - .proc_handler = proc_dointvec, + .proc_handler = proc_dointvec_minmax, + .strategy = sysctl_intvec, + .extra1 = one, + .extra2 = int_max, }, { .ctl_name = CTL_UNNUMBERED, signature.asc Description: This is a digitally signed message part
Re: aim7 -30% regression in 2.6.24-rc1
On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote: > * Zhang, Yanmin <[EMAIL PROTECTED]> wrote: > > > sub-bisecting captured patch > > 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) > > caused 20% regression of aim7. > > > > The last 10% should be also related to sched parameters, such like > > sysctl_sched_min_granularity. > > ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you > please try to figure out what the best value for > /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and > /proc/sys/kernel_sched_min_granularity is? > > there's a tuning constraint for kernel_sched_nr_latency: > > - kernel_sched_nr_latency should always be set to > kernel_sched_latency/kernel_sched_min_granularity. (it's not a free > tunable) > > i suspect a good approach would be to double the value of > kernel_sched_latency and kernel_sched_nr_latency in each tuning > iteration, while keeping kernel_sched_min_granularity unchanged. That > will excercise the tuning values of the 2.6.23 kernel as well. I followed your idea to test 2.6.24-rc1. The improvement is slow. When sched_nr_latency=2560 and sched_latency_ns=64000, the performance is still about 15% less than 2.6.23. -yanmin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
* Zhang, Yanmin <[EMAIL PROTECTED]> wrote: > sub-bisecting captured patch > 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) > caused 20% regression of aim7. > > The last 10% should be also related to sched parameters, such like > sysctl_sched_min_granularity. ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you please try to figure out what the best value for /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and /proc/sys/kernel_sched_min_granularity is? there's a tuning constraint for kernel_sched_nr_latency: - kernel_sched_nr_latency should always be set to kernel_sched_latency/kernel_sched_min_granularity. (it's not a free tunable) i suspect a good approach would be to double the value of kernel_sched_latency and kernel_sched_nr_latency in each tuning iteration, while keeping kernel_sched_min_granularity unchanged. That will excercise the tuning values of the 2.6.23 kernel as well. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
* Zhang, Yanmin [EMAIL PROTECTED] wrote: sub-bisecting captured patch 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) caused 20% regression of aim7. The last 10% should be also related to sched parameters, such like sysctl_sched_min_granularity. ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you please try to figure out what the best value for /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and /proc/sys/kernel_sched_min_granularity is? there's a tuning constraint for kernel_sched_nr_latency: - kernel_sched_nr_latency should always be set to kernel_sched_latency/kernel_sched_min_granularity. (it's not a free tunable) i suspect a good approach would be to double the value of kernel_sched_latency and kernel_sched_nr_latency in each tuning iteration, while keeping kernel_sched_min_granularity unchanged. That will excercise the tuning values of the 2.6.23 kernel as well. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote: * Zhang, Yanmin [EMAIL PROTECTED] wrote: sub-bisecting captured patch 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) caused 20% regression of aim7. The last 10% should be also related to sched parameters, such like sysctl_sched_min_granularity. ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you please try to figure out what the best value for /proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and /proc/sys/kernel_sched_min_granularity is? there's a tuning constraint for kernel_sched_nr_latency: - kernel_sched_nr_latency should always be set to kernel_sched_latency/kernel_sched_min_granularity. (it's not a free tunable) i suspect a good approach would be to double the value of kernel_sched_latency and kernel_sched_nr_latency in each tuning iteration, while keeping kernel_sched_min_granularity unchanged. That will excercise the tuning values of the 2.6.23 kernel as well. I followed your idea to test 2.6.24-rc1. The improvement is slow. When sched_nr_latency=2560 and sched_latency_ns=64000, the performance is still about 15% less than 2.6.23. -yanmin - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Mon, 2007-10-29 at 17:37 +0800, Zhang, Yanmin wrote: > On Mon, 2007-10-29 at 10:22 +0800, Zhang, Yanmin wrote: > > On Fri, 2007-10-26 at 13:23 +0200, Ingo Molnar wrote: > > > * Zhang, Yanmin <[EMAIL PROTECTED]> wrote: > > > > > > > I tested 2.6.24-rc1 on my x86_64 machine which has 2 quad-core > > > > processors. > > > > > > > > Comparing with 2.6.23, aim7 has about -30% regression. I did a bisect > > > > and found patch > > > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b5869ce7f68b233ceb81465a7644be0d9a5f3dbb > > > > > > > > caused the issue. > > > > > > weird, that's a commit diff - i.e. it changes no code. > > I got the tag from #git log. As for above link, I just added prior http > > address, > > so readers could check the patch by clicking. > > > > > > > > > kbuild/SPECjbb2000/SPECjbb2005 also has big regressions. On my another > > > > tigerton machine (4 quad-core processors), SPECjbb2005 has more than > > > > -40% regression. I didn't do a bisect on such benchmark testing, but I > > > > suspect the root cause is like aim7's. > > > > > > these two commits might be relevant: > > > > > > 7a6c6bcee029a978f866511d6e41dbc7301fde4c > > I did a quick testing. This patch has no impact. > > > > > 95dbb421d12fdd9796ed153853daf3679809274f > > Above big patch doesn't include this one, which means if I do > > 'git checkout b5869ce7f68b233ceb81465a7644be0d9a5f3dbb', the kernel doesn't > > include > > 95dbb421d12fdd9796ed153853daf3679809274f. > > > > > > > > but a bisection result would be the best info. > > I will do a bisect between 2.6.23 and tag > > 9c63d9c021f375a2708ad79043d6f4dd1291a085. > I ran git bisect with kernel version as the tag. It looks like git will > be crazy sometimes. So I checked ChangeLog and used the number tag to replace > the kernel version and retested it. > > It looks like at least 2 patches were responsible for the regression. I'm > doing sub-bisect now. > > I could find aim7 regression on all my testing machines although the > regression > percentage is different. > > Machine regression > 8-core stoakley 30% > 16-core tigerton 6% > tulsa(dual-core+HT, 16 logical cpu) 20% sub-bisecting captured patch 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) caused 20% regression of aim7. The last 10% should be also related to sched parameters, such like sysctl_sched_min_granularity. -yanmin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Mon, 2007-10-29 at 10:22 +0800, Zhang, Yanmin wrote: > On Fri, 2007-10-26 at 13:23 +0200, Ingo Molnar wrote: > > * Zhang, Yanmin <[EMAIL PROTECTED]> wrote: > > > > > I tested 2.6.24-rc1 on my x86_64 machine which has 2 quad-core processors. > > > > > > Comparing with 2.6.23, aim7 has about -30% regression. I did a bisect > > > and found patch > > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b5869ce7f68b233ceb81465a7644be0d9a5f3dbb > > > > > > caused the issue. > > > > weird, that's a commit diff - i.e. it changes no code. > I got the tag from #git log. As for above link, I just added prior http > address, > so readers could check the patch by clicking. > > > > > > kbuild/SPECjbb2000/SPECjbb2005 also has big regressions. On my another > > > tigerton machine (4 quad-core processors), SPECjbb2005 has more than > > > -40% regression. I didn't do a bisect on such benchmark testing, but I > > > suspect the root cause is like aim7's. > > > > these two commits might be relevant: > > > > 7a6c6bcee029a978f866511d6e41dbc7301fde4c > I did a quick testing. This patch has no impact. > > > 95dbb421d12fdd9796ed153853daf3679809274f > Above big patch doesn't include this one, which means if I do > 'git checkout b5869ce7f68b233ceb81465a7644be0d9a5f3dbb', the kernel doesn't > include > 95dbb421d12fdd9796ed153853daf3679809274f. > > > > > but a bisection result would be the best info. > I will do a bisect between 2.6.23 and tag > 9c63d9c021f375a2708ad79043d6f4dd1291a085. I ran git bisect with kernel version as the tag. It looks like git will be crazy sometimes. So I checked ChangeLog and used the number tag to replace the kernel version and retested it. It looks like at least 2 patches were responsible for the regression. I'm doing sub-bisect now. I could find aim7 regression on all my testing machines although the regression percentage is different. Machine regression 8-core stoakley 30% 16-core tigerton6% tulsa(dual-core+HT, 16 logical cpu) 20% -yanmin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Mon, 2007-10-29 at 10:22 +0800, Zhang, Yanmin wrote: On Fri, 2007-10-26 at 13:23 +0200, Ingo Molnar wrote: * Zhang, Yanmin [EMAIL PROTECTED] wrote: I tested 2.6.24-rc1 on my x86_64 machine which has 2 quad-core processors. Comparing with 2.6.23, aim7 has about -30% regression. I did a bisect and found patch http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b5869ce7f68b233ceb81465a7644be0d9a5f3dbb caused the issue. weird, that's a commit diff - i.e. it changes no code. I got the tag from #git log. As for above link, I just added prior http address, so readers could check the patch by clicking. kbuild/SPECjbb2000/SPECjbb2005 also has big regressions. On my another tigerton machine (4 quad-core processors), SPECjbb2005 has more than -40% regression. I didn't do a bisect on such benchmark testing, but I suspect the root cause is like aim7's. these two commits might be relevant: 7a6c6bcee029a978f866511d6e41dbc7301fde4c I did a quick testing. This patch has no impact. 95dbb421d12fdd9796ed153853daf3679809274f Above big patch doesn't include this one, which means if I do 'git checkout b5869ce7f68b233ceb81465a7644be0d9a5f3dbb', the kernel doesn't include 95dbb421d12fdd9796ed153853daf3679809274f. but a bisection result would be the best info. I will do a bisect between 2.6.23 and tag 9c63d9c021f375a2708ad79043d6f4dd1291a085. I ran git bisect with kernel version as the tag. It looks like git will be crazy sometimes. So I checked ChangeLog and used the number tag to replace the kernel version and retested it. It looks like at least 2 patches were responsible for the regression. I'm doing sub-bisect now. I could find aim7 regression on all my testing machines although the regression percentage is different. Machine regression 8-core stoakley 30% 16-core tigerton6% tulsa(dual-core+HT, 16 logical cpu) 20% -yanmin - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Mon, 2007-10-29 at 17:37 +0800, Zhang, Yanmin wrote: On Mon, 2007-10-29 at 10:22 +0800, Zhang, Yanmin wrote: On Fri, 2007-10-26 at 13:23 +0200, Ingo Molnar wrote: * Zhang, Yanmin [EMAIL PROTECTED] wrote: I tested 2.6.24-rc1 on my x86_64 machine which has 2 quad-core processors. Comparing with 2.6.23, aim7 has about -30% regression. I did a bisect and found patch http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b5869ce7f68b233ceb81465a7644be0d9a5f3dbb caused the issue. weird, that's a commit diff - i.e. it changes no code. I got the tag from #git log. As for above link, I just added prior http address, so readers could check the patch by clicking. kbuild/SPECjbb2000/SPECjbb2005 also has big regressions. On my another tigerton machine (4 quad-core processors), SPECjbb2005 has more than -40% regression. I didn't do a bisect on such benchmark testing, but I suspect the root cause is like aim7's. these two commits might be relevant: 7a6c6bcee029a978f866511d6e41dbc7301fde4c I did a quick testing. This patch has no impact. 95dbb421d12fdd9796ed153853daf3679809274f Above big patch doesn't include this one, which means if I do 'git checkout b5869ce7f68b233ceb81465a7644be0d9a5f3dbb', the kernel doesn't include 95dbb421d12fdd9796ed153853daf3679809274f. but a bisection result would be the best info. I will do a bisect between 2.6.23 and tag 9c63d9c021f375a2708ad79043d6f4dd1291a085. I ran git bisect with kernel version as the tag. It looks like git will be crazy sometimes. So I checked ChangeLog and used the number tag to replace the kernel version and retested it. It looks like at least 2 patches were responsible for the regression. I'm doing sub-bisect now. I could find aim7 regression on all my testing machines although the regression percentage is different. Machine regression 8-core stoakley 30% 16-core tigerton 6% tulsa(dual-core+HT, 16 logical cpu) 20% sub-bisecting captured patch 38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) caused 20% regression of aim7. The last 10% should be also related to sched parameters, such like sysctl_sched_min_granularity. -yanmin - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Fri, 2007-10-26 at 13:23 +0200, Ingo Molnar wrote: > * Zhang, Yanmin <[EMAIL PROTECTED]> wrote: > > > I tested 2.6.24-rc1 on my x86_64 machine which has 2 quad-core processors. > > > > Comparing with 2.6.23, aim7 has about -30% regression. I did a bisect > > and found patch > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b5869ce7f68b233ceb81465a7644be0d9a5f3dbb > > > > caused the issue. > > weird, that's a commit diff - i.e. it changes no code. I got the tag from #git log. As for above link, I just added prior http address, so readers could check the patch by clicking. > > > kbuild/SPECjbb2000/SPECjbb2005 also has big regressions. On my another > > tigerton machine (4 quad-core processors), SPECjbb2005 has more than > > -40% regression. I didn't do a bisect on such benchmark testing, but I > > suspect the root cause is like aim7's. > > these two commits might be relevant: > > 7a6c6bcee029a978f866511d6e41dbc7301fde4c I did a quick testing. This patch has no impact. > 95dbb421d12fdd9796ed153853daf3679809274f Above big patch doesn't include this one, which means if I do 'git checkout b5869ce7f68b233ceb81465a7644be0d9a5f3dbb', the kernel doesn't include 95dbb421d12fdd9796ed153853daf3679809274f. > > but a bisection result would be the best info. I will do a bisect between 2.6.23 and tag 9c63d9c021f375a2708ad79043d6f4dd1291a085. -yanmin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Fri, 2007-10-26 at 11:53 +0200, Peter Zijlstra wrote: > On Fri, 2007-10-26 at 17:43 +0800, Zhang, Yanmin wrote: > > I tested 2.6.24-rc1 on my x86_64 machine which has 2 quad-core processors. > > > > Comparing with 2.6.23, aim7 has about -30% regression. I did a bisect and > > found > > patch > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b5869ce7f68b233ceb81465a7644be0d9a5f3dbb > > caused the issue. > > Bit weird that you point to a merge commit, and not an actual patch. Are > you sure git bisect pointed at this one? When I did a bisect, kernel couldn't boot and my testing log showed it's at b5869ce7f68b233ceb81465a7644be0d9a5f3dbb. So I did a manual checkout. #git clone ... #git pull ... #git checkout b5869ce7f68b233ceb81465a7644be0d9a5f3dbb Then, compiled kernel and tested it. Then, reversed above patch and recompiled/retested it. If I ran git log, I could see this tag in the list. -yanmin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Fri, 2007-10-26 at 11:53 +0200, Peter Zijlstra wrote: On Fri, 2007-10-26 at 17:43 +0800, Zhang, Yanmin wrote: I tested 2.6.24-rc1 on my x86_64 machine which has 2 quad-core processors. Comparing with 2.6.23, aim7 has about -30% regression. I did a bisect and found patch http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b5869ce7f68b233ceb81465a7644be0d9a5f3dbb caused the issue. Bit weird that you point to a merge commit, and not an actual patch. Are you sure git bisect pointed at this one? When I did a bisect, kernel couldn't boot and my testing log showed it's at b5869ce7f68b233ceb81465a7644be0d9a5f3dbb. So I did a manual checkout. #git clone ... #git pull ... #git checkout b5869ce7f68b233ceb81465a7644be0d9a5f3dbb Then, compiled kernel and tested it. Then, reversed above patch and recompiled/retested it. If I ran git log, I could see this tag in the list. -yanmin - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Fri, 2007-10-26 at 13:23 +0200, Ingo Molnar wrote: * Zhang, Yanmin [EMAIL PROTECTED] wrote: I tested 2.6.24-rc1 on my x86_64 machine which has 2 quad-core processors. Comparing with 2.6.23, aim7 has about -30% regression. I did a bisect and found patch http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b5869ce7f68b233ceb81465a7644be0d9a5f3dbb caused the issue. weird, that's a commit diff - i.e. it changes no code. I got the tag from #git log. As for above link, I just added prior http address, so readers could check the patch by clicking. kbuild/SPECjbb2000/SPECjbb2005 also has big regressions. On my another tigerton machine (4 quad-core processors), SPECjbb2005 has more than -40% regression. I didn't do a bisect on such benchmark testing, but I suspect the root cause is like aim7's. these two commits might be relevant: 7a6c6bcee029a978f866511d6e41dbc7301fde4c I did a quick testing. This patch has no impact. 95dbb421d12fdd9796ed153853daf3679809274f Above big patch doesn't include this one, which means if I do 'git checkout b5869ce7f68b233ceb81465a7644be0d9a5f3dbb', the kernel doesn't include 95dbb421d12fdd9796ed153853daf3679809274f. but a bisection result would be the best info. I will do a bisect between 2.6.23 and tag 9c63d9c021f375a2708ad79043d6f4dd1291a085. -yanmin - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
* Zhang, Yanmin <[EMAIL PROTECTED]> wrote: > I tested 2.6.24-rc1 on my x86_64 machine which has 2 quad-core processors. > > Comparing with 2.6.23, aim7 has about -30% regression. I did a bisect > and found patch > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b5869ce7f68b233ceb81465a7644be0d9a5f3dbb > > caused the issue. weird, that's a commit diff - i.e. it changes no code. > kbuild/SPECjbb2000/SPECjbb2005 also has big regressions. On my another > tigerton machine (4 quad-core processors), SPECjbb2005 has more than > -40% regression. I didn't do a bisect on such benchmark testing, but I > suspect the root cause is like aim7's. these two commits might be relevant: 7a6c6bcee029a978f866511d6e41dbc7301fde4c 95dbb421d12fdd9796ed153853daf3679809274f but a bisection result would be the best info. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Fri, 2007-10-26 at 17:43 +0800, Zhang, Yanmin wrote: > I tested 2.6.24-rc1 on my x86_64 machine which has 2 quad-core processors. > > Comparing with 2.6.23, aim7 has about -30% regression. I did a bisect and > found > patch > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b5869ce7f68b233ceb81465a7644be0d9a5f3dbb > caused the issue. Bit weird that you point to a merge commit, and not an actual patch. Are you sure git bisect pointed at this one? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
On Fri, 2007-10-26 at 17:43 +0800, Zhang, Yanmin wrote: I tested 2.6.24-rc1 on my x86_64 machine which has 2 quad-core processors. Comparing with 2.6.23, aim7 has about -30% regression. I did a bisect and found patch http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b5869ce7f68b233ceb81465a7644be0d9a5f3dbb caused the issue. Bit weird that you point to a merge commit, and not an actual patch. Are you sure git bisect pointed at this one? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aim7 -30% regression in 2.6.24-rc1
* Zhang, Yanmin [EMAIL PROTECTED] wrote: I tested 2.6.24-rc1 on my x86_64 machine which has 2 quad-core processors. Comparing with 2.6.23, aim7 has about -30% regression. I did a bisect and found patch http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b5869ce7f68b233ceb81465a7644be0d9a5f3dbb caused the issue. weird, that's a commit diff - i.e. it changes no code. kbuild/SPECjbb2000/SPECjbb2005 also has big regressions. On my another tigerton machine (4 quad-core processors), SPECjbb2005 has more than -40% regression. I didn't do a bisect on such benchmark testing, but I suspect the root cause is like aim7's. these two commits might be relevant: 7a6c6bcee029a978f866511d6e41dbc7301fde4c 95dbb421d12fdd9796ed153853daf3679809274f but a bisection result would be the best info. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/