Re: oprofile broken in 2.6.21 SMP (was Re: Remove constructor from buffer_head)
On Sun, 13 May 2007 16:38:16 -0400 Benjamin LaHaise <[EMAIL PROTECTED]> wrote: > On Sat, May 05, 2007 at 11:31:20AM +0200, Andi Kleen wrote: > > Hmm, after a opcontrol --reset i see the same issue now. Don't know what's > > wrong, but it must be something different from the .20 perfctr allocation > > problem. > > > > It looks like the daemon doesn't get any data from the kernel > > I finally had time to track this down. The breakage is caused by "[PATCH] > x86-64: Let oprofile reserve MSR on all CPUs". Oprofile is already calling > the reserve functions on each CPU in the system when it sets up the MSRs. > This results in oprofile getting a reservation failure on CPUs above 0. The > following makes oprofile adapt to the API change for now -- oprofile > still needs to be modified to perform the reservations earlier during its > initialization, but that's a little bit more involved than the immediate > bug fix. This only affects systems with more than 1 CPU. This patch has > been through limited testing (Athlon 64 X2 and Core 2, but not on the P4) on > x86 and x86-64 (Core 2 only). Unfortunately we've left this a bit too late - your patch is patching code which isn't there any more in mainline and we also need a 2.6.21.x fix. So perhaps we could merge your "immediate bugfix" into -stable and implement the "more involved" fix for 2.6.22. Andi, any preferences? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oprofile broken in 2.6.21 SMP (was Re: Remove constructor from buffer_head)
On Sun, 13 May 2007 16:38:16 -0400 Benjamin LaHaise [EMAIL PROTECTED] wrote: On Sat, May 05, 2007 at 11:31:20AM +0200, Andi Kleen wrote: Hmm, after a opcontrol --reset i see the same issue now. Don't know what's wrong, but it must be something different from the .20 perfctr allocation problem. It looks like the daemon doesn't get any data from the kernel I finally had time to track this down. The breakage is caused by [PATCH] x86-64: Let oprofile reserve MSR on all CPUs. Oprofile is already calling the reserve functions on each CPU in the system when it sets up the MSRs. This results in oprofile getting a reservation failure on CPUs above 0. The following makes oprofile adapt to the API change for now -- oprofile still needs to be modified to perform the reservations earlier during its initialization, but that's a little bit more involved than the immediate bug fix. This only affects systems with more than 1 CPU. This patch has been through limited testing (Athlon 64 X2 and Core 2, but not on the P4) on x86 and x86-64 (Core 2 only). Unfortunately we've left this a bit too late - your patch is patching code which isn't there any more in mainline and we also need a 2.6.21.x fix. So perhaps we could merge your immediate bugfix into -stable and implement the more involved fix for 2.6.22. Andi, any preferences? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
oprofile broken in 2.6.21 SMP (was Re: Remove constructor from buffer_head)
On Sat, May 05, 2007 at 11:31:20AM +0200, Andi Kleen wrote: > Hmm, after a opcontrol --reset i see the same issue now. Don't know what's > wrong, but it must be something different from the .20 perfctr allocation > problem. > > It looks like the daemon doesn't get any data from the kernel I finally had time to track this down. The breakage is caused by "[PATCH] x86-64: Let oprofile reserve MSR on all CPUs". Oprofile is already calling the reserve functions on each CPU in the system when it sets up the MSRs. This results in oprofile getting a reservation failure on CPUs above 0. The following makes oprofile adapt to the API change for now -- oprofile still needs to be modified to perform the reservations earlier during its initialization, but that's a little bit more involved than the immediate bug fix. This only affects systems with more than 1 CPU. This patch has been through limited testing (Athlon 64 X2 and Core 2, but not on the P4) on x86 and x86-64 (Core 2 only). -ben Signed-off-by: Benjamin LaHaise <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/nmi.c b/arch/i386/kernel/nmi.c index 84c3497..21fc74d 100644 --- a/arch/i386/kernel/nmi.c +++ b/arch/i386/kernel/nmi.c @@ -148,7 +148,7 @@ int avail_to_resrv_perfctr_nmi(unsigned int msr) return 1; } -static int __reserve_perfctr_nmi(int cpu, unsigned int msr) +int __reserve_perfctr_nmi(int cpu, unsigned int msr) { unsigned int counter; if (cpu < 0) @@ -162,7 +162,7 @@ static int __reserve_perfctr_nmi(int cpu, unsigned int msr) return 0; } -static void __release_perfctr_nmi(int cpu, unsigned int msr) +void __release_perfctr_nmi(int cpu, unsigned int msr) { unsigned int counter; if (cpu < 0) @@ -212,7 +212,7 @@ int __reserve_evntsel_nmi(int cpu, unsigned int msr) return 0; } -static void __release_evntsel_nmi(int cpu, unsigned int msr) +void __release_evntsel_nmi(int cpu, unsigned int msr) { unsigned int counter; if (cpu < 0) @@ -1188,5 +1188,9 @@ EXPORT_SYMBOL(reserve_perfctr_nmi); EXPORT_SYMBOL(release_perfctr_nmi); EXPORT_SYMBOL(reserve_evntsel_nmi); EXPORT_SYMBOL(release_evntsel_nmi); +EXPORT_SYMBOL(__reserve_perfctr_nmi); +EXPORT_SYMBOL(__release_perfctr_nmi); +EXPORT_SYMBOL(__reserve_evntsel_nmi); +EXPORT_SYMBOL(__release_evntsel_nmi); EXPORT_SYMBOL(disable_timer_nmi_watchdog); EXPORT_SYMBOL(enable_timer_nmi_watchdog); diff --git a/arch/i386/oprofile/op_model_athlon.c b/arch/i386/oprofile/op_model_athlon.c index 3057a19..738a579 100644 --- a/arch/i386/oprofile/op_model_athlon.c +++ b/arch/i386/oprofile/op_model_athlon.c @@ -45,14 +45,14 @@ static void athlon_fill_in_addresses(struct op_msrs * const msrs) int i; for (i=0; i < NUM_COUNTERS; i++) { - if (reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i)) + if (__reserve_perfctr_nmi(-1, MSR_K7_PERFCTR0 + i)) msrs->counters[i].addr = MSR_K7_PERFCTR0 + i; else msrs->counters[i].addr = 0; } for (i=0; i < NUM_CONTROLS; i++) { - if (reserve_evntsel_nmi(MSR_K7_EVNTSEL0 + i)) + if (__reserve_evntsel_nmi(-1, MSR_K7_EVNTSEL0 + i)) msrs->controls[i].addr = MSR_K7_EVNTSEL0 + i; else msrs->controls[i].addr = 0; @@ -160,11 +160,11 @@ static void athlon_shutdown(struct op_msrs const * const msrs) for (i = 0 ; i < NUM_COUNTERS ; ++i) { if (CTR_IS_RESERVED(msrs,i)) - release_perfctr_nmi(MSR_K7_PERFCTR0 + i); + __release_perfctr_nmi(-1, MSR_K7_PERFCTR0 + i); } for (i = 0 ; i < NUM_CONTROLS ; ++i) { if (CTRL_IS_RESERVED(msrs,i)) - release_evntsel_nmi(MSR_K7_EVNTSEL0 + i); + __release_evntsel_nmi(-1, MSR_K7_EVNTSEL0 + i); } } diff --git a/arch/i386/oprofile/op_model_p4.c b/arch/i386/oprofile/op_model_p4.c index 4792592..ce096dc 100644 --- a/arch/i386/oprofile/op_model_p4.c +++ b/arch/i386/oprofile/op_model_p4.c @@ -413,7 +413,7 @@ static void p4_fill_in_addresses(struct op_msrs * const msrs) for (i = 0; i < num_counters; ++i) { addr = p4_counters[VIRT_CTR(stag, i)].counter_address; cccraddr = p4_counters[VIRT_CTR(stag, i)].cccr_address; - if (reserve_perfctr_nmi(addr)){ + if (__reserve_perfctr_nmi(-1, addr)){ msrs->counters[i].addr = addr; msrs->controls[i].addr = cccraddr; } @@ -422,7 +422,7 @@ static void p4_fill_in_addresses(struct op_msrs * const msrs) /* 43 ESCR registers in three or four discontiguous group */ for (addr = MSR_P4_BSU_ESCR0 + stag; addr < MSR_P4_IQ_ESCR0; ++i, addr += addr_increment()) { - if (reserve_evntsel_nmi(addr)) + if
oprofile broken in 2.6.21 SMP (was Re: Remove constructor from buffer_head)
On Sat, May 05, 2007 at 11:31:20AM +0200, Andi Kleen wrote: Hmm, after a opcontrol --reset i see the same issue now. Don't know what's wrong, but it must be something different from the .20 perfctr allocation problem. It looks like the daemon doesn't get any data from the kernel I finally had time to track this down. The breakage is caused by [PATCH] x86-64: Let oprofile reserve MSR on all CPUs. Oprofile is already calling the reserve functions on each CPU in the system when it sets up the MSRs. This results in oprofile getting a reservation failure on CPUs above 0. The following makes oprofile adapt to the API change for now -- oprofile still needs to be modified to perform the reservations earlier during its initialization, but that's a little bit more involved than the immediate bug fix. This only affects systems with more than 1 CPU. This patch has been through limited testing (Athlon 64 X2 and Core 2, but not on the P4) on x86 and x86-64 (Core 2 only). -ben Signed-off-by: Benjamin LaHaise [EMAIL PROTECTED] diff --git a/arch/i386/kernel/nmi.c b/arch/i386/kernel/nmi.c index 84c3497..21fc74d 100644 --- a/arch/i386/kernel/nmi.c +++ b/arch/i386/kernel/nmi.c @@ -148,7 +148,7 @@ int avail_to_resrv_perfctr_nmi(unsigned int msr) return 1; } -static int __reserve_perfctr_nmi(int cpu, unsigned int msr) +int __reserve_perfctr_nmi(int cpu, unsigned int msr) { unsigned int counter; if (cpu 0) @@ -162,7 +162,7 @@ static int __reserve_perfctr_nmi(int cpu, unsigned int msr) return 0; } -static void __release_perfctr_nmi(int cpu, unsigned int msr) +void __release_perfctr_nmi(int cpu, unsigned int msr) { unsigned int counter; if (cpu 0) @@ -212,7 +212,7 @@ int __reserve_evntsel_nmi(int cpu, unsigned int msr) return 0; } -static void __release_evntsel_nmi(int cpu, unsigned int msr) +void __release_evntsel_nmi(int cpu, unsigned int msr) { unsigned int counter; if (cpu 0) @@ -1188,5 +1188,9 @@ EXPORT_SYMBOL(reserve_perfctr_nmi); EXPORT_SYMBOL(release_perfctr_nmi); EXPORT_SYMBOL(reserve_evntsel_nmi); EXPORT_SYMBOL(release_evntsel_nmi); +EXPORT_SYMBOL(__reserve_perfctr_nmi); +EXPORT_SYMBOL(__release_perfctr_nmi); +EXPORT_SYMBOL(__reserve_evntsel_nmi); +EXPORT_SYMBOL(__release_evntsel_nmi); EXPORT_SYMBOL(disable_timer_nmi_watchdog); EXPORT_SYMBOL(enable_timer_nmi_watchdog); diff --git a/arch/i386/oprofile/op_model_athlon.c b/arch/i386/oprofile/op_model_athlon.c index 3057a19..738a579 100644 --- a/arch/i386/oprofile/op_model_athlon.c +++ b/arch/i386/oprofile/op_model_athlon.c @@ -45,14 +45,14 @@ static void athlon_fill_in_addresses(struct op_msrs * const msrs) int i; for (i=0; i NUM_COUNTERS; i++) { - if (reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i)) + if (__reserve_perfctr_nmi(-1, MSR_K7_PERFCTR0 + i)) msrs-counters[i].addr = MSR_K7_PERFCTR0 + i; else msrs-counters[i].addr = 0; } for (i=0; i NUM_CONTROLS; i++) { - if (reserve_evntsel_nmi(MSR_K7_EVNTSEL0 + i)) + if (__reserve_evntsel_nmi(-1, MSR_K7_EVNTSEL0 + i)) msrs-controls[i].addr = MSR_K7_EVNTSEL0 + i; else msrs-controls[i].addr = 0; @@ -160,11 +160,11 @@ static void athlon_shutdown(struct op_msrs const * const msrs) for (i = 0 ; i NUM_COUNTERS ; ++i) { if (CTR_IS_RESERVED(msrs,i)) - release_perfctr_nmi(MSR_K7_PERFCTR0 + i); + __release_perfctr_nmi(-1, MSR_K7_PERFCTR0 + i); } for (i = 0 ; i NUM_CONTROLS ; ++i) { if (CTRL_IS_RESERVED(msrs,i)) - release_evntsel_nmi(MSR_K7_EVNTSEL0 + i); + __release_evntsel_nmi(-1, MSR_K7_EVNTSEL0 + i); } } diff --git a/arch/i386/oprofile/op_model_p4.c b/arch/i386/oprofile/op_model_p4.c index 4792592..ce096dc 100644 --- a/arch/i386/oprofile/op_model_p4.c +++ b/arch/i386/oprofile/op_model_p4.c @@ -413,7 +413,7 @@ static void p4_fill_in_addresses(struct op_msrs * const msrs) for (i = 0; i num_counters; ++i) { addr = p4_counters[VIRT_CTR(stag, i)].counter_address; cccraddr = p4_counters[VIRT_CTR(stag, i)].cccr_address; - if (reserve_perfctr_nmi(addr)){ + if (__reserve_perfctr_nmi(-1, addr)){ msrs-counters[i].addr = addr; msrs-controls[i].addr = cccraddr; } @@ -422,7 +422,7 @@ static void p4_fill_in_addresses(struct op_msrs * const msrs) /* 43 ESCR registers in three or four discontiguous group */ for (addr = MSR_P4_BSU_ESCR0 + stag; addr MSR_P4_IQ_ESCR0; ++i, addr += addr_increment()) { - if (reserve_evntsel_nmi(addr)) + if (__reserve_evntsel_nmi(-1,
Re: Remove constructor from buffer_head
On Fri, May 04, 2007 at 04:45:29PM -0700, Andrew Morton wrote: > On Sat, 5 May 2007 01:22:05 +0200 > Andi Kleen <[EMAIL PROTECTED]> wrote: > > > > 2.6.21: > > > > > > akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50 > > > opreport error: No sample file found: try running opcontrol --dump > > > or specify a session containing sample files > > > > For me it works on a slightly post 2.6.21 kernel with suse oprofile-0.9.2-21 > > > > Did you try opcontrol --dump? > > Yes, tried various things. There's just nothing turning up in > /var/lib/oprofile. > > Chuck appears to be claiming that 2.6.21 oprofile is known to be broken, > but I never heard anything about that. Hmm, after a opcontrol --reset i see the same issue now. Don't know what's wrong, but it must be something different from the .20 perfctr allocation problem. It looks like the daemon doesn't get any data from the kernel -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Fri, May 04, 2007 at 04:45:29PM -0700, Andrew Morton wrote: On Sat, 5 May 2007 01:22:05 +0200 Andi Kleen [EMAIL PROTECTED] wrote: 2.6.21: akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50 opreport error: No sample file found: try running opcontrol --dump or specify a session containing sample files For me it works on a slightly post 2.6.21 kernel with suse oprofile-0.9.2-21 Did you try opcontrol --dump? Yes, tried various things. There's just nothing turning up in /var/lib/oprofile. Chuck appears to be claiming that 2.6.21 oprofile is known to be broken, but I never heard anything about that. Hmm, after a opcontrol --reset i see the same issue now. Don't know what's wrong, but it must be something different from the .20 perfctr allocation problem. It looks like the daemon doesn't get any data from the kernel -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Sat, 5 May 2007 01:22:05 +0200 Andi Kleen <[EMAIL PROTECTED]> wrote: > > 2.6.21: > > > > akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50 > > opreport error: No sample file found: try running opcontrol --dump > > or specify a session containing sample files > > For me it works on a slightly post 2.6.21 kernel with suse oprofile-0.9.2-21 > > Did you try opcontrol --dump? Yes, tried various things. There's just nothing turning up in /var/lib/oprofile. Chuck appears to be claiming that 2.6.21 oprofile is known to be broken, but I never heard anything about that. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Friday 04 May 2007 23:33:47 Andrew Morton wrote: > On Fri, 4 May 2007 13:42:12 -0700 > > 2.6.20: > > akpm2:/home/akpm> opcontrol --start-daemon > /usr/bin/opcontrol: line 1098: /dev/oprofile/0/enabled: No such file or > directory > /usr/bin/opcontrol: line 1098: /dev/oprofile/0/event: No such file or > directory > /usr/bin/opcontrol: line 1098: /dev/oprofile/0/count: No such file or > directory > /usr/bin/opcontrol: line 1098: /dev/oprofile/0/kernel: No such file or > directory > /usr/bin/opcontrol: line 1098: /dev/oprofile/0/user: No such file or directory > /usr/bin/opcontrol: line 1098: /dev/oprofile/0/unit_mask: No such file or > directory This isn't a problem anymore since the nmi watchdog is off by default now. > 2.6.21: > > akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50 > opreport error: No sample file found: try running opcontrol --dump > or specify a session containing sample files For me it works on a slightly post 2.6.21 kernel with suse oprofile-0.9.2-21 Did you try opcontrol --dump? -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
Andrew Morton wrote: > > I'd investigate further, but someone has gone and broken oprofile. > Did you just notice that? Apparently it's been broken since 2.6.21-final. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Fri, 4 May 2007 14:42:02 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Fri, 4 May 2007, Andrew Morton wrote: > > > So the patch took the average system time from 4.42 seconds up to 4.582 > > seconds. Nice slowdown! > > All of that from a memset and a list head init on a cacheline we already > use? Seems unlikely, especially when you consider all the other stuff which a write() has to do. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Fri, 4 May 2007, Andrew Morton wrote: > So the patch took the average system time from 4.42 seconds up to 4.582 > seconds. Nice slowdown! All of that from a memset and a list head init on a cacheline we already use? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Fri, 4 May 2007 13:42:12 -0700 Andrew Morton <[EMAIL PROTECTED]> wrote: > I'd investigate further, but someone has gone and broken oprofile. Damn, we went and merged that bustage? 2.6.20: akpm2:/home/akpm> opcontrol --start-daemon /usr/bin/opcontrol: line 1098: /dev/oprofile/0/enabled: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/event: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/count: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/kernel: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/user: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/unit_mask: No such file or directory 2.6.21: akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50 opreport error: No sample file found: try running opcontrol --dump or specify a session containing sample files This is an FC6 machine. `yum update oprofile' says Could not find update match for oprofile No Packages marked for Update/Obsoletion akpm2:/home/akpm> rpm -q oprofile oprofile-0.9.2-3.fc6 I'm quite stunned that we did this. Now what? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > Performance tests show a slight improvements in netperf (not a > strong case for a performance improvement but removing the > constructor has definitely no negative impact so why keep > this around?). > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost > (127.0.0.1) port 0 AF_INET > Recv SendSend > Socket Socket Message Elapsed > Size SizeSize Time Throughput > bytes bytes bytessecs.10^6bits/sec > > Before: > 87380 16384 1638410.016026.04 > 87380 16384 1638410.015992.17 > 87380 16384 1638410.016071.23 > > After: > 87380 16384 1638410.016090.20 > 87380 16384 1638410.016078.3 > 87380 16384 1638410.006013.52 > > > Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> > > --- > fs/buffer.c | 22 -- So I benchmarked this by repeatedly extending (via write()) and truncating a 10MB file, on ext2. Using create-delete.c from http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz Machine is a fast 2x4 core Woodcrest. CONFIG_SLAB=y The command used was time create-delete -s $((16 * 1024 * 1024)) -n 300 foo which will allocate and free 300*4096 buffer_heads. With patch: akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.56s system 99% cpu 4.565 total akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.60s system 99% cpu 4.612 total akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.60s system 99% cpu 4.602 total akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.56s system 99% cpu 4.567 total akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.59s system 95% cpu 4.824 total Without patch: akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.42s system 99% cpu 4.419 total akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.42s system 99% cpu 4.421 total akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.42s system 99% cpu 4.427 total akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.42s system 99% cpu 4.417 total akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.42s system 99% cpu 4.435 total So the patch took the average system time from 4.42 seconds up to 4.582 seconds. Nice slowdown! It could just be the usual inter-kernel-build noise, dunno. I'd investigate further, but someone has gone and broken oprofile. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, 3 May 2007, William Lee Irwin III wrote: > On Thu, May 03, 2007 at 08:08:41PM -0700, Christoph Lameter wrote: > > Performance tests show a slight improvements in netperf (not a > > strong case for a performance improvement but removing the > > constructor has definitely no negative impact so why keep > > this around?). > > Cache effects are not so easily visible. Cache profile results from > more realistic workloads (e.g. major macrobenchmarks) are more > appropriate for gauging this. Yeah I really out to stick a performance counter in this but that would require some effort. Defer for now I guess. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, May 03, 2007 at 08:08:41PM -0700, Christoph Lameter wrote: > Performance tests show a slight improvements in netperf (not a > strong case for a performance improvement but removing the > constructor has definitely no negative impact so why keep > this around?). Cache effects are not so easily visible. Cache profile results from more realistic workloads (e.g. major macrobenchmarks) are more appropriate for gauging this. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, May 03, 2007 at 08:08:41PM -0700, Christoph Lameter wrote: Performance tests show a slight improvements in netperf (not a strong case for a performance improvement but removing the constructor has definitely no negative impact so why keep this around?). Cache effects are not so easily visible. Cache profile results from more realistic workloads (e.g. major macrobenchmarks) are more appropriate for gauging this. -- wli - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, 3 May 2007, William Lee Irwin III wrote: On Thu, May 03, 2007 at 08:08:41PM -0700, Christoph Lameter wrote: Performance tests show a slight improvements in netperf (not a strong case for a performance improvement but removing the constructor has definitely no negative impact so why keep this around?). Cache effects are not so easily visible. Cache profile results from more realistic workloads (e.g. major macrobenchmarks) are more appropriate for gauging this. Yeah I really out to stick a performance counter in this but that would require some effort. Defer for now I guess. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter [EMAIL PROTECTED] wrote: Performance tests show a slight improvements in netperf (not a strong case for a performance improvement but removing the constructor has definitely no negative impact so why keep this around?). TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET Recv SendSend Socket Socket Message Elapsed Size SizeSize Time Throughput bytes bytes bytessecs.10^6bits/sec Before: 87380 16384 1638410.016026.04 87380 16384 1638410.015992.17 87380 16384 1638410.016071.23 After: 87380 16384 1638410.016090.20 87380 16384 1638410.016078.3 87380 16384 1638410.006013.52 Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- fs/buffer.c | 22 -- So I benchmarked this by repeatedly extending (via write()) and truncating a 10MB file, on ext2. Using create-delete.c from http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz Machine is a fast 2x4 core Woodcrest. CONFIG_SLAB=y The command used was time create-delete -s $((16 * 1024 * 1024)) -n 300 foo which will allocate and free 300*4096 buffer_heads. With patch: akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.56s system 99% cpu 4.565 total akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.60s system 99% cpu 4.612 total akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.60s system 99% cpu 4.602 total akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.56s system 99% cpu 4.567 total akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.59s system 95% cpu 4.824 total Without patch: akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.42s system 99% cpu 4.419 total akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.42s system 99% cpu 4.421 total akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.42s system 99% cpu 4.427 total akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.42s system 99% cpu 4.417 total akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo create-delete -s $((16 * 1024 * 1024)) -n 300 foo 0.00s user 4.42s system 99% cpu 4.435 total So the patch took the average system time from 4.42 seconds up to 4.582 seconds. Nice slowdown! It could just be the usual inter-kernel-build noise, dunno. I'd investigate further, but someone has gone and broken oprofile. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Fri, 4 May 2007 13:42:12 -0700 Andrew Morton [EMAIL PROTECTED] wrote: I'd investigate further, but someone has gone and broken oprofile. Damn, we went and merged that bustage? 2.6.20: akpm2:/home/akpm opcontrol --start-daemon /usr/bin/opcontrol: line 1098: /dev/oprofile/0/enabled: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/event: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/count: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/kernel: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/user: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/unit_mask: No such file or directory 2.6.21: akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50 opreport error: No sample file found: try running opcontrol --dump or specify a session containing sample files This is an FC6 machine. `yum update oprofile' says Could not find update match for oprofile No Packages marked for Update/Obsoletion akpm2:/home/akpm rpm -q oprofile oprofile-0.9.2-3.fc6 I'm quite stunned that we did this. Now what? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Fri, 4 May 2007, Andrew Morton wrote: So the patch took the average system time from 4.42 seconds up to 4.582 seconds. Nice slowdown! All of that from a memset and a list head init on a cacheline we already use? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Fri, 4 May 2007 14:42:02 -0700 (PDT) Christoph Lameter [EMAIL PROTECTED] wrote: On Fri, 4 May 2007, Andrew Morton wrote: So the patch took the average system time from 4.42 seconds up to 4.582 seconds. Nice slowdown! All of that from a memset and a list head init on a cacheline we already use? Seems unlikely, especially when you consider all the other stuff which a write() has to do. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
Andrew Morton wrote: I'd investigate further, but someone has gone and broken oprofile. Did you just notice that? Apparently it's been broken since 2.6.21-final. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Friday 04 May 2007 23:33:47 Andrew Morton wrote: On Fri, 4 May 2007 13:42:12 -0700 2.6.20: akpm2:/home/akpm opcontrol --start-daemon /usr/bin/opcontrol: line 1098: /dev/oprofile/0/enabled: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/event: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/count: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/kernel: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/user: No such file or directory /usr/bin/opcontrol: line 1098: /dev/oprofile/0/unit_mask: No such file or directory This isn't a problem anymore since the nmi watchdog is off by default now. 2.6.21: akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50 opreport error: No sample file found: try running opcontrol --dump or specify a session containing sample files For me it works on a slightly post 2.6.21 kernel with suse oprofile-0.9.2-21 Did you try opcontrol --dump? -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Sat, 5 May 2007 01:22:05 +0200 Andi Kleen [EMAIL PROTECTED] wrote: 2.6.21: akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50 opreport error: No sample file found: try running opcontrol --dump or specify a session containing sample files For me it works on a slightly post 2.6.21 kernel with suse oprofile-0.9.2-21 Did you try opcontrol --dump? Yes, tried various things. There's just nothing turning up in /var/lib/oprofile. Chuck appears to be claiming that 2.6.21 oprofile is known to be broken, but I never heard anything about that. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, 3 May 2007 20:34:48 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Thu, 3 May 2007, Andrew Morton wrote: > > > On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter <[EMAIL > > PROTECTED]> wrote: > > > > > Performance tests show a slight improvements in netperf (not a > > > strong case for a performance improvement but removing the > > > constructor has definitely no negative impact so why keep > > > this around?). > > > > > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost > > > (127.0.0.1) port 0 AF_INET > > > Recv SendSend > > > Socket Socket Message Elapsed > > > Size SizeSize Time Throughput > > > bytes bytes bytessecs.10^6bits/sec > > > > > > Before: > > > 87380 16384 1638410.016026.04 > > > 87380 16384 1638410.015992.17 > > > 87380 16384 1638410.016071.23 > > > > > > After: > > > 87380 16384 1638410.016090.20 > > > 87380 16384 1638410.016078.3 > > > 87380 16384 1638410.006013.52 > > > > How could a filesystem change affect networking performance? > > > > The change looks nice, but I'd microbenchmark it with a > > write-to-ext2-on-ramdisk > > or something like that. > > H.. I was told in another thread that this is the most frequently used > slab for this benchmark That would be hair-raising ;) I suspect confusion with sk_buff. buffer_heads do get used quite a bit though. A good microbenchmark would be to sit in a tight loop extending and truncating an ext2 file - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, 3 May 2007, Andrew Morton wrote: > The change looks nice, but I'd microbenchmark it with a > write-to-ext2-on-ramdisk > or something like that. Hmmm... How does one benchmark buffer head performance? Guess just by copying files? Not sure if the following will cut it. Two tests. First copying 8M of small files into a 16M ramdisk: for i in 1 2 3 4 5 6 7 8 9; do mke2fs /dev/ram0 >/dev/null mount /dev/ram0 /media >/dev/null time cp -a /etc /media umount /dev/ram0 done; No constructor real0m0.104s user0m0.016s sys 0m0.056s real0m0.090s user0m0.008s sys 0m0.056s real0m0.089s user0m0.016s sys 0m0.048s real0m0.097s user0m0.004s sys 0m0.064s real0m0.091s user0m0.008s sys 0m0.052s real0m0.091s user0m0.004s sys 0m0.060s real0m0.098s user0m0.008s sys 0m0.060s real0m0.091s user0m0.000s sys 0m0.064s real0m0.090s user0m0.012s sys 0m0.052s W/constructor real0m0.099s user0m0.004s sys 0m0.100s real0m0.098s user0m0.008s sys 0m0.096s real0m0.091s user0m0.016s sys 0m0.080s real0m0.091s user0m0.012s sys 0m0.084s real0m0.090s user0m0.012s sys 0m0.080s real0m0.090s user0m0.020s sys 0m0.076s real0m1.269s user0m0.012s sys 0m0.084s real0m0.095s user0m0.016s sys 0m0.084s real0m0.096s user0m0.020s sys 0m0.084s The no constructor numbers are generally lower. Lowest is no constructor with 0.089. Second. Copy vmlinux (52M) to 128M ramdisk: for i in 1 2 3 4 5 6 7 8 9; do mke2fs /dev/ram0 >/dev/null mount /dev/ram0 /media >/dev/null time cp slub/vmlinux /media umount /dev/ram0 done; No constructor: real0m2.095s user0m0.000s sys 0m0.168s real0m0.187s user0m0.008s sys 0m0.124s real0m0.186s user0m0.008s sys 0m0.120s real0m0.195s user0m0.008s sys 0m0.128s real0m0.177s user0m0.004s sys 0m0.120s real0m0.182s user0m0.004s sys 0m0.120s real0m0.186s user0m0.008s sys 0m0.120s real0m0.190s user0m0.004s sys 0m0.128s real0m0.174s user0m0.004s sys 0m0.116s Constructor real0m0.183s user0m0.004s sys 0m0.188s real0m0.183s user0m0.004s sys 0m0.192s real0m0.177s user0m0.012s sys 0m0.176s real0m0.186s user0m0.004s sys 0m0.192s real0m0.187s user0m0.008s sys 0m0.188s real0m0.184s user0m0.004s sys 0m0.192s real0m0.177s user0m0.012s sys 0m0.176s real0m0.183s user0m0.004s sys 0m0.192s real0m0.182s user0m0.004s sys 0m0.188s Same here. Low is 0.174 no constructor. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, 3 May 2007, Andrew Morton wrote: > On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> > wrote: > > > Performance tests show a slight improvements in netperf (not a > > strong case for a performance improvement but removing the > > constructor has definitely no negative impact so why keep > > this around?). > > > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost > > (127.0.0.1) port 0 AF_INET > > Recv SendSend > > Socket Socket Message Elapsed > > Size SizeSize Time Throughput > > bytes bytes bytessecs.10^6bits/sec > > > > Before: > > 87380 16384 1638410.016026.04 > > 87380 16384 1638410.015992.17 > > 87380 16384 1638410.016071.23 > > > > After: > > 87380 16384 1638410.016090.20 > > 87380 16384 1638410.016078.3 > > 87380 16384 1638410.006013.52 > > How could a filesystem change affect networking performance? > > The change looks nice, but I'd microbenchmark it with a > write-to-ext2-on-ramdisk > or something like that. H.. I was told in another thread that this is the most frequently used slab for this benchmark .. Just accepted that as true. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > Performance tests show a slight improvements in netperf (not a > strong case for a performance improvement but removing the > constructor has definitely no negative impact so why keep > this around?). > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost > (127.0.0.1) port 0 AF_INET > Recv SendSend > Socket Socket Message Elapsed > Size SizeSize Time Throughput > bytes bytes bytessecs.10^6bits/sec > > Before: > 87380 16384 1638410.016026.04 > 87380 16384 1638410.015992.17 > 87380 16384 1638410.016071.23 > > After: > 87380 16384 1638410.016090.20 > 87380 16384 1638410.016078.3 > 87380 16384 1638410.006013.52 How could a filesystem change affect networking performance? The change looks nice, but I'd microbenchmark it with a write-to-ext2-on-ramdisk or something like that. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Remove constructor from buffer_head
Performance tests show a slight improvements in netperf (not a strong case for a performance improvement but removing the constructor has definitely no negative impact so why keep this around?). TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET Recv SendSend Socket Socket Message Elapsed Size SizeSize Time Throughput bytes bytes bytessecs.10^6bits/sec Before: 87380 16384 1638410.016026.04 87380 16384 1638410.015992.17 87380 16384 1638410.016071.23 After: 87380 16384 1638410.016090.20 87380 16384 1638410.016078.3 87380 16384 1638410.006013.52 Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> --- fs/buffer.c | 22 -- 1 file changed, 4 insertions(+), 18 deletions(-) Index: slub/fs/buffer.c === --- slub.orig/fs/buffer.c 2007-05-03 19:17:09.0 -0700 +++ slub/fs/buffer.c2007-05-03 19:57:30.0 -0700 @@ -2907,9 +2907,10 @@ static void recalc_bh_state(void) struct buffer_head *alloc_buffer_head(gfp_t gfp_flags) { - struct buffer_head *ret = kmem_cache_alloc(bh_cachep, + struct buffer_head *ret = kmem_cache_zalloc(bh_cachep, set_migrateflags(gfp_flags, __GFP_RECLAIMABLE)); if (ret) { + INIT_LIST_HEAD(>b_assoc_buffers); get_cpu_var(bh_accounting).nr++; recalc_bh_state(); put_cpu_var(bh_accounting); @@ -2928,17 +2929,6 @@ void free_buffer_head(struct buffer_head } EXPORT_SYMBOL(free_buffer_head); -static void -init_buffer_head(void *data, struct kmem_cache *cachep, unsigned long flags) -{ - if (flags & SLAB_CTOR_CONSTRUCTOR) { - struct buffer_head * bh = (struct buffer_head *)data; - - memset(bh, 0, sizeof(*bh)); - INIT_LIST_HEAD(>b_assoc_buffers); - } -} - static void buffer_exit_cpu(int cpu) { int i; @@ -2965,12 +2955,8 @@ void __init buffer_init(void) { int nrpages; - bh_cachep = kmem_cache_create("buffer_head", - sizeof(struct buffer_head), 0, - (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC| - SLAB_MEM_SPREAD), - init_buffer_head, - NULL); + bh_cachep = KMEM_CACHE(buffer_head, + SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD); /* * Limit the bh occupancy to 10% of ZONE_NORMAL - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Remove constructor from buffer_head
Performance tests show a slight improvements in netperf (not a strong case for a performance improvement but removing the constructor has definitely no negative impact so why keep this around?). TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET Recv SendSend Socket Socket Message Elapsed Size SizeSize Time Throughput bytes bytes bytessecs.10^6bits/sec Before: 87380 16384 1638410.016026.04 87380 16384 1638410.015992.17 87380 16384 1638410.016071.23 After: 87380 16384 1638410.016090.20 87380 16384 1638410.016078.3 87380 16384 1638410.006013.52 Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- fs/buffer.c | 22 -- 1 file changed, 4 insertions(+), 18 deletions(-) Index: slub/fs/buffer.c === --- slub.orig/fs/buffer.c 2007-05-03 19:17:09.0 -0700 +++ slub/fs/buffer.c2007-05-03 19:57:30.0 -0700 @@ -2907,9 +2907,10 @@ static void recalc_bh_state(void) struct buffer_head *alloc_buffer_head(gfp_t gfp_flags) { - struct buffer_head *ret = kmem_cache_alloc(bh_cachep, + struct buffer_head *ret = kmem_cache_zalloc(bh_cachep, set_migrateflags(gfp_flags, __GFP_RECLAIMABLE)); if (ret) { + INIT_LIST_HEAD(ret-b_assoc_buffers); get_cpu_var(bh_accounting).nr++; recalc_bh_state(); put_cpu_var(bh_accounting); @@ -2928,17 +2929,6 @@ void free_buffer_head(struct buffer_head } EXPORT_SYMBOL(free_buffer_head); -static void -init_buffer_head(void *data, struct kmem_cache *cachep, unsigned long flags) -{ - if (flags SLAB_CTOR_CONSTRUCTOR) { - struct buffer_head * bh = (struct buffer_head *)data; - - memset(bh, 0, sizeof(*bh)); - INIT_LIST_HEAD(bh-b_assoc_buffers); - } -} - static void buffer_exit_cpu(int cpu) { int i; @@ -2965,12 +2955,8 @@ void __init buffer_init(void) { int nrpages; - bh_cachep = kmem_cache_create(buffer_head, - sizeof(struct buffer_head), 0, - (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC| - SLAB_MEM_SPREAD), - init_buffer_head, - NULL); + bh_cachep = KMEM_CACHE(buffer_head, + SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD); /* * Limit the bh occupancy to 10% of ZONE_NORMAL - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter [EMAIL PROTECTED] wrote: Performance tests show a slight improvements in netperf (not a strong case for a performance improvement but removing the constructor has definitely no negative impact so why keep this around?). TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET Recv SendSend Socket Socket Message Elapsed Size SizeSize Time Throughput bytes bytes bytessecs.10^6bits/sec Before: 87380 16384 1638410.016026.04 87380 16384 1638410.015992.17 87380 16384 1638410.016071.23 After: 87380 16384 1638410.016090.20 87380 16384 1638410.016078.3 87380 16384 1638410.006013.52 How could a filesystem change affect networking performance? The change looks nice, but I'd microbenchmark it with a write-to-ext2-on-ramdisk or something like that. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, 3 May 2007, Andrew Morton wrote: On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter [EMAIL PROTECTED] wrote: Performance tests show a slight improvements in netperf (not a strong case for a performance improvement but removing the constructor has definitely no negative impact so why keep this around?). TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET Recv SendSend Socket Socket Message Elapsed Size SizeSize Time Throughput bytes bytes bytessecs.10^6bits/sec Before: 87380 16384 1638410.016026.04 87380 16384 1638410.015992.17 87380 16384 1638410.016071.23 After: 87380 16384 1638410.016090.20 87380 16384 1638410.016078.3 87380 16384 1638410.006013.52 How could a filesystem change affect networking performance? The change looks nice, but I'd microbenchmark it with a write-to-ext2-on-ramdisk or something like that. H.. I was told in another thread that this is the most frequently used slab for this benchmark .. Just accepted that as true. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, 3 May 2007, Andrew Morton wrote: The change looks nice, but I'd microbenchmark it with a write-to-ext2-on-ramdisk or something like that. Hmmm... How does one benchmark buffer head performance? Guess just by copying files? Not sure if the following will cut it. Two tests. First copying 8M of small files into a 16M ramdisk: for i in 1 2 3 4 5 6 7 8 9; do mke2fs /dev/ram0 /dev/null mount /dev/ram0 /media /dev/null time cp -a /etc /media umount /dev/ram0 done; No constructor real0m0.104s user0m0.016s sys 0m0.056s real0m0.090s user0m0.008s sys 0m0.056s real0m0.089s user0m0.016s sys 0m0.048s real0m0.097s user0m0.004s sys 0m0.064s real0m0.091s user0m0.008s sys 0m0.052s real0m0.091s user0m0.004s sys 0m0.060s real0m0.098s user0m0.008s sys 0m0.060s real0m0.091s user0m0.000s sys 0m0.064s real0m0.090s user0m0.012s sys 0m0.052s W/constructor real0m0.099s user0m0.004s sys 0m0.100s real0m0.098s user0m0.008s sys 0m0.096s real0m0.091s user0m0.016s sys 0m0.080s real0m0.091s user0m0.012s sys 0m0.084s real0m0.090s user0m0.012s sys 0m0.080s real0m0.090s user0m0.020s sys 0m0.076s real0m1.269s user0m0.012s sys 0m0.084s real0m0.095s user0m0.016s sys 0m0.084s real0m0.096s user0m0.020s sys 0m0.084s The no constructor numbers are generally lower. Lowest is no constructor with 0.089. Second. Copy vmlinux (52M) to 128M ramdisk: for i in 1 2 3 4 5 6 7 8 9; do mke2fs /dev/ram0 /dev/null mount /dev/ram0 /media /dev/null time cp slub/vmlinux /media umount /dev/ram0 done; No constructor: real0m2.095s user0m0.000s sys 0m0.168s real0m0.187s user0m0.008s sys 0m0.124s real0m0.186s user0m0.008s sys 0m0.120s real0m0.195s user0m0.008s sys 0m0.128s real0m0.177s user0m0.004s sys 0m0.120s real0m0.182s user0m0.004s sys 0m0.120s real0m0.186s user0m0.008s sys 0m0.120s real0m0.190s user0m0.004s sys 0m0.128s real0m0.174s user0m0.004s sys 0m0.116s Constructor real0m0.183s user0m0.004s sys 0m0.188s real0m0.183s user0m0.004s sys 0m0.192s real0m0.177s user0m0.012s sys 0m0.176s real0m0.186s user0m0.004s sys 0m0.192s real0m0.187s user0m0.008s sys 0m0.188s real0m0.184s user0m0.004s sys 0m0.192s real0m0.177s user0m0.012s sys 0m0.176s real0m0.183s user0m0.004s sys 0m0.192s real0m0.182s user0m0.004s sys 0m0.188s Same here. Low is 0.174 no constructor. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, 3 May 2007 20:34:48 -0700 (PDT) Christoph Lameter [EMAIL PROTECTED] wrote: On Thu, 3 May 2007, Andrew Morton wrote: On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter [EMAIL PROTECTED] wrote: Performance tests show a slight improvements in netperf (not a strong case for a performance improvement but removing the constructor has definitely no negative impact so why keep this around?). TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET Recv SendSend Socket Socket Message Elapsed Size SizeSize Time Throughput bytes bytes bytessecs.10^6bits/sec Before: 87380 16384 1638410.016026.04 87380 16384 1638410.015992.17 87380 16384 1638410.016071.23 After: 87380 16384 1638410.016090.20 87380 16384 1638410.016078.3 87380 16384 1638410.006013.52 How could a filesystem change affect networking performance? The change looks nice, but I'd microbenchmark it with a write-to-ext2-on-ramdisk or something like that. H.. I was told in another thread that this is the most frequently used slab for this benchmark That would be hair-raising ;) I suspect confusion with sk_buff. buffer_heads do get used quite a bit though. A good microbenchmark would be to sit in a tight loop extending and truncating an ext2 file - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/