Re: oprofile broken in 2.6.21 SMP (was Re: Remove constructor from buffer_head)

2007-05-14 Thread Andrew Morton
On Sun, 13 May 2007 16:38:16 -0400 Benjamin LaHaise <[EMAIL PROTECTED]> wrote:

> On Sat, May 05, 2007 at 11:31:20AM +0200, Andi Kleen wrote:
> > Hmm, after a opcontrol --reset i see the same issue now. Don't know what's 
> > wrong, but it must be something different from the .20 perfctr allocation
> > problem.
> > 
> > It looks like the daemon doesn't get any data from the kernel
> 
> I finally had time to track this down.  The breakage is caused by "[PATCH] 
> x86-64: Let oprofile reserve MSR on all CPUs".  Oprofile is already calling 
> the reserve functions on each CPU in the system when it sets up the MSRs.  
> This results in oprofile getting a reservation failure on CPUs above 0.  The 
> following makes oprofile adapt to the API change for now -- oprofile 
> still needs to be modified to perform the reservations earlier during its 
> initialization, but that's a little bit more involved than the immediate 
> bug fix.  This only affects systems with more than 1 CPU.  This patch has 
> been through limited testing (Athlon 64 X2 and Core 2, but not on the P4) on 
> x86 and x86-64 (Core 2 only).

Unfortunately we've left this a bit too late - your patch is patching code which
isn't there any more in mainline and we also need a 2.6.21.x fix.

So perhaps we could merge your "immediate bugfix" into -stable and implement the
"more involved" fix for 2.6.22.

Andi, any preferences?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: oprofile broken in 2.6.21 SMP (was Re: Remove constructor from buffer_head)

2007-05-14 Thread Andrew Morton
On Sun, 13 May 2007 16:38:16 -0400 Benjamin LaHaise [EMAIL PROTECTED] wrote:

 On Sat, May 05, 2007 at 11:31:20AM +0200, Andi Kleen wrote:
  Hmm, after a opcontrol --reset i see the same issue now. Don't know what's 
  wrong, but it must be something different from the .20 perfctr allocation
  problem.
  
  It looks like the daemon doesn't get any data from the kernel
 
 I finally had time to track this down.  The breakage is caused by [PATCH] 
 x86-64: Let oprofile reserve MSR on all CPUs.  Oprofile is already calling 
 the reserve functions on each CPU in the system when it sets up the MSRs.  
 This results in oprofile getting a reservation failure on CPUs above 0.  The 
 following makes oprofile adapt to the API change for now -- oprofile 
 still needs to be modified to perform the reservations earlier during its 
 initialization, but that's a little bit more involved than the immediate 
 bug fix.  This only affects systems with more than 1 CPU.  This patch has 
 been through limited testing (Athlon 64 X2 and Core 2, but not on the P4) on 
 x86 and x86-64 (Core 2 only).

Unfortunately we've left this a bit too late - your patch is patching code which
isn't there any more in mainline and we also need a 2.6.21.x fix.

So perhaps we could merge your immediate bugfix into -stable and implement the
more involved fix for 2.6.22.

Andi, any preferences?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


oprofile broken in 2.6.21 SMP (was Re: Remove constructor from buffer_head)

2007-05-13 Thread Benjamin LaHaise
On Sat, May 05, 2007 at 11:31:20AM +0200, Andi Kleen wrote:
> Hmm, after a opcontrol --reset i see the same issue now. Don't know what's 
> wrong, but it must be something different from the .20 perfctr allocation
> problem.
> 
> It looks like the daemon doesn't get any data from the kernel

I finally had time to track this down.  The breakage is caused by "[PATCH] 
x86-64: Let oprofile reserve MSR on all CPUs".  Oprofile is already calling 
the reserve functions on each CPU in the system when it sets up the MSRs.  
This results in oprofile getting a reservation failure on CPUs above 0.  The 
following makes oprofile adapt to the API change for now -- oprofile 
still needs to be modified to perform the reservations earlier during its 
initialization, but that's a little bit more involved than the immediate 
bug fix.  This only affects systems with more than 1 CPU.  This patch has 
been through limited testing (Athlon 64 X2 and Core 2, but not on the P4) on 
x86 and x86-64 (Core 2 only).

-ben

Signed-off-by: Benjamin LaHaise <[EMAIL PROTECTED]>
diff --git a/arch/i386/kernel/nmi.c b/arch/i386/kernel/nmi.c
index 84c3497..21fc74d 100644
--- a/arch/i386/kernel/nmi.c
+++ b/arch/i386/kernel/nmi.c
@@ -148,7 +148,7 @@ int avail_to_resrv_perfctr_nmi(unsigned int msr)
return 1;
 }
 
-static int __reserve_perfctr_nmi(int cpu, unsigned int msr)
+int __reserve_perfctr_nmi(int cpu, unsigned int msr)
 {
unsigned int counter;
if (cpu < 0)
@@ -162,7 +162,7 @@ static int __reserve_perfctr_nmi(int cpu, unsigned int msr)
return 0;
 }
 
-static void __release_perfctr_nmi(int cpu, unsigned int msr)
+void __release_perfctr_nmi(int cpu, unsigned int msr)
 {
unsigned int counter;
if (cpu < 0)
@@ -212,7 +212,7 @@ int __reserve_evntsel_nmi(int cpu, unsigned int msr)
return 0;
 }
 
-static void __release_evntsel_nmi(int cpu, unsigned int msr)
+void __release_evntsel_nmi(int cpu, unsigned int msr)
 {
unsigned int counter;
if (cpu < 0)
@@ -1188,5 +1188,9 @@ EXPORT_SYMBOL(reserve_perfctr_nmi);
 EXPORT_SYMBOL(release_perfctr_nmi);
 EXPORT_SYMBOL(reserve_evntsel_nmi);
 EXPORT_SYMBOL(release_evntsel_nmi);
+EXPORT_SYMBOL(__reserve_perfctr_nmi);
+EXPORT_SYMBOL(__release_perfctr_nmi);
+EXPORT_SYMBOL(__reserve_evntsel_nmi);
+EXPORT_SYMBOL(__release_evntsel_nmi);
 EXPORT_SYMBOL(disable_timer_nmi_watchdog);
 EXPORT_SYMBOL(enable_timer_nmi_watchdog);
diff --git a/arch/i386/oprofile/op_model_athlon.c 
b/arch/i386/oprofile/op_model_athlon.c
index 3057a19..738a579 100644
--- a/arch/i386/oprofile/op_model_athlon.c
+++ b/arch/i386/oprofile/op_model_athlon.c
@@ -45,14 +45,14 @@ static void athlon_fill_in_addresses(struct op_msrs * const 
msrs)
int i;
 
for (i=0; i < NUM_COUNTERS; i++) {
-   if (reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i))
+   if (__reserve_perfctr_nmi(-1, MSR_K7_PERFCTR0 + i))
msrs->counters[i].addr = MSR_K7_PERFCTR0 + i;
else
msrs->counters[i].addr = 0;
}
 
for (i=0; i < NUM_CONTROLS; i++) {
-   if (reserve_evntsel_nmi(MSR_K7_EVNTSEL0 + i))
+   if (__reserve_evntsel_nmi(-1, MSR_K7_EVNTSEL0 + i))
msrs->controls[i].addr = MSR_K7_EVNTSEL0 + i;
else
msrs->controls[i].addr = 0;
@@ -160,11 +160,11 @@ static void athlon_shutdown(struct op_msrs const * const 
msrs)
 
for (i = 0 ; i < NUM_COUNTERS ; ++i) {
if (CTR_IS_RESERVED(msrs,i))
-   release_perfctr_nmi(MSR_K7_PERFCTR0 + i);
+   __release_perfctr_nmi(-1, MSR_K7_PERFCTR0 + i);
}
for (i = 0 ; i < NUM_CONTROLS ; ++i) {
if (CTRL_IS_RESERVED(msrs,i))
-   release_evntsel_nmi(MSR_K7_EVNTSEL0 + i);
+   __release_evntsel_nmi(-1, MSR_K7_EVNTSEL0 + i);
}
 }
 
diff --git a/arch/i386/oprofile/op_model_p4.c b/arch/i386/oprofile/op_model_p4.c
index 4792592..ce096dc 100644
--- a/arch/i386/oprofile/op_model_p4.c
+++ b/arch/i386/oprofile/op_model_p4.c
@@ -413,7 +413,7 @@ static void p4_fill_in_addresses(struct op_msrs * const 
msrs)
for (i = 0; i < num_counters; ++i) {
addr = p4_counters[VIRT_CTR(stag, i)].counter_address;
cccraddr = p4_counters[VIRT_CTR(stag, i)].cccr_address;
-   if (reserve_perfctr_nmi(addr)){
+   if (__reserve_perfctr_nmi(-1, addr)){
msrs->counters[i].addr = addr;
msrs->controls[i].addr = cccraddr;
}
@@ -422,7 +422,7 @@ static void p4_fill_in_addresses(struct op_msrs * const 
msrs)
/* 43 ESCR registers in three or four discontiguous group */
for (addr = MSR_P4_BSU_ESCR0 + stag;
 addr < MSR_P4_IQ_ESCR0; ++i, addr += addr_increment()) {
-   if (reserve_evntsel_nmi(addr))
+   if 

oprofile broken in 2.6.21 SMP (was Re: Remove constructor from buffer_head)

2007-05-13 Thread Benjamin LaHaise
On Sat, May 05, 2007 at 11:31:20AM +0200, Andi Kleen wrote:
 Hmm, after a opcontrol --reset i see the same issue now. Don't know what's 
 wrong, but it must be something different from the .20 perfctr allocation
 problem.
 
 It looks like the daemon doesn't get any data from the kernel

I finally had time to track this down.  The breakage is caused by [PATCH] 
x86-64: Let oprofile reserve MSR on all CPUs.  Oprofile is already calling 
the reserve functions on each CPU in the system when it sets up the MSRs.  
This results in oprofile getting a reservation failure on CPUs above 0.  The 
following makes oprofile adapt to the API change for now -- oprofile 
still needs to be modified to perform the reservations earlier during its 
initialization, but that's a little bit more involved than the immediate 
bug fix.  This only affects systems with more than 1 CPU.  This patch has 
been through limited testing (Athlon 64 X2 and Core 2, but not on the P4) on 
x86 and x86-64 (Core 2 only).

-ben

Signed-off-by: Benjamin LaHaise [EMAIL PROTECTED]
diff --git a/arch/i386/kernel/nmi.c b/arch/i386/kernel/nmi.c
index 84c3497..21fc74d 100644
--- a/arch/i386/kernel/nmi.c
+++ b/arch/i386/kernel/nmi.c
@@ -148,7 +148,7 @@ int avail_to_resrv_perfctr_nmi(unsigned int msr)
return 1;
 }
 
-static int __reserve_perfctr_nmi(int cpu, unsigned int msr)
+int __reserve_perfctr_nmi(int cpu, unsigned int msr)
 {
unsigned int counter;
if (cpu  0)
@@ -162,7 +162,7 @@ static int __reserve_perfctr_nmi(int cpu, unsigned int msr)
return 0;
 }
 
-static void __release_perfctr_nmi(int cpu, unsigned int msr)
+void __release_perfctr_nmi(int cpu, unsigned int msr)
 {
unsigned int counter;
if (cpu  0)
@@ -212,7 +212,7 @@ int __reserve_evntsel_nmi(int cpu, unsigned int msr)
return 0;
 }
 
-static void __release_evntsel_nmi(int cpu, unsigned int msr)
+void __release_evntsel_nmi(int cpu, unsigned int msr)
 {
unsigned int counter;
if (cpu  0)
@@ -1188,5 +1188,9 @@ EXPORT_SYMBOL(reserve_perfctr_nmi);
 EXPORT_SYMBOL(release_perfctr_nmi);
 EXPORT_SYMBOL(reserve_evntsel_nmi);
 EXPORT_SYMBOL(release_evntsel_nmi);
+EXPORT_SYMBOL(__reserve_perfctr_nmi);
+EXPORT_SYMBOL(__release_perfctr_nmi);
+EXPORT_SYMBOL(__reserve_evntsel_nmi);
+EXPORT_SYMBOL(__release_evntsel_nmi);
 EXPORT_SYMBOL(disable_timer_nmi_watchdog);
 EXPORT_SYMBOL(enable_timer_nmi_watchdog);
diff --git a/arch/i386/oprofile/op_model_athlon.c 
b/arch/i386/oprofile/op_model_athlon.c
index 3057a19..738a579 100644
--- a/arch/i386/oprofile/op_model_athlon.c
+++ b/arch/i386/oprofile/op_model_athlon.c
@@ -45,14 +45,14 @@ static void athlon_fill_in_addresses(struct op_msrs * const 
msrs)
int i;
 
for (i=0; i  NUM_COUNTERS; i++) {
-   if (reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i))
+   if (__reserve_perfctr_nmi(-1, MSR_K7_PERFCTR0 + i))
msrs-counters[i].addr = MSR_K7_PERFCTR0 + i;
else
msrs-counters[i].addr = 0;
}
 
for (i=0; i  NUM_CONTROLS; i++) {
-   if (reserve_evntsel_nmi(MSR_K7_EVNTSEL0 + i))
+   if (__reserve_evntsel_nmi(-1, MSR_K7_EVNTSEL0 + i))
msrs-controls[i].addr = MSR_K7_EVNTSEL0 + i;
else
msrs-controls[i].addr = 0;
@@ -160,11 +160,11 @@ static void athlon_shutdown(struct op_msrs const * const 
msrs)
 
for (i = 0 ; i  NUM_COUNTERS ; ++i) {
if (CTR_IS_RESERVED(msrs,i))
-   release_perfctr_nmi(MSR_K7_PERFCTR0 + i);
+   __release_perfctr_nmi(-1, MSR_K7_PERFCTR0 + i);
}
for (i = 0 ; i  NUM_CONTROLS ; ++i) {
if (CTRL_IS_RESERVED(msrs,i))
-   release_evntsel_nmi(MSR_K7_EVNTSEL0 + i);
+   __release_evntsel_nmi(-1, MSR_K7_EVNTSEL0 + i);
}
 }
 
diff --git a/arch/i386/oprofile/op_model_p4.c b/arch/i386/oprofile/op_model_p4.c
index 4792592..ce096dc 100644
--- a/arch/i386/oprofile/op_model_p4.c
+++ b/arch/i386/oprofile/op_model_p4.c
@@ -413,7 +413,7 @@ static void p4_fill_in_addresses(struct op_msrs * const 
msrs)
for (i = 0; i  num_counters; ++i) {
addr = p4_counters[VIRT_CTR(stag, i)].counter_address;
cccraddr = p4_counters[VIRT_CTR(stag, i)].cccr_address;
-   if (reserve_perfctr_nmi(addr)){
+   if (__reserve_perfctr_nmi(-1, addr)){
msrs-counters[i].addr = addr;
msrs-controls[i].addr = cccraddr;
}
@@ -422,7 +422,7 @@ static void p4_fill_in_addresses(struct op_msrs * const 
msrs)
/* 43 ESCR registers in three or four discontiguous group */
for (addr = MSR_P4_BSU_ESCR0 + stag;
 addr  MSR_P4_IQ_ESCR0; ++i, addr += addr_increment()) {
-   if (reserve_evntsel_nmi(addr))
+   if (__reserve_evntsel_nmi(-1, 

Re: Remove constructor from buffer_head

2007-05-05 Thread Andi Kleen
On Fri, May 04, 2007 at 04:45:29PM -0700, Andrew Morton wrote:
> On Sat, 5 May 2007 01:22:05 +0200
> Andi Kleen <[EMAIL PROTECTED]> wrote:
> 
> > > 2.6.21:
> > > 
> > > akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50
> > > opreport error: No sample file found: try running opcontrol --dump
> > > or specify a session containing sample files
> > 
> > For me it works on a slightly post 2.6.21 kernel with suse oprofile-0.9.2-21
> > 
> > Did you try opcontrol --dump? 
> 
> Yes, tried various things.  There's just nothing turning up in 
> /var/lib/oprofile.
> 
> Chuck appears to be claiming that 2.6.21 oprofile is known to be broken,
> but I never heard anything about that.

Hmm, after a opcontrol --reset i see the same issue now. Don't know what's 
wrong, but it must be something different from the .20 perfctr allocation
problem.

It looks like the daemon doesn't get any data from the kernel


-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-05 Thread Andi Kleen
On Fri, May 04, 2007 at 04:45:29PM -0700, Andrew Morton wrote:
 On Sat, 5 May 2007 01:22:05 +0200
 Andi Kleen [EMAIL PROTECTED] wrote:
 
   2.6.21:
   
   akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50
   opreport error: No sample file found: try running opcontrol --dump
   or specify a session containing sample files
  
  For me it works on a slightly post 2.6.21 kernel with suse oprofile-0.9.2-21
  
  Did you try opcontrol --dump? 
 
 Yes, tried various things.  There's just nothing turning up in 
 /var/lib/oprofile.
 
 Chuck appears to be claiming that 2.6.21 oprofile is known to be broken,
 but I never heard anything about that.

Hmm, after a opcontrol --reset i see the same issue now. Don't know what's 
wrong, but it must be something different from the .20 perfctr allocation
problem.

It looks like the daemon doesn't get any data from the kernel


-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread Andrew Morton
On Sat, 5 May 2007 01:22:05 +0200
Andi Kleen <[EMAIL PROTECTED]> wrote:

> > 2.6.21:
> > 
> > akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50
> > opreport error: No sample file found: try running opcontrol --dump
> > or specify a session containing sample files
> 
> For me it works on a slightly post 2.6.21 kernel with suse oprofile-0.9.2-21
> 
> Did you try opcontrol --dump? 

Yes, tried various things.  There's just nothing turning up in 
/var/lib/oprofile.

Chuck appears to be claiming that 2.6.21 oprofile is known to be broken,
but I never heard anything about that.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread Andi Kleen
On Friday 04 May 2007 23:33:47 Andrew Morton wrote:
> On Fri, 4 May 2007 13:42:12 -0700

> 
> 2.6.20:
> 
> akpm2:/home/akpm> opcontrol --start-daemon
> /usr/bin/opcontrol: line 1098: /dev/oprofile/0/enabled: No such file or 
> directory
> /usr/bin/opcontrol: line 1098: /dev/oprofile/0/event: No such file or 
> directory
> /usr/bin/opcontrol: line 1098: /dev/oprofile/0/count: No such file or 
> directory
> /usr/bin/opcontrol: line 1098: /dev/oprofile/0/kernel: No such file or 
> directory
> /usr/bin/opcontrol: line 1098: /dev/oprofile/0/user: No such file or directory
> /usr/bin/opcontrol: line 1098: /dev/oprofile/0/unit_mask: No such file or 
> directory

This isn't a problem anymore since the nmi watchdog is off by default now.

> 2.6.21:
> 
> akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50
> opreport error: No sample file found: try running opcontrol --dump
> or specify a session containing sample files

For me it works on a slightly post 2.6.21 kernel with suse oprofile-0.9.2-21

Did you try opcontrol --dump? 

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread Chuck Ebbert
Andrew Morton wrote:
> 
> I'd investigate further, but someone has gone and broken oprofile.
> 

Did you just notice that? Apparently it's been broken since 2.6.21-final.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread Andrew Morton
On Fri, 4 May 2007 14:42:02 -0700 (PDT)
Christoph Lameter <[EMAIL PROTECTED]> wrote:

> On Fri, 4 May 2007, Andrew Morton wrote:
> 
> > So the patch took the average system time from 4.42 seconds up to 4.582
> > seconds.  Nice slowdown!
> 
> All of that from a memset and a list head init on a cacheline we already 
> use?

Seems unlikely, especially when you consider all the other stuff which a write()
has to do.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread Christoph Lameter
On Fri, 4 May 2007, Andrew Morton wrote:

> So the patch took the average system time from 4.42 seconds up to 4.582
> seconds.  Nice slowdown!

All of that from a memset and a list head init on a cacheline we already 
use?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread Andrew Morton
On Fri, 4 May 2007 13:42:12 -0700
Andrew Morton <[EMAIL PROTECTED]> wrote:

> I'd investigate further, but someone has gone and broken oprofile.

Damn, we went and merged that bustage?


2.6.20:

akpm2:/home/akpm> opcontrol --start-daemon
/usr/bin/opcontrol: line 1098: /dev/oprofile/0/enabled: No such file or 
directory
/usr/bin/opcontrol: line 1098: /dev/oprofile/0/event: No such file or directory
/usr/bin/opcontrol: line 1098: /dev/oprofile/0/count: No such file or directory
/usr/bin/opcontrol: line 1098: /dev/oprofile/0/kernel: No such file or directory
/usr/bin/opcontrol: line 1098: /dev/oprofile/0/user: No such file or directory
/usr/bin/opcontrol: line 1098: /dev/oprofile/0/unit_mask: No such file or 
directory

2.6.21:

akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50
opreport error: No sample file found: try running opcontrol --dump
or specify a session containing sample files



This is an FC6 machine.  `yum update oprofile' says

Could not find update match for oprofile
No Packages marked for Update/Obsoletion

akpm2:/home/akpm> rpm -q oprofile
oprofile-0.9.2-3.fc6


I'm quite stunned that we did this.

Now what?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread Andrew Morton
On Thu, 3 May 2007 20:08:41 -0700 (PDT)
Christoph Lameter <[EMAIL PROTECTED]> wrote:

> Performance tests show a slight improvements in netperf (not a
> strong case for a performance improvement but removing the
> constructor has definitely no negative impact so why keep
> this around?).
> 
> TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost 
> (127.0.0.1) port 0 AF_INET
> Recv   SendSend
> Socket Socket  Message  Elapsed
> Size   SizeSize Time Throughput
> bytes  bytes   bytessecs.10^6bits/sec
> 
> Before:
>  87380  16384  1638410.016026.04
>  87380  16384  1638410.015992.17
>  87380  16384  1638410.016071.23
> 
> After:
>  87380  16384  1638410.016090.20
>  87380  16384  1638410.016078.3
>  87380  16384  1638410.006013.52
> 
> 
> Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>
> 
> ---
>  fs/buffer.c |   22 --


So I benchmarked this by repeatedly extending (via write()) and truncating
a 10MB file, on ext2.  Using create-delete.c from
http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz

Machine is a fast 2x4 core Woodcrest.  CONFIG_SLAB=y

The command used was

time create-delete -s $((16 * 1024 * 1024)) -n 300 foo

which will allocate and free 300*4096 buffer_heads.

With patch:

akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.56s system 99% 
cpu 4.565 total
akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.60s system 99% 
cpu 4.612 total
akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.60s system 99% 
cpu 4.602 total
akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.56s system 99% 
cpu 4.567 total
akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.59s system 95% 
cpu 4.824 total

Without patch:

akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.42s system 99% 
cpu 4.419 total
akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.42s system 99% 
cpu 4.421 total
akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.42s system 99% 
cpu 4.427 total
akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.42s system 99% 
cpu 4.417 total
akpm2:/mnt/sda2> time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.42s system 99% 
cpu 4.435 total

So the patch took the average system time from 4.42 seconds up to 4.582
seconds.  Nice slowdown!

It could just be the usual inter-kernel-build noise, dunno.

I'd investigate further, but someone has gone and broken oprofile.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread Christoph Lameter
On Thu, 3 May 2007, William Lee Irwin III wrote:

> On Thu, May 03, 2007 at 08:08:41PM -0700, Christoph Lameter wrote:
> > Performance tests show a slight improvements in netperf (not a
> > strong case for a performance improvement but removing the
> > constructor has definitely no negative impact so why keep
> > this around?).
> 
> Cache effects are not so easily visible. Cache profile results from
> more realistic workloads (e.g. major macrobenchmarks) are more
> appropriate for gauging this.

Yeah I really out to stick a performance counter in this but that would 
require some effort. Defer for now I guess.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread William Lee Irwin III
On Thu, May 03, 2007 at 08:08:41PM -0700, Christoph Lameter wrote:
> Performance tests show a slight improvements in netperf (not a
> strong case for a performance improvement but removing the
> constructor has definitely no negative impact so why keep
> this around?).

Cache effects are not so easily visible. Cache profile results from
more realistic workloads (e.g. major macrobenchmarks) are more
appropriate for gauging this.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread William Lee Irwin III
On Thu, May 03, 2007 at 08:08:41PM -0700, Christoph Lameter wrote:
 Performance tests show a slight improvements in netperf (not a
 strong case for a performance improvement but removing the
 constructor has definitely no negative impact so why keep
 this around?).

Cache effects are not so easily visible. Cache profile results from
more realistic workloads (e.g. major macrobenchmarks) are more
appropriate for gauging this.


-- wli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread Christoph Lameter
On Thu, 3 May 2007, William Lee Irwin III wrote:

 On Thu, May 03, 2007 at 08:08:41PM -0700, Christoph Lameter wrote:
  Performance tests show a slight improvements in netperf (not a
  strong case for a performance improvement but removing the
  constructor has definitely no negative impact so why keep
  this around?).
 
 Cache effects are not so easily visible. Cache profile results from
 more realistic workloads (e.g. major macrobenchmarks) are more
 appropriate for gauging this.

Yeah I really out to stick a performance counter in this but that would 
require some effort. Defer for now I guess.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread Andrew Morton
On Thu, 3 May 2007 20:08:41 -0700 (PDT)
Christoph Lameter [EMAIL PROTECTED] wrote:

 Performance tests show a slight improvements in netperf (not a
 strong case for a performance improvement but removing the
 constructor has definitely no negative impact so why keep
 this around?).
 
 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost 
 (127.0.0.1) port 0 AF_INET
 Recv   SendSend
 Socket Socket  Message  Elapsed
 Size   SizeSize Time Throughput
 bytes  bytes   bytessecs.10^6bits/sec
 
 Before:
  87380  16384  1638410.016026.04
  87380  16384  1638410.015992.17
  87380  16384  1638410.016071.23
 
 After:
  87380  16384  1638410.016090.20
  87380  16384  1638410.016078.3
  87380  16384  1638410.006013.52
 
 
 Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
 
 ---
  fs/buffer.c |   22 --


So I benchmarked this by repeatedly extending (via write()) and truncating
a 10MB file, on ext2.  Using create-delete.c from
http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz

Machine is a fast 2x4 core Woodcrest.  CONFIG_SLAB=y

The command used was

time create-delete -s $((16 * 1024 * 1024)) -n 300 foo

which will allocate and free 300*4096 buffer_heads.

With patch:

akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.56s system 99% 
cpu 4.565 total
akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.60s system 99% 
cpu 4.612 total
akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.60s system 99% 
cpu 4.602 total
akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.56s system 99% 
cpu 4.567 total
akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.59s system 95% 
cpu 4.824 total

Without patch:

akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.42s system 99% 
cpu 4.419 total
akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.42s system 99% 
cpu 4.421 total
akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.42s system 99% 
cpu 4.427 total
akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.42s system 99% 
cpu 4.417 total
akpm2:/mnt/sda2 time create-delete -s $((16 * 1024 * 1024)) -n 300 foo
create-delete -s $((16 * 1024 * 1024)) -n 300 foo  0.00s user 4.42s system 99% 
cpu 4.435 total

So the patch took the average system time from 4.42 seconds up to 4.582
seconds.  Nice slowdown!

It could just be the usual inter-kernel-build noise, dunno.

I'd investigate further, but someone has gone and broken oprofile.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread Andrew Morton
On Fri, 4 May 2007 13:42:12 -0700
Andrew Morton [EMAIL PROTECTED] wrote:

 I'd investigate further, but someone has gone and broken oprofile.

Damn, we went and merged that bustage?


2.6.20:

akpm2:/home/akpm opcontrol --start-daemon
/usr/bin/opcontrol: line 1098: /dev/oprofile/0/enabled: No such file or 
directory
/usr/bin/opcontrol: line 1098: /dev/oprofile/0/event: No such file or directory
/usr/bin/opcontrol: line 1098: /dev/oprofile/0/count: No such file or directory
/usr/bin/opcontrol: line 1098: /dev/oprofile/0/kernel: No such file or directory
/usr/bin/opcontrol: line 1098: /dev/oprofile/0/user: No such file or directory
/usr/bin/opcontrol: line 1098: /dev/oprofile/0/unit_mask: No such file or 
directory

2.6.21:

akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50
opreport error: No sample file found: try running opcontrol --dump
or specify a session containing sample files



This is an FC6 machine.  `yum update oprofile' says

Could not find update match for oprofile
No Packages marked for Update/Obsoletion

akpm2:/home/akpm rpm -q oprofile
oprofile-0.9.2-3.fc6


I'm quite stunned that we did this.

Now what?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread Christoph Lameter
On Fri, 4 May 2007, Andrew Morton wrote:

 So the patch took the average system time from 4.42 seconds up to 4.582
 seconds.  Nice slowdown!

All of that from a memset and a list head init on a cacheline we already 
use?

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread Andrew Morton
On Fri, 4 May 2007 14:42:02 -0700 (PDT)
Christoph Lameter [EMAIL PROTECTED] wrote:

 On Fri, 4 May 2007, Andrew Morton wrote:
 
  So the patch took the average system time from 4.42 seconds up to 4.582
  seconds.  Nice slowdown!
 
 All of that from a memset and a list head init on a cacheline we already 
 use?

Seems unlikely, especially when you consider all the other stuff which a write()
has to do.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread Chuck Ebbert
Andrew Morton wrote:
 
 I'd investigate further, but someone has gone and broken oprofile.
 

Did you just notice that? Apparently it's been broken since 2.6.21-final.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread Andi Kleen
On Friday 04 May 2007 23:33:47 Andrew Morton wrote:
 On Fri, 4 May 2007 13:42:12 -0700

 
 2.6.20:
 
 akpm2:/home/akpm opcontrol --start-daemon
 /usr/bin/opcontrol: line 1098: /dev/oprofile/0/enabled: No such file or 
 directory
 /usr/bin/opcontrol: line 1098: /dev/oprofile/0/event: No such file or 
 directory
 /usr/bin/opcontrol: line 1098: /dev/oprofile/0/count: No such file or 
 directory
 /usr/bin/opcontrol: line 1098: /dev/oprofile/0/kernel: No such file or 
 directory
 /usr/bin/opcontrol: line 1098: /dev/oprofile/0/user: No such file or directory
 /usr/bin/opcontrol: line 1098: /dev/oprofile/0/unit_mask: No such file or 
 directory

This isn't a problem anymore since the nmi watchdog is off by default now.

 2.6.21:
 
 akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50
 opreport error: No sample file found: try running opcontrol --dump
 or specify a session containing sample files

For me it works on a slightly post 2.6.21 kernel with suse oprofile-0.9.2-21

Did you try opcontrol --dump? 

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-04 Thread Andrew Morton
On Sat, 5 May 2007 01:22:05 +0200
Andi Kleen [EMAIL PROTECTED] wrote:

  2.6.21:
  
  akpm2:/home/akpm# opreport -l /boot/vmlinux-$(uname -r) | head -50
  opreport error: No sample file found: try running opcontrol --dump
  or specify a session containing sample files
 
 For me it works on a slightly post 2.6.21 kernel with suse oprofile-0.9.2-21
 
 Did you try opcontrol --dump? 

Yes, tried various things.  There's just nothing turning up in 
/var/lib/oprofile.

Chuck appears to be claiming that 2.6.21 oprofile is known to be broken,
but I never heard anything about that.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-03 Thread Andrew Morton
On Thu, 3 May 2007 20:34:48 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> 
wrote:

> On Thu, 3 May 2007, Andrew Morton wrote:
> 
> > On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter <[EMAIL 
> > PROTECTED]> wrote:
> > 
> > > Performance tests show a slight improvements in netperf (not a
> > > strong case for a performance improvement but removing the
> > > constructor has definitely no negative impact so why keep
> > > this around?).
> > > 
> > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost 
> > > (127.0.0.1) port 0 AF_INET
> > > Recv   SendSend
> > > Socket Socket  Message  Elapsed
> > > Size   SizeSize Time Throughput
> > > bytes  bytes   bytessecs.10^6bits/sec
> > > 
> > > Before:
> > >  87380  16384  1638410.016026.04
> > >  87380  16384  1638410.015992.17
> > >  87380  16384  1638410.016071.23
> > > 
> > > After:
> > >  87380  16384  1638410.016090.20
> > >  87380  16384  1638410.016078.3
> > >  87380  16384  1638410.006013.52
> > 
> > How could a filesystem change affect networking performance?
> > 
> > The change looks nice, but I'd microbenchmark it with a 
> > write-to-ext2-on-ramdisk
> > or something like that.
> 
> H.. I was told in another thread that this is the most frequently used 
> slab for this benchmark

That would be hair-raising ;)  I suspect confusion with sk_buff.

buffer_heads do get used quite a bit though.  A good microbenchmark would
be to sit in a tight loop extending and truncating an ext2 file

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-03 Thread Christoph Lameter
On Thu, 3 May 2007, Andrew Morton wrote:

> The change looks nice, but I'd microbenchmark it with a 
> write-to-ext2-on-ramdisk
> or something like that.

Hmmm... How does one benchmark buffer head performance? Guess just by 
copying files? Not sure if the following will cut it.

Two tests. First copying 8M of small files into a 16M ramdisk:

for i in 1 2 3 4 5 6 7 8 9; do

mke2fs /dev/ram0 >/dev/null
mount /dev/ram0 /media >/dev/null
time cp -a /etc /media
umount /dev/ram0

done;


No constructor

real0m0.104s
user0m0.016s
sys 0m0.056s

real0m0.090s
user0m0.008s
sys 0m0.056s

real0m0.089s
user0m0.016s
sys 0m0.048s

real0m0.097s
user0m0.004s
sys 0m0.064s

real0m0.091s
user0m0.008s
sys 0m0.052s

real0m0.091s
user0m0.004s
sys 0m0.060s

real0m0.098s
user0m0.008s
sys 0m0.060s

real0m0.091s
user0m0.000s
sys 0m0.064s

real0m0.090s
user0m0.012s
sys 0m0.052s

W/constructor

real0m0.099s
user0m0.004s
sys 0m0.100s

real0m0.098s
user0m0.008s
sys 0m0.096s

real0m0.091s
user0m0.016s
sys 0m0.080s

real0m0.091s
user0m0.012s
sys 0m0.084s

real0m0.090s
user0m0.012s
sys 0m0.080s

real0m0.090s
user0m0.020s
sys 0m0.076s

real0m1.269s
user0m0.012s
sys 0m0.084s

real0m0.095s
user0m0.016s
sys 0m0.084s

real0m0.096s
user0m0.020s
sys 0m0.084s

The no constructor numbers are generally lower.
Lowest is no constructor with 0.089.

Second. Copy vmlinux (52M) to 128M ramdisk:

for i in 1 2 3 4 5 6 7 8 9; do

mke2fs /dev/ram0 >/dev/null
mount /dev/ram0 /media >/dev/null
time cp slub/vmlinux /media
umount /dev/ram0

done;


No constructor:

real0m2.095s
user0m0.000s
sys 0m0.168s

real0m0.187s
user0m0.008s
sys 0m0.124s

real0m0.186s
user0m0.008s
sys 0m0.120s

real0m0.195s
user0m0.008s
sys 0m0.128s

real0m0.177s
user0m0.004s
sys 0m0.120s

real0m0.182s
user0m0.004s
sys 0m0.120s

real0m0.186s
user0m0.008s
sys 0m0.120s

real0m0.190s
user0m0.004s
sys 0m0.128s

real0m0.174s
user0m0.004s
sys 0m0.116s


Constructor

real0m0.183s
user0m0.004s
sys 0m0.188s

real0m0.183s
user0m0.004s
sys 0m0.192s

real0m0.177s
user0m0.012s
sys 0m0.176s

real0m0.186s
user0m0.004s
sys 0m0.192s

real0m0.187s
user0m0.008s
sys 0m0.188s

real0m0.184s
user0m0.004s
sys 0m0.192s

real0m0.177s
user0m0.012s
sys 0m0.176s

real0m0.183s
user0m0.004s
sys 0m0.192s

real0m0.182s
user0m0.004s
sys 0m0.188s

Same here. Low is 0.174 no constructor.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-03 Thread Christoph Lameter
On Thu, 3 May 2007, Andrew Morton wrote:

> On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> 
> wrote:
> 
> > Performance tests show a slight improvements in netperf (not a
> > strong case for a performance improvement but removing the
> > constructor has definitely no negative impact so why keep
> > this around?).
> > 
> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost 
> > (127.0.0.1) port 0 AF_INET
> > Recv   SendSend
> > Socket Socket  Message  Elapsed
> > Size   SizeSize Time Throughput
> > bytes  bytes   bytessecs.10^6bits/sec
> > 
> > Before:
> >  87380  16384  1638410.016026.04
> >  87380  16384  1638410.015992.17
> >  87380  16384  1638410.016071.23
> > 
> > After:
> >  87380  16384  1638410.016090.20
> >  87380  16384  1638410.016078.3
> >  87380  16384  1638410.006013.52
> 
> How could a filesystem change affect networking performance?
> 
> The change looks nice, but I'd microbenchmark it with a 
> write-to-ext2-on-ramdisk
> or something like that.

H.. I was told in another thread that this is the most frequently used 
slab for this benchmark .. Just accepted that as true.
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-03 Thread Andrew Morton
On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> 
wrote:

> Performance tests show a slight improvements in netperf (not a
> strong case for a performance improvement but removing the
> constructor has definitely no negative impact so why keep
> this around?).
> 
> TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost 
> (127.0.0.1) port 0 AF_INET
> Recv   SendSend
> Socket Socket  Message  Elapsed
> Size   SizeSize Time Throughput
> bytes  bytes   bytessecs.10^6bits/sec
> 
> Before:
>  87380  16384  1638410.016026.04
>  87380  16384  1638410.015992.17
>  87380  16384  1638410.016071.23
> 
> After:
>  87380  16384  1638410.016090.20
>  87380  16384  1638410.016078.3
>  87380  16384  1638410.006013.52

How could a filesystem change affect networking performance?

The change looks nice, but I'd microbenchmark it with a write-to-ext2-on-ramdisk
or something like that.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Remove constructor from buffer_head

2007-05-03 Thread Christoph Lameter
Performance tests show a slight improvements in netperf (not a
strong case for a performance improvement but removing the
constructor has definitely no negative impact so why keep
this around?).

TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) 
port 0 AF_INET
Recv   SendSend
Socket Socket  Message  Elapsed
Size   SizeSize Time Throughput
bytes  bytes   bytessecs.10^6bits/sec

Before:
 87380  16384  1638410.016026.04
 87380  16384  1638410.015992.17
 87380  16384  1638410.016071.23

After:
 87380  16384  1638410.016090.20
 87380  16384  1638410.016078.3
 87380  16384  1638410.006013.52


Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

---
 fs/buffer.c |   22 --
 1 file changed, 4 insertions(+), 18 deletions(-)

Index: slub/fs/buffer.c
===
--- slub.orig/fs/buffer.c   2007-05-03 19:17:09.0 -0700
+++ slub/fs/buffer.c2007-05-03 19:57:30.0 -0700
@@ -2907,9 +2907,10 @@ static void recalc_bh_state(void)

 struct buffer_head *alloc_buffer_head(gfp_t gfp_flags)
 {
-   struct buffer_head *ret = kmem_cache_alloc(bh_cachep,
+   struct buffer_head *ret = kmem_cache_zalloc(bh_cachep,
set_migrateflags(gfp_flags, __GFP_RECLAIMABLE));
if (ret) {
+   INIT_LIST_HEAD(>b_assoc_buffers);
get_cpu_var(bh_accounting).nr++;
recalc_bh_state();
put_cpu_var(bh_accounting);
@@ -2928,17 +2929,6 @@ void free_buffer_head(struct buffer_head
 }
 EXPORT_SYMBOL(free_buffer_head);
 
-static void
-init_buffer_head(void *data, struct kmem_cache *cachep, unsigned long flags)
-{
-   if (flags & SLAB_CTOR_CONSTRUCTOR) {
-   struct buffer_head * bh = (struct buffer_head *)data;
-
-   memset(bh, 0, sizeof(*bh));
-   INIT_LIST_HEAD(>b_assoc_buffers);
-   }
-}
-
 static void buffer_exit_cpu(int cpu)
 {
int i;
@@ -2965,12 +2955,8 @@ void __init buffer_init(void)
 {
int nrpages;
 
-   bh_cachep = kmem_cache_create("buffer_head",
-   sizeof(struct buffer_head), 0,
-   (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
-   SLAB_MEM_SPREAD),
-   init_buffer_head,
-   NULL);
+   bh_cachep = KMEM_CACHE(buffer_head,
+   SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD);
 
/*
 * Limit the bh occupancy to 10% of ZONE_NORMAL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Remove constructor from buffer_head

2007-05-03 Thread Christoph Lameter
Performance tests show a slight improvements in netperf (not a
strong case for a performance improvement but removing the
constructor has definitely no negative impact so why keep
this around?).

TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) 
port 0 AF_INET
Recv   SendSend
Socket Socket  Message  Elapsed
Size   SizeSize Time Throughput
bytes  bytes   bytessecs.10^6bits/sec

Before:
 87380  16384  1638410.016026.04
 87380  16384  1638410.015992.17
 87380  16384  1638410.016071.23

After:
 87380  16384  1638410.016090.20
 87380  16384  1638410.016078.3
 87380  16384  1638410.006013.52


Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 fs/buffer.c |   22 --
 1 file changed, 4 insertions(+), 18 deletions(-)

Index: slub/fs/buffer.c
===
--- slub.orig/fs/buffer.c   2007-05-03 19:17:09.0 -0700
+++ slub/fs/buffer.c2007-05-03 19:57:30.0 -0700
@@ -2907,9 +2907,10 @@ static void recalc_bh_state(void)

 struct buffer_head *alloc_buffer_head(gfp_t gfp_flags)
 {
-   struct buffer_head *ret = kmem_cache_alloc(bh_cachep,
+   struct buffer_head *ret = kmem_cache_zalloc(bh_cachep,
set_migrateflags(gfp_flags, __GFP_RECLAIMABLE));
if (ret) {
+   INIT_LIST_HEAD(ret-b_assoc_buffers);
get_cpu_var(bh_accounting).nr++;
recalc_bh_state();
put_cpu_var(bh_accounting);
@@ -2928,17 +2929,6 @@ void free_buffer_head(struct buffer_head
 }
 EXPORT_SYMBOL(free_buffer_head);
 
-static void
-init_buffer_head(void *data, struct kmem_cache *cachep, unsigned long flags)
-{
-   if (flags  SLAB_CTOR_CONSTRUCTOR) {
-   struct buffer_head * bh = (struct buffer_head *)data;
-
-   memset(bh, 0, sizeof(*bh));
-   INIT_LIST_HEAD(bh-b_assoc_buffers);
-   }
-}
-
 static void buffer_exit_cpu(int cpu)
 {
int i;
@@ -2965,12 +2955,8 @@ void __init buffer_init(void)
 {
int nrpages;
 
-   bh_cachep = kmem_cache_create(buffer_head,
-   sizeof(struct buffer_head), 0,
-   (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
-   SLAB_MEM_SPREAD),
-   init_buffer_head,
-   NULL);
+   bh_cachep = KMEM_CACHE(buffer_head,
+   SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD);
 
/*
 * Limit the bh occupancy to 10% of ZONE_NORMAL
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-03 Thread Andrew Morton
On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter [EMAIL PROTECTED] 
wrote:

 Performance tests show a slight improvements in netperf (not a
 strong case for a performance improvement but removing the
 constructor has definitely no negative impact so why keep
 this around?).
 
 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost 
 (127.0.0.1) port 0 AF_INET
 Recv   SendSend
 Socket Socket  Message  Elapsed
 Size   SizeSize Time Throughput
 bytes  bytes   bytessecs.10^6bits/sec
 
 Before:
  87380  16384  1638410.016026.04
  87380  16384  1638410.015992.17
  87380  16384  1638410.016071.23
 
 After:
  87380  16384  1638410.016090.20
  87380  16384  1638410.016078.3
  87380  16384  1638410.006013.52

How could a filesystem change affect networking performance?

The change looks nice, but I'd microbenchmark it with a write-to-ext2-on-ramdisk
or something like that.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-03 Thread Christoph Lameter
On Thu, 3 May 2007, Andrew Morton wrote:

 On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter [EMAIL PROTECTED] 
 wrote:
 
  Performance tests show a slight improvements in netperf (not a
  strong case for a performance improvement but removing the
  constructor has definitely no negative impact so why keep
  this around?).
  
  TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost 
  (127.0.0.1) port 0 AF_INET
  Recv   SendSend
  Socket Socket  Message  Elapsed
  Size   SizeSize Time Throughput
  bytes  bytes   bytessecs.10^6bits/sec
  
  Before:
   87380  16384  1638410.016026.04
   87380  16384  1638410.015992.17
   87380  16384  1638410.016071.23
  
  After:
   87380  16384  1638410.016090.20
   87380  16384  1638410.016078.3
   87380  16384  1638410.006013.52
 
 How could a filesystem change affect networking performance?
 
 The change looks nice, but I'd microbenchmark it with a 
 write-to-ext2-on-ramdisk
 or something like that.

H.. I was told in another thread that this is the most frequently used 
slab for this benchmark .. Just accepted that as true.
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-03 Thread Christoph Lameter
On Thu, 3 May 2007, Andrew Morton wrote:

 The change looks nice, but I'd microbenchmark it with a 
 write-to-ext2-on-ramdisk
 or something like that.

Hmmm... How does one benchmark buffer head performance? Guess just by 
copying files? Not sure if the following will cut it.

Two tests. First copying 8M of small files into a 16M ramdisk:

for i in 1 2 3 4 5 6 7 8 9; do

mke2fs /dev/ram0 /dev/null
mount /dev/ram0 /media /dev/null
time cp -a /etc /media
umount /dev/ram0

done;


No constructor

real0m0.104s
user0m0.016s
sys 0m0.056s

real0m0.090s
user0m0.008s
sys 0m0.056s

real0m0.089s
user0m0.016s
sys 0m0.048s

real0m0.097s
user0m0.004s
sys 0m0.064s

real0m0.091s
user0m0.008s
sys 0m0.052s

real0m0.091s
user0m0.004s
sys 0m0.060s

real0m0.098s
user0m0.008s
sys 0m0.060s

real0m0.091s
user0m0.000s
sys 0m0.064s

real0m0.090s
user0m0.012s
sys 0m0.052s

W/constructor

real0m0.099s
user0m0.004s
sys 0m0.100s

real0m0.098s
user0m0.008s
sys 0m0.096s

real0m0.091s
user0m0.016s
sys 0m0.080s

real0m0.091s
user0m0.012s
sys 0m0.084s

real0m0.090s
user0m0.012s
sys 0m0.080s

real0m0.090s
user0m0.020s
sys 0m0.076s

real0m1.269s
user0m0.012s
sys 0m0.084s

real0m0.095s
user0m0.016s
sys 0m0.084s

real0m0.096s
user0m0.020s
sys 0m0.084s

The no constructor numbers are generally lower.
Lowest is no constructor with 0.089.

Second. Copy vmlinux (52M) to 128M ramdisk:

for i in 1 2 3 4 5 6 7 8 9; do

mke2fs /dev/ram0 /dev/null
mount /dev/ram0 /media /dev/null
time cp slub/vmlinux /media
umount /dev/ram0

done;


No constructor:

real0m2.095s
user0m0.000s
sys 0m0.168s

real0m0.187s
user0m0.008s
sys 0m0.124s

real0m0.186s
user0m0.008s
sys 0m0.120s

real0m0.195s
user0m0.008s
sys 0m0.128s

real0m0.177s
user0m0.004s
sys 0m0.120s

real0m0.182s
user0m0.004s
sys 0m0.120s

real0m0.186s
user0m0.008s
sys 0m0.120s

real0m0.190s
user0m0.004s
sys 0m0.128s

real0m0.174s
user0m0.004s
sys 0m0.116s


Constructor

real0m0.183s
user0m0.004s
sys 0m0.188s

real0m0.183s
user0m0.004s
sys 0m0.192s

real0m0.177s
user0m0.012s
sys 0m0.176s

real0m0.186s
user0m0.004s
sys 0m0.192s

real0m0.187s
user0m0.008s
sys 0m0.188s

real0m0.184s
user0m0.004s
sys 0m0.192s

real0m0.177s
user0m0.012s
sys 0m0.176s

real0m0.183s
user0m0.004s
sys 0m0.192s

real0m0.182s
user0m0.004s
sys 0m0.188s

Same here. Low is 0.174 no constructor.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Remove constructor from buffer_head

2007-05-03 Thread Andrew Morton
On Thu, 3 May 2007 20:34:48 -0700 (PDT) Christoph Lameter [EMAIL PROTECTED] 
wrote:

 On Thu, 3 May 2007, Andrew Morton wrote:
 
  On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter [EMAIL 
  PROTECTED] wrote:
  
   Performance tests show a slight improvements in netperf (not a
   strong case for a performance improvement but removing the
   constructor has definitely no negative impact so why keep
   this around?).
   
   TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost 
   (127.0.0.1) port 0 AF_INET
   Recv   SendSend
   Socket Socket  Message  Elapsed
   Size   SizeSize Time Throughput
   bytes  bytes   bytessecs.10^6bits/sec
   
   Before:
87380  16384  1638410.016026.04
87380  16384  1638410.015992.17
87380  16384  1638410.016071.23
   
   After:
87380  16384  1638410.016090.20
87380  16384  1638410.016078.3
87380  16384  1638410.006013.52
  
  How could a filesystem change affect networking performance?
  
  The change looks nice, but I'd microbenchmark it with a 
  write-to-ext2-on-ramdisk
  or something like that.
 
 H.. I was told in another thread that this is the most frequently used 
 slab for this benchmark

That would be hair-raising ;)  I suspect confusion with sk_buff.

buffer_heads do get used quite a bit though.  A good microbenchmark would
be to sit in a tight loop extending and truncating an ext2 file

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/