Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-20 Thread Jakob Oestergaard
On Tue, Apr 19, 2005 at 06:46:28PM -0400, Trond Myklebust wrote:
> ty den 19.04.2005 Klokka 21:45 (+0200) skreiv Jakob Oestergaard:
> 
> > It mounts a home directory from a 2.6.6 NFS server - the client and
> > server are on a hub'ed 100Mbit network.
> > 
> > On the earlier 2.6 client I/O performance was as one would expect on
> > hub'ed 100Mbit - meaning, not exactly stellar, but you'd get around 4-5
> > MB/sec and decent interactivity.
> 
> OK, hold it right there...
> 
...
> Also, does that hub support NICs that do autonegotiation? (I'll bet the
> answer is "no").

*blush*

Ok Trond, you got me there - I don't know why upgrading the client made
the problem much more visible though, but the *server* had negotiated
full duplex rather than half (the client negotiated half ok). Fixing
that on the server side made the client pleasent to work with again.
Mom's a happy camper now again  ;)

Sorry for jumping the gun there...

To get back to the original problem;

I wonder if (as was discussed) the tg3 driver on my NFS server is
dropping packets, causing the 2.6.11 NFS client to misbehave... This
didn't make sense to me before (since earlier clients worked well), but
having just seen this other case where a broken server setup caused
2.6.11 clients to misbehave (where earlier clients were fine), maybe it
could be an explanation...

Will try either changing tg3 driver or putting in an e1000 on my NFS
server - I will let you know about the status on this when I know more.

Thanks all,

-- 

 / jakob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-20 Thread Jakob Oestergaard
On Tue, Apr 19, 2005 at 06:46:28PM -0400, Trond Myklebust wrote:
 ty den 19.04.2005 Klokka 21:45 (+0200) skreiv Jakob Oestergaard:
 
  It mounts a home directory from a 2.6.6 NFS server - the client and
  server are on a hub'ed 100Mbit network.
  
  On the earlier 2.6 client I/O performance was as one would expect on
  hub'ed 100Mbit - meaning, not exactly stellar, but you'd get around 4-5
  MB/sec and decent interactivity.
 
 OK, hold it right there...
 
...
 Also, does that hub support NICs that do autonegotiation? (I'll bet the
 answer is no).

*blush*

Ok Trond, you got me there - I don't know why upgrading the client made
the problem much more visible though, but the *server* had negotiated
full duplex rather than half (the client negotiated half ok). Fixing
that on the server side made the client pleasent to work with again.
Mom's a happy camper now again  ;)

Sorry for jumping the gun there...

To get back to the original problem;

I wonder if (as was discussed) the tg3 driver on my NFS server is
dropping packets, causing the 2.6.11 NFS client to misbehave... This
didn't make sense to me before (since earlier clients worked well), but
having just seen this other case where a broken server setup caused
2.6.11 clients to misbehave (where earlier clients were fine), maybe it
could be an explanation...

Will try either changing tg3 driver or putting in an e1000 on my NFS
server - I will let you know about the status on this when I know more.

Thanks all,

-- 

 / jakob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-19 Thread Trond Myklebust
ty den 19.04.2005 Klokka 21:45 (+0200) skreiv Jakob Oestergaard:

> It mounts a home directory from a 2.6.6 NFS server - the client and
> server are on a hub'ed 100Mbit network.
> 
> On the earlier 2.6 client I/O performance was as one would expect on
> hub'ed 100Mbit - meaning, not exactly stellar, but you'd get around 4-5
> MB/sec and decent interactivity.

OK, hold it right there...

So, IIRC the problem was that you were seeing abominable retrans rates
on UDP and TCP, and you are using a 100Mbit hub rather than a switch?

What does the collision LED look like, when you see these performance
problems?
Also, does that hub support NICs that do autonegotiation? (I'll bet the
answer is "no").

Cheers,
  Trond

-- 
Trond Myklebust <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-19 Thread Jakob Oestergaard
On Tue, Apr 12, 2005 at 11:28:43AM +0200, Jakob Oestergaard wrote:
...
> 
> But still, guys, it is the *same* server with tg3 that runs well with a
> 2.4 client but poorly with a 2.6 client.
> 
> Maybe I'm just staring myself blind at this, but I can't see how a
> general problem on the server (such as packet loss, latency or whatever)
> would cause no problems with a 2.4 client but major problems with a 2.6
> client.

Another data point;

I upgraded my mom's machine from an earlier 2.6 (don't remember which,
but I can find out) to 2.6.11.6.

It mounts a home directory from a 2.6.6 NFS server - the client and
server are on a hub'ed 100Mbit network.

On the earlier 2.6 client I/O performance was as one would expect on
hub'ed 100Mbit - meaning, not exactly stellar, but you'd get around 4-5
MB/sec and decent interactivity.

The typical workload here is storing or retrieving large TIFF files on
the client, while working with other things in KDE. So, if the
large-file NFS I/O causes NFS client stalls, it will be noticable on the
desktop (probably as Konqueror or whatever is accessing configuration or
cache files).

With 2.6.11.6 the client is virtually unusable when large files are
transferred.  A "df -h" will hang on the mounted filesystem for several
seconds, and I have my mom on the phone complaining that various windows
won't close and that her machine is too slow (*again* it's no more than
half a year ago she got the new P4)  ;)

Now there's plenty of things to start optimizing; RPC over TCP, using a
switch or crossover cable instead of the hub, etc. etc.

However, what changed here was the client kernel going from en earlier
2.6 to 2.6.11.6.

A lot happened to the NFS client in 2.6.11 - I wonder if there's any of
these patches that are worth trying to revert?  I have several setups
that suck currently, so I'm willing to try a thing or two :)

I would try 
---
<[EMAIL PROTECTED]>
RPC: Convert rpciod into a work queue for greater flexibility.
Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]>
---
if noone has a better idea...  But that's just a hunch based solely on
my observation of rpciod being a CPU hog on one of the earlier client
systems.  I didn't observe large 'sy' times in vmstat on this client
while it hung on NFS though...

Any suggestions would be greatly appreciated,

-- 

 / jakob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-19 Thread Jakob Oestergaard
On Tue, Apr 12, 2005 at 11:28:43AM +0200, Jakob Oestergaard wrote:
...
 
 But still, guys, it is the *same* server with tg3 that runs well with a
 2.4 client but poorly with a 2.6 client.
 
 Maybe I'm just staring myself blind at this, but I can't see how a
 general problem on the server (such as packet loss, latency or whatever)
 would cause no problems with a 2.4 client but major problems with a 2.6
 client.

Another data point;

I upgraded my mom's machine from an earlier 2.6 (don't remember which,
but I can find out) to 2.6.11.6.

It mounts a home directory from a 2.6.6 NFS server - the client and
server are on a hub'ed 100Mbit network.

On the earlier 2.6 client I/O performance was as one would expect on
hub'ed 100Mbit - meaning, not exactly stellar, but you'd get around 4-5
MB/sec and decent interactivity.

The typical workload here is storing or retrieving large TIFF files on
the client, while working with other things in KDE. So, if the
large-file NFS I/O causes NFS client stalls, it will be noticable on the
desktop (probably as Konqueror or whatever is accessing configuration or
cache files).

With 2.6.11.6 the client is virtually unusable when large files are
transferred.  A df -h will hang on the mounted filesystem for several
seconds, and I have my mom on the phone complaining that various windows
won't close and that her machine is too slow (*again* it's no more than
half a year ago she got the new P4)  ;)

Now there's plenty of things to start optimizing; RPC over TCP, using a
switch or crossover cable instead of the hub, etc. etc.

However, what changed here was the client kernel going from en earlier
2.6 to 2.6.11.6.

A lot happened to the NFS client in 2.6.11 - I wonder if there's any of
these patches that are worth trying to revert?  I have several setups
that suck currently, so I'm willing to try a thing or two :)

I would try 
---
[EMAIL PROTECTED]
RPC: Convert rpciod into a work queue for greater flexibility.
Signed-off-by: Trond Myklebust [EMAIL PROTECTED]
---
if noone has a better idea...  But that's just a hunch based solely on
my observation of rpciod being a CPU hog on one of the earlier client
systems.  I didn't observe large 'sy' times in vmstat on this client
while it hung on NFS though...

Any suggestions would be greatly appreciated,

-- 

 / jakob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-19 Thread Trond Myklebust
ty den 19.04.2005 Klokka 21:45 (+0200) skreiv Jakob Oestergaard:

 It mounts a home directory from a 2.6.6 NFS server - the client and
 server are on a hub'ed 100Mbit network.
 
 On the earlier 2.6 client I/O performance was as one would expect on
 hub'ed 100Mbit - meaning, not exactly stellar, but you'd get around 4-5
 MB/sec and decent interactivity.

OK, hold it right there...

So, IIRC the problem was that you were seeing abominable retrans rates
on UDP and TCP, and you are using a 100Mbit hub rather than a switch?

What does the collision LED look like, when you see these performance
problems?
Also, does that hub support NICs that do autonegotiation? (I'll bet the
answer is no).

Cheers,
  Trond

-- 
Trond Myklebust [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-12 Thread Jakob Oestergaard
On Tue, Apr 12, 2005 at 11:03:29AM +1000, Greg Banks wrote:
> On Tue, 2005-04-12 at 01:42, Jakob Oestergaard wrote:
> > Yes, as far as I know - the Broadcom Tigeon3 driver does not have the
> > option of enabling/disabling RX polling (if we agree that is what we're
> > talking about), but looking in tg3.c it seems that it *always*
> > unconditionally uses NAPI...
> 
> I've whined and moaned about this in the past, but for all its
> faults NAPI on tg3 doesn't lose packets.  It does cause a huge
> increase in irq cpu time on multiple fast CPUs.  What irq rate
> are you seeing?

Around 20.000 interrupts per second during the large write, on the IRQ
where eth0 is (this is not shared with anything else).

[sparrow:joe] $ cat /proc/interrupts
   CPU0   CPU1
...
169:3853488  412570512   IO-APIC-level  eth0
...


But still, guys, it is the *same* server with tg3 that runs well with a
2.4 client but poorly with a 2.6 client.

Maybe I'm just staring myself blind at this, but I can't see how a
general problem on the server (such as packet loss, latency or whatever)
would cause no problems with a 2.4 client but major problems with a 2.6
client.

-- 

 / jakob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-12 Thread Jakob Oestergaard
On Tue, Apr 12, 2005 at 11:03:29AM +1000, Greg Banks wrote:
 On Tue, 2005-04-12 at 01:42, Jakob Oestergaard wrote:
  Yes, as far as I know - the Broadcom Tigeon3 driver does not have the
  option of enabling/disabling RX polling (if we agree that is what we're
  talking about), but looking in tg3.c it seems that it *always*
  unconditionally uses NAPI...
 
 I've whined and moaned about this in the past, but for all its
 faults NAPI on tg3 doesn't lose packets.  It does cause a huge
 increase in irq cpu time on multiple fast CPUs.  What irq rate
 are you seeing?

Around 20.000 interrupts per second during the large write, on the IRQ
where eth0 is (this is not shared with anything else).

[sparrow:joe] $ cat /proc/interrupts
   CPU0   CPU1
...
169:3853488  412570512   IO-APIC-level  eth0
...


But still, guys, it is the *same* server with tg3 that runs well with a
2.4 client but poorly with a 2.6 client.

Maybe I'm just staring myself blind at this, but I can't see how a
general problem on the server (such as packet loss, latency or whatever)
would cause no problems with a 2.4 client but major problems with a 2.6
client.

-- 

 / jakob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Greg Banks
On Tue, 2005-04-12 at 01:42, Jakob Oestergaard wrote:
> Yes, as far as I know - the Broadcom Tigeon3 driver does not have the
> option of enabling/disabling RX polling (if we agree that is what we're
> talking about), but looking in tg3.c it seems that it *always*
> unconditionally uses NAPI...

I've whined and moaned about this in the past, but for all its
faults NAPI on tg3 doesn't lose packets.  It does cause a huge
increase in irq cpu time on multiple fast CPUs.  What irq rate
are you seeing?

I did once post a patch to make NAPI for tg3 selectable at
configure time.
http://marc.theaimsgroup.com/?l=linux-netdev=107183822710263=2

> No dropped packets... I wonder if the tg3 driver is being completely
> honest about this...

At one point it wasn't, since this patch it is:
http://marc.theaimsgroup.com/?l=linux-netdev=108433829603319=2

Greg.
-- 
Greg Banks, R Software Engineer, SGI Australian Software Group.
I don't speak for SGI.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Jakob Oestergaard
On Mon, Apr 11, 2005 at 11:21:45AM -0400, Trond Myklebust wrote:
> må den 11.04.2005 Klokka 16:41 (+0200) skreiv Jakob Oestergaard:
> 
> > > That can mean either that the server is dropping fragments, or that the
> > > client is dropping the replies. Can you generate a similar tcpdump on
> > > the server?
> > 
> > Certainly;  http://unthought.net/sparrow.dmp.bz2
> 
> So, it looks to me as if "sparrow" is indeed dropping packets (missed
> sequences). Is it running with NAPI enabled too?

Yes, as far as I know - the Broadcom Tigeon3 driver does not have the
option of enabling/disabling RX polling (if we agree that is what we're
talking about), but looking in tg3.c it seems that it *always*
unconditionally uses NAPI...

sparrow:~# ifconfig
eth0  Link encap:Ethernet  HWaddr 00:09:3D:10:BB:1E
  inet addr:10.0.1.20  Bcast:10.0.1.255  Mask:255.255.255.0
  inet6 addr: fe80::209:3dff:fe10:bb1e/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:2304578 errors:0 dropped:0 overruns:0 frame:0
  TX packets:2330829 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:2381644307 (2.2 GiB)  TX bytes:2191756317 (2.0 GiB)
  Interrupt:169

No dropped packets... I wonder if the tg3 driver is being completely
honest about this...

Still, 2.4 manages to perform twice as fast against the same server.

And, the 2.6 client still has extremely heavy CPU usage (from rpciod
mainly, which doesn't show up in profiles)

The plot thickens...

Trond (or anyone else feeling they might have some insight they would
like to share on this one), I'll do anything you say (ok, *almost*
anything you say) - any ideas?


-- 

 / jakob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Trond Myklebust
mà den 11.04.2005 Klokka 16:41 (+0200) skreiv Jakob Oestergaard:

> > That can mean either that the server is dropping fragments, or that the
> > client is dropping the replies. Can you generate a similar tcpdump on
> > the server?
> 
> Certainly;  http://unthought.net/sparrow.dmp.bz2

So, it looks to me as if "sparrow" is indeed dropping packets (missed
sequences). Is it running with NAPI enabled too?

Cheers,
  Trond
-- 
Trond Myklebust <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Jakob Oestergaard
On Mon, Apr 11, 2005 at 10:35:25AM -0400, Trond Myklebust wrote:
> må den 11.04.2005 Klokka 15:47 (+0200) skreiv Jakob Oestergaard:
> 
> > Certainly;
> > 
> > http://unthought.net/binary.dmp.bz2
> > 
> > I got an 'invalid snaplen' with the 9 you suggested, the above dump
> > is done with 9000 - if you need another snaplen please just let me know.
> 
> So, the RPC itself looks good, but it also looks as if after a while you
> are running into some heavy retransmission problems with TCP too (at the
> TCP level now, instead of at the RPC level). When you get into that
> mode, it looks as if every 2nd or 3rd TCP segment being sent from the
> client is being lost...

Odd...

I'm really sorry for using your time if this ends up being just a
networking problem.

> That can mean either that the server is dropping fragments, or that the
> client is dropping the replies. Can you generate a similar tcpdump on
> the server?

Certainly;  http://unthought.net/sparrow.dmp.bz2


-- 

 / jakob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Trond Myklebust
mà den 11.04.2005 Klokka 15:47 (+0200) skreiv Jakob Oestergaard:

> Certainly;
> 
> http://unthought.net/binary.dmp.bz2
> 
> I got an 'invalid snaplen' with the 9 you suggested, the above dump
> is done with 9000 - if you need another snaplen please just let me know.

So, the RPC itself looks good, but it also looks as if after a while you
are running into some heavy retransmission problems with TCP too (at the
TCP level now, instead of at the RPC level). When you get into that
mode, it looks as if every 2nd or 3rd TCP segment being sent from the
client is being lost...

That can mean either that the server is dropping fragments, or that the
client is dropping the replies. Can you generate a similar tcpdump on
the server?

Cheers,
  Trond
-- 
Trond Myklebust <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Jakob Oestergaard
On Mon, Apr 11, 2005 at 08:35:39AM -0400, Trond Myklebust wrote:
...
> That certainly shouldn't be the case (and isn't on any of my setups). Is
> the behaviour identical same on both the PIII and the Opteron systems?

The dual opteron is the nfs server

The dual athlon is the 2.4 nfs client

The dual PIII is the 2.6 nfs client

> As for the WRITE rates, could you send me a short tcpdump from the
> "sequential write" section of the above test? Just use "tcpdump -s 9
> -w binary.dmp"  just for a couple of seconds. I'd like to check the
> latencies, and just check that you are indeed sending unstable writes
> with not too many commit or getattr calls.

Certainly;

http://unthought.net/binary.dmp.bz2

I got an 'invalid snaplen' with the 9 you suggested, the above dump
is done with 9000 - if you need another snaplen please just let me know.

A little explanation for the IPs you see;
 sparrow/10.0.1.20 - nfs server
 raven/10.0.1.7 - 2.6 nfs client
 osprey/10.0.1.13 - NIS/DNS server

Thanks,

-- 

 / jakob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Trond Myklebust
mà den 11.04.2005 Klokka 09:48 (+0200) skreiv Jakob Oestergaard:

> tcp with timeo=600 causes retransmits (as seen with nfsstat) to drop to
> zero.

Good.

>  File   Block  Num  Seq ReadRand Read   Seq Write  Rand Write
>   DirSize   Size   Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%)
> --- -- --- --- --- --- --- ---
>. 2000   40961  60.50 67.6% 30.12 14.4% 22.54 30.1% 7.075 27.8%
>. 2000   40962  59.87 69.0% 34.34 19.0% 24.09 35.2% 7.805 30.0%
>. 2000   40964  62.27 69.8% 44.87 29.9% 23.07 34.3% 8.239 30.9%
> 
> So, reads start off better, it seems, but writes are still half speed of
> 2.4.25.
>
> I should say that it is common to see a single rpciod thread hogging
> 100% CPU for 20-30 seconds - that looks suspicious to me, other than
> that, I can't really point my finger at anything in this setup.

That certainly shouldn't be the case (and isn't on any of my setups). Is
the behaviour identical same on both the PIII and the Opteron systems?

As for the WRITE rates, could you send me a short tcpdump from the
"sequential write" section of the above test? Just use "tcpdump -s 9
-w binary.dmp"  just for a couple of seconds. I'd like to check the
latencies, and just check that you are indeed sending unstable writes
with not too many commit or getattr calls.

Cheers
  Trond
-- 
Trond Myklebust <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Jakob Oestergaard
On Sat, Apr 09, 2005 at 05:52:32PM -0400, Trond Myklebust wrote:
> lau den 09.04.2005 Klokka 23:35 (+0200) skreiv Jakob Oestergaard:
> 
> > 2.6.11.6: (dual PIII 1GHz, 2G RAM, Intel e1000)
> > 
> >  File   Block  Num  Seq ReadRand Read   Seq Write  Rand Write
> >   DirSize   Size   Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%)
> > --- -- --- --- --- --- --- ---
> >. 2000   40961  38.34 18.8% 19.61 6.77% 22.53 23.4% 6.947 15.6%
> >. 2000   40962  52.82 29.0% 24.42 9.37% 24.20 27.0% 7.755 16.7%
> >. 2000   40964  62.48 34.8% 33.65 17.0% 24.73 27.6% 8.027 15.4%
> > 
> > 
> > 44MiB/sec for 2.4 versus 22MiB/sec for 2.6 - any suggestions as to how
> > this could be improved?
> 
> What happened to the retransmission rates when you changed to TCP?

tcp with timeo=600 causes retransmits (as seen with nfsstat) to drop to
zero.

> 
> Note that on TCP (besides bumping the value for timeo) I would also
> recommend using a full 32k r/wsize instead of 4k (if the network is of
> decent quality, I'd recommend 32k for UDP too).

32k seems to be default for both UDP and TCP.

The network should be of decent quality - e1000 on client, tg3 on
server, both with short cables into a gigabit switch with plenty of
backplane headroom.

> The other tweak you can apply for TCP is to bump the value
> for /proc/sys/sunrpc/tcp_slot_table_entries. That will allow you to have
> several more RPC requests in flight (although that will also tie up more
> threads on the server).

Changing only to TCP gives me:

 File   Block  Num  Seq ReadRand Read   Seq Write  Rand Write
  DirSize   Size   Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%)
--- -- --- --- --- --- --- ---
   . 2000   40961  47.04 65.2% 50.57 26.2% 24.24 29.7% 6.896 28.7%
   . 2000   40962  55.77 66.1% 61.72 31.9% 24.13 33.0% 7.646 26.6%
   . 2000   40964  61.94 68.9% 70.52 42.5% 25.65 35.6% 8.042 26.7%

Pretty much the same as before - with writes being suspiciously slow
(compared to good ole' 2.4.25)

With tcp_slot_table_entries bumped to 64 on the client (128 knfsd
threads on the server, same as in all tests), I see:

 File   Block  Num  Seq ReadRand Read   Seq Write  Rand Write
  DirSize   Size   Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%)
--- -- --- --- --- --- --- ---
   . 2000   40961  60.50 67.6% 30.12 14.4% 22.54 30.1% 7.075 27.8%
   . 2000   40962  59.87 69.0% 34.34 19.0% 24.09 35.2% 7.805 30.0%
   . 2000   40964  62.27 69.8% 44.87 29.9% 23.07 34.3% 8.239 30.9%

So, reads start off better, it seems, but writes are still half speed of
2.4.25.

I should say that it is common to see a single rpciod thread hogging
100% CPU for 20-30 seconds - that looks suspicious to me, other than
that, I can't really point my finger at anything in this setup.

Any suggestions Trond?   I'd be happy to run some tests for you if you
have any idea how we can speed up those writes (or reads for that
matter, although I am fairly happy with those).


-- 

 / jakob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Jakob Oestergaard
On Sat, Apr 09, 2005 at 05:52:32PM -0400, Trond Myklebust wrote:
 lau den 09.04.2005 Klokka 23:35 (+0200) skreiv Jakob Oestergaard:
 
  2.6.11.6: (dual PIII 1GHz, 2G RAM, Intel e1000)
  
   File   Block  Num  Seq ReadRand Read   Seq Write  Rand Write
DirSize   Size   Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%)
  --- -- --- --- --- --- --- ---
 . 2000   40961  38.34 18.8% 19.61 6.77% 22.53 23.4% 6.947 15.6%
 . 2000   40962  52.82 29.0% 24.42 9.37% 24.20 27.0% 7.755 16.7%
 . 2000   40964  62.48 34.8% 33.65 17.0% 24.73 27.6% 8.027 15.4%
  
  
  44MiB/sec for 2.4 versus 22MiB/sec for 2.6 - any suggestions as to how
  this could be improved?
 
 What happened to the retransmission rates when you changed to TCP?

tcp with timeo=600 causes retransmits (as seen with nfsstat) to drop to
zero.

 
 Note that on TCP (besides bumping the value for timeo) I would also
 recommend using a full 32k r/wsize instead of 4k (if the network is of
 decent quality, I'd recommend 32k for UDP too).

32k seems to be default for both UDP and TCP.

The network should be of decent quality - e1000 on client, tg3 on
server, both with short cables into a gigabit switch with plenty of
backplane headroom.

 The other tweak you can apply for TCP is to bump the value
 for /proc/sys/sunrpc/tcp_slot_table_entries. That will allow you to have
 several more RPC requests in flight (although that will also tie up more
 threads on the server).

Changing only to TCP gives me:

 File   Block  Num  Seq ReadRand Read   Seq Write  Rand Write
  DirSize   Size   Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%)
--- -- --- --- --- --- --- ---
   . 2000   40961  47.04 65.2% 50.57 26.2% 24.24 29.7% 6.896 28.7%
   . 2000   40962  55.77 66.1% 61.72 31.9% 24.13 33.0% 7.646 26.6%
   . 2000   40964  61.94 68.9% 70.52 42.5% 25.65 35.6% 8.042 26.7%

Pretty much the same as before - with writes being suspiciously slow
(compared to good ole' 2.4.25)

With tcp_slot_table_entries bumped to 64 on the client (128 knfsd
threads on the server, same as in all tests), I see:

 File   Block  Num  Seq ReadRand Read   Seq Write  Rand Write
  DirSize   Size   Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%)
--- -- --- --- --- --- --- ---
   . 2000   40961  60.50 67.6% 30.12 14.4% 22.54 30.1% 7.075 27.8%
   . 2000   40962  59.87 69.0% 34.34 19.0% 24.09 35.2% 7.805 30.0%
   . 2000   40964  62.27 69.8% 44.87 29.9% 23.07 34.3% 8.239 30.9%

So, reads start off better, it seems, but writes are still half speed of
2.4.25.

I should say that it is common to see a single rpciod thread hogging
100% CPU for 20-30 seconds - that looks suspicious to me, other than
that, I can't really point my finger at anything in this setup.

Any suggestions Trond?   I'd be happy to run some tests for you if you
have any idea how we can speed up those writes (or reads for that
matter, although I am fairly happy with those).


-- 

 / jakob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Trond Myklebust
m den 11.04.2005 Klokka 09:48 (+0200) skreiv Jakob Oestergaard:

 tcp with timeo=600 causes retransmits (as seen with nfsstat) to drop to
 zero.

Good.

  File   Block  Num  Seq ReadRand Read   Seq Write  Rand Write
   DirSize   Size   Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%)
 --- -- --- --- --- --- --- ---
. 2000   40961  60.50 67.6% 30.12 14.4% 22.54 30.1% 7.075 27.8%
. 2000   40962  59.87 69.0% 34.34 19.0% 24.09 35.2% 7.805 30.0%
. 2000   40964  62.27 69.8% 44.87 29.9% 23.07 34.3% 8.239 30.9%
 
 So, reads start off better, it seems, but writes are still half speed of
 2.4.25.

 I should say that it is common to see a single rpciod thread hogging
 100% CPU for 20-30 seconds - that looks suspicious to me, other than
 that, I can't really point my finger at anything in this setup.

That certainly shouldn't be the case (and isn't on any of my setups). Is
the behaviour identical same on both the PIII and the Opteron systems?

As for the WRITE rates, could you send me a short tcpdump from the
sequential write section of the above test? Just use tcpdump -s 9
-w binary.dmp  just for a couple of seconds. I'd like to check the
latencies, and just check that you are indeed sending unstable writes
with not too many commit or getattr calls.

Cheers
  Trond
-- 
Trond Myklebust [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Jakob Oestergaard
On Mon, Apr 11, 2005 at 08:35:39AM -0400, Trond Myklebust wrote:
...
 That certainly shouldn't be the case (and isn't on any of my setups). Is
 the behaviour identical same on both the PIII and the Opteron systems?

The dual opteron is the nfs server

The dual athlon is the 2.4 nfs client

The dual PIII is the 2.6 nfs client

 As for the WRITE rates, could you send me a short tcpdump from the
 sequential write section of the above test? Just use tcpdump -s 9
 -w binary.dmp  just for a couple of seconds. I'd like to check the
 latencies, and just check that you are indeed sending unstable writes
 with not too many commit or getattr calls.

Certainly;

http://unthought.net/binary.dmp.bz2

I got an 'invalid snaplen' with the 9 you suggested, the above dump
is done with 9000 - if you need another snaplen please just let me know.

A little explanation for the IPs you see;
 sparrow/10.0.1.20 - nfs server
 raven/10.0.1.7 - 2.6 nfs client
 osprey/10.0.1.13 - NIS/DNS server

Thanks,

-- 

 / jakob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Trond Myklebust
m den 11.04.2005 Klokka 15:47 (+0200) skreiv Jakob Oestergaard:

 Certainly;
 
 http://unthought.net/binary.dmp.bz2
 
 I got an 'invalid snaplen' with the 9 you suggested, the above dump
 is done with 9000 - if you need another snaplen please just let me know.

So, the RPC itself looks good, but it also looks as if after a while you
are running into some heavy retransmission problems with TCP too (at the
TCP level now, instead of at the RPC level). When you get into that
mode, it looks as if every 2nd or 3rd TCP segment being sent from the
client is being lost...

That can mean either that the server is dropping fragments, or that the
client is dropping the replies. Can you generate a similar tcpdump on
the server?

Cheers,
  Trond
-- 
Trond Myklebust [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Jakob Oestergaard
On Mon, Apr 11, 2005 at 10:35:25AM -0400, Trond Myklebust wrote:
 må den 11.04.2005 Klokka 15:47 (+0200) skreiv Jakob Oestergaard:
 
  Certainly;
  
  http://unthought.net/binary.dmp.bz2
  
  I got an 'invalid snaplen' with the 9 you suggested, the above dump
  is done with 9000 - if you need another snaplen please just let me know.
 
 So, the RPC itself looks good, but it also looks as if after a while you
 are running into some heavy retransmission problems with TCP too (at the
 TCP level now, instead of at the RPC level). When you get into that
 mode, it looks as if every 2nd or 3rd TCP segment being sent from the
 client is being lost...

Odd...

I'm really sorry for using your time if this ends up being just a
networking problem.

 That can mean either that the server is dropping fragments, or that the
 client is dropping the replies. Can you generate a similar tcpdump on
 the server?

Certainly;  http://unthought.net/sparrow.dmp.bz2


-- 

 / jakob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Trond Myklebust
m den 11.04.2005 Klokka 16:41 (+0200) skreiv Jakob Oestergaard:

  That can mean either that the server is dropping fragments, or that the
  client is dropping the replies. Can you generate a similar tcpdump on
  the server?
 
 Certainly;  http://unthought.net/sparrow.dmp.bz2

So, it looks to me as if sparrow is indeed dropping packets (missed
sequences). Is it running with NAPI enabled too?

Cheers,
  Trond
-- 
Trond Myklebust [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Jakob Oestergaard
On Mon, Apr 11, 2005 at 11:21:45AM -0400, Trond Myklebust wrote:
 må den 11.04.2005 Klokka 16:41 (+0200) skreiv Jakob Oestergaard:
 
   That can mean either that the server is dropping fragments, or that the
   client is dropping the replies. Can you generate a similar tcpdump on
   the server?
  
  Certainly;  http://unthought.net/sparrow.dmp.bz2
 
 So, it looks to me as if sparrow is indeed dropping packets (missed
 sequences). Is it running with NAPI enabled too?

Yes, as far as I know - the Broadcom Tigeon3 driver does not have the
option of enabling/disabling RX polling (if we agree that is what we're
talking about), but looking in tg3.c it seems that it *always*
unconditionally uses NAPI...

sparrow:~# ifconfig
eth0  Link encap:Ethernet  HWaddr 00:09:3D:10:BB:1E
  inet addr:10.0.1.20  Bcast:10.0.1.255  Mask:255.255.255.0
  inet6 addr: fe80::209:3dff:fe10:bb1e/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:2304578 errors:0 dropped:0 overruns:0 frame:0
  TX packets:2330829 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:2381644307 (2.2 GiB)  TX bytes:2191756317 (2.0 GiB)
  Interrupt:169

No dropped packets... I wonder if the tg3 driver is being completely
honest about this...

Still, 2.4 manages to perform twice as fast against the same server.

And, the 2.6 client still has extremely heavy CPU usage (from rpciod
mainly, which doesn't show up in profiles)

The plot thickens...

Trond (or anyone else feeling they might have some insight they would
like to share on this one), I'll do anything you say (ok, *almost*
anything you say) - any ideas?


-- 

 / jakob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Greg Banks
On Tue, 2005-04-12 at 01:42, Jakob Oestergaard wrote:
 Yes, as far as I know - the Broadcom Tigeon3 driver does not have the
 option of enabling/disabling RX polling (if we agree that is what we're
 talking about), but looking in tg3.c it seems that it *always*
 unconditionally uses NAPI...

I've whined and moaned about this in the past, but for all its
faults NAPI on tg3 doesn't lose packets.  It does cause a huge
increase in irq cpu time on multiple fast CPUs.  What irq rate
are you seeing?

I did once post a patch to make NAPI for tg3 selectable at
configure time.
http://marc.theaimsgroup.com/?l=linux-netdevm=107183822710263w=2

 No dropped packets... I wonder if the tg3 driver is being completely
 honest about this...

At one point it wasn't, since this patch it is:
http://marc.theaimsgroup.com/?l=linux-netdevm=108433829603319w=2

Greg.
-- 
Greg Banks, RD Software Engineer, SGI Australian Software Group.
I don't speak for SGI.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-09 Thread Trond Myklebust
lau den 09.04.2005 Klokka 23:35 (+0200) skreiv Jakob Oestergaard:

> 2.6.11.6: (dual PIII 1GHz, 2G RAM, Intel e1000)
> 
>  File   Block  Num  Seq ReadRand Read   Seq Write  Rand Write
>   DirSize   Size   Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%)
> --- -- --- --- --- --- --- ---
>. 2000   40961  38.34 18.8% 19.61 6.77% 22.53 23.4% 6.947 15.6%
>. 2000   40962  52.82 29.0% 24.42 9.37% 24.20 27.0% 7.755 16.7%
>. 2000   40964  62.48 34.8% 33.65 17.0% 24.73 27.6% 8.027 15.4%
> 
> 
> 44MiB/sec for 2.4 versus 22MiB/sec for 2.6 - any suggestions as to how
> this could be improved?

What happened to the retransmission rates when you changed to TCP?

Note that on TCP (besides bumping the value for timeo) I would also
recommend using a full 32k r/wsize instead of 4k (if the network is of
decent quality, I'd recommend 32k for UDP too).

The other tweak you can apply for TCP is to bump the value
for /proc/sys/sunrpc/tcp_slot_table_entries. That will allow you to have
several more RPC requests in flight (although that will also tie up more
threads on the server).

Don't forget that you need to unmount then mount again after making
these changes (-oremount won't suffice).

Cheers,
  Trond

-- 
Trond Myklebust <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-09 Thread Jakob Oestergaard
On Thu, Apr 07, 2005 at 12:17:51PM -0400, Trond Myklebust wrote:
> to den 07.04.2005 Klokka 17:38 (+0200) skreiv Jakob Oestergaard:
> 
> > I tweaked the VM a bit, put the following in /etc/sysctl.conf:
> >  vm.dirty_writeback_centisecs=100
> >  vm.dirty_expire_centisecs=200
> > 
> > The defaults are 500 and 3000 respectively...
> > 
> > This improved things a lot; the client is now "almost not very laggy",
> > and load stays in the saner 1-2 range.
> 
> OK. That hints at what is causing the latencies on the server: I'll bet
> it is the fact that the page reclaim code tries to be clever, and uses
> NFSv3 STABLE writes in order to be able to free up the dirty pages
> immediately. Could you try the following patch, and see if that makes a
> difference too?

The patch alone without the tweaked VM settings doesn't cure the lag - I
think it's better than without the patch (I can actually type this mail
with a large copy running). With the tweaked VM settings too, it's
pretty good - still a little lag, but not enough to really make it
annoying.

Performance is pretty much the same as before (copying an 8GiB file with
15-16MiB/sec - about half the performance of what I get locally on the
file server).

Something that worries me;  It seems that 2.4.25 is a lot faster as NFS
client than 2.6.11.6, most notably on writes - see the following
tiobench results (2000 KiB file, tests with 1, 2 and 4 threads) up
against the same NFS server:

2.4.25:  (dual athlon MP 1.4GHz, 1G RAM, Intel e1000)

 File   Block  Num  Seq ReadRand Read   Seq Write  Rand Write
  DirSize   Size   Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%)
--- -- --- --- --- --- --- ---
   . 2000   40961  58.87 54.9% 5.615 5.03% 44.40 44.2% 4.534 8.41%
   . 2000   40962  56.98 58.3% 6.926 6.64% 41.61 58.0% 4.462 10.8%
   . 2000   40964  53.90 59.0% 7.764 9.44% 39.85 61.5% 4.256 10.8%


2.6.11.6: (dual PIII 1GHz, 2G RAM, Intel e1000)

 File   Block  Num  Seq ReadRand Read   Seq Write  Rand Write
  DirSize   Size   Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%)
--- -- --- --- --- --- --- ---
   . 2000   40961  38.34 18.8% 19.61 6.77% 22.53 23.4% 6.947 15.6%
   . 2000   40962  52.82 29.0% 24.42 9.37% 24.20 27.0% 7.755 16.7%
   . 2000   40964  62.48 34.8% 33.65 17.0% 24.73 27.6% 8.027 15.4%


44MiB/sec for 2.4 versus 22MiB/sec for 2.6 - any suggestions as to how
this could be improved?

(note; the write performance doesn't change notably with VM tuning nor
with the one-liner change that Trond suggested)

-- 

 / jakob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-09 Thread Jakob Oestergaard
On Thu, Apr 07, 2005 at 12:17:51PM -0400, Trond Myklebust wrote:
 to den 07.04.2005 Klokka 17:38 (+0200) skreiv Jakob Oestergaard:
 
  I tweaked the VM a bit, put the following in /etc/sysctl.conf:
   vm.dirty_writeback_centisecs=100
   vm.dirty_expire_centisecs=200
  
  The defaults are 500 and 3000 respectively...
  
  This improved things a lot; the client is now almost not very laggy,
  and load stays in the saner 1-2 range.
 
 OK. That hints at what is causing the latencies on the server: I'll bet
 it is the fact that the page reclaim code tries to be clever, and uses
 NFSv3 STABLE writes in order to be able to free up the dirty pages
 immediately. Could you try the following patch, and see if that makes a
 difference too?

The patch alone without the tweaked VM settings doesn't cure the lag - I
think it's better than without the patch (I can actually type this mail
with a large copy running). With the tweaked VM settings too, it's
pretty good - still a little lag, but not enough to really make it
annoying.

Performance is pretty much the same as before (copying an 8GiB file with
15-16MiB/sec - about half the performance of what I get locally on the
file server).

Something that worries me;  It seems that 2.4.25 is a lot faster as NFS
client than 2.6.11.6, most notably on writes - see the following
tiobench results (2000 KiB file, tests with 1, 2 and 4 threads) up
against the same NFS server:

2.4.25:  (dual athlon MP 1.4GHz, 1G RAM, Intel e1000)

 File   Block  Num  Seq ReadRand Read   Seq Write  Rand Write
  DirSize   Size   Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%)
--- -- --- --- --- --- --- ---
   . 2000   40961  58.87 54.9% 5.615 5.03% 44.40 44.2% 4.534 8.41%
   . 2000   40962  56.98 58.3% 6.926 6.64% 41.61 58.0% 4.462 10.8%
   . 2000   40964  53.90 59.0% 7.764 9.44% 39.85 61.5% 4.256 10.8%


2.6.11.6: (dual PIII 1GHz, 2G RAM, Intel e1000)

 File   Block  Num  Seq ReadRand Read   Seq Write  Rand Write
  DirSize   Size   Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%)
--- -- --- --- --- --- --- ---
   . 2000   40961  38.34 18.8% 19.61 6.77% 22.53 23.4% 6.947 15.6%
   . 2000   40962  52.82 29.0% 24.42 9.37% 24.20 27.0% 7.755 16.7%
   . 2000   40964  62.48 34.8% 33.65 17.0% 24.73 27.6% 8.027 15.4%


44MiB/sec for 2.4 versus 22MiB/sec for 2.6 - any suggestions as to how
this could be improved?

(note; the write performance doesn't change notably with VM tuning nor
with the one-liner change that Trond suggested)

-- 

 / jakob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-09 Thread Trond Myklebust
lau den 09.04.2005 Klokka 23:35 (+0200) skreiv Jakob Oestergaard:

 2.6.11.6: (dual PIII 1GHz, 2G RAM, Intel e1000)
 
  File   Block  Num  Seq ReadRand Read   Seq Write  Rand Write
   DirSize   Size   Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%)
 --- -- --- --- --- --- --- ---
. 2000   40961  38.34 18.8% 19.61 6.77% 22.53 23.4% 6.947 15.6%
. 2000   40962  52.82 29.0% 24.42 9.37% 24.20 27.0% 7.755 16.7%
. 2000   40964  62.48 34.8% 33.65 17.0% 24.73 27.6% 8.027 15.4%
 
 
 44MiB/sec for 2.4 versus 22MiB/sec for 2.6 - any suggestions as to how
 this could be improved?

What happened to the retransmission rates when you changed to TCP?

Note that on TCP (besides bumping the value for timeo) I would also
recommend using a full 32k r/wsize instead of 4k (if the network is of
decent quality, I'd recommend 32k for UDP too).

The other tweak you can apply for TCP is to bump the value
for /proc/sys/sunrpc/tcp_slot_table_entries. That will allow you to have
several more RPC requests in flight (although that will also tie up more
threads on the server).

Don't forget that you need to unmount then mount again after making
these changes (-oremount won't suffice).

Cheers,
  Trond

-- 
Trond Myklebust [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-07 Thread Trond Myklebust
to den 07.04.2005 Klokka 17:38 (+0200) skreiv Jakob Oestergaard:

> I tweaked the VM a bit, put the following in /etc/sysctl.conf:
>  vm.dirty_writeback_centisecs=100
>  vm.dirty_expire_centisecs=200
> 
> The defaults are 500 and 3000 respectively...
> 
> This improved things a lot; the client is now "almost not very laggy",
> and load stays in the saner 1-2 range.

OK. That hints at what is causing the latencies on the server: I'll bet
it is the fact that the page reclaim code tries to be clever, and uses
NFSv3 STABLE writes in order to be able to free up the dirty pages
immediately. Could you try the following patch, and see if that makes a
difference too?

Cheers,
  Trond

 fs/nfs/write.c |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.12-rc2/fs/nfs/write.c
===
--- linux-2.6.12-rc2.orig/fs/nfs/write.c
+++ linux-2.6.12-rc2/fs/nfs/write.c
@@ -305,7 +305,7 @@ do_it:
if (err >= 0) {
err = 0;
if (wbc->for_reclaim)
-   nfs_flush_inode(inode, 0, 0, FLUSH_STABLE);
+   nfs_flush_inode(inode, 0, 0, 0);
}
} else {
err = nfs_writepage_sync(ctx, inode, page, 0,


-- 
Trond Myklebust <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-07 Thread Greg Banks
On Thu, Apr 07, 2005 at 05:38:48PM +0200, Jakob Oestergaard wrote:
> On Thu, Apr 07, 2005 at 09:19:06AM +1000, Greg Banks wrote:
> ...
> > How large is the client's RAM? 
> 
> 2GB - (32 bit kernel because it's dual PIII, so I use highmem)

Ok, that's probably not enough to fully trigger some of the problems
I've seen on large-memory NFS clients.

> A few more details:
> 
> With standard VM settings, the client will be laggy during the copy, but
> it will also have a load average around 10 (!)   And really, the only
> thing I do with it is one single 'cp' operation.  The CPU hogs are
> pdflush, rpciod/0 and rpciod/1.

NFS writes of single files much larger than client RAM still have
interesting issues.

> I tweaked the VM a bit, put the following in /etc/sysctl.conf:
>  vm.dirty_writeback_centisecs=100
>  vm.dirty_expire_centisecs=200
> 
> The defaults are 500 and 3000 respectively...

Yes, you want more frequent and smaller writebacks.  It may help to
reduce vm.dirty_ratio and possibly vm.dirty_background_ratio.

> This improved things a lot; the client is now "almost not very laggy",
> and load stays in the saner 1-2 range.
> 
> Still, system CPU utilization is very high (still from rpciod and
> pdflush - more rpciod and less pdflush though),

This is probably the rpciod's and pdflush all trying to do things
at the same time and contending for the BKL.

> During the copy I typically see:
> 
> nfs_write_data  681   952 480  8 1 : tunables  54 27 8 : slabdata 119 119 108
> nfs_page  15639 18300  64 61 1 : tunables 120 60 8 : slabdata 300 300 180

That's not so bad, it's only about 3% of the system's pages.

Greg.
-- 
Greg Banks, R Software Engineer, SGI Australian Software Group.
I don't speak for SGI.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-07 Thread Jakob Oestergaard
On Thu, Apr 07, 2005 at 09:19:06AM +1000, Greg Banks wrote:
...
> How large is the client's RAM? 

2GB - (32 bit kernel because it's dual PIII, so I use highmem)

A few more details:

With standard VM settings, the client will be laggy during the copy, but
it will also have a load average around 10 (!)   And really, the only
thing I do with it is one single 'cp' operation.  The CPU hogs are
pdflush, rpciod/0 and rpciod/1.

I tweaked the VM a bit, put the following in /etc/sysctl.conf:
 vm.dirty_writeback_centisecs=100
 vm.dirty_expire_centisecs=200

The defaults are 500 and 3000 respectively...

This improved things a lot; the client is now "almost not very laggy",
and load stays in the saner 1-2 range.

Still, system CPU utilization is very high (still from rpciod and
pdflush - more rpciod and less pdflush though), and the file copying
performance over NFS is roughly half of what I get locally on the server
(8G file copy with 16MB/sec over NFS versus 32 MB/sec locally).

(I run with plenty of knfsd threads on the server, and generally the
server is not very loaded when the client is pounding it as much as it
can)

> What does the following command report
> before and during the write?
> 
> egrep 'nfs_page|nfs_write_data' /proc/slabinfo

During the copy I typically see:

nfs_write_data  681   952 480  8 1 : tunables  54 27 8 : slabdata 119 119 108
nfs_page  15639 18300  64 61 1 : tunables 120 60 8 : slabdata 300 300 180

The "18300" above typically goes from 12000 to 25000...

After the copy I see:

nfs_write_data  36  48 480  8 1 : tunables   54   27 8 : slabdata  5  6  0
nfs_page 1  61  64 61 1 : tunables  120   60 8 : slabdata  1  1  0

-- 

 / jakob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-07 Thread Jakob Oestergaard
On Wed, Apr 06, 2005 at 05:28:56PM -0400, Trond Myklebust wrote:
...
> A look at "nfsstat" might help, as might "netstat -s".
> 
> In particular, I suggest looking at the "retrans" counter in nfsstat.

When doing a 'cp largefile1 largefile2' on the client, I see approx. 10
retransmissions per second in nfsstat.

I don't really know if this is a lot...

I also see packets dropped in ifconfig - approx. 10 per second...  I
wonder if these two are related.

Client has an intel e1000 card - I just set the RX ring buffer to the
max. of 4096 (up from the default of 256), but this doesn't seem to help
a lot (I see the 10 drops/sec with the large RX buffer).

I use NAPI - is there anything else I can do to make the card not drop
packets?   I'm just assuming that this might at least be a part of the
problem, but with large RX ring and NAPI I don't know how much else I
can do to not make the box drop incoming data...

> When you say that TCP did not help, please note that if retrans is high,
> then using TCP with a large value for timeo (for instance -otimeo=600)
> is a good idea. It is IMHO a bug for the "mount" program to be setting
> default timeout values of less than 30 seconds when using TCP.

I can try that.

Thanks!

-- 

 / jakob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-07 Thread Jakob Oestergaard
On Wed, Apr 06, 2005 at 05:28:56PM -0400, Trond Myklebust wrote:
...
 A look at nfsstat might help, as might netstat -s.
 
 In particular, I suggest looking at the retrans counter in nfsstat.

When doing a 'cp largefile1 largefile2' on the client, I see approx. 10
retransmissions per second in nfsstat.

I don't really know if this is a lot...

I also see packets dropped in ifconfig - approx. 10 per second...  I
wonder if these two are related.

Client has an intel e1000 card - I just set the RX ring buffer to the
max. of 4096 (up from the default of 256), but this doesn't seem to help
a lot (I see the 10 drops/sec with the large RX buffer).

I use NAPI - is there anything else I can do to make the card not drop
packets?   I'm just assuming that this might at least be a part of the
problem, but with large RX ring and NAPI I don't know how much else I
can do to not make the box drop incoming data...

 When you say that TCP did not help, please note that if retrans is high,
 then using TCP with a large value for timeo (for instance -otimeo=600)
 is a good idea. It is IMHO a bug for the mount program to be setting
 default timeout values of less than 30 seconds when using TCP.

I can try that.

Thanks!

-- 

 / jakob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-07 Thread Jakob Oestergaard
On Thu, Apr 07, 2005 at 09:19:06AM +1000, Greg Banks wrote:
...
 How large is the client's RAM? 

2GB - (32 bit kernel because it's dual PIII, so I use highmem)

A few more details:

With standard VM settings, the client will be laggy during the copy, but
it will also have a load average around 10 (!)   And really, the only
thing I do with it is one single 'cp' operation.  The CPU hogs are
pdflush, rpciod/0 and rpciod/1.

I tweaked the VM a bit, put the following in /etc/sysctl.conf:
 vm.dirty_writeback_centisecs=100
 vm.dirty_expire_centisecs=200

The defaults are 500 and 3000 respectively...

This improved things a lot; the client is now almost not very laggy,
and load stays in the saner 1-2 range.

Still, system CPU utilization is very high (still from rpciod and
pdflush - more rpciod and less pdflush though), and the file copying
performance over NFS is roughly half of what I get locally on the server
(8G file copy with 16MB/sec over NFS versus 32 MB/sec locally).

(I run with plenty of knfsd threads on the server, and generally the
server is not very loaded when the client is pounding it as much as it
can)

 What does the following command report
 before and during the write?
 
 egrep 'nfs_page|nfs_write_data' /proc/slabinfo

During the copy I typically see:

nfs_write_data  681   952 480  8 1 : tunables  54 27 8 : slabdata 119 119 108
nfs_page  15639 18300  64 61 1 : tunables 120 60 8 : slabdata 300 300 180

The 18300 above typically goes from 12000 to 25000...

After the copy I see:

nfs_write_data  36  48 480  8 1 : tunables   54   27 8 : slabdata  5  6  0
nfs_page 1  61  64 61 1 : tunables  120   60 8 : slabdata  1  1  0

-- 

 / jakob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-07 Thread Greg Banks
On Thu, Apr 07, 2005 at 05:38:48PM +0200, Jakob Oestergaard wrote:
 On Thu, Apr 07, 2005 at 09:19:06AM +1000, Greg Banks wrote:
 ...
  How large is the client's RAM? 
 
 2GB - (32 bit kernel because it's dual PIII, so I use highmem)

Ok, that's probably not enough to fully trigger some of the problems
I've seen on large-memory NFS clients.

 A few more details:
 
 With standard VM settings, the client will be laggy during the copy, but
 it will also have a load average around 10 (!)   And really, the only
 thing I do with it is one single 'cp' operation.  The CPU hogs are
 pdflush, rpciod/0 and rpciod/1.

NFS writes of single files much larger than client RAM still have
interesting issues.

 I tweaked the VM a bit, put the following in /etc/sysctl.conf:
  vm.dirty_writeback_centisecs=100
  vm.dirty_expire_centisecs=200
 
 The defaults are 500 and 3000 respectively...

Yes, you want more frequent and smaller writebacks.  It may help to
reduce vm.dirty_ratio and possibly vm.dirty_background_ratio.

 This improved things a lot; the client is now almost not very laggy,
 and load stays in the saner 1-2 range.
 
 Still, system CPU utilization is very high (still from rpciod and
 pdflush - more rpciod and less pdflush though),

This is probably the rpciod's and pdflush all trying to do things
at the same time and contending for the BKL.

 During the copy I typically see:
 
 nfs_write_data  681   952 480  8 1 : tunables  54 27 8 : slabdata 119 119 108
 nfs_page  15639 18300  64 61 1 : tunables 120 60 8 : slabdata 300 300 180

That's not so bad, it's only about 3% of the system's pages.

Greg.
-- 
Greg Banks, RD Software Engineer, SGI Australian Software Group.
I don't speak for SGI.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-07 Thread Trond Myklebust
to den 07.04.2005 Klokka 17:38 (+0200) skreiv Jakob Oestergaard:

 I tweaked the VM a bit, put the following in /etc/sysctl.conf:
  vm.dirty_writeback_centisecs=100
  vm.dirty_expire_centisecs=200
 
 The defaults are 500 and 3000 respectively...
 
 This improved things a lot; the client is now almost not very laggy,
 and load stays in the saner 1-2 range.

OK. That hints at what is causing the latencies on the server: I'll bet
it is the fact that the page reclaim code tries to be clever, and uses
NFSv3 STABLE writes in order to be able to free up the dirty pages
immediately. Could you try the following patch, and see if that makes a
difference too?

Cheers,
  Trond

 fs/nfs/write.c |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.12-rc2/fs/nfs/write.c
===
--- linux-2.6.12-rc2.orig/fs/nfs/write.c
+++ linux-2.6.12-rc2/fs/nfs/write.c
@@ -305,7 +305,7 @@ do_it:
if (err = 0) {
err = 0;
if (wbc-for_reclaim)
-   nfs_flush_inode(inode, 0, 0, FLUSH_STABLE);
+   nfs_flush_inode(inode, 0, 0, 0);
}
} else {
err = nfs_writepage_sync(ctx, inode, page, 0,


-- 
Trond Myklebust [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-06 Thread Greg Banks
On Wed, Apr 06, 2005 at 06:01:23PM +0200, Jakob Oestergaard wrote:
> 
> Problem; during simple tests such as a 'cp largefile0 largefile1' on the
> client (under the mountpoint from the NFS server), the client becomes
> extremely laggy, NFS writes are slow, and I see very high CPU
> utilization by bdflush and rpciod.
> 
> For example, writing a single 8G file with dd will give me about
> 20MB/sec (I get 60+ MB/sec locally on the server), and the client rarely
> drops below 40% system CPU utilization.

How large is the client's RAM?  What does the following command report
before and during the write?

egrep 'nfs_page|nfs_write_data' /proc/slabinfo

Greg.
-- 
Greg Banks, R Software Engineer, SGI Australian Software Group.
I don't speak for SGI.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-06 Thread Trond Myklebust
on den 06.04.2005 Klokka 18:01 (+0200) skreiv Jakob Oestergaard:
> What do I do?
> 
> Performance sucks and the profiles do not make sense...
> 
> Any suggestions would be greatly appreciated,

A look at "nfsstat" might help, as might "netstat -s".

In particular, I suggest looking at the "retrans" counter in nfsstat.

When you say that TCP did not help, please note that if retrans is high,
then using TCP with a large value for timeo (for instance -otimeo=600)
is a good idea. It is IMHO a bug for the "mount" program to be setting
default timeout values of less than 30 seconds when using TCP.

Cheers,
  Trond

-- 
Trond Myklebust <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-06 Thread Jakob Oestergaard

Hello list,

Setup; 
 NFS server (dual opteron, HW RAID, SCA disk enclosure) on 2.6.11.6
 NFS client (dual PIII) on 2.6.11.6

Both on switched gigabit ethernet - I use NFSv3 over UDP (tried TCP but
this makes no difference).

Problem; during simple tests such as a 'cp largefile0 largefile1' on the
client (under the mountpoint from the NFS server), the client becomes
extremely laggy, NFS writes are slow, and I see very high CPU
utilization by bdflush and rpciod.

For example, writing a single 8G file with dd will give me about
20MB/sec (I get 60+ MB/sec locally on the server), and the client rarely
drops below 40% system CPU utilization.

I tried profiling the client (booting with profile=2), but the profile
traces do not make sense; a profile from a single write test where the
client did not at any time drop below 30% system time (and frequently
were at 40-50%) gives me something like:

raven:~# less profile3 | sort -nr | head
257922 total  2.6394
254739 default_idle 5789.5227
   960 smp_call_function  4.
   888 __might_sleep  5.6923
   569 finish_task_switch 4.7417
   176 kmap_atomic1.7600
   113 __wake_up  1.8833
74 kmap   1.5417
64 kunmap_atomic  5.

The difference between default_idle and total is 1.2% - but I never saw
system CPU utilization under 30%...

Besides, there's basically nothing in the profile that rhymes with
rpciod or bdflush (the two high-hitters on top during the test).

What do I do?

Performance sucks and the profiles do not make sense...

Any suggestions would be greatly appreciated,

Thank you!

-- 

 / jakob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-06 Thread Jakob Oestergaard

Hello list,

Setup; 
 NFS server (dual opteron, HW RAID, SCA disk enclosure) on 2.6.11.6
 NFS client (dual PIII) on 2.6.11.6

Both on switched gigabit ethernet - I use NFSv3 over UDP (tried TCP but
this makes no difference).

Problem; during simple tests such as a 'cp largefile0 largefile1' on the
client (under the mountpoint from the NFS server), the client becomes
extremely laggy, NFS writes are slow, and I see very high CPU
utilization by bdflush and rpciod.

For example, writing a single 8G file with dd will give me about
20MB/sec (I get 60+ MB/sec locally on the server), and the client rarely
drops below 40% system CPU utilization.

I tried profiling the client (booting with profile=2), but the profile
traces do not make sense; a profile from a single write test where the
client did not at any time drop below 30% system time (and frequently
were at 40-50%) gives me something like:

raven:~# less profile3 | sort -nr | head
257922 total  2.6394
254739 default_idle 5789.5227
   960 smp_call_function  4.
   888 __might_sleep  5.6923
   569 finish_task_switch 4.7417
   176 kmap_atomic1.7600
   113 __wake_up  1.8833
74 kmap   1.5417
64 kunmap_atomic  5.

The difference between default_idle and total is 1.2% - but I never saw
system CPU utilization under 30%...

Besides, there's basically nothing in the profile that rhymes with
rpciod or bdflush (the two high-hitters on top during the test).

What do I do?

Performance sucks and the profiles do not make sense...

Any suggestions would be greatly appreciated,

Thank you!

-- 

 / jakob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-06 Thread Trond Myklebust
on den 06.04.2005 Klokka 18:01 (+0200) skreiv Jakob Oestergaard:
 What do I do?
 
 Performance sucks and the profiles do not make sense...
 
 Any suggestions would be greatly appreciated,

A look at nfsstat might help, as might netstat -s.

In particular, I suggest looking at the retrans counter in nfsstat.

When you say that TCP did not help, please note that if retrans is high,
then using TCP with a large value for timeo (for instance -otimeo=600)
is a good idea. It is IMHO a bug for the mount program to be setting
default timeout values of less than 30 seconds when using TCP.

Cheers,
  Trond

-- 
Trond Myklebust [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-06 Thread Greg Banks
On Wed, Apr 06, 2005 at 06:01:23PM +0200, Jakob Oestergaard wrote:
 
 Problem; during simple tests such as a 'cp largefile0 largefile1' on the
 client (under the mountpoint from the NFS server), the client becomes
 extremely laggy, NFS writes are slow, and I see very high CPU
 utilization by bdflush and rpciod.
 
 For example, writing a single 8G file with dd will give me about
 20MB/sec (I get 60+ MB/sec locally on the server), and the client rarely
 drops below 40% system CPU utilization.

How large is the client's RAM?  What does the following command report
before and during the write?

egrep 'nfs_page|nfs_write_data' /proc/slabinfo

Greg.
-- 
Greg Banks, RD Software Engineer, SGI Australian Software Group.
I don't speak for SGI.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/