Re: Async NFS exports?

1999-08-22 Thread Doug Rabson

On Fri, 20 Aug 1999, Matthew Dillon wrote:

 : Just to be clear... I am wondering if mounting (on the NFS _server_) a
 : partition (that is exportable) as async will have any performance 
 : benefits to the NFS clients?
 :
 :As a first guess, probably not unless you have a large number of active
 :clients.  Any modern hard disc will outperform ethernet/fast ethernet,
 :especially for larger read/writes.  For large numbers of smaller
 :operations, or when there is a large number of simultaneous outstanding
 :requests from clients, maybe.  I'd say watch the disc itself (iostat is
 :your friend), and if it's pegged (especially large numbers of tps) async
 :might buy you some increase.
 :-- 
 :Matthew Fuller (MF4839) |[EMAIL PROTECTED]
 
 Not much if at all, whether you have a large number of clients or not,
 at least if you are using NFSv3 mounts.
 
 The reason is due to the way NFSv3 issues writes.  NFSv3 issues a 
 write but no longer assumes that the write has been synced to the 
 server's disk as of when the reply comes back.  Instead it keeps the
 buffer around and does a later commit rpc to do the sync, presumably
 long after the server has already synced the data. 
 
 So, effectively, all NFSv3 writes are async insofar as the client's 
 buffer cache is able to keep abrest of the write-rate.
 
 Hmm, interesting.  I see another optimization I can do to fix the
 buffer cache saturation case in CURRENT on the client.  The COMMIT rpc's
 aren't being issued async.

You need to track the return value of the commit so that you can detect
server reboots and sync-write the data again. If you change to async, make
sure that you still keep this part - its essential to the protocol.

--
Doug Rabson Mail:  [EMAIL PROTECTED]
Nonlinear Systems Ltd.  Phone: +44 181 442 9037




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Async NFS exports?

1999-08-22 Thread Doug Rabson
On Fri, 20 Aug 1999, Matthew Dillon wrote:

 : Just to be clear... I am wondering if mounting (on the NFS _server_) a
 : partition (that is exportable) as async will have any performance 
 : benefits to the NFS clients?
 :
 :As a first guess, probably not unless you have a large number of active
 :clients.  Any modern hard disc will outperform ethernet/fast ethernet,
 :especially for larger read/writes.  For large numbers of smaller
 :operations, or when there is a large number of simultaneous outstanding
 :requests from clients, maybe.  I'd say watch the disc itself (iostat is
 :your friend), and if it's pegged (especially large numbers of tps) async
 :might buy you some increase.
 :-- 
 :Matthew Fuller (MF4839) |fulle...@over-yonder.net
 
 Not much if at all, whether you have a large number of clients or not,
 at least if you are using NFSv3 mounts.
 
 The reason is due to the way NFSv3 issues writes.  NFSv3 issues a 
 write but no longer assumes that the write has been synced to the 
 server's disk as of when the reply comes back.  Instead it keeps the
 buffer around and does a later commit rpc to do the sync, presumably
 long after the server has already synced the data. 
 
 So, effectively, all NFSv3 writes are async insofar as the client's 
 buffer cache is able to keep abrest of the write-rate.
 
 Hmm, interesting.  I see another optimization I can do to fix the
 buffer cache saturation case in CURRENT on the client.  The COMMIT rpc's
 aren't being issued async.

You need to track the return value of the commit so that you can detect
server reboots and sync-write the data again. If you change to async, make
sure that you still keep this part - its essential to the protocol.

--
Doug Rabson Mail:  d...@nlsystems.com
Nonlinear Systems Ltd.  Phone: +44 181 442 9037




To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Async NFS exports?

1999-08-22 Thread Matthew Dillon
: buffer cache is able to keep abrest of the write-rate.
: 
: Hmm, interesting.  I see another optimization I can do to fix the
: buffer cache saturation case in CURRENT on the client.  The COMMIT rpc's
: aren't being issued async.
:
:You need to track the return value of the commit so that you can detect
:server reboots and sync-write the data again. If you change to async, make
:sure that you still keep this part - its essential to the protocol.
:
:--
:Doug RabsonMail:  d...@nlsystems.com
:Nonlinear Systems Ltd. Phone: +44 181 442 9037

   These are buffer-cache entities we are talking about here, so they won't 
   go away until NFS tells the system they can go away.  In that respect
   async I/O is no different then sync I/O.  async I/O is simply run
   synchronously from an nfsiod context.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Async NFS exports?

1999-08-21 Thread Matthew Dillon

:The problem that occurs on the FreeBSD server is simply that the
:nfsrv_commit() procedure calls fsync() on the file... on the *ENTIRE*
:file, for every commit rpc, rather then syncing just the offset/range 
:requested.  I am looking into ways to fix this.
:

Ok, I've verified the problem.  The nfsrv_commit() code running on the
server is definitely the culprit.  I was able to make a tentitive patch
which increased NFSv3 write performance to 10 MBytes/sec -- the maximum
my 100BaseTX network can do.

CURRENT in tree:2.5 MBytes/sec
CURRENT w/ asynchronized commit rpc:4.5 MBytes/sec
CURRENT w/ asy commit and fixed nfsrv_commit:   10 MBytes/sec (1)

note(1): network is maxed out.

Running bonnie returned around 5.2 MBytes/sec using putc, 3.5 MBytes/sec
doing rewrite (but also 3.5 MBytes/sec going the other way), and
10 MBytes/sec writing intelligently.  Throughout the test the low
level disk I/O on the server was able to do sustained clustering 
at 64KB/t.

I have a backlog of patches so it may be a week or more before this one
gets tested and committed into CURRENT.  It's a little racey for putting
into STABLE but maybe after a month of testing on CURRENT we will be
able to do it.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Async NFS exports?

1999-08-21 Thread Matthew Dillon
:The problem that occurs on the FreeBSD server is simply that the
:nfsrv_commit() procedure calls fsync() on the file... on the *ENTIRE*
:file, for every commit rpc, rather then syncing just the offset/range 
:requested.  I am looking into ways to fix this.
:

Ok, I've verified the problem.  The nfsrv_commit() code running on the
server is definitely the culprit.  I was able to make a tentitive patch
which increased NFSv3 write performance to 10 MBytes/sec -- the maximum
my 100BaseTX network can do.

CURRENT in tree:2.5 MBytes/sec
CURRENT w/ asynchronized commit rpc:4.5 MBytes/sec
CURRENT w/ asy commit and fixed nfsrv_commit:   10 MBytes/sec (1)

note(1): network is maxed out.

Running bonnie returned around 5.2 MBytes/sec using putc, 3.5 MBytes/sec
doing rewrite (but also 3.5 MBytes/sec going the other way), and
10 MBytes/sec writing intelligently.  Throughout the test the low
level disk I/O on the server was able to do sustained clustering 
at 64KB/t.

I have a backlog of patches so it may be a week or more before this one
gets tested and committed into CURRENT.  It's a little racey for putting
into STABLE but maybe after a month of testing on CURRENT we will be
able to do it.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Async NFS exports?

1999-08-20 Thread Steve Ames


I asked this on stable but didn't get a response... Would I get any
performance increases by mounting NFS exported partition as Async?

Would my soul be tormented in purgatory for doing it?

Just to be clear... I am wondering if mounting (on the NFS _server_) a
partition (that is exportable) as async will have any performance 
benefits to the NFS clients?

-Steve


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Async NFS exports?

1999-08-20 Thread Matthew D. Fuller

[ Caveat: I'm making this up as I go along ]

On Fri, Aug 20, 1999 at 01:13:06PM -0500, a little birdie told me
that Steve Ames remarked
 
 I asked this on stable but didn't get a response... Would I get any
 performance increases by mounting NFS exported partition as Async?
 
 Would my soul be tormented in purgatory for doing it?
 
 Just to be clear... I am wondering if mounting (on the NFS _server_) a
 partition (that is exportable) as async will have any performance 
 benefits to the NFS clients?

As a first guess, probably not unless you have a large number of active
clients.  Any modern hard disc will outperform ethernet/fast ethernet,
especially for larger read/writes.  For large numbers of smaller
operations, or when there is a large number of simultaneous outstanding
requests from clients, maybe.  I'd say watch the disc itself (iostat is
your friend), and if it's pegged (especially large numbers of tps) async
might buy you some increase.



-- 
Matthew Fuller (MF4839) |[EMAIL PROTECTED]
Unix Systems Administrator  |[EMAIL PROTECTED]
Specializing in FreeBSD |http://www.over-yonder.net/
FutureSouth Communications  |ISPHelp ISP Consulting

"The only reason I'm burning my candle at both ends, is because I
  haven't figured out how to light the middle yet"


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Async NFS exports?

1999-08-20 Thread Matthew Dillon

:I asked this on stable but didn't get a response... Would I get any
:performance increases by mounting NFS exported partition as Async?
:
:Would my soul be tormented in purgatory for doing it?
:
:Just to be clear... I am wondering if mounting (on the NFS _server_) a
:partition (that is exportable) as async will have any performance 
:benefits to the NFS clients?
:
:-Steve

w/NFSv3 I doubt mounting the exported partitions async will increase
performance much.  I would not use an async mount - if this is an NFS
server it needs to be as reliable as possible and async mounting the
partition is going to hurt if the machine crashes.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Async NFS exports?

1999-08-20 Thread Matthew Dillon


:I asked this on stable but didn't get a response... Would I get any
:performance increases by mounting NFS exported partition as Async?
:
:Would my soul be tormented in purgatory for doing it?
:
:Just to be clear... I am wondering if mounting (on the NFS _server_) a
:partition (that is exportable) as async will have any performance 
:benefits to the NFS clients?
:
:-Steve

Ok, I've run some more tests.  Basically you want to run NFSv3 under
CURRENT and you want to run at least 3 nfsiod's.  On a 100BaseTX network
this will give you unsaturated write performance in the ballpark of
9 MBytes/sec.  Saturated write performance, that is where you write more
then the client-side buffer cache can handle, will stabilize at
2.5 MBytes/sec.  I have a patch for CURRENT which will increase the
saturated write performance to 4.5 MBytes/sec (basically by moving the
nfs_commit() from nfs_writebp() to nfs_doio() so it can be asynchronized).
Hopefully that patch will go in soon but there's a pretty big backlog of
patches that haven't gone in yet, some over a week and a half old, so...

In anycase, even without the patch if you run a couple of nfsiod's and
do not saturated the buffer cache you should get optimal performance.

Backing-porting the patch for nfs_commit to STABLE is possible but is
not likely to help much because the major performance restriction in
STABLE is related to buffer cache management, not NFS.


OS  #nfsiod's   unsaturated saturated
write perf. write perf.
( . 100BASETX .. )

CURRENT 0   9 MBytes/sec2.5 MBytes/sec
CURRENT 4   9 MBytes/sec4.5 MBytes/sec(w/patch)

STABLE  0   3 MBytes/sec3 MBytes/sec(1)
STABLE  4   4 MBytes/sec3 MBytes/sec(1)

note(1): saturated performance under STABLE is extremely inconsistant

-Matt
Matthew Dillon 
[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Async NFS exports?

1999-08-20 Thread Rob Snow

Emm, I guess that answers my earlier question/mail:

Why?---



basil# uname -a
FreeBSD basil.dympna.com 3.2-RELEASE FreeBSD 3.2-RELEASE #7: Thu Aug 19
23:59:50 CDT 1999
[EMAIL PROTECTED]:/export/current/src/sys/compile/Basil-SMP
[Dual PPro-233's]

basil# cd /stripe
basil# df -k .

Filesystem 1K-blocks UsedAvail Capacity  Mounted on
/dev/vinum/stripe   1719751186511 15735200 1%/stripe

basil# Bonnie -s 256

  ---Sequential Output ---Sequential Input--
  -Per Char- --Block--- -Rewrite-- -Per Char- --Block---
MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU
  256 10817 97.3 15805 93.1  6338 41.4  9943 97.5 15796 51.2


basil# mount_nfs -3 localhost:/stripe /mnt
basil# cd /mnt
basil# Bonnie -s 256

  ---Sequential Output ---Sequential Input--
  -Per Char- --Block--- -Rewrite-- -Per Char- --Block---
MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU
  256  4270 57.6  6639 30.6  1877 11.7  3804 55.3  6201 18.7




Matthew Dillon wrote:
 
 :I asked this on stable but didn't get a response... Would I get any
 :performance increases by mounting NFS exported partition as Async?
 :
 :Would my soul be tormented in purgatory for doing it?
 :
 :Just to be clear... I am wondering if mounting (on the NFS _server_) a
 :partition (that is exportable) as async will have any performance
 :benefits to the NFS clients?
 :
 :-Steve
 
 Ok, I've run some more tests.  Basically you want to run NFSv3 under
 CURRENT and you want to run at least 3 nfsiod's.  On a 100BaseTX network
 this will give you unsaturated write performance in the ballpark of
 9 MBytes/sec.  Saturated write performance, that is where you write more
 then the client-side buffer cache can handle, will stabilize at
 2.5 MBytes/sec.  I have a patch for CURRENT which will increase the
 saturated write performance to 4.5 MBytes/sec (basically by moving the
 nfs_commit() from nfs_writebp() to nfs_doio() so it can be asynchronized).
 Hopefully that patch will go in soon but there's a pretty big backlog of
 patches that haven't gone in yet, some over a week and a half old, so...
 
 In anycase, even without the patch if you run a couple of nfsiod's and
 do not saturated the buffer cache you should get optimal performance.
 
 Backing-porting the patch for nfs_commit to STABLE is possible but is
 not likely to help much because the major performance restriction in
 STABLE is related to buffer cache management, not NFS.
 
 OS  #nfsiod's   unsaturated saturated
 write perf. write perf.
 ( . 100BASETX .. )
 
 CURRENT 0   9 MBytes/sec2.5 MBytes/sec
 CURRENT 4   9 MBytes/sec4.5 MBytes/sec(w/patch)
 
 STABLE  0   3 MBytes/sec3 MBytes/sec(1)
 STABLE  4   4 MBytes/sec3 MBytes/sec(1)
 
 note(1): saturated performance under STABLE is extremely inconsistant
 
 -Matt
 Matthew Dillon
 [EMAIL PROTECTED]
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-hackers" in the body of the message


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Async NFS exports?

1999-08-20 Thread Matthew Dillon


:Emm, I guess that answers my earlier question/mail:
:
:Why?---
:
:/dev/vinum/stripe   1719751186511 15735200 1%/stripe
:basil# Bonnie -s 256
:  ---Sequential Output ---Sequential Input--
:  -Per Char- --Block--- -Rewrite-- -Per Char- --Block---
:MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU
:  256 10817 97.3 15805 93.1  6338 41.4  9943 97.5 15796 51.2
:
:
:basil# mount_nfs -3 localhost:/stripe /mnt
:basil# cd /mnt
:basil# Bonnie -s 256
:
:  ---Sequential Output ---Sequential Input--
:  -Per Char- --Block--- -Rewrite-- -Per Char- --Block---
:MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU
:  256  4270 57.6  6639 30.6  1877 11.7  3804 55.3  6201 18.7

Buffer copy and protocol overhead, plus the data is being cached 
twice: once on the server and once on the client (which happens to be
the same machine).   The machine becomes cpu-bound with all the extra
work.

This is what I get.  Ignore the read numbers (the machine has a lot 
of memory).  This is with nfsd -n 4 and nfsiod -n 4, NFSv3 UDP mount, 
on a duel P-III/450 running CURRENT.

/dev/da0h   2338236  1235921   91525757%/usr/obj
---Sequential Output ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
256  9979 51.4  9154 19.0  9814 22.5 19727 99.8 103863 100.0 2163.4 43.2

localhost:/usr/obj/ 2338236  1235921   91525757%/mnt
---Sequential Output ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
256  3532 30.9  4164  7.6  4364 14.9 13700 99.9 75964 99.7 2779.7 162.1

 As you can see, I get very similar numbers for writing.  9 MB/sec on
 the local filesystem and 4.3 MB/sec via a localhost: NFS mount.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Async NFS exports?

1999-08-20 Thread Matthew Dillon


: The reason is due to the way NFSv3 issues writes.  NFSv3 issues a 
: write but no longer assumes that the write has been synced to the 
: server's disk as of when the reply comes back.  Instead it keeps the
:..
:
:If you are looking for more optimizations, you can delay NFS write
:operations for some interval X, on all write buffers, entirely.
:
:Sun does this (it is technically called "write gathering" in the
:literature).

The client-side pushes a buffer out the moment it fills up.  Since
this corresponds to a maximally-sized NFS data packet the client will
not be more efficient by delaying it.

The server-side in the FreeBSD implementation *DOES* do write-clustering,
as I demonstrate below.  The server does a better job of write-clustering
when the client is a FreeBSD-CURRENT box, though, because FreeBSD-CURRENT
clients generate a more uniform data flow.

The server is able to do write-clustering w/ NFSv3 mounts because data
writes are not required to be synched to disk until the client requests
a commit.

Note that KB/t fields below.


CURRENT server, CURRENT client. iostat on server while client writes

apollo:/usr/src/sys/nfs# iostat da0 1
  tty da0 cpu
 tin tout  KB/t tps  MB/s  us ni sy in id
   0   20  0.00   0  0.00   1  0  1  0 98
   0   43  0.00   0  0.00   0  0  1  0 99
   0   43  1.00   1  0.00   0  0  0  0100
   0   42  6.86   7  0.05   0  0 10  0 90
   0   43 61.80  68  4.12   1  0 25 21 53
   0   43 64.00  72  4.52   0  0 18 28 54
   0   43 64.00  73  4.58   0  0 21 25 54
   0   42 62.29  74  4.51   0  0 24 32 45
   0   43 63.23  72  4.46   0  0 22 23 54
   0   43 64.00  32  1.98   0  0  9  9 82

CURRENT server, STABLE client.  iostat on server while client writes

apollo:/usr/src/sys/nfs# iostat da0 1
  tty da0 cpu
 tin tout  KB/t tps  MB/s  us ni sy in id
   0   20  0.00   0  0.00   1  0  1  0 98
   0   43 43.43  98  4.15   0  0 15 21 64
   0   42 17.48 107  1.82   0  0  9  7 84
   0   43 62.49  73  4.47   0  0 19 33 48
   0   43 19.93 121  2.35   0  0  8  9 84
   0   43 43.07  85  3.58   0  0 17 22 61
   0   42 46.77  90  4.11   0  0 21 22 57
   0   43 15.63 108  1.65   0  0  8  7 85
   0   43 64.00  70  4.39   2  0 17 27 54


However, even a FreeBSD-CURRENT breaks down when its buffer cache
saturates.  The example below demonstrates this:


CURRENT server, CURRENT client, iostat of server disks while client writes

(client buffer cache not saturated)
   0   43  0.00   0  0.00   0  0  0  0100
   0   43 56.00  26  1.41   0  0 14  5 81   
   0   42 64.00  72  4.52   0  0 21 26 53
   0   43 64.00  73  4.58   0  0 25 19 57
  11   54 64.00  71  4.46   2  0 17 28 53
   0   42 62.27  73  4.45   0  0 26 22 52
   5   48 64.00  72  4.52   0  0 22 24 53
   5   48 52.72  82  4.23   0  0 20 26 53
(client buffer cache saturates)
  11   54 18.76 163  2.99   2  0 12 16 71
   1   43 19.20 129  2.41   1  0 13 11 75
   7   70 20.65 146  2.95   2  0 14 16 67
   4   47 17.53 130  2.22   0  0  9 15 77
   6   51 24.63 113  2.72   1  0 10 13 76
   4   73 15.17 153  2.27   0  0 11  7 82


The problem that occurs on the FreeBSD server is simply that the
nfsrv_commit() procedure calls fsync() on the file... on the *ENTIRE*
file, for every commit rpc, rather then syncing just the offset/range 
requested.  I am looking into ways to fix this.

-Matt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Async NFS exports?

1999-08-20 Thread Steve Ames

I asked this on stable but didn't get a response... Would I get any
performance increases by mounting NFS exported partition as Async?

Would my soul be tormented in purgatory for doing it?

Just to be clear... I am wondering if mounting (on the NFS _server_) a
partition (that is exportable) as async will have any performance 
benefits to the NFS clients?

-Steve


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Async NFS exports?

1999-08-20 Thread Matthew D. Fuller
[ Caveat: I'm making this up as I go along ]

On Fri, Aug 20, 1999 at 01:13:06PM -0500, a little birdie told me
that Steve Ames remarked
 
 I asked this on stable but didn't get a response... Would I get any
 performance increases by mounting NFS exported partition as Async?
 
 Would my soul be tormented in purgatory for doing it?
 
 Just to be clear... I am wondering if mounting (on the NFS _server_) a
 partition (that is exportable) as async will have any performance 
 benefits to the NFS clients?

As a first guess, probably not unless you have a large number of active
clients.  Any modern hard disc will outperform ethernet/fast ethernet,
especially for larger read/writes.  For large numbers of smaller
operations, or when there is a large number of simultaneous outstanding
requests from clients, maybe.  I'd say watch the disc itself (iostat is
your friend), and if it's pegged (especially large numbers of tps) async
might buy you some increase.



-- 
Matthew Fuller (MF4839) |fulle...@over-yonder.net
Unix Systems Administrator  |fulle...@futuresouth.com
Specializing in FreeBSD |http://www.over-yonder.net/
FutureSouth Communications  |ISPHelp ISP Consulting

The only reason I'm burning my candle at both ends, is because I
  haven't figured out how to light the middle yet


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Async NFS exports?

1999-08-20 Thread Matthew Dillon
:I asked this on stable but didn't get a response... Would I get any
:performance increases by mounting NFS exported partition as Async?
:
:Would my soul be tormented in purgatory for doing it?
:
:Just to be clear... I am wondering if mounting (on the NFS _server_) a
:partition (that is exportable) as async will have any performance 
:benefits to the NFS clients?
:
:-Steve

w/NFSv3 I doubt mounting the exported partitions async will increase
performance much.  I would not use an async mount - if this is an NFS
server it needs to be as reliable as possible and async mounting the
partition is going to hurt if the machine crashes.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Async NFS exports?

1999-08-20 Thread Matthew Dillon
: Just to be clear... I am wondering if mounting (on the NFS _server_) a
: partition (that is exportable) as async will have any performance 
: benefits to the NFS clients?
:
:As a first guess, probably not unless you have a large number of active
:clients.  Any modern hard disc will outperform ethernet/fast ethernet,
:especially for larger read/writes.  For large numbers of smaller
:operations, or when there is a large number of simultaneous outstanding
:requests from clients, maybe.  I'd say watch the disc itself (iostat is
:your friend), and if it's pegged (especially large numbers of tps) async
:might buy you some increase.
:-- 
:Matthew Fuller (MF4839) |fulle...@over-yonder.net

Not much if at all, whether you have a large number of clients or not,
at least if you are using NFSv3 mounts.

The reason is due to the way NFSv3 issues writes.  NFSv3 issues a 
write but no longer assumes that the write has been synced to the 
server's disk as of when the reply comes back.  Instead it keeps the
buffer around and does a later commit rpc to do the sync, presumably
long after the server has already synced the data. 

So, effectively, all NFSv3 writes are async insofar as the client's 
buffer cache is able to keep abrest of the write-rate.

Hmm, interesting.  I see another optimization I can do to fix the
buffer cache saturation case in CURRENT on the client.  The COMMIT rpc's
aren't being issued async.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Async NFS exports?

1999-08-20 Thread Alfred Perlstein
On Fri, 20 Aug 1999, Steve Ames wrote:

 
 I asked this on stable but didn't get a response... Would I get any
 performance increases by mounting NFS exported partition as Async?
 
 Would my soul be tormented in purgatory for doing it?
 
 Just to be clear... I am wondering if mounting (on the NFS _server_) a
 partition (that is exportable) as async will have any performance 
 benefits to the NFS clients?

of course it will, but it'll also potentially
cause catastrophic damage to your filesystem if
the machine dies for some reason.

have a look at:
/usr/src/sys/contrib/softupdates/README
or possibly 
/usr/src/contrib/softupdates/README

also, play with the nfs sysctl's and mout options to tune performance.

sysctl -a | grep nfs

good luck,
-Alfred Perlstein - [bri...@rush.net|alf...@freebsd.org]
Wintelcom systems administrator and programmer
   - http://www.wintelcom.net/ [bri...@wintelcom.net]




To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Async NFS exports?

1999-08-20 Thread Matthew Dillon

:I asked this on stable but didn't get a response... Would I get any
:performance increases by mounting NFS exported partition as Async?
:
:Would my soul be tormented in purgatory for doing it?
:
:Just to be clear... I am wondering if mounting (on the NFS _server_) a
:partition (that is exportable) as async will have any performance 
:benefits to the NFS clients?
:
:-Steve

Ok, I've run some more tests.  Basically you want to run NFSv3 under
CURRENT and you want to run at least 3 nfsiod's.  On a 100BaseTX network
this will give you unsaturated write performance in the ballpark of
9 MBytes/sec.  Saturated write performance, that is where you write more
then the client-side buffer cache can handle, will stabilize at
2.5 MBytes/sec.  I have a patch for CURRENT which will increase the
saturated write performance to 4.5 MBytes/sec (basically by moving the
nfs_commit() from nfs_writebp() to nfs_doio() so it can be asynchronized).
Hopefully that patch will go in soon but there's a pretty big backlog of
patches that haven't gone in yet, some over a week and a half old, so...

In anycase, even without the patch if you run a couple of nfsiod's and
do not saturated the buffer cache you should get optimal performance.

Backing-porting the patch for nfs_commit to STABLE is possible but is
not likely to help much because the major performance restriction in
STABLE is related to buffer cache management, not NFS.


OS  #nfsiod's   unsaturated saturated
write perf. write perf.
( . 100BASETX .. )

CURRENT 0   9 MBytes/sec2.5 MBytes/sec
CURRENT 4   9 MBytes/sec4.5 MBytes/sec(w/patch)

STABLE  0   3 MBytes/sec3 MBytes/sec(1)
STABLE  4   4 MBytes/sec3 MBytes/sec(1)

note(1): saturated performance under STABLE is extremely inconsistant

-Matt
Matthew Dillon 
dil...@backplane.com



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Async NFS exports?

1999-08-20 Thread Matthew Dillon
:Ok, I've run some more tests.  Basically you want to run NFSv3 under
:CURRENT and you want to run at least 3 nfsiod's.  On a 100BaseTX network

Oh, let me be a bit more clear:  Run 3 nfsiod's on the client.  Run 
4 nfsd's on the server.  e.g. 'nfsiod -n 3' on the client and 'nfsd -n 4'
on the server.  To realize maximum NFS performance both the server and
client should be running CURRENT.  

If Non-FreeBSD clients cannot achieve good performance, the problem 
is with those clients and nothing you do on the server will help.

-Matt



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Async NFS exports?

1999-08-20 Thread Terry Lambert
 The reason is due to the way NFSv3 issues writes.  NFSv3 issues a 
 write but no longer assumes that the write has been synced to the 
 server's disk as of when the reply comes back.  Instead it keeps the
 buffer around and does a later commit rpc to do the sync, presumably
 long after the server has already synced the data. 
 
 So, effectively, all NFSv3 writes are async insofar as the client's 
 buffer cache is able to keep abrest of the write-rate.
 
 Hmm, interesting.  I see another optimization I can do to fix the
 buffer cache saturation case in CURRENT on the client.  The COMMIT rpc's
 aren't being issued async.

If you are looking for more optimizations, you can delay NFS write
operations for some interval X, on all write buffers, entirely.

Sun does this (it is technically called write gathering in the
literature).

The upshot of introducing this latency between request and action
is that multiple writes to adjacent regions on non-buffer boundaries
are batched up for processing by the NFS server.

The net result is that you can do more writes, while generating
less wire traffic.

Be forewarned, however, that you must include synchronization
primitives at the appropriate places.  Among these are:

1)  When a write lock is acquired, a write is issued in a
region spanned by the lock, and the lock released, the
lock release must also be delayed.

The reason for this is that the lock is being used to
ensure distributed cache coherency (actually, in effect,
a lease-stall for other clients wanting to do writes).

2)  When a conflicting lock is requested, you need to initate
a synchronization point.

This is because even though they are proxied from the
same system ID, the NFS locks are on different process
ID's, and some NFS servers will enforce based on this,
even though the locks are, putatively, advisory, not
mandatory.

3)  On file deletions.

Generally, this just boils down to not being able to
delete an NFS file in actuality, unless you are the
last closer on the file.  Implementation-wise, it means
that a delete operation is a request-to-sync operation.

4)  Etc..


While these aren't technically necessary right now (in FreeBSD,
since FreeBSD fails to implement NFS locking), omitting them now
will make it nearly impossible to repair later, should NFS locking
support arise from my (or someone else's) kernel patches being applied,
and someone hacking up the lock and stat daemons to make the proper
system calls.


Terry Lambert
te...@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Async NFS exports?

1999-08-20 Thread Rob Snow
Emm, I guess that answers my earlier question/mail:

Why?---



basil# uname -a
FreeBSD basil.dympna.com 3.2-RELEASE FreeBSD 3.2-RELEASE #7: Thu Aug 19
23:59:50 CDT 1999
rs...@basil.dympna.com:/export/current/src/sys/compile/Basil-SMP
[Dual PPro-233's]

basil# cd /stripe
basil# df -k .

Filesystem 1K-blocks UsedAvail Capacity  Mounted on
/dev/vinum/stripe   1719751186511 15735200 1%/stripe

basil# Bonnie -s 256

  ---Sequential Output ---Sequential Input--
  -Per Char- --Block--- -Rewrite-- -Per Char- --Block---
MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU
  256 10817 97.3 15805 93.1  6338 41.4  9943 97.5 15796 51.2


basil# mount_nfs -3 localhost:/stripe /mnt
basil# cd /mnt
basil# Bonnie -s 256

  ---Sequential Output ---Sequential Input--
  -Per Char- --Block--- -Rewrite-- -Per Char- --Block---
MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU
  256  4270 57.6  6639 30.6  1877 11.7  3804 55.3  6201 18.7




Matthew Dillon wrote:
 
 :I asked this on stable but didn't get a response... Would I get any
 :performance increases by mounting NFS exported partition as Async?
 :
 :Would my soul be tormented in purgatory for doing it?
 :
 :Just to be clear... I am wondering if mounting (on the NFS _server_) a
 :partition (that is exportable) as async will have any performance
 :benefits to the NFS clients?
 :
 :-Steve
 
 Ok, I've run some more tests.  Basically you want to run NFSv3 under
 CURRENT and you want to run at least 3 nfsiod's.  On a 100BaseTX network
 this will give you unsaturated write performance in the ballpark of
 9 MBytes/sec.  Saturated write performance, that is where you write more
 then the client-side buffer cache can handle, will stabilize at
 2.5 MBytes/sec.  I have a patch for CURRENT which will increase the
 saturated write performance to 4.5 MBytes/sec (basically by moving the
 nfs_commit() from nfs_writebp() to nfs_doio() so it can be asynchronized).
 Hopefully that patch will go in soon but there's a pretty big backlog of
 patches that haven't gone in yet, some over a week and a half old, so...
 
 In anycase, even without the patch if you run a couple of nfsiod's and
 do not saturated the buffer cache you should get optimal performance.
 
 Backing-porting the patch for nfs_commit to STABLE is possible but is
 not likely to help much because the major performance restriction in
 STABLE is related to buffer cache management, not NFS.
 
 OS  #nfsiod's   unsaturated saturated
 write perf. write perf.
 ( . 100BASETX .. )
 
 CURRENT 0   9 MBytes/sec2.5 MBytes/sec
 CURRENT 4   9 MBytes/sec4.5 MBytes/sec(w/patch)
 
 STABLE  0   3 MBytes/sec3 MBytes/sec(1)
 STABLE  4   4 MBytes/sec3 MBytes/sec(1)
 
 note(1): saturated performance under STABLE is extremely inconsistant
 
 -Matt
 Matthew Dillon
 dil...@backplane.com
 
 To Unsubscribe: send mail to majord...@freebsd.org
 with unsubscribe freebsd-hackers in the body of the message


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Async NFS exports?

1999-08-20 Thread Matthew Dillon

:Emm, I guess that answers my earlier question/mail:
:
:Why?---
:
:/dev/vinum/stripe   1719751186511 15735200 1%/stripe
:basil# Bonnie -s 256
:  ---Sequential Output ---Sequential Input--
:  -Per Char- --Block--- -Rewrite-- -Per Char- --Block---
:MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU
:  256 10817 97.3 15805 93.1  6338 41.4  9943 97.5 15796 51.2
:
:
:basil# mount_nfs -3 localhost:/stripe /mnt
:basil# cd /mnt
:basil# Bonnie -s 256
:
:  ---Sequential Output ---Sequential Input--
:  -Per Char- --Block--- -Rewrite-- -Per Char- --Block---
:MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU
:  256  4270 57.6  6639 30.6  1877 11.7  3804 55.3  6201 18.7

Buffer copy and protocol overhead, plus the data is being cached 
twice: once on the server and once on the client (which happens to be
the same machine).   The machine becomes cpu-bound with all the extra
work.

This is what I get.  Ignore the read numbers (the machine has a lot 
of memory).  This is with nfsd -n 4 and nfsiod -n 4, NFSv3 UDP mount, 
on a duel P-III/450 running CURRENT.

/dev/da0h   2338236  1235921   91525757%/usr/obj
---Sequential Output ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
256  9979 51.4  9154 19.0  9814 22.5 19727 99.8 103863 100.0 2163.4 43.2

localhost:/usr/obj/ 2338236  1235921   91525757%/mnt
---Sequential Output ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
256  3532 30.9  4164  7.6  4364 14.9 13700 99.9 75964 99.7 2779.7 162.1

 As you can see, I get very similar numbers for writing.  9 MB/sec on
 the local filesystem and 4.3 MB/sec via a localhost: NFS mount.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Async NFS exports?

1999-08-20 Thread Matthew Dillon

: The reason is due to the way NFSv3 issues writes.  NFSv3 issues a 
: write but no longer assumes that the write has been synced to the 
: server's disk as of when the reply comes back.  Instead it keeps the
:..
:
:If you are looking for more optimizations, you can delay NFS write
:operations for some interval X, on all write buffers, entirely.
:
:Sun does this (it is technically called write gathering in the
:literature).

The client-side pushes a buffer out the moment it fills up.  Since
this corresponds to a maximally-sized NFS data packet the client will
not be more efficient by delaying it.

The server-side in the FreeBSD implementation *DOES* do write-clustering,
as I demonstrate below.  The server does a better job of write-clustering
when the client is a FreeBSD-CURRENT box, though, because FreeBSD-CURRENT
clients generate a more uniform data flow.

The server is able to do write-clustering w/ NFSv3 mounts because data
writes are not required to be synched to disk until the client requests
a commit.

Note that KB/t fields below.


CURRENT server, CURRENT client. iostat on server while client writes

apollo:/usr/src/sys/nfs# iostat da0 1
  tty da0 cpu
 tin tout  KB/t tps  MB/s  us ni sy in id
   0   20  0.00   0  0.00   1  0  1  0 98
   0   43  0.00   0  0.00   0  0  1  0 99
   0   43  1.00   1  0.00   0  0  0  0100
   0   42  6.86   7  0.05   0  0 10  0 90
   0   43 61.80  68  4.12   1  0 25 21 53
   0   43 64.00  72  4.52   0  0 18 28 54
   0   43 64.00  73  4.58   0  0 21 25 54
   0   42 62.29  74  4.51   0  0 24 32 45
   0   43 63.23  72  4.46   0  0 22 23 54
   0   43 64.00  32  1.98   0  0  9  9 82

CURRENT server, STABLE client.  iostat on server while client writes

apollo:/usr/src/sys/nfs# iostat da0 1
  tty da0 cpu
 tin tout  KB/t tps  MB/s  us ni sy in id
   0   20  0.00   0  0.00   1  0  1  0 98
   0   43 43.43  98  4.15   0  0 15 21 64
   0   42 17.48 107  1.82   0  0  9  7 84
   0   43 62.49  73  4.47   0  0 19 33 48
   0   43 19.93 121  2.35   0  0  8  9 84
   0   43 43.07  85  3.58   0  0 17 22 61
   0   42 46.77  90  4.11   0  0 21 22 57
   0   43 15.63 108  1.65   0  0  8  7 85
   0   43 64.00  70  4.39   2  0 17 27 54


However, even a FreeBSD-CURRENT breaks down when its buffer cache
saturates.  The example below demonstrates this:


CURRENT server, CURRENT client, iostat of server disks while client writes

(client buffer cache not saturated)
   0   43  0.00   0  0.00   0  0  0  0100
   0   43 56.00  26  1.41   0  0 14  5 81   
   0   42 64.00  72  4.52   0  0 21 26 53
   0   43 64.00  73  4.58   0  0 25 19 57
  11   54 64.00  71  4.46   2  0 17 28 53
   0   42 62.27  73  4.45   0  0 26 22 52
   5   48 64.00  72  4.52   0  0 22 24 53
   5   48 52.72  82  4.23   0  0 20 26 53
(client buffer cache saturates)
  11   54 18.76 163  2.99   2  0 12 16 71
   1   43 19.20 129  2.41   1  0 13 11 75
   7   70 20.65 146  2.95   2  0 14 16 67
   4   47 17.53 130  2.22   0  0  9 15 77
   6   51 24.63 113  2.72   1  0 10 13 76
   4   73 15.17 153  2.27   0  0 11  7 82


The problem that occurs on the FreeBSD server is simply that the
nfsrv_commit() procedure calls fsync() on the file... on the *ENTIRE*
file, for every commit rpc, rather then syncing just the offset/range 
requested.  I am looking into ways to fix this.

-Matt



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message