Re: Async NFS exports?
On Fri, 20 Aug 1999, Matthew Dillon wrote: : Just to be clear... I am wondering if mounting (on the NFS _server_) a : partition (that is exportable) as async will have any performance : benefits to the NFS clients? : :As a first guess, probably not unless you have a large number of active :clients. Any modern hard disc will outperform ethernet/fast ethernet, :especially for larger read/writes. For large numbers of smaller :operations, or when there is a large number of simultaneous outstanding :requests from clients, maybe. I'd say watch the disc itself (iostat is :your friend), and if it's pegged (especially large numbers of tps) async :might buy you some increase. :-- :Matthew Fuller (MF4839) |[EMAIL PROTECTED] Not much if at all, whether you have a large number of clients or not, at least if you are using NFSv3 mounts. The reason is due to the way NFSv3 issues writes. NFSv3 issues a write but no longer assumes that the write has been synced to the server's disk as of when the reply comes back. Instead it keeps the buffer around and does a later commit rpc to do the sync, presumably long after the server has already synced the data. So, effectively, all NFSv3 writes are async insofar as the client's buffer cache is able to keep abrest of the write-rate. Hmm, interesting. I see another optimization I can do to fix the buffer cache saturation case in CURRENT on the client. The COMMIT rpc's aren't being issued async. You need to track the return value of the commit so that you can detect server reboots and sync-write the data again. If you change to async, make sure that you still keep this part - its essential to the protocol. -- Doug Rabson Mail: [EMAIL PROTECTED] Nonlinear Systems Ltd. Phone: +44 181 442 9037 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Async NFS exports?
On Fri, 20 Aug 1999, Matthew Dillon wrote: : Just to be clear... I am wondering if mounting (on the NFS _server_) a : partition (that is exportable) as async will have any performance : benefits to the NFS clients? : :As a first guess, probably not unless you have a large number of active :clients. Any modern hard disc will outperform ethernet/fast ethernet, :especially for larger read/writes. For large numbers of smaller :operations, or when there is a large number of simultaneous outstanding :requests from clients, maybe. I'd say watch the disc itself (iostat is :your friend), and if it's pegged (especially large numbers of tps) async :might buy you some increase. :-- :Matthew Fuller (MF4839) |fulle...@over-yonder.net Not much if at all, whether you have a large number of clients or not, at least if you are using NFSv3 mounts. The reason is due to the way NFSv3 issues writes. NFSv3 issues a write but no longer assumes that the write has been synced to the server's disk as of when the reply comes back. Instead it keeps the buffer around and does a later commit rpc to do the sync, presumably long after the server has already synced the data. So, effectively, all NFSv3 writes are async insofar as the client's buffer cache is able to keep abrest of the write-rate. Hmm, interesting. I see another optimization I can do to fix the buffer cache saturation case in CURRENT on the client. The COMMIT rpc's aren't being issued async. You need to track the return value of the commit so that you can detect server reboots and sync-write the data again. If you change to async, make sure that you still keep this part - its essential to the protocol. -- Doug Rabson Mail: d...@nlsystems.com Nonlinear Systems Ltd. Phone: +44 181 442 9037 To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: Async NFS exports?
: buffer cache is able to keep abrest of the write-rate. : : Hmm, interesting. I see another optimization I can do to fix the : buffer cache saturation case in CURRENT on the client. The COMMIT rpc's : aren't being issued async. : :You need to track the return value of the commit so that you can detect :server reboots and sync-write the data again. If you change to async, make :sure that you still keep this part - its essential to the protocol. : :-- :Doug RabsonMail: d...@nlsystems.com :Nonlinear Systems Ltd. Phone: +44 181 442 9037 These are buffer-cache entities we are talking about here, so they won't go away until NFS tells the system they can go away. In that respect async I/O is no different then sync I/O. async I/O is simply run synchronously from an nfsiod context. -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: Async NFS exports?
:The problem that occurs on the FreeBSD server is simply that the :nfsrv_commit() procedure calls fsync() on the file... on the *ENTIRE* :file, for every commit rpc, rather then syncing just the offset/range :requested. I am looking into ways to fix this. : Ok, I've verified the problem. The nfsrv_commit() code running on the server is definitely the culprit. I was able to make a tentitive patch which increased NFSv3 write performance to 10 MBytes/sec -- the maximum my 100BaseTX network can do. CURRENT in tree:2.5 MBytes/sec CURRENT w/ asynchronized commit rpc:4.5 MBytes/sec CURRENT w/ asy commit and fixed nfsrv_commit: 10 MBytes/sec (1) note(1): network is maxed out. Running bonnie returned around 5.2 MBytes/sec using putc, 3.5 MBytes/sec doing rewrite (but also 3.5 MBytes/sec going the other way), and 10 MBytes/sec writing intelligently. Throughout the test the low level disk I/O on the server was able to do sustained clustering at 64KB/t. I have a backlog of patches so it may be a week or more before this one gets tested and committed into CURRENT. It's a little racey for putting into STABLE but maybe after a month of testing on CURRENT we will be able to do it. -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Async NFS exports?
:The problem that occurs on the FreeBSD server is simply that the :nfsrv_commit() procedure calls fsync() on the file... on the *ENTIRE* :file, for every commit rpc, rather then syncing just the offset/range :requested. I am looking into ways to fix this. : Ok, I've verified the problem. The nfsrv_commit() code running on the server is definitely the culprit. I was able to make a tentitive patch which increased NFSv3 write performance to 10 MBytes/sec -- the maximum my 100BaseTX network can do. CURRENT in tree:2.5 MBytes/sec CURRENT w/ asynchronized commit rpc:4.5 MBytes/sec CURRENT w/ asy commit and fixed nfsrv_commit: 10 MBytes/sec (1) note(1): network is maxed out. Running bonnie returned around 5.2 MBytes/sec using putc, 3.5 MBytes/sec doing rewrite (but also 3.5 MBytes/sec going the other way), and 10 MBytes/sec writing intelligently. Throughout the test the low level disk I/O on the server was able to do sustained clustering at 64KB/t. I have a backlog of patches so it may be a week or more before this one gets tested and committed into CURRENT. It's a little racey for putting into STABLE but maybe after a month of testing on CURRENT we will be able to do it. -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Async NFS exports?
I asked this on stable but didn't get a response... Would I get any performance increases by mounting NFS exported partition as Async? Would my soul be tormented in purgatory for doing it? Just to be clear... I am wondering if mounting (on the NFS _server_) a partition (that is exportable) as async will have any performance benefits to the NFS clients? -Steve To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Async NFS exports?
[ Caveat: I'm making this up as I go along ] On Fri, Aug 20, 1999 at 01:13:06PM -0500, a little birdie told me that Steve Ames remarked I asked this on stable but didn't get a response... Would I get any performance increases by mounting NFS exported partition as Async? Would my soul be tormented in purgatory for doing it? Just to be clear... I am wondering if mounting (on the NFS _server_) a partition (that is exportable) as async will have any performance benefits to the NFS clients? As a first guess, probably not unless you have a large number of active clients. Any modern hard disc will outperform ethernet/fast ethernet, especially for larger read/writes. For large numbers of smaller operations, or when there is a large number of simultaneous outstanding requests from clients, maybe. I'd say watch the disc itself (iostat is your friend), and if it's pegged (especially large numbers of tps) async might buy you some increase. -- Matthew Fuller (MF4839) |[EMAIL PROTECTED] Unix Systems Administrator |[EMAIL PROTECTED] Specializing in FreeBSD |http://www.over-yonder.net/ FutureSouth Communications |ISPHelp ISP Consulting "The only reason I'm burning my candle at both ends, is because I haven't figured out how to light the middle yet" To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Async NFS exports?
:I asked this on stable but didn't get a response... Would I get any :performance increases by mounting NFS exported partition as Async? : :Would my soul be tormented in purgatory for doing it? : :Just to be clear... I am wondering if mounting (on the NFS _server_) a :partition (that is exportable) as async will have any performance :benefits to the NFS clients? : :-Steve w/NFSv3 I doubt mounting the exported partitions async will increase performance much. I would not use an async mount - if this is an NFS server it needs to be as reliable as possible and async mounting the partition is going to hurt if the machine crashes. -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Async NFS exports?
:I asked this on stable but didn't get a response... Would I get any :performance increases by mounting NFS exported partition as Async? : :Would my soul be tormented in purgatory for doing it? : :Just to be clear... I am wondering if mounting (on the NFS _server_) a :partition (that is exportable) as async will have any performance :benefits to the NFS clients? : :-Steve Ok, I've run some more tests. Basically you want to run NFSv3 under CURRENT and you want to run at least 3 nfsiod's. On a 100BaseTX network this will give you unsaturated write performance in the ballpark of 9 MBytes/sec. Saturated write performance, that is where you write more then the client-side buffer cache can handle, will stabilize at 2.5 MBytes/sec. I have a patch for CURRENT which will increase the saturated write performance to 4.5 MBytes/sec (basically by moving the nfs_commit() from nfs_writebp() to nfs_doio() so it can be asynchronized). Hopefully that patch will go in soon but there's a pretty big backlog of patches that haven't gone in yet, some over a week and a half old, so... In anycase, even without the patch if you run a couple of nfsiod's and do not saturated the buffer cache you should get optimal performance. Backing-porting the patch for nfs_commit to STABLE is possible but is not likely to help much because the major performance restriction in STABLE is related to buffer cache management, not NFS. OS #nfsiod's unsaturated saturated write perf. write perf. ( . 100BASETX .. ) CURRENT 0 9 MBytes/sec2.5 MBytes/sec CURRENT 4 9 MBytes/sec4.5 MBytes/sec(w/patch) STABLE 0 3 MBytes/sec3 MBytes/sec(1) STABLE 4 4 MBytes/sec3 MBytes/sec(1) note(1): saturated performance under STABLE is extremely inconsistant -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Async NFS exports?
Emm, I guess that answers my earlier question/mail: Why?--- basil# uname -a FreeBSD basil.dympna.com 3.2-RELEASE FreeBSD 3.2-RELEASE #7: Thu Aug 19 23:59:50 CDT 1999 [EMAIL PROTECTED]:/export/current/src/sys/compile/Basil-SMP [Dual PPro-233's] basil# cd /stripe basil# df -k . Filesystem 1K-blocks UsedAvail Capacity Mounted on /dev/vinum/stripe 1719751186511 15735200 1%/stripe basil# Bonnie -s 256 ---Sequential Output ---Sequential Input-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU 256 10817 97.3 15805 93.1 6338 41.4 9943 97.5 15796 51.2 basil# mount_nfs -3 localhost:/stripe /mnt basil# cd /mnt basil# Bonnie -s 256 ---Sequential Output ---Sequential Input-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU 256 4270 57.6 6639 30.6 1877 11.7 3804 55.3 6201 18.7 Matthew Dillon wrote: :I asked this on stable but didn't get a response... Would I get any :performance increases by mounting NFS exported partition as Async? : :Would my soul be tormented in purgatory for doing it? : :Just to be clear... I am wondering if mounting (on the NFS _server_) a :partition (that is exportable) as async will have any performance :benefits to the NFS clients? : :-Steve Ok, I've run some more tests. Basically you want to run NFSv3 under CURRENT and you want to run at least 3 nfsiod's. On a 100BaseTX network this will give you unsaturated write performance in the ballpark of 9 MBytes/sec. Saturated write performance, that is where you write more then the client-side buffer cache can handle, will stabilize at 2.5 MBytes/sec. I have a patch for CURRENT which will increase the saturated write performance to 4.5 MBytes/sec (basically by moving the nfs_commit() from nfs_writebp() to nfs_doio() so it can be asynchronized). Hopefully that patch will go in soon but there's a pretty big backlog of patches that haven't gone in yet, some over a week and a half old, so... In anycase, even without the patch if you run a couple of nfsiod's and do not saturated the buffer cache you should get optimal performance. Backing-porting the patch for nfs_commit to STABLE is possible but is not likely to help much because the major performance restriction in STABLE is related to buffer cache management, not NFS. OS #nfsiod's unsaturated saturated write perf. write perf. ( . 100BASETX .. ) CURRENT 0 9 MBytes/sec2.5 MBytes/sec CURRENT 4 9 MBytes/sec4.5 MBytes/sec(w/patch) STABLE 0 3 MBytes/sec3 MBytes/sec(1) STABLE 4 4 MBytes/sec3 MBytes/sec(1) note(1): saturated performance under STABLE is extremely inconsistant -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Async NFS exports?
:Emm, I guess that answers my earlier question/mail: : :Why?--- : :/dev/vinum/stripe 1719751186511 15735200 1%/stripe :basil# Bonnie -s 256 : ---Sequential Output ---Sequential Input-- : -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- :MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU : 256 10817 97.3 15805 93.1 6338 41.4 9943 97.5 15796 51.2 : : :basil# mount_nfs -3 localhost:/stripe /mnt :basil# cd /mnt :basil# Bonnie -s 256 : : ---Sequential Output ---Sequential Input-- : -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- :MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU : 256 4270 57.6 6639 30.6 1877 11.7 3804 55.3 6201 18.7 Buffer copy and protocol overhead, plus the data is being cached twice: once on the server and once on the client (which happens to be the same machine). The machine becomes cpu-bound with all the extra work. This is what I get. Ignore the read numbers (the machine has a lot of memory). This is with nfsd -n 4 and nfsiod -n 4, NFSv3 UDP mount, on a duel P-III/450 running CURRENT. /dev/da0h 2338236 1235921 91525757%/usr/obj ---Sequential Output ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU 256 9979 51.4 9154 19.0 9814 22.5 19727 99.8 103863 100.0 2163.4 43.2 localhost:/usr/obj/ 2338236 1235921 91525757%/mnt ---Sequential Output ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU 256 3532 30.9 4164 7.6 4364 14.9 13700 99.9 75964 99.7 2779.7 162.1 As you can see, I get very similar numbers for writing. 9 MB/sec on the local filesystem and 4.3 MB/sec via a localhost: NFS mount. -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Async NFS exports?
: The reason is due to the way NFSv3 issues writes. NFSv3 issues a : write but no longer assumes that the write has been synced to the : server's disk as of when the reply comes back. Instead it keeps the :.. : :If you are looking for more optimizations, you can delay NFS write :operations for some interval X, on all write buffers, entirely. : :Sun does this (it is technically called "write gathering" in the :literature). The client-side pushes a buffer out the moment it fills up. Since this corresponds to a maximally-sized NFS data packet the client will not be more efficient by delaying it. The server-side in the FreeBSD implementation *DOES* do write-clustering, as I demonstrate below. The server does a better job of write-clustering when the client is a FreeBSD-CURRENT box, though, because FreeBSD-CURRENT clients generate a more uniform data flow. The server is able to do write-clustering w/ NFSv3 mounts because data writes are not required to be synched to disk until the client requests a commit. Note that KB/t fields below. CURRENT server, CURRENT client. iostat on server while client writes apollo:/usr/src/sys/nfs# iostat da0 1 tty da0 cpu tin tout KB/t tps MB/s us ni sy in id 0 20 0.00 0 0.00 1 0 1 0 98 0 43 0.00 0 0.00 0 0 1 0 99 0 43 1.00 1 0.00 0 0 0 0100 0 42 6.86 7 0.05 0 0 10 0 90 0 43 61.80 68 4.12 1 0 25 21 53 0 43 64.00 72 4.52 0 0 18 28 54 0 43 64.00 73 4.58 0 0 21 25 54 0 42 62.29 74 4.51 0 0 24 32 45 0 43 63.23 72 4.46 0 0 22 23 54 0 43 64.00 32 1.98 0 0 9 9 82 CURRENT server, STABLE client. iostat on server while client writes apollo:/usr/src/sys/nfs# iostat da0 1 tty da0 cpu tin tout KB/t tps MB/s us ni sy in id 0 20 0.00 0 0.00 1 0 1 0 98 0 43 43.43 98 4.15 0 0 15 21 64 0 42 17.48 107 1.82 0 0 9 7 84 0 43 62.49 73 4.47 0 0 19 33 48 0 43 19.93 121 2.35 0 0 8 9 84 0 43 43.07 85 3.58 0 0 17 22 61 0 42 46.77 90 4.11 0 0 21 22 57 0 43 15.63 108 1.65 0 0 8 7 85 0 43 64.00 70 4.39 2 0 17 27 54 However, even a FreeBSD-CURRENT breaks down when its buffer cache saturates. The example below demonstrates this: CURRENT server, CURRENT client, iostat of server disks while client writes (client buffer cache not saturated) 0 43 0.00 0 0.00 0 0 0 0100 0 43 56.00 26 1.41 0 0 14 5 81 0 42 64.00 72 4.52 0 0 21 26 53 0 43 64.00 73 4.58 0 0 25 19 57 11 54 64.00 71 4.46 2 0 17 28 53 0 42 62.27 73 4.45 0 0 26 22 52 5 48 64.00 72 4.52 0 0 22 24 53 5 48 52.72 82 4.23 0 0 20 26 53 (client buffer cache saturates) 11 54 18.76 163 2.99 2 0 12 16 71 1 43 19.20 129 2.41 1 0 13 11 75 7 70 20.65 146 2.95 2 0 14 16 67 4 47 17.53 130 2.22 0 0 9 15 77 6 51 24.63 113 2.72 1 0 10 13 76 4 73 15.17 153 2.27 0 0 11 7 82 The problem that occurs on the FreeBSD server is simply that the nfsrv_commit() procedure calls fsync() on the file... on the *ENTIRE* file, for every commit rpc, rather then syncing just the offset/range requested. I am looking into ways to fix this. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Async NFS exports?
I asked this on stable but didn't get a response... Would I get any performance increases by mounting NFS exported partition as Async? Would my soul be tormented in purgatory for doing it? Just to be clear... I am wondering if mounting (on the NFS _server_) a partition (that is exportable) as async will have any performance benefits to the NFS clients? -Steve To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: Async NFS exports?
[ Caveat: I'm making this up as I go along ] On Fri, Aug 20, 1999 at 01:13:06PM -0500, a little birdie told me that Steve Ames remarked I asked this on stable but didn't get a response... Would I get any performance increases by mounting NFS exported partition as Async? Would my soul be tormented in purgatory for doing it? Just to be clear... I am wondering if mounting (on the NFS _server_) a partition (that is exportable) as async will have any performance benefits to the NFS clients? As a first guess, probably not unless you have a large number of active clients. Any modern hard disc will outperform ethernet/fast ethernet, especially for larger read/writes. For large numbers of smaller operations, or when there is a large number of simultaneous outstanding requests from clients, maybe. I'd say watch the disc itself (iostat is your friend), and if it's pegged (especially large numbers of tps) async might buy you some increase. -- Matthew Fuller (MF4839) |fulle...@over-yonder.net Unix Systems Administrator |fulle...@futuresouth.com Specializing in FreeBSD |http://www.over-yonder.net/ FutureSouth Communications |ISPHelp ISP Consulting The only reason I'm burning my candle at both ends, is because I haven't figured out how to light the middle yet To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: Async NFS exports?
:I asked this on stable but didn't get a response... Would I get any :performance increases by mounting NFS exported partition as Async? : :Would my soul be tormented in purgatory for doing it? : :Just to be clear... I am wondering if mounting (on the NFS _server_) a :partition (that is exportable) as async will have any performance :benefits to the NFS clients? : :-Steve w/NFSv3 I doubt mounting the exported partitions async will increase performance much. I would not use an async mount - if this is an NFS server it needs to be as reliable as possible and async mounting the partition is going to hurt if the machine crashes. -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: Async NFS exports?
: Just to be clear... I am wondering if mounting (on the NFS _server_) a : partition (that is exportable) as async will have any performance : benefits to the NFS clients? : :As a first guess, probably not unless you have a large number of active :clients. Any modern hard disc will outperform ethernet/fast ethernet, :especially for larger read/writes. For large numbers of smaller :operations, or when there is a large number of simultaneous outstanding :requests from clients, maybe. I'd say watch the disc itself (iostat is :your friend), and if it's pegged (especially large numbers of tps) async :might buy you some increase. :-- :Matthew Fuller (MF4839) |fulle...@over-yonder.net Not much if at all, whether you have a large number of clients or not, at least if you are using NFSv3 mounts. The reason is due to the way NFSv3 issues writes. NFSv3 issues a write but no longer assumes that the write has been synced to the server's disk as of when the reply comes back. Instead it keeps the buffer around and does a later commit rpc to do the sync, presumably long after the server has already synced the data. So, effectively, all NFSv3 writes are async insofar as the client's buffer cache is able to keep abrest of the write-rate. Hmm, interesting. I see another optimization I can do to fix the buffer cache saturation case in CURRENT on the client. The COMMIT rpc's aren't being issued async. -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: Async NFS exports?
On Fri, 20 Aug 1999, Steve Ames wrote: I asked this on stable but didn't get a response... Would I get any performance increases by mounting NFS exported partition as Async? Would my soul be tormented in purgatory for doing it? Just to be clear... I am wondering if mounting (on the NFS _server_) a partition (that is exportable) as async will have any performance benefits to the NFS clients? of course it will, but it'll also potentially cause catastrophic damage to your filesystem if the machine dies for some reason. have a look at: /usr/src/sys/contrib/softupdates/README or possibly /usr/src/contrib/softupdates/README also, play with the nfs sysctl's and mout options to tune performance. sysctl -a | grep nfs good luck, -Alfred Perlstein - [bri...@rush.net|alf...@freebsd.org] Wintelcom systems administrator and programmer - http://www.wintelcom.net/ [bri...@wintelcom.net] To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: Async NFS exports?
:I asked this on stable but didn't get a response... Would I get any :performance increases by mounting NFS exported partition as Async? : :Would my soul be tormented in purgatory for doing it? : :Just to be clear... I am wondering if mounting (on the NFS _server_) a :partition (that is exportable) as async will have any performance :benefits to the NFS clients? : :-Steve Ok, I've run some more tests. Basically you want to run NFSv3 under CURRENT and you want to run at least 3 nfsiod's. On a 100BaseTX network this will give you unsaturated write performance in the ballpark of 9 MBytes/sec. Saturated write performance, that is where you write more then the client-side buffer cache can handle, will stabilize at 2.5 MBytes/sec. I have a patch for CURRENT which will increase the saturated write performance to 4.5 MBytes/sec (basically by moving the nfs_commit() from nfs_writebp() to nfs_doio() so it can be asynchronized). Hopefully that patch will go in soon but there's a pretty big backlog of patches that haven't gone in yet, some over a week and a half old, so... In anycase, even without the patch if you run a couple of nfsiod's and do not saturated the buffer cache you should get optimal performance. Backing-porting the patch for nfs_commit to STABLE is possible but is not likely to help much because the major performance restriction in STABLE is related to buffer cache management, not NFS. OS #nfsiod's unsaturated saturated write perf. write perf. ( . 100BASETX .. ) CURRENT 0 9 MBytes/sec2.5 MBytes/sec CURRENT 4 9 MBytes/sec4.5 MBytes/sec(w/patch) STABLE 0 3 MBytes/sec3 MBytes/sec(1) STABLE 4 4 MBytes/sec3 MBytes/sec(1) note(1): saturated performance under STABLE is extremely inconsistant -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: Async NFS exports?
:Ok, I've run some more tests. Basically you want to run NFSv3 under :CURRENT and you want to run at least 3 nfsiod's. On a 100BaseTX network Oh, let me be a bit more clear: Run 3 nfsiod's on the client. Run 4 nfsd's on the server. e.g. 'nfsiod -n 3' on the client and 'nfsd -n 4' on the server. To realize maximum NFS performance both the server and client should be running CURRENT. If Non-FreeBSD clients cannot achieve good performance, the problem is with those clients and nothing you do on the server will help. -Matt To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: Async NFS exports?
The reason is due to the way NFSv3 issues writes. NFSv3 issues a write but no longer assumes that the write has been synced to the server's disk as of when the reply comes back. Instead it keeps the buffer around and does a later commit rpc to do the sync, presumably long after the server has already synced the data. So, effectively, all NFSv3 writes are async insofar as the client's buffer cache is able to keep abrest of the write-rate. Hmm, interesting. I see another optimization I can do to fix the buffer cache saturation case in CURRENT on the client. The COMMIT rpc's aren't being issued async. If you are looking for more optimizations, you can delay NFS write operations for some interval X, on all write buffers, entirely. Sun does this (it is technically called write gathering in the literature). The upshot of introducing this latency between request and action is that multiple writes to adjacent regions on non-buffer boundaries are batched up for processing by the NFS server. The net result is that you can do more writes, while generating less wire traffic. Be forewarned, however, that you must include synchronization primitives at the appropriate places. Among these are: 1) When a write lock is acquired, a write is issued in a region spanned by the lock, and the lock released, the lock release must also be delayed. The reason for this is that the lock is being used to ensure distributed cache coherency (actually, in effect, a lease-stall for other clients wanting to do writes). 2) When a conflicting lock is requested, you need to initate a synchronization point. This is because even though they are proxied from the same system ID, the NFS locks are on different process ID's, and some NFS servers will enforce based on this, even though the locks are, putatively, advisory, not mandatory. 3) On file deletions. Generally, this just boils down to not being able to delete an NFS file in actuality, unless you are the last closer on the file. Implementation-wise, it means that a delete operation is a request-to-sync operation. 4) Etc.. While these aren't technically necessary right now (in FreeBSD, since FreeBSD fails to implement NFS locking), omitting them now will make it nearly impossible to repair later, should NFS locking support arise from my (or someone else's) kernel patches being applied, and someone hacking up the lock and stat daemons to make the proper system calls. Terry Lambert te...@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: Async NFS exports?
Emm, I guess that answers my earlier question/mail: Why?--- basil# uname -a FreeBSD basil.dympna.com 3.2-RELEASE FreeBSD 3.2-RELEASE #7: Thu Aug 19 23:59:50 CDT 1999 rs...@basil.dympna.com:/export/current/src/sys/compile/Basil-SMP [Dual PPro-233's] basil# cd /stripe basil# df -k . Filesystem 1K-blocks UsedAvail Capacity Mounted on /dev/vinum/stripe 1719751186511 15735200 1%/stripe basil# Bonnie -s 256 ---Sequential Output ---Sequential Input-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU 256 10817 97.3 15805 93.1 6338 41.4 9943 97.5 15796 51.2 basil# mount_nfs -3 localhost:/stripe /mnt basil# cd /mnt basil# Bonnie -s 256 ---Sequential Output ---Sequential Input-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU 256 4270 57.6 6639 30.6 1877 11.7 3804 55.3 6201 18.7 Matthew Dillon wrote: :I asked this on stable but didn't get a response... Would I get any :performance increases by mounting NFS exported partition as Async? : :Would my soul be tormented in purgatory for doing it? : :Just to be clear... I am wondering if mounting (on the NFS _server_) a :partition (that is exportable) as async will have any performance :benefits to the NFS clients? : :-Steve Ok, I've run some more tests. Basically you want to run NFSv3 under CURRENT and you want to run at least 3 nfsiod's. On a 100BaseTX network this will give you unsaturated write performance in the ballpark of 9 MBytes/sec. Saturated write performance, that is where you write more then the client-side buffer cache can handle, will stabilize at 2.5 MBytes/sec. I have a patch for CURRENT which will increase the saturated write performance to 4.5 MBytes/sec (basically by moving the nfs_commit() from nfs_writebp() to nfs_doio() so it can be asynchronized). Hopefully that patch will go in soon but there's a pretty big backlog of patches that haven't gone in yet, some over a week and a half old, so... In anycase, even without the patch if you run a couple of nfsiod's and do not saturated the buffer cache you should get optimal performance. Backing-porting the patch for nfs_commit to STABLE is possible but is not likely to help much because the major performance restriction in STABLE is related to buffer cache management, not NFS. OS #nfsiod's unsaturated saturated write perf. write perf. ( . 100BASETX .. ) CURRENT 0 9 MBytes/sec2.5 MBytes/sec CURRENT 4 9 MBytes/sec4.5 MBytes/sec(w/patch) STABLE 0 3 MBytes/sec3 MBytes/sec(1) STABLE 4 4 MBytes/sec3 MBytes/sec(1) note(1): saturated performance under STABLE is extremely inconsistant -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: Async NFS exports?
:Emm, I guess that answers my earlier question/mail: : :Why?--- : :/dev/vinum/stripe 1719751186511 15735200 1%/stripe :basil# Bonnie -s 256 : ---Sequential Output ---Sequential Input-- : -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- :MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU : 256 10817 97.3 15805 93.1 6338 41.4 9943 97.5 15796 51.2 : : :basil# mount_nfs -3 localhost:/stripe /mnt :basil# cd /mnt :basil# Bonnie -s 256 : : ---Sequential Output ---Sequential Input-- : -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- :MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU : 256 4270 57.6 6639 30.6 1877 11.7 3804 55.3 6201 18.7 Buffer copy and protocol overhead, plus the data is being cached twice: once on the server and once on the client (which happens to be the same machine). The machine becomes cpu-bound with all the extra work. This is what I get. Ignore the read numbers (the machine has a lot of memory). This is with nfsd -n 4 and nfsiod -n 4, NFSv3 UDP mount, on a duel P-III/450 running CURRENT. /dev/da0h 2338236 1235921 91525757%/usr/obj ---Sequential Output ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU 256 9979 51.4 9154 19.0 9814 22.5 19727 99.8 103863 100.0 2163.4 43.2 localhost:/usr/obj/ 2338236 1235921 91525757%/mnt ---Sequential Output ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU 256 3532 30.9 4164 7.6 4364 14.9 13700 99.9 75964 99.7 2779.7 162.1 As you can see, I get very similar numbers for writing. 9 MB/sec on the local filesystem and 4.3 MB/sec via a localhost: NFS mount. -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: Async NFS exports?
: The reason is due to the way NFSv3 issues writes. NFSv3 issues a : write but no longer assumes that the write has been synced to the : server's disk as of when the reply comes back. Instead it keeps the :.. : :If you are looking for more optimizations, you can delay NFS write :operations for some interval X, on all write buffers, entirely. : :Sun does this (it is technically called write gathering in the :literature). The client-side pushes a buffer out the moment it fills up. Since this corresponds to a maximally-sized NFS data packet the client will not be more efficient by delaying it. The server-side in the FreeBSD implementation *DOES* do write-clustering, as I demonstrate below. The server does a better job of write-clustering when the client is a FreeBSD-CURRENT box, though, because FreeBSD-CURRENT clients generate a more uniform data flow. The server is able to do write-clustering w/ NFSv3 mounts because data writes are not required to be synched to disk until the client requests a commit. Note that KB/t fields below. CURRENT server, CURRENT client. iostat on server while client writes apollo:/usr/src/sys/nfs# iostat da0 1 tty da0 cpu tin tout KB/t tps MB/s us ni sy in id 0 20 0.00 0 0.00 1 0 1 0 98 0 43 0.00 0 0.00 0 0 1 0 99 0 43 1.00 1 0.00 0 0 0 0100 0 42 6.86 7 0.05 0 0 10 0 90 0 43 61.80 68 4.12 1 0 25 21 53 0 43 64.00 72 4.52 0 0 18 28 54 0 43 64.00 73 4.58 0 0 21 25 54 0 42 62.29 74 4.51 0 0 24 32 45 0 43 63.23 72 4.46 0 0 22 23 54 0 43 64.00 32 1.98 0 0 9 9 82 CURRENT server, STABLE client. iostat on server while client writes apollo:/usr/src/sys/nfs# iostat da0 1 tty da0 cpu tin tout KB/t tps MB/s us ni sy in id 0 20 0.00 0 0.00 1 0 1 0 98 0 43 43.43 98 4.15 0 0 15 21 64 0 42 17.48 107 1.82 0 0 9 7 84 0 43 62.49 73 4.47 0 0 19 33 48 0 43 19.93 121 2.35 0 0 8 9 84 0 43 43.07 85 3.58 0 0 17 22 61 0 42 46.77 90 4.11 0 0 21 22 57 0 43 15.63 108 1.65 0 0 8 7 85 0 43 64.00 70 4.39 2 0 17 27 54 However, even a FreeBSD-CURRENT breaks down when its buffer cache saturates. The example below demonstrates this: CURRENT server, CURRENT client, iostat of server disks while client writes (client buffer cache not saturated) 0 43 0.00 0 0.00 0 0 0 0100 0 43 56.00 26 1.41 0 0 14 5 81 0 42 64.00 72 4.52 0 0 21 26 53 0 43 64.00 73 4.58 0 0 25 19 57 11 54 64.00 71 4.46 2 0 17 28 53 0 42 62.27 73 4.45 0 0 26 22 52 5 48 64.00 72 4.52 0 0 22 24 53 5 48 52.72 82 4.23 0 0 20 26 53 (client buffer cache saturates) 11 54 18.76 163 2.99 2 0 12 16 71 1 43 19.20 129 2.41 1 0 13 11 75 7 70 20.65 146 2.95 2 0 14 16 67 4 47 17.53 130 2.22 0 0 9 15 77 6 51 24.63 113 2.72 1 0 10 13 76 4 73 15.17 153 2.27 0 0 11 7 82 The problem that occurs on the FreeBSD server is simply that the nfsrv_commit() procedure calls fsync() on the file... on the *ENTIRE* file, for every commit rpc, rather then syncing just the offset/range requested. I am looking into ways to fix this. -Matt To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message