Re: Low nfs write throughput

2011-12-01 Thread John Baldwin
On Thursday, December 01, 2011 12:35:23 am Jeremy Chadwick wrote:
> On Tue, Nov 29, 2011 at 10:36:44AM -0500, John Baldwin wrote:
> > On Monday, November 28, 2011 7:12:39 pm Daryl Sayers wrote:
> > > > "Bengt" == Bengt Ahlgren  writes:
> > > 
> > > > Daryl Sayers  writes:
> > > >> Can anyone suggest why I am getting poor write performance from my nfs 
> > > >> setup.
> > > >> I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother 
> > > >> boards,
> > > >> 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives 
> > > >> with
> > > >> onboard Gb network cards connected to an idle network. The results 
> > > >> below show
> > > >> that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. 
> > > >> It
> > > >> improves if I use async but a smbfs mount still beats it. I am using 
> > > >> the same
> > > >> file, source and destinations for all tests. I have tried alternate 
> > > >> Network
> > > >> cards with no resulting benefit.
> > > 
> > > > [...]
> > > 
> > > >> Looking at a systat -v on the destination I see that the nfs test does 
> > > >> not
> > > >> exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
> > > >> For the record I get reads of 22Mb/s without and 77Mb/s with async 
> > > >> turned on
> > > >> for the nfs mount.
> > > 
> > > > On an UFS filesystem you get NFS writes with the same size as the
> > > > filesystem blocksize.  So an easy way to improve performance is to
> > > > create a filesystem with larger blocks.  I accidentally found this out
> > > > when I had two NFS exported filesystems from the same box with 16K and
> > > > 64K blocksizes respectively.
> > > 
> > > > (Larger blocksize also tremendously improves the performance of UFS
> > > > snapshots!)
> > > 
> > > Thanks to all that answered. I did try the 'sysctl -w vfs.nfsrv.async=1' 
> > > with
> > > no reportable change in performance. We are using a UFS2 filesystem so the
> > > zfs command was not required. I did not try the patch as we would like to 
> > > stay
> > > as standard as possible but will upgrade if the patch is released in new
> > > kernel.
> > 
> > If you can test the patch then it is something I will likely put into the
> > next release.  I have already tested it as far as robustness locally, what
> > I don't have are good performance tests.  It would really be helpful if you
> > were able to test it.
> 
> John,
> 
> We'd like to test this patch[1], but need to know if it needs to be
> applied to just the system acting as the NFS server, or the NFS clients
> as well.
> 
> [1]: http://www.freebsd.org/~jhb/patches/nfs_server_cluster.patch

Just the NFS server.  I'm going to commit it to HEAD later today.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Low nfs write throughput

2011-11-30 Thread Jeremy Chadwick
On Tue, Nov 29, 2011 at 10:36:44AM -0500, John Baldwin wrote:
> On Monday, November 28, 2011 7:12:39 pm Daryl Sayers wrote:
> > > "Bengt" == Bengt Ahlgren  writes:
> > 
> > > Daryl Sayers  writes:
> > >> Can anyone suggest why I am getting poor write performance from my nfs 
> > >> setup.
> > >> I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother 
> > >> boards,
> > >> 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
> > >> onboard Gb network cards connected to an idle network. The results below 
> > >> show
> > >> that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
> > >> improves if I use async but a smbfs mount still beats it. I am using the 
> > >> same
> > >> file, source and destinations for all tests. I have tried alternate 
> > >> Network
> > >> cards with no resulting benefit.
> > 
> > > [...]
> > 
> > >> Looking at a systat -v on the destination I see that the nfs test does 
> > >> not
> > >> exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
> > >> For the record I get reads of 22Mb/s without and 77Mb/s with async 
> > >> turned on
> > >> for the nfs mount.
> > 
> > > On an UFS filesystem you get NFS writes with the same size as the
> > > filesystem blocksize.  So an easy way to improve performance is to
> > > create a filesystem with larger blocks.  I accidentally found this out
> > > when I had two NFS exported filesystems from the same box with 16K and
> > > 64K blocksizes respectively.
> > 
> > > (Larger blocksize also tremendously improves the performance of UFS
> > > snapshots!)
> > 
> > Thanks to all that answered. I did try the 'sysctl -w vfs.nfsrv.async=1' 
> > with
> > no reportable change in performance. We are using a UFS2 filesystem so the
> > zfs command was not required. I did not try the patch as we would like to 
> > stay
> > as standard as possible but will upgrade if the patch is released in new
> > kernel.
> 
> If you can test the patch then it is something I will likely put into the
> next release.  I have already tested it as far as robustness locally, what
> I don't have are good performance tests.  It would really be helpful if you
> were able to test it.

John,

We'd like to test this patch[1], but need to know if it needs to be
applied to just the system acting as the NFS server, or the NFS clients
as well.

[1]: http://www.freebsd.org/~jhb/patches/nfs_server_cluster.patch

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Low nfs write throughput

2011-11-30 Thread Daryl Sayers
> "John" == John Baldwin  writes:

> On Tuesday, November 29, 2011 6:56:27 pm Daryl Sayers wrote:
>> > "John" == John Baldwin  writes:
>> 
>> > On Monday, November 28, 2011 7:12:39 pm Daryl Sayers wrote:
>> >> > "Bengt" == Bengt Ahlgren  writes:
>> >> 
>> >> > Daryl Sayers  writes:
>> >> >> Can anyone suggest why I am getting poor write performance from my nfs 
>> >> >> setup.
>> >> >> I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother 
>> >> >> boards,
>> >> >> 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives 
>> >> >> with
>> >> >> onboard Gb network cards connected to an idle network. The results 
>> >> >> below show
>> >> >> that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. 
>> >> >> It
>> >> >> improves if I use async but a smbfs mount still beats it. I am using 
>> >> >> the same
>> >> >> file, source and destinations for all tests. I have tried alternate 
>> >> >> Network
>> >> >> cards with no resulting benefit.
>> >> 
>> >> > [...]
>> >> 
>> >> >> Looking at a systat -v on the destination I see that the nfs test does 
>> >> >> not
>> >> >> exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
>> >> >> For the record I get reads of 22Mb/s without and 77Mb/s with async 
>> >> >> turned on
>> >> >> for the nfs mount.
>> >> 
>> >> > On an UFS filesystem you get NFS writes with the same size as the
>> >> > filesystem blocksize.  So an easy way to improve performance is to
>> >> > create a filesystem with larger blocks.  I accidentally found this out
>> >> > when I had two NFS exported filesystems from the same box with 16K and
>> >> > 64K blocksizes respectively.
>> >> 
>> >> > (Larger blocksize also tremendously improves the performance of UFS
>> >> > snapshots!)
>> >> 
>> >> Thanks to all that answered. I did try the 'sysctl -w vfs.nfsrv.async=1' 
>> >> with
>> >> no reportable change in performance. We are using a UFS2 filesystem so the
>> >> zfs command was not required. I did not try the patch as we would like to 
>> >> stay
>> >> as standard as possible but will upgrade if the patch is released in new
>> >> kernel.
>> 
>> > If you can test the patch then it is something I will likely put into the
>> > next release.  I have already tested it as far as robustness locally, what
>> > I don't have are good performance tests.  It would really be helpful if you
>> > were able to test it.
>> 
>> >> Thanks Bengt for the suggestion of block size. Increasing the block size 
>> >> to
>> >> 64k made a significant improvement to performance.
>> 
>> > In theory the patch might have given you similar gains.  During my simple 
>> > tests
>> > I was able to raise the average I/O size in iostat to 70 to 80k from 16k.
>> 
>> OK, I downloaded and install the patch and did some basic testing and I can
>> reveal that the patch does improve performance. I can also see that my KB/t
>> now exceed the 16KB/t that seemed to be a limiting factor prior.

> Ok, thanks.  Does it give similar performance results to using 64k block size?
>From the tests I have done I get similar results to the block size change.


-- 
Daryl Sayers Direct: +612 95525510
Corinthian Engineering   Office: +612 95525500
Suite 54, Jones Bay Wharf   Fax: +612 95525549
26-32 Pirrama Rd  email: da...@ci.com.au
Pyrmont NSW 2009 Australia  www: http://www.ci.com.au
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Low nfs write throughput

2011-11-30 Thread John Baldwin
On Tuesday, November 29, 2011 6:56:27 pm Daryl Sayers wrote:
> > "John" == John Baldwin  writes:
> 
> > On Monday, November 28, 2011 7:12:39 pm Daryl Sayers wrote:
> >> > "Bengt" == Bengt Ahlgren  writes:
> >> 
> >> > Daryl Sayers  writes:
> >> >> Can anyone suggest why I am getting poor write performance from my nfs 
> >> >> setup.
> >> >> I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother 
> >> >> boards,
> >> >> 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
> >> >> onboard Gb network cards connected to an idle network. The results 
> >> >> below show
> >> >> that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. 
> >> >> It
> >> >> improves if I use async but a smbfs mount still beats it. I am using 
> >> >> the same
> >> >> file, source and destinations for all tests. I have tried alternate 
> >> >> Network
> >> >> cards with no resulting benefit.
> >> 
> >> > [...]
> >> 
> >> >> Looking at a systat -v on the destination I see that the nfs test does 
> >> >> not
> >> >> exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
> >> >> For the record I get reads of 22Mb/s without and 77Mb/s with async 
> >> >> turned on
> >> >> for the nfs mount.
> >> 
> >> > On an UFS filesystem you get NFS writes with the same size as the
> >> > filesystem blocksize.  So an easy way to improve performance is to
> >> > create a filesystem with larger blocks.  I accidentally found this out
> >> > when I had two NFS exported filesystems from the same box with 16K and
> >> > 64K blocksizes respectively.
> >> 
> >> > (Larger blocksize also tremendously improves the performance of UFS
> >> > snapshots!)
> >> 
> >> Thanks to all that answered. I did try the 'sysctl -w vfs.nfsrv.async=1' 
> >> with
> >> no reportable change in performance. We are using a UFS2 filesystem so the
> >> zfs command was not required. I did not try the patch as we would like to 
> >> stay
> >> as standard as possible but will upgrade if the patch is released in new
> >> kernel.
> 
> > If you can test the patch then it is something I will likely put into the
> > next release.  I have already tested it as far as robustness locally, what
> > I don't have are good performance tests.  It would really be helpful if you
> > were able to test it.
> 
> >> Thanks Bengt for the suggestion of block size. Increasing the block size to
> >> 64k made a significant improvement to performance.
> 
> > In theory the patch might have given you similar gains.  During my simple 
> > tests
> > I was able to raise the average I/O size in iostat to 70 to 80k from 16k.
> 
> OK, I downloaded and install the patch and did some basic testing and I can
> reveal that the patch does improve performance. I can also see that my KB/t
> now exceed the 16KB/t that seemed to be a limiting factor prior.

Ok, thanks.  Does it give similar performance results to using 64k block size?

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Low nfs write throughput

2011-11-29 Thread Daryl Sayers
> "John" == John Baldwin  writes:

> On Monday, November 28, 2011 7:12:39 pm Daryl Sayers wrote:
>> > "Bengt" == Bengt Ahlgren  writes:
>> 
>> > Daryl Sayers  writes:
>> >> Can anyone suggest why I am getting poor write performance from my nfs 
>> >> setup.
>> >> I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother 
>> >> boards,
>> >> 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
>> >> onboard Gb network cards connected to an idle network. The results below 
>> >> show
>> >> that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
>> >> improves if I use async but a smbfs mount still beats it. I am using the 
>> >> same
>> >> file, source and destinations for all tests. I have tried alternate 
>> >> Network
>> >> cards with no resulting benefit.
>> 
>> > [...]
>> 
>> >> Looking at a systat -v on the destination I see that the nfs test does not
>> >> exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
>> >> For the record I get reads of 22Mb/s without and 77Mb/s with async turned 
>> >> on
>> >> for the nfs mount.
>> 
>> > On an UFS filesystem you get NFS writes with the same size as the
>> > filesystem blocksize.  So an easy way to improve performance is to
>> > create a filesystem with larger blocks.  I accidentally found this out
>> > when I had two NFS exported filesystems from the same box with 16K and
>> > 64K blocksizes respectively.
>> 
>> > (Larger blocksize also tremendously improves the performance of UFS
>> > snapshots!)
>> 
>> Thanks to all that answered. I did try the 'sysctl -w vfs.nfsrv.async=1' with
>> no reportable change in performance. We are using a UFS2 filesystem so the
>> zfs command was not required. I did not try the patch as we would like to 
>> stay
>> as standard as possible but will upgrade if the patch is released in new
>> kernel.

> If you can test the patch then it is something I will likely put into the
> next release.  I have already tested it as far as robustness locally, what
> I don't have are good performance tests.  It would really be helpful if you
> were able to test it.

>> Thanks Bengt for the suggestion of block size. Increasing the block size to
>> 64k made a significant improvement to performance.

> In theory the patch might have given you similar gains.  During my simple 
> tests
> I was able to raise the average I/O size in iostat to 70 to 80k from 16k.

OK, I downloaded and install the patch and did some basic testing and I can
reveal that the patch does improve performance. I can also see that my KB/t
now exceed the 16KB/t that seemed to be a limiting factor prior. 

-- 
Daryl Sayers Direct: +612 95525510
Corinthian Engineering   Office: +612 95525500
Suite 54, Jones Bay Wharf   Fax: +612 95525549
26-32 Pirrama Rd  email: da...@ci.com.au
Pyrmont NSW 2009 Australia  www: http://www.ci.com.au
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Low nfs write throughput

2011-11-29 Thread John Baldwin
On Monday, November 28, 2011 7:12:39 pm Daryl Sayers wrote:
> > "Bengt" == Bengt Ahlgren  writes:
> 
> > Daryl Sayers  writes:
> >> Can anyone suggest why I am getting poor write performance from my nfs 
> >> setup.
> >> I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother 
> >> boards,
> >> 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
> >> onboard Gb network cards connected to an idle network. The results below 
> >> show
> >> that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
> >> improves if I use async but a smbfs mount still beats it. I am using the 
> >> same
> >> file, source and destinations for all tests. I have tried alternate Network
> >> cards with no resulting benefit.
> 
> > [...]
> 
> >> Looking at a systat -v on the destination I see that the nfs test does not
> >> exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
> >> For the record I get reads of 22Mb/s without and 77Mb/s with async turned 
> >> on
> >> for the nfs mount.
> 
> > On an UFS filesystem you get NFS writes with the same size as the
> > filesystem blocksize.  So an easy way to improve performance is to
> > create a filesystem with larger blocks.  I accidentally found this out
> > when I had two NFS exported filesystems from the same box with 16K and
> > 64K blocksizes respectively.
> 
> > (Larger blocksize also tremendously improves the performance of UFS
> > snapshots!)
> 
> Thanks to all that answered. I did try the 'sysctl -w vfs.nfsrv.async=1' with
> no reportable change in performance. We are using a UFS2 filesystem so the
> zfs command was not required. I did not try the patch as we would like to stay
> as standard as possible but will upgrade if the patch is released in new
> kernel.

If you can test the patch then it is something I will likely put into the
next release.  I have already tested it as far as robustness locally, what
I don't have are good performance tests.  It would really be helpful if you
were able to test it.

> Thanks Bengt for the suggestion of block size. Increasing the block size to
> 64k made a significant improvement to performance.

In theory the patch might have given you similar gains.  During my simple tests
I was able to raise the average I/O size in iostat to 70 to 80k from 16k.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Low nfs write throughput

2011-11-28 Thread Daryl Sayers
> "Bengt" == Bengt Ahlgren  writes:

> Daryl Sayers  writes:
>> Can anyone suggest why I am getting poor write performance from my nfs setup.
>> I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother boards,
>> 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
>> onboard Gb network cards connected to an idle network. The results below show
>> that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
>> improves if I use async but a smbfs mount still beats it. I am using the same
>> file, source and destinations for all tests. I have tried alternate Network
>> cards with no resulting benefit.

> [...]

>> Looking at a systat -v on the destination I see that the nfs test does not
>> exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
>> For the record I get reads of 22Mb/s without and 77Mb/s with async turned on
>> for the nfs mount.

> On an UFS filesystem you get NFS writes with the same size as the
> filesystem blocksize.  So an easy way to improve performance is to
> create a filesystem with larger blocks.  I accidentally found this out
> when I had two NFS exported filesystems from the same box with 16K and
> 64K blocksizes respectively.

> (Larger blocksize also tremendously improves the performance of UFS
> snapshots!)

Thanks to all that answered. I did try the 'sysctl -w vfs.nfsrv.async=1' with
no reportable change in performance. We are using a UFS2 filesystem so the
zfs command was not required. I did not try the patch as we would like to stay
as standard as possible but will upgrade if the patch is released in new
kernel.
Thanks Bengt for the suggestion of block size. Increasing the block size to
64k made a significant improvement to performance.

-- 
Daryl Sayers Direct: +612 95525510
Corinthian Engineering   Office: +612 95525500
Suite 54, Jones Bay Wharf   Fax: +612 95525549
26-32 Pirrama Rd  email: da...@ci.com.au
Pyrmont NSW 2009 Australia  www: http://www.ci.com.au
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Low nfs write throughput

2011-11-28 Thread Bengt Ahlgren
Daryl Sayers  writes:

> Can anyone suggest why I am getting poor write performance from my nfs setup.
> I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother boards,
> 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
> onboard Gb network cards connected to an idle network. The results below show
> that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
> improves if I use async but a smbfs mount still beats it. I am using the same
> file, source and destinations for all tests. I have tried alternate Network
> cards with no resulting benefit.

[...]

> Looking at a systat -v on the destination I see that the nfs test does not
> exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
> For the record I get reads of 22Mb/s without and 77Mb/s with async turned on
> for the nfs mount.

On an UFS filesystem you get NFS writes with the same size as the
filesystem blocksize.  So an easy way to improve performance is to
create a filesystem with larger blocks.  I accidentally found this out
when I had two NFS exported filesystems from the same box with 16K and
64K blocksizes respectively.

(Larger blocksize also tremendously improves the performance of UFS
snapshots!)

Bengt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Low nfs write throughput

2011-11-21 Thread John Baldwin
On Friday, November 18, 2011 7:36:47 pm Xin LI wrote:
> Hi,
> 
> > I don't know if it will help with your performance, but I have some patches
> > to allow the NFS server to cluster writes.  You can try
> > www.freebsd.org/~jhb/patches/nfs_server_cluster.patch.  I've tested it on 8,
> > but it should probably apply fine to 9.
> 
> I think 9 would need some changes, I just made them with minimal
> compile testing, though.

Oops, 8 has the same problems, and actually, it needs more fixes than that as
the uio isn't initialized then.  I've updated the patch at the URL so it should
now work for the new server.  Sorry. :/

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Low nfs write throughput

2011-11-18 Thread Xin LI
Hi,

> I don't know if it will help with your performance, but I have some patches
> to allow the NFS server to cluster writes.  You can try
> www.freebsd.org/~jhb/patches/nfs_server_cluster.patch.  I've tested it on 8,
> but it should probably apply fine to 9.

I think 9 would need some changes, I just made them with minimal
compile testing, though.

Cheers,
-- 
Xin LI  https://www.delphij.net/
FreeBSD - The Power to Serve! Live free or die
Index: sys/fs/nfsserver/nfs_nfsdport.c
===
--- sys/fs/nfsserver/nfs_nfsdport.c	(revision 227689)
+++ sys/fs/nfsserver/nfs_nfsdport.c	(working copy)
@@ -90,20 +90,78 @@ SYSCTL_INT(_vfs_nfsd, OID_AUTO, issue_delegations,
 SYSCTL_INT(_vfs_nfsd, OID_AUTO, enable_locallocks, CTLFLAG_RW,
 &nfsrv_dolocallocks, 0, "Enable nfsd to acquire local locks on files");
 
-#define	NUM_HEURISTIC		1017
+#define	MAX_REORDERED_RPC	16
+#define	NUM_HEURISTIC		1031
 #define	NHUSE_INIT		64
 #define	NHUSE_INC		16
 #define	NHUSE_MAX		2048
 
 static struct nfsheur {
 	struct vnode *nh_vp;	/* vp to match (unreferenced pointer) */
-	off_t nh_nextr;		/* next offset for sequential detection */
+	off_t nh_nextoff;	/* next offset for sequential detection */
 	int nh_use;		/* use count for selection */
 	int nh_seqcount;	/* heuristic */
 } nfsheur[NUM_HEURISTIC];
 
 
 /*
+ * Heuristic to detect sequential operation.
+ */
+static struct nfsheur *
+nfsrv_sequential_heuristic(struct uio *uio, struct vnode *vp)
+{
+	struct nfsheur *nh;
+	int hi, try;
+
+	/* Locate best candidate. */
+	try = 32;
+	hi = ((int)(vm_offset_t)vp / sizeof(struct vnode)) % NUM_HEURISTIC;
+	nh = &nfsheur[hi];
+	while (try--) {
+		if (nfsheur[hi].nh_vp == vp) {
+			nh = &nfsheur[hi];
+			break;
+		}
+		if (nfsheur[hi].nh_use > 0)
+			--nfsheur[hi].nh_use;
+		hi = (hi + 1) % NUM_HEURISTIC;
+		if (nfsheur[hi].nh_use < nh->nh_use)
+			nh = &nfsheur[hi];
+	}
+
+	/* Initialize hint if this is a new file. */
+	if (nh->nh_vp != vp) {
+		nh->nh_vp = vp;
+		nh->nh_nextoff = uio->uio_offset;
+		nh->nh_use = NHUSE_INIT;
+		if (uio->uio_offset == 0)
+			nh->nh_seqcount = 4;
+		else
+			nh->nh_seqcount = 1;
+	}
+
+	/* Calculate heuristic. */
+	if ((uio->uio_offset == 0 && nh->nh_seqcount > 0) ||
+	uio->uio_offset == nh->nh_nextoff) {
+		/* See comments in vfs_vnops.c:sequential_heuristic(). */
+		nh->nh_seqcount += howmany(uio->uio_resid, 16384);
+		if (nh->nh_seqcount > IO_SEQMAX)
+			nh->nh_seqcount = IO_SEQMAX;
+	} else if (qabs(uio->uio_offset - nh->nh_nextoff) <= MAX_REORDERED_RPC *
+	imax(vp->v_mount->mnt_stat.f_iosize, uio->uio_resid)) {
+		/* Probably a reordered RPC, leave seqcount alone. */
+	} else if (nh->nh_seqcount > 1) {
+		nh->nh_seqcount /= 2;
+	} else {
+		nh->nh_seqcount = 0;
+	}
+	nh->nh_use += NHUSE_INC;
+	if (nh->nh_use > NHUSE_MAX)
+		nh->nh_use = NHUSE_MAX;
+	return (nh);
+}
+
+/*
  * Get attributes into nfsvattr structure.
  */
 int
@@ -567,58 +625,12 @@ nfsvno_read(struct vnode *vp, off_t off, int cnt,
 	int i;
 	struct iovec *iv;
 	struct iovec *iv2;
-	int error = 0, len, left, siz, tlen, ioflag = 0, hi, try = 32;
+	int error = 0, len, left, siz, tlen, ioflag = 0;
 	struct mbuf *m2 = NULL, *m3;
 	struct uio io, *uiop = &io;
 	struct nfsheur *nh;
 
-	/*
-	 * Calculate seqcount for heuristic
-	 */
-	/*
-	 * Locate best candidate
-	 */
-
-	hi = ((int)(vm_offset_t)vp / sizeof(struct vnode)) % NUM_HEURISTIC;
-	nh = &nfsheur[hi];
-
-	while (try--) {
-		if (nfsheur[hi].nh_vp == vp) {
-			nh = &nfsheur[hi];
-			break;
-		}
-		if (nfsheur[hi].nh_use > 0)
-			--nfsheur[hi].nh_use;
-		hi = (hi + 1) % NUM_HEURISTIC;
-		if (nfsheur[hi].nh_use < nh->nh_use)
-			nh = &nfsheur[hi];
-	}
-
-	if (nh->nh_vp != vp) {
-		nh->nh_vp = vp;
-		nh->nh_nextr = off;
-		nh->nh_use = NHUSE_INIT;
-		if (off == 0)
-			nh->nh_seqcount = 4;
-		else
-			nh->nh_seqcount = 1;
-	}
-
-	/*
-	 * Calculate heuristic
-	 */
-
-	if ((off == 0 && nh->nh_seqcount > 0) || off == nh->nh_nextr) {
-		if (++nh->nh_seqcount > IO_SEQMAX)
-			nh->nh_seqcount = IO_SEQMAX;
-	} else if (nh->nh_seqcount > 1) {
-		nh->nh_seqcount = 1;
-	} else {
-		nh->nh_seqcount = 0;
-	}
-	nh->nh_use += NHUSE_INC;
-	if (nh->nh_use > NHUSE_MAX)
-		nh->nh_use = NHUSE_MAX;
+	nh = nfsrv_sequential_heuristic(uiop, vp);
 	ioflag |= nh->nh_seqcount << IO_SEQSHIFT;
 
 	len = left = NFSM_RNDUP(cnt);
@@ -672,6 +684,7 @@ nfsvno_read(struct vnode *vp, off_t off, int cnt,
 		*mpp = NULL;
 		goto out;
 	}
+	nh->nh_nextoff = uiop->uio_offset;
 	tlen = len - uiop->uio_resid;
 	cnt = cnt < tlen ? cnt : tlen;
 	tlen = NFSM_RNDUP(cnt);
@@ -700,6 +713,7 @@ nfsvno_write(struct vnode *vp, off_t off, int retl
 	struct iovec *iv;
 	int ioflags, error;
 	struct uio io, *uiop = &io;
+	struct nfsheur *nh;
 
 	MALLOC(ivp, struct iovec *, cnt * sizeof (struct iovec), M_TEMP,
 	M_WAITOK);
@@ -733,7 +747,11 @@ nfsvno_write(struct vnode *vp, off_t off, int retl
 	uiop->uio_segflg = UIO_SYSSPACE;
 	NFSUIOPROC(uiop, p);
 	uiop->uio_offset = off;
+	n

Re: Low nfs write throughput

2011-11-18 Thread Rick Macklem
Bane Ivosev wrote:
> and if you use zfs also try this
> 
> zfs set sync=disabled
> 
I know diddly about zfs, but I believe some others have improved
zfs performance for NFS writing by moving the ZIL log to a dedicated
device, sometimes an SSD. Apparently (again I'm not knowledgible) you
do have to be careful what SSD you use and how full you make it, if you
want good write performance on the SSD.

I should also note that use of these options (vfs.nfsrv.async=1 and the
above for zfs) is risky in the sense that recently written data can be
lost when a server crashes/reboots because the NFS clients don't know
to hold onto the data and re-write it after a server crash/reboot.

rick
ps: NFS write performance has been an issue since SUN released their
first implementation of it in 1985. The "big" server vendors typically
solve the problem with lots of non-volatile RAM in the server boxes.
(This solution requires server code that specifically knows how to
 use this non-volatile RAM. Such code is not in the FreeBSD servers.)

> On 11/18/11 04:10, Daryl Sayers wrote:
> > Can anyone suggest why I am getting poor write performance from my
> > nfs setup.
> > I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus
> > mother boards,
> > 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives
> > with
> > onboard Gb network cards connected to an idle network. The results
> > below show
> > that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using
> > nfs. It
> > improves if I use async but a smbfs mount still beats it. I am using
> > the same
> > file, source and destinations for all tests. I have tried alternate
> > Network
> > cards with no resulting benefit.
> >
> > oguido# dd if=/u0/tmp/D2 | rsh castor dd of=/dsk/ufs/D2
> > 1950511+1 records in
> > 1950511+1 records out
> > 998661755 bytes transferred in 10.402483 secs (96002246 bytes/sec)
> > 1950477+74 records in
> > 1950511+1 records out
> > 998661755 bytes transferred in 10.115458 secs (98726301 bytes/sec)
> > (98Mb/s)
> >
> >
> > oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp gemini:/dsk/ufs
> > /mnt
> > oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
> > 7619+1 records in
> > 7619+1 records out
> > 998661755 bytes transferred in 62.570260 secs (15960646 bytes/sec)
> > (15Mb/s)
> >
> >
> > oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp,async
> > gemini:/dsk/ufs /mnt
> > oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
> > 7619+1 records in
> > 7619+1 records out
> > 998661755 bytes transferred in 50.697024 secs (19698627 bytes/sec)
> > (19Mb/s)
> >
> >
> > oguido# mount -t smbfs //gemini/ufs /mnt
> > oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
> > 7619+1 records in
> > 7619+1 records out
> > 998661755 bytes transferred in 29.787616 secs (33526072 bytes/sec)
> > (33Mb/s)
> >
> > Looking at a systat -v on the destination I see that the nfs test
> > does not
> > exceed 16KB/t with 100% busy where the other tests reach up to
> > 128KB/t.
> > For the record I get reads of 22Mb/s without and 77Mb/s with async
> > turned on
> > for the nfs mount.
> >
> >
> > A copy of dmesg:
> > 
> >
> > Copyright (c) 1992-2011 The FreeBSD Project.
> > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993,
> > 1994
> > The Regents of the University of California. All rights
> > reserved.
> > FreeBSD is a registered trademark of The FreeBSD Foundation.
> > FreeBSD 8.2-STABLE #0: Tue Jul 26 02:49:49 UTC 2011
> > root@fm32-8-1106:/usr/obj/usr/src/sys/LOCAL i386
> > Timecounter "i8254" frequency 1193182 Hz quality 0
> > CPU: Intel(R) Core(TM)2 Duo CPU E6850 @ 3.00GHz (2995.21-MHz
> > 686-class CPU)
> >   Origin = "GenuineIntel" Id = 0x6fb Family = 6 Model = f Stepping =
> >   11
> >   
> > Features=0xbfebfbff
> >   
> > Features2=0xe3fd
> >   AMD Features=0x2010
> >   AMD Features2=0x1
> >   TSC: P-state invariant
> > real memory = 4294967296 (4096 MB)
> > avail memory = 3141234688 (2995 MB)
> > ACPI APIC Table: 
> > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
> > FreeBSD/SMP: 1 package(s) x 2 core(s)
> >  cpu0 (BSP): APIC ID: 0
> >  cpu1 (AP): APIC ID: 1
> > ioapic0  irqs 0-23 on motherboard
> > kbd1 at kbdmux0
> > cryptosoft0:  on motherboard
> > acpi0:  on motherboard
> > acpi0: [ITHREAD]
> > acpi0: Power Button (fixed)
> > acpi0: reservation of 0, a (3) failed
> > acpi0: reservation of 10, bff0 (3) failed
> > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
> > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
> > cpu0:  on acpi0
> > ACPI Warning: Incorrect checksum in table [OEMB] - 0xBE, should be
> > 0xB1 (20101013/tbutils-354)
> > cpu1:  on acpi0
> > pcib0:  port 0xcf8-0xcff on acpi0
> > pci0:  on pcib0
> > pcib1:  irq 16 at device 1.0 on pci0
> > pci1:  on pcib1
> > mpt0:  port 0x7800-0x78ff mem
> > 0xfd4fc000-0xfd4f,0xfd4e-0xfd4e irq 16 at device 0.0 on
> > pci1
> > mpt0: [ITHREAD]
> > mpt0: MPI Version=1.5.18.0

Re: Low nfs write throughput

2011-11-18 Thread John Baldwin
On Thursday, November 17, 2011 10:10:27 pm Daryl Sayers wrote:
> 
> Can anyone suggest why I am getting poor write performance from my nfs 
setup.
> I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother 
boards,
> 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
> onboard Gb network cards connected to an idle network. The results below 
show
> that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
> improves if I use async but a smbfs mount still beats it. I am using the 
same
> file, source and destinations for all tests. I have tried alternate Network
> cards with no resulting benefit.
> 
> oguido# dd if=/u0/tmp/D2 | rsh castor dd of=/dsk/ufs/D2
> 1950511+1 records in
> 1950511+1 records out
> 998661755 bytes transferred in 10.402483 secs (96002246 bytes/sec)
> 1950477+74 records in
> 1950511+1 records out
> 998661755 bytes transferred in 10.115458 secs (98726301 bytes/sec) 
(98Mb/s)
> 
> 
> oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp gemini:/dsk/ufs /mnt
> oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
> 7619+1 records in
> 7619+1 records out
> 998661755 bytes transferred in 62.570260 secs (15960646 bytes/sec) 
(15Mb/s)
> 
> 
> oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp,async gemini:/dsk/ufs 
/mnt
> oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
> 7619+1 records in
> 7619+1 records out
> 998661755 bytes transferred in 50.697024 secs (19698627 bytes/sec) 
(19Mb/s)
> 
> 
> oguido# mount -t smbfs //gemini/ufs /mnt
> oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
> 7619+1 records in
> 7619+1 records out
> 998661755 bytes transferred in 29.787616 secs (33526072 bytes/sec) 
(33Mb/s)
> 
> Looking at a systat -v on the destination I see that the nfs test does not
> exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
> For the record I get reads of 22Mb/s without and 77Mb/s with async turned on
> for the nfs mount.

I don't know if it will help with your performance, but I have some patches
to allow the NFS server to cluster writes.  You can try 
www.freebsd.org/~jhb/patches/nfs_server_cluster.patch.  I've tested it on 8, 
but it should probably apply fine to 9.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Low nfs write throughput

2011-11-18 Thread Bane Ivosev
and if you use zfs also try this

zfs set sync=disabled

On 11/18/11 04:10, Daryl Sayers wrote:
> Can anyone suggest why I am getting poor write performance from my nfs setup.
> I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother boards,
> 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
> onboard Gb network cards connected to an idle network. The results below show
> that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
> improves if I use async but a smbfs mount still beats it. I am using the same
> file, source and destinations for all tests. I have tried alternate Network
> cards with no resulting benefit.
> 
> oguido# dd if=/u0/tmp/D2 | rsh castor dd of=/dsk/ufs/D2
> 1950511+1 records in
> 1950511+1 records out
> 998661755 bytes transferred in 10.402483 secs (96002246 bytes/sec)
> 1950477+74 records in
> 1950511+1 records out
> 998661755 bytes transferred in 10.115458 secs (98726301 bytes/sec) 
> (98Mb/s)
> 
> 
> oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp gemini:/dsk/ufs /mnt
> oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
> 7619+1 records in
> 7619+1 records out
> 998661755 bytes transferred in 62.570260 secs (15960646 bytes/sec) 
> (15Mb/s)
> 
> 
> oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp,async gemini:/dsk/ufs /mnt
> oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
> 7619+1 records in
> 7619+1 records out
> 998661755 bytes transferred in 50.697024 secs (19698627 bytes/sec) 
> (19Mb/s)
> 
> 
> oguido# mount -t smbfs //gemini/ufs /mnt
> oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
> 7619+1 records in
> 7619+1 records out
> 998661755 bytes transferred in 29.787616 secs (33526072 bytes/sec) 
> (33Mb/s)
> 
> Looking at a systat -v on the destination I see that the nfs test does not
> exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
> For the record I get reads of 22Mb/s without and 77Mb/s with async turned on
> for the nfs mount.
> 
> 
> A copy of dmesg:
> 
> 
> Copyright (c) 1992-2011 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
> The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 8.2-STABLE #0: Tue Jul 26 02:49:49 UTC 2011
> root@fm32-8-1106:/usr/obj/usr/src/sys/LOCAL i386
> Timecounter "i8254" frequency 1193182 Hz quality 0
> CPU: Intel(R) Core(TM)2 Duo CPU E6850  @ 3.00GHz (2995.21-MHz 686-class 
> CPU)
>   Origin = "GenuineIntel"  Id = 0x6fb  Family = 6  Model = f  Stepping = 11
>   
> Features=0xbfebfbff
>   
> Features2=0xe3fd
>   AMD Features=0x2010
>   AMD Features2=0x1
>   TSC: P-state invariant
> real memory  = 4294967296 (4096 MB)
> avail memory = 3141234688 (2995 MB)
> ACPI APIC Table: 
> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
> FreeBSD/SMP: 1 package(s) x 2 core(s)
>  cpu0 (BSP): APIC ID:  0
>  cpu1 (AP): APIC ID:  1
> ioapic0  irqs 0-23 on motherboard
> kbd1 at kbdmux0
> cryptosoft0:  on motherboard
> acpi0:  on motherboard
> acpi0: [ITHREAD]
> acpi0: Power Button (fixed)
> acpi0: reservation of 0, a (3) failed
> acpi0: reservation of 10, bff0 (3) failed
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
> cpu0:  on acpi0
> ACPI Warning: Incorrect checksum in table [OEMB] - 0xBE, should be 0xB1 
> (20101013/tbutils-354)
> cpu1:  on acpi0
> pcib0:  port 0xcf8-0xcff on acpi0
> pci0:  on pcib0
> pcib1:  irq 16 at device 1.0 on pci0
> pci1:  on pcib1
> mpt0:  port 0x7800-0x78ff mem 
> 0xfd4fc000-0xfd4f,0xfd4e-0xfd4e irq 16 at device 0.0 on pci1
> mpt0: [ITHREAD]
> mpt0: MPI Version=1.5.18.0
> mpt0: Capabilities: ( RAID-0 RAID-1E RAID-1 )
> mpt0: 0 Active Volumes (2 Max)
> mpt0: 0 Hidden Drive Members (14 Max)
> uhci0:  port 0xdc00-0xdc1f irq 16 
> at device 26.0 on pci0
> uhci0: [ITHREAD]
> uhci0: LegSup = 0x2f00
> usbus0:  on uhci0
> uhci1:  port 0xe000-0xe01f irq 17 
> at device 26.1 on pci0
> uhci1: [ITHREAD]
> uhci1: LegSup = 0x2f00
> usbus1:  on uhci1
> ehci0:  mem 
> 0xfebffc00-0xfebf irq 18 at device 26.7 on pci0
> ehci0: [ITHREAD]
> usbus2: EHCI version 1.0
> usbus2:  on ehci0
> pci0:  at device 27.0 (no driver attached)
> pcib2:  irq 16 at device 28.0 on pci0
> pci5:  on pcib2
> atapci0:  port 0xac00-0xac7f mem 
> 0xfd9ffc00-0xfd9ffc7f,0xfd9f8000-0xfd9fbfff irq 16 at device 0.0 on pci5
> atapci0: [ITHREAD]
> ata2:  on atapci0
> ata2: [ITHREAD]
> ata3:  on atapci0
> ata3: [ITHREAD]
> pcib3:  irq 17 at device 28.1 on pci0
> pci4:  on pcib3
> em0:  port 0x9c00-0x9c1f mem 
> 0xfd7e-0xfd7f,0xfd7c-0xfd7d irq 17 at device 0.0 on pci4
> em0: Using an MSI interrupt
> em0: [FILTER]
> em0: Ethernet address: 00:1b:21:04:ac:11
> pcib4:  irq 19 at device 28.3 on pci0
> pci3:  on pcib4
> age0:  mem 
> 0xfd6c-0xfd6f irq 19 at device 0.0 on pci3
> age0: 1280 Tx FIF

Re: Low nfs write throughput

2011-11-18 Thread Bane Ivosev
did you try this ?

sysctl -w vfs.nfsrv.async=1

On 11/18/11 04:10, Daryl Sayers wrote:
> Can anyone suggest why I am getting poor write performance from my nfs setup.
> I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother boards,
> 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
> onboard Gb network cards connected to an idle network. The results below show
> that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
> improves if I use async but a smbfs mount still beats it. I am using the same
> file, source and destinations for all tests. I have tried alternate Network
> cards with no resulting benefit.
> 
> oguido# dd if=/u0/tmp/D2 | rsh castor dd of=/dsk/ufs/D2
> 1950511+1 records in
> 1950511+1 records out
> 998661755 bytes transferred in 10.402483 secs (96002246 bytes/sec)
> 1950477+74 records in
> 1950511+1 records out
> 998661755 bytes transferred in 10.115458 secs (98726301 bytes/sec) 
> (98Mb/s)
> 
> 
> oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp gemini:/dsk/ufs /mnt
> oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
> 7619+1 records in
> 7619+1 records out
> 998661755 bytes transferred in 62.570260 secs (15960646 bytes/sec) 
> (15Mb/s)
> 
> 
> oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp,async gemini:/dsk/ufs /mnt
> oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
> 7619+1 records in
> 7619+1 records out
> 998661755 bytes transferred in 50.697024 secs (19698627 bytes/sec) 
> (19Mb/s)
> 
> 
> oguido# mount -t smbfs //gemini/ufs /mnt
> oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
> 7619+1 records in
> 7619+1 records out
> 998661755 bytes transferred in 29.787616 secs (33526072 bytes/sec) 
> (33Mb/s)
> 
> Looking at a systat -v on the destination I see that the nfs test does not
> exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
> For the record I get reads of 22Mb/s without and 77Mb/s with async turned on
> for the nfs mount.
> 
> 
> A copy of dmesg:
> 
> 
> Copyright (c) 1992-2011 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
> The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 8.2-STABLE #0: Tue Jul 26 02:49:49 UTC 2011
> root@fm32-8-1106:/usr/obj/usr/src/sys/LOCAL i386
> Timecounter "i8254" frequency 1193182 Hz quality 0
> CPU: Intel(R) Core(TM)2 Duo CPU E6850  @ 3.00GHz (2995.21-MHz 686-class 
> CPU)
>   Origin = "GenuineIntel"  Id = 0x6fb  Family = 6  Model = f  Stepping = 11
>   
> Features=0xbfebfbff
>   
> Features2=0xe3fd
>   AMD Features=0x2010
>   AMD Features2=0x1
>   TSC: P-state invariant
> real memory  = 4294967296 (4096 MB)
> avail memory = 3141234688 (2995 MB)
> ACPI APIC Table: 
> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
> FreeBSD/SMP: 1 package(s) x 2 core(s)
>  cpu0 (BSP): APIC ID:  0
>  cpu1 (AP): APIC ID:  1
> ioapic0  irqs 0-23 on motherboard
> kbd1 at kbdmux0
> cryptosoft0:  on motherboard
> acpi0:  on motherboard
> acpi0: [ITHREAD]
> acpi0: Power Button (fixed)
> acpi0: reservation of 0, a (3) failed
> acpi0: reservation of 10, bff0 (3) failed
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
> cpu0:  on acpi0
> ACPI Warning: Incorrect checksum in table [OEMB] - 0xBE, should be 0xB1 
> (20101013/tbutils-354)
> cpu1:  on acpi0
> pcib0:  port 0xcf8-0xcff on acpi0
> pci0:  on pcib0
> pcib1:  irq 16 at device 1.0 on pci0
> pci1:  on pcib1
> mpt0:  port 0x7800-0x78ff mem 
> 0xfd4fc000-0xfd4f,0xfd4e-0xfd4e irq 16 at device 0.0 on pci1
> mpt0: [ITHREAD]
> mpt0: MPI Version=1.5.18.0
> mpt0: Capabilities: ( RAID-0 RAID-1E RAID-1 )
> mpt0: 0 Active Volumes (2 Max)
> mpt0: 0 Hidden Drive Members (14 Max)
> uhci0:  port 0xdc00-0xdc1f irq 16 
> at device 26.0 on pci0
> uhci0: [ITHREAD]
> uhci0: LegSup = 0x2f00
> usbus0:  on uhci0
> uhci1:  port 0xe000-0xe01f irq 17 
> at device 26.1 on pci0
> uhci1: [ITHREAD]
> uhci1: LegSup = 0x2f00
> usbus1:  on uhci1
> ehci0:  mem 
> 0xfebffc00-0xfebf irq 18 at device 26.7 on pci0
> ehci0: [ITHREAD]
> usbus2: EHCI version 1.0
> usbus2:  on ehci0
> pci0:  at device 27.0 (no driver attached)
> pcib2:  irq 16 at device 28.0 on pci0
> pci5:  on pcib2
> atapci0:  port 0xac00-0xac7f mem 
> 0xfd9ffc00-0xfd9ffc7f,0xfd9f8000-0xfd9fbfff irq 16 at device 0.0 on pci5
> atapci0: [ITHREAD]
> ata2:  on atapci0
> ata2: [ITHREAD]
> ata3:  on atapci0
> ata3: [ITHREAD]
> pcib3:  irq 17 at device 28.1 on pci0
> pci4:  on pcib3
> em0:  port 0x9c00-0x9c1f mem 
> 0xfd7e-0xfd7f,0xfd7c-0xfd7d irq 17 at device 0.0 on pci4
> em0: Using an MSI interrupt
> em0: [FILTER]
> em0: Ethernet address: 00:1b:21:04:ac:11
> pcib4:  irq 19 at device 28.3 on pci0
> pci3:  on pcib4
> age0:  mem 
> 0xfd6c-0xfd6f irq 19 at device 0.0 on pci3
> age0: 1280 Tx FIFO, 2364 

Low nfs write throughput

2011-11-17 Thread Daryl Sayers

Can anyone suggest why I am getting poor write performance from my nfs setup.
I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother boards,
4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
onboard Gb network cards connected to an idle network. The results below show
that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
improves if I use async but a smbfs mount still beats it. I am using the same
file, source and destinations for all tests. I have tried alternate Network
cards with no resulting benefit.

oguido# dd if=/u0/tmp/D2 | rsh castor dd of=/dsk/ufs/D2
1950511+1 records in
1950511+1 records out
998661755 bytes transferred in 10.402483 secs (96002246 bytes/sec)
1950477+74 records in
1950511+1 records out
998661755 bytes transferred in 10.115458 secs (98726301 bytes/sec) (98Mb/s)


oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp gemini:/dsk/ufs /mnt
oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
7619+1 records in
7619+1 records out
998661755 bytes transferred in 62.570260 secs (15960646 bytes/sec) (15Mb/s)


oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp,async gemini:/dsk/ufs /mnt
oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
7619+1 records in
7619+1 records out
998661755 bytes transferred in 50.697024 secs (19698627 bytes/sec) (19Mb/s)


oguido# mount -t smbfs //gemini/ufs /mnt
oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
7619+1 records in
7619+1 records out
998661755 bytes transferred in 29.787616 secs (33526072 bytes/sec) (33Mb/s)

Looking at a systat -v on the destination I see that the nfs test does not
exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
For the record I get reads of 22Mb/s without and 77Mb/s with async turned on
for the nfs mount.


A copy of dmesg:


Copyright (c) 1992-2011 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.2-STABLE #0: Tue Jul 26 02:49:49 UTC 2011
root@fm32-8-1106:/usr/obj/usr/src/sys/LOCAL i386
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Core(TM)2 Duo CPU E6850  @ 3.00GHz (2995.21-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x6fb  Family = 6  Model = f  Stepping = 11
  
Features=0xbfebfbff
  Features2=0xe3fd
  AMD Features=0x2010
  AMD Features2=0x1
  TSC: P-state invariant
real memory  = 4294967296 (4096 MB)
avail memory = 3141234688 (2995 MB)
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0  irqs 0-23 on motherboard
kbd1 at kbdmux0
cryptosoft0:  on motherboard
acpi0:  on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
acpi0: reservation of 0, a (3) failed
acpi0: reservation of 10, bff0 (3) failed
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
cpu0:  on acpi0
ACPI Warning: Incorrect checksum in table [OEMB] - 0xBE, should be 0xB1 
(20101013/tbutils-354)
cpu1:  on acpi0
pcib0:  port 0xcf8-0xcff on acpi0
pci0:  on pcib0
pcib1:  irq 16 at device 1.0 on pci0
pci1:  on pcib1
mpt0:  port 0x7800-0x78ff mem 
0xfd4fc000-0xfd4f,0xfd4e-0xfd4e irq 16 at device 0.0 on pci1
mpt0: [ITHREAD]
mpt0: MPI Version=1.5.18.0
mpt0: Capabilities: ( RAID-0 RAID-1E RAID-1 )
mpt0: 0 Active Volumes (2 Max)
mpt0: 0 Hidden Drive Members (14 Max)
uhci0:  port 0xdc00-0xdc1f irq 16 at 
device 26.0 on pci0
uhci0: [ITHREAD]
uhci0: LegSup = 0x2f00
usbus0:  on uhci0
uhci1:  port 0xe000-0xe01f irq 17 at 
device 26.1 on pci0
uhci1: [ITHREAD]
uhci1: LegSup = 0x2f00
usbus1:  on uhci1
ehci0:  mem 
0xfebffc00-0xfebf irq 18 at device 26.7 on pci0
ehci0: [ITHREAD]
usbus2: EHCI version 1.0
usbus2:  on ehci0
pci0:  at device 27.0 (no driver attached)
pcib2:  irq 16 at device 28.0 on pci0
pci5:  on pcib2
atapci0:  port 0xac00-0xac7f mem 
0xfd9ffc00-0xfd9ffc7f,0xfd9f8000-0xfd9fbfff irq 16 at device 0.0 on pci5
atapci0: [ITHREAD]
ata2:  on atapci0
ata2: [ITHREAD]
ata3:  on atapci0
ata3: [ITHREAD]
pcib3:  irq 17 at device 28.1 on pci0
pci4:  on pcib3
em0:  port 0x9c00-0x9c1f mem 
0xfd7e-0xfd7f,0xfd7c-0xfd7d irq 17 at device 0.0 on pci4
em0: Using an MSI interrupt
em0: [FILTER]
em0: Ethernet address: 00:1b:21:04:ac:11
pcib4:  irq 19 at device 28.3 on pci0
pci3:  on pcib4
age0:  mem 0xfd6c-0xfd6f 
irq 19 at device 0.0 on pci3
age0: 1280 Tx FIFO, 2364 Rx FIFO
age0: Using 1 MSI messages.
miibus0:  on age0
atphy0:  PHY 0 on miibus0
atphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, 
1000baseT-FDX-master, auto
age0: Ethernet address: 00:1a:92:d2:de:cc
age0: [FILTER]
pcib5:  irq 16 at device 28.4 on pci0
pci2:  on pcib5
atapci1:  port 
0x8c00-0x8c07,0x8880-0x8883,0x8800-0x8807,0x8480-0x8483,0x84