Re: Weird NFS problems
Jon Dama wrote: Try switching to TCP NFS. a 100MBit interface cannot keep up with a 1GBit interface in a bridge configuration. Therefore, in the long run, at full-bore you'd expect to drop 9 out of every 10 ethernet frames. MTU is 1500 therefore 1K works (it fits in one frame), 2K doesn't (your NFS transactions are split across frames, one of which will almost certainly be dropped, it's UDP so the loss of one frame invalidates the whole transaction). This is the same reason you can't use UDP with a block size greater than MTU to use NFS over your DSL or some such arrangement. Incidentially, this has nothing to do with FreeBSD. So if using TCP mounts solves your problem, don't expect Solaris NFS to magically make the UDP case work... The thing is that UDP NFS has been working for us for years. A big part of our work is performance analysis, so to change our network architecture will invalidate a large part of our data. -- -- Skylar Thompson ([EMAIL PROTECTED]) -- http://www.cs.earlham.edu/~skylar/ signature.asc Description: OpenPGP digital signature
Re: Weird NFS problems
Yes, but surely you weren't bridging gigabit and 100Mbit before? Did you try my suggestion about binding the IP address of the NFS server to the 100Mbit side? -Jon On Tue, 31 May 2005, Skylar Thompson wrote: Jon Dama wrote: Try switching to TCP NFS. a 100MBit interface cannot keep up with a 1GBit interface in a bridge configuration. Therefore, in the long run, at full-bore you'd expect to drop 9 out of every 10 ethernet frames. MTU is 1500 therefore 1K works (it fits in one frame), 2K doesn't (your NFS transactions are split across frames, one of which will almost certainly be dropped, it's UDP so the loss of one frame invalidates the whole transaction). This is the same reason you can't use UDP with a block size greater than MTU to use NFS over your DSL or some such arrangement. Incidentially, this has nothing to do with FreeBSD. So if using TCP mounts solves your problem, don't expect Solaris NFS to magically make the UDP case work... The thing is that UDP NFS has been working for us for years. A big part of our work is performance analysis, so to change our network architecture will invalidate a large part of our data. -- -- Skylar Thompson ([EMAIL PROTECTED]) -- http://www.cs.earlham.edu/~skylar/ ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Weird NFS problems
Jon Dama wrote: Yes, but surely you weren't bridging gigabit and 100Mbit before? Did you try my suggestion about binding the IP address of the NFS server to the 100Mbit side? Yeah. Unfortunately networking on the server fell apart when I did that. Traffic was still passed and I could get through to the server on the 100Mb/s side, but not on the 1000Mb/s. It looked like the arp tables weren't being forwarded properly, but I couldn't convince FreeBSD to do proxy arp. After doing some more poking around, it actually looks like it might be a misfeature in the Linux 2.4 kernel wrt ipfilter (which is running on the bridge). Apparently 2.4 fragments UDP packets in the reverse order that every other UNIX-like operating system does, which throws off ipfilter's state tables. I'm going to do some testing to see if the difference between UDP and TCP NFS is negligible enough for us to disregard. Thanks for the suggestions! -- -- Skylar Thompson ([EMAIL PROTECTED]) -- http://www.cs.earlham.edu/~skylar/ signature.asc Description: OpenPGP digital signature
Re: Weird NFS problems
Oh, something else to try: I checked through my notes and discovered that I had gotten UDP to work in a similar configuration before. What I did was bind the IP address to fxp0 instead of em0. By doing this, the kernel seems to send the data at a pace suitable for the slow interface. -Jon On Fri, 27 May 2005, Don Lewis wrote: On 26 May, Skylar Thompson wrote: I'm having some problems with NFS serving on a FreeBSD 5.4-RELEASE machine. The FreeBSD machine is the NFS/NIS server for a group of four Linux clusters. The network archictecture looks like this: 234/24 234/24 Cluster 1 ---|--- Cluster 3 | --- em0| File server | fxp0 | -- Cluster 2 ---|--- Cluster 4 234/24230/24 em0 and fxp0 are bridged, and em0 has a 234/24 IP address while fxp0 is just in promiscuous mode. 234/24 is an 802.1q VLAN on the fxp0 side of the server, so packets are untagged at the switch just before fxp0, and are forwarded to em0 through the bridge. The problem manifests itself in large UDP NFS requests from Clusters 3 and 4. The export can be mounted fine from both those clusters, and small transfers such as with ls work fine, but the moment any serious data transfer starts, the entire mount just hangs. Running ethereal on the file server shows a a lot of fragmented packets, and RPC retransmissions on just a single request. Reducing the read and write NFS buffers on the Linux clients to 1kB from the default of 4kB solves the issue, but kills the transfer rate. The moment I go to 2kB, the problem reappearss. Clusters 1 and 2 use the default of 4kB buffers, and have no problems communicating to em0. Poking through the list archives, I ran across this message (http://lists.freebsd.org/pipermail/freebsd-stable/2003-May/001007.html) that reveals a bug in the fxp(4) driver in 4-RELEASE that incorrectly detects the capabilities of the NIC. Is this still an issue in 5-RELEASE, or am I looking at a different problem? Any ideas on how I can get the NFS buffers up to a reasonable level? That problem was fixed quite some time ago. Which transfer direction fails? Client writing to server Client reading from server Both? Do you see all the fragments in the retransmitted request? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Weird NFS problems
On 26 May, Skylar Thompson wrote: I'm having some problems with NFS serving on a FreeBSD 5.4-RELEASE machine. The FreeBSD machine is the NFS/NIS server for a group of four Linux clusters. The network archictecture looks like this: 234/24 234/24 Cluster 1 ---|--- Cluster 3 | --- em0| File server | fxp0 | -- Cluster 2 ---|--- Cluster 4 234/24230/24 em0 and fxp0 are bridged, and em0 has a 234/24 IP address while fxp0 is just in promiscuous mode. 234/24 is an 802.1q VLAN on the fxp0 side of the server, so packets are untagged at the switch just before fxp0, and are forwarded to em0 through the bridge. The problem manifests itself in large UDP NFS requests from Clusters 3 and 4. The export can be mounted fine from both those clusters, and small transfers such as with ls work fine, but the moment any serious data transfer starts, the entire mount just hangs. Running ethereal on the file server shows a a lot of fragmented packets, and RPC retransmissions on just a single request. Reducing the read and write NFS buffers on the Linux clients to 1kB from the default of 4kB solves the issue, but kills the transfer rate. The moment I go to 2kB, the problem reappearss. Clusters 1 and 2 use the default of 4kB buffers, and have no problems communicating to em0. Poking through the list archives, I ran across this message (http://lists.freebsd.org/pipermail/freebsd-stable/2003-May/001007.html) that reveals a bug in the fxp(4) driver in 4-RELEASE that incorrectly detects the capabilities of the NIC. Is this still an issue in 5-RELEASE, or am I looking at a different problem? Any ideas on how I can get the NFS buffers up to a reasonable level? That problem was fixed quite some time ago. Which transfer direction fails? Client writing to server Client reading from server Both? Do you see all the fragments in the retransmitted request? ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Weird NFS problems
Try switching to TCP NFS. a 100MBit interface cannot keep up with a 1GBit interface in a bridge configuration. Therefore, in the long run, at full-bore you'd expect to drop 9 out of every 10 ethernet frames. MTU is 1500 therefore 1K works (it fits in one frame), 2K doesn't (your NFS transactions are split across frames, one of which will almost certainly be dropped, it's UDP so the loss of one frame invalidates the whole transaction). This is the same reason you can't use UDP with a block size greater than MTU to use NFS over your DSL or some such arrangement. Incidentially, this has nothing to do with FreeBSD. So if using TCP mounts solves your problem, don't expect Solaris NFS to magically make the UDP case work... -Jon On Fri, 27 May 2005, Don Lewis wrote: On 26 May, Skylar Thompson wrote: I'm having some problems with NFS serving on a FreeBSD 5.4-RELEASE machine. The FreeBSD machine is the NFS/NIS server for a group of four Linux clusters. The network archictecture looks like this: 234/24 234/24 Cluster 1 ---|--- Cluster 3 | --- em0| File server | fxp0 | -- Cluster 2 ---|--- Cluster 4 234/24230/24 em0 and fxp0 are bridged, and em0 has a 234/24 IP address while fxp0 is just in promiscuous mode. 234/24 is an 802.1q VLAN on the fxp0 side of the server, so packets are untagged at the switch just before fxp0, and are forwarded to em0 through the bridge. The problem manifests itself in large UDP NFS requests from Clusters 3 and 4. The export can be mounted fine from both those clusters, and small transfers such as with ls work fine, but the moment any serious data transfer starts, the entire mount just hangs. Running ethereal on the file server shows a a lot of fragmented packets, and RPC retransmissions on just a single request. Reducing the read and write NFS buffers on the Linux clients to 1kB from the default of 4kB solves the issue, but kills the transfer rate. The moment I go to 2kB, the problem reappearss. Clusters 1 and 2 use the default of 4kB buffers, and have no problems communicating to em0. Poking through the list archives, I ran across this message (http://lists.freebsd.org/pipermail/freebsd-stable/2003-May/001007.html) that reveals a bug in the fxp(4) driver in 4-RELEASE that incorrectly detects the capabilities of the NIC. Is this still an issue in 5-RELEASE, or am I looking at a different problem? Any ideas on how I can get the NFS buffers up to a reasonable level? That problem was fixed quite some time ago. Which transfer direction fails? Client writing to server Client reading from server Both? Do you see all the fragments in the retransmitted request? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Weird NFS problems
I'm having some problems with NFS serving on a FreeBSD 5.4-RELEASE machine. The FreeBSD machine is the NFS/NIS server for a group of four Linux clusters. The network archictecture looks like this: 234/24 234/24 Cluster 1 ---|--- Cluster 3 | --- em0| File server | fxp0 | -- Cluster 2 ---|--- Cluster 4 234/24230/24 em0 and fxp0 are bridged, and em0 has a 234/24 IP address while fxp0 is just in promiscuous mode. 234/24 is an 802.1q VLAN on the fxp0 side of the server, so packets are untagged at the switch just before fxp0, and are forwarded to em0 through the bridge. The problem manifests itself in large UDP NFS requests from Clusters 3 and 4. The export can be mounted fine from both those clusters, and small transfers such as with ls work fine, but the moment any serious data transfer starts, the entire mount just hangs. Running ethereal on the file server shows a a lot of fragmented packets, and RPC retransmissions on just a single request. Reducing the read and write NFS buffers on the Linux clients to 1kB from the default of 4kB solves the issue, but kills the transfer rate. The moment I go to 2kB, the problem reappearss. Clusters 1 and 2 use the default of 4kB buffers, and have no problems communicating to em0. Poking through the list archives, I ran across this message (http://lists.freebsd.org/pipermail/freebsd-stable/2003-May/001007.html) that reveals a bug in the fxp(4) driver in 4-RELEASE that incorrectly detects the capabilities of the NIC. Is this still an issue in 5-RELEASE, or am I looking at a different problem? Any ideas on how I can get the NFS buffers up to a reasonable level? -- -- Skylar Thompson ([EMAIL PROTECTED]) -- http://www.cs.earlham.edu/~skylar/ signature.asc Description: OpenPGP digital signature