Re: Weird NFS problems

2005-05-31 Thread Skylar Thompson

Jon Dama wrote:


Try switching to TCP NFS.

a 100MBit interface cannot keep up with a 1GBit interface in a bridge
configuration.  Therefore, in the long run, at full-bore you'd expect to
drop 9 out of every 10 ethernet frames.

MTU is 1500 therefore 1K works (it fits in one frame), 2K doesn't (your
NFS transactions are split across frames, one of which will almost
certainly be dropped, it's UDP so the loss of one frame invalidates the
whole transaction).

This is the same reason you can't use UDP with a block size greater than
MTU to use NFS over your DSL or some such arrangement.

Incidentially, this has nothing to do with FreeBSD.  So if using TCP
mounts solves your problem, don't expect Solaris NFS to magically make the
UDP case work...
 



The thing is that UDP NFS has been working for us for years. A big part 
of our work is performance analysis, so to change our network 
architecture will invalidate a large part of our data.


--
-- Skylar Thompson ([EMAIL PROTECTED])
-- http://www.cs.earlham.edu/~skylar/



signature.asc
Description: OpenPGP digital signature


Re: Weird NFS problems

2005-05-31 Thread Jon Dama
Yes, but surely you weren't bridging gigabit and 100Mbit before?

Did you try my suggestion about binding the IP address of the NFS server
to the 100Mbit side?

-Jon

On Tue, 31 May 2005, Skylar Thompson wrote:

 Jon Dama wrote:

 Try switching to TCP NFS.
 
 a 100MBit interface cannot keep up with a 1GBit interface in a bridge
 configuration.  Therefore, in the long run, at full-bore you'd expect to
 drop 9 out of every 10 ethernet frames.
 
 MTU is 1500 therefore 1K works (it fits in one frame), 2K doesn't (your
 NFS transactions are split across frames, one of which will almost
 certainly be dropped, it's UDP so the loss of one frame invalidates the
 whole transaction).
 
 This is the same reason you can't use UDP with a block size greater than
 MTU to use NFS over your DSL or some such arrangement.
 
 Incidentially, this has nothing to do with FreeBSD.  So if using TCP
 mounts solves your problem, don't expect Solaris NFS to magically make the
 UDP case work...
 
 

 The thing is that UDP NFS has been working for us for years. A big part
 of our work is performance analysis, so to change our network
 architecture will invalidate a large part of our data.

 --
 -- Skylar Thompson ([EMAIL PROTECTED])
 -- http://www.cs.earlham.edu/~skylar/


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Weird NFS problems

2005-05-31 Thread Skylar Thompson

Jon Dama wrote:


Yes, but surely you weren't bridging gigabit and 100Mbit before?
 


Did you try my suggestion about binding the IP address of the NFS server
to the 100Mbit side?
 



Yeah. Unfortunately networking on the server fell apart when I did that. 
Traffic was still passed and I could get through to the server on the 
100Mb/s side, but not on the 1000Mb/s. It looked like the arp tables 
weren't being forwarded properly, but I couldn't convince FreeBSD to do 
proxy arp.


After doing some more poking around, it actually looks like it might be 
a misfeature in the Linux 2.4 kernel wrt ipfilter (which is running on 
the bridge). Apparently 2.4 fragments UDP packets in the reverse order 
that every other UNIX-like operating system does, which throws off 
ipfilter's state tables. I'm going to do some testing to see if the 
difference between UDP and TCP NFS is negligible enough for us to disregard.


Thanks for the suggestions!

--
-- Skylar Thompson ([EMAIL PROTECTED])
-- http://www.cs.earlham.edu/~skylar/



signature.asc
Description: OpenPGP digital signature


Re: Weird NFS problems

2005-05-28 Thread Jon Dama
Oh, something else to try:

I checked through my notes and discovered that I had gotten UDP to work in
a similar configuration before.  What I did was bind the IP address to
fxp0 instead of em0.  By doing this, the kernel seems to send the data at
a pace suitable for the slow interface.



-Jon

On Fri, 27 May 2005, Don Lewis wrote:

 On 26 May, Skylar Thompson wrote:
  I'm having some problems with NFS serving on a FreeBSD 5.4-RELEASE
  machine. The FreeBSD machine is the NFS/NIS server for a group of four
  Linux clusters. The network archictecture looks like this:
 
  234/24   234/24
  Cluster 1 ---|--- Cluster 3
 | ---
  em0|  File server | fxp0
 |  --
  Cluster 2 ---|--- Cluster 4
  234/24230/24
 
 
  em0 and fxp0 are bridged, and em0 has a 234/24 IP address while fxp0 is
  just in promiscuous mode. 234/24 is an 802.1q VLAN on the fxp0 side of
  the server, so packets are untagged at the switch just before fxp0, and
  are forwarded to em0 through the bridge.
 
  The problem manifests itself in large UDP NFS requests from Clusters 3
  and 4. The export can be mounted fine from both those clusters, and
  small transfers such as with ls work fine, but the moment any serious
  data transfer starts, the entire mount just hangs. Running ethereal on
  the file server shows a a lot of fragmented packets, and RPC
  retransmissions on just a single request. Reducing the read and write
  NFS buffers on the Linux clients to 1kB from the default of 4kB solves
  the issue, but kills the transfer rate. The moment I go to 2kB, the
  problem reappearss. Clusters 1 and 2 use the default of 4kB buffers, and
  have no problems communicating to em0.
 
  Poking through the list archives, I ran across this message
  (http://lists.freebsd.org/pipermail/freebsd-stable/2003-May/001007.html)
  that reveals a bug in the fxp(4) driver in 4-RELEASE that incorrectly
  detects the capabilities of the NIC. Is this still an issue in
  5-RELEASE, or am I looking at a different problem? Any ideas on how I
  can get the NFS buffers up to a reasonable level?

 That problem was fixed quite some time ago.

 Which transfer direction fails?
   Client writing to server
   Client reading from server
   Both?

 Do you see all the fragments in the retransmitted request?

 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Weird NFS problems

2005-05-27 Thread Don Lewis
On 26 May, Skylar Thompson wrote:
 I'm having some problems with NFS serving on a FreeBSD 5.4-RELEASE 
 machine. The FreeBSD machine is the NFS/NIS server for a group of four 
 Linux clusters. The network archictecture looks like this:
 
 234/24   234/24
 Cluster 1 ---|--- Cluster 3
| ---
 em0|  File server | fxp0
|  --
 Cluster 2 ---|--- Cluster 4
 234/24230/24
 
 
 em0 and fxp0 are bridged, and em0 has a 234/24 IP address while fxp0 is 
 just in promiscuous mode. 234/24 is an 802.1q VLAN on the fxp0 side of 
 the server, so packets are untagged at the switch just before fxp0, and 
 are forwarded to em0 through the bridge.
 
 The problem manifests itself in large UDP NFS requests from Clusters 3 
 and 4. The export can be mounted fine from both those clusters, and 
 small transfers such as with ls work fine, but the moment any serious 
 data transfer starts, the entire mount just hangs. Running ethereal on 
 the file server shows a a lot of fragmented packets, and RPC 
 retransmissions on just a single request. Reducing the read and write 
 NFS buffers on the Linux clients to 1kB from the default of 4kB solves 
 the issue, but kills the transfer rate. The moment I go to 2kB, the 
 problem reappearss. Clusters 1 and 2 use the default of 4kB buffers, and 
 have no problems communicating to em0.
 
 Poking through the list archives, I ran across this message 
 (http://lists.freebsd.org/pipermail/freebsd-stable/2003-May/001007.html) 
 that reveals a bug in the fxp(4) driver in 4-RELEASE that incorrectly 
 detects the capabilities of the NIC. Is this still an issue in 
 5-RELEASE, or am I looking at a different problem? Any ideas on how I 
 can get the NFS buffers up to a reasonable level?

That problem was fixed quite some time ago.

Which transfer direction fails?
Client writing to server
Client reading from server
Both?

Do you see all the fragments in the retransmitted request?

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Weird NFS problems

2005-05-27 Thread Jon Dama
Try switching to TCP NFS.

a 100MBit interface cannot keep up with a 1GBit interface in a bridge
configuration.  Therefore, in the long run, at full-bore you'd expect to
drop 9 out of every 10 ethernet frames.

MTU is 1500 therefore 1K works (it fits in one frame), 2K doesn't (your
NFS transactions are split across frames, one of which will almost
certainly be dropped, it's UDP so the loss of one frame invalidates the
whole transaction).

This is the same reason you can't use UDP with a block size greater than
MTU to use NFS over your DSL or some such arrangement.

Incidentially, this has nothing to do with FreeBSD.  So if using TCP
mounts solves your problem, don't expect Solaris NFS to magically make the
UDP case work...

-Jon

On Fri, 27 May 2005, Don Lewis wrote:

 On 26 May, Skylar Thompson wrote:
  I'm having some problems with NFS serving on a FreeBSD 5.4-RELEASE
  machine. The FreeBSD machine is the NFS/NIS server for a group of four
  Linux clusters. The network archictecture looks like this:
 
  234/24   234/24
  Cluster 1 ---|--- Cluster 3
 | ---
  em0|  File server | fxp0
 |  --
  Cluster 2 ---|--- Cluster 4
  234/24230/24
 
 
  em0 and fxp0 are bridged, and em0 has a 234/24 IP address while fxp0 is
  just in promiscuous mode. 234/24 is an 802.1q VLAN on the fxp0 side of
  the server, so packets are untagged at the switch just before fxp0, and
  are forwarded to em0 through the bridge.
 
  The problem manifests itself in large UDP NFS requests from Clusters 3
  and 4. The export can be mounted fine from both those clusters, and
  small transfers such as with ls work fine, but the moment any serious
  data transfer starts, the entire mount just hangs. Running ethereal on
  the file server shows a a lot of fragmented packets, and RPC
  retransmissions on just a single request. Reducing the read and write
  NFS buffers on the Linux clients to 1kB from the default of 4kB solves
  the issue, but kills the transfer rate. The moment I go to 2kB, the
  problem reappearss. Clusters 1 and 2 use the default of 4kB buffers, and
  have no problems communicating to em0.
 
  Poking through the list archives, I ran across this message
  (http://lists.freebsd.org/pipermail/freebsd-stable/2003-May/001007.html)
  that reveals a bug in the fxp(4) driver in 4-RELEASE that incorrectly
  detects the capabilities of the NIC. Is this still an issue in
  5-RELEASE, or am I looking at a different problem? Any ideas on how I
  can get the NFS buffers up to a reasonable level?

 That problem was fixed quite some time ago.

 Which transfer direction fails?
   Client writing to server
   Client reading from server
   Both?

 Do you see all the fragments in the retransmitted request?

 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Weird NFS problems

2005-05-26 Thread Skylar Thompson
I'm having some problems with NFS serving on a FreeBSD 5.4-RELEASE 
machine. The FreeBSD machine is the NFS/NIS server for a group of four 
Linux clusters. The network archictecture looks like this:


234/24   234/24
Cluster 1 ---|--- Cluster 3
  | ---
   em0|  File server | fxp0
  |  --
Cluster 2 ---|--- Cluster 4
234/24230/24


em0 and fxp0 are bridged, and em0 has a 234/24 IP address while fxp0 is 
just in promiscuous mode. 234/24 is an 802.1q VLAN on the fxp0 side of 
the server, so packets are untagged at the switch just before fxp0, and 
are forwarded to em0 through the bridge.


The problem manifests itself in large UDP NFS requests from Clusters 3 
and 4. The export can be mounted fine from both those clusters, and 
small transfers such as with ls work fine, but the moment any serious 
data transfer starts, the entire mount just hangs. Running ethereal on 
the file server shows a a lot of fragmented packets, and RPC 
retransmissions on just a single request. Reducing the read and write 
NFS buffers on the Linux clients to 1kB from the default of 4kB solves 
the issue, but kills the transfer rate. The moment I go to 2kB, the 
problem reappearss. Clusters 1 and 2 use the default of 4kB buffers, and 
have no problems communicating to em0.


Poking through the list archives, I ran across this message 
(http://lists.freebsd.org/pipermail/freebsd-stable/2003-May/001007.html) 
that reveals a bug in the fxp(4) driver in 4-RELEASE that incorrectly 
detects the capabilities of the NIC. Is this still an issue in 
5-RELEASE, or am I looking at a different problem? Any ideas on how I 
can get the NFS buffers up to a reasonable level?


--
-- Skylar Thompson ([EMAIL PROTECTED])
-- http://www.cs.earlham.edu/~skylar/



signature.asc
Description: OpenPGP digital signature