Re: [scottish] NFS problem

Paul Millar Fri, 07 Jun 2002 04:11:47 -0700

Hi Thomas,

I've been meaning to reply, but life's been rather hectic at the mo.

On Wed, 5 Jun 2002, Thomas McLaughlin wrote:
> I am getting the following error while trying to copy a file (186k) to 
> an NFS mounted filesystem

> cp: closing `/jupiter/public/nfs.ps': Input/output error

Input/output error message is a catch-all, you have to look at
kernel-level output for what's actually happening.

You can turn on NFS debugging with some proc magic:
  echo 1 > /proc/sys/sunrpc/nfs_debug
for client side, and:
  echo 1 > /proc/sys/sunrpc/nfsd_debug
for server side.  *Make sure you turn it off again*, as it will fill 
syslog very quickly otherwise. Do "echo 0 > /proc/..." to turn off.

Last time I looked, most of the information is unintelligible gibberish.
But it might say something useful.

> I can copy smaller files ok and can can copy from the NFS mounted
> filesystems without any problems. I cannot find any errors in log
> files in any of the machines.

The fact that it working for smaller buffer sizes is very suggestive.  I
agree with Martin that this sounds like it's a problem with network
packets on one machine.  The "rsize=8192,wsize=8192" options will
definitely cause IP-fragmentation (for files > ~1.5 kiB), so the remote
machine has to reassemble these packets.  If you've got in anyway a
non-standard network-stack (ie firewall, rewriting rules, Masq., ...), or
the machine is not running a Linus kernel (e.g. RedHat), try upgrading to
the latest kernel 2.4-series kernel (and choose the networking options
carefully).

Another possibility is packets being dropped.  NFS works over/with sunrpc,
which uses either udp or tcp.  AFAIK, Linux doesn't support NFS over TCP,
you you'll be using UDP packets, which are unreliable.  If you're network
is dropping these (for some reason), and the kernel is unable to recover
(it _should_ be able to recover), then you might get this i/o error.  It
may be that your kernel(s) is/are less likely to recover from frag. IP
packets, but this would be a pretty major bug.

Try running tcpdump on the local and remote sites.  Limit your search by
only looking for UDP traffic from the remote site ("tcpdump -i eth0 host
<remote host> and host <local host> and udp"  should do it), then trigger
the fault.  Might be interesting to see what's happening at the IP-level.

BTW, make sure you've got an up-to-the-minute version of tcpdump.  
There's a buffer overflow vulnerability announced a few days ago, and
RedHat/SuSE/... have only just started to release updated versions.

Cheers,

Paul.

-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
Particle Physics (Theory & Experimental) Groups                   Paul Millar 
Department of Physics and Astronomy                     [EMAIL PROTECTED]
University of Glasgow                                 [EMAIL PROTECTED]
Glasgow, G12 8QQ, Scotland             http://www.astro.gla.ac.uk/users/paulm 
+44 (0)141 330 4717        A54C A9FC 6A77 1664 2E4E  90E3 FFD2 704B BF0F 03E9
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 

--------------------------------------------------------------------
http://www.lug.org.uk                   http://www.linuxportal.co.uk
http://www.linuxjob.co.uk               http://www.linuxshop.co.uk
--------------------------------------------------------------------

Re: [scottish] NFS problem

Reply via email to