>On Friday 07 June 2002 12:30 pm, you ( Paul Millar) wrote:
> Hi Thomas,
>
> I've been meaning to reply, but life's been rather hectic at the mo.
>
> On Wed, 5 Jun 2002, Thomas McLaughlin wrote:
> > I am getting the following error while trying to copy a file (186k) to
> > an NFS mounted filesystem
> >
> > cp: closing `/jupiter/public/nfs.ps': Input/output error
Thanks for the reply.
I have now tried turning on NFS debugging as suggested and ran tcpdump
but this did not seem to give any info on the problem. The logs entries just
stopped without any errors or looked like the file had copied ok with status
set to ok. I can see from the tcpdump output that packets are fragmented
when I user a buffwer size of 4096.
There is nothing even obviousy different in the log files/tcpdump output
when the copy does work (using a buffer size of 1024). In fact I can sometimes
copy my test file once when I first restart nfs using the higher buffer size
and then I get the error each time I try and copy.
It has been suggested that that it could be a problem with one of the network
cards/packets so should I not have seen problem with other applications such
as ftp (which is ok)? Does this use a smaller packet size?
Both machines are running Mandrake 8.2 with no firewall software running
however my machine is running vmware and jabber daemon but does not
appear to be causing problems.
I will look at upgrading the kernel and check for updates on the mandrake
web site.
> Input/output error message is a catch-all, you have to look at
> kernel-level output for what's actually happening.
>
> You can turn on NFS debugging with some proc magic:
> echo 1 > /proc/sys/sunrpc/nfs_debug
> for client side, and:
> echo 1 > /proc/sys/sunrpc/nfsd_debug
> for server side. *Make sure you turn it off again*, as it will fill
> syslog very quickly otherwise. Do "echo 0 > /proc/..." to turn off.
>
> Last time I looked, most of the information is unintelligible gibberish.
> But it might say something useful.
>
> > I can copy smaller files ok and can can copy from the NFS mounted
> > filesystems without any problems. I cannot find any errors in log
> > files in any of the machines.
>
> The fact that it working for smaller buffer sizes is very suggestive. I
> agree with Martin that this sounds like it's a problem with network
> packets on one machine. The "rsize=8192,wsize=8192" options will
> definitely cause IP-fragmentation (for files > ~1.5 kiB), so the remote
> machine has to reassemble these packets. If you've got in anyway a
> non-standard network-stack (ie firewall, rewriting rules, Masq., ...), or
> the machine is not running a Linus kernel (e.g. RedHat), try upgrading to
> the latest kernel 2.4-series kernel (and choose the networking options
> carefully).
>
> Another possibility is packets being dropped. NFS works over/with sunrpc,
> which uses either udp or tcp. AFAIK, Linux doesn't support NFS over TCP,
> you you'll be using UDP packets, which are unreliable. If you're network
> is dropping these (for some reason), and the kernel is unable to recover
> (it _should_ be able to recover), then you might get this i/o error. It
> may be that your kernel(s) is/are less likely to recover from frag. IP
> packets, but this would be a pretty major bug.
>
> Try running tcpdump on the local and remote sites. Limit your search by
> only looking for UDP traffic from the remote site ("tcpdump -i eth0 host
> <remote host> and host <local host> and udp" should do it), then trigger
> the fault. Might be interesting to see what's happening at the IP-level.
>
> BTW, make sure you've got an up-to-the-minute version of tcpdump.
> There's a buffer overflow vulnerability announced a few days ago, and
> RedHat/SuSE/... have only just started to release updated versions.
>
> Cheers,
>
> Paul.
>
> -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
> -- Particle Physics (Theory & Experimental) Groups Paul
> Millar Department of Physics and Astronomy
> [EMAIL PROTECTED] University of Glasgow
> [EMAIL PROTECTED] Glasgow, G12 8QQ, Scotland
> http://www.astro.gla.ac.uk/users/paulm +44 (0)141 330 4717 A54C A9FC
> 6A77 1664 2E4E 90E3 FFD2 704B BF0F 03E9 -- -- -- -- -- -- -- -- -- -- --
> -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
>
>
> --------------------------------------------------------------------
> http://www.lug.org.uk http://www.linuxportal.co.uk
> http://www.linuxjob.co.uk http://www.linuxshop.co.uk
> --------------------------------------------------------------------
--------------------------------------------------------------------
http://www.lug.org.uk http://www.linuxportal.co.uk
http://www.linuxjob.co.uk http://www.linuxshop.co.uk
--------------------------------------------------------------------