Re: NFS data integrity failure

2004-10-16 Thread Gary Dunn
On Thu, 2004-10-14 at 07:58, Bigbrother wrote:

 
 MachineB mounts machineA:/disk and puts 1.2 GB of data from its disk to the
 machineA dick.  A CRC check performed on the copied files show that
 everything is correct. (always!)

Then do it this way :-)

Seriously, though, to isolate NFS you need to exercise the network and
file systems using other methods. How about transfering the same files
using a) ftp and b) scp. If the problem is dropped packets or
fragmentation or stuck bits in the NIC, those methods will be equally
unsuccessful.

Does either machine ever display an error message about nfs going down
then coming back? I can't remember the exact words, something like
connection lost then restored. When this happens to me at work it is due
to the ethernet switch port one system is connected to coming up in half
duplex instead of full duplex. Once it was a bad cat5 cable. 

Are the file sizes different?
 
-- 

Gary Dunn
[EMAIL PROTECTED]
Honolulu

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


NFS data integrity failure

2004-10-15 Thread Bigbrother


Dear all,

I have noticed a very strange NFS problem between two FreeBSD machines
running both 4.10-Release-p2.


-Short description:

NFS copy transfers from A to B (A controls the transfer, e.g. he gets data
) produce (always!!) CRC errors and MD5 mismatch between (some) original
files and copies (6-7 files out of 90)
NFS copy transfers from A to B (B controls the transfer, e.g. he puts
data)  make exact copies EVERYTIME!!!


NFS mounts have been tried with TCP, UDP, read/write of 8K or 16K,nfsv2,
nfsv3







Long description:

MachineA mount machineB:/disk  and copies 1.2 GB of data from machineB:/disk
to local disc (gets data) (almost 90 files of 15MB each). After, the
transfer I compare the CRC of every copied file with the original CRC and
some files produce different CRCs. If I copy again the failed file the CRC
is correct. Of course this means that I should manually verify every time
that copies are 100% the same with original, which is a bit waste of time.


MachineB mounts machineA:/disk and puts 1.2 GB of data from its disk to the
machineA dick.  A CRC check performed on the copied files show that
everything is correct. (always!)

Other tests:

MachineB mounts machineA:/disk and gets 1.2GB of data from machineA:/disk.
Everything is correct

MachineA mounts machineB:/disk and puts data on machineB. Some files have
CRC errors!!




Every time the files that are damaged are different.

NFS mounts are done with the same parameters every time

Different combinations of NFS mount parameters have been tried and every
time the results are the same.



MachineA:
CPU: AMD Athlon(tm) Processor (807.19-MHz 686-class CPU)
real memory  = 134135808 (130992K bytes)
Network card: Realtec 8139

MachineB:
CPU: Intel Pentium III (731.47-MHz 686-class CPU)
real memory  = 536870912 (524288K bytes)
Network card: 3Com 3c905C-TX Fast Etherlink XL 


Both machines are not in any load.

No errors reported by syslog!!!




-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

What is happening? How can I find out what is causing this? 
Is it possible that realtek card causes such behavior? 
On the other hand why some transfers succeed every time? 
I am not in the situation of buying another network card for my home
machine, so if you have any suggestion of how to resolve this problem
let me know...Have you got any similar situations? How did you solved them?


I have searched the net and have not found any useful information about it.

Thank you a lot in advance!!




---
Give a man fire, and he'll be warm for a day; set a man on fire, and he'll
be warm for the rest of his life 

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]