Le Fri, 30 May 2008 09:59:25 -0500
Troy Benjegerdes <[EMAIL PROTECTED]> écrivait:

> So a bad network card (or maybe pci-X slot) is causing corruption
> that still has a correct TCP checksum?
> 

Obviously! And there are more errors with one port than the other one.
On one port I had one error every ~8 GB transferred, on the other one
much more. The very bad news were that it was a completely silent
error... I have 15TB of possibly corrupted data I'll have to recreate
entirely :(

BTW I've plugged the card on different PCI slots and it behaved the
same. So I'm pretty sure it's the card.

> Can you run tcpdump on both server and client and save a trace?

Well I'll have to put back the faulty card in another machine. It may
be interesting to check what's happening with ssh, nfs, etc.

> You could  also try turning off any TCP or checksum offload in the 
> network card.

Yes, the Intel Pro1000 has a TOE, it may be faulty. I don't know how to
disable it unfortunately.


-- 
----------------------------------------
Emmanuel Florac     |   Intellique
----------------------------------------


_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to