Le Fri, 30 May 2008 09:59:25 -0500 Troy Benjegerdes <[EMAIL PROTECTED]> écrivait:
> So a bad network card (or maybe pci-X slot) is causing corruption > that still has a correct TCP checksum? > Obviously! And there are more errors with one port than the other one. On one port I had one error every ~8 GB transferred, on the other one much more. The very bad news were that it was a completely silent error... I have 15TB of possibly corrupted data I'll have to recreate entirely :( BTW I've plugged the card on different PCI slots and it behaved the same. So I'm pretty sure it's the card. > Can you run tcpdump on both server and client and save a trace? Well I'll have to put back the faulty card in another machine. It may be interesting to check what's happening with ssh, nfs, etc. > You could also try turning off any TCP or checksum offload in the > network card. Yes, the Intel Pro1000 has a TOE, it may be faulty. I don't know how to disable it unfortunately. -- ---------------------------------------- Emmanuel Florac | Intellique ---------------------------------------- _______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
