hi steve, > The nsdChksum settings for none GNR/ESS based system is not officially > supported. It will perform checksum on data transfer over the network > only and can be used to help debug data corruption when network is a > suspect. i'll take not officially supported over silent bitrot any day.
> > Did any of those "Encountered XYZ checksum errors on network I/O to NSD > Client disk" warning messages resulted in disk been changed to "down" > state due to IO error? no. If no disk IO error was reported in GPFS log, > that means data was retransmitted successfully on retry. we suspected as much. as sven already asked, mmfsck now reports clean filesystem. i have an ibdump of 2 involved nsds during the reported checksums, i'll have a closer look if i can spot these retries. > > As sven said, only GNR/ESS provids the full end to end data integrity. so with the silent network error, we have high probabilty that the data is corrupted. we are now looking for a test to find out what adapters are affected. we hoped that nsdperf with verify=on would tell us, but it doesn't. > > Steve Y. Xiao > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
