Really ? You sure? I just set up a new 1.6.5.1 filesystem this week: [EMAIL PROTECTED] ~]# cat /proc/fs/lustre/llite/nobackup-0000010037e27c00/ checksum_pages 0
I am curious to test if they were on. My MPI_File_write() of a large file was less than I expected, but it looked like OST's were cpu bound. (two x4500's) Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Aug 21, 2008, at 2:59 PM, Andreas Dilger wrote: > On Aug 21, 2008 10:55 -0400, Brock Palen wrote: >> On Aug 21, 2008, at 10:22 AM, Troy Benjegerdes wrote: >>> This is a big nasty issue, particularly for HPC applications where >>> performance is a big issue. >>> >>> How does one even begin to benchmark the performance overhead of a >>> parallel filesystem with checksumming? I am having nightmares >>> over the >>> ways vendors will try to play games with performance numbers. >> >> True > > Actually, Lustre 1.6.5 does checksumming by default, and that is how > we do our benchmarking. Some customers will turn it off because the > overhead hurts them. New customers may not even notice it... > Also, for > many workloads the data integrity is much more important than the > speed. > >>> My suspicion is that whenever a parallel filesystem with >>> checksumming is >>> available and works, that all the end-users will just turn it off >>> anyway >>> because the applications will run twice as fast without it, >>> regardless >>> of what the benchmarks say.. leaving us back at the same problem. >> >> I don't think this will be a problem. On current systems it may be >> the case of the checksummed filesystem becoming cpu bound. I think >> the OST's will be bailed out by cpu speeds going up faster than disk >> speeds. You just need to limit the number of OST's/OSS. > > I agree that CPU speeds will almost certainly cover this in the > future. > >> Where I could see it being a problem is on the client side. That >> assumes that writes and reads are competing with the application for >> cycles. So far on our clusters I see applications do ether compute >> or IO on a thread/rank. Not both, freeing up allocated cpus for IO. > > Yes, that is our experience also. > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. > > > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
