On Aug 21, 2008, at 10:22 AM, Troy Benjegerdes wrote: > This is a big nasty issue, particularly for HPC applications where > performance is a big issue. > > How does one even begin to benchmark the performance overhead of a > parallel filesystem with checksumming? I am having nightmares over the > ways vendors will try to play games with performance numbers.
True > > My suspicion is that whenever a parallel filesystem with > checksumming is > available and works, that all the end-users will just turn it off > anyway > because the applications will run twice as fast without it, regardless > of what the benchmarks say.. leaving us back at the same problem. I don't think this will be a problem. On current systems it may be the case of the checksummed filesystem becoming cpu bound. I think the OST's will be bailed out by cpu speeds going up faster than disk speeds. You just need to limit the number of OST's/OSS. Where I could see it being a problem is on the client side. That assumes that writes and reads are competing with the application for cycles. So far on our clusters I see applications do ether compute or IO on a thread/rank. Not both, freeing up allocated cpus for IO. Then again maybe I should ask our users why they don't do any async IO. Prob depends. My 2 cents. > > On Wed, Aug 20, 2008 at 07:12:10PM +0200, Bernd Schubert wrote: >> Oh damn, I'm always afraid of silent data corruptions due to bad >> harddisks. We >> also already had this issue, fortunately we found this disk before >> taking the >> system into production. >> >> Will lustre-2.0 use the ZFS checksum feature? >> >> >> Thanks, >> Bernd >> >> On Wednesday 20 August 2008 19:08:34 Peter Jones wrote: >>> Hi there >>> >>> I got the following background information from Juergen Kreuels >>> at SGI >>> >>> "It turned out that a bad disk ( which did NOT report itself as >>> being >>> bad ) killed the lustre leading to data corruption due to inode >>> areas on >>> that disk. >>> It was finally decided to remake the whole FS and only during that >>> action we finally ( after nearly 48 h ) found that bad drive. >>> >>> It had nothing to do with the lustre FS itself. Lustre had been the >>> victim of a HW failure on a Raid6 lun." >>> >>> I hope that this helps >>> >>> PJones >>> >>> Heiko Schroeter wrote: >>>> Hello list, >>>> >>>> does anyone has more background infos of what happened there ? >>>> >>>> Regards >>>> Heiko >>>> >>>> >>>> >>>> >>>> HLRN News >>>> --------- >>>> >>>> >>>> Since Mon Aug 18, 2008 12:00 HLRN-II complex Berlin is open for >>>> users, >>>> again. >>>> >>>> During the maintenance it turned out that the Lustre file system >>>> holding >>>> the users $WORK and $TMPDIR was damaged completely. >>>> The file system had to be reconstructed from scratch. All user >>>> data in >>>> $WORK are lost. >>>> >>>> We hope that this event remains an exception. SGI apologizes for >>>> this >>>> event. >>>> >>>> /Bka >>>> >>>> =================================================================== >>>> ===== >>>> This is an announcement for all HLRN Users >>>> _______________________________________________ >>>> Lustre-discuss mailing list >>>> [email protected] >>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>> >>> _______________________________________________ >>> Lustre-discuss mailing list >>> [email protected] >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >> >> -- >> Bernd Schubert >> Q-Leap Networks GmbH >> _______________________________________________ >> Lustre-discuss mailing list >> [email protected] >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > -- > ---------------------------------------------------------------------- > ---- > Troy Benjegerdes 'da hozer' > [EMAIL PROTECTED] > > Somone asked me why I work on this free (http://www.gnu.org/ > philosophy/) > software stuff and not get a real job. Charles Shultz had the best > answer: > > "Why do musicians compose symphonies and poets write poems? They do it > because life wouldn't have any meaning for them if they didn't. > That's why > I draw cartoons. It's my life." -- Charles Shultz > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
