On Thu, Aug 19, 2010 at 01:29:37PM +0100, Gregory Matthews wrote: >Article by Jeff Layton: > >http://www.linux-mag.com/id/7839 > >anyone have views on whether this sort of caching would be useful for >the MDT? My feeling is that MDT reads are probably pretty random but >writes might benefit...?
if you look at the tiny size of inodes in slabtop on an MDS you'll see that all read ops for most fs's are probably 100% cached in ram by a decent sized MDS. ie. once you have traversed all inodes of a fs once, then likely the MDT's are a write-only media, and the ram of the MDS is a faster iop machine than any SSD could ever be. you are then left with a MDT workload of entirely small writes. that is definitely not a SSD sweet spot - many SSDs will fragment badly and slow down horrendously, which eg. JBODs of 15k rpm SAS disks will not do. basically beware of cheap SSDs, possibly any SSD, and certainly any SSD that isn't an Intel x25-e or better. the Marvell controller SSDs we sadly have many of now, I would not inflict upon any MDT. also, having experimented with ramdisk MDT's (not in production obviously), it is clear that even this 'perfect' media doesn't solve all Lustre iops problems. far from it. usually it just means that you hit algorithmic or numa problems in Lustre MDS code, or (more likely) the ops just flow onto the OSTs and those become the bottleneck instead. basically ramdisk MDT speedups weren't big over even just say, 16 fast FC or SAS disks. SSDs would be in-between if they were behaving perfectly, which would require extensive testing to determine. looking at it a different way, Lustre's statahead kinda works ok, create's are (IIRC) batched so also scale ok, so delete's might be the only workload left where the fastest MDT money can buy would get you any significant benefit... probably not worth the spend for most folks. assuming for a moment that SSDs worked as they should, then other Lustre related workloads for which SSDs might be suitable are external journals for OSTs, md bitmaps, or (one day) perhaps ZFS intent logs. cheers, robin -- Dr Robin Humble, HPC Systems Analyst, NCI National Facility _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
