O.k., so the point is that the MDS writes are so small, one could never stripe such a write over multiple disks anyhow. Very good, one point less to worry about.
Btw, files on the MDT - why does the apparent file size there sometimes reflect the size of the real file, and sometimes not? For example, on a ldiskfs-mounted copy of our MDT, I have a directory under ROOT/ with -rw-rw-r-- 1 935M 15. Jul 2009 09000075278027.140 -rw-rw-r-- 1 0 15. Jul 2009 09000075278027.150 As they should, both entries are 0-sized, as seen by e.g. "du". On Lustre, both files exist and both have size 935M. So for some reason, one has a metatdata entry that appears as a huge sparse file, the other does not. Is there a reason, or is this just an illness of our installation? Cheers, Thomas On 01/21/2011 09:31 PM, Cliff White wrote: > > > On Fri, Jan 21, 2011 at 3:43 AM, Thomas Roth <[email protected] > <mailto:[email protected]>> wrote: > > Hi all, > > we have gotten new MDS hardware, and I've got two questions: > > What are the recommendations for the RAID configuration and formatting > options? > I was following the recent discussion about these aspects on an OST: > chunk size, strip size, stride-size, stripe-width etc. in the light of > the 1MB chunks of Lustre ... So what about the MDT? I will have a RAID > 10 that consists of 11 RAID-1 pairs striped over. giving me roughly 3TB > of space. What would be the correct value for <insert your favorite > term>, the amount of data written to one disk before proceeding to the > next disk? > > > The MDS does very small random IO - inodes and directories. Afaik, the > largest chunk > of data read/written would be 4.5K -and you would see that only with large > OST stripe > counts. RAID 10 is fine. You will not > be doing IO that spans more than one spindle, so I'm not sure if there's a > real need to tune here. > Also, the size of the data on the MDS is determined by the number of files in > the > filesystem (~4k per file is good) > unless you are buried in petabytes 3TB is likely way oversize for an MDT. > cliffw > > > > Secondly, it is not yet decided whether we wouldn't use this hardware to > set up a second Lustre cluster. The manual recommends to have only one > MGS per site, but doesn't elaborate: what would be the drawback of > having two MGSes, two different network addresses the clients have to > connect to to mount the Lustres? > I know that it didn't work in Lustre 1.6.3 ;-) and there are no apparent > issues when connecting a Lustre client to a test cluster now (version > 1.8.4), but what about production? > > > Cheers, > Thomas > _______________________________________________ > Lustre-discuss mailing list > [email protected] <mailto:[email protected]> > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > -- -------------------------------------------------------------------- Thomas Roth Department: Informationstechnologie Location: SB3 1.262 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1 64291 Darmstadt www.gsi.de Gesellschaft mit beschränkter Haftung Sitz der Gesellschaft: Darmstadt Handelsregister: Amtsgericht Darmstadt, HRB 1528 Geschäftsführung: Professor Dr. Dr. h.c. Horst Stöcker, Dr. Hartmut Eickhoff Vorsitzende des Aufsichtsrates: Dr. Beatrix Vierkorn-Rudolph Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
