On Thu, Feb 9, 2012 at 8:18 PM, Andreas Dilger <[email protected]> wrote: > On 2012-02-09, at 6:20 AM, Jack David wrote: >> In the output of "lsf getstripe <filename> | <dirname>", the obdidx >> denotes the OST index (I assume). >> >> Consider the following output: >> >> lmm_stripe_count: 2 >> lmm_stripe_size: 1048576 >> lmm_stripe_offset: 1 >> obdidx objid objid group >> 1 2 0x2 0 >> 0 3 0x3 0 >> >> where I have a setup consisting of two OSTs. If I have more than two >> OSTs, is it possible that I get the obdidx values out of order? Or the >> obdidx values will always be linear? >> >> For example, in above output, the values are linear (like 1, 0 - and >> this pattern will be repeated while storing the data I assume). If I >> have 4 OSTs, can the values be non-linear? Something like 2,0,1,3 or >> 2,1,3,0 (or any pattern for that matter)?? > > Typically the ordering will be linear, but this depends on a number of > different factors: > - what order the OSTs were created in: without --index=N the OST order > depends on the order in which they were first mounted, so using --index > is always recommended, and will be mandatory in the future > - the distribution of OSTs among OSS nodes: the MDS object allocator > will normally select one OST from each OSS before allocating another > object from a different OST on the same OSS
Thanks for this information. > - the space available on each OST: when OST free space is imbalanced > the OSTs will be selected in part based on how full they are I have a doubt here. Lets say I have 4 OSTs, but the lustre client is issuing the write request having which can be accommodated by any single OST (e.g. write request is of size 512bytes and stripe_size is 1MB). In this case, how will the data be stored? Will the MDS maintain the index of next OST which should serve the request? > >> My assumption on how the data is stored on OSTs: >> Based upon the values of obdidx, each OST will store a stripe_size >> worth data into the objid (a file under ldiskfs volume of that OST) in >> rotation. So if I get the obdidx like 2,1,3,0 and stripe_size if 1MB, >> then the data will be stored in following order: >> >> 1st MB: 2nd OST >> 2nd MB: 1st OST >> 3rdMB: 3rd OST >> 4thMB: 0th OST >> 5th MB: 2nd OST (Again - repeating the pattern) >> 6th MB: 1st OST >> >> Is this understanding correct?? I hope I am clear on my question. > > Correct. The data is strictly round-robin on the objects once they > are allocated to a file. > Thanks again, J > Cheers, Andreas > -- > Andreas Dilger Whamcloud, Inc. > Principal Engineer http://www.whamcloud.com/ > > > > -- J _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
