No, the algorithm is not purely random, it is weighted on QOS, space and a few other things. When a stripe is chosen on one OSS, we add a penalty to the other OSTs on that OSS to prevent IO bunching on one OSS. cliffw
On Thu, Mar 31, 2011 at 1:59 PM, Jeremy Filizetti < [email protected]> wrote: > I this a feature implemented after 1.8.5? In the past default striping > without an offset resulted in sequential stripe allocation according to > client device order for a striped file. Basically the order OSTs were > mounted after the the last --writeconf is the order the targets are added to > the client llog and allocated. > > It's probably not a big deal for lots of clients but for a small number of > clients doing large sequential IO or working over the WAN it is. So > regardless of an A or B configuration a file with a stripe count of 3 could > end up issuing IO to a single OSS instead of using round-robin between the > socket/queue pair to each OSS. > > Jeremy > > > On Thu, Mar 31, 2011 at 11:06 AM, Kevin Van Maren < > [email protected]> wrote: > >> It used to be that multi-stripe files were created with sequential OST >> indexes. It also used to be that OST indexes were sequentially assigned >> to newly-created files. >> As Lustre now adds greater randomization, the strategy for assigning >> OSTs to OSS nodes (and storage hardware, which often limits the >> aggregate performance of multiple OSTs) is less important. >> >> While I have normally gone with "a", "b" can make it easier to remember >> where OSTs are located, and also keep a uniform convention if the >> storage system is later grown. >> >> Kevin >> >> >> Heckes, Frank wrote: >> > Hi all, >> > >> > sorry if this question has been answered before. >> > >> > What is the optimal 'strategy' assigning OSTs to OSS nodes: >> > >> > -a- Assign OST via round-robin to the OSS >> > -b- Assign in consecutive order (as long as the backend storage provides >> > enought capacity for iops and bandwidth) >> > -c- Something 'in-between' the 'extremes' of -a- and -b- >> > >> > E.g.: >> > >> > -a- OSS_1 OSS_2 OST_3 >> > |_ |_ |_ >> > OST_1 OST_2 OST_3 >> > OST_4 OST_5 OST_6 >> > OST_7 OST_8 OST_9 >> > >> > -b- OSS_1 OSS_2 OST_3 >> > |_ |_ |_ >> > OST_1 OST_4 OST_7 >> > OST_2 OST_5 OST_8 >> > OST_3 OST_6 OST_9 >> > >> > I thought -a- would be best for task-local (each task write to own >> > file) and single file (all task write to single file) I/O since its like >> > a raid-0 approach used disk I/O (and SUN create our first FS this way). >> > Does someone made any systematic investigations which approach is best >> > or have some educated opinion? >> > Many thanks in advance. >> > BR >> > >> > -Frank Heckes >> > >> > >> ------------------------------------------------------------------------------------------------ >> > >> ------------------------------------------------------------------------------------------------ >> > Forschungszentrum Juelich GmbH >> > 52425 Juelich >> > Sitz der Gesellschaft: Juelich >> > Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 >> > Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher >> > Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), >> > Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, >> > Prof. Dr. Sebastian M. Schmidt >> > >> ------------------------------------------------------------------------------------------------ >> > >> ------------------------------------------------------------------------------------------------ >> > >> > Besuchen Sie uns auf unserem neuen Webauftritt unter www.fz-juelich.de >> > _______________________________________________ >> > Lustre-discuss mailing list >> > [email protected] >> > http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > >> >> _______________________________________________ >> Lustre-discuss mailing list >> [email protected] >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > > > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > -- cliffw Support Guy WhamCloud, Inc. www.whamcloud.com
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
