There are several things that could have been done. The most likely are: 1) you deactivated the OSTs on the MSD, using something like:
# lctl set_param ost.work-OST0001.active=0 # lctl set_param ost.work-OST0002.active=0 2) you set the file stripe on the directory to use only OST0, as with # lfs setstripe -i 0 . I would think that you'd remember #1, so my guess would be #2, which could have happened when someone intended to do "lfs setstripe -c 0". Do an "lfs getstripe ." A simple: "lfs setstripe -i -1 ." in each directory should clear it up going forward. Note that existing files will NOT be re-striped, but new files will be balanced going forward. Kevin Aaron Everett wrote: > Thanks for the reply. > > File sizes are all <1GB and most files are <1MB. For a test, I copied a > typical result set from a non-lustre mount to my lustre directory. Total size > of the test is 42GB. I included before/after results for lfs df -i from a > client. > > Before test: > [r...@englogin01 backups]# lfs df > UUID 1K-blocks Used Available Use% Mounted on > fortefs-MDT0000_UUID 1878903960 129326660 1749577300 6% /lustre/work[MDT:0] > fortefs-OST0000_UUID 1264472876 701771484 562701392 55% /lustre/work[OST:0] > fortefs-OST0001_UUID 1264472876 396097912 868374964 31% /lustre/work[OST:1] > fortefs-OST0002_UUID 1264472876 393607384 870865492 31% /lustre/work[OST:2] > > filesystem summary: 3793418628 1491476780 2301941848 39% /lustre/work > > [r...@englogin01 backups]# lfs df -i > UUID Inodes IUsed IFree IUse% Mounted on > fortefs-MDT0000_UUID 497433511 33195991 464237520 6% /lustre/work[MDT:0] > fortefs-OST0000_UUID 80289792 13585653 66704139 16% /lustre/work[OST:0] > fortefs-OST0001_UUID 80289792 7014185 73275607 8% /lustre/work[OST:1] > fortefs-OST0002_UUID 80289792 7013859 73275933 8% /lustre/work[OST:2] > > filesystem summary: 497433511 33195991 464237520 6% /lustre/work > > > After test: > > [aever...@englogin01 ~]$ lfs df > UUID 1K-blocks Used Available Use% Mounted on > fortefs-MDT0000_UUID 1878903960 129425104 1749478856 6% /lustre/work[MDT:0] > fortefs-OST0000_UUID 1264472876 759191664 505281212 60% /lustre/work[OST:0] > fortefs-OST0001_UUID 1264472876 395929536 868543340 31% /lustre/work[OST:1] > fortefs-OST0002_UUID 1264472876 393392924 871079952 31% /lustre/work[OST:2] > > filesystem summary: 3793418628 1548514124 2244904504 40% /lustre/work > > [aever...@englogin01 ~]$ lfs df -i > UUID Inodes IUsed IFree IUse% Mounted on > fortefs-MDT0000_UUID 497511996 33298931 464213065 6% /lustre/work[MDT:0] > fortefs-OST0000_UUID 80289792 13665028 66624764 17% /lustre/work[OST:0] > fortefs-OST0001_UUID 80289792 7013783 73276009 8% /lustre/work[OST:1] > fortefs-OST0002_UUID 80289792 7013456 73276336 8% /lustre/work[OST:2] > > filesystem summary: 497511996 33298931 464213065 6% /lustre/work > > > > > > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Brian J. Murrell > Sent: Thursday, March 19, 2009 3:13 PM > To: [email protected] > Subject: Re: [Lustre-discuss] Unbalanced load across OST's > > On Thu, 2009-03-19 at 14:33 -0400, Aaron Everett wrote: > >> Hello all, >> > > Hi, > > >> We are running 1.6.6 with a shared mgs/mdt and 3 ost’s. We run a set >> of tests that write heavily, then we review the results and delete the >> data. Usually the load is evenly spread across all 3 ost’s. I noticed >> this afternoon that the load does not seem to be distributed. >> > > Striping as well as file count and size affects OST distribution as well. > Are any of the data involved striped? Are you writing very few large files > before you measure distribution? > > >> OST0000 has a load of 50+ with iowait of around 10% >> >> OST0001 has a load of <1 with >99% idle >> >> OST0002 has a load of <1 with >99% idle >> > > What does lfs df say before and after such a test that produces the above > results? Does it bear out even use amongst the OST before, and after the > test? > > >> df confirms the lopsided writes: >> > > lfs df [-i] from a client is usually more illustrative of use. As I say > above, if you can quiesce the filesystem for the test above, do an lfs df; > lfs df -i before the test and after. Assuming you were successful in > quiescing, you should see the change to the OSTs that your test effected. > > >> OST0000: >> >> Filesystem Size Used Avail Use% Mounted on >> >> /dev/sdb1 1.2T 602G 544G 53% /mnt/fortefs/ost0 >> > > What's important is what it looked like before the test too. Your test could > have, for example, wrote a single object (i.e. file) of nearly 300G for all > we can tell from what you've posted so far. > > b. > > > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
