Greetings Sebastien, On Wed, Aug 19, 2009 at 10:39 AM, Sébastien Buisson<[email protected]> wrote: > Hi, > > To me: > 12 OSTs x 1.2 GB = 14.4 GB < 16GB > > So you are clearly in the recommendation.
I thought I would be with in the spec *if* my OSTs were smaller units. As they are JBODs in sections of 6+ Tb each, I though I was "coloring outside the lines". Thanks, megan > > Cheers, > Sebastien. > > > Ms. Megan Larko a écrit : >> >> Responding to what Sebastien has written: >>> >>> Hi, >> >>> Just a small feedback from our own experience. >>> I agree with Brian about the fact that there is no strong limit on the >>> number of OSTs per OSS in the Lustre code. But one should really take >>> into account the available memory on OSSes when defining the number of >>> OSTs per OSS (and so the size of each OST). If you do not have 1GB or >>> 1.2 GB of memory per OST on your OSSes, you will run into serious >> >> t>rouble with "out of memory" messages. >> >>> For instance, if you want 8 OSTs per OSS, your OSSes should have at >>> least 10GB of RAM. >> >>> Unfortunately we experienced those "out of memory" problems, so I advise >>> you to read Lustre Operations Manual chapter 33.12 "OSS RAM Size for a >>> Single OST". >> >>> Cheers, >>> Sebastien. >> >> We have one OSS running Lustre 2.6.18-53.1.13.el5_lustre.1.6.4.3smp. >> This OSS has 16Gb RAM for 76Tb of formatted Lustre disk space. >> >> [r...@oss4 ~]# cat /proc/meminfo >> MemTotal: 16439360 kB >> MemFree: 88204 kB >> >> Client sees: ic-m...@o2ib:/crew8 Total Usable Space 76Tb >> >> The OSS has 6 JBODS, each of which is partitioned in two parts to stay >> below the Lustre 8Tb per partition limit. >> /dev/sdb1 6.3T 3.8T 2.3T 63% /srv/lustre/OST/crew8-OST0000 >> /dev/sdb2 6.3T 3.7T 2.3T 62% /srv/lustre/OST/crew8-OST0001 >> /dev/sdc1 6.3T 3.8T 2.3T 63% /srv/lustre/OST/crew8-OST0002 >> /dev/sdc2 6.3T 3.8T 2.2T 64% /srv/lustre/OST/crew8-OST0003 >> /dev/sdd1 6.3T 3.8T 2.2T 64% /srv/lustre/OST/crew8-OST0004 >> /dev/sdd2 6.3T 4.2T 1.8T 70% /srv/lustre/OST/crew8-OST0005 >> /dev/sdi1 6.3T 4.3T 1.8T 71% /srv/lustre/OST/crew8-OST0006 >> /dev/sdi2 6.3T 3.8T 2.2T 64% /srv/lustre/OST/crew8-OST0007 >> /dev/sdj1 6.3T 3.8T 2.3T 63% /srv/lustre/OST/crew8-OST0008 >> /dev/sdj2 6.3T 3.8T 2.2T 63% /srv/lustre/OST/crew8-OST0009 >> /dev/sdk1 6.3T 3.7T 2.3T 62% /srv/lustre/OST/crew8-OST0010 >> /dev/sdk2 6.3T 3.7T 2.3T 63% /srv/lustre/OST/crew8-OST0011 >> >> As you can see, this is no where near the recommendation of 1Gb of RAM >> per OST. Yes, we do occasionally, under load, see kernel panics due >> to, we believe, insufficient memory and swap. These panics occur >> approximately once per month. We also see watchdog messages stating >> "swap page allocation failure" messages sometimes a day prior to >> kernel panic. After this Lustre disk was up and running was I then >> enlightened that this was too much load for a single OSS. Ah well, >> live and learn. I am planning to split this one large group across >> two OSSes in the next month. Hopefully the kernel panics and >> watchdog errors will go away with the disk OST load shared across two >> OSS machines. >> >> Just one real life scenario for your consideration. >> >> megan >> _______________________________________________ >> Lustre-discuss mailing list >> [email protected] >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
