On Sat, Feb 3, 2018 at 4:45 AM, Dilger, Andreas <[email protected]> wrote:
> On Jan 26, 2018, at 07:56, Thomas Roth <[email protected]> wrote: > > > > Hmm, option-testing leads to more confusion: > > > > With this 922GB-sdb1 I do > > > > mkfs.lustre --reformat --mgs --mdt ... /dev/sdb1 > > > > The output of the command says > > > > Permanent disk data: > > Target: test0:MDT0000 > > ... > > > > device size = 944137MB > > formatting backing filesystem ldiskfs on /dev/sdb1 > > target name test0:MDT0000 > > 4k blocks 241699072 > > options -J size=4096 -I 1024 -i 2560 -q -O > dirdata,uninit_bg,^extents,mmp,dir_nlink,quota,huge_file,flex_bg -E > lazy_journal_init -F > > > > mkfs_cmd = mke2fs -j -b 4096 -L test0:MDT0000 -J size=4096 -I 1024 -i > 2560 -q -O dirdata,uninit_bg,^extents,mmp,dir_nlink,quota,huge_file,flex_bg > -E lazy_journal_init -F /dev/sdb1 241699072 > > The default options have to be conservative, as we don't know in advance > how a filesystem will be used. It may be that some sites will have lots of > hard links or long filenames (which consume directory space == blocks, but > not inodes), or they will have widely-striped files (which also consume > xattr blocks). The 2KB/inode ratio includes the space for the inode itself > (512B in 2.7.x 1024B in 2.10), at least one directory entry (~64 bytes), > some fixed overhead for the journal (up to 4GB on the MDT), and > Lustre-internal overhead (OI entry = ~64 bytes), ChangeLog, etc. > > If you have a better idea of space usage at your site, you can specify > different parameters. > > > Mount this as ldiskfs, gives 369 M inodes. > > > > One would assume that specifying one / some of the mke2fs-options here > in the mkfs.lustre-command will change nothing. > > > > However, > > > > mkfs.lustre --reformat --mgs --mdt ... --mkfsoptions="-I 1024" /dev/sdb1 > > > > says > > > > device size = 944137MB > > formatting backing filesystem ldiskfs on /dev/sdb1 > > target name test0:MDT0000 > > 4k blocks 241699072 > > options -I 1024 -J size=4096 -i 1536 -q -O > dirdata,uninit_bg,^extents,mmp,dir_nlink,quota,huge_file,flex_bg -E > lazy_journal_init -F > > > > mkfs_cmd = mke2fs -j -b 4096 -L test0:MDT0000 -I 1024 -J size=4096 -i > 1536 -q -O dirdata,uninit_bg,^extents,mmp,dir_nlink,quota,huge_file,flex_bg > -E lazy_journal_init -F /dev/sdb1 241699072 > > > > and the mounted devices now has 615 M inodes. > > > > So, whatever makes the calculation for the "-i / bytes-per-inode" value > becomes ineffective if I specify the inode size by hand? > > This is a bit surprising. I agree that specifying the same inode size > value as the default should not affect the calculation for the > bytes-per-inode ratio. > > > How many bytes-per-inode do I need? > > > > This ratio, is it what the manual specifies as "one inode created for > each 2kB of LUN" ? > > That was true with 512B inodes, but with the increase to 1024B inodes in > 2.10 (to allow for PFL file layouts, since they are larger) the inode ratio > has also gone up 512B to 2560B/inode. > Does this mean that someone who updates their servers from 2.x to 2.10 will not be able to use PFL since the MDT was formatted in a way that can't support it? (in our case formatted under Lustre 2.5 currently running 2.8) Thanks, Eli > > > Perhaps the raw size of an MDT device should better be such that it > leads to "-I 1024 -i 2048"? > > Yes, that is probably reasonable, since the larger inode also means that > there is less chance of external xattr blocks being allocated. > > Note that with ZFS there is no need to specify the inode ratio at all. It > will dynamically allocate inode blocks as needed, along with directory > blocks, OI tables, etc., until the filesystem is full. > > Cheers, Andreas > > > On 01/26/2018 03:10 PM, Thomas Roth wrote: > >> Hi all, > >> what is the relation between raw device size and size of a formatted > MDT? Size of inodes + free space = raw size? > >> The example: > >> MDT device has 922 GB in /proc/partions. > >> Formatted under Lustre 2.5.3 with default values for mkfs.lustre > resulted in a 'df -h' MDT of 692G and more importantly 462M inodes. > >> So, the space used for inodes + the 'df -h' output add up to the raw > size: > >> 462M inodes * 0.5kB/inode + 692 GB = 922 GB > >> On that system there are now 330M files, more than 70% of the available > inodes. > >> 'df -h' says '692G 191G 456G 30% /srv/mds0' > >> What do I need the remaining 450G for? (Or the ~400G left once all the > inodes are eaten?) > >> Should the format command not be tuned towards more inodes? > >> Btw, on a Lustre 2.10.2 MDT I get 369M inodes and 550 G space (with a > 922G raw device): inode size is now 1024. > >> However, according to the manual and various Jira/Ludocs the size > should be 2k nowadays? > >> Actually, the command within mkfs.lustre reads > >> mke2fs -j -b 4096 -L test0:MDT0000 -J size=4096 -I 1024 -i 2560 -F > /dev/sdb 241699072 > >> -i 2560 ? > >> Cheers, > >> Thomas > > > > - > > _______________________________________________ > > lustre-discuss mailing list > > [email protected] > > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > > Cheers, Andreas > -- > Andreas Dilger > Lustre Principal Architect > Intel Corporation > > > > > > > > _______________________________________________ > lustre-discuss mailing list > [email protected] > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
