I’m trying to finalize my file system configuration for production. I’ll be 
moving 3-3.5B files from my legacy storage to ESS (about 1.8PB). The legacy 
file systems are block size 256k, 8k subblocks.

Target ESS is a GL4, 8TB drives (2.2PB using 8+2p)

For file systems configured on the ESS, the vdisk block size must equal the 
file system block size. Using 8+2p, the smallest block size is 512K. Looking at 
the overall file size histogram, a block size of 1MB might be a good compromise 
in efficiency and sub block size (32k subblock). With 4K inodes, somewhere 
around 60-70% of the current files end up in inodes. Of the files in the range 
4k-32K, those are the ones that would potentially “waste” some space because 
they are smaller than the sub block but too big for an inode. That’s roughly 
10-15% of the files. This ends up being a compromise because of our inability 
to use the V5 file system format (clients still at CentOS 6/Scale 4.2.3).

For metadata, the  file systems are currently using about 15TB of space 
(replicated, across roughly 1.7PB usage). This represents a mix of 256b and 4k 
inodes (70% 256b). Assuming a 8x increase the upper limit of needs would be 
128TB. Since some of that is already in 4K inodes, I feel an allocation of 
90-100 TB (4-5% of data space) is closer to reality. I don’t know if having a 
separate metadata pool makes sense if I’m using the V4 format, in which the 
block size of metadata and data is the same.

Summary, I think the best options are:

Option (1): 2 file systems of 1PB each. 1PB data pool, 50TB system pool, 1MB 
block size, 2x replicated metadata
Option (2): 2 file systems of 1PB each. 1PB data/metadata pool, 1MB block size, 
2x replicated metadata (preferred, then I don’t need to manage my metadata 
space)

Any thoughts would be appreciated.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance
507-269-0413

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to