>Someone mentioned encryption will bypass this feature, but it's actually 
>encryption that perhaps requires larger inode sizes to store all the key meta 
>info (you can have up to 8 keys per inode I believe).

I believe that is incorrect.  If encryption is used, the size of the inode 
makes no difference. This is due to the fact that Only data, NOT metadata is 
encrypted on the file system.  So storing blocks in MD spaces is out.
See the Scale documentation, and older GPFS documentation, for more 
information.   (such as Encryption - IBM 
Documentation<https://www.ibm.com/docs/en/storage-scale/5.1.3?topic=administering-encryption>
 )  Until such time as they start encrypting the metadata, it’s pointless to 
size MD for small files.

Ed Wahl
Ohio Supercomputer Center

From: gpfsug-discuss <[email protected]> On Behalf Of Alec
Sent: Wednesday, August 2, 2023 12:07 PM
To: gpfsug main discussion list <[email protected]>
Subject: Re: [gpfsug-discuss] Inode size, and system pool subblock

I think things are conflated here.β€Š.β€Š. The inode size is really just a call on 
how much functionality you need in an inode. I wouldn't even think about disk 
block size when setting this. Essentially the smaller the inode the less space I

I think things are conflated here...

The inode size is really just a call on how much functionality you need in an 
inode.  I wouldn't even think about disk block size when setting this.  
Essentially the smaller the inode the less space I need for metadata but also 
the less capacity I have in my inode.

The default is 4k and if you don't change it then GPFS will put up to a 3.8k 
file in the inode itself vs going to an indirect disk allocation.  Someone 
mentioned encryption will bypass this feature, but it's actually encryption 
that perhaps requires larger inode sizes to store all the key meta info (you 
can have up to 8 keys per inode I believe).

So essentially it you've got a smaller inode size your directories max size 
will max out sooner, your ACLs could be constrained, large file names can 
exhaust, you may not have enough space for Encryption details.  But the upshot 
is you need to dedicate less space to metadata and can handle more file 
entries.  So if you've got billions of files and are managing replicas then you 
should consider fine tuning inode size down.

You can go from 3.5% of space going to inodes to 1% if you went from 4k to 512 
bytes.. but there is a reason GPFS defaults to 4k... And doesn't expand on it 
too much.  If you've guessed wrong you're kind of hosed.

None of this has to do with hardware block sizes, subblock allocation and 
fragment sizes.  And further compounded by 4k native block sizes vs emulated 
512 block size some disk hardware does.

For GPFS you generally will have a very large block size 256kb or 1MB and GPFS 
will divide those blocks into 32 fragments.  So you may have your smallest unit 
being a 8kb or 32kb fragment.  If you have a dedicated MD pool (highly 
recommended) you'd definitely specify a smaller block size than 1MB (128kb = 
4kb fragments).

The balance you're trying to strike here is the least amount of commands to 
retrieve your data efficiently.  Think about the roundtrip on the bus being the 
same for a 4kb read vs a 1mb read so try to maximize this.

Generally the goal of the file system is to ensure that the excess data that is 
read when trying to pull fragments is as useless as possible.

I may also be confused but I wouldn't worry so much about inode size to block 
size.. just worry about getting large blocks working well for regular storage 
pool if your data is huge and using a smaller block size in MD if dedicate pool 
which is almost always recommended.

Be very careful of specifying a small inode size because it's not just max 
filenames and max file counts in a directory.. it is much more.. and if you 
have a lot of small files don't underestimate the advantage of those files 
being stored directly in the inode.  A 512 byte inode could only store about a 
380byte file vs a 4k file storing 3800 byte file.  These files tend to be shell 
scripts and config files which you really don't want to be waiting around for 
and occupying a huge 1mb read for and waisting a potentially larger 64kb 
fragment allocation on.

Alec



On Wed, Aug 2, 2023, 4:47 AM Olaf Weiser 
<[email protected]<mailto:[email protected]>> wrote:
Hallo Peter,

[1] [...] having a smaller inode size than the subblock size means there's a 
big wastage on disk usage, with no performance benefit to doing so[...]
in short - yes πŸ˜‰



[2]  [...]  I believe I'm correct in saying that inodes are not the only things 
to live on the metadata pool, so I assume that some other metadata might 
benefit from the larger block/subblock size. But looking at the number of 
inodes, the inode size, and the space consumed in the system pool, it really 
looks like the majority of space consumed is by inodes.[...]
you may need to consider snapshots and directories , which all contributes to 
MD space

predicting the space requirements for MD for directories is always hard, 
because the size of a directory  is depending on the file's name length, the 
users will create...


further more,  using a less than 4k  inode size makes also not much sense, when 
taking into account, that NVMEs and other modern block storage devices comes 
with a hardware block size of 4k (even though GPFS still can deal with 512 
Bytes per sector)


hope this helps ..




________________________________
Von: gpfsug-discuss 
<[email protected]<mailto:[email protected]>> 
im Auftrag von Peter Chase 
<[email protected]<mailto:[email protected]>>
Gesendet: Mittwoch, 2. August 2023 11:09
An: [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>>
Betreff: [EXTERNAL] [gpfsug-discuss] Inode size, and system pool subblock

Good Morning, I have a question about inode size vs subblock size. Can anyone 
think of a reason that the chosen inode size of a scale filesystem should be 
smaller than the subblock size for the metadata pool? I'm looking at an 
existing filesystem,

Good Morning,

I have a question about inode size vs subblock size. Can anyone think of a 
reason that the chosen inode size of a scale filesystem should be smaller than 
the subblock size for the metadata pool?
I'm looking at an existing filesystem, the inode size is 2KiB, and the subblock 
is 4KiB.
It feels like I'm missing something. If I've understood the docs on blocks and 
subblocks correctly, it sounds like the subblock is the smallest atomic access 
size. Meaning with a 4K subblock, and a 2K inode, reading the inode would 
return its contents and 2K of empty subblock every time. So, in my head (and 
maybe only there), having a smaller inode size than the subblock size means 
there's a big wastage on disk usage, with no performance benefit to doing so.
I believe I'm correct in saying that inodes are not the only things to live on 
the metadata pool, so I assume that some other metadata might benefit from the 
larger block/subblock size. But looking at the number of inodes, the inode 
size, and the space consumed in the system pool, it really looks like the 
majority of space consumed is by inodes.

As I said, I feel like I'm missing something, so if anyone can tell me where 
I'm wrong it would be greatly appreciated!

Sincerely,


Pete Chase

UKMO
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at 
gpfsug.org<https://urldefense.com/v3/__http:/gpfsug.org__;!!KGKeukY!wKer_px73AVXSgqasA-xymOOL3Y-Ln5AOyO_hz3e81yY2Y3Bx_IhmuPN87Q8-uneGQK5yacvKmWa$>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org<https://urldefense.com/v3/__http:/gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org__;!!KGKeukY!wKer_px73AVXSgqasA-xymOOL3Y-Ln5AOyO_hz3e81yY2Y3Bx_IhmuPN87Q8-uneGQK5yRmAg67I$>
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org

Reply via email to