I guess that particular table is not the whole truth, nor a specification, nor a promise, but a simplified summary of what you get when there is just one block size that applies to both meta-data and data-data.
You have discovered that it does not apply to systems where metadata has a different blocksize than data-data. My guesstimate (speculation!) is that the deployed code chooses one subblocks-per-full-block parameter and applies that to both. Which would explain the results we're seeing. Further is seems the the mmlsfs command assumes at least in some places that there is only one subblocks-per-block parameter... Looking deeper into code, is another story for another day -- but I'll say that there seems to be sufficient flexibility that if this were deemed a burning issue, there could be futher "enhancements..." ;-) From: "Buterbaugh, Kevin L" <[email protected]> To: gpfsug main discussion list <[email protected]> Date: 08/01/2018 02:24 PM Subject: Re: [gpfsug-discuss] Sub-block size wrong on GPFS 5 filesystem? Sent by: [email protected] Hi Marc, Thanks for the response … I understand what you’re saying, but since I’m asking for a 1 MB block size for metadata and a 4 MB block size for data and according to the chart in the mmcrfs man page both result in an 8 KB sub block size I’m still confused as to why I’ve got a 32 KB sub block size for my non-system (i.e. data) pools? Especially when you consider that 32 KB isn’t the default even if I had chosen an 8 or 16 MB block size! Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education [email protected] - (615)875-9633 On Aug 1, 2018, at 12:21 PM, Marc A Kaplan <[email protected]> wrote: I haven't looked into all the details but here's a clue -- notice there is only one "subblocks-per-full-block" parameter. And it is the same for both metadata blocks and datadata blocks. So maybe (MAYBE) that is a constraint somewhere... Certainly, in the currently supported code, that's what you get. From: "Buterbaugh, Kevin L" <[email protected]> To: gpfsug main discussion list <[email protected]> Date: 08/01/2018 12:55 PM Subject: [gpfsug-discuss] Sub-block size wrong on GPFS 5 filesystem? Sent by: [email protected] Hi All, Our production cluster is still on GPFS 4.2.3.x, but in preparation for moving to GPFS 5 I have upgraded our small (7 node) test cluster to GPFS 5.0.1-1. I am setting up a new filesystem there using hardware that we recently life-cycled out of our production environment. I “successfully” created a filesystem but I believe the sub-block size is wrong. I’m using a 4 MB filesystem block size, so according to the mmcrfs man page the sub-block size should be 8K: Table 1. Block sizes and subblock sizes +‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐ ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+ | Block size | Subblock size | +‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐ ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+ | 64 KiB | 2 KiB | +‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐ ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+ | 128 KiB | 4 KiB | +‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐ ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+ | 256 KiB, 512 KiB, 1 MiB, 2 | 8 KiB | | MiB, 4 MiB | | +‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐ ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+ | 8 MiB, 16 MiB | 16 KiB | +‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐ ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+ However, it appears that it’s 8K for the system pool but 32K for the other pools: flag value description ------------------- ------------------------ ----------------------------------- -f 8192 Minimum fragment (subblock) size in bytes (system pool) 32768 Minimum fragment (subblock) size in bytes (other pools) -i 4096 Inode size in bytes -I 32768 Indirect block size in bytes -m 2 Default number of metadata replicas -M 3 Maximum number of metadata replicas -r 1 Default number of data replicas -R 3 Maximum number of data replicas -j scatter Block allocation type -D nfs4 File locking semantics in effect -k all ACL semantics in effect -n 32 Estimated number of nodes that will mount file system -B 1048576 Block size (system pool) 4194304 Block size (other pools) -Q user;group;fileset Quotas accounting enabled user;group;fileset Quotas enforced none Default quotas enabled --perfileset-quota No Per-fileset quota enforcement --filesetdf No Fileset df enabled? -V 19.01 (5.0.1.0) File system version --create-time Wed Aug 1 11:39:39 2018 File system creation time -z No Is DMAPI enabled? -L 33554432 Logfile size -E Yes Exact mtime mount option -S relatime Suppress atime mount option -K whenpossible Strict replica allocation option --fastea Yes Fast external attributes enabled? --encryption No Encryption enabled? --inode-limit 101095424 Maximum number of inodes --log-replicas 0 Number of log replicas --is4KAligned Yes is4KAligned? --rapid-repair Yes rapidRepair enabled? --write-cache-threshold 0 HAWC Threshold (max 65536) --subblocks-per-full-block 128 Number of subblocks per full block -P system;raid1;raid6 Disk storage pools in file system --file-audit-log No File Audit Logging enabled? --maintenance-mode No Maintenance Mode enabled? -d test21A3nsd;test21A4nsd;test21B3nsd;test21B4nsd;test23Ansd;test23Bnsd;test23Cnsd;test24Ansd;test24Bnsd;test24Cnsd;test25Ansd;test25Bnsd;test25Cnsd Disks in file system -A yes Automatic mount option -o none Additional mount options -T /gpfs5 Default mount point --mount-priority 0 Mount priority Output of mmcrfs: mmcrfs gpfs5 -F ~/gpfs/gpfs5.stanza -A yes -B 4M -E yes -i 4096 -j scatter -k all -K whenpossible -m 2 -M 3 -n 32 -Q yes -r 1 -R 3 -T /gpfs5 -v yes --nofilesetdf --metadata-block-size 1M The following disks of gpfs5 will be formatted on node testnsd3: test21A3nsd: size 953609 MB test21A4nsd: size 953609 MB test21B3nsd: size 953609 MB test21B4nsd: size 953609 MB test23Ansd: size 15259744 MB test23Bnsd: size 15259744 MB test23Cnsd: size 1907468 MB test24Ansd: size 15259744 MB test24Bnsd: size 15259744 MB test24Cnsd: size 1907468 MB test25Ansd: size 15259744 MB test25Bnsd: size 15259744 MB test25Cnsd: size 1907468 MB Formatting file system ... Disks up to size 8.29 TB can be added to storage pool system. Disks up to size 16.60 TB can be added to storage pool raid1. Disks up to size 132.62 TB can be added to storage pool raid6. Creating Inode File 8 % complete on Wed Aug 1 11:39:19 2018 18 % complete on Wed Aug 1 11:39:24 2018 27 % complete on Wed Aug 1 11:39:29 2018 37 % complete on Wed Aug 1 11:39:34 2018 48 % complete on Wed Aug 1 11:39:39 2018 60 % complete on Wed Aug 1 11:39:44 2018 72 % complete on Wed Aug 1 11:39:49 2018 83 % complete on Wed Aug 1 11:39:54 2018 95 % complete on Wed Aug 1 11:39:59 2018 100 % complete on Wed Aug 1 11:40:01 2018 Creating Allocation Maps Creating Log Files 3 % complete on Wed Aug 1 11:40:07 2018 28 % complete on Wed Aug 1 11:40:14 2018 53 % complete on Wed Aug 1 11:40:19 2018 78 % complete on Wed Aug 1 11:40:24 2018 100 % complete on Wed Aug 1 11:40:25 2018 Clearing Inode Allocation Map Clearing Block Allocation Map Formatting Allocation Map for storage pool system 85 % complete on Wed Aug 1 11:40:32 2018 100 % complete on Wed Aug 1 11:40:33 2018 Formatting Allocation Map for storage pool raid1 53 % complete on Wed Aug 1 11:40:38 2018 100 % complete on Wed Aug 1 11:40:42 2018 Formatting Allocation Map for storage pool raid6 20 % complete on Wed Aug 1 11:40:47 2018 39 % complete on Wed Aug 1 11:40:52 2018 60 % complete on Wed Aug 1 11:40:57 2018 79 % complete on Wed Aug 1 11:41:02 2018 100 % complete on Wed Aug 1 11:41:08 2018 Completed creation of file system /dev/gpfs5. mmcrfs: Propagating the cluster configuration data to all affected nodes. This is an asynchronous process. And contents of stanza file: %nsd: nsd=test21A3nsd usage=metadataOnly failureGroup=210 pool=system servers=testnsd3,testnsd1,testnsd2 device=dm-15 %nsd: nsd=test21A4nsd usage=metadataOnly failureGroup=210 pool=system servers=testnsd1,testnsd2,testnsd3 device=dm-14 %nsd: nsd=test21B3nsd usage=metadataOnly failureGroup=211 pool=system servers=testnsd1,testnsd2,testnsd3 device=dm-17 %nsd: nsd=test21B4nsd usage=metadataOnly failureGroup=211 pool=system servers=testnsd2,testnsd3,testnsd1 device=dm-16 %nsd: nsd=test23Ansd usage=dataOnly failureGroup=23 pool=raid6 servers=testnsd2,testnsd3,testnsd1 device=dm-10 %nsd: nsd=test23Bnsd usage=dataOnly failureGroup=23 pool=raid6 servers=testnsd3,testnsd1,testnsd2 device=dm-9 %nsd: nsd=test23Cnsd usage=dataOnly failureGroup=23 pool=raid1 servers=testnsd1,testnsd2,testnsd3 device=dm-5 %nsd: nsd=test24Ansd usage=dataOnly failureGroup=24 pool=raid6 servers=testnsd3,testnsd1,testnsd2 device=dm-6 %nsd: nsd=test24Bnsd usage=dataOnly failureGroup=24 pool=raid6 servers=testnsd1,testnsd2,testnsd3 device=dm-0 %nsd: nsd=test24Cnsd usage=dataOnly failureGroup=24 pool=raid1 servers=testnsd2,testnsd3,testnsd1 device=dm-2 %nsd: nsd=test25Ansd usage=dataOnly failureGroup=25 pool=raid6 servers=testnsd1,testnsd2,testnsd3 device=dm-6 %nsd: nsd=test25Bnsd usage=dataOnly failureGroup=25 pool=raid6 servers=testnsd2,testnsd3,testnsd1 device=dm-6 %nsd: nsd=test25Cnsd usage=dataOnly failureGroup=25 pool=raid1 servers=testnsd3,testnsd1,testnsd2 device=dm-3 %pool: pool=system blockSize=1M usage=metadataOnly layoutMap=scatter allowWriteAffinity=no %pool: pool=raid6 blockSize=4M usage=dataOnly layoutMap=scatter allowWriteAffinity=no %pool: pool=raid1 blockSize=4M usage=dataOnly layoutMap=scatter allowWriteAffinity=no What am I missing or what have I done wrong? Thanks… Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education [email protected] (615)875-9633 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cd84fdde05c65406d4d9008d5f7d32f0f%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636687408760535040&sdata=hqVZVIQLbxakARTspzbSkMZBHi2b6%2BIcrPLU1atNbus%3D&reserved=0 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
