Hello, Am 31.08.2018 um 13:59 schrieb Shyam Ranganathan: > I suspect you have hit this: > https://bugzilla.redhat.com/show_bug.cgi?id=1602262#c5 > > I further suspect your older setup was 3.10 based and not 3.12 based. > > There is an additional feature added in 3.12 that stores GFID to path > conversion details using xattrs (see "GFID to path" in > https://docs.gluster.org/en/latest/release-notes/3.12.0/#major-changes-and-features > ) > > Due to which xattr storage limit is reached/breached on ext4 based bricks. > > To check if you are facing similar issue to the one in the bug provided > above, I would check if the brick logs throw up the no space error on a > gfid2path set failure.
thanks for the hint. >From log output (= no gfid2path errors) it seems to be not the problem although the old gluster volume was setup with version 3.10.x (or even 3.8.x i think). I wrote I could reproduce it on new ext4 and on old xfs gluster volumes with version 3.12.13 while it was running fine with ~ 3.12.8 (half year ago) without problems. But just saw that my old main volume wasn't/isn't xfs but also ext4. Digging into logs I could see that I was running in January still 3.10.8 / 3.10.9 and initial switched in April to 3.12.9 / 3.12 version branch. >From entry sizes/differences your suggestion would fit: https://manpages.debian.org/testing/manpages/xattr.7.en.html or http://man7.org/linux/man-pages/man5/attr.5.html In the current ext2, ext3, and ext4 filesystem implementations, the total bytes used by the names and values of all of a file's extended attributes must fit in a single filesystem block (1024, 2048 or 4096 bytes, depending on the block size specified when the filesystem was created). because I can see differences by volume setup type: * with ext4 setup "defaults" i got error after 44 successful links: /etc/mke2fs.conf: [defaults] base_features = sparse_super,large_file,filetype,resize_inode,dir_index,ext_attr default_mntopts = acl,user_xattr enable_periodic_fsck = 0 blocksize = 4096 inode_size = 256 inode_ratio = 16384 [fs_types] ext3 = { features = has_journal } ext4 = { features = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isize inode_size = 256 } ... * with ext4 setup "small" with enhanced settings back to inode_size=256 while I formatted it I could setup only 10 successful links: small = { blocksize = 1024 inode_size = 128 # in my volume case also 256 inode_ratio = 4096 } which would match the blocksize limitation - here in default ext4 fs: # attr -l test Attribute "gfid2path.3951a8fec4234683" has a 41 byte value for test Attribute "gfid" has a 16 byte value for test Attribute "afr.dirty" has a 12 byte value for test Attribute "gfid2path.003214300fcd4d34" has a 44 byte value for test ... Attribute "gfid2path.fe4d3e4d0bc31351" has a 44 byte value for test # attr -l test | grep gfid2path | wc -l 46 41 + 16 + 12 + 45 * 44 = 2049 (+ 256 inode_size + ??? ) <= 4096 with 1k blocksize I got only: # attr -l test Attribute "gfid2path.7a3f0fa0e8f7eba3" has a 41 byte value for test Attribute "gfid" has a 16 byte value for test Attribute "afr.dirty" has a 12 byte value for test Attribute "gfid2path.13e24c98a492d7f1" has a 43 byte value for test Attribute "gfid2path.1efa5641f9785d6c" has a 43 byte value for test Attribute "gfid2path.551dfafc5d4a7bda" has a 43 byte value for test Attribute "gfid2path.578dc56f20801437" has a 43 byte value for test Attribute "gfid2path.8e983883502e3c57" has a 43 byte value for test Attribute "gfid2path.94b700e1c7f156e3" has a 43 byte value for test Attribute "gfid2path.cbeb1108f9a34dac" has a 43 byte value for test Attribute "gfid2path.cd6ba60f624abc2b" has a 43 byte value for test Attribute "gfid2path.dbf95647d59cd047" has a 43 byte value for test Attribute "gfid2path.ec6198adc227befe" has a 44 byte value for test * 41 + 16 + 12 + 9 * 43 + 44 = 500 (+256 inode_size + ???) <= 1024 whatever the unknown missing (different) size is needed for. But in log I can see only this error which is not very helpful (here tested on another volume with ext4 "default" settings): [2018-08-31 13:21:11.306022] W [MSGID: 114031] [client-rpc-fops.c:2701:client3_3_link_cbk] 0-staging-prudsys-client-0: remote operation failed: (/test/test-45 -> /test/test-46) [No space left on device] [2018-08-31 13:21:11.306420] W [MSGID: 114031] [client-rpc-fops.c:2701:client3_3_link_cbk] 0-staging-prudsys-client-2: remote operation failed: (/test/test-45 -> /test/test-46) [No space left on device] [2018-08-31 13:21:11.306466] W [MSGID: 114031] [client-rpc-fops.c:2701:client3_3_link_cbk] 0-staging-prudsys-client-1: remote operation failed: (/test/test-45 -> /test/test-46) [No space left on device] [2018-08-31 13:21:11.307452] W [fuse-bridge.c:540:fuse_entry_cbk] 0-glusterfs-fuse: 23122: LINK() /test/test-46 => -1 (No space left on device) [2018-08-31 13:21:11.339428] W [MSGID: 114031] [client-rpc-fops.c:2701:client3_3_link_cbk] 0-staging-prudsys-client-0: remote operation failed: (/test/test-45 -> /test/test-47) [No space left on device] [2018-08-31 13:21:11.339991] W [MSGID: 114031] [client-rpc-fops.c:2701:client3_3_link_cbk] 0-staging-prudsys-client-1: remote operation failed: (/test/test-45 -> /test/test-47) [No space left on device] [2018-08-31 13:21:11.340039] W [MSGID: 114031] [client-rpc-fops.c:2701:client3_3_link_cbk] 0-staging-prudsys-client-2: remote operation failed: (/test/test-45 -> /test/test-47) [No space left on device] [2018-08-31 13:21:11.341036] W [fuse-bridge.c:540:fuse_entry_cbk] 0-glusterfs-fuse: 23125: LINK() /test/test-47 => -1 (No space left on device) ... [2018-08-31 13:21:12.097966] W [MSGID: 114031] [client-rpc-fops.c:2701:client3_3_link_cbk] 0-staging-prudsys-client-0: remote operation failed: (/test/test-45 -> /test/test-100) [No space left on device] [2018-08-31 13:21:12.098326] W [MSGID: 114031] [client-rpc-fops.c:2701:client3_3_link_cbk] 0-staging-prudsys-client-1: remote operation failed: (/test/test-45 -> /test/test-100) [No space left on device] [2018-08-31 13:21:12.098412] W [MSGID: 114031] [client-rpc-fops.c:2701:client3_3_link_cbk] 0-staging-prudsys-client-2: remote operation failed: (/test/test-45 -> /test/test-100) [No space left on device] [2018-08-31 13:21:12.101533] W [fuse-bridge.c:540:fuse_entry_cbk] 0-glusterfs-fuse: 23285: LINK() /test/test-100 => -1 (No space left on device) [2018-08-31 13:32:48.613484] I [MSGID: 109063] [dht-layout.c:716:dht_layout_normalize] 0-staging-prudsys-dht: Found anomalies in (null) (gfid = 1923da4d-9661-4d53-84d6-7d196276a0fc). Holes=1 overlaps=0 [2018-08-31 13:32:48.613529] I [MSGID: 109063] [dht-layout.c:716:dht_layout_normalize] 0-staging-prudsys-dht: Found anomalies in (null) (gfid = a04f8ab2-5b7a-490c-a3a6-71d9899295fa). Holes=1 overlaps=0 [2018-08-31 13:32:48.613556] I [MSGID: 109063] [dht-layout.c:716:dht_layout_normalize] 0-staging-prudsys-dht: Found anomalies in (null) (gfid = 6d5ed713-7cff-4cf9-bb57-197a217051db). Holes=1 overlaps=0 Same log output with old ext4 filesystem: [2018-08-31 14:06:05.882886] W [MSGID: 114031] [client-rpc-fops.c:2701:client3_3_link_cbk] 0-mygluster-client-2: remote operation failed: (/test/test-45 -> /test/test-46) [No space left on device] [2018-08-31 14:06:05.883427] W [MSGID: 114031] [client-rpc-fops.c:2701:client3_3_link_cbk] 0-mygluster-client-3: remote operation failed: (/test/test-45 -> /test/test-46) [No space left on device] [2018-08-31 14:06:05.884821] W [fuse-bridge.c:540:fuse_entry_cbk] 0-glusterfs-fuse: 15575982: LINK() /test/test-46 => -1 (No space left on device) [2018-08-31 14:06:05.901852] W [MSGID: 114031] [client-rpc-fops.c:2701:client3_3_link_cbk] 0-mygluster-client-2: remote operation failed: (/test/test-45 -> /test/test-47) [No space left on device] [2018-08-31 14:06:05.902410] W [MSGID: 114031] [client-rpc-fops.c:2701:client3_3_link_cbk] 0-mygluster-client-3: remote operation failed: (/test/test-45 -> /test/test-47) [No space left on device] [2018-08-31 14:06:05.903968] W [fuse-bridge.c:540:fuse_entry_cbk] 0-glusterfs-fuse: 15575985: LINK() /test/test-47 => -1 (No space left on device) ... [2018-08-31 14:06:06.727908] W [MSGID: 114031] [client-rpc-fops.c:2701:client3_3_link_cbk] 0-mygluster-client-2: remote operation failed: (/test/test-45 -> /test/test-100) [No space left on device] [2018-08-31 14:06:06.728409] W [MSGID: 114031] [client-rpc-fops.c:2701:client3_3_link_cbk] 0-mygluster-client-3: remote operation failed: (/test/test-45 -> /test/test-100) [No space left on device] [2018-08-31 14:06:06.729631] W [fuse-bridge.c:540:fuse_entry_cbk] 0-glusterfs-fuse: 15576145: LINK() /test/test-100 => -1 (No space left on device) and no more loglines referencing my test - I can see no gfid2path errors you mentioned but error seems related to inode size as above shown. Also interesting as you mentioned: with actual 3.12.13 version on another "old" Glusterfs volume with xfs background it's working fine. > To check if you are facing similar issue to the one in the bug provided > above, I would check if the brick logs throw up the no space error on a > gfid2path set failure. Is there some parameter to get more detailed error logging ? But from docu it looks like it has default good settings: https://docs.gluster.org/en/v3/Administrator%20Guide/Managing%20Volumes/ diagnostics.brick-log-level Changes the log-level of the bricks. INFO DEBUG/WARNING/ERROR/CRITICAL/NONE/TRACE diagnostics.client-log-level Changes the log-level of the clients. INFO DEBUG/WARNING/ERROR/CRITICAL/NONE/TRACE diagnostics.latency-measurement Statistics related to the latency of each operation would be tracked. Off On/Off diagnostics.dump-fd-stats Statistics related to file-operations would be tracked. Off On > To get around the problem, I would suggest using xfs as the backing FS > for the brick (considering you have close to 250 odd hardlinks to a > file). I would not attempt to disable the gfid2path feature, as that is > useful in getting to the real file just given a GFID and is already part > of core on disk Gluster metadata (It can be shut off, but I would > refrain from it). Since there are only some 10xGB of small files duplicated like this it's much easier to use then duplicated content again and perhaps I can also trigger people to clean up unneeded files. > >> My search for documentation found only the parameter >> "storage.max-hardlinks" with default of 100 for version 4.0. >> I checked it in my gluster 3.12.13 but here the parameter is not yet >> implemented. If this problem is backend filesystem related it would be good to have it documented also for 4.0 that the storage.max-hardlinks parameter would work only if the backend is e.g. xfs and has enough inode space for it (best with a reference/short example howto calculate it) ? Thanks and nice weekend Reiner
_______________________________________________ Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
