Hi, Thanks for your response, please find the xfs_info for each brick on the arbiter below: root@uk3-prod-gfs-arb-01:~# xfs_info /data/glusterfs/gv1/brick1 meta-data=/dev/sdc1 isize=512 agcount=31, agsize=131007 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1 data = bsize=4096 blocks=3931899, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 root@uk3-prod-gfs-arb-01:~# xfs_info /data/glusterfs/gv1/brick2 meta-data=/dev/sde1 isize=512 agcount=13, agsize=327616 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1 data = bsize=4096 blocks=3931899, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 root@uk3-prod-gfs-arb-01:~# xfs_info /data/glusterfs/gv1/brick3 meta-data=/dev/sdd1 isize=512 agcount=13, agsize=327616 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1 data = bsize=4096 blocks=3931899, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0
I've also copied below some df output from the arb server: root@uk3-prod-gfs-arb-01:~# df -hi Filesystem Inodes IUsed IFree IUse% Mounted on udev 992K 473 991K 1% /dev tmpfs 995K 788 994K 1% /run /dev/sda1 768K 105K 664K 14% / tmpfs 995K 3 995K 1% /dev/shm tmpfs 995K 4 995K 1% /run/lock tmpfs 995K 18 995K 1% /sys/fs/cgroup /dev/sdb1 128K 113 128K 1% /var/lib/glusterd /dev/sdd1 7.5M 2.6M 5.0M 35% /data/glusterfs/gv1/brick3 /dev/sdc1 7.5M 600K 7.0M 8% /data/glusterfs/gv1/brick1 /dev/sde1 6.4M 2.9M 3.5M 46% /data/glusterfs/gv1/brick2 uk1-prod-gfs-01:/gv1 150M 6.5M 144M 5% /mnt/gfs tmpfs 995K 21 995K 1% /run/user/1004 root@uk3-prod-gfs-arb-01:~# df -h Filesystem Size Used Avail Use% Mounted on udev 3.9G 0 3.9G 0% /dev tmpfs 796M 916K 795M 1% /run /dev/sda1 12G 3.9G 7.3G 35% / tmpfs 3.9G 8.0K 3.9G 1% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup /dev/sdb1 2.0G 456K 1.9G 1% /var/lib/glusterd /dev/sdd1 15G 12G 3.5G 78% /data/glusterfs/gv1/brick3 /dev/sdc1 15G 2.6G 13G 18% /data/glusterfs/gv1/brick1 /dev/sde1 15G 14G 1.8G 89% /data/glusterfs/gv1/brick2 uk1-prod-gfs-01:/gv1 300G 139G 162G 47% /mnt/gfs tmpfs 796M 0 796M 0% /run/user/1004 Something I forgot to mention in my initial message is that the opversion was upgraded from 70200 to 100000, which seems as though it could have been a trigger for the issue as well. Thanks, Liam Smith???? Linux Systems Support Engineer, Scholar The contents of this email message and any attachments are intended solely for the addressee(s) and may contain confidential and/or privileged information and may be legally protected from disclosure. [cid:6e96af21-7d9f-4d11-b76c-6ff66f52e6dd]<https://www.ek.co/> ________________________________ From: Strahil Nikolov <hunter86...@yahoo.com> Sent: 03 July 2023 18:28 To: Liam Smith <liam.sm...@ek.co>; gluster-users@gluster.org <gluster-users@gluster.org> Subject: Re: [Gluster-users] remove_me files building up CAUTION: This e-mail originates from outside of Ekco. Do not click links or attachments unless you recognise the sender. ________________________________ Hi, you mentioned that the arbiter bricks run out of inodes. Are you using XFS ? Can you provide the xfs_info of each brick ? Best Regards, Strahil Nikolov On Sat, Jul 1, 2023 at 19:41, Liam Smith <liam.sm...@ek.co> wrote: Hi, We're running a cluster with two data nodes and one arbiter, and have sharding enabled. We had an issue a while back where one of the server's crashed, we got the server back up and running and ensured that all healing entries cleared, and also increased the server spec (CPU/Mem) as this seemed to be the potential cause. Since then however, we've seen some strange behaviour, whereby a lot of 'remove_me' files are building up under `/data/glusterfs/gv1/brick2/brick/.shard/.remove_me/` and `/data/glusterfs/gv1/brick3/brick/.shard/.remove_me/`. This is causing the arbiter to run out of space on brick2 and brick3, as the remove_me files are constantly increasing. brick1 appears to be fine, the disk usage increases throughout the day and drops down in line with the trend of the brick on the data nodes. We see the disk usage increase and drop throughout the day on the data nodes for brick2 and brick3 as well, but while the arbiter follows the same trend of the disk usage increasing, it doesn't drop at any point. This is the output of some gluster commands, occasional heal entries come and go: root@uk3-prod-gfs-arb-01:~# gluster volume info gv1 Volume Name: gv1 Type: Distributed-Replicate Volume ID: d3d1fdec-7df9-4f71-b9fc-660d12c2a046 Status: Started Snapshot Count: 0 Number of Bricks: 3 x (2 + 1) = 9 Transport-type: tcp Bricks: Brick1: uk1-prod-gfs-01:/data/glusterfs/gv1/brick1/brick Brick2: uk2-prod-gfs-01:/data/glusterfs/gv1/brick1/brick Brick3: uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick1/brick (arbiter) Brick4: uk1-prod-gfs-01:/data/glusterfs/gv1/brick3/brick Brick5: uk2-prod-gfs-01:/data/glusterfs/gv1/brick3/brick Brick6: uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick3/brick (arbiter) Brick7: uk1-prod-gfs-01:/data/glusterfs/gv1/brick2/brick Brick8: uk2-prod-gfs-01:/data/glusterfs/gv1/brick2/brick Brick9: uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick2/brick (arbiter) Options Reconfigured: cluster.entry-self-heal: on cluster.metadata-self-heal: on cluster.data-self-heal: on performance.client-io-threads: off storage.fips-mode-rchecksum: on transport.address-family: inet cluster.lookup-optimize: off performance.readdir-ahead: off cluster.readdir-optimize: off cluster.self-heal-daemon: enable features.shard: enable features.shard-block-size: 512MB cluster.min-free-disk: 10% cluster.use-anonymous-inode: yes root@uk3-prod-gfs-arb-01:~# gluster peer status Number of Peers: 2 Hostname: uk2-prod-gfs-01 Uuid: 2fdfa4a2-195d-4cc5-937c-f48466e76149 State: Peer in Cluster (Connected) Hostname: uk1-prod-gfs-01 Uuid: 43ec93d1-ad83-4103-aea3-80ded0903d88 State: Peer in Cluster (Connected) root@uk3-prod-gfs-arb-01:~# gluster volume heal gv1 info Brick uk1-prod-gfs-01:/data/glusterfs/gv1/brick1/brick <gfid:5b57e1f6-3e3d-4334-a0db-b2560adae6d1> Status: Connected Number of entries: 1 Brick uk2-prod-gfs-01:/data/glusterfs/gv1/brick1/brick Status: Connected Number of entries: 0 Brick uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick1/brick Status: Connected Number of entries: 0 Brick uk1-prod-gfs-01:/data/glusterfs/gv1/brick3/brick Status: Connected Number of entries: 0 Brick uk2-prod-gfs-01:/data/glusterfs/gv1/brick3/brick Status: Connected Number of entries: 0 Brick uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick3/brick Status: Connected Number of entries: 0 Brick uk1-prod-gfs-01:/data/glusterfs/gv1/brick2/brick Status: Connected Number of entries: 0 Brick uk2-prod-gfs-01:/data/glusterfs/gv1/brick2/brick <gfid:6ba9c472-9232-4b45-b12f-a1232d6f4627> /.shard/.remove_me <gfid:0f042518-248d-426a-93f4-cfaa92b6ef3e> Status: Connected Number of entries: 3 Brick uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick2/brick <gfid:6ba9c472-9232-4b45-b12f-a1232d6f4627> /.shard/.remove_me <gfid:0f042518-248d-426a-93f4-cfaa92b6ef3e> Status: Connected Number of entries: 3 root@uk3-prod-gfs-arb-01:~# gluster volume get all cluster.op-version Option Value ------ ----- cluster.op-version 100000 We're not sure if this is a potential bug or if something's corrupted that we don't have visibility of, so any pointers/suggestions about how to approach this would be appreciated. Thanks, Liam ________________________________ The contents of this email message and any attachments are intended solely for the addressee(s) and may contain confidential and/or privileged information and may be legally protected from disclosure. ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk<https://meet.google.com/cpu-eiue-hvk> Gluster-users mailing list Gluster-users@gluster.org<mailto:Gluster-users@gluster.org> https://lists.gluster.org/mailman/listinfo/gluster-users<https://lists.gluster.org/mailman/listinfo/gluster-users>
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users