Re: [Gluster-users] remove_me files building up

Liam Smith Tue, 04 Jul 2023 04:14:23 -0700

Hi,

Thanks for your response, please find the xfs_info for each brick on the 
arbiter below:
root@uk3-prod-gfs-arb-01:~# xfs_info /data/glusterfs/gv1/brick1
meta-data=/dev/sdc1              isize=512    agcount=31, agsize=131007 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1
data     =                       bsize=4096   blocks=3931899, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
root@uk3-prod-gfs-arb-01:~# xfs_info /data/glusterfs/gv1/brick2
meta-data=/dev/sde1              isize=512    agcount=13, agsize=327616 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1
data     =                       bsize=4096   blocks=3931899, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
root@uk3-prod-gfs-arb-01:~# xfs_info /data/glusterfs/gv1/brick3
meta-data=/dev/sdd1              isize=512    agcount=13, agsize=327616 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1
data     =                       bsize=4096   blocks=3931899, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0


I've also copied below some df output from the arb server:
root@uk3-prod-gfs-arb-01:~# df -hi
Filesystem           Inodes IUsed IFree IUse% Mounted on
udev                   992K   473  991K    1% /dev
tmpfs                  995K   788  994K    1% /run
/dev/sda1              768K  105K  664K   14% /
tmpfs                  995K     3  995K    1% /dev/shm
tmpfs                  995K     4  995K    1% /run/lock
tmpfs                  995K    18  995K    1% /sys/fs/cgroup
/dev/sdb1              128K   113  128K    1% /var/lib/glusterd
/dev/sdd1              7.5M  2.6M  5.0M   35% /data/glusterfs/gv1/brick3
/dev/sdc1              7.5M  600K  7.0M    8% /data/glusterfs/gv1/brick1
/dev/sde1              6.4M  2.9M  3.5M   46% /data/glusterfs/gv1/brick2
uk1-prod-gfs-01:/gv1   150M  6.5M  144M    5% /mnt/gfs
tmpfs                  995K    21  995K    1% /run/user/1004
root@uk3-prod-gfs-arb-01:~# df -h
Filesystem            Size  Used Avail Use% Mounted on
udev                  3.9G     0  3.9G   0% /dev
tmpfs                 796M  916K  795M   1% /run
/dev/sda1              12G  3.9G  7.3G  35% /
tmpfs                 3.9G  8.0K  3.9G   1% /dev/shm
tmpfs                 5.0M     0  5.0M   0% /run/lock
tmpfs                 3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/sdb1             2.0G  456K  1.9G   1% /var/lib/glusterd
/dev/sdd1              15G   12G  3.5G  78% /data/glusterfs/gv1/brick3
/dev/sdc1              15G  2.6G   13G  18% /data/glusterfs/gv1/brick1
/dev/sde1              15G   14G  1.8G  89% /data/glusterfs/gv1/brick2
uk1-prod-gfs-01:/gv1  300G  139G  162G  47% /mnt/gfs
tmpfs                 796M     0  796M   0% /run/user/1004

Something I forgot to mention in my initial message is that the opversion was 
upgraded from 70200 to 100000, which seems as though it could have been a 
trigger for the issue as well.

Thanks,

Liam Smith????
Linux Systems Support Engineer, Scholar

The contents of this email message and any attachments are intended solely for 
the addressee(s) and may contain confidential and/or privileged information and 
may be legally protected from disclosure.

[cid:6e96af21-7d9f-4d11-b76c-6ff66f52e6dd]<https://www.ek.co/>
________________________________
From: Strahil Nikolov <hunter86...@yahoo.com>
Sent: 03 July 2023 18:28
To: Liam Smith <liam.sm...@ek.co>; gluster-users@gluster.org 
<gluster-users@gluster.org>
Subject: Re: [Gluster-users] remove_me files building up


CAUTION: This e-mail originates from outside of Ekco. Do not click links or 
attachments unless you recognise the sender.

________________________________
Hi,

you mentioned that the arbiter bricks run out of inodes.
Are you using XFS ?
Can you provide the xfs_info of each brick ?

Best Regards,
Strahil Nikolov

On Sat, Jul 1, 2023 at 19:41, Liam Smith
<liam.sm...@ek.co> wrote:
Hi,

We're running a cluster with two data nodes and one arbiter, and have sharding 
enabled.

We had an issue a while back where one of the server's crashed, we got the 
server back up and running and ensured that all healing entries cleared, and 
also increased the server spec (CPU/Mem) as this seemed to be the potential 
cause.

Since then however, we've seen some strange behaviour, whereby a lot of 
'remove_me' files are building up under 
`/data/glusterfs/gv1/brick2/brick/.shard/.remove_me/` and 
`/data/glusterfs/gv1/brick3/brick/.shard/.remove_me/`. This is causing the 
arbiter to run out of space on brick2 and brick3, as the remove_me files are 
constantly increasing.

brick1 appears to be fine, the disk usage increases throughout the day and 
drops down in line with the trend of the brick on the data nodes. We see the 
disk usage increase and drop throughout the day on the data nodes for brick2 
and brick3 as well, but while the arbiter follows the same trend of the disk 
usage increasing, it doesn't drop at any point.

This is the output of some gluster commands, occasional heal entries come and 
go:
root@uk3-prod-gfs-arb-01:~# gluster volume info gv1

Volume Name: gv1
Type: Distributed-Replicate
Volume ID: d3d1fdec-7df9-4f71-b9fc-660d12c2a046
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x (2 + 1) = 9
Transport-type: tcp
Bricks:
Brick1: uk1-prod-gfs-01:/data/glusterfs/gv1/brick1/brick
Brick2: uk2-prod-gfs-01:/data/glusterfs/gv1/brick1/brick
Brick3: uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick1/brick (arbiter)
Brick4: uk1-prod-gfs-01:/data/glusterfs/gv1/brick3/brick
Brick5: uk2-prod-gfs-01:/data/glusterfs/gv1/brick3/brick
Brick6: uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick3/brick (arbiter)
Brick7: uk1-prod-gfs-01:/data/glusterfs/gv1/brick2/brick
Brick8: uk2-prod-gfs-01:/data/glusterfs/gv1/brick2/brick
Brick9: uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick2/brick (arbiter)
Options Reconfigured:
cluster.entry-self-heal: on
cluster.metadata-self-heal: on
cluster.data-self-heal: on
performance.client-io-threads: off
storage.fips-mode-rchecksum: on
transport.address-family: inet
cluster.lookup-optimize: off
performance.readdir-ahead: off
cluster.readdir-optimize: off
cluster.self-heal-daemon: enable
features.shard: enable
features.shard-block-size: 512MB
cluster.min-free-disk: 10%
cluster.use-anonymous-inode: yes

root@uk3-prod-gfs-arb-01:~# gluster peer status
Number of Peers: 2

Hostname: uk2-prod-gfs-01
Uuid: 2fdfa4a2-195d-4cc5-937c-f48466e76149
State: Peer in Cluster (Connected)

Hostname: uk1-prod-gfs-01
Uuid: 43ec93d1-ad83-4103-aea3-80ded0903d88
State: Peer in Cluster (Connected)

root@uk3-prod-gfs-arb-01:~# gluster volume heal gv1 info
Brick uk1-prod-gfs-01:/data/glusterfs/gv1/brick1/brick
<gfid:5b57e1f6-3e3d-4334-a0db-b2560adae6d1>
Status: Connected
Number of entries: 1

Brick uk2-prod-gfs-01:/data/glusterfs/gv1/brick1/brick
Status: Connected
Number of entries: 0

Brick uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick1/brick
Status: Connected
Number of entries: 0

Brick uk1-prod-gfs-01:/data/glusterfs/gv1/brick3/brick
Status: Connected
Number of entries: 0

Brick uk2-prod-gfs-01:/data/glusterfs/gv1/brick3/brick
Status: Connected
Number of entries: 0

Brick uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick3/brick
Status: Connected
Number of entries: 0

Brick uk1-prod-gfs-01:/data/glusterfs/gv1/brick2/brick
Status: Connected
Number of entries: 0

Brick uk2-prod-gfs-01:/data/glusterfs/gv1/brick2/brick
<gfid:6ba9c472-9232-4b45-b12f-a1232d6f4627>
/.shard/.remove_me
<gfid:0f042518-248d-426a-93f4-cfaa92b6ef3e>
Status: Connected
Number of entries: 3

Brick uk3-prod-gfs-arb-01:/data/glusterfs/gv1/brick2/brick
<gfid:6ba9c472-9232-4b45-b12f-a1232d6f4627>
/.shard/.remove_me
<gfid:0f042518-248d-426a-93f4-cfaa92b6ef3e>
Status: Connected
Number of entries: 3

root@uk3-prod-gfs-arb-01:~# gluster volume get all cluster.op-version
Option                                   Value
------                                   -----
cluster.op-version                       100000

We're not sure if this is a potential bug or if something's corrupted that we 
don't have visibility of, so any pointers/suggestions about how to approach 
this would be appreciated.

Thanks,
Liam



________________________________

The contents of this email message and any attachments are intended solely for 
the addressee(s) and may contain confidential and/or privileged information and 
may be legally protected from disclosure.


________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: 
https://meet.google.com/cpu-eiue-hvk<https://meet.google.com/cpu-eiue-hvk>
Gluster-users mailing list
Gluster-users@gluster.org<mailto:Gluster-users@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users<https://lists.gluster.org/mailman/listinfo/gluster-users>

________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] remove_me files building up

Reply via email to