Re: [Gluster-users] [External] Re: Self Heal Confusion

Brett Holcomb Mon, 31 Dec 2018 01:35:12 -0800

That is probably the case as a lot of files were deleted some time ago.


I'm on version 5.2 but was on 3.12 until about a week ago.

Here is the quorum info. I'm running a distributed replicated volumesin 2 x 3 = 6


cluster.quorum-type auto
cluster.quorum-count (null)
cluster.server-quorum-type off
cluster.server-quorum-ratio 0
cluster.quorum-reads                    no

Where exacty do I remove the gfid entries from - the .glusterfsdirectory? Do I just delete all the directories can files under thisdirectory?


Where do I put the cluster.heal-timeout option - which file?

I think you've hit on the cause of the issue. Thinking back we've hadsome extended power outages and due to a misconfiguration in the swapfile device name a couple of the nodes did not come up and I didn'tcatch it for a while so maybe the deletes occured then.


Thank you.

On 12/31/18 2:58 AM, Davide Obbi wrote:

if the long GFID does not correspond to any file it could mean thefile has been deleted by the client mounting the volume. I think thisis caused when the delete was issued and the number of active brickswere not reaching quorum majority or a second brick was taken downwhile another was down or did not finish the selfheal, the latter morelikely.
It would be interesting to see:
- what version of glusterfs you running, it happened to me with 3.12
- volume quorum rules: "gluster volume get vol all | grep quorum"
To clean it up if i remember correctly it should be possible to deletethe gfid entries from the brick mounts on the glusterfs server nodesreporting the files to heal.
As a side note you might want to consider changing the selfhealtimeout to more agressive schedule in cluster.heal-timeout option

_______________________________________________
Gluster-users mailing list
[email protected]
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: Self Heal Confusion

Reply via email to