On 05/28/2017 10:31 PM, Markus Stockhausen wrote:
Hi,

I'm fairly new to gluster and quite happy with it. We are using it in an OVirt environment that stores its VM images in the gluster. Setup is as follows and
Clients mount the volume with gluster native fuse protocol.

3 storage nodes: Centos 7, Gluster 3.8.12 (managed by me), 2 bricks each
5 virtualization nodes: Centos 7, Gluster 3.8.12 (managed by OVirt engine)

After todays reboot of one of the storage nodes the recovery did not finish
successfully. The state of one brick remained in:

[root@cfiler301 dom_md]# gluster volume heal gluster1 info
...
Brick cfilers201:/var/data/brick1/brick
/b1de7818-020b-4f47-938f-f3ebb51836a3/dom_md/ids
Status: Connected
Number of entries: 1
...

The above file is used by sanlock runing on the OVirt nodes to handle VM
image locking. Issuing a manual heal with "gluster volume heal gluster1" fixed
the problem but the unsynced entry reappeared a few seconds later.

My question: Should this situation be recovered automatically and if yes
what might be the culprit?

We have had this observed by our QE guys while testing too, but in all cases, there was an intermittent disconnect from the fuse mount to the bricks, leading to the 'ids' file needing heal (and being healed on reconnect) again and again. Perhaps you should check if and why the mount is getting disconnected from the bricks.

HTH,
Ravi

Best regards.

Markus

P.S. I finally fixed the issue by remounting the filesystems (on Ovirt nodes) and
so sanlock was restarted too.




_______________________________________________
Gluster-users mailing list
[email protected]
http://lists.gluster.org/mailman/listinfo/gluster-users


_______________________________________________
Gluster-users mailing list
[email protected]
http://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to