On 05/28/2017 10:31 PM, Markus Stockhausen wrote:
Hi,
I'm fairly new to gluster and quite happy with it. We are using it in
an OVirt
environment that stores its VM images in the gluster. Setup is as
follows and
Clients mount the volume with gluster native fuse protocol.
3 storage nodes: Centos 7, Gluster 3.8.12 (managed by me), 2 bricks each
5 virtualization nodes: Centos 7, Gluster 3.8.12 (managed by OVirt engine)
After todays reboot of one of the storage nodes the recovery did not
finish
successfully. The state of one brick remained in:
[root@cfiler301 dom_md]# gluster volume heal gluster1 info
...
Brick cfilers201:/var/data/brick1/brick
/b1de7818-020b-4f47-938f-f3ebb51836a3/dom_md/ids
Status: Connected
Number of entries: 1
...
The above file is used by sanlock runing on the OVirt nodes to handle VM
image locking. Issuing a manual heal with "gluster volume heal
gluster1" fixed
the problem but the unsynced entry reappeared a few seconds later.
My question: Should this situation be recovered automatically and if yes
what might be the culprit?
We have had this observed by our QE guys while testing too, but in all
cases, there was an intermittent disconnect from the fuse mount to the
bricks, leading to the 'ids' file needing heal (and being healed on
reconnect) again and again.
Perhaps you should check if and why the mount is getting disconnected
from the bricks.
HTH,
Ravi
Best regards.
Markus
P.S. I finally fixed the issue by remounting the filesystems (on Ovirt
nodes) and
so sanlock was restarted too.
_______________________________________________
Gluster-users mailing list
[email protected]
http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://lists.gluster.org/mailman/listinfo/gluster-users