Am testing replacing the brick in a replica 3 test volume. Gluster 3.7.11. Volume hosts two VM's. 3 Nodes, vna, vnb and vng.

*First off I tried removing/adding a brick.*

gluster v remove-brick replica 2 vng.proxmox.softlog:/tank/vmdata/test1 force.

That worked fine, VM's (on another node) kept running without a hiccup


I deleted /tank/vmdata/test1, then

gluster v add-brick replica 3 vng.proxmox.softlog:/tank/vmdata/test1 force.


Succeeded and heal statistics immediatly showed 3000+ shards being healed on vna and vnb

Unfortunately it also show 100's of sharded being healed on vng, which should not be happening as it had no data on it. Reverse heal basically.

Eventually all the heals completed, but the VM's were hopeless ccorrupted.

*Then I retried the above, but with all VM's shutdown*
i.e, no writes or reads happening on the volume.

This worked - i.e all the shards on vna & vnb healed, nothing in reverse. Once completed the data (VM's) was fine.

Unfortunately this isn't practical in production - can' bring all the VM's down for the 1-2 days it would take to heal.


*Replacing the brick

*I tried

killed the glusterfsd process on vng, then
gluster v replace-brick test1 vng.proxmox.softlog:/tank/vmdata/test1 vng.proxmox.softlog:/tank/vmdata/test1.1 commit force
*
*vna & vnb shards started healing, but vng showed 5 reverse heals happening.
Eventually it got down to 4-5 shards needing healing on each brick and stopped. They didn't go away till I removed the test1.1 brick.
*

*Currently the replace brick processes seems to be unusable except when the volume is not being used.

--
Lindsay Mathieson

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Reply via email to