[Gluster-users] add/replace brick corrupting data

Lindsay Mathieson Sat, 14 May 2016 07:45:54 -0700

Am testing replacing the brick in a replica 3 test volume. Gluster3.7.11. Volume hosts two VM's. 3 Nodes, vna, vnb and vng.


*First off I tried removing/adding a brick.*

gluster v remove-brick replica 2vng.proxmox.softlog:/tank/vmdata/test1 force.


That worked fine, VM's (on another node) kept running without a hiccup


I deleted /tank/vmdata/test1, then

gluster v add-brick replica 3vng.proxmox.softlog:/tank/vmdata/test1 force.

Succeeded and heal statistics immediatly showed 3000+ shards beinghealed on vna and vnb

Unfortunately it also show 100's of sharded being healed on vng, whichshould not be happening as it had no data on it. Reverse heal basically.


Eventually all the heals completed, but the VM's were hopeless ccorrupted.

*Then I retried the above, but with all VM's shutdown*
i.e, no writes or reads happening on the volume.

This worked - i.e all the shards on vna & vnb healed, nothing inreverse. Once completed the data (VM's) was fine.

Unfortunately this isn't practical in production - can' bring all theVM's down for the 1-2 days it would take to heal.



*Replacing the brick

*I tried

killed the glusterfsd process on vng, then

gluster v replace-brick test1vng.proxmox.softlog:/tank/vmdata/test1vng.proxmox.softlog:/tank/vmdata/test1.1 commit force

*
*vna & vnb shards started healing, but vng showed 5 reverse heals happening.

Eventually it got down to 4-5 shards needing healing on each brick andstopped. They didn't go away till I removed the test1.1 brick.

*Currently the replace brick processes seems to be unusable except whenthe volume is not being used.


--
Lindsay Mathieson

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] add/replace brick corrupting data

Reply via email to