Re: [Gluster-users] remove-brick removed unexpected bricks

Ravishankar N Tue, 13 Aug 2013 06:49:55 -0700

On 08/13/2013 06:21 PM, Cool wrote:

I'm pretty sure I did "watch ... remove-brick ... status" till itmentioned everything is completed before trigger commit, I should makeit clear in my previous mail.
Actually you can read my mail again - in step #5, files on /sdc1 gotmigrated instead of /sdd1, even though my command was trying toremove-brick /sdd1,

Ah, my bad. Got it now. This is strange..

this is the root cause (to me) that caused the problem, as data on/sdc1 migrated to /sdb1 and /sdd1, then commit simply remove /sdd1from gfs_v0. It seems vol definition information got some problem ingluster.

If you are able to reproduce the issue, does 'gluster volume info' showthe correct bricks before and after start-status-commit operations ofremoving sdd1? You could also see if there are any error messages in/var/log/glusterfs/<volname>-rebalance.log


-Ravi

-C.B.

On 8/12/2013 9:51 PM, Ravishankar N wrote:
On 08/13/2013 03:43 AM, Cool wrote:
remove-brick in 3.4.0 seems removing wrong bricks, can someone helpto review the environment/steps to see if I did anything stupid?
setup - Ubuntu 12.04LTS on gfs11 and gfs12, with following packagesfrom ppa, both nodes have 3 xfs partitions sdb1, sdc1, sdd1:ii glusterfs-client 3.4.0final-ubuntu1~precise1 clusteredfile-system (client package)ii glusterfs-common 3.4.0final-ubuntu1~precise1 GlusterFS commonlibraries and translator modulesii glusterfs-server 3.4.0final-ubuntu1~precise1 clusteredfile-system (server package)
step to reproduce the problem:
1. create volume gfs_v0 in replica 2 with gfs11:/sdb1 and gfs12:/sdb1
2. add-brick gfs11:/sdc1 and gfs12:/sdc1
3. add-brick gfs11:/sdd1 and gfs12:/sdd1
4. rebalance to make files distributed to all three pair of disks
5. remove-brick gfs11:/sdd1 and gfs12:/sdd1 start, files on***/sdc1*** are migrating out
6. remove-brick commit led to data loss in gfs_v0
If between step 5 and 6 I initiate a remove-brick targeting /sdc1,then after commit I would not lose anything since all data will bemigrated back to /sdb1.
You should ensure that a 'remove-brick start ' has completed andthen commit it before initiating the second one. The correct way todo this would be:
5.   # gluster volume remove-brick gfs_v0 gfs11:/sdd1 gfs12:/sdd1 start
6. Check that the data migration has been completed using the statuscommand:# gluster volume remove-brick gfs_v0 gfs11:/sdd1 gfs12:/sdd1status
7.   #gluster volume remove-brick gfs_v0 gfs11:/sdd1 gfs12:/sdd1 commit
8.   # gluster volume remove-brick gfs_v0 gfs11:/sdc1 gfs12:/sdc1 start
9.   # gluster volume remove-brick gfs_v0 gfs11:/sdc1 gfs12:/sdc1 status
10. # gluster volume remove-brick gfs_v0 gfs11:/sdc1 gfs12:/sdc1 commit
This would leave you with the original replica 2 volume that you hadbegun with. Hope this helps.
Note:
The latest version of glusterfs has the check that prevents a secondremove-brick operation until the first one has been committed.(You would receive a message thus : "volume remove-brick start:failed: An earlier remove-brick task exists for volume <volname>.Either commit it or stop it before starting a new task." )
-Ravi
-C.B.
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users


_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] remove-brick removed unexpected bricks

Reply via email to