On 10/30/2013 09:10 PM, Brian Cipriano wrote:
I had the exact same experience recently with a 3.4 distributed
cluster I set up. I spent some time on the IRC but couldn’t track it
down. Seems remove-brick is broken in 3.3 and 3.4. I guess folks don’t
remove bricks very often :)
- brian
Brian,
I have tried remove brick couple of times and it worked for me. From
your experiance it seems remove brick has a bug. I will suggest you to
file a bug or give us steps to reproduce , so that I can reproduce it
in my environment and file a bug for it.
On Oct 30, 2013, at 11:21 AM, Lalatendu Mohanty <[email protected]
<mailto:[email protected]>> wrote:
On 10/30/2013 08:40 PM, Lalatendu Mohanty wrote:
On 10/30/2013 03:43 PM, B.K.Raghuram wrote:
I have gluster 3.4.1 on 4 boxes with hostnames n9, n10, n11, n12. I
did the following sequence of steps and ended up with losing data so
what did I do wrong?!
- Create a distributed volume with bricks on n9 and n10
- Started the volume
- NFS mounted the volume and created 100 files on it. Found that n9
had 45, n10 had 55
- Added a brick n11 to this volume
- Removed a brick n10 from the volume with gluster remove brick <vol>
<n10 brick name> start
- n9 now has 45 files, n10 has 55 files and n11 has 45 files(all the
same as on n9)
- Checked status, it shows that no rebalanced files but that n10 had
scanned 100 files and completed. 0 scanned for all the others
- I then did a rebalance start force on the vol and found that n9 had
0 files, n10 had 55 files and n11 had 45 files - weird - looked like
n9 had been removed but double checked again and found that n10 had
indeed been removed.
- did a remove-brick commit. Now same file distribution after that.
volume info now shows the volume to have n9 and n11 and bricks.
- did a rebalance start again on the volume. The rebalance-status now
shows n11 had 45 rebalanced files, all the brick nodes had 45 files
scanned and all show complete. The file layout after this is n9 has 45
files and n10 has 55 files. n11 has 0 files!
- An ls on the nfs mount now shows only 45 files so the other 55 not
visible because they are on n10 which is not part of the volume!
What have I done wrong in this sequence?
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users
|
I think running rebalnce (force) in between "remove brick start" and
"remove brick commit" is the issue. Can you please paste your
command as per the time line of events. That would make it more clear.
Below are the steps, I do to replace a brick and it works for me.
|
1. |gluster volume add-brick /|VOLNAME NEW-BRICK|/|
2. |gluster volume remove-brick |VOLNAME|/|BRICK|/| |start|
3. |gluster volume remove-brick |VOLNAME|/|BRICK|/||status|
4. |gluster volume remove-brick |VOLNAME /BRICK/| commit|
I will also suggest you to use distribute-replicate volumes, so that
you have a replica copy always and it reduces the probability of
losing data.
-Lala
_______________________________________________
Gluster-users mailing list
[email protected] <mailto:[email protected]>
http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users