On 06/18/2013 11:43 AM, [email protected] wrote:
Hello,
When trying to recover from failed node and replace brick with spare one
I have trashed my cluster and now it is in stuck state.
Any ideas, how to reintroduce/remove those nodes and bring peace and
order to cluster?
There was a pending brick replacement operation from 0031 to 0028 (it is
still not commited according to rbstate file)
There was a hardware failure on 0022 node
I was not able to commit replace brick 0031 due to 0022 was not
responding and not giving cluster lock to requesting node.
I was not able to start replacement 0022 to 0028 due to pending brick
replacement
I have forced peer removal from cluster, hoping that afterwards I would
be able to complete operations. Unfortunately I have removes not only
0022 but 0031 also.
I have peer probed 0031 successfully. But now gluster volume info and
volume status both lists 0031 node. But when I attempt to do a brick
operation I do get:
gluster volume remove-brick glustervmstore 0031:/mnt/vmstore/brick
0036:/mnt/vmstore/brick force
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
Incorrect brick 0031:/mnt/vmstore/brick for volume glustervmstore
gluster volume replace-brick glustervmstore 0031:/mnt/vmstore/brick
0028:/mnt/vmstore/brick commit force
brick: 0031:/mnt/vmstore/brick does not exist in volume: glustervmstore
Looks like these commands are being rejected from a node where the
volume information is not current. Can you please provide glusterd logs
from the node where these commands were issued?
Thanks,
Vijay
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users