On 10/07/2015 12:06 AM, Lindsay Mathieson wrote:
First up - one of the things that concerns me re gluster is the
incoherent state of documentation. The only docs linked on the main
webpage are for 3.2 and there is almost nothing on how to handle
failure modes such as dead disks/bricks etc, which is one of glusters
primary functions.
Every link under Documentation at http://gluster.org points to the
gluster.readthedocs.org pages that are all current. Where is this "main
webpage" in which you found links to the old wiki pages?
My problem - I have a replica 2 volume, 2 nodes, 2 bricks (zfs datasets).
As a test, I destroyed one brick (zfs destroy the dataset).
Can't start the datastore1:
volume start: datastore1: failed: Failed to find brick directory
/glusterdata/datastore1 for volume datastore1. Reason : No such file
or directory
A bit disturbing, I was hoping it would work off the remaining brick.
It *is* still working off the remaining brick. It won't start the
missing brick because the missing brick is missing. This is by design.
If, for whatever reason, your brick did not mount, you don't want
gluster to start filling your root device with replication from the
other brick.
I documented this on my blog at
https://joejulian.name/blog/replacing-a-brick-on-glusterfs-340/ which is
still accurate for the latest version.
The bug report I filed for this was closed without resolution. I assume
there's no plans for ever making this easy for administrators.
https://bugzilla.redhat.com/show_bug.cgi?id=991084
Can't replace the brick:
gluster volume replace-brick datastore1
vnb.proxmox.softlog:/glusterdata/datastore1
vnb.proxmox.softlog:/glusterdata/datastore1-2 commit force
because the store is not running.
After a lot of googling I found list messages referencing the remove
brick command:
gluster volume remove-brick datastore1 replica 2
vnb.proxmox.softlog:/glusterdata/datastore1c commit force
Fails with the unhelpful error:
wrong brick type: commit, use <HOSTNAME>:<export-dir-abs-path>
Usage: volume remove-brick <VOLNAME> [replica <COUNT>] <BRICK> ...
<start|stop|status|commit|force>
In the end I destroyed and recreated the volume so I could resume
testing, but I have no idea how I would handle a real failed brick in
the future
--
Lindsay
_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users