Hi Stefan, I think what you propose will work, though you should test it thoroughly.
I think more generally, "the GlusterFS way" would be to use 2-way replication instead of a distributed volume; then you can lose one of your servers without outage. And re-synchronize when it comes back up. Chances are if you weren't using the SAN volumes; you could have purchased two servers each with enough disk to make two copies of the data, all for less dollars... Regards, Alex On Mon, Dec 11, 2017 at 12:52 PM, Stefan Solbrig <stefan.solb...@ur.de> wrote: > Dear all, > > I'm rather new to glusterfs but have some experience running lager lustre > and beegfs installations. These filesystems provide active/active > failover. Now, I discovered that I can also do this in glusterfs, although > I didn't find detailed documentation about it. (I'm using glusterfs 3.10.8) > > So my question is: can I really use glusterfs to do failover in the way > described below, or am I misusing glusterfs? (and potentially corrupting my > data?) > > My setup is: I have two servers (qlogin and gluster2) that access a shared > SAN storage. Both servers connect to the same SAN (SAS multipath) and I > implement locking via lvm2 and sanlock, so I can mount the same storage on > either server. > The idea is that normally each server serves one brick, but in case one > server fails, the other server can serve both bricks. (I'm not interested > on automatic failover, I'll always do this manually. I could also use this > to do maintainance on one server, with only minimal downtime.) > > > #normal setup: > [root@qlogin ~]# gluster volume info g2 > #... > # Volume Name: g2 > # Type: Distribute > # Brick1: qlogin:/glust/castor/brick > # Brick2: gluster2:/glust/pollux/brick > > # failover: let's artificially fail one server by killing one glusterfsd: > [root@qlogin] systemctl status glusterd > [root@qlogin] kill -9 <pid/of/glusterfsd/running/brick/castor> > > # unmount brick > [root@qlogin] umount /glust/castor/ > > # deactive LV > [root@qlogin] lvchange -a n vgosb06vd05/castor > > > ### now do the failover: > > # active same storage on other server: > [root@gluster2] lvchange -a y vgosb06vd05/castor > > # mount on other server > [root@gluster2] mount /dev/mapper/vgosb06vd05-castor /glust/castor > > # now move the "failed" brick to the other server > [root@gluster2] gluster volume replace-brick g2 > qlogin:/glust/castor/brick gluster2:/glust/castor/brick commit force > ### The last line is the one I have doubts about > > #now I'm in failover state: > #Both bricks on one server: > [root@qlogin ~]# gluster volume info g2 > #... > # Volume Name: g2 > # Type: Distribute > # Brick1: gluster2:/glust/castor/brick > # Brick2: gluster2:/glust/pollux/brick > > > Is it intended to work this way? > > Thanks a lot! > > best wishes, > Stefan > > _______________________________________________ > Gluster-users mailing list > Gluster-users@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users