Re: [ovirt-users] Rebooting gluster nodes make VMs pause due to storage error
On Tue, Oct 27, 2015 at 4:59 PM, wrote: > Hi, > > We're using ovirt 3.5.3.1, and as storage backend we use GlusterFS. We > added a Storage Domain with the path "gluster.fqdn1:/volume", and as > options, we used "backup-volfile-servers=gluster.fqdn2". We now need to > restart both gluster.fqdn1 and gluster.fqdn2 machines due to system update > (not at the same time, obviously). We're worried because in previous > attempts, when restarted the main gluster node (gluster.fqdn1 in this > case), all the VMs running against that storage backend got paused due to > storage errors, and we couldn't resume them and finally had to power them > off the hard way and start them again. > > Gluster version on gluster.fqdn1 and gluster.fqdn2 is 3.6.3-1. > > Gluster configuration for that volume is: > > Volume Name: volume > Type: Replicate > Volume ID: a2d7e52c-2f63-4e72-9635-4e311baae6ff > Status: Started > Number of Bricks: 1 x 2 = 2 > This is replica 2 - not supported. > Transport-type: tcp > Bricks: > Brick1: gluster.fqdn1:/gluster/brick_01/brick > Brick2: gluster.fqdn2:/gluster/brick_01/brick > Options Reconfigured: > storage.owner-gid: 36 > storage.owner-uid: 36 > cluster.server-quorum-type: server > cluster.quorum-type: none > network.remote-dio: enable > cluster.eager-lock: enable > performance.stat-prefetch: off > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > > We would like to know if there's a "clean" way to do such a procedure. We > know that pausing all the VMs and then restarting the gluster nodes work > with no harm, but the downtime of the VMs is important to us and we would > like to avoid it, especially when we have 2 gluster nodes for that. > You should use replica 3. This configuration should be able to survive reboot on of the nodes. If two nodes are down, the file system will become readonly, and vms will pause because of write errors. Adding Sahina. Nir > Any hints are appreciated, > > Thanks. > ___ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Rebooting gluster nodes make VMs pause due to storage error
- Original Message - > From: nico...@devels.es > To: users@ovirt.org > Sent: Tuesday, October 27, 2015 4:59:31 PM > Subject: [ovirt-users] Rebooting gluster nodes make VMs pause due to storage > error > > Hi, > > We're using ovirt 3.5.3.1, and as storage backend we use GlusterFS. We > added a Storage Domain with the path "gluster.fqdn1:/volume", and as > options, we used "backup-volfile-servers=gluster.fqdn2". We now need to > restart both gluster.fqdn1 and gluster.fqdn2 machines due to system > update (not at the same time, obviously). We're worried because in > previous attempts, when restarted the main gluster node (gluster.fqdn1 > in this case), all the VMs running against that storage backend got > paused due to storage errors, and we couldn't resume them and finally > had to power them off the hard way and start them again. > > Gluster version on gluster.fqdn1 and gluster.fqdn2 is 3.6.3-1. > > Gluster configuration for that volume is: > > Volume Name: volume > Type: Replicate > Volume ID: a2d7e52c-2f63-4e72-9635-4e311baae6ff > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: gluster.fqdn1:/gluster/brick_01/brick > Brick2: gluster.fqdn2:/gluster/brick_01/brick > Options Reconfigured: > storage.owner-gid: 36 > storage.owner-uid: 36 > cluster.server-quorum-type: server > cluster.quorum-type: none > network.remote-dio: enable > cluster.eager-lock: enable > performance.stat-prefetch: off > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > > We would like to know if there's a "clean" way to do such a procedure. > We know that pausing all the VMs and then restarting the gluster nodes > work with no harm, but the downtime of the VMs is important to us and we > would like to avoid it, especially when we have 2 gluster nodes for > that. > > Any hints are appreciated, > > Thanks. Hi Nicolas, I'd suggest to try asking in the gluster mailing list as unless there is some misconfiguration or a problematic version of any of the components is in use, the scenario you specified should work (assuming that you are waiting for the server heal process to end after it comes up). And i'd advise to use a 3 nodes cluster. Regardless of your question I suggest you to take a look at the gluster virt store usecase page - http://www.gluster.org/community/documentation/index.php/Virt-store-usecase > ___ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Rebooting gluster nodes make VMs pause due to storage error
Hi, We're using ovirt 3.5.3.1, and as storage backend we use GlusterFS. We added a Storage Domain with the path "gluster.fqdn1:/volume", and as options, we used "backup-volfile-servers=gluster.fqdn2". We now need to restart both gluster.fqdn1 and gluster.fqdn2 machines due to system update (not at the same time, obviously). We're worried because in previous attempts, when restarted the main gluster node (gluster.fqdn1 in this case), all the VMs running against that storage backend got paused due to storage errors, and we couldn't resume them and finally had to power them off the hard way and start them again. Gluster version on gluster.fqdn1 and gluster.fqdn2 is 3.6.3-1. Gluster configuration for that volume is: Volume Name: volume Type: Replicate Volume ID: a2d7e52c-2f63-4e72-9635-4e311baae6ff Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: gluster.fqdn1:/gluster/brick_01/brick Brick2: gluster.fqdn2:/gluster/brick_01/brick Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 cluster.server-quorum-type: server cluster.quorum-type: none network.remote-dio: enable cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off We would like to know if there's a "clean" way to do such a procedure. We know that pausing all the VMs and then restarting the gluster nodes work with no harm, but the downtime of the VMs is important to us and we would like to avoid it, especially when we have 2 gluster nodes for that. Any hints are appreciated, Thanks. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users