Re: [ovirt-users] Client-quorum not met - Distributed-Replicate gluster volume

2015-02-16 Thread George Skorup
You would need six storage hosts in total to maintain quorum if even one 
of the hosts goes down. There's no way to decide who's right with 
replica 2. When you have 2 out of 3 online, majority rules.


I have a four node cluster doing replica 4, no distribute. I can take 
one host down. If two are down, quorum is not met and the volumes go 
read-only. Same issue applies, only 50% is online.


On 2/16/2015 5:20 AM, Wesley Schaft wrote:

Hi,

I've set up 4 oVirt nodes with Gluster storage to provide high available 
virtual machines.
The Gluster volumes are Distributed-Replicate with a replica count of 2.

The extra volume options are configured:

cat /var/lib/glusterd/groups/virt
quick-read=off
read-ahead=off
io-cache=off
stat-prefetch=off
eager-lock=enable
remote-dio=enable
quorum-type=auto
server-quorum-type=server


Volume for the self-hosted engine:
gluster volume info engine

Volume Name: engine
Type: Distributed-Replicate
Volume ID: 9e7a3265-1e91-46e1-a0ba-09c5cc1fc1c1
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gluster004:/gluster/engine/004
Brick2: gluster005:/gluster/engine/005
Brick3: gluster006:/gluster/engine/006
Brick4: gluster007:/gluster/engine/007
Options Reconfigured:
cluster.quorum-type: auto
storage.owner-gid: 36
storage.owner-uid: 36
cluster.server-quorum-type: server
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
network.ping-timeout: 10


Volume for the virtual machines:
gluster volume info data

Volume Name: data
Type: Distributed-Replicate
Volume ID: 896db323-7ac4-4023-82a6-a8815a4d06b4
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gluster004:/gluster/data/004
Brick2: gluster005:/gluster/data/005
Brick3: gluster006:/gluster/data/006
Brick4: gluster007:/gluster/data/007
Options Reconfigured:
cluster.quorum-type: auto
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
storage.owner-uid: 36
storage.owner-gid: 36
cluster.server-quorum-type: server
network.ping-timeout: 10


Everything seems to be working fine.
However, when I stop the storage network on gluster004 or gluster006, 
client-quorum is lost.
Client-quorum isn't lost when the storage network is stopped on gluster005 or 
gluster007.

[2015-02-16 07:05:58.541531] W [MSGID: 108001] [afr-common.c:3635:afr_notify] 
0-data-replicate-1: Client-quorum is not met
[2015-02-16 07:05:58.541579] W [MSGID: 108001] [afr-common.c:3635:afr_notify] 
0-engine-replicate-1: Client-quorum is not met

And as a result, the volumes are read-only and the VM's are paused.

I've added a dummy gluster node for quorum use (no bricks, only running 
glusterd), but that didn't help.

gluster peer status
Number of Peers: 4

Hostname: gluster005
Uuid: 6c5253b4-b1c6-4d0a-9e6b-1f3efc1e8086
State: Peer in Cluster (Connected)

Hostname: gluster006
Uuid: 4b3d15c4-2de0-4d2e-aa4c-3981e47dadbd
State: Peer in Cluster (Connected)

Hostname: gluster007
Uuid: 165e9ada-addb-496e-abf7-4a4efda4d5d3
State: Peer in Cluster (Connected)

Hostname: glusterdummy
Uuid: 3ef8177b-2394-429b-a58e-ecf0f6ce79a0
State: Peer in Cluster (Connected)


The 4 nodes are running CentOS 7, with the following oVirt / Gluster packages:

glusterfs-3.6.2-1.el7.x86_64
glusterfs-api-3.6.2-1.el7.x86_64
glusterfs-cli-3.6.2-1.el7.x86_64
glusterfs-fuse-3.6.2-1.el7.x86_64
glusterfs-libs-3.6.2-1.el7.x86_64
glusterfs-rdma-3.6.2-1.el7.x86_64
glusterfs-server-3.6.2-1.el7.x86_64
ovirt-engine-sdk-python-3.5.1.0-1.el7.centos.noarch
ovirt-host-deploy-1.3.1-1.el7.noarch
ovirt-hosted-engine-ha-1.2.5-1.el7.centos.noarch
ovirt-hosted-engine-setup-1.2.2-1.el7.centos.noarch
vdsm-gluster-4.16.10-8.gitc937927.el7.noarch


The self-hosted engine is running CentOS 6 with ovirt-engine-3.5.1-1.el6.noarch

Regards,
Wesley





___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Client-quorum not met - Distributed-Replicate gluster volume

2015-02-16 Thread Wesley Schaft
Hi,

I've set up 4 oVirt nodes with Gluster storage to provide high available 
virtual machines.
The Gluster volumes are Distributed-Replicate with a replica count of 2.

The extra volume options are configured:

cat /var/lib/glusterd/groups/virt
quick-read=off
read-ahead=off
io-cache=off
stat-prefetch=off
eager-lock=enable
remote-dio=enable
quorum-type=auto
server-quorum-type=server


Volume for the self-hosted engine:
gluster volume info engine

Volume Name: engine
Type: Distributed-Replicate
Volume ID: 9e7a3265-1e91-46e1-a0ba-09c5cc1fc1c1
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gluster004:/gluster/engine/004
Brick2: gluster005:/gluster/engine/005
Brick3: gluster006:/gluster/engine/006
Brick4: gluster007:/gluster/engine/007
Options Reconfigured:
cluster.quorum-type: auto
storage.owner-gid: 36
storage.owner-uid: 36
cluster.server-quorum-type: server
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
network.ping-timeout: 10


Volume for the virtual machines:
gluster volume info data

Volume Name: data
Type: Distributed-Replicate
Volume ID: 896db323-7ac4-4023-82a6-a8815a4d06b4
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gluster004:/gluster/data/004
Brick2: gluster005:/gluster/data/005
Brick3: gluster006:/gluster/data/006
Brick4: gluster007:/gluster/data/007
Options Reconfigured:
cluster.quorum-type: auto
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
storage.owner-uid: 36
storage.owner-gid: 36
cluster.server-quorum-type: server
network.ping-timeout: 10


Everything seems to be working fine.
However, when I stop the storage network on gluster004 or gluster006, 
client-quorum is lost.
Client-quorum isn't lost when the storage network is stopped on gluster005 or 
gluster007.

[2015-02-16 07:05:58.541531] W [MSGID: 108001] [afr-common.c:3635:afr_notify] 
0-data-replicate-1: Client-quorum is not met
[2015-02-16 07:05:58.541579] W [MSGID: 108001] [afr-common.c:3635:afr_notify] 
0-engine-replicate-1: Client-quorum is not met

And as a result, the volumes are read-only and the VM's are paused.

I've added a dummy gluster node for quorum use (no bricks, only running 
glusterd), but that didn't help.

gluster peer status
Number of Peers: 4

Hostname: gluster005
Uuid: 6c5253b4-b1c6-4d0a-9e6b-1f3efc1e8086
State: Peer in Cluster (Connected)

Hostname: gluster006
Uuid: 4b3d15c4-2de0-4d2e-aa4c-3981e47dadbd
State: Peer in Cluster (Connected)

Hostname: gluster007
Uuid: 165e9ada-addb-496e-abf7-4a4efda4d5d3
State: Peer in Cluster (Connected)

Hostname: glusterdummy
Uuid: 3ef8177b-2394-429b-a58e-ecf0f6ce79a0
State: Peer in Cluster (Connected)


The 4 nodes are running CentOS 7, with the following oVirt / Gluster packages:

glusterfs-3.6.2-1.el7.x86_64
glusterfs-api-3.6.2-1.el7.x86_64
glusterfs-cli-3.6.2-1.el7.x86_64
glusterfs-fuse-3.6.2-1.el7.x86_64
glusterfs-libs-3.6.2-1.el7.x86_64
glusterfs-rdma-3.6.2-1.el7.x86_64
glusterfs-server-3.6.2-1.el7.x86_64
ovirt-engine-sdk-python-3.5.1.0-1.el7.centos.noarch
ovirt-host-deploy-1.3.1-1.el7.noarch
ovirt-hosted-engine-ha-1.2.5-1.el7.centos.noarch
ovirt-hosted-engine-setup-1.2.2-1.el7.centos.noarch
vdsm-gluster-4.16.10-8.gitc937927.el7.noarch


The self-hosted engine is running CentOS 6 with ovirt-engine-3.5.1-1.el6.noarch

Regards,
Wesley





___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users