Could you share the client logs and information about the approx time/day when you saw this issue?
-Krutika On Sat, Apr 16, 2016 at 12:57 AM, Kevin Lemonnier <[email protected]> wrote: > Hi, > > We have a small glusterFS 3.7.6 cluster with 3 nodes running with proxmox > VM's on it. I did set up the different recommended option like the virt > group, but > by hand since it's on debian. The shards are 256MB, if that matters. > > This morning the second node crashed, and as it came back up started a > heal, but that basically froze all the VM's running on that volume. Since > we really really > can't have 40 minutes down time in the middle of the day, I just removed > the node from the network and that stopped the heal, allowing the VM's to > access > their disks again. The plan was to re-connecte the node in a couple of > hours to let it heal at night. > But a VM crashed now, and it can't boot up again : seems to freez trying > to access the disks. > > Looking at the heal info for the volume, it has gone way up since this > morning, it looks like the VM's aren't writing to both nodes, just the one > they are on. > It seems pretty bad, we have 2 nodes on 3 up, I would expect the volume to > work just fine since it has quorum. What am I missing ? > > It is still too early to start the heal, is there a way to start the VM > anyway right now ? I mean, it was running a moment ago so the data is > there, it just needs > to let the VM access it. > > > > Volume Name: vm-storage > Type: Replicate > Volume ID: a5b19324-f032-4136-aaac-5e9a4c88aaef > Status: Started > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: first_node:/mnt/vg1-storage > Brick2: second_node:/mnt/vg1-storage > Brick3: third_node:/mnt/vg1-storage > Options Reconfigured: > cluster.quorum-type: auto > cluster.server-quorum-type: server > network.remote-dio: enable > cluster.eager-lock: enable > performance.readdir-ahead: on > performance.quick-read: off > performance.read-ahead: off > performance.io-cache: off > performance.stat-prefetch: off > features.shard: on > features.shard-block-size: 256MB > cluster.server-quorum-ratio: 51% > > > Thanks for your help > > -- > Kevin Lemonnier > PGP Fingerprint : 89A5 2283 04A0 E6E9 0111 > > _______________________________________________ > Gluster-users mailing list > [email protected] > http://www.gluster.org/mailman/listinfo/gluster-users >
_______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
