Sorry, I was referring to the glusterfs client logs. Assuming you are using FUSE mount, your log file will be in /var/log/glusterfs/<hyphenated-mount-point-path>.log
-Krutika On Sun, Apr 17, 2016 at 9:37 PM, Kevin Lemonnier <[email protected]> wrote: > I believe Proxmox is just an interface to KVM that uses the lib, so if I'm > not mistaken there isn't client logs ? > > It's not the first time I have the issue, it happens on every heal on the > 2 clusters I have. > > I did let the heal finish that night and the VMs are working now, but it > is pretty scarry for future crashes or brick replacement. > Should I maybe lower the shard size ? Won't solve the fact that 2 bricks > on 3 aren't keeping the filesystem usable but might make the healing > quicker right ? > > Thanks > > Le 17 avril 2016 17:56:37 GMT+02:00, Krutika Dhananjay < > [email protected]> a écrit : > >Could you share the client logs and information about the approx > >time/day > >when you saw this issue? > > > >-Krutika > > > >On Sat, Apr 16, 2016 at 12:57 AM, Kevin Lemonnier > ><[email protected]> > >wrote: > > > >> Hi, > >> > >> We have a small glusterFS 3.7.6 cluster with 3 nodes running with > >proxmox > >> VM's on it. I did set up the different recommended option like the > >virt > >> group, but > >> by hand since it's on debian. The shards are 256MB, if that matters. > >> > >> This morning the second node crashed, and as it came back up started > >a > >> heal, but that basically froze all the VM's running on that volume. > >Since > >> we really really > >> can't have 40 minutes down time in the middle of the day, I just > >removed > >> the node from the network and that stopped the heal, allowing the > >VM's to > >> access > >> their disks again. The plan was to re-connecte the node in a couple > >of > >> hours to let it heal at night. > >> But a VM crashed now, and it can't boot up again : seems to freez > >trying > >> to access the disks. > >> > >> Looking at the heal info for the volume, it has gone way up since > >this > >> morning, it looks like the VM's aren't writing to both nodes, just > >the one > >> they are on. > >> It seems pretty bad, we have 2 nodes on 3 up, I would expect the > >volume to > >> work just fine since it has quorum. What am I missing ? > >> > >> It is still too early to start the heal, is there a way to start the > >VM > >> anyway right now ? I mean, it was running a moment ago so the data is > >> there, it just needs > >> to let the VM access it. > >> > >> > >> > >> Volume Name: vm-storage > >> Type: Replicate > >> Volume ID: a5b19324-f032-4136-aaac-5e9a4c88aaef > >> Status: Started > >> Number of Bricks: 1 x 3 = 3 > >> Transport-type: tcp > >> Bricks: > >> Brick1: first_node:/mnt/vg1-storage > >> Brick2: second_node:/mnt/vg1-storage > >> Brick3: third_node:/mnt/vg1-storage > >> Options Reconfigured: > >> cluster.quorum-type: auto > >> cluster.server-quorum-type: server > >> network.remote-dio: enable > >> cluster.eager-lock: enable > >> performance.readdir-ahead: on > >> performance.quick-read: off > >> performance.read-ahead: off > >> performance.io-cache: off > >> performance.stat-prefetch: off > >> features.shard: on > >> features.shard-block-size: 256MB > >> cluster.server-quorum-ratio: 51% > >> > >> > >> Thanks for your help > >> > >> -- > >> Kevin Lemonnier > >> PGP Fingerprint : 89A5 2283 04A0 E6E9 0111 > >> > >> _______________________________________________ > >> Gluster-users mailing list > >> [email protected] > >> http://www.gluster.org/mailman/listinfo/gluster-users > >> > > -- > Envoyé de mon appareil Android avec K-9 Mail. Veuillez excuser ma brièveté. > _______________________________________________ > Gluster-users mailing list > [email protected] > http://www.gluster.org/mailman/listinfo/gluster-users >
_______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
