On Thu, Jul 31, 2014 at 12:02 PM, Humble Devassy Chirammal < [email protected]> wrote:
> I second Jason, either the quorum=auto has to be disabled or just add one > more server to the trusted pool and find the result . > > --Humble > > > On Fri, Aug 1, 2014 at 12:22 AM, Jason Brooks <[email protected]> wrote: > >> >> >> ----- Original Message ----- >> > From: "Vince Loschiavo" <[email protected]> >> > To: [email protected] >> > Sent: Thursday, July 31, 2014 9:22:16 AM >> > Subject: [Gluster-users] Virt-store use case - HA failure issue - >> suggestions needed >> > >> > I'm currently testing Gluster 3.5.1 in a two server QEMU/KVM >> environment. >> > Centos 6.5: >> > Two servers (KVM07 & KVM08), Two brick (one brick per server) replicated >> > volume >> > >> > I've tuned the volume per the documentation here: >> > http://gluster.org/documentation/use_cases/Virt-store-usecase/ >> > >> > I have the gluster volume fuse mounted on KVM07 and KVM08 and am using >> it >> > to store raw disk images. >> > >> > KVM is using the fuse mounted volume as a "dir: Filesystem Directory: >> > storage pool. >> > >> > With setting dynamic_ownership = 0 in /etc/libvirt/qemu.conf and >> chown-ing >> > the files to qemu:qemu, live migration works great. >> > >> > Problem: >> > If I need to take down one of these servers for maintenance, I live >> migrate >> > the VMs to the other server. >> > service gluster stop >> > then kill all the remaining gluster and brick processes. >> >> The guide says that quorum-type=auto sets a rule such that at least half >> of the bricks in the replica group should be UP and running. If not, >> the replica group becomes read-only. I think the rule is actually 51%, >> so bringing down one of the two servers makes your volume read-only. >> >> If you want two servers, you need to unset this rule. Better to add a >> third server and a third replica, though. >> >> Regards, Jason >> >> >> > >> > At this point, the VMs die. The Fuse mount recovers and remains >> attached >> > to the volume via the other server, but the VIRT disk images are not >> fully >> > synced. >> > >> > This causes the VMs to go into a read-only files system state, then >> kernel >> > panic. Reboots/restarts of the VMs just cause kernel panics. This >> > effectively brings down the two node cluster. >> > >> > Bringing back up the gluster node / bricks /etc, prompts a self-heal. >> Once >> > self-heal is completed, the VMs can boot normally. >> > >> > Question: is there a better way to accomplish HA with live/running Virt >> > images? The goal is to be able to bring down any one server in the pair >> > and perform maintenance without interrupting the VMs. >> > >> > I assume my shutdown process is flawed but haven't been able to find a >> > better process. >> > >> > Any suggestions are welcome. >> > >> > >> > -- >> > -Vince Loschiavo >> > >> > _______________________________________________ >> > Gluster-users mailing list >> > [email protected] >> > http://supercolony.gluster.org/mailman/listinfo/gluster-users >> _______________________________________________ >> Gluster-users mailing list >> [email protected] >> http://supercolony.gluster.org/mailman/listinfo/gluster-users >> > > That was it. Thank you. I'm somewhat space constrained in my lab, so I chose to disable quorum and set server-quorum to 50%. I assume that was redundant, but it works for me. -- -Vince Loschiavo
_______________________________________________ Gluster-users mailing list [email protected] http://supercolony.gluster.org/mailman/listinfo/gluster-users
