I'm not planning to move to ovirt 4 until it gets stable, so would be great to backport to 3.6 or ,ideally, gets developed on the next release of 3.6 branch. Considering the urgency (its a single point of failure) x complexity wouldn't be hard to make the proposed fix.
I'm using today a production environment on top of gluster replica 3 and this is the only SPF I have. Thanks Luiz Em sex, 15 de abr de 2016 03:05, Sandro Bonazzola <sbona...@redhat.com> escreveu: > On Thu, Apr 14, 2016 at 7:35 PM, Nir Soffer <nsof...@redhat.com> wrote: > >> On Wed, Apr 13, 2016 at 4:34 PM, Luiz Claudio Prazeres Goncalves >> <luiz...@gmail.com> wrote: >> > Nir, here is the problem: >> > https://bugzilla.redhat.com/show_bug.cgi?id=1298693 >> > >> > When you do a hosted-engine --deploy and pick "glusterfs" you don't >> have a >> > way to define the mount options, therefore, the use of the >> > "backupvol-server", however when you create a storage domain from the >> UI you >> > can, like the attached screen shot. >> > >> > >> > In the hosted-engine --deploy, I would expect a flow which includes not >> only >> > the "gluster" entrypoint, but also the gluster mount options which is >> > missing today. This option would be optional, but would remove the >> single >> > point of failure described on the Bug 1298693. >> > >> > for example: >> > >> > Existing entry point on the "hosted-engine --deploy" flow >> > gluster1.xyz.com:/engine >> >> I agree, this feature must be supported. >> > > It will, and it's currently targeted to 4.0. > > > >> >> > Missing option on the "hosted-engine --deploy" flow : >> > backupvolfile-server=gluster2.xyz.com >> ,fetch-attempts=3,log-level=WARNING,log-file=/var/log/glusterfs/gluster_engine_domain.log >> > >> > Sandro, it seems to me a simple solution which can be easily fixed. >> > >> > What do you think? >> > >> > Regards >> > -Luiz >> > >> > >> > >> > 2016-04-13 4:15 GMT-03:00 Sandro Bonazzola <sbona...@redhat.com>: >> >> >> >> >> >> >> >> On Tue, Apr 12, 2016 at 6:47 PM, Nir Soffer <nsof...@redhat.com> >> wrote: >> >>> >> >>> On Tue, Apr 12, 2016 at 3:05 PM, Luiz Claudio Prazeres Goncalves >> >>> <luiz...@gmail.com> wrote: >> >>> > Hi Sandro, I've been using gluster with 3 external hosts for a while >> >>> > and >> >>> > things are working pretty well, however this single point of failure >> >>> > looks >> >>> > like a simple feature to implement,but critical to anyone who wants >> to >> >>> > use >> >>> > gluster on production . This is not hyperconvergency which has >> other >> >>> > issues/implications. So , why not have this feature out on 3.6 >> branch? >> >>> > It >> >>> > looks like just let vdsm use the 'backupvol-server' option when >> >>> > mounting the >> >>> > engine domain and make the property tests. >> >>> >> >>> Can you explain what is the problem, and what is the suggested >> solution? >> >>> >> >>> Engine and vdsm already support the backupvol-server option - you can >> >>> define this option in the storage domain options when you create a >> >>> gluster >> >>> storage domain. With this option vdsm should be able to connect to >> >>> gluster >> >>> storage domain even if a brick is down. >> >>> >> >>> If you don't have this option in engine , you probably cannot add it >> with >> >>> hosted >> >>> engine setup, since for editing it you must put the storage domain in >> >>> maintenance >> >>> and if you do this the engine vm will be killed :-) This is is one of >> >>> the issues with >> >>> engine managing the storage domain it runs on. >> >>> >> >>> I think the best way to avoid this issue, is to add a DNS entry >> >>> providing the addresses >> >>> of all the gluster bricks, and use this address for the gluster >> >>> storage domain. This way >> >>> the glusterfs mount helper can mount the domain even if one of the >> >>> gluster bricks >> >>> are down. >> >>> >> >>> Again, we will need some magic from the hosted engine developers to >> >>> modify the >> >>> address of the hosted engine gluster domain on existing system. >> >> >> >> >> >> Magic won't happen without a bz :-) please open one describing what's >> >> requested. >> >> >> >> >> >>> >> >>> >> >>> Nir >> >>> >> >>> > >> >>> > Could you add this feature to the next release of 3.6 branch? >> >>> > >> >>> > Thanks >> >>> > Luiz >> >>> > >> >>> > Em ter, 12 de abr de 2016 05:03, Sandro Bonazzola < >> sbona...@redhat.com> >> >>> > escreveu: >> >>> >> >> >>> >> On Mon, Apr 11, 2016 at 11:44 PM, Bond, Darryl < >> db...@nrggos.com.au> >> >>> >> wrote: >> >>> >>> >> >>> >>> My setup is hyperconverged. I have placed my test results in >> >>> >>> https://bugzilla.redhat.com/show_bug.cgi?id=1298693 >> >>> >>> >> >>> >> >> >>> >> Ok, so you're aware about the limitation of the single point of >> >>> >> failure. >> >>> >> If you drop the host referenced in hosted engine configuration for >> the >> >>> >> initial setup it won't be able to connect to shared storage even if >> >>> >> the >> >>> >> other hosts in the cluster are up since the entry point is down. >> >>> >> Note that hyperconverged deployment is not supported in 3.6. >> >>> >> >> >>> >> >> >>> >>> >> >>> >>> >> >>> >>> Short description of setup: >> >>> >>> >> >>> >>> 3 hosts with 2 disks each set up with gluster replica 3 across >> the 6 >> >>> >>> disks volume name hosted-engine. >> >>> >>> >> >>> >>> Hostname hosted-storage configured in /etc//hosts to point to the >> >>> >>> host1. >> >>> >>> >> >>> >>> Installed hosted engine on host1 with the hosted engine storage >> path >> >>> >>> = >> >>> >>> hosted-storage:/hosted-engine >> >>> >>> >> >>> >>> Install first engine on h1 successful. Hosts h2 and h3 added to >> the >> >>> >>> hosted engine. All works fine. >> >>> >>> >> >>> >>> Additional storage and non-hosted engine hosts added etc. >> >>> >>> >> >>> >>> Additional VMs added to hosted-engine storage (oVirt Reports VM >> and >> >>> >>> Cinder VM). Additional VM's are hosted by other storage - cinder >> and >> >>> >>> NFS. >> >>> >>> >> >>> >>> The system is in production. >> >>> >>> >> >>> >>> >> >>> >>> Engine can be migrated around with the web interface. >> >>> >>> >> >>> >>> >> >>> >>> - 3.6.4 upgrade released, follow the upgrade guide, engine is >> >>> >>> upgraded >> >>> >>> first , new Centos kernel requires host reboot. >> >>> >>> >> >>> >>> - Engine placed on h2 - h3 into maintenance (local) upgrade and >> >>> >>> Reboot >> >>> >>> h3 - No issues - Local maintenance removed from h3. >> >>> >>> >> >>> >>> - Engine placed on h3 - h2 into maintenance (local) upgrade and >> >>> >>> Reboot >> >>> >>> h2 - No issues - Local maintenance removed from h2. >> >>> >>> >> >>> >>> - Engine placed on h3 -h1 into mainteance (local) upgrade and >> reboot >> >>> >>> h1 - >> >>> >>> engine crashes and does not start elsewhere, VM(cinder) on h3 on >> >>> >>> same >> >>> >>> gluster volume pauses. >> >>> >>> >> >>> >>> - Host 1 takes about 5 minutes to reboot (Enterprise box with all >> >>> >>> it's >> >>> >>> normal BIOS probing) >> >>> >>> >> >>> >>> - Engine starts after h1 comes back and stabilises >> >>> >>> >> >>> >>> - VM(cinder) unpauses itself, VM(reports) continued fine the >> whole >> >>> >>> time. >> >>> >>> I can do no diagnosis on the 2 VMs as the engine is not available. >> >>> >>> >> >>> >>> - Local maintenance removed from h1 >> >>> >>> >> >>> >>> >> >>> >>> I don't believe the issue is with gluster itself as the volume >> >>> >>> remains >> >>> >>> accessible on all hosts during this time albeit with a missing >> server >> >>> >>> (gluster volume status) as each gluster server is rebooted. >> >>> >>> >> >>> >>> Gluster was upgraded as part of the process, no issues were seen >> >>> >>> here. >> >>> >>> >> >>> >>> >> >>> >>> I have been able to duplicate the issue without the upgrade by >> >>> >>> following >> >>> >>> the same sort of timeline. >> >>> >>> >> >>> >>> >> >>> >>> ________________________________ >> >>> >>> From: Sandro Bonazzola <sbona...@redhat.com> >> >>> >>> Sent: Monday, 11 April 2016 7:11 PM >> >>> >>> To: Richard Neuboeck; Simone Tiraboschi; Roy Golan; Martin Sivak; >> >>> >>> Sahina >> >>> >>> Bose >> >>> >>> Cc: Bond, Darryl; users >> >>> >>> Subject: Re: [ovirt-users] Hosted engine on gluster problem >> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> On Mon, Apr 11, 2016 at 9:37 AM, Richard Neuboeck >> >>> >>> <h...@tbi.univie.ac.at<mailto:h...@tbi.univie.ac.at>> wrote: >> >>> >>> Hi Darryl, >> >>> >>> >> >>> >>> I'm still experimenting with my oVirt installation so I tried to >> >>> >>> recreate the problems you've described. >> >>> >>> >> >>> >>> My setup has three HA hosts for virtualization and three machines >> >>> >>> for the gluster replica 3 setup. >> >>> >>> >> >>> >>> I manually migrated the Engine from the initial install host (one) >> >>> >>> to host three. Then shut down host one manually and interrupted >> the >> >>> >>> fencing mechanisms so the host stayed down. This didn't bother the >> >>> >>> Engine VM at all. >> >>> >>> >> >>> >>> Did you move the host one to maintenance before shutting down? >> >>> >>> Or is this a crash recovery test? >> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> To make things a bit more challenging I then shut down host three >> >>> >>> while running the Engine VM. Of course the Engine was down for >> some >> >>> >>> time until host two detected the problem. It started the Engine VM >> >>> >>> and everything seems to be running quite well without the initial >> >>> >>> install host. >> >>> >>> >> >>> >>> Thanks for the feedback! >> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> My only problem is that the HA agent on host two and three refuse >> to >> >>> >>> start after a reboot due to the fact that the configuration of the >> >>> >>> hosted engine is missing. I wrote another mail to >> >>> >>> users@ovirt.org<mailto:users@ovirt.org> >> >>> >>> about that. >> >>> >>> >> >>> >>> This is weird. Martin, Simone can you please investigate on this? >> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> Cheers >> >>> >>> Richard >> >>> >>> >> >>> >>> On 04/08/2016 01:38 AM, Bond, Darryl wrote: >> >>> >>> > There seems to be a pretty severe bug with using hosted engine >> on >> >>> >>> > gluster. >> >>> >>> > >> >>> >>> > If the host that was used as the initial hosted-engine --deploy >> >>> >>> > host >> >>> >>> > goes away, the engine VM wil crash and cannot be restarted until >> >>> >>> > the host >> >>> >>> > comes back. >> >>> >>> >> >>> >>> is this an Hyperconverged setup? >> >>> >>> >> >>> >>> >> >>> >>> > >> >>> >>> > This is regardless of which host the engine was currently >> running. >> >>> >>> > >> >>> >>> > >> >>> >>> > The issue seems to be buried in the bowels of VDSM and is not an >> >>> >>> > issue >> >>> >>> > with gluster itself. >> >>> >>> >> >>> >>> Sahina, can you please investigate on this? >> >>> >>> >> >>> >>> >> >>> >>> > >> >>> >>> > The gluster filesystem is still accessable from the host that >> was >> >>> >>> > running the engine. The issue has been submitted to bugzilla but >> >>> >>> > the fix is >> >>> >>> > some way off (4.1). >> >>> >>> > >> >>> >>> > >> >>> >>> > Can my hosted engine be converted to use NFS (using the gluster >> NFS >> >>> >>> > server on the same filesystem) without rebuilding my hosted >> engine >> >>> >>> > (ie >> >>> >>> > change domainType=glusterfs to domainType=nfs)? >> >>> >>> >> >>> >>> > >> >>> >>> > What effect would that have on the hosted-engine storage domain >> >>> >>> > inside >> >>> >>> > oVirt, ie would the same filesystem be mounted twice or would it >> >>> >>> > just break. >> >>> >>> > >> >>> >>> > >> >>> >>> > Will this actually fix the problem, does it have the same issue >> >>> >>> > when >> >>> >>> > the hosted engine is on NFS? >> >>> >>> > >> >>> >>> > >> >>> >>> > Darryl >> >>> >>> > >> >>> >>> > >> >>> >>> > >> >>> >>> > >> >>> >>> > ________________________________ >> >>> >>> > >> >>> >>> > The contents of this electronic message and any attachments are >> >>> >>> > intended only for the addressee and may contain legally >> privileged, >> >>> >>> > personal, sensitive or confidential information. If you are not >> the >> >>> >>> > intended >> >>> >>> > addressee, and have received this email, any transmission, >> >>> >>> > distribution, >> >>> >>> > downloading, printing or photocopying of the contents of this >> >>> >>> > message or >> >>> >>> > attachments is strictly prohibited. Any legal privilege or >> >>> >>> > confidentiality >> >>> >>> > attached to this message and attachments is not waived, lost or >> >>> >>> > destroyed by >> >>> >>> > reason of delivery to any person other than intended addressee. >> If >> >>> >>> > you have >> >>> >>> > received this message and are not the intended addressee you >> should >> >>> >>> > notify >> >>> >>> > the sender by return email and destroy all copies of the message >> >>> >>> > and any >> >>> >>> > attachments. Unless expressly attributed, the views expressed in >> >>> >>> > this email >> >>> >>> > do not necessarily represent the views of the company. >> >>> >>> > _______________________________________________ >> >>> >>> > Users mailing list >> >>> >>> > Users@ovirt.org<mailto:Users@ovirt.org> >> >>> >>> > http://lists.ovirt.org/mailman/listinfo/users >> >>> >>> > >> >>> >>> >> >>> >>> >> >>> >>> -- >> >>> >>> /dev/null >> >>> >>> >> >>> >>> >> >>> >>> _______________________________________________ >> >>> >>> Users mailing list >> >>> >>> Users@ovirt.org<mailto:Users@ovirt.org> >> >>> >>> http://lists.ovirt.org/mailman/listinfo/users >> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> -- >> >>> >>> Sandro Bonazzola >> >>> >>> Better technology. Faster innovation. Powered by community >> >>> >>> collaboration. >> >>> >>> See how it works at redhat.com<http://redhat.com> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> -- >> >>> >> Sandro Bonazzola >> >>> >> Better technology. Faster innovation. Powered by community >> >>> >> collaboration. >> >>> >> See how it works at redhat.com >> >>> >> _______________________________________________ >> >>> >> Users mailing list >> >>> >> Users@ovirt.org >> >>> >> http://lists.ovirt.org/mailman/listinfo/users >> >>> > >> >>> > >> >>> > _______________________________________________ >> >>> > Users mailing list >> >>> > Users@ovirt.org >> >>> > http://lists.ovirt.org/mailman/listinfo/users >> >>> > >> >> >> >> >> >> >> >> >> >> -- >> >> Sandro Bonazzola >> >> Better technology. Faster innovation. Powered by community >> collaboration. >> >> See how it works at redhat.com >> > >> > >> > > > > -- > Sandro Bonazzola > Better technology. Faster innovation. Powered by community collaboration. > See how it works at redhat.com >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users