Hi Martin, > > just as a random comment, do you still have the database backup from > the bare metal -> VM attempt? It might be possible to just try again > using it. Or in the worst case.. update the offending value there > before restoring it to the new engine instance.
I still have the backup. I'd rather do the latter, as re-running the HE deployment is quite lengthy and involved (I have to re-initialise the FC storage each time). Do you know what the offending value(s) would be? Would it be in the Postgres DB or in a config file somewhere? Cheers, Cam > Regards > > Martin Sivak > > On Thu, Jun 22, 2017 at 11:39 AM, cmc <[email protected]> wrote: >> Hi Yanir, >> >> Thanks for the reply. >> >>> First of all, maybe a chain reaction of : >>> WARN [org.ovirt.engine.core.bll.exportimport.ImportVmCommand] >>> (org.ovirt.thread.pool-6-thread-23) [] Validation of action 'ImportVm' >>> failed for user SYSTEM. Reasons: VAR__ACTION__IMPORT >>> ,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS >>> is causing the hosted engine vm not to be set up correctly and further >>> actions were made when the hosted engine vm wasnt in a stable state. >>> >>> As for now, are you trying to revert back to a previous/initial state ? >> >> I'm not trying to revert it to a previous state for now. This was a >> migration from a bare metal engine, and it didn't report any error >> during the migration. I'd had some problems on my first attempts at >> this migration, whereby it never completed (due to a proxy issue) but >> I managed to resolve this. Do you know of a way to get the Hosted >> Engine VM into a stable state, without rebuilding the entire cluster >> from scratch (since I have a lot of VMs on it)? >> >> Thanks for any help. >> >> Regards, >> >> Cam >> >>> Regards, >>> Yanir >>> >>> On Wed, Jun 21, 2017 at 4:32 PM, cmc <[email protected]> wrote: >>>> >>>> Hi Jenny/Martin, >>>> >>>> Any idea what I can do here? The hosted engine VM has no log on any >>>> host in /var/log/libvirt/qemu, and I fear that if I need to put the >>>> host into maintenance, e.g., to upgrade it that I created it on (which >>>> I think is hosting it), or if it fails for any reason, it won't get >>>> migrated to another host, and I will not be able to manage the >>>> cluster. It seems to be a very dangerous position to be in. >>>> >>>> Thanks, >>>> >>>> Cam >>>> >>>> On Wed, Jun 21, 2017 at 11:48 AM, cmc <[email protected]> wrote: >>>> > Thanks Martin. The hosts are all part of the same cluster. >>>> > >>>> > I get these errors in the engine.log on the engine: >>>> > >>>> > 2017-06-19 03:28:05,030Z WARN >>>> > [org.ovirt.engine.core.bll.exportimport.ImportVmCommand] >>>> > (org.ovirt.thread.pool-6-thread-23) [] Validation of action 'ImportVm' >>>> > failed for user SYST >>>> > EM. Reasons: >>>> > VAR__ACTION__IMPORT,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS >>>> > 2017-06-19 03:28:05,030Z INFO >>>> > [org.ovirt.engine.core.bll.exportimport.ImportVmCommand] >>>> > (org.ovirt.thread.pool-6-thread-23) [] Lock freed to object >>>> > 'EngineLock:{exclusiveLocks='[a >>>> > 79e6b0e-fff4-4cba-a02c-4c00be151300=<VM, >>>> > ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName HostedEngine>, >>>> > HostedEngine=<VM_NAME, ACTION_TYPE_FAILED_NAME_ALREADY_USED>]', >>>> > sharedLocks= >>>> > '[a79e6b0e-fff4-4cba-a02c-4c00be151300=<REMOTE_VM, >>>> > ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName HostedEngine>]'}' >>>> > 2017-06-19 03:28:05,030Z ERROR >>>> > [org.ovirt.engine.core.bll.HostedEngineImporter] >>>> > (org.ovirt.thread.pool-6-thread-23) [] Failed importing the Hosted >>>> > Engine VM >>>> > >>>> > The sanlock.log reports conflicts on that same host, and a different >>>> > error on the other hosts, not sure if they are related. >>>> > >>>> > And this in the /var/log/ovirt-hosted-engine-ha/agent log on the host >>>> > which I deployed the hosted engine VM on: >>>> > >>>> > MainThread::ERROR::2017-06-19 >>>> > >>>> > 13:09:49,743::ovf_store::124::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) >>>> > Unable to extract HEVM OVF >>>> > MainThread::ERROR::2017-06-19 >>>> > >>>> > 13:09:49,743::config::445::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) >>>> > Failed extracting VM OVF from the OVF_STORE volume, falling back to >>>> > initial vm.conf >>>> > >>>> > I've seen some of these issues reported in bugzilla, but they were for >>>> > older versions of oVirt (and appear to be resolved). >>>> > >>>> > I will install that package on the other two hosts, for which I will >>>> > put them in maintenance as vdsm is installed as an upgrade. I guess >>>> > restarting vdsm is a good idea after that? >>>> > >>>> > Thanks, >>>> > >>>> > Campbell >>>> > >>>> > On Wed, Jun 21, 2017 at 10:51 AM, Martin Sivak <[email protected]> >>>> > wrote: >>>> >> Hi, >>>> >> >>>> >> you do not have to install it on all hosts. But you should have more >>>> >> than one and ideally all hosted engine enabled nodes should belong to >>>> >> the same engine cluster. >>>> >> >>>> >> Best regards >>>> >> >>>> >> Martin Sivak >>>> >> >>>> >> On Wed, Jun 21, 2017 at 11:29 AM, cmc <[email protected]> wrote: >>>> >>> Hi Jenny, >>>> >>> >>>> >>> Does ovirt-hosted-engine-ha need to be installed across all hosts? >>>> >>> Could that be the reason it is failing to see it properly? >>>> >>> >>>> >>> Thanks, >>>> >>> >>>> >>> Cam >>>> >>> >>>> >>> On Mon, Jun 19, 2017 at 1:27 PM, cmc <[email protected]> wrote: >>>> >>>> Hi Jenny, >>>> >>>> >>>> >>>> Logs are attached. I can see errors in there, but am unsure how they >>>> >>>> arose. >>>> >>>> >>>> >>>> Thanks, >>>> >>>> >>>> >>>> Campbell >>>> >>>> >>>> >>>> On Mon, Jun 19, 2017 at 12:29 PM, Evgenia Tokar <[email protected]> >>>> >>>> wrote: >>>> >>>>> From the output it looks like the agent is down, try starting it by >>>> >>>>> running: >>>> >>>>> systemctl start ovirt-ha-agent. >>>> >>>>> >>>> >>>>> The engine is supposed to see the hosted engine storage domain and >>>> >>>>> import it >>>> >>>>> to the system, then it should import the hosted engine vm. >>>> >>>>> >>>> >>>>> Can you attach the agent log from the host >>>> >>>>> (/var/log/ovirt-hosted-engine-ha/agent.log) >>>> >>>>> and the engine log from the engine vm >>>> >>>>> (/var/log/ovirt-engine/engine.log)? >>>> >>>>> >>>> >>>>> Thanks, >>>> >>>>> Jenny >>>> >>>>> >>>> >>>>> >>>> >>>>> On Mon, Jun 19, 2017 at 12:41 PM, cmc <[email protected]> wrote: >>>> >>>>>> >>>> >>>>>> Hi Jenny, >>>> >>>>>> >>>> >>>>>> > What version are you running? >>>> >>>>>> >>>> >>>>>> 4.1.2.2-1.el7.centos >>>> >>>>>> >>>> >>>>>> > For the hosted engine vm to be imported and displayed in the >>>> >>>>>> > engine, you >>>> >>>>>> > must first create a master storage domain. >>>> >>>>>> >>>> >>>>>> To provide a bit more detail: this was a migration of a bare-metal >>>> >>>>>> engine in an existing cluster to a hosted engine VM for that >>>> >>>>>> cluster. >>>> >>>>>> As part of this migration, I built an entirely new host and ran >>>> >>>>>> 'hosted-engine --deploy' (followed these instructions: >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> http://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_Metal_to_an_EL-Based_Self-Hosted_Environment/). >>>> >>>>>> I restored the backup from the engine and it completed without any >>>> >>>>>> errors. I didn't see any instructions regarding a master storage >>>> >>>>>> domain in the page above. The cluster has two existing master >>>> >>>>>> storage >>>> >>>>>> domains, one is fibre channel, which is up, and one ISO domain, >>>> >>>>>> which >>>> >>>>>> is currently offline. >>>> >>>>>> >>>> >>>>>> > What do you mean the hosted engine commands are failing? What >>>> >>>>>> > happens >>>> >>>>>> > when >>>> >>>>>> > you run hosted-engine --vm-status now? >>>> >>>>>> >>>> >>>>>> Interestingly, whereas when I ran it before, it exited with no >>>> >>>>>> output >>>> >>>>>> and a return code of '1', it now reports: >>>> >>>>>> >>>> >>>>>> --== Host 1 status ==-- >>>> >>>>>> >>>> >>>>>> conf_on_shared_storage : True >>>> >>>>>> Status up-to-date : False >>>> >>>>>> Hostname : kvm-ldn-03.ldn.fscfc.co.uk >>>> >>>>>> Host ID : 1 >>>> >>>>>> Engine status : unknown stale-data >>>> >>>>>> Score : 0 >>>> >>>>>> stopped : True >>>> >>>>>> Local maintenance : False >>>> >>>>>> crc32 : 0217f07b >>>> >>>>>> local_conf_timestamp : 2911 >>>> >>>>>> Host timestamp : 2897 >>>> >>>>>> Extra metadata (valid at timestamp): >>>> >>>>>> metadata_parse_version=1 >>>> >>>>>> metadata_feature_version=1 >>>> >>>>>> timestamp=2897 (Thu Jun 15 16:22:54 2017) >>>> >>>>>> host-id=1 >>>> >>>>>> score=0 >>>> >>>>>> vm_conf_refresh_time=2911 (Thu Jun 15 16:23:08 2017) >>>> >>>>>> conf_on_shared_storage=True >>>> >>>>>> maintenance=False >>>> >>>>>> state=AgentStopped >>>> >>>>>> stopped=True >>>> >>>>>> >>>> >>>>>> Yet I can login to the web GUI fine. I guess it is not HA due to >>>> >>>>>> being >>>> >>>>>> in an unknown state currently? Does the hosted-engine-ha rpm need >>>> >>>>>> to >>>> >>>>>> be installed across all nodes in the cluster, btw? >>>> >>>>>> >>>> >>>>>> Thanks for the help, >>>> >>>>>> >>>> >>>>>> Cam >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > Jenny Tokar >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > On Thu, Jun 15, 2017 at 6:32 PM, cmc <[email protected]> wrote: >>>> >>>>>> >> >>>> >>>>>> >> Hi, >>>> >>>>>> >> >>>> >>>>>> >> I've migrated from a bare-metal engine to a hosted engine. There >>>> >>>>>> >> were >>>> >>>>>> >> no errors during the install, however, the hosted engine did not >>>> >>>>>> >> get >>>> >>>>>> >> started. I tried running: >>>> >>>>>> >> >>>> >>>>>> >> hosted-engine --status >>>> >>>>>> >> >>>> >>>>>> >> on the host I deployed it on, and it returns nothing (exit code >>>> >>>>>> >> is 1 >>>> >>>>>> >> however). I could not ping it either. So I tried starting it via >>>> >>>>>> >> 'hosted-engine --vm-start' and it returned: >>>> >>>>>> >> >>>> >>>>>> >> Virtual machine does not exist >>>> >>>>>> >> >>>> >>>>>> >> But it then became available. I logged into it successfully. It >>>> >>>>>> >> is not >>>> >>>>>> >> in the list of VMs however. >>>> >>>>>> >> >>>> >>>>>> >> Any ideas why the hosted-engine commands fail, and why it is not >>>> >>>>>> >> in >>>> >>>>>> >> the list of virtual machines? >>>> >>>>>> >> >>>> >>>>>> >> Thanks for any help, >>>> >>>>>> >> >>>> >>>>>> >> Cam >>>> >>>>>> >> _______________________________________________ >>>> >>>>>> >> Users mailing list >>>> >>>>>> >> [email protected] >>>> >>>>>> >> http://lists.ovirt.org/mailman/listinfo/users >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>> >>>> >>>>> >>>> >>> _______________________________________________ >>>> >>> Users mailing list >>>> >>> [email protected] >>>> >>> http://lists.ovirt.org/mailman/listinfo/users >>>> _______________________________________________ >>>> Users mailing list >>>> [email protected] >>>> http://lists.ovirt.org/mailman/listinfo/users >>> >>> _______________________________________________ Users mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/users

