[ovirt-users] Re: Failed to create a new VM in ovirt due to "The host did not satisfy internal filter Memory" error

2020-09-10 Thread Martin Sivak
Hi,

As others told you already, this is the expected behavior. The reason
is very simple. All the VMs expect their allocated memory to be
available and might try to use it. And when that happens and the
memory is not there, the VMs will crash. That is not something you
typically want.

You can configure Memory over-commitment to allow this behavior.
Typically for desktop VMs that are not all using all the memory at the
same time. See for example the docs here
https://www.ovirt.org/documentation/administration_guide/#memory_optimization
and here 
https://www.ovirt.org/documentation/administration_guide/#Cluster_Optimization_Settings_Explained
for explanation.

I also recommend enabling at least one of Enable Memory Balloon
Optimization and/or KSM control to allow nodes to rebalance free
memory. You should also set the Guaranteed memory for VMs as explained
in the above links to limit ballooning aggressivity.

Best regards

--
Martin Sivak
ex-oVirt maintainer of this area

On Mon, Sep 7, 2020 at 9:38 PM KISHOR K  wrote:
>
> Hi All,
>
> I'm new to Ovirt and not having a perfect experience with Ovirt yet.
> I ran into a strange issue today when I tried to create a new VM with 32GB in 
> ovirt. VM creation basically due to an error, pasted below.
>
> Cannot run VM. There is no host that satisfies current scheduling 
> constraints. See below for details:, The host host-01 did not satisfy 
> internal filter Memory., The host host-01 did not satisfy internal filter 
> Memory.
>
> After some troubleshooting, I found that there was enough available memory 
> (around 120 GB free ) in the host to host new VM and there were around 10 VMs 
> already running on this host.
> But, I noticed that ovirt is actually scheduling and creating VM based on 
> "Max free Memory for scheduling new VMs" value, which seems to be set/updated 
> based on total allocated memory for all VMs running on the host and it's not 
> really the then consumed memory by running VMs.
>
> Can anyone help to explain if it is some kind of bug in ovirt or it expected 
> behavior?
> If it is expected behavior, is there any possibility to change it to make 
> sure VM is created based on actual free memory?
>
> Thanks a lot in advance for your support.
>
> /Kishore
>
>
>
> /Kishore
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/EZVZI225KR5YBOGI5HXPTYIX3Z3QDYHV/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3GWPRQTPXLU6J2ZDTVUHTI64IC3GVAUW/


[ovirt-users] Re: power_saving schedule not powering off hosts

2020-04-05 Thread Martin Sivak
On Sun, Apr 5, 2020 at 4:07 PM Strahil Nikolov  wrote:
>
> On April 5, 2020 3:36:42 PM GMT+03:00, "Maton, Brett" 
>  wrote:
> >I've got a cluster made up of five physical hosts, (Dells with idrac 7
> >management)
> >Power management / fencing enabled on all hosts.
> >
> >I've enabled the power_saving scheduling policy on my cluster, it's
> >migrated all the VM's to a couple of physical hosts so three are
> >sitting
> >idle with no VM's.
> >
> >Shouldn't the power_saving policy shut down the idle hosts?
>
> I don't think so...

On the contrary, it will shut down unneeded hosts indeed. But you have
to configure the fields Liran mentioned:

Edit the cluster, select scheduling policy tab and add two parameters
to your Power saving policy. One (EnableAutomaticHostPowerManagement)
enables the power cycling mechanism and the second one
(HostsInReserve) controls how many empty hosts are allowed to stay up.
When the not enough hosts are empty anymore a new one will be started.

Best regards

--
Martin Sivak

> But you can set a script to change the performance  profile of tuned daemon 
> to adjust based on number of VMs.
>
> For example, if no VMs are running - tuned's power  saving profile is 
> enabled,  while if a VM is powered  on (or migrated) - the script can put a 
> high performance  peofile  is enabled.
>
>
> Best Regards,
> Strahil Nikolov
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/RRXKMRBMIME2KMBWW7GAA3DJBG5S24G7/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7ZLKFEPYMPJWX3M7XSCI73CGHIU3Q7CH/


[ovirt-users] Re: memory overcommit in oVirt

2020-02-04 Thread Martin Sivak
Hi,

memory overcommit in oVirt has a different meaning indeed.

oVirt tracks the amount of memory allocated to all running VMs and
will not let you start an additional VM once the full capacity of the
node is reached. The important thing to realize is that this is in no
way related to the current memory consumption of a VM. This resource
tracking is making sure all VMs will have enough memory on the node
even if they all decide to use their full allocation at the same time.

Memory overcommit in oVirt is exploiting the fact that the situation
described above does not happen all that often. In fact, it almost
does not happen at all, especially in the virtual desktop use case. So
we let the user specify how much can the VM allocation grow above the
physical memory capacity of a node (- some overhead for system and
such).

Does this make sense to you?

Best regards

--
Martin Sivak
ex-oVirt maintainer :)

On Tue, Feb 4, 2020 at 3:21 AM yam yam  wrote:
>
> Hello,
>
> I thought memory overcommit feature in oVirt utilizes host's overcommit 
> features by manipulating kernel variables like vm.overcommit_ratio, 
> vm_overcommit_memory.
> but, I've just confirmed that those variables didn't change at all.
>
> I wonder if oVirt's overcommit really has nothing to do with that of host
> and if so, how does oVirt manage memory overcommit.
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/6JWQJUGCZ4NXB7HE7XA5WRRECNCFUKT7/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NPXS74HQ52ZUALTUEGDA5JKDD7IYOSJ7/


[ovirt-users] Re: Memory ballon question

2019-06-14 Thread Martin Sivak
Hi,

> 2019-06-13 07:11:40,973 - mom.Controllers.Balloon - INFO - Ballooning 
> guest:node1 from 695648 to 660865
> 2019-06-13 07:12:51,437 - mom.GuestMonitor.Thread - INFO - GuestMonitor-node1 
> ending
>
> Can someone clarify what exactly does this (from  to ) mean ?

It is the ballooning operation log:

- From - how much memory was left in the VM before the action
- To - how much after (could be both lower and higher)

I do not remember the units, but I think it was in KiB.

Martin


On Thu, Jun 13, 2019 at 9:26 PM Strahil Nikolov  wrote:
>
> Hi Martin,Darrell,
>
> thanks for your feedback.
>
> I have checked the /var/log/vdsm/mom.log and it seems that MOM was actually 
> working:
>
> 2019-06-13 07:08:47,690 - mom.GuestMonitor.Thread - INFO - GuestMonitor-node1 
> starting
> 2019-06-13 07:09:39,490 - mom.Controllers.Balloon - INFO - Ballooning 
> guest:node1 from 1048576 to 996147
> 2019-06-13 07:09:54,658 - mom.Controllers.Balloon - INFO - Ballooning 
> guest:node1 from 996148 to 946340
> 2019-06-13 07:10:09,853 - mom.Controllers.Balloon - INFO - Ballooning 
> guest:node1 from 946340 to 899023
> 2019-06-13 07:10:25,053 - mom.Controllers.Balloon - INFO - Ballooning 
> guest:node1 from 899024 to 854072
> 2019-06-13 07:10:40,233 - mom.Controllers.Balloon - INFO - Ballooning 
> guest:node1 from 854072 to 811368
> 2019-06-13 07:10:55,428 - mom.Controllers.Balloon - INFO - Ballooning 
> guest:node1 from 811368 to 770799
> 2019-06-13 07:11:10,621 - mom.Controllers.Balloon - INFO - Ballooning 
> guest:node1 from 770800 to 732260
> 2019-06-13 07:11:25,827 - mom.Controllers.Balloon - INFO - Ballooning 
> guest:node1 from 732260 to 695647
> 2019-06-13 07:11:40,973 - mom.Controllers.Balloon - INFO - Ballooning 
> guest:node1 from 695648 to 660865
> 2019-06-13 07:12:51,437 - mom.GuestMonitor.Thread - INFO - GuestMonitor-node1 
> ending
>
> Can someone clarify what exactly does this (from  to yyyy) mean ?
>
> Best Regards,
> Strahil Nikolov
>
> В четвъртък, 13 юни 2019 г., 17:27:01 ч. Гринуич+3, Martin Sivak 
>  написа:
>
>
> Hi,
>
> iirc the guest agent is not needed anymore as we get almost the same
> stats from the balloon driver directly.
>
> Ballooning has to be enabled on cluster level though. So that is one
> thing to check. If that is fine then I guess a more detailed
> description is needed.
>
> oVirt generally starts ballooning when the memory load gets over 80%
> of available memory.
>
> The host agent that handles ballooning is called mom and the logs are
> located in /var/log/vdsm/mom* iirc. It might be a good idea to check
> whether the virtual machines were declared ready (meaning all data
> sources we collect provided data).
>
> --
> Martin Sivak
> used to be maintainer of mom
>
> On Thu, Jun 13, 2019 at 12:26 AM Darrell Budic  wrote:
> >
> > Do you have the overt-guest-agent running on your VMs? It’s required for 
> > ballooning to control allocations on the guest side.
> >
> > On Jun 12, 2019, at 11:32 AM, Strahil  wrote:
> >
> > Hello All,
> >
> > as a KVM user I know how usefull is the memory balloon and how you can both 
> > increase - and also decrease memory live (both Linux & Windows).
> > I have noticed that I cannot decrease the memory in oVirt.
> >
> > Does anyone got a clue why the situation is like that ?
> >
> > I was expecting that the guaranteed memory is the minimum to which the 
> > balloon driver will not go bellow, but when I put my host under pressure - 
> > the host just started to swap instead of reducing some of the VM memory 
> > (and my VMs had plenty of free space).
> >
> > It will be great if oVirt can decrease the memory (if the VM has 
> > unallocated memory) when the host is under pressure and the VM cannot be 
> > relocated.
> >
> > Best Regards,
> > Strahil Nikolov
> >
> > ___
> > Users mailing list -- users@ovirt.org
> > To unsubscribe send an email to users-le...@ovirt.org
> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> > oVirt Code of Conduct: 
> > https://www.ovirt.org/community/about/community-guidelines/
> > List Archives: 
> > https://lists.ovirt.org/archives/list/users@ovirt.org/message/LUWCN2MLNTDJUEZBCTVXFMVABGPUSEFH/
> >
> >
> > ___
> > Users mailing list -- users@ovirt.org
> > To unsubscribe send an email to users-le...@ovirt.org
> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> > oVirt Code of Conduct: 
> > https://www.ovirt.org/community/about/commun

[ovirt-users] Re: Memory ballon question

2019-06-13 Thread Martin Sivak
Hi,

iirc the guest agent is not needed anymore as we get almost the same
stats from the balloon driver directly.

Ballooning has to be enabled on cluster level though. So that is one
thing to check. If that is fine then I guess a more detailed
description is needed.

oVirt generally starts ballooning when the memory load gets over 80%
of available memory.

The host agent that handles ballooning is called mom and the logs are
located in /var/log/vdsm/mom* iirc. It might be a good idea to check
whether the virtual machines were declared ready (meaning all data
sources we collect provided data).

--
Martin Sivak
used to be maintainer of mom

On Thu, Jun 13, 2019 at 12:26 AM Darrell Budic  wrote:
>
> Do you have the overt-guest-agent running on your VMs? It’s required for 
> ballooning to control allocations on the guest side.
>
> On Jun 12, 2019, at 11:32 AM, Strahil  wrote:
>
> Hello All,
>
> as a KVM user I know how usefull is the memory balloon and how you can both 
> increase - and also decrease memory live (both Linux & Windows).
> I have noticed that I cannot decrease the memory in oVirt.
>
> Does anyone got a clue why the situation is like that ?
>
> I was expecting that the guaranteed memory is the minimum to which the 
> balloon driver will not go bellow, but when I put my host under pressure - 
> the host just started to swap instead of reducing some of the VM memory (and 
> my VMs had plenty of free space).
>
> It will be great if oVirt can decrease the memory (if the VM has unallocated 
> memory) when the host is under pressure and the VM cannot be relocated.
>
> Best Regards,
> Strahil Nikolov
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/LUWCN2MLNTDJUEZBCTVXFMVABGPUSEFH/
>
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/22XYJD7XAYZLVYCJUB6TW3RZ5VJFJ3ET/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7UDVZJ3EDXYNMI7DCZTA7YLWNIQU3EVO/


[ovirt-users] Re: Stale hosted engine node information harmful ?

2019-05-02 Thread Martin Sivak
Hi,

the stale records are not an issue at all. You can remove them for
visually cleaner reports (hoste-engine --clean-metadata command, check
the man page), but it makes no difference to the algorithms.

Best regards

Martin Sivak

On Thu, May 2, 2019 at 11:31 AM Andreas Elvers
 wrote:
>
> I have 5 nodes (node01 to node05). Originally all those nodes were part of 
> our default datacenter/cluster with a NFS storage domain for vmdisk, engine 
> and iso-images. All five nodes were engine HA nodes.
> Later node01, node02 and node03 were re-installed to have engine HA removed. 
> Then those nodes were removed from the default cluster. Eventually node01,02 
> and 03 were completely re-installed to host our new Ceph/Gluster based 
> datecenter. The engine is still running on the old default Datacenter. Now I 
> wish to move it over to our ceph/gluster datacenter.
>
> when I look at the current output of "hosted-engine --vm-status" I see:
>
> --== Host node01.infra.solutions.work (id: 1) status ==--
>
> conf_on_shared_storage : True
> Status up-to-date  : False
> Hostname   : node01.infra.solutions.work
> Host ID: 1
> Engine status  : unknown stale-data
> Score  : 0
> stopped: True
> Local maintenance  : False
> crc32  : e437bff4
> local_conf_timestamp   : 155627
> Host timestamp : 155877
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=155877 (Fri Aug  3 13:09:19 2018)
> host-id=1
> score=0
> vm_conf_refresh_time=155627 (Fri Aug  3 13:05:08 2018)
> conf_on_shared_storage=True
> maintenance=False
> state=AgentStopped
> stopped=True
>
>
> --== Host node02.infra.solutions.work (id: 2) status ==--
>
> conf_on_shared_storage : True
> Status up-to-date  : False
> Hostname   : node02.infra.solutions.work
> Host ID: 2
> Engine status  : unknown stale-data
> Score  : 0
> stopped: True
> Local maintenance  : False
> crc32  : 11185b04
> local_conf_timestamp   : 154757
> Host timestamp : 154856
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=154856 (Fri Aug  3 13:22:19 2018)
> host-id=2
> score=0
> vm_conf_refresh_time=154757 (Fri Aug  3 13:20:40 2018)
> conf_on_shared_storage=True
> maintenance=False
> state=AgentStopped
> stopped=True
>
>
> --== Host node03.infra.solutions.work (id: 3) status ==--
>
> conf_on_shared_storage : True
> Status up-to-date  : False
> Hostname   : node03.infra.solutions.work
> Host ID: 3
> Engine status  : unknown stale-data
> Score  : 0
> stopped: False
> Local maintenance  : True
> crc32  : 9595bed9
> local_conf_timestamp   : 14363
> Host timestamp : 14362
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=14362 (Thu Aug  2 18:03:25 2018)
> host-id=3
> score=0
> vm_conf_refresh_time=14363 (Thu Aug  2 18:03:25 2018)
> conf_on_shared_storage=True
> maintenance=True
> state=LocalMaintenance
> stopped=False
>
>
> --== Host node04.infra.solutions.work (id: 4) status ==--
>
> conf_on_shared_storage : True
> Status up-to-date  : True
> Hostname   : node04.infra.solutions.work
> Host ID: 4
> Engine status  : {"health": "good", "vm": "up", "detail": 
> "Up"}
> Score  : 3400
> stopped: False
> Local maintenance  : False
> crc32  : 245854b1
> local_conf_timestamp   : 317498
> Host timestamp : 317498
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_ver

[ovirt-users] Re: Hosted engine Migration

2019-03-11 Thread Martin Sivak
Hi,

as far as I know you can manually migrate hosted engine from the
webadmin UI by clicking at the migrate button.
The question is, why would you want to?

Best regards

Martin Sivak

On Mon, Mar 11, 2019 at 2:01 PM  wrote:
>
> Am I reading these right in that manual migration is not possible?
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/7NP3HP3Q6DCDDOEZZK7LS4P2C6TUMZEB/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FPPVYQ656TFHUOBLMKO66E7WDWVG3FN3/


[ovirt-users] Re: Huge Pages

2019-02-15 Thread Martin Sivak
Hi,

the hugepages configuration has two aspects - hugepage size and count. You
can have multiple sizes at the same time. I am not sure what the
recommended configuration is or how to pass it to kernel so others will
have to help here (I think 2M and 1G are the usual sizes).

The VM memory configuration is then taken from two places. The custom
property tells oVirt which size of hugepages you want to use (in KiB iirc)
and the actual number of used hugepages is computed from the total memory
configured for the VM at the usual place (it has to be a multiple of the
selected hugepage size).

One of our colleagues wrote a blog post about this:
https://mpolednik.github.io/2017/06/26/hugepages-and-ovirt/

Best regards

Martin Sivak

On Fri, Feb 15, 2019 at 5:35 AM Vincent Royer  wrote:

> How do I know how many huge pages my hosts can support?
>
> cat /proc/meminfo | grep Huge
>  AnonHugePages:  17684480 kB
> HugePages_Total:   0
> HugePages_Free:0
> HugePages_Rsvd:0
> HugePages_Surp:0
> Hugepagesize:   2048 kB
>
> [image: image.png]
>
> And once I know, I set the kernel parameters here, and reboot the host,
> correct?
>
> [image: image.png]
>
>
> And then I assume I assign them to the VM here?  How do I decide how many
> huge pages and what size a particular VM can benefit from?
>
> [image: image.png]
>
>
> Is there a part of the docs I am not finding that covers this?
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZMGN7YZXSQ5UUZ43U5RPSXTZOTYI2XAZ/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7CGYXOQDIAIJIGRYTF3QUZX7JSVMQ74J/


[ovirt-users] Re: engine mails about FSM states

2018-12-14 Thread Martin Sivak
Hi,

> Host id is not set

This is an internal error report that should normally not happen. It means
the ovirt-ha-agent asked for a storage operation before it registered
itself with the broker. If this happens seldom then it looks like a race
condition.

I would recommend opening a bug report with all the logs we talked about
and all the RPM versions (ovirt-hosted-engine-ha and
ovirt-hosted-engine-setup packages). Use this link to go directly to the
right component:
https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-hosted-engine-ha

Since all is well except the emails I recommend filtering out the emails as
a workaround before this can be fully investigated and fixed.

Simone, Denis: I can't do more here, looks like a race in agent - broker
initialization and host id management.

Best regards

Martin Sivak


On Fri, Dec 14, 2018 at 12:35 PM fsoyer  wrote:

> In borker.log I found this, just before 05:59am:
>
> Thread-3::INFO::2018-12-13
> 05:58:45,634::mem_free::51::mem_free.MemFree::(action) memFree: 82101
> Thread-1::INFO::2018-12-13 05:58:46,322::ping::60::ping.Ping::(action)
> Successfully pinged 10.0.1.254
> Thread-5::INFO::2018-12-13
> 05:58:46,611::engine_health::241::engine_health.EngineHealth::(_result_from_stats)
> VM is up on this host with healthy engine
> Thread-2::INFO::2018-12-13
> 05:58:49,144::mgmt_bridge::62::mgmt_bridge.MgmtBridge::(action) Found
> bridge ovirtmgmt with ports
> StatusStorageThread::ERROR::2018-12-13
> 05:58:54,935::status_broker::90::ovirt_hosted_engine_ha.broker.status_broker.StatusBroker.Update::(run)
> Failed to update state.
> Traceback (most recent call last):
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/status_broker.py",
> line 82, in run
> if (self._status_broker._inquire_whiteboard_lock() or
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/status_broker.py",
> line 190, in _inquire_whiteboard_lock
> self.host_id, self._lease_file)
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/status_broker.py",
> line 128, in host_id
> raise ex.HostIdNotLockedError("Host id is not set")
> HostIdNotLockedError: Host id is not set
> StatusStorageThread::ERROR::2018-12-13
> 05:58:54,937::status_broker::70::ovirt_hosted_engine_ha.broker.status_broker.StatusBroker.Update::(trigger_restart)
> Trying to restart the broker
>
> "Host is not set" ???
> --
>
> Regards,
>
> *Frank*
>
>
> Le Vendredi, Décembre 14, 2018 12:27 CET, Martin Sivak 
> a écrit:
>
>
> Hi,
>
> check the broker.log as well. The connect is used to talk to
> ovirt-ha-broker service socket.
>
> Best regards
>
> Martin Sivak
>
>
>
> On Fri, Dec 14, 2018 at 12:20 PM fsoyer  wrote:
>
>> I think I have it in agent.log. What can be this "file not found" ?
>>
>> MainThread::ERROR::2018-12-13
>> 05:59:03,909::hosted_engine::431::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Unhandled monitoring loop exception
>> Traceback (most recent call last):
>>   File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 428, in start_monitoring
>> self._monitoring_loop()
>>   File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 447, in _monitoring_loop
>> for old_state, state, delay in self.fsm:
>>   File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py",
>> line 127, in next
>> new_data = self.refresh(self._state.data)
>>   File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py",
>> line 81, in refresh
>> stats.update(self.hosted_engine.collect_stats())
>>   File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 736, in collect_stats
>> all_stats = self._broker.get_stats_from_storage()
>>   File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>> line 135, in get_stats_from_storage
>> result = self._proxy.get_stats()
>>   File "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__
>> return self.__send(self.__name, args)
>>   File "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request
>> verbose=self.__verbose
>>   File "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request
>> return self.single_request(host, handler, request_body, verbose)
>>   File "/usr/lib64/python2.7/xmlrpclib

[ovirt-users] Re: engine mails about FSM states

2018-12-14 Thread Martin Sivak
Hi,

check the broker.log as well. The connect is used to talk to
ovirt-ha-broker service socket.

Best regards

Martin Sivak



On Fri, Dec 14, 2018 at 12:20 PM fsoyer  wrote:

> I think I have it in agent.log. What can be this "file not found" ?
>
> MainThread::ERROR::2018-12-13
> 05:59:03,909::hosted_engine::431::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Unhandled monitoring loop exception
> Traceback (most recent call last):
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 428, in start_monitoring
> self._monitoring_loop()
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 447, in _monitoring_loop
> for old_state, state, delay in self.fsm:
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py",
> line 127, in next
> new_data = self.refresh(self._state.data)
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py",
> line 81, in refresh
> stats.update(self.hosted_engine.collect_stats())
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 736, in collect_stats
> all_stats = self._broker.get_stats_from_storage()
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> line 135, in get_stats_from_storage
> result = self._proxy.get_stats()
>   File "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__
> return self.__send(self.__name, args)
>   File "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request
> verbose=self.__verbose
>   File "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request
> return self.single_request(host, handler, request_body, verbose)
>   File "/usr/lib64/python2.7/xmlrpclib.py", line 1301, in single_request
> self.send_content(h, request_body)
>   File "/usr/lib64/python2.7/xmlrpclib.py", line 1448, in send_content
> connection.endheaders(request_body)
>   File "/usr/lib64/python2.7/httplib.py", line 1037, in endheaders
> self._send_output(message_body)
>   File "/usr/lib64/python2.7/httplib.py", line 881, in _send_output
> self.send(msg)
>   File "/usr/lib64/python2.7/httplib.py", line 843, in send
> self.connect()
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py",
> line 52, in connect
> self.sock.connect(base64.b16decode(self.host))
>   File "/usr/lib64/python2.7/socket.py", line 224, in meth
> return getattr(self._sock,name)(*args)
> error: [Errno 2] No such file or directory
> MainThread::ERROR::2018-12-13
> 05:59:04,043::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
> Traceback (most recent call last):
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 131, in _run_agent
> return action(he)
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 55, in action_proper
> return he.start_monitoring()
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 435, in start_monitoring
> self.publish(stopped)
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 337, in publish
> self._push_to_storage(blocks)
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 707, in _push_to_storage
> self._broker.put_stats_on_storage(self.host_id, blocks)
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> line 105, in put_stats_on_storage
> self._proxy.put_stats(host_id, xmlrpclib.Binary(data))
>   File "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__
> return self.__send(self.__name, args)
>   File "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request
> verbose=self.__verbose
>   File "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request
> return self.single_request(host, handler, request_body, verbose)
>   File "/usr/lib64/python2.7/xmlrpclib.py", line 1301, in single_request
> self.send_content(h, request_body)
>   File "/usr/lib64/python2.7/xmlrpclib.py", line 1448, in send_content
> connection.endheaders(request_body)
>   File "/usr/lib64/python2.7/httplib.py", line 1037, in endheaders
> self._send_

[ovirt-users] Re: engine mails about FSM states

2018-12-14 Thread Martin Sivak
Hi,

no StartState is not common, it is only ever entered when the agent
boots up. So something restarted or killed the agent process. Check
the agent log in /var/log/ovirt-hosted-engine-ha for errors.

Best regards

Martin Sivak

On Fri, Dec 14, 2018 at 12:05 PM fsoyer  wrote:
>
> Hi Martin,
> my problem is that nobody restarted the agent. Do you mean that this is not a 
> normal behavior ? Is it possible that it restarts itself ?
>
> Thanks
> --
>
> Regards,
>
> Frank
>
>
>
> Le Jeudi, Décembre 13, 2018 15:25 CET, Martin Sivak  a 
> écrit:
>
>
> Hi,
>
> those are state change notifications from the hosted engine agent. It
> basically means somebody restarted the ha-agent process and it found
> out the VM is still running fine and returned to the proper state.
>
> Configuring it is possible using the broker.conf file in
> /etc/ovirt-hosted-engine-ha (look for the notification section) or the
> hosted-engine tool (search --help for set config) depending on the
> version of hosted engine you are using.
>
> Best regards
>
> --
> Martin Sivak
>
>
> On Thu, Dec 13, 2018 at 3:10 PM fsoyer  wrote:
> >
> > Hi,
> > I don't find revelant answer about this. Sorry il this was already asked.
> > I receive randomly (one or two tims a week, differents hours) 3 mails with 
> > this subjects :
> > first : ovirt-hosted-engine state transition StartState-ReinitializeFSM
> > second : ovirt-hosted-engine state transition ReinitializeFSM-EngineStarting
> > third : ovirt-hosted-engine state transition EngineStarting-EngineUp
> > all at exactly the same time. The "events" in GUI doesn't indicate anything 
> > about this. No impact on engine or VMs.
> > So I wonder what this messages means ? And, if case of just "info" 
> > messages, is there a way to disable them ?
> >
> > Thanks.
> > --
> >
> > Reagrds,
> >
> > Frank
> >
> > ___
> > Users mailing list -- users@ovirt.org
> > To unsubscribe send an email to users-le...@ovirt.org
> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> > oVirt Code of Conduct: 
> > https://www.ovirt.org/community/about/community-guidelines/
> > List Archives: 
> > https://lists.ovirt.org/archives/list/users@ovirt.org/message/CVEHTWILWDEHASTCQHFHX62U4K4ZCOSK/
>
>
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6B7QUFZXKMNMRTK4KKDEVZUEAP2FROII/


[ovirt-users] Re: engine mails about FSM states

2018-12-13 Thread Martin Sivak
Hi,

those are state change notifications from the hosted engine agent. It
basically means somebody restarted the ha-agent process and it found
out the VM is still running fine and returned to the proper state.

Configuring it is possible using the broker.conf file in
/etc/ovirt-hosted-engine-ha (look for the notification section) or the
hosted-engine tool (search --help for set config) depending on the
version of hosted engine you are using.

Best regards

--
Martin Sivak


On Thu, Dec 13, 2018 at 3:10 PM fsoyer  wrote:
>
> Hi,
> I don't find revelant answer about this. Sorry il this was already asked.
> I receive randomly (one or two tims a week, differents hours) 3 mails with 
> this subjects :
> first : ovirt-hosted-engine state transition StartState-ReinitializeFSM
> second : ovirt-hosted-engine state transition ReinitializeFSM-EngineStarting
> third : ovirt-hosted-engine state transition EngineStarting-EngineUp
> all at exactly the same time. The "events" in GUI doesn't indicate anything 
> about this. No impact on engine or VMs.
> So I wonder what this messages means ? And, if case of just "info" messages, 
> is there a way to disable them ?
>
> Thanks.
> --
>
> Reagrds,
>
> Frank
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/CVEHTWILWDEHASTCQHFHX62U4K4ZCOSK/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/67JW6HHNSDAOR2SIUOG4HJFWO55N3MLG/


[ovirt-users] Re: Memory usage is inaccurate on ovirt web manager.

2018-11-13 Thread Martin Sivak
Hi,

please notice the Shared Memory lines:

node2:
 Shared Memory:57%

vs.

node3:
 Shared Memory:98%

It seems you have overcommit enabled and KSM started deduplicating
pages. This frees memory when there are multiple VMs that use
identical pages. And node3 either has VMs that are similar or was just
faster when searching for identical pages.

Best regards

--
Martin Sivak


On Tue, Nov 13, 2018 at 8:13 AM,   wrote:
> Hi buddy,I encoutered a issue:
> Enviroment:ovirt 3.5,node2+node3+engine,node2 and node3 are in the same 
> cluster.
> node2:64GB memory
> node3:64GB memory
> If I try to start 25 VMs on node2,node2's Memory usage is almost up to 80%
> But if I start 25 VMs on node3,node3's Memory usage is normal,about 70%
> VMs are all the same.
> And I checked out "Hosts>General".
> on node2:
>Physical Memory:64301 MB total, 49512 MB used, 14789 MB free
>Swap Size:28567 MB total, -858090 MB used, 886657 MB free
>Shared Memory:57%
>Max free Memory for scheduling new VMs:24191 MB
>Memory Page Sharing:Active
> on node3:
>Physical Memory:64301 MB total, 37938 MB used, 26363 MB free
>Swap Size:28567 MB total, 161 MB used, 28406 MB free
>Shared Memory:98%
>Max free Memory for scheduling new VMs:24191 MB
>Memory Page Sharing:Active
>
> Best regard
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/WXT7XIB2IKAQEND3BA534GVDX2OAWUXZ/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4Y7AJ7BFU4VUQBT45FSHHTOKKITVGP4Z/


[ovirt-users] Re: ovirt 4.2.7 nested not importing she domain

2018-11-09 Thread Martin Sivak
Hi,

> It completed without errors and the hosted_engine storage domain and the
> HostedEngine inside it were already visible, without the former dependency
> to create a data domain

Glad it works for you.

This is indeed one of the few small improvements in the new deployment
procedure :) We do not recommend using the old procedure anymore
unless there is something special that does not work there. In other
words, try ansible first from now on.

Best regards

--
Martin Sivak
HE ex-maintainer :)

On Fri, Nov 9, 2018 at 1:56 PM, Gianluca Cecchi
 wrote:
>
> On Fri, Nov 9, 2018 at 11:28 AM Simone Tiraboschi 
> wrote:
>>
>>
>>
>> On Fri, Nov 9, 2018 at 12:45 AM Gianluca Cecchi
>>  wrote:
>>>
>>> Hello,
>>> I'm configuring a nested self hosted engine environment with 4.2.7 and
>>> CentOS 7.5.
>>> Domain type is NFS.
>>> I deployed with
>>>
>>> hosted-engine --deploy --noansible
>>>
>>> All went apparently good but after creating the master storage domain I
>>> see that the hosted engine domain is not automatically imported
>>> At the moment I have only one host.
>>>
>>> ovirt-ha-agent status gives every 10 seconds:
>>> Nov 09 00:36:30 ovirtdemo01.localdomain.local ovirt-ha-agent[18407]:
>>> ovirt-ha-agent
>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config.vm ERROR
>>> Unable to identify the OVF_STORE volume, falling back to initial vm.conf.
>>> Please ensure you already added your first data domain for regular VMs
>>>
>>> In engine.log I see  every 15 seconds a dumpxml output ad the message:
>>>
>>> 2018-11-09 00:31:52,822+01 WARN
>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder]
>>> (EE-ManagedThreadFactory-engineScheduled-Thread-52) [7fcce3cb] null
>>> architecture type, replacing with x86_64, VM [HostedEngine]
>>>
>>> see full below.
>>>
>>> Any hint?
>>>
>>
>> Hi Gianluca,
>> unfortunately it's a known regression: it's currently tracked here
>> https://bugzilla.redhat.com/1639604
>>
>> In the mean time I'd suggest to use the new ansible flow witch is not
>> affected by this issue or deploy with an engine-appliance shipped before
>> 4.2.5 completing the upgrade on engine side only when everything is there as
>> expected.
>
>
> Thanks Simone,
> I scratched and reinstalled using the 4.2. appliance and the default option
> (with ansible) executing the command:
>
>  hosted-engine --deploy
>

> Gianluca
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/N6XCZ3TZPCYVC4B4554AKQCJ25BE764I/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UKOI55PFP64XO5A6YRAEVEUIG3K63N2B/


[ovirt-users] Re: Affinity rules in ovirt

2018-10-16 Thread Martin Sivak
Hi,

> "hosts_rule": {"enabled": "true", "enforcing": "true", "positive": "false" },

Notice positive: false. You created a rule which repels those VMs from
the host you selected.

Martin

On Tue, Oct 16, 2018 at 7:47 AM, Hari Prasanth Loganathan
 wrote:
> Hi Team,
>
> Affinity rule is not working as expected, I created a affinity group and
> added two VMs and one host (say TestHost2) (NOTE : two hosts available) with
> following configuration
>
>"enforcing": "true",
> "hosts_rule": {
> "enabled": "true",
> "enforcing": "true",
> "positive": "false"
> },
> "positive": "true",
> "vms_rule": {
> "enabled": "true",
> "enforcing": "true",
> "positive": "true"
> },
>
> But both the VMs are coming up in TestHost1, Am I missing something?
>
> Thanks,
> Hari
>
>
>
>
>
>
>
>
>
> On Mon, Oct 15, 2018 at 4:05 PM Martin Sivak  wrote:
>>
>> > I have found that migration doesn't  work when using
>> > vms_rule Enforcing Hard.
>> > Scenario vms in an affinity group with Enforcing Hard no host members
>> > when trying to migrate vms it comes back with no host to migrate to even
>> > when selecting all vms in the affinity group.
>>
>> Yes you are right, the engine checks one by one. We do not support
>> group migration in the scheduler. It was never considered important
>> enough to offset the development costs (the internal architecture
>> would have to change significantly). The workaround is to relax the
>> affinity rule a bit first, do the migration and restore the affinity
>> rule.
>>
>> Martin Sivak
>>
>> On Mon, Oct 15, 2018 at 12:13 PM, Staniforth, Paul
>>  wrote:
>> > Hello,
>> > I have found that migration doesn't  work when using
>> > vms_rule Enforcing Hard.
>> > Scenario vms in an affinity group with Enforcing Hard no host members
>> > when trying to migrate vms it comes back with no host to migrate to even
>> > when selecting all vms in the affinity group.
>> >
>> > Regards,
>> >   Paul S.
>> > 
>> > From: Martin Sivak 
>> > Sent: 15 October 2018 09:43
>> > To: Darrell Budic
>> > Cc: Ovirt Users
>> > Subject: [ovirt-users] Re: Affinity rules in ovirt
>> >
>> > Hi,
>> >
>> >> 3) What is the difference between VM to VM affinity and VM to Host
>> >> affinity?
>> >
>> > - VM - VM rule expresses the relationship of all running VMs from the
>> > specified list to each other.
>> > - VM - Host rule describes the relationship of all running VMs from
>> > the list to the selected hosts
>> >
>> >> Say I have 2 hosts, so what happens in case of failure of host running
>> >> the VMs?
>> >
>> > VMs that are not up are not considered by scheduler. So it will
>> > (assuming those are highly available VMs) start the first VM somewhere
>> > and then the second VM will be started at the same host where the
>> > first one is running.
>> >
>> >> Enforcing will keep the scheduler from launching a VM if it can’t meet
>> >> those
>> >> criteria, and will try and make changes to which hosts are running
>> >> where if
>> >> it can. If you don’t set it, it will still try and launch VMs on
>> >> different
>> >> hosts, but if it can’t, it will still launch the VM.
>> >
>> > Correct.
>> >
>> >> It also won’t make
>> >> changes to bring all VMs into compliance with the Affinity rules, from
>> >> what
>> >> I can tel.
>> >
>> > It will try that too, but rather less aggressively.
>> >
>> >
>> > Best regards
>> >
>> > Martin Sivak
>> >
>> > On Sun, Oct 14, 2018 at 7:59 PM, Darrell Budic 
>> > wrote:
>> >> VM to VM affinity will try and run the VMs on the same host if
>> >> positive, and
>> >> different hosts if negative.
>> >>
>> >> VM to Host affinity will try and run the VM on a specific set of Hosts
>> >

[ovirt-users] Re: Affinity rules in ovirt

2018-10-15 Thread Martin Sivak
> I have found that migration doesn't  work when using vms_rule 
> Enforcing Hard.
> Scenario vms in an affinity group with Enforcing Hard no host members when 
> trying to migrate vms it comes back with no host to migrate to even when 
> selecting all vms in the affinity group.

Yes you are right, the engine checks one by one. We do not support
group migration in the scheduler. It was never considered important
enough to offset the development costs (the internal architecture
would have to change significantly). The workaround is to relax the
affinity rule a bit first, do the migration and restore the affinity
rule.

Martin Sivak

On Mon, Oct 15, 2018 at 12:13 PM, Staniforth, Paul
 wrote:
> Hello,
> I have found that migration doesn't  work when using vms_rule 
> Enforcing Hard.
> Scenario vms in an affinity group with Enforcing Hard no host members when 
> trying to migrate vms it comes back with no host to migrate to even when 
> selecting all vms in the affinity group.
>
> Regards,
>   Paul S.
> ____
> From: Martin Sivak 
> Sent: 15 October 2018 09:43
> To: Darrell Budic
> Cc: Ovirt Users
> Subject: [ovirt-users] Re: Affinity rules in ovirt
>
> Hi,
>
>> 3) What is the difference between VM to VM affinity and VM to Host affinity?
>
> - VM - VM rule expresses the relationship of all running VMs from the
> specified list to each other.
> - VM - Host rule describes the relationship of all running VMs from
> the list to the selected hosts
>
>> Say I have 2 hosts, so what happens in case of failure of host running the 
>> VMs?
>
> VMs that are not up are not considered by scheduler. So it will
> (assuming those are highly available VMs) start the first VM somewhere
> and then the second VM will be started at the same host where the
> first one is running.
>
>> Enforcing will keep the scheduler from launching a VM if it can’t meet those
>> criteria, and will try and make changes to which hosts are running where if
>> it can. If you don’t set it, it will still try and launch VMs on different
>> hosts, but if it can’t, it will still launch the VM.
>
> Correct.
>
>> It also won’t make
>> changes to bring all VMs into compliance with the Affinity rules, from what
>> I can tel.
>
> It will try that too, but rather less aggressively.
>
>
> Best regards
>
> Martin Sivak
>
> On Sun, Oct 14, 2018 at 7:59 PM, Darrell Budic  wrote:
>> VM to VM affinity will try and run the VMs on the same host if positive, and
>> different hosts if negative.
>>
>> VM to Host affinity will try and run the VM on a specific set of Hosts if
>> positive, and not on those hosts if negative.
>>
>> Enforcing will keep the scheduler from launching a VM if it can’t meet those
>> criteria, and will try and make changes to which hosts are running where if
>> it can. If you don’t set it, it will still try and launch VMs on different
>> hosts, but if it can’t, it will still launch the VM. It also won’t make
>> changes to bring all VMs into compliance with the Affinity rules, from what
>> I can tel.
>>
>> On Oct 14, 2018, at 8:15 AM, Hari Prasanth Loganathan
>>  wrote:
>>
>> Hi Team,
>>
>> I tried to follow up the affinity rules using this tutorial :
>> https://www.youtube.com/watch?v=rs_5BSqacWE but I have few clarifications in
>> it.,
>>
>> 1) I understand the VM to VM affinity, It means both the individual VMs
>> needs to run on a single common host, Say I have 2 hosts, so what happens in
>> case of failure of host running the VMs?
>> 2)  If I create a affinity group and I get the following data,
>> i) What is host rule / vms rule - enabled, enforcing and
>> positive ?
>>
>>   "enforcing": "true",
>> "hosts_rule": {
>> "enabled": "true",
>> "enforcing": "true",
>> "positive": "false"
>> },
>> "vms_rule": {
>> "enabled": "false",
>> "enforcing": "true",
>> "positive": "false"
>> }
>>
>> 3) What is the difference between VM to VM affinity and VM to Host affinity?
>>
>> Doc's are not very clear, so please any help is appreciated.
>>
>> Thanks,
>> Hari
>>
>>
>>
>> DISCLAIMER - MSysTechnologies LLC
>>
>>
>> This email message, contents a

[ovirt-users] Re: Affinity rules in ovirt

2018-10-15 Thread Martin Sivak
Hi,

> 3) What is the difference between VM to VM affinity and VM to Host affinity?

- VM - VM rule expresses the relationship of all running VMs from the
specified list to each other.
- VM - Host rule describes the relationship of all running VMs from
the list to the selected hosts

> Say I have 2 hosts, so what happens in case of failure of host running the 
> VMs?

VMs that are not up are not considered by scheduler. So it will
(assuming those are highly available VMs) start the first VM somewhere
and then the second VM will be started at the same host where the
first one is running.

> Enforcing will keep the scheduler from launching a VM if it can’t meet those
> criteria, and will try and make changes to which hosts are running where if
> it can. If you don’t set it, it will still try and launch VMs on different
> hosts, but if it can’t, it will still launch the VM.

Correct.

> It also won’t make
> changes to bring all VMs into compliance with the Affinity rules, from what
> I can tel.

It will try that too, but rather less aggressively.


Best regards

Martin Sivak

On Sun, Oct 14, 2018 at 7:59 PM, Darrell Budic  wrote:
> VM to VM affinity will try and run the VMs on the same host if positive, and
> different hosts if negative.
>
> VM to Host affinity will try and run the VM on a specific set of Hosts if
> positive, and not on those hosts if negative.
>
> Enforcing will keep the scheduler from launching a VM if it can’t meet those
> criteria, and will try and make changes to which hosts are running where if
> it can. If you don’t set it, it will still try and launch VMs on different
> hosts, but if it can’t, it will still launch the VM. It also won’t make
> changes to bring all VMs into compliance with the Affinity rules, from what
> I can tel.
>
> On Oct 14, 2018, at 8:15 AM, Hari Prasanth Loganathan
>  wrote:
>
> Hi Team,
>
> I tried to follow up the affinity rules using this tutorial :
> https://www.youtube.com/watch?v=rs_5BSqacWE but I have few clarifications in
> it.,
>
> 1) I understand the VM to VM affinity, It means both the individual VMs
> needs to run on a single common host, Say I have 2 hosts, so what happens in
> case of failure of host running the VMs?
> 2)  If I create a affinity group and I get the following data,
> i) What is host rule / vms rule - enabled, enforcing and
> positive ?
>
>   "enforcing": "true",
> "hosts_rule": {
> "enabled": "true",
> "enforcing": "true",
> "positive": "false"
> },
> "vms_rule": {
> "enabled": "false",
> "enforcing": "true",
> "positive": "false"
> }
>
> 3) What is the difference between VM to VM affinity and VM to Host affinity?
>
> Doc's are not very clear, so please any help is appreciated.
>
> Thanks,
> Hari
>
>
>
> DISCLAIMER - MSysTechnologies LLC
>
>
> This email message, contents and its attachments may contain confidential,
> proprietary or legally privileged information and is intended solely for the
> use of the individual or entity to whom it is actually intended. If you have
> erroneously received this message, please permanently delete it immediately
> and notify the sender. If you are not the intended recipient of the email
> message,you are notified strictly not to disseminate,distribute or copy this
> e-mail.E-mail transmission cannot be guaranteed to be secure or error-free
> as Information could be intercepted, corrupted, lost, destroyed, incomplete
> or contain viruses and MSysTechnologies LLC accepts no liability for the
> contents and integrity of this mail or for any damage caused by the
> limitations of the e-mail transmission.
>
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/ST35SIO45EO647BJKUVCUZQBRSAHAUSA/
>
>
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/GXWG7ICW2YN7HZKWVVMRLO53YAIA5DCI/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7R6KZVHKVGGTMRLMYKF4PO2GR32QMIXC/


[ovirt-users] Re: Turn Off Email Alerts

2018-08-29 Thread Martin Sivak
Hi,

> Yes, indeed!
>
> How can I change this internal Python setting?

The same as the notification regex. Using hosted-engine
--set-shared-config   --type=

Executing hosted-engine --get-shared-config xxx (or anything else that
does not exist) should give you the list of all types and keys you can
change (on 4.2 fo sure and 4.1 very probably too).

Best regards

Martin Sivak


On Wed, Aug 29, 2018 at 11:56 PM, Douglas Duckworth
 wrote:
> Yes, indeed!
>
> How can I change this internal Python setting?
>
> On Wed, Aug 29, 2018, 5:43 PM Martin Sivak  wrote:
>>
>> Hi,
>>
>> two clarifications:
>>
>> Hosted engine is sending those emails using built-in Python SMTP
>> client that talks directly yo the SMTP server specified during install
>> time. We default to localhost, but you might have changed it.
>>
>> > notify.state_transition : maintenance|start|stop|migrate|up|down, type :
>> > broker
>>
>> The value here is a regular expression that is matched against the
>> state transition string in the email.
>>
>> Best regards
>>
>> Martin Sivak
>>
>> On Wed, Aug 29, 2018 at 10:34 PM, Douglas Duckworth
>>  wrote:
>> > Thanks for sharing
>> >
>> > I may want to do that
>> >
>> > Though first I want to understand how the emails are arriving.
>> >
>> > I stooped ovirt-engine-notifier.service and postfix.service on all hosts
>> > and
>> > the hosted engine.  So how are the email being delivered?  They are not
>> > running senmail so I don't understand what daemon's sending these
>> > messages.
>> >
>> > Thanks,
>> >
>> > Douglas Duckworth, MSc, LFCS
>> > HPC System Administrator
>> > Scientific Computing Unit
>> > Weill Cornell Medicine
>> > 1300 York Avenue
>> > New York, NY 10065
>> > E: d...@med.cornell.edu
>> > O: 212-746-6305
>> > F: 212-746-8690
>> >
>> >
>> >
>> > On Wed, Aug 29, 2018 at 3:16 PM, Simone Tiraboschi 
>> > wrote:
>> >>
>> >> Hi,
>> >> you can change the list of status you want to be notified about with
>> >> hosted-engine --set-shared-config notify.state_transition
>> >> The default is:
>> >>
>> >> [root@hehost01 ~]# hosted-engine --get-shared-config
>> >> notify.state_transition --type=broker
>> >>
>> >> notify.state_transition : maintenance|start|stop|migrate|up|down, type
>> >> :
>> >> broker
>> >>
>> >>
>> >> On Wed, Aug 29, 2018 at 7:47 PM Douglas Duckworth
>> >>  wrote:
>> >>>
>> >>> I agree however we are in testing phase so I make changes a lot.
>> >>> Therefore alerts are not presently needed.
>> >>>
>> >>> So how to I turn them off?
>> >>>
>> >>> These steps do not work on hosted engine:
>> >>>
>> >>>
>> >>> me@ovirt-engine[~]$ sudo systemctl stop postfix.service
>> >>> me@ovirt-engine[~]$ sudo systemctl stop ovirt-engine-notifier.service
>> >>> me@ovirt-engine[~]$ sudo systemctl status
>> >>> ovirt-engine-notifier.service
>> >>> ● ovirt-engine-notifier.service - oVirt Engine Notifier
>> >>>Loaded: loaded
>> >>> (/usr/lib/systemd/system/ovirt-engine-notifier.service;
>> >>> enabled; vendor preset: disabled)
>> >>>Active: inactive (dead) since Wed 2018-08-29 13:41:55 EDT; 3s ago
>> >>>   Process: 1814
>> >>>
>> >>> ExecStart=/usr/share/ovirt-engine/services/ovirt-engine-notifier/ovirt-engine-notifier.py
>> >>> --redirect-output --systemd=notify $EXTRA_ARGS start (code=exited,
>> >>> status=0/SUCCESS)
>> >>>  Main PID: 1814 (code=exited, status=0/SUCCESS)
>> >>>
>> >>> Aug 25 12:09:31 ovirt-engine.pbtech systemd[1]: Starting oVirt Engine
>> >>> Notifier...
>> >>> Aug 25 12:09:33 ovirt-engine.pbtech systemd[1]: Started oVirt Engine
>> >>> Notifier.
>> >>> Aug 29 13:41:54 ovirt-engine.pbtech systemd[1]: Stopping oVirt Engine
>> >>> Notifier...
>> >>> Aug 29 13:41:55 ovirt-engine.pbtech systemd[1]: Stopped oVirt Engine
>> >>> Notifier.
>> >>>
>> >>> I still get an email every time I put a host in maint mode for
>> >>> example.
>> >>>
>

[ovirt-users] Re: Turn Off Email Alerts

2018-08-29 Thread Martin Sivak
Hi,

two clarifications:

Hosted engine is sending those emails using built-in Python SMTP
client that talks directly yo the SMTP server specified during install
time. We default to localhost, but you might have changed it.

> notify.state_transition : maintenance|start|stop|migrate|up|down, type : 
> broker

The value here is a regular expression that is matched against the
state transition string in the email.

Best regards

Martin Sivak

On Wed, Aug 29, 2018 at 10:34 PM, Douglas Duckworth
 wrote:
> Thanks for sharing
>
> I may want to do that
>
> Though first I want to understand how the emails are arriving.
>
> I stooped ovirt-engine-notifier.service and postfix.service on all hosts and
> the hosted engine.  So how are the email being delivered?  They are not
> running senmail so I don't understand what daemon's sending these messages.
>
> Thanks,
>
> Douglas Duckworth, MSc, LFCS
> HPC System Administrator
> Scientific Computing Unit
> Weill Cornell Medicine
> 1300 York Avenue
> New York, NY 10065
> E: d...@med.cornell.edu
> O: 212-746-6305
> F: 212-746-8690
>
>
>
> On Wed, Aug 29, 2018 at 3:16 PM, Simone Tiraboschi 
> wrote:
>>
>> Hi,
>> you can change the list of status you want to be notified about with
>> hosted-engine --set-shared-config notify.state_transition
>> The default is:
>>
>> [root@hehost01 ~]# hosted-engine --get-shared-config
>> notify.state_transition --type=broker
>>
>> notify.state_transition : maintenance|start|stop|migrate|up|down, type :
>> broker
>>
>>
>> On Wed, Aug 29, 2018 at 7:47 PM Douglas Duckworth
>>  wrote:
>>>
>>> I agree however we are in testing phase so I make changes a lot.
>>> Therefore alerts are not presently needed.
>>>
>>> So how to I turn them off?
>>>
>>> These steps do not work on hosted engine:
>>>
>>>
>>> me@ovirt-engine[~]$ sudo systemctl stop postfix.service
>>> me@ovirt-engine[~]$ sudo systemctl stop ovirt-engine-notifier.service
>>> me@ovirt-engine[~]$ sudo systemctl status ovirt-engine-notifier.service
>>> ● ovirt-engine-notifier.service - oVirt Engine Notifier
>>>Loaded: loaded (/usr/lib/systemd/system/ovirt-engine-notifier.service;
>>> enabled; vendor preset: disabled)
>>>Active: inactive (dead) since Wed 2018-08-29 13:41:55 EDT; 3s ago
>>>   Process: 1814
>>> ExecStart=/usr/share/ovirt-engine/services/ovirt-engine-notifier/ovirt-engine-notifier.py
>>> --redirect-output --systemd=notify $EXTRA_ARGS start (code=exited,
>>> status=0/SUCCESS)
>>>  Main PID: 1814 (code=exited, status=0/SUCCESS)
>>>
>>> Aug 25 12:09:31 ovirt-engine.pbtech systemd[1]: Starting oVirt Engine
>>> Notifier...
>>> Aug 25 12:09:33 ovirt-engine.pbtech systemd[1]: Started oVirt Engine
>>> Notifier.
>>> Aug 29 13:41:54 ovirt-engine.pbtech systemd[1]: Stopping oVirt Engine
>>> Notifier...
>>> Aug 29 13:41:55 ovirt-engine.pbtech systemd[1]: Stopped oVirt Engine
>>> Notifier.
>>>
>>> I still get an email every time I put a host in maint mode for example.
>>>
>>>
>>> Thanks,
>>>
>>> Douglas Duckworth, MSc, LFCS
>>> HPC System Administrator
>>> Scientific Computing Unit
>>> Weill Cornell Medicine
>>> 1300 York Avenue
>>> New York, NY 10065
>>> E: d...@med.cornell.edu
>>> O: 212-746-6305
>>> F: 212-746-8690
>>>
>>>
>>>
>>> On Tue, Aug 28, 2018 at 4:43 PM, Johan Bernhardsson 
>>> wrote:
>>>>
>>>> Those alerts are also coming from hosted-engine that keeps ovirt manager
>>>> running.
>>>>
>>>> I would rather have a filter in my email client for them than disabling
>>>> all of the alerting stuff
>>>>
>>>> /Johan
>>>>
>>>> On August 28, 2018 22:36:34 Douglas Duckworth 
>>>> wrote:
>>>>>
>>>>> Hi
>>>>>
>>>>> Can someone please help?  I keep getting ovirt alerts via email despite
>>>>> turning off postix and ovirt-engine-notifier.service
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Douglas Duckworth, MSc, LFCS
>>>>> HPC System Administrator
>>>>> Scientific Computing Unit
>>>>> Weill Cornell Medicine
>>>>> 1300 York Avenue
>>>>> New York, NY 10065
>>>>> E: d...@med.cornell.edu
>>>>> O: 212-746-6305
>>>>> F

[ovirt-users] Re: MoM - Not working ? How to use it ?

2018-08-02 Thread Martin Sivak
Hi,

did you put the host to maintenance and activated it again when you
enabled ballooning? Or search for a slightly hidden Sync MOM policy
link somewhere in the host area that would force it.

The log seems to indicate ballooning is either disabled or "not necessary".

Martin

On Tue, Jul 31, 2018 at 12:07 PM,   wrote:
> Hello,
>
> I'm trying to test Memory balloon useful functionality, but l'm unable to 
> make it work successfuly.
>
> Firstly, I enable the  MOM on my Cluster:
> Memory Balloon
> Enable Memory Balloon Optimization
>
> Secondly, I configure a VM virtual machine like this:
> On system tab:
> Memory Size : 28672 MB
> Maximum memory : 28672 MB
> On Ressource allocation tab:
> Memory Allocation:
> Physical Memory Guaranteed : 2048 MB
> Memory Balloon Device: Enabled
>
> Thirdly, I test
> 1- I boot the VM. After boot VM consume 200MB
> 2- I lauch perl program which consume 26GB.
> 3- During this, Host RAM usage is near 90%
> 4- I stop the perl program, and on guest side, RAM usage drop down to 200Mo
> 5- On host side, RAM usage keep to 24 - 26GB. I wait some time to show if the 
> qemu-vm process drop is host-ram-usage. Ram usage keep to 24GB.
> 6- I try to launch another VM like the first one (but with 8GB RAM and 2GB 
> Guaranteed) , on the host (to "force" Mom ?) but I'm unable to lauch it, Host 
> free mem is insufficient
>
> I change MoM log to debug to show something:
> 2018-07-31 11:49:31,887 - mom.RPCServer - INFO - ping()
> 2018-07-31 11:49:31,888 - mom.RPCServer - INFO - getStatistics()
> 2018-07-31 11:49:31,888 - mom.Monitor - DEBUG - Field 'mem_free' not known. 
> Ignoring.
>
>
> 2018-07-31 11:49:43,313 - mom.VdsmRpcBase - DEBUG - VM List: 
> [u'9632443f-d302-43f7-a279-778e64ee98f4']
> 2018-07-31 11:49:43,465 - mom.VdsmRpcBase - DEBUG - Memory stats: 
> {'swap_out': 0, 'swap_usage': 0, 'mem_free': 28221020, 'major_fault': 0, 
> 'swap_in': 0, 'swap_total': 0, 'mem_available': 28650472, 'minor_fault': 99, 
> 'mem_unused': 28221020}
> 2018-07-31 11:49:43,468 - mom.Monitor - DEBUG - Collector 
>  0x167de18> did not return any data
> 2018-07-31 11:49:43,931 - mom.RPCServer - INFO - ping()
> 2018-07-31 11:49:43,932 - mom.RPCServer - INFO - getStatistics()
> 2018-07-31 11:49:43,932 - mom.Monitor - DEBUG - Field 'mem_free' not known. 
> Ignoring.
> 2018-07-31 11:49:44,258 - mom.Monitor - DEBUG - Field 'mem_free' not known. 
> Ignoring.
> 2018-07-31 11:49:44,284 - mom.Evaluator - DEBUG - debug: ('No shared pages, 
> setting ksm_merge_across_nodes to', 1)
> 2018-07-31 11:49:44,286 - mom.Evaluator - DEBUG - debug: ('entry: 
> apply_NUMA_policy',)
> 2018-07-31 11:49:44,286 - mom.Evaluator - DEBUG - debug: (1, 
> '=ksm_merge_across_nodes ACTUAL from kernel')
> 2018-07-31 11:49:44,286 - mom.Evaluator - DEBUG - debug: (1, 
> '=ksmMergeAcrossNodes REQUIRED from oVirt-engine')
> 2018-07-31 11:49:44,289 - mom.Evaluator - DEBUG - debug: ('exit: 
> apply_NUMA_policy return_value = ', 0)
> 2018-07-31 11:49:44,310 - mom.Policy - DEBUG - Results: [0, 1, 1, 1, 0, 1, 1, 
> 0.2, 0.05, 0.2, 0.05, 0.0025, 'change_big_enough', 'shrink_guest', 
> 'grow_guest', 0.22506353015141964, 'balloon_logic', [0], 'guest_qos', [0], 
> 300, -50, 64, 1250, 10, 0.2, 'change_npages', 'apply_NUMA_policy', 6550590.4, 
> 30058208, None, 0, 0, None, -1, 10, 'check_and_set_quota', 
> 'reset_quota_and_period', [None], 0, 'set_io_limits', 'reset_io_limits', [0]]
>
>
> Does I miss something ? MoM behavior ?
>
> Thanks for all !
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/QQ7EIZMCBXHTKOY5X4RYCLU3KRE6D4H4/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6SRKVYNSB767WSA4WCX5ZAG24JUL65VR/


[ovirt-users] Re: MoM - Not working ? How to use it ?

2018-08-01 Thread Martin Sivak
Hi,

> 2018-07-31 11:49:44,258 - mom.Monitor - DEBUG - Field 'mem_free' not known. 
> Ignoring.

No, this is "normal". Should not affect ballooning.

> I can't find a INFO log file with My_VM- is ready

And the line where the GuestMonitor is starting? You need to get back
to the time the VM was started or restart MOM (mom-vdsm service).

> All Centos guests are fully supported ? (Centos 7.X branch)

Yes I think so.

Martin

On Wed, Aug 1, 2018 at 2:28 PM,   wrote:
> Hi Martin,
>
> No problem, thanks for our answer !
> In posted log, you can see:
> 2018-07-31 11:49:44,258 - mom.Monitor - DEBUG - Field 'mem_free' not known. 
> Ignoring.
> I think issue start here :/ ?
> I can't find a INFO log file with My_VM- is ready
> All Centos guests are fully supported ? (Centos 7.X branch)
>
> Regards,
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/AWZLKUETHSHTECBE7EO5MINMNYMVY6LM/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NOS7R6SXAJA62QUDMSVK4NJ2RMO6L3UL/


[ovirt-users] Re: MoM - Not working ? How to use it ?

2018-08-01 Thread Martin Sivak
Hi,

sorry I could not answer yesterday.

What you need to check is whether the VM is reported as ready in the
log (/var/log/vdsm/mom.log). You should see lines like:

2018-08-01 10:52:25,648 - mom.GuestMonitor.Thread - INFO -
GuestMonitor-ms-vhost-1 starting
2018-08-01 10:52:25,648 - mom.GuestManager - DEBUG - added monitor for
guest c4f1ea89-457a-4049-97d7-11cf751d00f0
2018-08-01 10:52:25,648 - mom.Monitor - DEBUG - Using optional fields:
set(['swap_out', 'swap_usage', 'balloon_cur', 'swap_total',
'balloon_min', 'major_fault', 'swap_in', 'io_tune_current',
'mem_unused', 'balloon_max', 'minor_fault', 'io_tune',
'mem_available'])
2018-08-01 10:52:25,649 - mom.Monitor - DEBUG - Using fields:
set(['vcpu_count', 'vcpu_period', 'vcpu_user_limit', 'vcpu_quota'])

...

2018-08-01 10:52:25,808 - mom.Monitor - INFO - ms-vhost-1 is ready

There might be a message reporting that some required data fields are
not available and the ballooning won't work for that VM in such case.

Best regards

Martin Sivak


On Wed, Aug 1, 2018 at 10:45 AM,   wrote:
> Hello,
>
> Is someone have an idea regarding my Memory balloon misconfig :/ ?
>
> Thanks !
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/FTNY4FQ4W52Z62GGFCXC567KOBJ2DJOI/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VJSSYKXAVSLSIOYETGT6HTSVWPQOJV3B/


[ovirt-users] Re: Broker fails to start after upgrade 4.1 to 4.2 metadata_image_UUID can't be ''

2018-06-26 Thread Martin Sivak
[root @ h4 /] # sanlock direct init -s hosted-engine: 0:
> /rhev/data-center/mnt/ssd.lan \: _ ovirt /
> 8905c9ac-d892-478d-8346-63b8fa1c5763 / images / badd5883-ef71-
> 45bb-9073-a573f46a3b44 / e4408917-fe00-4567-8db0-bf464472ec01.lockspace
> init done -19

Do you really have all those spaces in the path? Also make sure the
backslash in \:ovirt  is doubled if you execute this from bash like
you seem to be doing (\\:ovirt)

Martin

On Mon, Jun 25, 2018 at 3:38 PM, Reznikov Alexei  wrote:
> 25.06.2018 15:12, Martin Sivak пишет:
>
>> Hi,
>>
>> yes there is a solution described directly in the bug:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1521011#c20
>>
>> The provided script worked only for cases that had the necessary
>> disks, but where the uuids were not written to the config files.
>>
>> You need to follow the procedure from comment 20 when no disks for
>> lockspace and metadata exist at all.
>>
>> Best regards
>>
>> Martin Sivak
>>
>>
>>
>> On Mon, Jun 25, 2018 at 9:52 AM, Reznikov Alexei 
>> wrote:
>>>
>>> 21.06.2018 20:15, reznikov...@soskol.com пишет:
>>>>
>>>> Hi list!
>>>>
>>>> After upgrade my cluster from 4.1.9 to 4.2.2, agent and broker can't
>>>> start
>>>> on host...
>>>>
>>>> cat /var/log/ovirt-hosted-engine-ha/agent.log
>>>> MainThread::ERROR::2018-06-21
>>>>
>>>> 03:25:34,603::hosted_engine::538::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
>>>> Failed to start necessary monitors
>>>> MainThread::ERROR::2018-06-21
>>>>
>>>> 03:25:34,604::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
>>>> Traceback (most recent call last)
>>>>
>>>> cat /var/log/ovirt-hosted-engine-ha/broker.log
>>>> MainThread::INFO::2018-06-21
>>>>
>>>> 03:25:40,406::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> Finished loading submonitors
>>>> MainThread::WARNING::2018-06-21
>>>>
>>>> 03:25:40,406::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
>>>> Can't connect vdsm storage: 'metadata_image_UUID can't be ''
>>>>
>>>> cat /etc/ovirt-hosted-engine/hosted-engine.conf | grep metadata_image
>>>> metadata_image_UUID=
>>>>
>>>> Also is:
>>>> cat /etc/ovirt-hosted-engine/hosted-engine.conf | grep lock
>>>> lockspace_image_UUID=
>>>> lockspace_volume_UUID=
>>>>
>>>> This bug is very much like this
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1521011 My cluster started
>>>> with
>>>> version 3.3...
>>>>
>>>> But i can't resolution this bug correctly.
>>>>
>>>> Guru please help me!!!
>>>>
>>>> Thanx, Alex!
>>>>
> Thanks for answer Martin.
>
> I get the problem in step 5, the procedure described in comment 20
> [root @ h4 /] # sanlock direct init -s hosted-engine: 0:
> /rhev/data-center/mnt/ssd.lan \: _ ovirt /
> 8905c9ac-d892-478d-8346-63b8fa1c5763 / images / badd5883-ef71-
> 45bb-9073-a573f46a3b44 / e4408917-fe00-4567-8db0-bf464472ec01.lockspace
> init done -19
>
> What does mean "init done -19", why do not I see any changes?
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/4KPMQ5KY7V2KK6WXHAR6SKFEG4GZ4D57/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/HYICZDMTY6JZCCTE7V5BTSN6ETPJ2PPU/


[ovirt-users] Re: Broker fails to start after upgrade 4.1 to 4.2 metadata_image_UUID can't be ''

2018-06-25 Thread Martin Sivak
Hi,

yes there is a solution described directly in the bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1521011#c20

The provided script worked only for cases that had the necessary
disks, but where the uuids were not written to the config files.

You need to follow the procedure from comment 20 when no disks for
lockspace and metadata exist at all.

Best regards

Martin Sivak



On Mon, Jun 25, 2018 at 9:52 AM, Reznikov Alexei  wrote:
> 21.06.2018 20:15, reznikov...@soskol.com пишет:
>>
>> Hi list!
>>
>> After upgrade my cluster from 4.1.9 to 4.2.2, agent and broker can't start
>> on host...
>>
>> cat /var/log/ovirt-hosted-engine-ha/agent.log
>> MainThread::ERROR::2018-06-21
>> 03:25:34,603::hosted_engine::538::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
>> Failed to start necessary monitors
>> MainThread::ERROR::2018-06-21
>> 03:25:34,604::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
>> Traceback (most recent call last)
>>
>> cat /var/log/ovirt-hosted-engine-ha/broker.log
>> MainThread::INFO::2018-06-21
>> 03:25:40,406::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>> Finished loading submonitors
>> MainThread::WARNING::2018-06-21
>> 03:25:40,406::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
>> Can't connect vdsm storage: 'metadata_image_UUID can't be ''
>>
>> cat /etc/ovirt-hosted-engine/hosted-engine.conf | grep metadata_image
>> metadata_image_UUID=
>>
>> Also is:
>> cat /etc/ovirt-hosted-engine/hosted-engine.conf | grep lock
>> lockspace_image_UUID=
>> lockspace_volume_UUID=
>>
>> This bug is very much like this
>> https://bugzilla.redhat.com/show_bug.cgi?id=1521011 My cluster started with
>> version 3.3...
>>
>> But i can't resolution this bug correctly.
>>
>> Guru please help me!!!
>>
>> Thanx, Alex!
>>
> Bump.
>
> I tried run workaround script from Simone Tiraboschi, but him not work
> properly for me.
>
> I not see volume ... hosted-engine.lockspace and hosted-engine.metada is
> null.
>
> [root@h4 ~]# ./workaround_1521011.sh
> + source /etc/ovirt-hosted-engine/hosted-engine.conf
> ++ fqdn=eng.lan
> ++ vm_disk_id=e9d7a377-e109-4b28-9a43-7a8c8b603749
> ++ vm_disk_vol_id=cd12a59e-7d84-4b4e-98c7-4c68e83ecd7b
> ++ vmid=ccdd675a-a58b-495a-9502-3e6a4b7e5228
> ++ storage=ssd:/ovirt
> ++ mnt_options=
> ++ conf=/var/run/ovirt-hosted-engine-ha/vm.conf
> ++ host_id=4
> ++ console=vnc
> ++ domainType=nfs3
> ++ spUUID=----
> ++ sdUUID=8905c9ac-d892-478d-8346-63b8fa1c5763
> ++ connectionUUID=ce84071b-86a2-4e82-b4d9-06abf23dfbc4
> ++ ca_cert=/etc/pki/vdsm/libvirt-spice/ca-cert.pem
> ++ ca_subject='C=EN, L=Test, O=Test, CN=Test'
> ++ vdsm_use_ssl=true
> ++ gateway=10.245.183.1
> ++ bridge=ovirtmgmt
> ++ lockspace_volume_UUID=
> ++ lockspace_image_UUID=
> ++ metadata_volume_UUID=
> ++ metadata_image_UUID=
> ++ conf_volume_UUID=a20d9700-1b9a-41d8-bb4b-f2b7c168104f
> ++ conf_image_UUID=b5f353f5-9357-4aad-b1a3-751d411e6278
> ++ iqn=
> ++ portal=
> ++ user=
> ++ password=
> ++ port=
> ++ vdsm-client StorageDomain getImages
> storagedomainID=8905c9ac-d892-478d-8346-63b8fa1c5763
> storagepoolID=----
> ++ grep -
> ++ tr -d ,
> ++ xargs
> + for i in '$(vdsm-client StorageDomain getImages storagedomainID=${sdUUID}
> storagepoolID=${spUUID} | grep - | tr -d '\'','\'' | xargs)'
> ++ vdsm-client StorageDomain getVolumes
> storagedomainID=8905c9ac-d892-478d-8346-63b8fa1c5763
> storagepoolID=----
> imageID=83e0550b-0fc3-40b1-955d-b07ebfbb3994
> ++ grep -
> ++ tr -d ,
> ++ xargs
> + for v in '$(vdsm-client StorageDomain getVolumes storagedomainID=${sdUUID}
> storagepoolID=${spUUID} imageID=${i} | grep - | tr -d '\'','\'' | xargs)'
> ++ vdsm-client Volume getInfo
> storagedomainID=8905c9ac-d892-478d-8346-63b8fa1c5763
> storagepoolID=----
> imageID=83e0550b-0fc3-40b1-955d-b07ebfbb3994
> volumeID=5a26be32-6c5b-4dcc-ac67-5c442f24df55
> ++ jq '. | select(.description=="hosted-engine.lockspace") | .uuid'
> ++ xargs
> + lockspace_vol_uuid=
> + [[ ! -z '' ]]
> ++ vdsm-client Volume getInfo
> storagedomainID=8905c9ac-d892-478d-8346-63b8fa1c5763
> storagepoolID=----
> imageID=83e0550b-0fc3-40b1-955d-b07ebfbb3994
> volumeID=5a26be32-6c5b-4dcc-ac67-5c442f24df55
> ++ jq '. | select(.description=="hosted-engine.lockspace") | .image'
> ++ xargs
> + lockspace_img_uuid=
>

[ovirt-users] Re: HostedEngine and affinity groups

2018-06-18 Thread Martin Sivak
Hi,

yes, the affinity enforcement engine does work with soft rules.
However, soft affinity is just one input to the weighting part of the
algorithm and other inputs (free memory, cpu load, ..) can counter it.

You could increase the weight of the affinity policy units in the
Scheduling policy configuration, but it should already be high enough
(I think the default is 100).

There is an table in the log when debug logging is enabled:
https://ovirt.org/develop/release-management/features/sla/scheduling-weight-normalization/#testing

You can check the internal weights and the selection process by
enabling this like this:

- use the logcontrol script from here
https://github.com/oVirt/ovirt-engine/blob/master/contrib/log-control.sh
- and enable DEBUG for
org.ovirt.engine.core.bll.scheduling.policyunits.RankSelectorPolicyUnit

Best regards

--
Martin Sivak
SLA, oVirt

On Mon, Jun 18, 2018 at 4:52 PM,   wrote:
> Hi,
>
> i operate a 3 node ovirt42 cluster (hyperconverged).
>
> Now i would like to isolate the HostedEngine from the other VMs.
> The reason is to minimize the impact of outage of one of the nodes.
> When the node with the HE is away there is no impact at all on the running 
> VMs.
> If one of the other nodes goes down, then i have still a running HE which 
> restarts
> the missing VMs on remaining nodes fast again.
>
> I cannot use hard affinity because in case of failure i need all resources of 
> the 2 remaining modes,
> so i configured a soft negative affinity group for every VM between this VM 
> and the HE.
>
> As a first test i migrated the HE to a node with VMs but even after a while 
> nothing happened.
> I would have expected that either the HE or the VMs on this node would have 
> been migrated to
> fullfill the soft negative affinity.
>
> Now my questions:
>
> Are my thoughts correct?
> Is there a better solution than making one (soft negative) affinity group for 
> every VM to isolate from HE?
> Does the affinity group enforcement engine work at all with soft rules?
>
> Regards,
> Robert
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/XKHYWZD6CO5FQG67IL6OIBAHVKVBSX5Q/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/X52PVRF2U7R5XFPL7M4IUZIJOXL54HXC/


[ovirt-users] Re: Hosted engine cannot migrate

2018-06-14 Thread Martin Sivak
It should not be. Was there anything in the log? Storage failure or something?

Best regards

Martin Sivak

On Thu, Jun 14, 2018 at 11:59 AM, Callum Smith  wrote:
> So the one host with the stale-data, putting that into maintenance and then
> rebooting seems to have brought it back and stopped the errors. It seems the
> ha-broker is a bit more temperamental than usual after the 4.2.3.1 release.
>
> Regards,
> Callum
>
> --
>
> Callum Smith
> Research Computing Core
> Wellcome Trust Centre for Human Genetics
> University of Oxford
> e. cal...@well.ox.ac.uk
>
> On 14 Jun 2018, at 10:34, Martin Sivak  wrote:
>
> Dear Callum,
>
> unknown stale-data means the hosts did not submit status update during
> the last minute. That might be just a glitch or something happened to
> the storage connection there.
>
> Best regards
>
> Martin Sivak
>
>
> On Thu, Jun 14, 2018 at 11:28 AM, Callum Smith  wrote:
>
> Dear Martin,
>
> The engine is running happily and migration appears to work although it
> appears one node has dropped into "unknown stale-data" on vm-status (the
> node that is neither the originator or the target for the migration.
>
> Regards,
> Callum
>
> --
>
> Callum Smith
> Research Computing Core
> Wellcome Trust Centre for Human Genetics
> University of Oxford
> e. cal...@well.ox.ac.uk
>
> On 14 Jun 2018, at 10:22, Martin Sivak  wrote:
>
> Hi,
>
> is the engine running even though you see the errors in the log?
> Hosted engine agents fight for the lock when starting the engine VM.
> One wins and the others report an issue. We have some checks in place
> to silence those, but maybe it leaked again.
>
> This might be just annoying as long as the engine is up. Manually
> clicking the migrate button should also work.
>
> Best regards
>
> Martin Sivak
>
> On Thu, Jun 14, 2018 at 10:41 AM, Callum Smith  wrote:
>
> Dear All,
>
> Getting an issue where the HE can't b migrated, the log is full of:
> "VM HostedEngine is down with error. Exit message: resource busy: Failed to
> acquire lock: Lease is held by another host."
>
> engine.log attached
>
> Regards,
> Callum
>
> --
>
> Callum Smith
> Research Computing Core
> Wellcome Trust Centre for Human Genetics
> University of Oxford
> e. cal...@well.ox.ac.uk
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/SR5RVQ2WFDMKNO4RXSCU45GJ7RTXDZX4/
>
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/N552YEVKWL55CQJPJLTRZZMNBJY3KVMP/


[ovirt-users] Re: Hosted engine cannot migrate

2018-06-14 Thread Martin Sivak
Dear Callum,

unknown stale-data means the hosts did not submit status update during
the last minute. That might be just a glitch or something happened to
the storage connection there.

Best regards

Martin Sivak


On Thu, Jun 14, 2018 at 11:28 AM, Callum Smith  wrote:
> Dear Martin,
>
> The engine is running happily and migration appears to work although it
> appears one node has dropped into "unknown stale-data" on vm-status (the
> node that is neither the originator or the target for the migration.
>
> Regards,
> Callum
>
> --
>
> Callum Smith
> Research Computing Core
> Wellcome Trust Centre for Human Genetics
> University of Oxford
> e. cal...@well.ox.ac.uk
>
> On 14 Jun 2018, at 10:22, Martin Sivak  wrote:
>
> Hi,
>
> is the engine running even though you see the errors in the log?
> Hosted engine agents fight for the lock when starting the engine VM.
> One wins and the others report an issue. We have some checks in place
> to silence those, but maybe it leaked again.
>
> This might be just annoying as long as the engine is up. Manually
> clicking the migrate button should also work.
>
> Best regards
>
> Martin Sivak
>
> On Thu, Jun 14, 2018 at 10:41 AM, Callum Smith  wrote:
>
> Dear All,
>
> Getting an issue where the HE can't b migrated, the log is full of:
> "VM HostedEngine is down with error. Exit message: resource busy: Failed to
> acquire lock: Lease is held by another host."
>
> engine.log attached
>
> Regards,
> Callum
>
> --
>
> Callum Smith
> Research Computing Core
> Wellcome Trust Centre for Human Genetics
> University of Oxford
> e. cal...@well.ox.ac.uk
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/SR5RVQ2WFDMKNO4RXSCU45GJ7RTXDZX4/
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ESIJDSAYDLDBIB4FBKEHGJCEP64ASR62/


[ovirt-users] Re: Hosted engine cannot migrate

2018-06-14 Thread Martin Sivak
Hi,

is the engine running even though you see the errors in the log?
Hosted engine agents fight for the lock when starting the engine VM.
One wins and the others report an issue. We have some checks in place
to silence those, but maybe it leaked again.

This might be just annoying as long as the engine is up. Manually
clicking the migrate button should also work.

Best regards

Martin Sivak

On Thu, Jun 14, 2018 at 10:41 AM, Callum Smith  wrote:
> Dear All,
>
> Getting an issue where the HE can't b migrated, the log is full of:
> "VM HostedEngine is down with error. Exit message: resource busy: Failed to
> acquire lock: Lease is held by another host."
>
> engine.log attached
>
> Regards,
> Callum
>
> --
>
> Callum Smith
> Research Computing Core
> Wellcome Trust Centre for Human Genetics
> University of Oxford
> e. cal...@well.ox.ac.uk
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/SR5RVQ2WFDMKNO4RXSCU45GJ7RTXDZX4/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/B4747X4ULVN3IQBP3JSRM5Z5XDX6OAMZ/


[ovirt-users] Re: Question re: MaxFreeMemoryforOverUtlized and MinFreeMemoryforUnderUtilized

2018-06-13 Thread Martin Sivak
> when a quantity exceeds a Maximum value this should be considered an error 
> condition

Which does work for under-utilized threshold where normal situation is
when free memory is above the threshold (or used memory below it). But
it does not work for over utilized threshold where normal is when free
memory is above the treshold.

There are many ways to describe those (above, below, used memory, free
memory) and it is too late to change it anyway.

Best regards

Martin Sivak



On Thu, Jun 14, 2018 at 12:12 AM, Alastair Neil  wrote:
> when the free memory is below defined maximum value
>
>  this is the problem, this  statement is veiled in double negatives
>
> When a quantity is below a "Maximum" value then this is should be considered
> normal
> when a quantity exceeds a Maximum value this should be considered an error
> condition
>
> but this is not the case - when free memory falls below the threshold of
> MaxFreeMemoryforOverUtlized we are overutilized
>
> On Wed, 13 Jun 2018 at 17:14, Martin Sivak  wrote:
>>
>> Hi, it is just a matter of perspective:
>>
>> MaxFreeMemoryforOverUtlized - the host is considered over utilized
>> when the free memory is below defined maximum value
>> MinFreeMemoryforUnderUtilized - the host is considered under utilized
>> when the free memory is at least the defined minimal value
>>
>> Best regards
>>
>> --
>> Martin Sivak
>> oVirt
>>
>>
>> On Wed, Jun 13, 2018 at 7:14 PM, Alastair Neil 
>> wrote:
>> > Can someone clarify these setting for me, I am having difficulty parsing
>> > what exactly they mean. They seem to me to be backwards.
>> >
>> > If I wish to set a threshold at which I want my host to be consider over
>> > utilized, not schedule any new VMs, and migrate VMs away,  then surely I
>> > should specify  a minimum threshold of free memory. I.E. if free memory
>> > drops below my threshold (OR memory use exceed a maximum threshold)
>> > migrate
>> > VM's off of this system.
>> >
>> > Conversely if a system is underutilized I should set a maximum threshold
>> > of
>> > free memory (Or minimum used memory).
>> >
>> > Thanks,
>> >
>> > --Alastair
>> >
>> >
>> > ___
>> > Users mailing list -- users@ovirt.org
>> > To unsubscribe send an email to users-le...@ovirt.org
>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> > oVirt Code of Conduct:
>> > https://www.ovirt.org/community/about/community-guidelines/
>> > List Archives:
>> >
>> > https://lists.ovirt.org/archives/list/users@ovirt.org/message/C25Z6WQRPXXU4TMSDDR7HLS5SW3D2LCF/
>> >
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FLWFBNYMPOVUG23OPAKJ6GZZJ3BJHLQA/


[ovirt-users] Re: Question re: MaxFreeMemoryforOverUtlized and MinFreeMemoryforUnderUtilized

2018-06-13 Thread Martin Sivak
Hi, it is just a matter of perspective:

MaxFreeMemoryforOverUtlized - the host is considered over utilized
when the free memory is below defined maximum value
MinFreeMemoryforUnderUtilized - the host is considered under utilized
when the free memory is at least the defined minimal value

Best regards

--
Martin Sivak
oVirt


On Wed, Jun 13, 2018 at 7:14 PM, Alastair Neil  wrote:
> Can someone clarify these setting for me, I am having difficulty parsing
> what exactly they mean. They seem to me to be backwards.
>
> If I wish to set a threshold at which I want my host to be consider over
> utilized, not schedule any new VMs, and migrate VMs away,  then surely I
> should specify  a minimum threshold of free memory. I.E. if free memory
> drops below my threshold (OR memory use exceed a maximum threshold) migrate
> VM's off of this system.
>
> Conversely if a system is underutilized I should set a maximum threshold of
> free memory (Or minimum used memory).
>
> Thanks,
>
> --Alastair
>
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/C25Z6WQRPXXU4TMSDDR7HLS5SW3D2LCF/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NIR5OP5432DHNFKM72PCVRPDYAZE6L3Y/


[ovirt-users] Re: Centos7.5 or Ovirt node for new hci build?

2018-06-13 Thread Martin Sivak
Hi,

it actually does not make much difference unless you need some special
customization. Node is plug and play, CenOS + oVirt repos might have
fresher packages (snapshots, nightlies) and allow live changes.

--
Martin Sivak


On Wed, Jun 13, 2018 at 1:07 PM, Jayme  wrote:
> I'm about to build a 3 host Ovirt hyperconverged cluster and was wondering
> if it's recommended to use straight up Centos or would Ovirt node be a
> better choice?
>
> Thanks
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/H724RRGWIMA5WDCO5KIKW3IJEYRHW5U6/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4EO6RW47OKPVHE7ER5SO3DIQQDJIIPT3/


[ovirt-users] Re: Non-responsive vm's due to crashed host and hosted vm liveliness check fails

2018-06-04 Thread Martin Sivak
Hi all,

we decided to issue an updated package to the oVirt 4.1 repository
that should fix this for all users. We still consider 4.1 an EOL
release, but we think this upgrade path should be fixed anyway.

Metadata are refreshing as we speak. You can also download the package
manually from the repositories:

For example the URL for CentOS 7 based installation:
http://resources.ovirt.org/pub/ovirt-4.1/rpm/el7/noarch/ovirt-hosted-engine-ha-2.1.9-1.el7.noarch.rpm

Best regards

--
Martin Sivak
SLA / oVirt

On Mon, Jun 4, 2018 at 2:45 PM,   wrote:
> Thank you again so very much Andrej!  I am going to try this right now...
>
> Respectfully,
> Charles
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/RM45ON3RADJD5WPM3IHGQ3EF6PWOQ4PX/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/HUGTS4WY2DB743AE7XGJUUKRX5Z4JWGV/


[ovirt-users] Re: Hosted Engine w Gluster - increase number of hosts

2018-05-24 Thread Martin Sivak
Hi,

hosted engine does not care about disks. Gluster might, but I do not
know enough about best practice brick layout there, sorry.

Martin

On Wed, May 23, 2018 at 11:44 AM, femi adegoke  wrote:
> Must all 7 hosts have the same amount of storage/number of disks?
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org


[ovirt-users] Re: Hosted Engine w Gluster - increase number of hosts

2018-05-23 Thread Martin Sivak
Hi,

we recommend up to 7 HE hosts, it is not important if the number is
odd or even. The real implementation limit is much higher and you
won't reach it. But since we do not test more than 7 as part of the QE
process, we can't recommend it.

Best regards

Martin Sivak

On Wed, May 23, 2018 at 9:43 AM, femi adegoke <ov...@fateknollogee.com> wrote:
>> On Wed, May 23, 2018 at 8:47 AM, femi adegoke <ovirt(a)fateknollogee.com
>> wrote:
>>
>>
>> Additional hosts should be directly added from the engine.
> How many host should I add per best practice.
> I have 3 now, do I add 2 or 4 more.
> Should the total number of host be odd or even?
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org


[ovirt-users] Re: Use local ansible to talk to engineVM and other vm

2018-05-22 Thread Martin Sivak
Hi,

you can use multiple different inventory sources at the same time - so
use your file + ovirt4.py

https://docs.ansible.com/ansible/2.5/user_guide/intro_dynamic_inventory.html#using-inventory-directories-and-multiple-inventory-sources

Best regards

Martin Sivak

On Tue, May 22, 2018 at 9:50 AM, Sumit Bharadia <03ce...@gmail.com> wrote:
> Thanks.
>
> Is there a way to use an existing inventory file?
>
> I asked because the server where the engineVM is running is listed into my
> existing a ansible inventory file on my local machine, but how do I
> specify/run the subsequent tasks which I want to run on vms running on
> engineVM as I don't see the /etc/hosts file updated where engineVM runs. How
> would my local ansible playbook know which vms are available, etc?
>
>
>
> On Tue, 22 May 2018, 8:36 am Ondra Machacek, <omach...@redhat.com> wrote:
>>
>> On 05/21/2018 11:51 AM, 03ce...@gmail.com wrote:
>> > I have a self-hosted-engine (4.2) running on a centos 7.4 server.
>> >
>> > I have downloaded ovirt ansible roles from ansible-galaxy and can run
>> > them from the server where the engineVM is running and able to deploy new
>> > vms, clusters, dc, etc.
>> >
>> > I have seen the use of ovirt4.py file to target and group hosts which
>> > you can target for specific plays. However, the box where 
>> > self-hosted-engine
>> > is running is a physical server but I am looking to run ansible from my
>> > local machine instead to manage vms running on engineVM. Is there a way to
>> > achieve this?
>>
>> Sure you can use your own computer to manage the VMs.
>>
>> In your playbook you just need to specify group/host where the tasks of
>> the playbook should run.
>>
>> So if using the ovirt4.py script as your inventory file, you need to
>> just specify specific group where you want to run the tasks in your
>> playbook like this:
>>
>> - hosts: tag_httpd
>>tasks:
>>  ...
>>
>> If you want to Create/Delete VMs using ovirt_* modules, you can do it
>> from your computer as well, but you need to install Python SDK version
>> 4. You can download it from pip using following command: pip install
>> ovirt-engine-sdk-python.
>>
>> >
>> > Thank you in advance.
>> >
>> > ___
>> > Users mailing list -- users@ovirt.org
>> > To unsubscribe send an email to users-le...@ovirt.org
>> >
>
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org


[ovirt-users] Re: Hosted Engine Setup error (oVirt v4.2.3)

2018-05-21 Thread Martin Sivak
> Does the MAC address ever change during the deployment process?

No.

> What is meant by "proper MAC reservation"?

The DHCP server must for a given MAC always return the same IP that is
paired with the requested FQDN in the DNS server (or injected into
/etc/hosts).


Best regards

Martin Sivak

On Mon, May 21, 2018 at 10:40 AM,  <ov...@fateknollogee.com> wrote:
> Does the MAC address ever change during the deployment process?
>
> What is meant by "proper MAC reservation"?
>
>
> On 2018-05-21 01:35, Martin Sivak wrote:
>>
>> Hi,
>>
>>> It failed with a password like this ##ZtHouse1234##
>>
>>
>> I smell an issue with the Ansible var file we use to pass the password
>> into the job. We should probably base64 encode it or something.
>>
>>> With DHCP, the deployment failed when it was trying to add the gluster
>>> host.
>>> Once I changed the engine to static, the deployment passed.
>>
>>
>> DHCP is tricky as you need to make sure that the FQDN always matches
>> the same VM (proper MAC, DNS and DHCP reservation come into play). But
>> this is the default for all installs I did and always worked fine.
>>
>> Best regards
>>
>> Martin Sivak
>>
>> On Mon, May 21, 2018 at 10:18 AM,  <ov...@fateknollogee.com> wrote:
>>>
>>> It failed with a password like this ##ZtHouse1234##
>>> I changed the password to ZtHouse1234 & the error went away.
>>>
>>> With DHCP, the deployment failed when it was trying to add the gluster
>>> host.
>>> Once I changed the engine to static, the deployment passed.
>>>
>>> Also Squeakz (on ovirt channel - IRC) said he has also seen the problem
>>> with
>>> dhcp.
>>>
>>>
>>>
>>>
>>> On 2018-05-21 01:07, Simone Tiraboschi wrote:
>>>>
>>>>
>>>> On Mon, May 21, 2018 at 7:41 AM, <ov...@fateknollogee.com> wrote:
>>>>
>>>>> I finally got it to install...all green lights!!
>>>>>
>>>>> Thanks to Simone and Squeakz (on IRC) for all the help.
>>>>> I have a few follow up questions, I'll post them in a new email
>>>>> thread.
>>>>>
>>>>> Do not:
>>>>> (these 2 items seemed to be the main cause of my deployment errors)
>>>>> -use special characters in any of your passwords - letters & numbers
>>>>> only.
>>>>
>>>>
>>>>
>>>> We had a bug in the past (
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1528253 ) but we never
>>>> really reproduced it, better to double check.
>>>> Can you please provide an example of password that make it failing?
>>>>
>>>>> -hosted engine vm - assign static IP, no dhcp reservation.
>>>>
>>>>
>>>>
>>>> I'm pretty sure that DHCP based configuration are correctly working,
>>>> I'd suggest to double check your DHCP server.
>>>>
>>>>> On 2018-05-18 00:56, Simone Tiraboschi wrote:
>>>>> On Fri, May 18, 2018 at 9:54 AM, <ov...@fateknollogee.com> wrote:
>>>>>
>>>>> Thanks for the info.
>>>>> Unfortunately, I do not have the ovirt-node3 logs since I deleted
>>>>> all 3 nodes.
>>>>> I have now re-installed ovirt node on all 3 hosts.
>>>>> I will double check that hosts can resolve each other.
>>>>>
>>>>> All hosts should correctly resolve the engine VM fqdn, the engine VM
>>>>> has to correctly resolve the address of all the three hosts.
>>>>>
>>>>> Any other advise (or things to check) before I re-try Hosted Engine
>>>>> Setup?
>>>>>
>>>>> Nothing special on my side.
>>>>
>>>>
>>>>
>>>>> ___
>>>>> Users mailing list -- users@ovirt.org
>>>>> To unsubscribe send an email to users-le...@ovirt.org
>>>
>>>
>>> ___
>>> Users mailing list -- users@ovirt.org
>>> To unsubscribe send an email to users-le...@ovirt.org
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org


[ovirt-users] Re: Hosted Engine Setup error (oVirt v4.2.3)

2018-05-21 Thread Martin Sivak
Hi,

> It failed with a password like this ##ZtHouse1234##

I smell an issue with the Ansible var file we use to pass the password
into the job. We should probably base64 encode it or something.

> With DHCP, the deployment failed when it was trying to add the gluster host.
> Once I changed the engine to static, the deployment passed.

DHCP is tricky as you need to make sure that the FQDN always matches
the same VM (proper MAC, DNS and DHCP reservation come into play). But
this is the default for all installs I did and always worked fine.

Best regards

Martin Sivak

On Mon, May 21, 2018 at 10:18 AM,  <ov...@fateknollogee.com> wrote:
> It failed with a password like this ##ZtHouse1234##
> I changed the password to ZtHouse1234 & the error went away.
>
> With DHCP, the deployment failed when it was trying to add the gluster host.
> Once I changed the engine to static, the deployment passed.
>
> Also Squeakz (on ovirt channel - IRC) said he has also seen the problem with
> dhcp.
>
>
>
>
> On 2018-05-21 01:07, Simone Tiraboschi wrote:
>>
>> On Mon, May 21, 2018 at 7:41 AM, <ov...@fateknollogee.com> wrote:
>>
>>> I finally got it to install...all green lights!!
>>>
>>> Thanks to Simone and Squeakz (on IRC) for all the help.
>>> I have a few follow up questions, I'll post them in a new email
>>> thread.
>>>
>>> Do not:
>>> (these 2 items seemed to be the main cause of my deployment errors)
>>> -use special characters in any of your passwords - letters & numbers
>>> only.
>>
>>
>> We had a bug in the past (
>> https://bugzilla.redhat.com/show_bug.cgi?id=1528253 ) but we never
>> really reproduced it, better to double check.
>> Can you please provide an example of password that make it failing?
>>
>>> -hosted engine vm - assign static IP, no dhcp reservation.
>>
>>
>> I'm pretty sure that DHCP based configuration are correctly working,
>> I'd suggest to double check your DHCP server.
>>
>>> On 2018-05-18 00:56, Simone Tiraboschi wrote:
>>> On Fri, May 18, 2018 at 9:54 AM, <ov...@fateknollogee.com> wrote:
>>>
>>> Thanks for the info.
>>> Unfortunately, I do not have the ovirt-node3 logs since I deleted
>>> all 3 nodes.
>>> I have now re-installed ovirt node on all 3 hosts.
>>> I will double check that hosts can resolve each other.
>>>
>>> All hosts should correctly resolve the engine VM fqdn, the engine VM
>>> has to correctly resolve the address of all the three hosts.
>>>
>>> Any other advise (or things to check) before I re-try Hosted Engine
>>> Setup?
>>>
>>> Nothing special on my side.
>>
>>
>>> ___
>>> Users mailing list -- users@ovirt.org
>>> To unsubscribe send an email to users-le...@ovirt.org
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org


Re: [ovirt-users] unable to start engine

2018-05-02 Thread Martin Sivak
Hi,

you are probably running 4.2 in global maintenance mode right? We do
not download the vm.conf unless we need it and since you just rebooted
the machine it might be missing indeed.

It should recover properly if you let the agent do its job and start
the engine by itself. It will download the vm.conf in the process.

Best regards

Martin Sivak

On Wed, May 2, 2018 at 2:52 AM, Justin Zygmont <jzygm...@proofpoint.com> wrote:
> After rebooting the node hosting the engine, I get this:
>
>
>
> # hosted-engine --connect-storage
>
> # hosted-engine --vm-start
>
> The hosted engine configuration has not been retrieved from shared storage.
> Please ensure that ovirt-ha-agent is running and the storage server is
> reachable.
>
>
>
> ovirt-ha-agent is running and the NFS server is reachable, it used to work.
> I don’t see which log to check or where to look
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Any way to access the console of the hosted-engine?

2018-04-27 Thread Martin Sivak
Hello,

please take a look at https://www.ovirt.org/documentation/how-to/hosted-
engine/#handle-engine-vm-boot-problems

If by any chance you miss the console device completely, add this line to
the vm.conf you create:

devices={device:vnc,type:graphics,deviceId:f1d0394e-
b077-4ea6-99e5-b9b6b8fe073c,address:None}

Best regards

--
Martin Sivak
SLA / oVirt

On Fri, Apr 27, 2018 at 4:40 PM, Nico De Ranter <nico.deran...@esaturnus.com
> wrote:

>
> It seems I messed up my glusterfs filesystem as a result the hosted-engine
> didn't start anymore.
>
> Sigh.
>
> Nico
>
> On Fri, Apr 27, 2018 at 4:10 PM, Nico De Ranter <
> nico.deran...@esaturnus.com> wrote:
>
>>
>> I installed 2 Ovirt 4.2.2 servers and installed the
>> hosted-engine-appliance on one of them.  I used the web interface to add
>> storage domains and add the second host but suddenly I lost connection to
>> the hosted-engine.
>> I tried rebooting the host but the reboot just hangs. The server is still
>> pingable but the server console is black. I eventually did a hard reset of
>> the server.  Currently the host is back up-and-running but the
>> hosted-engine seems to be unable to start.
>> I am getting lots of e-mails
>>
>>  ovirt-hosted-engine state transition EngineDown-EngineStart
>>  ovirt-hosted-engine state transition EngineStart-EngineStarting
>>  ovirt-hosted-engine state transition EngineStarting-EngineForceStop
>>  ovirt-hosted-engine state transition EngineForceStop-EngineDown
>>  ...
>>  ad infinitum
>>
>> I tried reinstalling the whole setup but the same thing happened again.
>>
>> I was hoping to be able to access the console of the hosted-engine, but I
>> don't know how to access it without using the hosted-engine itself.
>>
>> Any ideas what might be happening?
>>
>> Nico
>>
>> --
>>
>> Nico De Ranter
>>
>> Operations Engineer
>>
>> T. +32 16 38 72 10
>>
>>
>> <http://www.esaturnus.com>
>>
>> <http://www.esaturnus.com>
>>
>>
>> eSATURNUS
>> Romeinse straat 12
>> 3001 Leuven – Belgium
>>
>> T. +32 16 40 12 82
>> F. +32 16 40 84 77
>> www.esaturnus.com
>>
>> <http://www.esaturnus.com/>
>>
>> *For Service & Support *
>>
>> Support Line: +32 16 387210 or via email : supp...@esaturnus.com
>>
>>
>>
>>
>
>
> --
>
> Nico De Ranter
>
> Operations Engineer
>
> T. +32 16 38 72 10
>
>
> <http://www.esaturnus.com>
>
> <http://www.esaturnus.com>
>
>
> eSATURNUS
> Romeinse straat 12
> 3001 Leuven – Belgium
>
> T. +32 16 40 12 82
> F. +32 16 40 84 77
> www.esaturnus.com
>
> <http://www.esaturnus.com/>
>
> *For Service & Support *
>
> Support Line: +32 16 387210 or via email : supp...@esaturnus.com
>
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch

2018-04-26 Thread Martin Sivak
Hi,

Martin, can you please take a look? Even though the setup is a bit
weird (the hostname mainly..) it seems to be running, but the health
endpoint returns 404. Is there something maybe in SSO hostname
detection that would cause that? How can we debug this more?

A current summary of the issue at hand:

> hosted-engine.ovirt.com=192.168.122.91, it is engine VM, visit  
> hosted-engine.ovirt.com show me web UI.

> [root@hosted-engine2 ~]# curl 
> http://hosted-engine.ovirt.com/ovirt-engine/services/health
> Error404 - Not Found

Best regards

Martin Sivak

On Thu, Apr 26, 2018 at 9:32 AM, dhy336 <dhy...@sina.com> wrote:
> sorry, I used 192.168.223 to replace 192.168.122.65, forget tell you,
> hosted-engine.ovirt.com=192.168.122.91, it is engine VM, visit
> hosted-engine.ovirt.com show me web UI.
>
> 发自网易邮箱手机版
>
> 在2018年04月26日 14:52,Martin Sivak 写道:
>
> Hi,
>
>> hosted-engine1 : 192.168.122.66
>> hosted-engine2 : 192.168.122.223
>
> But you said in an earlier email that:
>
>> I hava two node, A:192.168.122.65 ,   B:192.168.122.66
>
> Make sure your names resolve properly. So far it does exactly what it
> is supposed to do - when the engine is unreachable, it tries
> restarting it. Did you really use hosted-engine.ovirt.com as the fqdn?
> Are you sure it resolves to whatever IP the VM has (192.168.122.91)?
>
> Maybe you used /etc/hosts to configure the name on the first host and
> in the VM, but miss the record on the second host?
>
> What does $(host hosted-engine.ovirt.com) show you?
>
>> I can not visit web UI, but my engine VM is run, i can login it.  engine
>> has
>> some error
>>
>>
>> VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})'
>>  execution failed: java.net.NoRouteToHostException: No route to host
>
> I told you before. This is normal as it is trying to figure out
> whether the host is up.
>
>
> Best regards
>
> Martin Sivak
>
>
> On Thu, Apr 26, 2018 at 4:14 AM,  <dhy...@sina.com> wrote:
>> engine VM:192.168.122.91
>> hosted-engine1 : 192.168.122.66
>> hosted-engine2 : 192.168.122.223
>>
>> I can not visit web UI, but my engine VM is run, i can login it.  engine
>> has
>> some error
>>
>>  2018-04-25 18:35:03,401+08 INFO
>>  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp
>> Reactor)
>>  [] Connecting to hosted-engine1/192.168.122.66
>>  2018-04-25 18:35:06,411+08 ERROR
>>  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
>>  (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Command
>>  'GetAllVmStatsVDSCommand(HostName = hosted-engine1,
>>
>>
>> VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})'
>>  execution failed: java.net.NoRouteToHostException: No route to host
>>
>> 
>> [root@hosted-engine2 ~]# hosted-engine --check-liveliness
>> Hosted Engine is not up!
>>
>> -
>> [root@hosted-engine2 ~]# curl
>> http://hosted-engine.ovirt.com/ovirt-engine/services/health
>> Error404 - Not Found
>>
>> Note: this command is blocked ,it takes 5 minutes
>>
>> -
>> --== Host 1 status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date  : False
>> Hostname   : hosted-engine1
>> Host ID: 1
>> Engine status  : unknown stale-data
>> Score  : 3400
>> stopped: False
>> Local maintenance  : False
>> crc32  : 1eae8968
>> local_conf_timestamp   : 48907
>> Host timestamp : 48907
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=48907 (Thu Apr 26 01:57:14 2018)
>> host-id=1
>> score=3400
>> vm_conf_refresh_time=48907 (Thu Apr 26 01:57:15 2018)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=EngineUp
>> stopped=False
>>
>>
>> --== Host 2 status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date  : Tru

Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch

2018-04-26 Thread Martin Sivak
Hi,

> hosted-engine1 : 192.168.122.66
> hosted-engine2 : 192.168.122.223

But you said in an earlier email that:

> I hava two node, A:192.168.122.65 ,   B:192.168.122.66

Make sure your names resolve properly. So far it does exactly what it
is supposed to do - when the engine is unreachable, it tries
restarting it. Did you really use hosted-engine.ovirt.com as the fqdn?
Are you sure it resolves to whatever IP the VM has (192.168.122.91)?

Maybe you used /etc/hosts to configure the name on the first host and
in the VM, but miss the record on the second host?

What does $(host hosted-engine.ovirt.com) show you?

> I can not visit web UI, but my engine VM is run, i can login it.  engine has
> some error
>
> VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})'
>  execution failed: java.net.NoRouteToHostException: No route to host

I told you before. This is normal as it is trying to figure out
whether the host is up.


Best regards

Martin Sivak


On Thu, Apr 26, 2018 at 4:14 AM,  <dhy...@sina.com> wrote:
> engine VM:192.168.122.91
> hosted-engine1 : 192.168.122.66
> hosted-engine2 : 192.168.122.223
>
> I can not visit web UI, but my engine VM is run, i can login it.  engine has
> some error
>
>  2018-04-25 18:35:03,401+08 INFO
>  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
>  [] Connecting to hosted-engine1/192.168.122.66
>  2018-04-25 18:35:06,411+08 ERROR
>  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
>  (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Command
>  'GetAllVmStatsVDSCommand(HostName = hosted-engine1,
>
> VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})'
>  execution failed: java.net.NoRouteToHostException: No route to host
> 
> [root@hosted-engine2 ~]# hosted-engine --check-liveliness
> Hosted Engine is not up!
> -
> [root@hosted-engine2 ~]# curl
> http://hosted-engine.ovirt.com/ovirt-engine/services/health
> Error404 - Not Found
>
> Note: this command is blocked ,it takes 5 minutes
> -
> --== Host 1 status ==--
>
> conf_on_shared_storage : True
> Status up-to-date  : False
> Hostname   : hosted-engine1
> Host ID: 1
> Engine status  : unknown stale-data
> Score  : 3400
> stopped: False
> Local maintenance  : False
> crc32  : 1eae8968
> local_conf_timestamp   : 48907
> Host timestamp : 48907
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=48907 (Thu Apr 26 01:57:14 2018)
> host-id=1
> score=3400
> vm_conf_refresh_time=48907 (Thu Apr 26 01:57:15 2018)
> conf_on_shared_storage=True
> maintenance=False
> state=EngineUp
> stopped=False
>
>
> --== Host 2 status ==--
>
> conf_on_shared_storage : True
> Status up-to-date  : True
> Hostname   : hosted-engine2
> Host ID: 2
> Engine status  : {"reason": "failed liveliness check",
> "health": "bad", "vm": "up", "detail": "Up"}
> Score  : 3000
> stopped: False
> Local maintenance  : False
> crc32  : 1b92756d
> local_conf_timestamp   : 44057
> Host timestamp : 44057
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=44057 (Thu Apr 26 02:00:57 2018)
> host-id=2
> score=3000
> vm_conf_refresh_time=44057 (Thu Apr 26 02:00:57 2018)
> conf_on_shared_storage=True
> maintenance=False
> state=EngineStarting
> stopped=False
>
>
>
>
>
>
> - Original Message -ovirt
> From: Martin Sivak <msi...@redhat.com>
> To: dhy336 <dhy...@sina.com>
> Cc: users <users@ovirt.org>
> Subject: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch
> Date: 2018-04-25 20:41
>
>
>> 2018-04-25 18:35:06,411+08 ERROR
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.G

Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch

2018-04-25 Thread Martin Sivak
> 2018-04-25 18:35:06,411+08 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
> (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Command
> 'GetAllVmStatsVDSCommand(HostName = hosted-engine1,
> VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})'
> execution failed: java.net.NoRouteToHostException: No route to host

This is expected and normal. The ovirt-engine service is trying to
find out whether host A is still unreachable or not. This is not the
issue you are looking for.

> 192.168.122.66 has been powered off, and hosted engine VM run in
> 192.168.122.223, I think engine should connect to 192.168.122.223,

You are mixing the IP of the engine VM and the IP of a host. The
engine runs in VM with stable .122.223 (independent on which host the
VM runs at) and manages two hosts .122.65 and .122.66. The engine
constantly monitors all its hosts and that means it is trying to
connect to them every now and then.

Please execute the two following commands on Host B and show us the
results (use the proper fqdn):

$(hosted-engine --check-liveliness)
$(curl http://{fqdn}/ovirt-engine/services/health)


Best regards

Martin Sivak

On Wed, Apr 25, 2018 at 2:34 PM,  <dhy...@sina.com> wrote:
> I  login in engine VM by (#hosted-engine --console) , I find ovirt-engine
> process. and I find some error in /var/log/ovirt-engine/engine.log
>
> 192.168.122.66 has been powered off, and hosted engine VM run in
> 192.168.122.223, I think engine should connect to 192.168.122.223,
>
>
> 2018-04-25 18:35:03,401+08 INFO
> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
> [] Connecting to hosted-engine1/192.168.122.66
> 2018-04-25 18:35:06,411+08 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
> (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Command
> 'GetAllVmStatsVDSCommand(HostName = hosted-engine1,
> VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})'
> execution failed: java.net.NoRouteToHostException: No route to host
> 2018-04-25 18:35:06,411+08 INFO
> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher]
> (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Failed to fetch vms
> info for host 'hosted-engine1' - skipping VMs monitoring.
> 2018-04-25 18:35:21,420+08 INFO
> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
> [] Connecting to hosted-engine1/192.168.122.66
> 2018-04-25 18:35:24,430+08 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
> (EE-ManagedThreadFactory-engineScheduled-Thread-1) [] Command
> 'GetAllVmStatsVDSCommand(HostName = hosted-engine1,
> VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})'
> execution failed: java.net.NoRouteToHostException: No route to host
> 2018-04-25 18:35:24,431+08 INFO
> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher]
> (EE-ManagedThreadFactory-engineScheduled-Thread-1) [] Failed to fetch vms
> info for host 'hosted-engine1' - skipping VMs monitoring.
> 2018-04-25 18:35:39,438+08 INFO
> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
> [] Connecting to hosted-engine1/192.168.122.66
>
>
>
> - Original Message -
> From: Martin Sivak <msi...@redhat.com>
> To: dhy336 <dhy...@sina.com>
> Cc: users <users@ovirt.org>
> Subject: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch
> Date: 2018-04-25 20:27
>
>
> The engine will try connecting to all registered hosts all the time.
> That is normal.
> If your host can reach the engine then check whether it can reach
> http://{fqdn}/ovirt-engine/services/health as that is what is used to
> make sure the engine is alive.
> Best regards
> Martin Sivak
> On Wed, Apr 25, 2018 at 2:15 PM, <dhy...@sina.com> wrote:
>> Hi Martin,
>>
>> thank you for answer
>> my host can reach the engine, I confuse why engine connect to another host
>> which has been power off by me?
>>
>> - Original Message -
>> From: Martin Sivak <msi...@redhat.com>
>> To: dhy336 <dhy...@sina.com>, users <users@ovirt.org>
>> Subject: Re: Re: Re: Re: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can
>> not_switch
>> Date: 2018-04-25 19:12
>>
>> It is as I expected:
>> Engine status : {"reason": "failed liveliness check"
>> The host can't talk to the ovirt-engine service. Please make sure the
>> host can reach the engine fqdn as configured in
>> /etc/ovirt-hosted-engine/hosted-engine.conf on the fqdn= line.
>> You can check it manually by executing $(hosted-engine
>> --check-liveliness) from the host.
>> Be

Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch

2018-04-25 Thread Martin Sivak
The engine will try connecting to all registered hosts all the time.
That is normal.

If your host can reach the engine then check whether it can reach
http://{fqdn}/ovirt-engine/services/health as that is what is used to
make sure the engine is alive.

Best regards

Martin Sivak

On Wed, Apr 25, 2018 at 2:15 PM,  <dhy...@sina.com> wrote:
> Hi Martin,
>
> thank you for answer
> my host can reach the engine, I confuse why engine connect to another host
> which has been power off by me?
>
> - Original Message -
> From: Martin Sivak <msi...@redhat.com>
> To: dhy336 <dhy...@sina.com>, users <users@ovirt.org>
> Subject: Re: Re: Re: Re: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can
> not_switch
> Date: 2018-04-25 19:12
>
> It is as I expected:
> Engine status : {"reason": "failed liveliness check"
> The host can't talk to the ovirt-engine service. Please make sure the
> host can reach the engine fqdn as configured in
> /etc/ovirt-hosted-engine/hosted-engine.conf on the fqdn= line.
> You can check it manually by executing $(hosted-engine
> --check-liveliness) from the host.
> Best regards
> Martin Sivak
> On Wed, Apr 25, 2018 at 12:51 PM, <dhy...@sina.com> wrote:
>> Hi,
>>
>> two node :
>> 192.168.122.66 hosted-engine1
>> 192.168.122.223 hosted-engine2
>>
>> I power off hosted-engine1, so I do not attach hosted-engine1`s log,
>>
>> [root@hosted-engine2 ~]# hosted-engine --vm-status
>>
>> --== Host 1 status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date : False
>> Hostname : hosted-engine1
>> Host ID : 1
>> Engine status : unknown stale-data
>> Score : 3400
>> stopped : False
>> Local maintenance : False
>> crc32 : a7af0afa
>> local_conf_timestamp : 11485
>> Host timestamp : 11485
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=11485 (Wed Apr 25 10:08:34 2018)
>> host-id=1
>> score=3400
>> vm_conf_refresh_time=11485 (Wed Apr 25 10:08:34 2018)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=EngineUp
>> stopped=False
>>
>>
>> --== Host 2 status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date : True
>> Hostname : hosted-engine2
>> Host ID : 2
>> Engine status : {"reason": "failed liveliness check",
>> "health": "bad", "vm": "up", "detail": "Up"}
>> Score : 3000
>> stopped : False
>> Local maintenance : False
>> crc32 : a2e82883
>> local_conf_timestamp : 6278
>> Host timestamp : 6278
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=6278 (Wed Apr 25 10:37:44 2018)
>> host-id=2
>> score=3000
>> vm_conf_refresh_time=6278 (Wed Apr 25 10:37:44 2018)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=EngineStop
>> stopped=False
>> timeout=Thu Jan 1 09:49:38 1970
>>
>>
>>
>> - Original Message -
>> From: Martin Sivak <msi...@redhat.com>
>> To: dhy336 <dhy...@sina.com>, users <users@ovirt.org>
>> Subject: Re: Re: Re: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can
>> not_switch
>> Date: 2018-04-25 17:41
>>
>>
>> Please attach the output of hosted-engine --vm-status and the
>> /var/log/ovirt-hosted-engine-ha/agent.log file from both hosts.
>> The VM will restart if the ovirt-engine service does not become
>> available within timeout. And that might mean couple of things - the
>> FQDN of the engine is wrong, the engine needs something that was only
>> available on the dead host (A) like some storage, host B cannot ping
>> the gateway..
>> Best regards
>> Martin Sivak
>> On Wed, Apr 25, 2018 at 11:33 AM, <dhy...@sina.com> wrote:
>>> sorry, I mis-represent,
>>>
>>> I hava two node, A:192.168.122.65 , B:192.168.122.66 with hosted-engine.
>>>
>>> testing engine HA :
>>>
>>> first two node is up, and hosted-engine VM run in A, then I poweroff A,
>>> and
>>> after 3 minutes, B start it`s hosted engine VM,
>>> But it`s ovirt-engine connect to host A, and continue for about 10
>>> minutes,
>>> then hosted engine VM restart.
>>> - Original Message -
>>> From: Martin Sivak <msi...@redhat.com>
>>> To: dhy336 <dhy...@sina.com>
>>> Subject: Re: Re: Re: Re: [ovirt-users] 回复:Re: 

Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch

2018-04-25 Thread Martin Sivak
It is as I expected:

Engine status : {"reason": "failed liveliness check"

The host can't talk to the ovirt-engine service. Please make sure the
host can reach the engine fqdn as configured in
/etc/ovirt-hosted-engine/hosted-engine.conf on the fqdn= line.

You can check it manually by executing $(hosted-engine
--check-liveliness) from the host.

Best regards

Martin Sivak

On Wed, Apr 25, 2018 at 12:51 PM,  <dhy...@sina.com> wrote:
> Hi,
>
>  two node :
> 192.168.122.66 hosted-engine1
> 192.168.122.223 hosted-engine2
>
> I power off  hosted-engine1, so I do not attach  hosted-engine1`s log,
>
> [root@hosted-engine2 ~]# hosted-engine --vm-status
>
> --== Host 1 status ==--
>
> conf_on_shared_storage : True
> Status up-to-date  : False
> Hostname   : hosted-engine1
> Host ID: 1
> Engine status  : unknown stale-data
> Score  : 3400
> stopped: False
> Local maintenance  : False
> crc32  : a7af0afa
> local_conf_timestamp   : 11485
> Host timestamp : 11485
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=11485 (Wed Apr 25 10:08:34 2018)
> host-id=1
> score=3400
> vm_conf_refresh_time=11485 (Wed Apr 25 10:08:34 2018)
> conf_on_shared_storage=True
> maintenance=False
> state=EngineUp
> stopped=False
>
>
> --== Host 2 status ==--
>
> conf_on_shared_storage : True
> Status up-to-date  : True
> Hostname   : hosted-engine2
> Host ID: 2
> Engine status  : {"reason": "failed liveliness check",
> "health": "bad", "vm": "up", "detail": "Up"}
> Score  : 3000
> stopped: False
> Local maintenance  : False
> crc32  : a2e82883
> local_conf_timestamp   : 6278
> Host timestamp : 6278
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=6278 (Wed Apr 25 10:37:44 2018)
> host-id=2
> score=3000
> vm_conf_refresh_time=6278 (Wed Apr 25 10:37:44 2018)
> conf_on_shared_storage=True
> maintenance=False
> state=EngineStop
> stopped=False
> timeout=Thu Jan  1 09:49:38 1970
>
>
>
> - Original Message -
> From: Martin Sivak <msi...@redhat.com>
> To: dhy336 <dhy...@sina.com>, users <users@ovirt.org>
> Subject: Re: Re: Re: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can
> not_switch
> Date: 2018-04-25 17:41
>
>
> Please attach the output of hosted-engine --vm-status and the
> /var/log/ovirt-hosted-engine-ha/agent.log file from both hosts.
> The VM will restart if the ovirt-engine service does not become
> available within timeout. And that might mean couple of things - the
> FQDN of the engine is wrong, the engine needs something that was only
> available on the dead host (A) like some storage, host B cannot ping
> the gateway..
> Best regards
> Martin Sivak
> On Wed, Apr 25, 2018 at 11:33 AM, <dhy...@sina.com> wrote:
>> sorry, I mis-represent,
>>
>> I hava two node, A:192.168.122.65 , B:192.168.122.66 with hosted-engine.
>>
>> testing engine HA :
>>
>> first two node is up, and hosted-engine VM run in A, then I poweroff A,
>> and
>> after 3 minutes, B start it`s hosted engine VM,
>> But it`s ovirt-engine connect to host A, and continue for about 10
>> minutes,
>> then hosted engine VM restart.
>> - Original Message -
>> From: Martin Sivak <msi...@redhat.com>
>> To: dhy336 <dhy...@sina.com>
>> Subject: Re: Re: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch
>> Date: 2018-04-25 17:11
>>
>>
>> Your hosted engine VM has its own address that does not depend on
>> which host it is currently running. So it should be available on the
>> same address no matter where the VM is running.
>> Best regards
>> Martin Sivak
>> On Wed, Apr 25, 2018 at 9:07 AM, <dhy...@sina.com> wrote:
>>>>> I deploy two node for hosted engine, first hosted engine VM run in
>>>>> 192.168.122.65, I power off this host, hosted-engine VM switch
>>>>> another host,but ovirt engine still connect 192.168.122.65. if restart
>>>>> ovirt-engine server, it is work.
>>>
>>&

Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch

2018-04-25 Thread Martin Sivak
Please attach the output of hosted-engine --vm-status and the
/var/log/ovirt-hosted-engine-ha/agent.log file from both hosts.

The VM will restart if the ovirt-engine service does not become
available within timeout. And that might mean couple of things - the
FQDN of the engine is wrong, the engine needs something that was only
available on the dead host (A) like some storage, host B cannot ping
the gateway..

Best regards

Martin Sivak

On Wed, Apr 25, 2018 at 11:33 AM,  <dhy...@sina.com> wrote:
> sorry, I mis-represent,
>
> I hava two node, A:192.168.122.65 ,   B:192.168.122.66  with hosted-engine.
>
>  testing engine HA :
>
>  first two node is up, and hosted-engine VM run in A, then I poweroff A, and
> after 3 minutes, B start it`s hosted engine VM,
>  But it`s ovirt-engine connect to host A, and continue for about 10 minutes,
> then hosted engine VM restart.
> - Original Message -
> From: Martin Sivak <msi...@redhat.com>
> To: dhy336 <dhy...@sina.com>
> Subject: Re: Re: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch
> Date: 2018-04-25 17:11
>
>
> Your hosted engine VM has its own address that does not depend on
> which host it is currently running. So it should be available on the
> same address no matter where the VM is running.
> Best regards
> Martin Sivak
> On Wed, Apr 25, 2018 at 9:07 AM, <dhy...@sina.com> wrote:
>>>> I deploy two node for hosted engine, first hosted engine VM run in
>>>> 192.168.122.65, I power off this host, hosted-engine VM switch
>>>> another host,but ovirt engine still connect 192.168.122.65. if restart
>>>> ovirt-engine server, it is work.
>>
>> I think this issue is error, because hosted engine VM has power up in
>> another host( 192.168.122.66), so hosted engine should
>> connect to host( 192.168.122.66), not connet to host(192.168.122.66)?
>>
>> thanks
>>
>> - Original Message -
>> From: Martin Sivak <msi...@redhat.com>
>> To: dhy336 <dhy...@sina.com>
>> Cc: users <users@ovirt.org>
>> Subject: Re: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch
>> Date: 2018-04-20 18:28
>>
>>
>> Hi,
>> No, this is not an error. You killed the host without moving it to
>> maintenance first. The engine has no way to distinguish this from
>> temporary network failure for example. Give it some time and the host
>> will move its status to one of the error states and handle the highly
>> available VMs on it (if fencing is properly configured).
>> Best regards
>> Martin Sivak
>> On Fri, Apr 20, 2018 at 12:13 PM, <dhy...@sina.com> wrote:
>>> this process is not error ?
>>> - Original Message -
>>> From: Martin Sivak <msi...@redhat.com>
>>> To: dhy336 <dhy...@sina.com>
>>> Cc: users <users@ovirt.org>
>>> Subject: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch
>>> Date: 2018-04-20 18:05
>>>
>>>
>>> Hi,
>>> the engine does not know you killed the host. It will notice
>>> eventually and handle the situation. Just give it time (5 minutes or
>>> so).
>>> Best regards
>>> --
>>> Martin Sivak
>>> SLA / oVirt
>>> On Fri, Apr 20, 2018 at 12:00 PM, <dhy...@sina.com> wrote:
>>>> Hi, thanks for your feedback. I hava another qeustions
>>>>
>>>> I deploy two node for hosted engine, first hosted engine VM run in
>>>> 192.168.122.65, I power off this host, hosted-engine VM switch
>>>> another host,but ovirt engine still connect 192.168.122.65. if restart
>>>> ovirt-engine server, it is work.
>>>>
>>>>
>>>> 2018-04-20 17:13:04,692+08 ERROR
>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
>>>> (EE-ManagedThreadFactory-en gineScheduled-Thread-98) [] Command
>>>> 'GetAllVmStatsVDSCommand(HostName = hosted-engine2,
>>>> VdsIdVDSCommandParametersBase:{hos
>>>> tId='a5428ef7-9df6-4a86-91de-7e36fda340fa'})' execution failed:
>>>> java.net.NoRouteToHostException: No route to host
>>>> 6568 2018-04-20 17:13:04,693+08 INFO
>>>> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher]
>>>> (EE-ManagedThreadFactory-engi neScheduled-Thread-98) [] Failed to fetch
>>>> vms info for host 'hosted-engin2' - skipping VMs monitoring.
>>>> 6569 2018-04-20 17:13:19,710+08 INFO
>>>> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp
>>>> Reactor)
>>>> [] Connecting to hosted-e

Re: [ovirt-users] hosted-engine --deploy ERROR

2018-04-24 Thread Martin Sivak
Hi,

was that a clean host? What does virsh -r net-list show?

Best regards

Martin Sivak

On Tue, Apr 24, 2018 at 9:12 AM,  <dhy...@sina.com> wrote:
> Hi,
>
> I deploy hosted engine but it has some error,
>
> # hosted-engine --deploy
>
> [ INFO  ] TASK [Check status of default libvirt network]
> [ INFO  ] changed: [localhost]
> [ INFO  ] TASK [Activate default libvirt network]
> [ ERROR ] fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["virsh",
> "net-start", "default"], "delta": "0:00:00.074655", "end": "2018-04-24
> 14:59:12.548459", "msg": "non-zero return code", "rc": 1, "start":
> "2018-04-24 14:59:12.473804", "stderr": "error: failed to get network
> 'default'\nerror: Network not found: no network with matching name
> 'default'", "stderr_lines": ["error: failed to get network 'default'",
> "error: Network not found: no network with matching name 'default'"],
> "stdout": "", "stdout_lines": []}
> [ INFO  ] TASK [include_tasks]
> [ INFO  ] ok: [localhost]
> [ INFO  ] TASK [Remove local vm dir]
> [ INFO  ] changed: [localhost]
> [ INFO  ] TASK [Notify the user about a failure]
> [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The
> system may not be provisioned according to the playbook results: please
> check the logs for the issue, fix accordingly or re-deploy from scratch.\n"}
> [ ERROR ] Failed to execute stage 'Closing up': Failed executing
> ansible-playbook
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Duplicate host_id after deployment via GUI, how is the hosted engine host_id selected?

2018-04-24 Thread Martin Sivak
Hi Thomas,

I posted https://gerrit.ovirt.org/#/c/90549/ in an attempt to see how
to improve the clarity. The result is in the attachment.

The ID is still different, but the section header shows the hostname
as well. Do you think it would be enough for you? We can't expect
people to touch the DB every time they add a host out of order and the
SPM ID is not selectable by user, because it needs to fit some storage
constraints (we use it for protecting storage metadata) and must match
the hosted engine ID now.

Best regards

Martin Sivak

On Mon, Apr 23, 2018 at 2:47 PM, Thomas Klute <kl...@ingenit.com> wrote:
> Hi Martin,
>
> I fully understand.
> The only reason is that it's not only me administering this oVirt setup,
> it's a setup for a client that has 2 local administrators.
> We are trying to eliminate any configuration that might cause confusion.
> When doing a
>
> hosted-engine --vm-status
>
> a list like this is shown:
>
> --== Host 1 status ==--
> ...
> Hostname   : ovirt1.munk.de
> Host ID: 1
> ...
>
> --== Host 2 status ==--
> ...
> Hostname   : ovirt3.munk.de
> Host ID: 2
> ...
>
> and so on.
> This "Host 2" -> ovirt3 will sooner or later cause confusion and will
> lead to administrative actions applied to the wrong hosts.
>
> I'll give it a try and try to align the hosts and ids. I'll backup the
> old values in order to be able to roll back the changes if it does not work.
>
> Thanks for your help!!
> Best regards,
>  Thomas
>
>
>
>> Hi Thomas,
>>
>> is there a real need to have them aligned? Both SPM ID and Hosted
>> engine ids are quite hidden and accompanied by hostname where visible.
>>
>> I understand the urge to have everything neatly organized, but playing
>> with hosted engine IDs and SPM IDs will bring you nothing but trouble.
>>
>> Theoretically you could put all hosts to maintenance, hosted engine to
>> global maintenance, stop the engine service and update the DB and
>> hosted engine config files. Changing anything when active is a big
>> NO-NO. And I can't promise stability even if you stop everything
>> first.
>>
>> The short version: Please change hosted engine ID to match SPM ID
>> (/etc/ovirt-hosted-engine/hosted-engine.conf) and ignore the hostname
>> vs ID mismatch. All other options might cost you..
>>
>> Best regards
>>
>> --
>> Martin Sivak
>> SLA / oVirt
>>
>> On Mon, Apr 23, 2018 at 2:26 PM, Thomas Klute <kl...@ingenit.com> wrote:
>>> Dear Simone,
>>>
>>> thanks for the help, I already have noticed, that there is a mismatch
>>> between SPM IDs and the hosts.
>>> From my knowledge this may be our fault.
>>>
>>>  vds_name | vds_spm_id
>>>  oVirt4   |  3
>>>  oVirt2   |  4
>>>  oVirt6   |  6
>>>  oVirt5   |  5
>>>  oVirt1   |  1
>>>  oVirt3   |  2
>>>
>>> oVirt3 was re-installed using the node ng image - and was assigned id 2 that
>>> was available at that time (taken into account this database query).
>>> It was not available in fact because of the host ovirt2 having the host_id=2
>>> entry in the configuration on the host and the sanlock having locked the
>>> appropriate space.
>>>
>>> I can't remember where this mismatch came from but I assume that some time
>>> ago someone maybe manually changed the entry on oVirt2 host in order to
>>> match with the hostname (oVirtX -> Host id X).
>>> Thus I think it is not a bug at the moment.
>>>
>>> But that leads me to the next question:
>>>
>>> Is there a manual way to align numbers in the hostnames with the SPM IDs?
>>> (oVirt4 should be spm id 4)
>>>
>>> Does this work? Update the database, change the entry in hosted-engine.conf
>>> and then reboot the node?
>>>
>>> Thanks & best regards,
>>> Thomas
>>>
>>>
>>>
>>>
>>> On Mon, Apr 23, 2018 at 1:04 PM, Martin Sivak <msi...@redhat.com> wrote:
>>>> Hello,
>>>>
>>>> the ID should be coming from the SPM ID of the host as assigned by the
>>>> engine and duplicates should not be happening indeed.
>>>>
>>>> Can you please tell us what version of engine you have and how was
>>>> first hosted engine node deployed? We recently changed the default
>>>> deployment method (4.2.2 in fact) and we would

Re: [ovirt-users] Hosted Engine won't start, how to debug?

2018-04-23 Thread Martin Sivak
Hi,

the VM configuration is stored in the engine database and we generate
the vm.conf using the data the engine exports. So you are right, we
overwrite the file.

You might be able to simply edit the VM in the webadmin and enable the
VNC (or Spice) console there. I am not entirely sure if we allow that
atm, but we might.

If that is not allowed, then you will have to edit the DB. The best
way would be to change the vm_static table's origin (should be 6 -
managed hosted engine) field to 3 - ovirt. The webadmin will then
allow you to edit the VM fully. Then change the origin back to 6. All
that for a VM with vm_name = "HostedEngine".

So the fallback procedure for a default install of webadmin (the
database name = engine):

- put hosted engine into global maintenance
- ssh into the engine VM
- sudo -i -u postgres
- psql engine -c "UPDATE vm_static SET origin=3 WHERE vm_name='HostedEngine'"
- edit the VM in webadmin (it should lose its crown status after a
while) and add VNC
- psql engine -c "UPDATE vm_static SET origin=6 WHERE vm_name='HostedEngine'"
- remove global maintenance

All usual disclaimers apply. I recommend backup if it is a production
environment and I can only propose this because I used it repeatedly
during development tests. But production use is on your own risk.

Best regards

Martin Sivak

On Mon, Apr 23, 2018 at 3:45 PM, Thomas Klute <kl...@ingenit.com> wrote:
> Dear Martin,
>
> a follow up question regarding this case:
> we managed to get the hosted engine up and running again, but we'd like
> to permanently add the missing graphics device to the vm.conf (instead
> of starting with the vm-custom.conf in case of trouble).
> How do I add this line to the vm.conf config file (it seems to be
> overwritten after a few seconds - I suppose it's copied from somewhere?)
>
> Thanks,
> Thomas
>
>> Hi,
>>
>> the vnc device is there by default (I copied it out of my own hosted
>> engine instance), I do not know why it was missing in your case.
>>
>> Best regards
>>
>> Martin Sivak
>>
>> On Fri, Apr 13, 2018 at 5:13 PM, Thomas Klute <kl...@ingenit.com> wrote:
>>> Dear Martin,
>>>
>>> yes, that worked. Thank you so much!!
>>> We were able to see that the latest kernel update failed and did not
>>> create the initramfs file and thus the boot process failed with a kernel
>>> panic.
>>>
>>> Debugging this problem took us many hours... It felt so complicated to
>>> connect a vnc to this vm - compared to a bare metal setup with display
>>> and keyboard.
>>> Wouldn't it be a good idea to have the vnc device in the config by default?
>>>
>>> Best regards,
>>>  Thomas
>>>
>>>> You need to be in global maintenance, but I think you already know that.
>>>> Then try updating the vm.conf like you already did and add this line:
>>>>
>>>> devices={device:vnc,type:graphics,deviceId:f1d0394e-b077-4ea6-99e5-b9b6b8fe073c,address:None}
>>>>
>>>> Then restart the VM using hosted-engine commands and try the VNC approach 
>>>> again.
>>>>
>>>> Best regards
>>>>
>>>> Martin Sivak
>>>>
>>>> On Fri, Apr 13, 2018 at 2:35 PM, Thomas Klute <kl...@ingenit.com> wrote:
>>>>> Dear Martin,
>>>>>
>>>>> thanks for the feedback.
>>>>> We already read this and tried it.
>>>>> It seems to me that the graphics device was removed from the hosted
>>>>> engine by some ovirt release.
>>>>>
>>>>> If I try to set a console password I see this message:
>>>>>
>>>>> hosted-engine --add-console-password
>>>>> Enter password:
>>>>> no graphics devices configured
>>>>>
>>>>> Furthermore, there is nothing listening on port 5900 after that.
>>>>> The HostedEngine qemu process shows a " -display none " as parameter and
>>>>> I have no idea where this can be changed.
>>>>>
>>>>> I already created a /var/run/ovirt-hosted-engine-ha/vm-custom.conf
>>>>> containing:
>>>>> display=vnc
>>>>> kvmEnable=true
>>>>>
>>>>> But nothing changed.
>>>>> I also edited the HostedEngine VM config using virsh and added a vnc
>>>>> display:
>>>>> /usr/bin/virsh -c
>>>>> qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf edit
>>>>> HostedEngine
>>>>>
>>>>> 
>>&

Re: [ovirt-users] Duplicate host_id after deployment via GUI, how is the hosted engine host_id selected?

2018-04-23 Thread Martin Sivak
Hi Thomas,

is there a real need to have them aligned? Both SPM ID and Hosted
engine ids are quite hidden and accompanied by hostname where visible.

I understand the urge to have everything neatly organized, but playing
with hosted engine IDs and SPM IDs will bring you nothing but trouble.

Theoretically you could put all hosts to maintenance, hosted engine to
global maintenance, stop the engine service and update the DB and
hosted engine config files. Changing anything when active is a big
NO-NO. And I can't promise stability even if you stop everything
first.

The short version: Please change hosted engine ID to match SPM ID
(/etc/ovirt-hosted-engine/hosted-engine.conf) and ignore the hostname
vs ID mismatch. All other options might cost you..

Best regards

--
Martin Sivak
SLA / oVirt

On Mon, Apr 23, 2018 at 2:26 PM, Thomas Klute <kl...@ingenit.com> wrote:
> Dear Simone,
>
> thanks for the help, I already have noticed, that there is a mismatch
> between SPM IDs and the hosts.
> From my knowledge this may be our fault.
>
>  vds_name | vds_spm_id
>  oVirt4   |  3
>  oVirt2   |  4
>  oVirt6   |  6
>  oVirt5   |  5
>  oVirt1   |  1
>  oVirt3   |  2
>
> oVirt3 was re-installed using the node ng image - and was assigned id 2 that
> was available at that time (taken into account this database query).
> It was not available in fact because of the host ovirt2 having the host_id=2
> entry in the configuration on the host and the sanlock having locked the
> appropriate space.
>
> I can't remember where this mismatch came from but I assume that some time
> ago someone maybe manually changed the entry on oVirt2 host in order to
> match with the hostname (oVirtX -> Host id X).
> Thus I think it is not a bug at the moment.
>
> But that leads me to the next question:
>
> Is there a manual way to align numbers in the hostnames with the SPM IDs?
> (oVirt4 should be spm id 4)
>
> Does this work? Update the database, change the entry in hosted-engine.conf
> and then reboot the node?
>
> Thanks & best regards,
> Thomas
>
>
>
>
> On Mon, Apr 23, 2018 at 1:04 PM, Martin Sivak <msi...@redhat.com> wrote:
>>
>> Hello,
>>
>> the ID should be coming from the SPM ID of the host as assigned by the
>> engine and duplicates should not be happening indeed.
>>
>> Can you please tell us what version of engine you have and how was
>> first hosted engine node deployed? We recently changed the default
>> deployment method (4.2.2 in fact) and we would like to know whether
>> this might be related to that change or not.
>>
>> Simone, do you know how to debug this? Are there logs we could use to
>> check the behavior? The host-deploy logs maybe?
>
>
> Thomas,
> can you please share the output of
>   sudo -u postgres scl enable rh-postgresql95 -- psql -d engine -c "select
> vds_id, vds_name, vds_spm_id from vds"
> executed on the engine VM
>
> and the content of /var/log/ovirt-engine/host-deploy (still from the engine
> VM)?
>
>>
>>
>> Best regards
>>
>> --
>> Martin Sivak
>> SLA / oVirt
>>
>> On Mon, Apr 23, 2018 at 12:54 PM, Thomas Klute <kl...@ingenit.com> wrote:
>> > Dear oVirt Users,
>> >
>> > we installed a third ovirt node using the ovirt node ng image (4.2.2),
>> > and
>> > added that node using the web interface of the hosted engine:
>> >
>> > Compute->Hosts->New - including the Hosted engine option "Choose hosted
>> > engine deployment action" set to "Deploy"
>> >
>> > After that we found out, that on the new node the host_id entry was
>> > causing
>> > problems. In
>> >
>> > /etc/ovirt-hosted-engine/hosted-engine.conf
>> >
>> > we found the entry
>> >
>> > ...
>> > host_id=2
>> > ...
>> >
>> > But that host id already exists and was alive during deployment process.
>> > So my questions are:
>> >
>> > How is the host id chosen (when adding a new host via GUI)?
>> > Is there an option to preselect the host id in the GUI for a new node?
>> > How can we prevent further duplicate host ids?
>> >
>> > Thanks and best regards,
>> > Thomas
>> >
>> >
>> > --
>> > 
>> >
>> >  Dipl.-Inform. Thomas Klute   kl...@ingenit.com
>> >  Geschäftsführer / CEO
>> >  -

Re: [ovirt-users] Duplicate host_id after deployment via GUI, how is the hosted engine host_id selected?

2018-04-23 Thread Martin Sivak
Hello,

the ID should be coming from the SPM ID of the host as assigned by the
engine and duplicates should not be happening indeed.

Can you please tell us what version of engine you have and how was
first hosted engine node deployed? We recently changed the default
deployment method (4.2.2 in fact) and we would like to know whether
this might be related to that change or not.

Simone, do you know how to debug this? Are there logs we could use to
check the behavior? The host-deploy logs maybe?

Best regards

--
Martin Sivak
SLA / oVirt

On Mon, Apr 23, 2018 at 12:54 PM, Thomas Klute <kl...@ingenit.com> wrote:
> Dear oVirt Users,
>
> we installed a third ovirt node using the ovirt node ng image (4.2.2), and
> added that node using the web interface of the hosted engine:
>
> Compute->Hosts->New - including the Hosted engine option "Choose hosted
> engine deployment action" set to "Deploy"
>
> After that we found out, that on the new node the host_id entry was causing
> problems. In
>
> /etc/ovirt-hosted-engine/hosted-engine.conf
>
> we found the entry
>
> ...
> host_id=2
> ...
>
> But that host id already exists and was alive during deployment process.
> So my questions are:
>
> How is the host id chosen (when adding a new host via GUI)?
> Is there an option to preselect the host id in the GUI for a new node?
> How can we prevent further duplicate host ids?
>
> Thanks and best regards,
> Thomas
>
>
> --
> 
>
>  Dipl.-Inform. Thomas Klute   kl...@ingenit.com
>  Geschäftsführer / CEO
>  --
>  ingenit GmbH & Co. KG   Tel. +49 (0)231 58 698-120
>  Emil-Figge-Strasse 76-80Fax. +49 (0)231 58 698-121
>  D-44227 Dortmund   www.ingenit.com
>
>  Registergericht: Amtsgericht Dortmund, HRA 13 914
>  Gesellschafter : Thomas Klute, Marc-Christian Schröer
> 
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch

2018-04-20 Thread Martin Sivak
Hi,

No, this is not an error. You killed the host without moving it to
maintenance first. The engine has no way to distinguish this from
temporary network failure for example. Give it some time and the host
will move its status to one of the error states and handle the highly
available VMs on it (if fencing is properly configured).

Best regards

Martin Sivak

On Fri, Apr 20, 2018 at 12:13 PM,  <dhy...@sina.com> wrote:
> this process is not error ?
> - Original Message -----
> From: Martin Sivak <msi...@redhat.com>
> To: dhy336 <dhy...@sina.com>
> Cc: users <users@ovirt.org>
> Subject: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch
> Date: 2018-04-20 18:05
>
>
> Hi,
> the engine does not know you killed the host. It will notice
> eventually and handle the situation. Just give it time (5 minutes or
> so).
> Best regards
> --
> Martin Sivak
> SLA / oVirt
> On Fri, Apr 20, 2018 at 12:00 PM, <dhy...@sina.com> wrote:
>> Hi, thanks for your feedback. I hava another qeustions
>>
>> I deploy two node for hosted engine, first hosted engine VM run in
>> 192.168.122.65, I power off this host, hosted-engine VM switch
>> another host,but ovirt engine still connect 192.168.122.65. if restart
>> ovirt-engine server, it is work.
>>
>>
>> 2018-04-20 17:13:04,692+08 ERROR
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
>> (EE-ManagedThreadFactory-en gineScheduled-Thread-98) [] Command
>> 'GetAllVmStatsVDSCommand(HostName = hosted-engine2,
>> VdsIdVDSCommandParametersBase:{hos
>> tId='a5428ef7-9df6-4a86-91de-7e36fda340fa'})' execution failed:
>> java.net.NoRouteToHostException: No route to host
>> 6568 2018-04-20 17:13:04,693+08 INFO
>> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher]
>> (EE-ManagedThreadFactory-engi neScheduled-Thread-98) [] Failed to fetch
>> vms info for host 'hosted-engin2' - skipping VMs monitoring.
>> 6569 2018-04-20 17:13:19,710+08 INFO
>> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
>> [] Connecting to hosted-engine2/192.168.122.656570 2018-04-20
>> 17:13:22,730+08 ERROR
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
>> (EE-ManagedThreadFactory-en gineScheduled-Thread-45) [] Command
>> 'GetAllVmStatsVDSCommand(HostName = hosted-engine-tchyp2,
>> VdsIdVDSCommandParametersBase:{hos
>> tId='a5428ef7-9df6-4a86-91de-7e36fda340fa'})' execution failed:
>> java.net.NoRouteToHostException: No route to host
>> 6571 2018-04-20 17:13:22,732+08 INFO
>> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher]
>> (EE-ManagedThreadFactory-engi neScheduled-Thread-45) [] Failed to fetch
>> vms info for host 'hosted-engine2' - skipping VMs monitoring.
>>
>> - Original Message -
>> From: Martin Sivak <msi...@redhat.com>
>> To: dhy336 <dhy...@sina.com>
>> Cc: users <users@ovirt.org>
>> Subject: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch
>> Date: 2018-04-20 16:40
>>
>>
>> Hi,
>> your ovirt-hosted-engine-ha package is too old. You need at least
>> 2.1.9 to properly support 4.2 engine. The same applies to vdsm. Please
>> upgrade the node.
>> Best regards
>> Martin Sivak
>> On Fri, Apr 20, 2018 at 3:58 AM, <dhy...@sina.com> wrote:
>>> Hi I find some error logs in /var/log/ovirt-hosted-engine-ha/broker.
>>>
>>> [root@hosted-engine2 ~]# ll /rhev/data-center/mnt
>>> total 0
>>> drwxr-xr-x. 3 vdsm kvm 76 Apr 18 22:28 192.168.122.218:_exports_data
>>> drwxr-xr-x. 3 vdsm kvm 76 Apr 18 22:12
>>> 192.168.122.218:_exports_hosted-engine-test1
>>> [root@hosted-engine2 ~]# ll
>>> /rhev/data-center/mnt/192.168.122.218\:_exports_hosted-engine-test1/
>>> total 0
>>> drwxr-xr-x. 5 vdsm kvm 50 Apr 18 22:14
>>> 8a734205-65b7-4801-b7f0-d380eb45dbae
>>> -rwxr-xr-x. 1 vdsm kvm 0 Apr 20 09:54 __DIRECT_IO_TEST__
>>>
>>> uuid 8a734205-65b7-4801-b7f0-d380eb45dbae is in
>>> /rhev/data-center/mnt/192.168.122.218\:_exports_hosted-engine-test1/
>>> but broker find it in /rhev/data-center/mnt, is it my version is error?
>>> my
>>> ovirt-hosted-engine-ha version is 2.1.5, vdsm is 4.20.5,
>>> ovirt-engine is 4.2
>>>
>>> MainThread::INFO::2018-04-19
>>>
>>>
>>> 19:26:31,479::listener::41::ovirt_hosted_engine_ha.broker.listener.Listener::(__init__)
>>> Initializing SocketServer
>>> MainThread::INFO::2018-04-19
>>>
>>>
>>> 19:26:31,480::listener::56::ovirt_hosted_

Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch

2018-04-20 Thread Martin Sivak
Hi,

the engine does not know you killed the host. It will notice
eventually and handle the situation. Just give it time (5 minutes or
so).

Best regards

--
Martin Sivak
SLA / oVirt

On Fri, Apr 20, 2018 at 12:00 PM,  <dhy...@sina.com> wrote:
> Hi, thanks for your feedback. I hava another qeustions
>
> I deploy two node for hosted engine, first hosted engine VM run in
> 192.168.122.65, I power off this host, hosted-engine VM switch
> another host,but ovirt engine still connect 192.168.122.65.  if restart
> ovirt-engine server, it is work.
>
>
>  2018-04-20 17:13:04,692+08 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
> (EE-ManagedThreadFactory-en gineScheduled-Thread-98) [] Command
> 'GetAllVmStatsVDSCommand(HostName = hosted-engine2,
> VdsIdVDSCommandParametersBase:{hos
> tId='a5428ef7-9df6-4a86-91de-7e36fda340fa'})' execution failed:
> java.net.NoRouteToHostException: No route to host
> 6568 2018-04-20 17:13:04,693+08 INFO
> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher]
> (EE-ManagedThreadFactory-engi neScheduled-Thread-98) [] Failed to fetch
> vms info for host 'hosted-engin2' - skipping VMs monitoring.
> 6569 2018-04-20 17:13:19,710+08 INFO
> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
> [] Connecting to hosted-engine2/192.168.122.656570 2018-04-20
> 17:13:22,730+08 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
> (EE-ManagedThreadFactory-en gineScheduled-Thread-45) [] Command
> 'GetAllVmStatsVDSCommand(HostName = hosted-engine-tchyp2,
> VdsIdVDSCommandParametersBase:{hos
> tId='a5428ef7-9df6-4a86-91de-7e36fda340fa'})' execution failed:
> java.net.NoRouteToHostException: No route to host
> 6571 2018-04-20 17:13:22,732+08 INFO
> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher]
> (EE-ManagedThreadFactory-engi neScheduled-Thread-45) [] Failed to fetch
> vms info for host 'hosted-engine2' - skipping VMs monitoring.
>
> - Original Message -
> From: Martin Sivak <msi...@redhat.com>
> To: dhy336 <dhy...@sina.com>
> Cc: users <users@ovirt.org>
> Subject: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch
> Date: 2018-04-20 16:40
>
>
> Hi,
> your ovirt-hosted-engine-ha package is too old. You need at least
> 2.1.9 to properly support 4.2 engine. The same applies to vdsm. Please
> upgrade the node.
> Best regards
> Martin Sivak
> On Fri, Apr 20, 2018 at 3:58 AM, <dhy...@sina.com> wrote:
>> Hi I find some error logs in /var/log/ovirt-hosted-engine-ha/broker.
>>
>> [root@hosted-engine2 ~]# ll /rhev/data-center/mnt
>> total 0
>> drwxr-xr-x. 3 vdsm kvm 76 Apr 18 22:28 192.168.122.218:_exports_data
>> drwxr-xr-x. 3 vdsm kvm 76 Apr 18 22:12
>> 192.168.122.218:_exports_hosted-engine-test1
>> [root@hosted-engine2 ~]# ll
>> /rhev/data-center/mnt/192.168.122.218\:_exports_hosted-engine-test1/
>> total 0
>> drwxr-xr-x. 5 vdsm kvm 50 Apr 18 22:14
>> 8a734205-65b7-4801-b7f0-d380eb45dbae
>> -rwxr-xr-x. 1 vdsm kvm 0 Apr 20 09:54 __DIRECT_IO_TEST__
>>
>> uuid 8a734205-65b7-4801-b7f0-d380eb45dbae is in
>> /rhev/data-center/mnt/192.168.122.218\:_exports_hosted-engine-test1/
>> but broker find it in /rhev/data-center/mnt, is it my version is error? my
>> ovirt-hosted-engine-ha version is 2.1.5, vdsm is 4.20.5,
>> ovirt-engine is 4.2
>>
>> MainThread::INFO::2018-04-19
>>
>> 19:26:31,479::listener::41::ovirt_hosted_engine_ha.broker.listener.Listener::(__init__)
>> Initializing SocketServer
>> MainThread::INFO::2018-04-19
>>
>> 19:26:31,480::listener::56::ovirt_hosted_engine_ha.broker.listener.Listener::(__init__)
>> SocketServer ready
>> Thread-1::INFO::2018-04-19
>>
>> 19:26:31,558::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
>> Connection established
>> Thread-1::ERROR::2018-04-19
>>
>> 19:26:31,559::listener::192::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
>> Error handling request, data: 'set-storage-domain FilesystemBackend
>> dom_type=nfs3 sd_uuid=8a734205-65b7-4801-b7f0-d380eb45dbae'
>> Traceback (most recent call last):
>> File
>>
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
>> line 166, in handle
>> data)
>> File
>>
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
>> line 299, in _dispatch
>> .set_storage_domain(client, sd_type, **options)
>> File
>>
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
&

Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch

2018-04-20 Thread Martin Sivak
Hi,

your ovirt-hosted-engine-ha package is too old. You need at least
2.1.9 to properly support 4.2 engine. The same applies to vdsm. Please
upgrade the node.

Best regards

Martin Sivak

On Fri, Apr 20, 2018 at 3:58 AM,  <dhy...@sina.com> wrote:
> Hi I find some error logs in /var/log/ovirt-hosted-engine-ha/broker.
>
> [root@hosted-engine2 ~]# ll /rhev/data-center/mnt
> total 0
> drwxr-xr-x. 3 vdsm kvm 76 Apr 18 22:28 192.168.122.218:_exports_data
> drwxr-xr-x. 3 vdsm kvm 76 Apr 18 22:12
> 192.168.122.218:_exports_hosted-engine-test1
> [root@hosted-engine2 ~]# ll
> /rhev/data-center/mnt/192.168.122.218\:_exports_hosted-engine-test1/
> total 0
> drwxr-xr-x. 5 vdsm kvm 50 Apr 18 22:14 8a734205-65b7-4801-b7f0-d380eb45dbae
> -rwxr-xr-x. 1 vdsm kvm  0 Apr 20 09:54 __DIRECT_IO_TEST__
>
> uuid 8a734205-65b7-4801-b7f0-d380eb45dbae is in
> /rhev/data-center/mnt/192.168.122.218\:_exports_hosted-engine-test1/
> but broker find it in  /rhev/data-center/mnt, is it my version is error? my
> ovirt-hosted-engine-ha version is 2.1.5, vdsm is 4.20.5,
> ovirt-engine is 4.2
>
> MainThread::INFO::2018-04-19
> 19:26:31,479::listener::41::ovirt_hosted_engine_ha.broker.listener.Listener::(__init__)
> Initializing SocketServer
> MainThread::INFO::2018-04-19
> 19:26:31,480::listener::56::ovirt_hosted_engine_ha.broker.listener.Listener::(__init__)
> SocketServer ready
> Thread-1::INFO::2018-04-19
> 19:26:31,558::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
> Connection established
> Thread-1::ERROR::2018-04-19
> 19:26:31,559::listener::192::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
> Error handling request, data: 'set-storage-domain FilesystemBackend
> dom_type=nfs3 sd_uuid=8a734205-65b7-4801-b7f0-d380eb45dbae'
> Traceback (most recent call last):
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
> line 166, in handle
> data)
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
> line 299, in _dispatch
> .set_storage_domain(client, sd_type, **options)
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
> line 66, in set_storage_domain
> self._backends[client].connect()
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
> line 462, in connect
> self._dom_type)
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
> line 107, in get_domain_path
> " in {1}".format(sd_uuid, parent))
> BackendFailureException: path to storage domain
> 8a734205-65b7-4801-b7f0-d380eb45dbae not found in /rhev/data-center/mnt
> Thread-1::INFO::2018-04-19
> 19:26:31,563::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
> Connection closed
> Thread-2::INFO::2018-04-19
> 19:26:44,601::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
> Connection established
>
> - 原始邮件 -
> 发件人:<dhy...@sina.com>
> 收件人:"Martin Sivak" <msi...@redhat.com>
> 抄送人:users <users@ovirt.org>
> 主题:[ovirt-users] 回复:Re: Hosted-engine can not_switch
> 日期:2018年04月20日 09点30分
>
> libvirt has not error logs . I only find some error for vdsm.
> vdsm log is:
> 2018-04-20 09:24:52,610+0800 INFO  (jsonrpc/1) [vdsm.api] FINISH
> getVolumeInfo return={'info': {'status': 'OK', 'domain':
> '8a734205-65b7-4801-b7f0-d380eb45dbae', 'voltype': 'LEAF', 'description':
> 'hosted-engine.lockspace', 'parent': '----',
> 'format': 'RAW', 'generation': 0, 'image':
> '611272bd-c2cc-42bc-94e2-9aa52e754c35', 'ctime': '1524032037', 'disktype':
> '2', 'legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1048576',
> 'children': [], 'pool': '', 'capacity': '1048576', 'uuid':
> u'7037aac6-7c8e-4efd-82f7-ca618c953fe6', 'truesize': '1048576', 'type':
> 'PREALLOCATED', 'lease': {'owners': [], 'version': None}}} from=::1,48306,
> task_id=03a7938e-8afb-4b16-b8dd-126c2b1f5d52 (api:52)
> 2018-04-20 09:24:52,611+0800 INFO  (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC
> call Volume.getInfo succeeded in 0.03 seconds (__init__:630)
> 2018-04-20 09:24:54,113+0800 ERROR (periodic/3) [virt.periodic.Operation]
>  operation failed
> (periodic:215)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 213,
> in __call__
> self._func()
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line 522,
> in __call__
> self._send_metrics()
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/sampl

Re: [ovirt-users] Hosted-engine can not switch

2018-04-19 Thread Martin Sivak
We need more than just this small log snippet. Please check the vdsm
and libvirt logs as well.

Best regards

Martin Sivak

On Thu, Apr 19, 2018 at 2:05 PM,  <dhy...@sina.com> wrote:
> Hi,
>  I deploy three node with hosted engine,  I force shut down  a node which
> Host-engine VM is run, But  hosted engine VM in other nodes can not run.
>
> I find some error in /var/log/ovirt-hosted-engine-ha/agent.log
>
> MainThread::INFO::2018-04-19
> 19:56:35,787::hosted_engine::1192::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state)
> Cleaning state for non-running VM
> MainThread::INFO::2018-04-19
> 19:56:42,587::hosted_engine::1176::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state)
> Vdsm state for VM clean
> MainThread::INFO::2018-04-19
> 19:56:42,589::hosted_engine::1125::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
> Starting vm using `/usr/sbin/hosted-engine --vm-start`
> MainThread::INFO::2018-04-19
> 19:56:47,599::hosted_engine::1131::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
> stdout:
> MainThread::INFO::2018-04-19
> 19:56:47,600::hosted_engine::1132::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
> stderr: Virtual machine does not exist: {'vmId':
> u'08bbd680-a8a7-4267-82e7-89f36e87e930'}
>
> MainThread::INFO::2018-04-19
> 19:56:47,600::hosted_engine::1144::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
> Engine VM started on localhost
> MainThread::INFO::2018-04-19
> 19:56:47,609::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1524139007.61 type=state_transition
> detail=EngineStart-EngineStarting hostname='hosted-engine2'
> MainThread::INFO::2018-04-19
> 19:56:47,670::brokerlink::121::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition (EngineStart-EngineStarting)
> sent? sent
> MainThread::INFO::2018-04-19
> 19:56:47,670::hosted_engine::604::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm)
> Initializing VDSM
> MainThread::INFO::2018-04-19
> 19:56:50,095::hosted_engine::630::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
> Connecting the storage
> MainThread::INFO::2018-04-19
> 19:56:50,096::storage_server::220::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(validate_storage_server)
> Validating storage server
> MainThread::INFO::2018-04-19
> 19:56:52,449::hosted_engine::639::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
> Storage domain reported as valid and reconnect is not forced.
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] custom hosted-engine issues

2018-04-18 Thread Martin Sivak
Hi,

That part is related to the hosted engine storage. You need an
additional storage domain for regular VMs as specified in the note I
sent you. Add the storage using the webadmin UI.

Best regards

--
Martin Sivak
SLA / oVirt

On Wed, Apr 18, 2018 at 11:55 AM,  <dhy...@sina.com> wrote:
> Select the type of storage to use.
>
>  Please specify the storage you would like to use (glusterfs, iscsi, fc,
> nfs3, nfs4)[nfs3]:
>
> For NFS storage types, specify the full address, using either the FQDN or IP
> address, and path name of the shared storage domain.
>
>   Please specify the full shared storage connection path to use (example:
> host:/path): storage.example.com:/hosted_engine/nfs
>
> I followed this guide configure my nfs shared storage, but this storage has
> not add to ovirt engine automatically, I do  not know why not to add to
> ovirt engine automatically?
>
> - 原始邮件 -
> 发件人:Martin Sivak <msi...@redhat.com>
> 收件人:dhy336 <dhy...@sina.com>
> 抄送人:users <users@ovirt.org>
> 主题:Re: [ovirt-users] custom hosted-engine issues
> 日期:2018年04月18日 17点40分
>
>
> Hi,
> you need to add a storage domain for VMs first. The hosted engine
> domain and VM will then be auto imported.
> See the following in the Hosted engine deployment guide:
> https://www.ovirt.org/documentation/self-hosted/chap-Deploying_Self-Hosted_Engine/
> "Important: Log in as the admin@internal user to continue configuring
> the Engine and add further resources. You must create another data
> domain for the data center to be initialized to host regular virtual
> machine data, and for the Engine virtual machine to be visible."
> You seem to be using oVirt 4.1 so please note that the oVirt 4.2.2
> release now supports much better and safer deployment method.
> Best regards
> --
> Martin Sivak
> SLA / oVirt
> On Wed, Apr 18, 2018 at 11:08 AM, <dhy...@sina.com> wrote:
>> Hi,
>> I setup hosted engine, and it is successed, but it has not add my share
>> storage (nfs) to Storage Domain,
>> I don`t find engine VM in webadmin UI compute->Virtual Machines.
>> it has not Hosted Engine sub-tab in webadmin UI when i add host to ovirt
>> engine.
>>
>> would you give me some advise? thanks...
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] custom hosted-engine issues

2018-04-18 Thread Martin Sivak
Hi,

you need to add a storage domain for VMs first. The hosted engine
domain and VM will then be auto imported.

See the following in the Hosted engine deployment guide:
https://www.ovirt.org/documentation/self-hosted/chap-Deploying_Self-Hosted_Engine/

"Important: Log in as the admin@internal user to continue configuring
the Engine and add further resources. You must create another data
domain for the data center to be initialized to host regular virtual
machine data, and for the Engine virtual machine to be visible."

You seem to be using oVirt 4.1 so please note that the oVirt 4.2.2
release now supports much better and safer deployment method.

Best regards

--
Martin Sivak
SLA / oVirt

On Wed, Apr 18, 2018 at 11:08 AM,  <dhy...@sina.com> wrote:
> Hi,
> I setup hosted engine, and it is successed, but it has not add my share
> storage (nfs) to Storage Domain,
> I don`t find engine VM in webadmin UI compute->Virtual Machines.
> it has not  Hosted Engine sub-tab in webadmin UI when i add host to ovirt
> engine.
>
> would you give me some advise? thanks...
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted Engine won't start, how to debug?

2018-04-16 Thread Martin Sivak
Hi,

> graphical_console parameter has been introduced in ovirt_vms ansible module
> just since ansible 2.5.

but that isn't Thomas' case as he has an legacy deployment that was
upgraded across versions. This might affect new installs of 4.2 only.

Martin

On Mon, Apr 16, 2018 at 10:04 AM, Simone Tiraboschi <stira...@redhat.com> wrote:
>
>
> On Sat, Apr 14, 2018 at 1:47 PM, Martin Sivak <msi...@redhat.com> wrote:
>>
>> Hi,
>>
>> the vnc device is there by default (I copied it out of my own hosted
>> engine instance), I do not know why it was missing in your case.
>
>
> graphical_console parameter has been introduced in ovirt_vms ansible module
> just since ansible 2.5.
> http://docs.ansible.com/ansible/latest/modules/ovirt_vms_module.html#ovirt-vms
>
> Anso so we have to consume it as well.
>
> I just opened https://bugzilla.redhat.com/show_bug.cgi?id=1567772
> to track it.
>
>>
>> Best regards
>>
>> Martin Sivak
>>
>> On Fri, Apr 13, 2018 at 5:13 PM, Thomas Klute <kl...@ingenit.com> wrote:
>> > Dear Martin,
>> >
>> > yes, that worked. Thank you so much!!
>> > We were able to see that the latest kernel update failed and did not
>> > create the initramfs file and thus the boot process failed with a kernel
>> > panic.
>> >
>> > Debugging this problem took us many hours... It felt so complicated to
>> > connect a vnc to this vm - compared to a bare metal setup with display
>> > and keyboard.
>> > Wouldn't it be a good idea to have the vnc device in the config by
>> > default?
>> >
>> > Best regards,
>> >  Thomas
>> >
>> >> You need to be in global maintenance, but I think you already know
>> >> that.
>> >> Then try updating the vm.conf like you already did and add this line:
>> >>
>> >>
>> >> devices={device:vnc,type:graphics,deviceId:f1d0394e-b077-4ea6-99e5-b9b6b8fe073c,address:None}
>> >>
>> >> Then restart the VM using hosted-engine commands and try the VNC
>> >> approach again.
>> >>
>> >> Best regards
>> >>
>> >> Martin Sivak
>> >>
>> >> On Fri, Apr 13, 2018 at 2:35 PM, Thomas Klute <kl...@ingenit.com>
>> >> wrote:
>> >>> Dear Martin,
>> >>>
>> >>> thanks for the feedback.
>> >>> We already read this and tried it.
>> >>> It seems to me that the graphics device was removed from the hosted
>> >>> engine by some ovirt release.
>> >>>
>> >>> If I try to set a console password I see this message:
>> >>>
>> >>> hosted-engine --add-console-password
>> >>> Enter password:
>> >>> no graphics devices configured
>> >>>
>> >>> Furthermore, there is nothing listening on port 5900 after that.
>> >>> The HostedEngine qemu process shows a " -display none " as parameter
>> >>> and
>> >>> I have no idea where this can be changed.
>> >>>
>> >>> I already created a /var/run/ovirt-hosted-engine-ha/vm-custom.conf
>> >>> containing:
>> >>> display=vnc
>> >>> kvmEnable=true
>> >>>
>> >>> But nothing changed.
>> >>> I also edited the HostedEngine VM config using virsh and added a vnc
>> >>> display:
>> >>> /usr/bin/virsh -c
>> >>> qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf edit
>> >>> HostedEngine
>> >>>
>> >>> 
>> >>>   
>> >>> 
>> >>> 
>> >>>   > >>> heads='1' primary='yes'/>
>> >>>   > >>> function='0x0'/>
>> >>> 
>> >>>
>> >>> But there is still "display none" passed as command line parameter to
>> >>> qemu and thus, I suppose, there's no display.
>> >>>
>> >>> Any help is appreciated, thanks,
>> >>> Thomas
>> >>>
>> >>>
>> >>>> Hi,
>> >>>>
>> >>>> the serial console will show output if the kernel knows to use it.
>> >>>>
>> >>>> The VNC approach is also possible and I believe we already have a
>> >>>> graphical device present. What you are looking for is probably this
>> >&g

Re: [ovirt-users] Hosted Engine won't start, how to debug?

2018-04-14 Thread Martin Sivak
Hi,

the vnc device is there by default (I copied it out of my own hosted
engine instance), I do not know why it was missing in your case.

Best regards

Martin Sivak

On Fri, Apr 13, 2018 at 5:13 PM, Thomas Klute <kl...@ingenit.com> wrote:
> Dear Martin,
>
> yes, that worked. Thank you so much!!
> We were able to see that the latest kernel update failed and did not
> create the initramfs file and thus the boot process failed with a kernel
> panic.
>
> Debugging this problem took us many hours... It felt so complicated to
> connect a vnc to this vm - compared to a bare metal setup with display
> and keyboard.
> Wouldn't it be a good idea to have the vnc device in the config by default?
>
> Best regards,
>  Thomas
>
>> You need to be in global maintenance, but I think you already know that.
>> Then try updating the vm.conf like you already did and add this line:
>>
>> devices={device:vnc,type:graphics,deviceId:f1d0394e-b077-4ea6-99e5-b9b6b8fe073c,address:None}
>>
>> Then restart the VM using hosted-engine commands and try the VNC approach 
>> again.
>>
>> Best regards
>>
>> Martin Sivak
>>
>> On Fri, Apr 13, 2018 at 2:35 PM, Thomas Klute <kl...@ingenit.com> wrote:
>>> Dear Martin,
>>>
>>> thanks for the feedback.
>>> We already read this and tried it.
>>> It seems to me that the graphics device was removed from the hosted
>>> engine by some ovirt release.
>>>
>>> If I try to set a console password I see this message:
>>>
>>> hosted-engine --add-console-password
>>> Enter password:
>>> no graphics devices configured
>>>
>>> Furthermore, there is nothing listening on port 5900 after that.
>>> The HostedEngine qemu process shows a " -display none " as parameter and
>>> I have no idea where this can be changed.
>>>
>>> I already created a /var/run/ovirt-hosted-engine-ha/vm-custom.conf
>>> containing:
>>> display=vnc
>>> kvmEnable=true
>>>
>>> But nothing changed.
>>> I also edited the HostedEngine VM config using virsh and added a vnc
>>> display:
>>> /usr/bin/virsh -c
>>> qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf edit
>>> HostedEngine
>>>
>>> 
>>>   
>>> 
>>> 
>>>   >> heads='1' primary='yes'/>
>>>   >> function='0x0'/>
>>> 
>>>
>>> But there is still "display none" passed as command line parameter to
>>> qemu and thus, I suppose, there's no display.
>>>
>>> Any help is appreciated, thanks,
>>> Thomas
>>>
>>>
>>>> Hi,
>>>>
>>>> the serial console will show output if the kernel knows to use it.
>>>>
>>>> The VNC approach is also possible and I believe we already have a
>>>> graphical device present. What you are looking for is probably this
>>>> (VNC method is described there):
>>>> https://www.ovirt.org/documentation/how-to/hosted-engine/#handle-engine-vm-boot-problems
>>>>
>>>> Best regards
>>>>
>>>> --
>>>> Martin Sivak
>>>> SLA / oVirt
>>>>
>>>> On Fri, Apr 13, 2018 at 11:26 AM, Thomas Klute <kl...@ingenit.com> wrote:
>>>>> Dear oVirt Team,
>>>>>
>>>>> after trying to reboot a hosted engine setup on oVirt 4.2 the VM won't
>>>>> come up anymore.
>>>>> The qemu-kvm process is there but we're unable to access the VM using
>>>>> - the serial console (simply does not show anything, does not react to
>>>>> characters typed)
>>>>> - VNC / Spice because the hosted engine vm.conf does not contain any
>>>>> graphics device.
>>>>>
>>>>> Before trying to reinstall, we'd like to recover and debug what is going 
>>>>> on.
>>>>> We mounted a Centos7 install .iso and started the VM using
>>>>> hosted-engine --vm-start
>>>>> --vm-conf=/var/run/ovirt-hosted-engine-ha/vm-custom.conf
>>>>> But we still have to problem, that the serial console does not show
>>>>> anything and there is no way to connect using VNC.
>>>>>
>>>>> So, what is the recommended way to move forward in such situation?
>>>>> IMHO the classical way would be to add a graphics device and connect via
>>>>> VNC?
>>>>> I did not use t

Re: [ovirt-users] Hosted Engine won't start, how to debug?

2018-04-13 Thread Martin Sivak
You need to be in global maintenance, but I think you already know that.
Then try updating the vm.conf like you already did and add this line:

devices={device:vnc,type:graphics,deviceId:f1d0394e-b077-4ea6-99e5-b9b6b8fe073c,address:None}

Then restart the VM using hosted-engine commands and try the VNC approach again.

Best regards

Martin Sivak

On Fri, Apr 13, 2018 at 2:35 PM, Thomas Klute <kl...@ingenit.com> wrote:
> Dear Martin,
>
> thanks for the feedback.
> We already read this and tried it.
> It seems to me that the graphics device was removed from the hosted
> engine by some ovirt release.
>
> If I try to set a console password I see this message:
>
> hosted-engine --add-console-password
> Enter password:
> no graphics devices configured
>
> Furthermore, there is nothing listening on port 5900 after that.
> The HostedEngine qemu process shows a " -display none " as parameter and
> I have no idea where this can be changed.
>
> I already created a /var/run/ovirt-hosted-engine-ha/vm-custom.conf
> containing:
> display=vnc
> kvmEnable=true
>
> But nothing changed.
> I also edited the HostedEngine VM config using virsh and added a vnc
> display:
> /usr/bin/virsh -c
> qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf edit
> HostedEngine
>
> 
>   
> 
> 
>heads='1' primary='yes'/>
>function='0x0'/>
> 
>
> But there is still "display none" passed as command line parameter to
> qemu and thus, I suppose, there's no display.
>
> Any help is appreciated, thanks,
> Thomas
>
>
>> Hi,
>>
>> the serial console will show output if the kernel knows to use it.
>>
>> The VNC approach is also possible and I believe we already have a
>> graphical device present. What you are looking for is probably this
>> (VNC method is described there):
>> https://www.ovirt.org/documentation/how-to/hosted-engine/#handle-engine-vm-boot-problems
>>
>> Best regards
>>
>> --
>> Martin Sivak
>> SLA / oVirt
>>
>> On Fri, Apr 13, 2018 at 11:26 AM, Thomas Klute <kl...@ingenit.com> wrote:
>>> Dear oVirt Team,
>>>
>>> after trying to reboot a hosted engine setup on oVirt 4.2 the VM won't
>>> come up anymore.
>>> The qemu-kvm process is there but we're unable to access the VM using
>>> - the serial console (simply does not show anything, does not react to
>>> characters typed)
>>> - VNC / Spice because the hosted engine vm.conf does not contain any
>>> graphics device.
>>>
>>> Before trying to reinstall, we'd like to recover and debug what is going on.
>>> We mounted a Centos7 install .iso and started the VM using
>>> hosted-engine --vm-start
>>> --vm-conf=/var/run/ovirt-hosted-engine-ha/vm-custom.conf
>>> But we still have to problem, that the serial console does not show
>>> anything and there is no way to connect using VNC.
>>>
>>> So, what is the recommended way to move forward in such situation?
>>> IMHO the classical way would be to add a graphics device and connect via
>>> VNC?
>>> I did not use the serial console much, up to now. Should the serial
>>> console show any output during boot?
>>>
>>> Thanks for your help,
>>>  Thomas
>>>
>>> --
>>> 
>>>
>>>  Dipl.-Inform. Thomas Klute   kl...@ingenit.com
>>>  Geschäftsführer / CEO
>>>  --
>>>  ingenit GmbH & Co. KG   Tel. +49 (0)231 58 698-120
>>>  Emil-Figge-Strasse 76-80Fax. +49 (0)231 58 698-121
>>>  D-44227 Dortmund   www.ingenit.com
>>>
>>>  Registergericht: Amtsgericht Dortmund, HRA 13 914
>>>  Gesellschafter : Thomas Klute, Marc-Christian Schröer
>>> 
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
> Sollten noch Fragen offen sein, stehen wir Ihnen natürlich jederzeit
> gerne zur Verfügung.
>
> Mit Grüßen aus Dortmund,
>  Thomas Klute
>
> --
> 
>
>  Dipl.-Inform. Thomas Klute   kl...@ingenit.com
>  Geschäftsführer / CEO
>  --
>  ingenit GmbH & Co. KG   Tel. +49 (0)231 58 698-120
>  Emil-Figge-Strasse 76-80Fax. +49 (0)231 58 698-121
>  D-44227 Dortmund   www.ingenit.com
>
>  Registergericht: Amtsgericht Dortmund, HRA 13 914
>  Gesellschafter : Thomas Klute, Marc-Christian Schröer
> 
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted Engine won't start, how to debug?

2018-04-13 Thread Martin Sivak
Hi,

the serial console will show output if the kernel knows to use it.

The VNC approach is also possible and I believe we already have a
graphical device present. What you are looking for is probably this
(VNC method is described there):
https://www.ovirt.org/documentation/how-to/hosted-engine/#handle-engine-vm-boot-problems

Best regards

--
Martin Sivak
SLA / oVirt

On Fri, Apr 13, 2018 at 11:26 AM, Thomas Klute <kl...@ingenit.com> wrote:
> Dear oVirt Team,
>
> after trying to reboot a hosted engine setup on oVirt 4.2 the VM won't
> come up anymore.
> The qemu-kvm process is there but we're unable to access the VM using
> - the serial console (simply does not show anything, does not react to
> characters typed)
> - VNC / Spice because the hosted engine vm.conf does not contain any
> graphics device.
>
> Before trying to reinstall, we'd like to recover and debug what is going on.
> We mounted a Centos7 install .iso and started the VM using
> hosted-engine --vm-start
> --vm-conf=/var/run/ovirt-hosted-engine-ha/vm-custom.conf
> But we still have to problem, that the serial console does not show
> anything and there is no way to connect using VNC.
>
> So, what is the recommended way to move forward in such situation?
> IMHO the classical way would be to add a graphics device and connect via
> VNC?
> I did not use the serial console much, up to now. Should the serial
> console show any output during boot?
>
> Thanks for your help,
>  Thomas
>
> --
> 
>
>  Dipl.-Inform. Thomas Klute   kl...@ingenit.com
>  Geschäftsführer / CEO
>  --
>  ingenit GmbH & Co. KG   Tel. +49 (0)231 58 698-120
>  Emil-Figge-Strasse 76-80Fax. +49 (0)231 58 698-121
>  D-44227 Dortmund   www.ingenit.com
>
>  Registergericht: Amtsgericht Dortmund, HRA 13 914
>  Gesellschafter : Thomas Klute, Marc-Christian Schröer
> 
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 回复:Re: 回复:Re: ovirt engine HA

2018-04-09 Thread Martin Sivak
> You mentioned that can be done via the web GUI. Do you mean by cockpit or
by
> oVirt Engine Web Interface itself ?

Only via the oVirt web admin interface. You first add a standard storage
domain, wait for a VM that represents the engine to appear (name:
HostedEngine) and then you can add an additional host with hosted engine
bits directly from the webadmin UI (HostedEngine side tab of Add new host
dialog, select Deploy).

Best regards

Martin Sivak

On Mon, Apr 9, 2018 at 6:21 PM, FERNANDO FREDIANI <fernando.fredi...@upx.com
> wrote:

> Hello Simone
>
> The doubt is once one hosted engine is deployed to one of the hosts the
> process to deploy in the second one is exactly the same or does it have any
> minor particularity ? You mentioned that can be done via the web GUI. Do
> you mean by cockpit or by oVirt Engine Web Interface itself ?
>
> Thanks
> Fernando
>
> 2018-04-09 7:32 GMT-03:00 Simone Tiraboschi <stira...@redhat.com>:
>
>>
>>
>> On Sun, Apr 8, 2018 at 3:53 PM, <dhy...@sina.com> wrote:
>>
>>> sorry. I do not know how to describe my qeustion
>>>  I want to add a hosted-engine, I use same nfs path with first
>>> hosted-engine, but has some error.
>>>   --== STORAGE CONFIGURATION ==--
>>>
>>>   Please specify the storage you would like to use (glusterfs,
>>> iscsi, fc, nfs3, nfs4)[nfs3]:
>>>   Please specify the full shared storage connection path to use
>>> (example: host:/path): 192.168.122.218:/exports/hosted-engine-test1
>>> [ ERROR ] The selected device already contains a storage domain.
>>> [ ERROR ] Setup of additional hosts using this software is not allowed
>>> anymore. Please use the engine web interface to deploy any additional hosts.
>>>
>>
>> Since 4.0 you should add additional hosted-engine hosts directly from the
>> webadmin UI.
>>
>>
>>> [ ERROR ] Failed to execute stage 'Environment customization': Setup of
>>> additional hosts using this software is not allowed anymore. Please use the
>>> engine web interface to deploy any additional hosts.
>>> [ INFO  ] Stage: Clean up
>>> [ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-
>>> setup/answers/answers-20180408214554.conf'
>>> [ INFO  ] Stage: Pre-termination
>>> [ INFO  ] Stage: Termination
>>> [ ERROR ] Hosted Engine deployment failed
>>>   Log file is located at /var/log/ovirt-hosted-engine-s
>>> etup/ovirt-hosted-engine-setup-20180408214515-4vofq6.log
>>>
>>>
>>>
>>> - 原始邮件 -
>>> 发件人:Alex K <rightkickt...@gmail.com>
>>> 收件人:dhy...@sina.com
>>> 抄送人:FERNANDO FREDIANI <fernando.fredi...@upx.com>, users <
>>> users@ovirt.org>
>>> 主题:Re: [ovirt-users] 回复:Re: ovirt engine HA
>>> 日期:2018年04月08日 19点45分
>>>
>>> Are you a troll?
>>>
>>>
>>> On Sun, Apr 8, 2018, 12:15 <dhy...@sina.com> wrote:
>>>
>>>
>>> Hi, I hava two node, and deploy hosted-engine by #hosted-engine --deploy,
>>>  I find two hosted-engine that  i deploied is independent,  how to make
>>> my hosted-engine is HA?
>>>
>>> - 原始邮件 -
>>> 发件人:Alex K <rightkickt...@gmail.com>
>>> 收件人:FERNANDO FREDIANI <fernando.fredi...@upx.com>
>>> 抄送人:users <users@ovirt.org>
>>> 主题:Re: [ovirt-users] ovirt engine HA
>>> 日期:2018年04月04日 01点40分
>>>
>>> In case you need HA for the engine you need to deploy it to other hosts
>>> also through the GUI.
>>>
>>>
>>> On Tue, Apr 3, 2018 at 4:47 PM, FERNANDO FREDIANI <
>>> fernando.fredi...@upx.com> wrote:
>>>
>>> Is it enough to deploy the Self-Hosted engine in just one Host of the
>>> cluster or is it necessary to repeat the process in each of the nodes that
>>> must be able to run it ?
>>>
>>> Thanks
>>> Fernando
>>>
>>> 2018-04-03 2:01 GMT-03:00 Vincent Royer <vinc...@epicenergy.ca>:
>>>
>>> Same thing, the engine in this case is "self-hosted", as in, it runs in
>>> a VM hosted on the cluster that it is managing.  I am a beginner here, but
>>> from my understanding, each node is always checking on the health of the
>>> engine VM.  If the engine is missing (ie, the host running it has gone
>>> down), then another available, healthy host will spawn up the engine and
>>> you will regain access.
>>>
>>> In my experience this has worked very reliabl

Re: [ovirt-users] hosted-engine deploy error

2018-04-04 Thread Martin Sivak
Hi,

the hostname is invalid for hosted engine because it is set to
localhost.localdomain.

Check the shell prompt, do you see the localhost there?
[root@localhost ~]# hosted-engine --deploy

You can change the hostname by using many different tools, check
https://fedoramagazine.org/set-hostname-fedora/ for examples. Most of
them (if not all) should be valid for CentOS as well.

The hostname you set must be resolvable to IP and that IP has to point
back to the host you are on.

Best regards

Martin Sivak

On Wed, Apr 4, 2018 at 12:50 PM, dhy336 <dhy...@sina.com> wrote:
> thanks, but i do not know why is my hostname invalid?
>
>
> 在2018年04月04日 18:41,Simone Tiraboschi 写道:
>
>
> On Wed, Apr 4, 2018 at 12:28 PM, <dhy...@sina.com> wrote:
>>
>> Hi,
>>
>> I deloy hosted-engine, by this
>> blog,https://ovirt.org/documentation/self-hosted/chap-Deploying_Self-Hosted_Engine/
>>  but I am facing some error.who  may give me  some advise?
>>
>> [root@localhost ~]# hosted-engine --deploy
>
>
> ^^^
>
> The error is here: please set a valid hostname for your host: the engine VM
> should be able to reach it.
> If you add the host as localhost the engine VM will reach itself and it's
> not what we want.
>
> If you don'y have a working DNS system, hosted-engine-setup will ask about
> injecting an entry for you under /etc/hosts on the engine VM but you still
> need a valid hostname.
>
>
>>
>>
>>
>>
>>   --== STORAGE CONFIGURATION ==--
>>
>>   Please specify the storage you would like to use (glusterfs,
>> iscsi, fc, nfs3, nfs4)[nfs3]:
>>   Please specify the full shared storage connection path to use
>> (example: host:/path): 192.168.122.134:/home/exports/hosted-engine
>>
>>   --== HOST NETWORK CONFIGURATION ==--
>>
>>   iptables was detected on your computer, do you wish setup to
>> configure it? (Yes, No)[Yes]:
>>   Please indicate a pingable gateway IP address [192.168.122.1]:
>>   Please indicate a nic to set ovirtmgmt bridge on: (eth0) [eth0]:
>>
>>   --== VM CONFIGURATION ==--
>>
>>   The following appliance have been found on your system:
>>   [1] - The oVirt Engine Appliance image (OVA) -
>> 4.1-20180124.1.el7.centos
>>   [2] - Directly select an OVA file
>>   Please select an appliance (1, 2) [1]:
>> [ INFO  ] Verifying its sha1sum
>> [ INFO  ] Checking OVF archive content (could take a few minutes depending
>> on archive size)
>> [ INFO  ] Checking OVF XML content (could take a few minutes depending on
>> archive size)
>>   Please specify the console type you would like to use to connect
>> to the VM (vnc, spice) [vnc]:
>> [ INFO  ] Detecting host timezone.
>>   Would you like to use cloud-init to customize the appliance on
>> the first boot (Yes, No)[Yes]?
>>   Would you like to generate on-fly a cloud-init ISO image (of
>> no-cloud type)
>>   or do you have an existing one (Generate, Existing)[Generate]?
>>   Please provide the FQDN you would like to use for the engine
>> appliance.
>>   Note: This will be the FQDN of the engine VM you are now going
>> to launch,
>>   it should not point to the base host or to any other existing
>> machine.
>>   Engine VM FQDN: (leave it empty to skip):  []: engine.tchyp.com
>>   Please provide the domain name you would like to use for the
>> engine appliance.
>>   Engine VM domain: [tchyp.com]
>>   Automatically execute engine-setup on the engine appliance on
>> first boot (Yes, No)[Yes]?
>>   Automatically restart the engine VM as a monitored service after
>> engine-setup (Yes, No)[Yes]?
>>   Enter root password that will be used for the engine appliance
>> (leave it empty to skip):
>> [WARNING] Skipping appliance root password
>>   Enter ssh public key for the root user that will be used for the
>> engine appliance (leave it empty to skip):
>> [WARNING] Skipping appliance root ssh public key
>>   Do you want to enable ssh access for the root user (yes, no,
>> without-password) [yes]:
>> [WARNING] The oVirt engine appliance is not configured with a default
>> password, please consider configuring it via cloud-init
>>   Please specify the size of the VM disk in GB: [58]:
>> [WARNING] Minimum requirements not met by available memory: Required: 4096
>> MB. Available: 3064 MB
>>   Please specify the memory size of the V

Re: [ovirt-users] Hosted-Engine Agent broken

2018-03-16 Thread Martin Sivak
Hi,

make sure you have at least ovirt-hosted-engine-ha-2.2.1 and the
service was properly restarted.

The situation you are describing can happen when you run older hosted
engine agent with 4.2 ovirt-engine.

It was tracked as: https://bugzilla.redhat.com/1518887

Best regards

Martin Sivak


On Fri, Mar 16, 2018 at 2:09 PM, Sven Achtelik <sven.achte...@eps.aero> wrote:
> Hi All,
>
>
>
> after upgrading my engine to 4.2 and upgrading my hosts to the latest
> versions the HA for the hosted engine is not working anymore. The Agent
> fails with the following errors. Did I miss anything while upgrading ? The
> Engine is still running – what would be the correct approach to get the HA
> services up and running ?
>
>
>
> ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback
> (most recent call last):
>
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 191, in _run_agent
>
>
> return action(he)
>
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 64, in action_proper
>
>
> return he.start_monitoring()
>
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 421, in start_monitoring
>
>
> self._config.refresh_vm_conf()
>
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py",
> line 496, in refresh_vm_conf
>
>
> content_from_ovf = self._get_vm_conf_content_from_ovf_store()
>
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py",
> line 438, in _get_vm_conf_content_from_ovf_store
>
>
> conf = ovf2VmParams.confFromOvf(heovf)
>
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/ovf/ovf2VmParams.py",
> line 283, in confFromOvf
>
>
> vmConf = toDict(ovf)
>
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/ovf/ovf2VmParams.py",
> line 210, in toDict
>
>
> vmParams['vmId'] = tree.find('Content/Section').attrib[OVF_NS + 'id']
>
> File
> "lxml.etree.pyx", line 2272, in lxml.etree._Attrib.__getitem__
> (src/lxml/lxml.etree.c:55336)
>
>
> KeyError: '{http://schemas.dmtf.org/ovf/envelope/1/}id'
>
> ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to
> restart agent
>
> ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
> Unable to refresh vm.conf from the shared storage. Has this HE cluster
> correctly reached 3.6 level?
>
>
>
> If anyone could give a hint on where to look at would be very helpful.
>
>
>
> Thank you,
>
> Sven
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 4.2.2.2-1 Starting hosted engine on all hosts

2018-03-16 Thread Martin Sivak
>
> Why did it trash it?

Split brain and concurrent filesystem access...

The bug only happened in 4.2.2 and was never released officially apart
from development builds. And it should be fixed now.

Martin

On Fri, Mar 16, 2018 at 11:04 AM, Yaniv Kaul  wrote:
>
>
> On Mar 15, 2018 9:21 PM, "Maton, Brett"  wrote:
>
> Ok cool, glad you already have enough information as it's trashed my
> hosted-engine beyond recovery...
>
>
> Why did it trash it?
> Y.
>
>
> On 15 March 2018 at 17:47, Gianluca Cecchi 
> wrote:
>>
>>
>> Il 15 Mar 2018 6:34 PM, "Simone Tiraboschi"  ha
>> scritto:
>>
>>
>>
>> On Thu, Mar 15, 2018 at 8:18 AM, Yedidyah Bar David 
>> wrote:
>>>
>>> On Thu, Mar 15, 2018 at 8:50 AM, Maton, Brett 
>>> wrote:
>>> > The last three 4.2.2 release candidates that I've tried have been
>>> > starting
>>> > self hosted engine all all physical hosts at the same time.
>>> >
>>> > Same with the latest RC, what logs  do you need to investigate the
>>> > problem?
>>
>>
>> It was this one: https://bugzilla.redhat.com/show_bug.cgi?id=1547479
>>
>> It got fixed today but still not available in RC4.
>>
>>>
>>>
>>> /var/log/ovirt-hosted-engine-ha/*
>>> /var/log/sanlock.log
>>> /var/log/vdsm/*
>>>
>>> Adding Martin.
>>>
>>> Thanks and best regards,
>>> --
>>> Didi
>>> __
>>
>>
>> If I understood correctly, this kind of risk is not present in 4.1.x and
>> in 4.2.y for every x and for y <= 1?
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] cocpit is not running on hosts

2018-03-13 Thread Martin Sivak
> systemctl enable cockpit.socket
> systemctl start cockpit.socket

There is slight difference on when the service gets activated (on
startup vs. on first access), but I guess either way is fine.

The documentation indeed mentions the socket way:

http://cockpit-project.org/guide/133/startup.html

Best regards

Martin Sivak

On Tue, Mar 13, 2018 at 4:45 PM, Gianluca Cecchi
<gianluca.cec...@gmail.com> wrote:
> On Tue, Mar 13, 2018 at 4:14 PM, Martin Sivak <msi...@redhat.com> wrote:
>>
>> Hi,
>>
>> make sure the service is actually started and the firewall is
>> configured properly:
>>
>> systemctl status cockpit
>> firewall-cmd --list-all
>>
>> You can make sure all is fine by doing the following:
>>
>> systemctl enable cockpit
>> systemctl start cockpit
>> firewall-cmd --add-service=cockpit --permanent
>> firewall-cmd --reload
>>
>> Best regards
>>
>> Martin Sivak
>>
>
> Actually from what I understood, the "cockpit service" has to remain
> configured as "static", while the "cockpit socket" has to be enabled.
> And in cockpit.service unit file in [Unit] section:
>
> Requires=cockpit.socket
>
>
> So in my case on a plain CentOS server acting as a node I executed:
>
> systemctl enable cockpit.socket
> systemctl start cockpit.socket
>
> And I verified I could connect to the hypervisor on port 9090 and then also
> the status of cockpit.service was "active".
>
> In messages:
>
> Mar 13 16:30:29 ov42 systemd: Starting Cockpit Web Service Socket.
> Mar 13 16:30:29 ov42 systemd: Listening on Cockpit Web Service Socket.
>
> And when I connect with browser to port 9090 some seconds later:
>
> Mar 13 16:30:47 ov42 systemd: Starting Cockpit Web Service...
> Mar 13 16:30:47 ov42 systemd: Started Cockpit Web Service.
> Mar 13 16:30:47 ov42 cockpit-ws: Using certificate:
> /etc/cockpit/ws-certs.d/0-self-signed.cert
> Mar 13 16:30:47 ov42 cockpit-ws: couldn't read from connection: Error
> reading data from TLS socket: A TLS fatal alert has been received.
> Mar 13 16:30:57 ov42 cockpit-session: pam_ssh_add: Failed adding some keys
> Mar 13 16:30:57 ov42 systemd-logind: New session 3407 of user root.
> Mar 13 16:30:57 ov42 systemd: Started Session 3407 of user root.
> Mar 13 16:30:57 ov42 systemd: Starting Session 3407 of user root.
> Mar 13 16:30:58 ov42 cockpit-ws: logged in user session
> Mar 13 16:30:58 ov42 cockpit-ws: New connection to session from 10.4.4.12
> ...
>
> For further stop/start of cockpit.socket, I see that the start of the
> cockpit.service is instead immediate when cockpit.socket starts
>
> eg:
>
> Mar 13 16:37:37 ov42 systemd: Starting Cockpit Web Service Socket.
> Mar 13 16:37:37 ov42 systemd: Listening on Cockpit Web Service Socket.
> Mar 13 16:37:37 ov42 systemd: Starting Cockpit Web Service...
> Mar 13 16:37:37 ov42 systemd: Started Cockpit Web Service.
> Mar 13 16:37:37 ov42 cockpit-ws: Using certificate:
> /etc/cockpit/ws-certs.d/0-self-signed.cert
>
> Gianluca
>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] cocpit is not running on hosts

2018-03-13 Thread Martin Sivak
Hi,

make sure the service is actually started and the firewall is
configured properly:

systemctl status cockpit
firewall-cmd --list-all

You can make sure all is fine by doing the following:

systemctl enable cockpit
systemctl start cockpit
firewall-cmd --add-service=cockpit --permanent
firewall-cmd --reload

Best regards

Martin Sivak

On Tue, Mar 13, 2018 at 3:33 PM, Peter Hudec <phu...@cnc.sk> wrote:
> Hi,
>
> after upgrade to 4.2. there was running the cockpit on each host.
> Right now, there is no service on port 9090. Is there any special setup
> how to put it back?
>
> [PROD] r...@dipovirt03.cnc.sk: /home/phudec # rpm -qa | grep ovirt
> ovirt-imageio-common-1.2.1-0.el7.centos.noarch
> ovirt-vmconsole-1.0.4-1.el7.centos.noarch
> ovirt-provider-ovn-driver-1.2.5-1.el7.centos.noarch
> ovirt-setup-lib-1.1.4-1.el7.centos.noarch
> ovirt-host-4.2.1-1.el7.centos.x86_64
> ovirt-host-deploy-1.7.2-1.el7.centos.noarch
> ovirt-engine-sdk-python-3.6.9.1-1.el7.centos.noarch
> ovirt-host-dependencies-4.2.1-1.el7.centos.x86_64
> ovirt-hosted-engine-setup-2.2.9-1.el7.centos.noarch
> ovirt-vmconsole-host-1.0.4-1.el7.centos.noarch
> python-ovirt-engine-sdk4-4.2.4-2.el7.centos.x86_64
> ovirt-imageio-daemon-1.2.1-0.el7.centos.noarch
> cockpit-ovirt-dashboard-0.11.11-0.1.el7.centos.noarch
> ovirt-hosted-engine-ha-2.2.4-1.el7.centos.noarch
> ovirt-engine-appliance-4.2-20180214.1.el7.centos.noarch
> ovirt-release42-4.2.1.1-1.el7.centos.noarch
>
> [PROD] r...@dipovirt03.cnc.sk: /home/phudec # rpm -qa | grep cockpit
> cockpit-system-160-1.el7.centos.noarch
> cockpit-networkmanager-160-1.el7.centos.noarch
> cockpit-160-1.el7.centos.x86_64
> cockpit-bridge-160-1.el7.centos.x86_64
> cockpit-dashboard-160-1.el7.centos.x86_64
> cockpit-storaged-160-1.el7.centos.noarch
> cockpit-ovirt-dashboard-0.11.11-0.1.el7.centos.noarch
> cockpit-ws-160-1.el7.centos.x86_64
>
> regards
> Peter
>
> --
> *Peter Hudec*
> Infraštruktúrny architekt
> phu...@cnc.sk <mailto:phu...@cnc.sk>
>
> *CNC, a.s.*
> Borská 6, 841 04 Bratislava
> Recepcia: +421 2  35 000 100
>
> Mobil:+421 905 997 203
> *www.cnc.sk* <http:///www.cnc.sk>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 'Sanlock lockspace add failure' when trying to set up hosted engine 4.2 on existing gluster cluster

2018-03-09 Thread Martin Sivak
Hi Oliver,

which version of oVirt are you running?

The issue seems to be that correctly deployed hosted engine does not
have any storage available in the webadmin If I understand you
correctly. Is that right?

We used to require two separate storage domains in 4.1 and older
releases. One for hosted engine and one for the rest of the VMs. 4.2.1
changed that. So if you are running 4.1, just add another storage
domain [1] and the engine will finish the hosted engine initialization
automatically after that.

[1] 
https://www.ovirt.org/documentation/self-hosted/chap-Deploying_Self-Hosted_Engine/

See the paragraph: "Important: Log in as the admin@internal user to
continue configuring the Engine and add further resources. You must
create another data domain for the data center to be initialized to
host regular virtual machine data, and for the Engine virtual machine
to be visible. See "Storage" in the Administration Guide for different
storage options and on how to add a data storage domain."

Best regards

Martin Sivak

On Fri, Mar 9, 2018 at 10:08 AM, Oliver Dietzel <o.diet...@rto.de> wrote:
> Install from node iso on gluster works fine, the hosted engine vm installs
> on gluster, but after installation is finished and rebooted the gluster
> cluster is not added as data storage.
> Hosted engine is able to boot from gluster but not able to use it. Looks
> like HE doesnt use the gluster volume it booted from, but tries to add the
> same gluster volume a second time.
>
> Is there a workaround?
>
> Error message in web gui:
> VDSM ovirt-gluster.rto.de command CreateStoragePoolVDS failed: Cannot
> acquire host id: (u'e6d008f7-e5e8-4064-9f6a-6ab7c8723eeb',
> SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))
>
> Error message in sanlock.log:
>
> [root@ovirt-gluster ~]# tail /var/log/sanlock.log
> 2018-03-09 09:37:05 812 [1082]: s5 host 1 2 791
> f571ebc1-2572-4689-b64e-6999433f0597.ovirt-glus
> 2018-03-09 09:37:05 812 [1082]: s5 host 250 1 0
> f571ebc1-2572-4689-b64e-6999433f0597.ovirt-glus
> 2018-03-09 09:37:20 828 [1093]: s5:r4 resource
> e6d008f7-e5e8-4064-9f6a-6ab7c8723eeb:e555168d-719d-4f73-8541-62395c97c1ff:/rhev/data-center/mnt/glusterSD/gluster01:_gv0/e6d008f7-e5e8-4064-9f6a-6ab7c8723eeb/images/7b6675f7-f739-40b8-a554-b19795fe57c0/e555168d-719d-4f73-8541-62395c97c1ff.lease:0
> for 3,12,5566
> 2018-03-09 09:40:58 1046 [1093]: s6 lockspace
> hosted-engine:1:/var/run/vdsm/storage/e6d008f7-e5e8-4064-9f6a-6ab7c8723eeb/3b22b2fe-f4d5-4d0c-995a-03b2850b674b/eb7218ff-bdbc-49c5-af17-a62a7385d299:0
> 2018-03-09 09:41:20 1067 [1082]: s6 host 1 1 1046
> f571ebc1-2572-4689-b64e-6999433f0597.ovirt-glus
> 2018-03-09 09:42:23 1130 [1093]: s5:r5 resource
> e6d008f7-e5e8-4064-9f6a-6ab7c8723eeb:e555168d-719d-4f73-8541-62395c97c1ff:/rhev/data-center/mnt/glusterSD/gluster01:_gv0/e6d008f7-e5e8-4064-9f6a-6ab7c8723eeb/images/7b6675f7-f739-40b8-a554-b19795fe57c0/e555168d-719d-4f73-8541-62395c97c1ff.lease:0
> for 2,9,11635
> 2018-03-09 09:44:30 1258 [1093]: add_lockspace
> e6d008f7-e5e8-4064-9f6a-6ab7c8723eeb:250:/rhev/data-center/mnt/glusterSD/gluster01:_gv0/e6d008f7-e5e8-4064-9f6a-6ab7c8723eeb/dom_md/ids:0
> conflicts with name of list1 s5
> e6d008f7-e5e8-4064-9f6a-6ab7c8723eeb:1:/rhev/data-center/mnt/glusterSD/gluster01:_gv0/e6d008f7-e5e8-4064-9f6a-6ab7c8723eeb/dom_md/ids:0
> 2018-03-09 09:44:37 1264 [1093]: s7 lockspace
> hosted-engine:1:/var/run/vdsm/storage/e6d008f7-e5e8-4064-9f6a-6ab7c8723eeb/3b22b2fe-f4d5-4d0c-995a-03b2850b674b/eb7218ff-bdbc-49c5-af17-a62a7385d299:0
> 2018-03-09 09:44:58 1285 [1082]: s7 host 1 2 1264
> f571ebc1-2572-4689-b64e-6999433f0597.ovirt-glus
> 2018-03-09 09:46:42 1389 [1093]: add_lockspace
> e6d008f7-e5e8-4064-9f6a-6ab7c8723eeb:250:/rhev/data-center/mnt/glusterSD/gluster01:_gv0/e6d008f7-e5e8-4064-9f6a-6ab7c8723eeb/dom_md/ids:0
> conflicts with name of list1 s5
> e6d008f7-e5e8-4064-9f6a-6ab7c8723eeb:1:/rhev/data-center/mnt/glusterSD/gluster01:_gv0/e6d008f7-e5e8-4064-9f6a-6ab7c8723eeb/dom_md/ids:0
> ___
> Oliver Dietzel
> RTO GmbH
> Hanauer Landstraße 439
> 60314 Frankfurt
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Start VM automatically

2018-02-27 Thread Martin Sivak
Hi,

we are considering the feature and its many angles:

- starting without the management using local storage (
https://bugzilla.redhat.com/show_bug.cgi?id=1166657)
- starting without the management with shared storage (
https://bugzilla.redhat.com/show_bug.cgi?id=817363)
- starting the VM via the management running as Hosted Engine (
https://bugzilla.redhat.com/show_bug.cgi?id=1325468)

All three cases have their pain points eg: which host should start the VM
and how do you protect against split brain?

If you would be so kind, please describe your use case to the relevant RFE
bug so we can consider it when planning the feature. And stand assured that
we are thinking about how to implement this properly.

Best regards

--
Martin Sivak
SLA / oVirt

On Mon, Feb 26, 2018 at 10:09 PM, Fabrice SOLER <
fabrice.so...@ac-guadeloupe.fr> wrote:

> Hello,
>
> My node (IP ovirtmgmt) is behind a routeur that is running on the
> hypervisor (the node itself).
>
> So, I need that the VM (routeur) start automatically after the node start.
>
> The ovirt engine is running on another infrastructure and the version is
> 4.2.0. The node is also in this version.
>
> Is there a solution ?
>
> Sincerely,
> --
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Reinitializing lockspace

2018-02-21 Thread Martin Sivak
Hi,

the bug you found describes the right procedure for reinitializing the
lockspace indeed. The hosted engine tool just packages the script to
make it easier for you to use.

You should check whether all hosted engine tools are down first
(systemctl stop ovirt-ha-agent ovirt-ha-broker) on all hosts before
attempting the reinitialization. Also check the storage connection you
the lockspace and that all hosts have different host id in
/etc/ovirt-hosted-engine/hosted-engine.conf.

I can't help you more without logs and more details about the issue.
Like what version you are using and what happened that you started
looking into logs in the first place.

Best regards

Martin Sivak

On Wed, Feb 21, 2018 at 12:04 AM, Jamie Lawrence
<jlawre...@squaretrade.com> wrote:
> Hello,
>
> I have a sanlock problem. I don't fully understand the logs, but from what I 
> can gather, messages like this means it ain't working.
>
> 2018-02-16 14:51:46 22123 [15036]: s1 renewal error -107 delta_length 0 
> last_success 22046
> 2018-02-16 14:51:47 22124 [15036]: 53977885 aio collect RD 
> 0x7fe5040008c0:0x7fe5040008d0:0x7fe518922000 result -107:0 match res
> 2018-02-16 14:51:47 22124 [15036]: s1 delta_renew read rv -107 offset 0 
> /rhev/data-center/mnt/glusterSD/sc5-gluster-10g-1.squaretrade.com:ovirt__images/53977885-0887-48d0-a02c-8d9e3faec93c/dom_md/ids
>
> I attempted `hosted-engine --reinitialize-lockspace --force`, which didn't 
> appear to do anything, but who knows.
>
> I downed everything and and tried `sanlock direct init -s `, which caused 
> sanlock to dump core.
>
> At this point the only thing I can think of to do is down everything, whack 
> and manually recreate the lease files and try again. I'm worried that that 
> will lose something that the setup did or will otherwise destroy the 
> installation. It looks like this has been done by others[1], but the 
> references I can find are a bit old, so I'm unsure if that is still a valid 
> approach.
>
> So, questions:
>
>  - Will that work?
>  - Is there something I should do instead of that?
>
> Thanks,
>
> -j
>
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1116469
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Fwd: why host is not capable to run HE?

2018-02-19 Thread Martin Sivak
Hi Artem,

just a restart of ovirt-ha-agent services should be enough.

Best regards

Martin Sivak

On Mon, Feb 19, 2018 at 4:40 PM, Artem Tambovskiy
<artem.tambovs...@gmail.com> wrote:
> Ok, understood.
> Once I set correct host_id on both hosts how to take changes in force? With
> minimal downtime? Or i need reboot both hosts anyway?
>
> Regards,
> Artem
>
> 19 февр. 2018 г. 18:18 пользователь "Simone Tiraboschi"
> <stira...@redhat.com> написал:
>
>>
>>
>> On Mon, Feb 19, 2018 at 4:12 PM, Artem Tambovskiy
>> <artem.tambovs...@gmail.com> wrote:
>>>
>>>
>>> Thanks a lot, Simone!
>>>
>>> This is clearly shows a problem:
>>>
>>> [root@ov-eng ovirt-engine]# sudo -u postgres psql -d engine -c 'select
>>> vds_name, vds_spm_id from vds'
>>> vds_name | vds_spm_id
>>> -+
>>>  ovirt1.local |  2
>>>  ovirt2.local |  1
>>> (2 rows)
>>>
>>> While hosted-engine.conf on ovirt1.local have host_id=1, and ovirt2.local
>>> host_id=2. So totally opposite values.
>>> So how to get this fixed in the simple way? Update the engine DB?
>>
>>
>> I'd suggest to manually fix /etc/ovirt-hosted-engine/hosted-engine.conf on
>> both the hosts
>>
>>>
>>>
>>> Regards,
>>> Artem
>>>
>>> On Mon, Feb 19, 2018 at 5:37 PM, Simone Tiraboschi <stira...@redhat.com>
>>> wrote:
>>>>
>>>>
>>>>
>>>> On Mon, Feb 19, 2018 at 12:13 PM, Artem Tambovskiy
>>>> <artem.tambovs...@gmail.com> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> Last weekend my cluster suffered form a massive power outage due to
>>>>> human mistake.
>>>>> I'm using SHE setup with Gluster, I managed to bring the cluster up
>>>>> quickly, but once again I have a problem with duplicated host_id
>>>>> (https://bugzilla.redhat.com/show_bug.cgi?id=1543988) on second host and 
>>>>> due
>>>>> to this second host is not capable to run HE.
>>>>>
>>>>> I manually updated file hosted_engine.conf with correct host_id and
>>>>> restarted agent & broker - no effect. Than I rebooted the host itself -
>>>>> still no changes. How to fix this issue?
>>>>
>>>>
>>>> I'd suggest to run this command on the engine VM:
>>>> sudo -u postgres scl enable rh-postgresql95 --  psql -d engine -c
>>>> 'select vds_name, vds_spm_id from vds'
>>>> (just  sudo -u postgres psql -d engine -c 'select vds_name, vds_spm_id
>>>> from vds'  if still on 4.1) and check
>>>> /etc/ovirt-hosted-engine/hosted-engine.conf on all the involved host.
>>>> Maybe you can also have a leftover configuration file on undeployed
>>>> host.
>>>>
>>>> When you find a conflict you should manually bring down sanlock
>>>> In doubt a reboot of both the hosts will solve for sure.
>>>>
>>>>
>>>>>
>>>>>
>>>>> Regards,
>>>>> Artem
>>>>>
>>>>> ___
>>>>> Users mailing list
>>>>> Users@ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>
>>>>
>>>
>>>
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] CentOS 7 Hyperconverged oVirt 4.2 with Self-Hosted-Engine with glusterfs with 2 Hypervisors and 1 glusterfs-Arbiter-only

2018-02-12 Thread Martin Sivak
Hi,

this should work according to my gluster colleagues.

The recommended way to install this would be by using one of the
"full" nodes and deploying hosted engine via cockpit there. The
gdeploy plugin in cockpit should allow you to configure the arbiter
node.

The documentation for deploying RHHI (hyper converged RH product) is
here: 
https://access.redhat.com/documentation/en-us/red_hat_hyperconverged_infrastructure/1.1/html-single/deploying_red_hat_hyperconverged_infrastructure/index#deploy

Best regards

Martin Sivak

On Mon, Feb 12, 2018 at 4:25 PM, Philipp Richter
<philipp.rich...@linforge.com> wrote:
> Hi,
>
> I'm trying to install oVirt 4.2 as 2-Node Hyperconverged System based on 
> glusterfs.
> A third node should be used as glusterfs arbiter Node and to provide Quorum 
> for the Cluster.
> The third node is a small PCEngines APU2 Host, so it is not usable as 
> Hypervisor.
>
> My question is: Is this kind of setup possible?
> What is the best way to install a cluster like this one?
>
> Thanks,
> --
>
> : Philipp Richter
> : LINFORGE | Peace of mind for your IT
> :
> : T: +43 1 890 79 99
> : E: philipp.rich...@linforge.com
> : https://www.xing.com/profile/Philipp_Richter15
> : https://www.linkedin.com/in/philipp-richter
> :
> : LINFORGE Technologies GmbH
> : Brehmstraße 10
> : 1110 Wien
> : Österreich
> :
> : Firmenbuchnummer: FN 216034y
> : USt.- Nummer : ATU53054901
> : Gerichtsstand: Wien
> :
> : LINFORGE® is a registered trademark of LINFORGE, Austria.
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Use the "hosted_engine" data domain as data domain for others VM

2018-02-09 Thread Martin Sivak
Hi,

it should work in general, but there are couple of corner cases to be aware of.

Hosted engine VM should have its disks only on the HE storage domain.
The HE should be installed using the new Node 0 approach (default in
4.2.1+) or it must not use any custom mount options
(https://bugzilla.redhat.com/show_bug.cgi?id=1373930)

For all those reasons we do not recommend using it in production, but
we are not aware about anything that would really block you from doing
it. It just hasn't been tested and polished enough yet.

Best regards

Martin Sivak

On Fri, Feb 9, 2018 at 1:02 PM, Simone Tiraboschi <stira...@redhat.com> wrote:
>
>
> On Fri, Feb 9, 2018 at 10:20 AM, yayo (j) <jag...@gmail.com> wrote:
>>
>> Hi,
>>
>> Is there any problem to use the "hosted_engine" data domain to put disk of
>> others VM?
>
>
> Hi,
> it will be the default in 4.3.
>
> It's tracked here:
> https://bugzilla.redhat.com/show_bug.cgi?id=1451653
>
> I'm not aware of any specific block on 4.2 so it should work on technical
> side although it's not the recommended architecture on production systems.
> For sure you can actually face some perforce degradation of block device
> storage domains (iSCSI and FC) if you have a lot of inactive VMs on the
> hosted-engine storage domain due to
> https://bugzilla.redhat.com/show_bug.cgi?id=1443819
>
>
>>
>> I have created a "too big" "hosted_engine" data domain so I want to use
>> that space...
>>
>> Thank you
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] when creating VMs, I don't want hosted_storage to be an option

2018-02-09 Thread Martin Sivak
Hi,

we got much closer to officially remove the specialty status of both
the domain and the VM in 4.2 with features like Node 0 deployment
(default since 4.2.1) and direct libvirtxml support in engine and HE
(4.2.2 iirc).

There are couple of outstanding issues:

- HE needs to know how to connect all storage domains necessary for HE
VM disks (not 100% related, but close)
- (live) storage migration is not supported yet - HE nodes need to
learn about the new connection details
- changes to gluster topology are not supported yet - same reason as above
- we have a bug with regards to block devices - will be fixed by
https://gerrit.ovirt.org/#/c/87325/
- fencing and SPM role need to be tested a bit more to make sure we
have no surprises there
- old deployments might not have some data in the engine DB
(https://bugzilla.redhat.com/show_bug.cgi?id=1373930)

We will not be adding any additional limits as all seems to work in
the usual cases and we work on removing the remaining restrictions. I
am not 100% certain when it will be finished exactly, but you can use
it now if you are careful (basically do not use custom mount options
and do not add disks to the HE VM that would come from a different
SD!!).

We have two tracking bugs for the related work:
https://bugzilla.redhat.com/show_bug.cgi?id=1455169 and
https://bugzilla.redhat.com/show_bug.cgi?id=1393902 - most of what was
needed was fixed already.

Best regards

Martin Sivak

On Fri, Feb 9, 2018 at 11:06 AM, Gianluca Cecchi
<gianluca.cec...@gmail.com> wrote:
> On Thu, Jun 22, 2017 at 11:37 AM, Martin Sivak <msi...@redhat.com> wrote:
>>
>> Hi,
>>
>> Chris is right. We want to remove the specialty status from that
>> storage domain. It is one of the highest priority items for hosted
>> engine right now.
>>
>> There is currently no way to hide it I am afraid.
>>
>> Best regards
>>
>> --
>> Martin Sivak
>> SLA / oVirt
>>
>
> Hello Martin (and list),
> any update on this item to remove specialty of hosted_engine storage?
> Any bugzilla RFE or pointer? I think it didn't catch 4.2, correct?
>
> Thanks,
> Gianluca
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Maximum time node can be offline.

2018-02-09 Thread Martin Sivak
Hi,

the hosts are almost stateless and we set up most of what is needed
during activation. Hosted engine has some configuration stored
locally, but that is just the path to the storage domain.

I think you should be fine unless you change the network topology
significantly. I would also install security updates once in while.

We can even shut down the hosts for you when you configure two cluster
scheduling properties: EnableAutomaticPM and HostsInReserve.
HostsInReserve should be at least 1 though. It behaves like this, as
long as the reserve host is empty, we shut down all the other empty
hosts. And we boot another host once a VM does not fit on other used
hosts and is places on the running reserve host. That would save you
the power of just one host, but it would still be highly available (if
hosted engine and storage allows that too).

Bear in mind that single host cluster is not highly available at all.

Best regards

Martin Sivak

On Fri, Feb 9, 2018 at 8:25 AM, Gianluca Cecchi
<gianluca.cec...@gmail.com> wrote:
> On Fri, Feb 9, 2018 at 2:30 AM, Thomas Letherby <xrs...@xrs444.net> wrote:
>>
>> Thanks, that answers my follow up question! :)
>>
>> My concern is that I could have a host off-line for a month say, is that
>> going to cause any issues?
>>
>> Thanks,
>>
>> Thomas
>>
>
> I think that if in the mean time you don't make any configuration changes
> and you don't update anything, there is no reason to have problems.
> In case of changes done, it could depend on what they are: are you thinking
> about any particular scenario?
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Cannot Remove Disk

2018-02-08 Thread Martin Sivak
Andrej, this might be related to the recent fixes of yours in that
area. Can you take a look please?

Best regards

Martin Sivak

On Thu, Feb 8, 2018 at 4:18 PM, Donny Davis <do...@fortnebula.com> wrote:
> Ovirt 4.2 has been humming away quite nicely for me in the last few months,
> and now I am hitting an issue when try to touch any api call that has to do
> with a specific disk. This disk resides on a hyperconverged DC, and none of
> the other disks seem to be affected. Here is the error thrown.
>
> 2018-02-08 10:13:20,005-05 ERROR
> [org.ovirt.engine.core.bll.storage.disk.RemoveDiskCommand] (default task-22)
> [7b48d1ec-53a7-497a-af8e-938f30a321cf] Error during ValidateFailure.:
> org.ovirt.engine.core.bll.quota.InvalidQuotaParametersException: Quota
> 6156b8dd-50c9-4e8f-b1f3-4a6449b02c7b does not match storage pool
> 5a497956-0380-021e-0025-035e
>
>
>
> Any ideas what can be done to fix this?
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirt and gateway behavior

2018-02-06 Thread Martin Sivak
> This is expected behaviour, even if it’s not very bright. It’s being used as
> a way to detect network is operating correctly.

Correct, it is used to check whether users can reach the host and the
VM that runs on it. There aren't that many options to check that. All
require data exchange of some kind (ICMP req/res, TCP SYN/ACK, some
UDP echo..).

> It is insane as there are so many ways it breaks.  My network admin turns
> off ICMP responses and death to network.

ICMP is an important signaling mechanism.. seriously, it is usually a
bad idea to block it.

> I got this trying to install on a network with out a gateway.

How were your users accessing the VMs? Was this some kind of super
secure deployment with no outside connectivity?


Best regards

Martin Sivak

On Tue, Feb 6, 2018 at 4:32 PM, Ben De Luca <bdel...@gmail.com> wrote:
> This is expected behaviour, even if it’s not very bright. It’s being used as
> a way to detect network is operating correctly.
>
> I got this trying to install on a network with out a gateway.
>
> It is insane as there are so many ways it breaks.  My network admin turns
> off ICMP responses and death to network.
>
> On Tue 6. Feb 2018 at 16:27, Alex K <rightkickt...@gmail.com> wrote:
>>
>> Hi,
>>
>> I have seen hosts rendered unresponsive when gateway is lost.
>> I will be able to provide more info once I prepare an environment and test
>> this further.
>>
>> Thanx,
>> Alex
>>
>> On Tue, Feb 6, 2018 at 10:40 AM, Yaniv Kaul <yk...@redhat.com> wrote:
>>>
>>>
>>>
>>> On Feb 5, 2018 2:21 PM, "Alex K" <rightkickt...@gmail.com> wrote:
>>>
>>> Hi all,
>>>
>>> I have a 3 nodes ovirt 4.1 cluster, self hosted on top of glusterfs. The
>>> cluster is used to host several VMs.
>>> I have observed that when gateway is lost (say the gateway device is
>>> down) the ovirt cluster goes down.
>>>
>>>
>>> Is the cluster down, or just the self-hosted engine?
>>>
>>>
>>> It seems a bit extreme behavior especially when one does not care if the
>>> hosted VMs have connectivity to Internet or not.
>>>
>>>
>>> Are the VMs down?
>>> The hosts?
>>> Y.
>>>
>>>
>>> Can this behavior be disabled?
>>>
>>> Thanx,
>>> Alex
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirt and gateway behavior

2018-02-06 Thread Martin Sivak
Hi,

ee use the ping check to see whether the host running hosted engine
has connectivity with the rest of the cluster and users. We kill the
VM in a hope that some other host will make the engine available to
users again.

We use the gateway by default as it is pretty common to have separate
network for data center, but you can change the address if your
topology is different.

Best regards

Martin Sivak

On Tue, Feb 6, 2018 at 4:27 PM, Alex K <rightkickt...@gmail.com> wrote:
> Hi,
>
> I have seen hosts rendered unresponsive when gateway is lost.
> I will be able to provide more info once I prepare an environment and test
> this further.
>
> Thanx,
> Alex
>
> On Tue, Feb 6, 2018 at 10:40 AM, Yaniv Kaul <yk...@redhat.com> wrote:
>>
>>
>>
>> On Feb 5, 2018 2:21 PM, "Alex K" <rightkickt...@gmail.com> wrote:
>>
>> Hi all,
>>
>> I have a 3 nodes ovirt 4.1 cluster, self hosted on top of glusterfs. The
>> cluster is used to host several VMs.
>> I have observed that when gateway is lost (say the gateway device is down)
>> the ovirt cluster goes down.
>>
>>
>> Is the cluster down, or just the self-hosted engine?
>>
>>
>> It seems a bit extreme behavior especially when one does not care if the
>> hosted VMs have connectivity to Internet or not.
>>
>>
>> Are the VMs down?
>> The hosts?
>> Y.
>>
>>
>> Can this behavior be disabled?
>>
>> Thanx,
>> Alex
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted-engine unknow stale-data

2018-01-22 Thread Martin Sivak
Hi Artem,

make sure the IDs are different, change them manually if you must!

That is all you need to do to get the agent up I think. The symlink
issue is probably related to another change we did (it happens when a
new hosted engine node is deployed by the engine) and a simple broker
restart should fix it too.

Best regards

Martin Sivak

On Mon, Jan 22, 2018 at 8:03 AM, Artem Tambovskiy
<artem.tambovs...@gmail.com> wrote:
> Hello Kasturi,
>
> Yes, I set global maintenance mode intentionally,
> I'm run out of the ideas troubleshooting my cluster and decided to undeploy
> the hosted engine from second host, clean the installation and add again to
> the cluster.
> Also I cleaned the metadata with hosted-engine --clean-metadata --host-id=2
> --force-clean But once I added the second host to the cluster again it
> doesn't show the capability to run hosted engine. And doesn't even appear in
> the output hosted-engine --vm-status
> [root@ovirt1 ~]#hosted-engine --vm-status --== Host 1 status ==--
> conf_on_shared_storage : True Status up-to-date : True Hostname :
> ovirt1.telia.ru Host ID : 1 Engine status : {"health": "good", "vm": "up",
> "detail": "up"} Score : 3400 stopped : False Local maintenance : False crc32
> : a23c7cbd local_conf_timestamp : 848931 Host timestamp : 848930 Extra
> metadata (valid at timestamp): metadata_parse_version=1
> metadata_feature_version=1 timestamp=848930 (Mon Jan 22 09:53:29 2018)
> host-id=1 score=3400 vm_conf_refresh_time=848931 (Mon Jan 22 09:53:29 2018)
> conf_on_shared_storage=True maintenance=False state=GlobalMaintenance
> stopped=False
>
> On redeployed second host I see unknown-stale-data again, and second host
> doesn't show up as a hosted-engine capable.
> [root@ovirt2 ~]# hosted-engine --vm-status
>
>
> --== Host 1 status ==--
>
> conf_on_shared_storage : True
> Status up-to-date  : False
> Hostname   : ovirt1.telia.ru
> Host ID: 1
> Engine status  : unknown stale-data
> Score  : 0
> stopped: False
> Local maintenance  : False
> crc32  : 18765f68
> local_conf_timestamp   : 848951
> Host timestamp : 848951
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=848951 (Mon Jan 22 09:53:49 2018)
> host-id=1
> score=0
> vm_conf_refresh_time=848951 (Mon Jan 22 09:53:50 2018)
> conf_on_shared_storage=True
> maintenance=False
> state=ReinitializeFSM
> stopped=False
>
>
> Really strange situation ...
>
> Regards,
> Artem
>
>
>
> On Mon, Jan 22, 2018 at 9:46 AM, Kasturi Narra <kna...@redhat.com> wrote:
>>
>> Hello Artem,
>>
>> Any reason why you chose hosted-engine undeploy action for the second
>> host ? I see that the cluster is in global maintenance mode, was this
>> intended ?
>>
>> command to clear the entries from hosted-engine --vm-status is
>> "hosted-engine --clean-metadata --host-id= --force-clean"
>>
>> Hope this helps !!
>>
>> Thanks
>> kasturi
>>
>>
>> On Fri, Jan 19, 2018 at 12:07 AM, Artem Tambovskiy
>> <artem.tambovs...@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> Ok, i decided to remove second host from the cluster.
>>> I reinstalled from webUI it with hosted-engine action UNDEPLOY, and
>>> removed it from the cluster aftewards.
>>> All VM's are fine hosted engine running ok,
>>> But hosted-engine --vm-status still showing 2 hosts.
>>>
>>> How I can clean the traces of second host in a correct way?
>>>
>>>
>>> --== Host 1 status ==--
>>>
>>> conf_on_shared_storage : True
>>> Status up-to-date  : True
>>> Hostname   : ovirt1.telia.ru
>>> Host ID: 1
>>> Engine status  : {"health": "good", "vm": "up",
>>> "detail": "up"}
>>> Score  : 3400
>>> stopped: False
>>> Local maintenance  : False
>>> crc32  : 1b1b6f6d
>>> local_conf_timestamp   : 545385
>>> Host timestamp : 545385
>>> Extra metadata (valid at t

Re: [ovirt-users] oVirt storage access failure from host

2018-01-19 Thread Martin Sivak
Hi,

Have you been adding or redeploying a host lately? If yes, then try
restarting ovirt-ha-broker service. If it helps then it might be a
case of this bug: https://bugzilla.redhat.com/1527394

The ovirt-ha-agent and brokers from oVirt 4.2 are fixed already, but
we havent backported the fix yet.

Best regards

Martin Sivak

On Fri, Jan 19, 2018 at 1:01 PM, Alex K <rightkickt...@gmail.com> wrote:
> Hi All,
>
> I have a 3 server ovirt 4.1 selft hosted setup with gluster replica 3.
>
> I see that suddenly one of the hosts reported as unresponsive and at same
> time the /var/log/messages logged:
>
> ovirt-ha-broker ovirt_hosted_engine_ha.broker.listener.ConnectionHandler
> ERROR Error handling request, data: 'set-storage-domain FilesystemBackend
> dom_type=glusterfs
> sd_uuid=ad7b9e2a-7ae3-46ad-9429-5f5ef452eac8'#012Traceback (most recent call
> last):#012  File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
> line 166, in handle#012data)#012  File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
> line 299, in _dispatch#012.set_storage_domain(client, sd_type,
> **options)#012  File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
> line 66, in set_storage_domain#012self._backends[client].connect()#012
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
> line 462, in connect#012self._dom_type)#012  File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
> line 107, in get_domain_path#012" in {1}".format(sd_uuid,
> parent))#012BackendFailureException: path to storage domain
> ad7b9e2a-7ae3-46ad-9429-5f5ef452eac8 not found in
> /rhev/data-center/mnt/glusterSD
> Jan 15 11:04:56 v1 journal: vdsm root ERROR failed to retrieve Hosted Engine
> HA info#012Traceback (most recent call last):#012  File
> "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 231, in
> _getHaInfo#012stats = instance.get_all_stats()#012  File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
> line 103, in get_all_stats#012self._configure_broker_conn(broker)#012
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
> line 180, in _configure_broker_conn#012dom_type=dom_type)#012  File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> line 177, in set_storage_domain#012.format(sd_type, options,
> e))#012RequestError: Failed to set storage domain FilesystemBackend, options
> {'dom_type': 'glusterfs', 'sd_uuid':
> 'ad7b9e2a-7ae3-46ad-9429-5f5ef452eac8'}: Request failed:  'ovirt_hosted_engine_ha.lib.storage_backends.BackendFailureException'>
>
>
> At VDSM logs i see the following continuously logged:
> [jsonrpc.JsonRpcServer] RPC call VM.getStats failed (error 1) in 0.00
> seconds (__init__:539)
>
> No errors seen at gluster at same time frame.
>
> Any hints on what is causing this issue? It seems a storage access issue but
> gluster was up and volumes ok. The VMs that I am running on top are Windows
> 10 and Windows 2016 64 bit.
>
>
> Thanx,
> Alex
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted-engine unknow stale-data

2018-01-16 Thread Martin Sivak
Hi everybody,

there are couple of things to check here.

- what version of hosted engine agent is this? The logs look like
coming from 4.1
- what version of engine is used?
- check the host ID in /etc/ovirt-hosted-engine/hosted-engine.conf on
both hosts, the numbers must be different
- it looks like the agent or broker on host 2 is not active (or there
would be a report)
- the second host does not see data from the first host (unknown
stale-data), wait for a minute and check again, then check the storage
connection

And then the general troubleshooting:

- put hosted engine in global maintenance mode (and check that it is
visible from the other host using he --vm-status)
- mount storage domain (hosted-engine --connect-storage)
- check sanlock client status to see if proper lockspaces are present

Best regards

Martin Sivak

On Tue, Jan 16, 2018 at 1:16 PM, Derek Atkins <de...@ihtfp.com> wrote:
> Why are both hosts reporting as ovirt 1?
> Look at the hostname fields to see what mean.
>
> -derek
> Sent using my mobile device. Please excuse any typos.
>
> On January 16, 2018 7:11:09 AM Artem Tambovskiy <artem.tambovs...@gmail.com>
> wrote:
>>
>> Hello,
>>
>> Yes, I followed exactly the same procedure while reinstalling the hosts
>> (the only difference that I have SSH key configured instead of the
>> password).
>>
>> Just reinstalled the second host one more time, after 20 min the host
>> still haven't reached active score of 3400 (Hosted Engine HA:Not Active) and
>> I still don't see crown icon for this host.
>>
>> hosted-engine --vm-status  from ovirt1 host
>>
>> [root@ovirt1 ~]# hosted-engine --vm-status
>>
>>
>> --== Host 1 status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date  : True
>> Hostname   : ovirt1.telia.ru
>> Host ID: 1
>> Engine status  : {"health": "good", "vm": "up",
>> "detail": "up"}
>> Score  : 3400
>> stopped: False
>> Local maintenance  : False
>> crc32  : 3f94156a
>> local_conf_timestamp   : 349144
>> Host timestamp : 349144
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=349144 (Tue Jan 16 15:03:45 2018)
>> host-id=1
>> score=3400
>> vm_conf_refresh_time=349144 (Tue Jan 16 15:03:45 2018)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=EngineUp
>> stopped=False
>>
>>
>> --== Host 2 status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date  : False
>> Hostname   : ovirt1.telia.ru
>> Host ID: 2
>> Engine status  : unknown stale-data
>> Score  : 0
>> stopped: True
>> Local maintenance  : False
>> crc32  : c7037c03
>> local_conf_timestamp   : 7530
>> Host timestamp : 7530
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=7530 (Fri Jan 12 16:10:12 2018)
>> host-id=2
>> score=0
>> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=AgentStopped
>> stopped=True
>>
>>
>> hosted-engine --vm-status output from ovirt2 host
>>
>> [root@ovirt2 ovirt-hosted-engine-ha]# hosted-engine --vm-status
>>
>>
>> --== Host 1 status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date  : False
>> Hostname   : ovirt1.telia.ru
>> Host ID: 1
>> Engine status  : unknown stale-data
>> Score  : 3400
>> stopped: False
>> Local maintenance  : False
>> crc32  : 6d3606f1
>> local_conf_timestamp   : 349264
>> Host timestamp : 349264
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>>

Re: [ovirt-users] ha-agent and broker continually crashing after 4.2 update

2018-01-15 Thread Martin Sivak
I actually do not agree with Simone here. The fix he talks about adds
a call to prepareImage, but your log clearly shows that prepareImage
is the call that fails:

Jan 12 16:52:36 cultivar0 journal: vdsm storage.Dispatcher ERROR
FINISH prepareImage error=Volume does not exist:
(u'8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8',)

I have to ask how old the environment is. Was it by any chance
installed back in 3.3/3.4 days and upgraded since then?

Martin

On Mon, Jan 15, 2018 at 10:17 AM, Simone Tiraboschi  wrote:
>
>
> On Fri, Jan 12, 2018 at 9:54 PM, Jayme  wrote:
>>
>> recently upgraded to 4.2 and had some problems with engine vm running, got
>> that cleared up now my only remaining issue is that now it seems
>> ovirt-ha-broker and ovirt-ha-agent are continually crashing on all three of
>> my hosts.  Everything is up and working fine otherwise, all VMs running and
>> hosted engine VM is running along with interface etc.
>
>
> I think it's due to https://bugzilla.redhat.com/show_bug.cgi?id=1527394 with
> got recently fixed.
> ovirt-hosted-engine-ha-2.2.3 should address it, please let us know if not.
>
>
>>
>>
>> Jan 12 16:52:34 cultivar0 journal: vdsm storage.Dispatcher ERROR FINISH
>> prepareImage error=Volume does not exist:
>> (u'8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8',)
>> Jan 12 16:52:34 cultivar0 python: detected unhandled Python exception in
>> '/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker'
>> Jan 12 16:52:34 cultivar0 abrt-server: Not saving repeating crash in
>> '/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker'
>> Jan 12 16:52:34 cultivar0 systemd: ovirt-ha-broker.service: main process
>> exited, code=exited, status=1/FAILURE
>> Jan 12 16:52:34 cultivar0 systemd: Unit ovirt-ha-broker.service entered
>> failed state.
>> Jan 12 16:52:34 cultivar0 systemd: ovirt-ha-broker.service failed.
>> Jan 12 16:52:34 cultivar0 systemd: ovirt-ha-broker.service holdoff time
>> over, scheduling restart.
>> Jan 12 16:52:34 cultivar0 systemd: Cannot add dependency job for unit
>> lvm2-lvmetad.socket, ignoring: Unit is masked.
>> Jan 12 16:52:34 cultivar0 systemd: Started oVirt Hosted Engine High
>> Availability Communications Broker.
>> Jan 12 16:52:34 cultivar0 systemd: Starting oVirt Hosted Engine High
>> Availability Communications Broker...
>> Jan 12 16:52:36 cultivar0 journal: vdsm storage.TaskManager.Task ERROR
>> (Task='73141dec-9d8f-4164-9c4e-67c43a102eff') Unexpected error#012Traceback
>> (most recent call last):#012  File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in
>> _run#012return fn(*args, **kargs)#012  File "", line 2, in
>> prepareImage#012  File
>> "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in
>> method#012ret = func(*args, **kwargs)#012  File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 3162, in
>> prepareImage#012raise
>> se.VolumeDoesNotExist(leafUUID)#012VolumeDoesNotExist: Volume does not
>> exist: (u'8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8',)
>> Jan 12 16:52:36 cultivar0 journal: vdsm storage.Dispatcher ERROR FINISH
>> prepareImage error=Volume does not exist:
>> (u'8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8',)
>> Jan 12 16:52:36 cultivar0 python: detected unhandled Python exception in
>> '/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker'
>> Jan 12 16:52:36 cultivar0 abrt-server: Not saving repeating crash in
>> '/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker'
>> Jan 12 16:52:36 cultivar0 systemd: ovirt-ha-broker.service: main process
>> exited, code=exited, status=1/FAILURE
>> Jan 12 16:52:36 cultivar0 systemd: Unit ovirt-ha-broker.service entered
>> failed state.
>> Jan 12 16:52:36 cultivar0 systemd: ovirt-ha-broker.service failed.
>>
>> Jan 12 16:52:36 cultivar0 systemd: ovirt-ha-broker.service holdoff time
>> over, scheduling restart.
>> Jan 12 16:52:36 cultivar0 systemd: Cannot add dependency job for unit
>> lvm2-lvmetad.socket, ignoring: Unit is masked.
>> Jan 12 16:52:36 cultivar0 systemd: Started oVirt Hosted Engine High
>> Availability Communications Broker.
>> Jan 12 16:52:36 cultivar0 systemd: Starting oVirt Hosted Engine High
>> Availability Communications Broker...
>> Jan 12 16:52:37 cultivar0 journal: vdsm storage.TaskManager.Task ERROR
>> (Task='bc7af1e2-0ab2-4164-ae88-d2bee03500f9') Unexpected error#012Traceback
>> (most recent call last):#012  File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in
>> _run#012return fn(*args, **kargs)#012  File "", line 2, in
>> prepareImage#012  File
>> "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in
>> method#012ret = func(*args, **kwargs)#012  File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 3162, in
>> prepareImage#012raise
>> se.VolumeDoesNotExist(leafUUID)#012VolumeDoesNotExist: Volume does not
>> exist: (u'8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8',)
>> Jan 12 16:52:37 cultivar0 journal: vdsm storage.Dispatcher ERROR FINISH
>> prepareImage error=Volume does 

Re: [ovirt-users] unable to bring up hosted engine after botched 4.2 upgrade

2018-01-12 Thread Martin Sivak
Hi,

the VM is up according to the status (at least for a while). You
should be able to use console and diagnose anything that happened
inside (line the need for fsck and such) now.

Check the presence of those links again now, the metadata file content
is not important, but the file has to exist (agents will populate it
with status data). I have no new idea about what is wrong with that
though.

Best regards

Martin



On Fri, Jan 12, 2018 at 5:47 PM, Jayme <jay...@gmail.com> wrote:
> The lock space issue was an issue I needed to clear but I don't believe it
> has resolved the problem.  I shutdown agent and broker on all hosts and
> disconnected hosted-storage then enabled broker/agent on just one host and
> connected storage.  I started the VM and actually didn't get any errors in
> the logs barely at all which was good to see, however the VM is still not
> running:
>
> HOST3:
>
> Engine status  : {"reason": "failed liveliness check",
> "health": "bad", "vm": "up", "detail": "Up"}
>
> ==> /var/log/messages <==
> Jan 12 12:42:57 cultivar3 kernel: ovirtmgmt: port 2(vnet0) entered disabled
> state
> Jan 12 12:42:57 cultivar3 kernel: device vnet0 entered promiscuous mode
> Jan 12 12:42:57 cultivar3 kernel: ovirtmgmt: port 2(vnet0) entered blocking
> state
> Jan 12 12:42:57 cultivar3 kernel: ovirtmgmt: port 2(vnet0) entered
> forwarding state
> Jan 12 12:42:57 cultivar3 lldpad: recvfrom(Event interface): No buffer space
> available
> Jan 12 12:42:57 cultivar3 systemd-machined: New machine qemu-111-Cultivar.
> Jan 12 12:42:57 cultivar3 systemd: Started Virtual Machine
> qemu-111-Cultivar.
> Jan 12 12:42:57 cultivar3 systemd: Starting Virtual Machine
> qemu-111-Cultivar.
> Jan 12 12:42:57 cultivar3 kvm: 3 guests now active
> Jan 12 12:44:38 cultivar3 libvirtd: 2018-01-12 16:44:38.737+: 1535:
> error : qemuDomainAgentAvailable:6010 : Guest agent is not responding: QEMU
> guest agent is not connected
>
> Interestingly though, now I'm seeing this in the logs which may be a new
> clue:
>
>
> ==> /var/log/vdsm/vdsm.log <==
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/nfsSD.py", line 126,
> in findDomain
> return NfsStorageDomain(NfsStorageDomain.findDomainPath(sdUUID))
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/nfsSD.py", line 116,
> in findDomainPath
> raise se.StorageDomainDoesNotExist(sdUUID)
> StorageDomainDoesNotExist: Storage domain does not exist:
> (u'248f46f0-d793-4581-9810-c9d965e2f286',)
> jsonrpc/4::ERROR::2018-01-12
> 12:40:30,380::dispatcher::82::storage.Dispatcher::(wrapper) FINISH
> getStorageDomainInfo error=Storage domain does not exist:
> (u'248f46f0-d793-4581-9810-c9d965e2f286',)
> periodic/42::ERROR::2018-01-12 12:40:35,430::api::196::root::(_getHaInfo)
> failed to retrieve Hosted Engine HA score '[Errno 2] No such file or
> directory'Is the Hosted Engine setup finished?
> periodic/43::ERROR::2018-01-12 12:40:50,473::api::196::root::(_getHaInfo)
> failed to retrieve Hosted Engine HA score '[Errno 2] No such file or
> directory'Is the Hosted Engine setup finished?
> periodic/40::ERROR::2018-01-12 12:41:05,519::api::196::root::(_getHaInfo)
> failed to retrieve Hosted Engine HA score '[Errno 2] No such file or
> directory'Is the Hosted Engine setup finished?
> periodic/43::ERROR::2018-01-12 12:41:20,566::api::196::root::(_getHaInfo)
> failed to retrieve Hosted Engine HA score '[Errno 2] No such file or
> directory'Is the Hosted Engine setup finished?
>
> ==> /var/log/ovirt-hosted-engine-ha/broker.log <==
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
> line 151, in get_raw_stats
> f = os.open(path, direct_flag | os.O_RDONLY | os.O_SYNC)
> OSError: [Errno 2] No such file or directory:
> '/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8'
> StatusStorageThread::ERROR::2018-01-12
> 12:32:06,049::status_broker::92::ovirt_hosted_engine_ha.broker.status_broker.StatusBroker.Update::(run)
> Failed to read state.
> Traceback (most recent call last):
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/status_broker.py",
> line 88, in run
> self._storage_broker.get_raw_stats()
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
> line 162, in get_raw_stats
> .format(str(e)))
> RequestError: failed to read metadata: [Errno 2] No such file or directory:
> '/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/85

Re: [ovirt-users] unable to bring up hosted engine after botched 4.2 upgrade

2018-01-12 Thread Martin Sivak
> Can you please stop all hosted engine tooling (

On all hosts I should have added.

Martin

On Fri, Jan 12, 2018 at 3:22 PM, Martin Sivak <msi...@redhat.com> wrote:
>> RequestError: failed to read metadata: [Errno 2] No such file or directory:
>> '/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8'
>>
>>  ls -al
>> /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8
>> -rw-rw. 1 vdsm kvm 1028096 Jan 12 09:59
>> /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8
>>
>> Is this due to the symlink problem you guys are referring to that was
>> addressed in RC1 or something else?
>
> No, this file is the symlink. It should point to somewhere inside
> /rhev/. I see it is a 1G file in your case. That is really
> interesting.
>
> Can you please stop all hosted engine tooling (ovirt-ha-agent,
> ovirt-ha-broker), move the file (metadata file is not important when
> services are stopped, but better safe than sorry) and restart all
> services again?
>
>> Could there possibly be a permissions
>> problem somewhere?
>
> Maybe, but the file itself looks out of the ordinary. I wonder how it got 
> there.
>
> Best regards
>
> Martin Sivak
>
> On Fri, Jan 12, 2018 at 3:09 PM, Jayme <jay...@gmail.com> wrote:
>> Thanks for the help thus far.  Storage could be related but all other VMs on
>> same storage are running ok.  The storage is mounted via NFS from within one
>> of the three hosts, I realize this is not ideal.  This was setup by a
>> previous admin more as a proof of concept and VMs were put on there that
>> should not have been placed in a proof of concept environment.. it was
>> intended to be rebuilt with proper storage down the road.
>>
>> So the storage is on HOST0 and the other hosts mount NFS
>>
>> cultivar0.grove.silverorange.com:/exports/data  4861742080
>> 1039352832 3822389248  22%
>> /rhev/data-center/mnt/cultivar0.grove.silverorange.com:_exports_data
>> cultivar0.grove.silverorange.com:/exports/iso   4861742080
>> 1039352832 3822389248  22%
>> /rhev/data-center/mnt/cultivar0.grove.silverorange.com:_exports_iso
>> cultivar0.grove.silverorange.com:/exports/import_export 4861742080
>> 1039352832 3822389248  22%
>> /rhev/data-center/mnt/cultivar0.grove.silverorange.com:_exports_import__export
>> cultivar0.grove.silverorange.com:/exports/hosted_engine 4861742080
>> 1039352832 3822389248  22%
>> /rhev/data-center/mnt/cultivar0.grove.silverorange.com:_exports_hosted__engine
>>
>> Like I said, the VM data storage itself seems to be working ok, as all other
>> VMs appear to be running.
>>
>> I'm curious why the broker log says this file is not found when it is
>> correct and I can see the file at that path:
>>
>> RequestError: failed to read metadata: [Errno 2] No such file or directory:
>> '/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8'
>>
>>  ls -al
>> /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8
>> -rw-rw. 1 vdsm kvm 1028096 Jan 12 09:59
>> /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8
>>
>> Is this due to the symlink problem you guys are referring to that was
>> addressed in RC1 or something else?  Could there possibly be a permissions
>> problem somewhere?
>>
>> Assuming that all three hosts have 4.2 rpms installed and the host engine
>> will not start is it safe for me to update hosts to 4.2 RC1 rpms?   Or
>> perhaps install that repo and *only* update the ovirt HA packages?
>> Assuming that I cannot yet apply the same updates to the inaccessible hosted
>> engine VM.
>>
>> I should also mention one more thing.  I originally upgraded the engine VM
>> first using new RPMS then engine-setup.  It failed due to not being in
>> global maintenance, so I set global maintenance and ran it again, which
>> appeared to complete as intended but never came back up after.  Just in case
>> this might have anything at all to do with what could have happened.
>>
>> Thanks very much again, I very much appreciate the help!
>>
>> - Jayme
>>
>> On Fri, Jan 12, 2018 at 8:44 AM, Simone Tiraboschi <stira...@redhat.com>
>&

Re: [ovirt-users] unable to bring up hosted engine after botched 4.2 upgrade

2018-01-12 Thread Martin Sivak
> RequestError: failed to read metadata: [Errno 2] No such file or directory:
> '/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8'
>
>  ls -al
> /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8
> -rw-rw. 1 vdsm kvm 1028096 Jan 12 09:59
> /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8
>
> Is this due to the symlink problem you guys are referring to that was
> addressed in RC1 or something else?

No, this file is the symlink. It should point to somewhere inside
/rhev/. I see it is a 1G file in your case. That is really
interesting.

Can you please stop all hosted engine tooling (ovirt-ha-agent,
ovirt-ha-broker), move the file (metadata file is not important when
services are stopped, but better safe than sorry) and restart all
services again?

> Could there possibly be a permissions
> problem somewhere?

Maybe, but the file itself looks out of the ordinary. I wonder how it got there.

Best regards

Martin Sivak

On Fri, Jan 12, 2018 at 3:09 PM, Jayme <jay...@gmail.com> wrote:
> Thanks for the help thus far.  Storage could be related but all other VMs on
> same storage are running ok.  The storage is mounted via NFS from within one
> of the three hosts, I realize this is not ideal.  This was setup by a
> previous admin more as a proof of concept and VMs were put on there that
> should not have been placed in a proof of concept environment.. it was
> intended to be rebuilt with proper storage down the road.
>
> So the storage is on HOST0 and the other hosts mount NFS
>
> cultivar0.grove.silverorange.com:/exports/data  4861742080
> 1039352832 3822389248  22%
> /rhev/data-center/mnt/cultivar0.grove.silverorange.com:_exports_data
> cultivar0.grove.silverorange.com:/exports/iso   4861742080
> 1039352832 3822389248  22%
> /rhev/data-center/mnt/cultivar0.grove.silverorange.com:_exports_iso
> cultivar0.grove.silverorange.com:/exports/import_export 4861742080
> 1039352832 3822389248  22%
> /rhev/data-center/mnt/cultivar0.grove.silverorange.com:_exports_import__export
> cultivar0.grove.silverorange.com:/exports/hosted_engine 4861742080
> 1039352832 3822389248  22%
> /rhev/data-center/mnt/cultivar0.grove.silverorange.com:_exports_hosted__engine
>
> Like I said, the VM data storage itself seems to be working ok, as all other
> VMs appear to be running.
>
> I'm curious why the broker log says this file is not found when it is
> correct and I can see the file at that path:
>
> RequestError: failed to read metadata: [Errno 2] No such file or directory:
> '/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8'
>
>  ls -al
> /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8
> -rw-rw. 1 vdsm kvm 1028096 Jan 12 09:59
> /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8
>
> Is this due to the symlink problem you guys are referring to that was
> addressed in RC1 or something else?  Could there possibly be a permissions
> problem somewhere?
>
> Assuming that all three hosts have 4.2 rpms installed and the host engine
> will not start is it safe for me to update hosts to 4.2 RC1 rpms?   Or
> perhaps install that repo and *only* update the ovirt HA packages?
> Assuming that I cannot yet apply the same updates to the inaccessible hosted
> engine VM.
>
> I should also mention one more thing.  I originally upgraded the engine VM
> first using new RPMS then engine-setup.  It failed due to not being in
> global maintenance, so I set global maintenance and ran it again, which
> appeared to complete as intended but never came back up after.  Just in case
> this might have anything at all to do with what could have happened.
>
> Thanks very much again, I very much appreciate the help!
>
> - Jayme
>
> On Fri, Jan 12, 2018 at 8:44 AM, Simone Tiraboschi <stira...@redhat.com>
> wrote:
>>
>>
>>
>> On Fri, Jan 12, 2018 at 11:11 AM, Martin Sivak <msi...@redhat.com> wrote:
>>>
>>> Hi,
>>>
>>> the hosted engine agent issue might be fixed by restarting
>>> ovirt-ha-broker or updating to newest ovirt-hosted-engine-ha and
>>> -setup. We improved handling of the missing symlink.
>>
>>
>> Available just in oVirt 4.2.1 RC1
>>
>>>
>>>
>>> All the other issues seem to point to some storage p

Re: [ovirt-users] unable to bring up hosted engine after botched 4.2 upgrade

2018-01-12 Thread Martin Sivak
The blockIoTune error should be harmless. It is just a result of a
data check by other component (mom) that encountered a VM that no
longer exists.

I thought we squashed all the logs like that though..

Martin

On Fri, Jan 12, 2018 at 3:12 PM, Jayme <jay...@gmail.com> wrote:
> One more thing to add, I've also been seeing a lot of this in the syslog as
> well:
>
> Jan 12 10:10:49 cultivar2 journal: vdsm jsonrpc.JsonRpcServer ERROR Internal
> server error#012Traceback (most recent call last):#012  File
> "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 606, in
> _handle_request#012res = method(**params)#012  File
> "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 197, in
> _dynamicMethod#012result = fn(*methodArgs)#012  File "", line 2,
> in getAllVmIoTunePolicies#012  File
> "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in
> method#012ret = func(*args, **kwargs)#012  File
> "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1354, in
> getAllVmIoTunePolicies#012io_tune_policies_dict =
> self._cif.getAllVmIoTunePolicies()#012  File
> "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 524, in
> getAllVmIoTunePolicies#012'current_values': v.getIoTune()}#012  File
> "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3481, in
> getIoTune#012result = self.getIoTuneResponse()#012  File
> "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3500, in
> getIoTuneResponse#012res = self._dom.blockIoTune(#012  File
> "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 47, in
> __getattr__#012% self.vmid)#012NotConnectedError: VM
> '4013c829-c9d7-4b72-90d5-6fe58137504c' was not defined yet or was undefined
>
> On Fri, Jan 12, 2018 at 10:09 AM, Jayme <jay...@gmail.com> wrote:
>>
>> Thanks for the help thus far.  Storage could be related but all other VMs
>> on same storage are running ok.  The storage is mounted via NFS from within
>> one of the three hosts, I realize this is not ideal.  This was setup by a
>> previous admin more as a proof of concept and VMs were put on there that
>> should not have been placed in a proof of concept environment.. it was
>> intended to be rebuilt with proper storage down the road.
>>
>> So the storage is on HOST0 and the other hosts mount NFS
>>
>> cultivar0.grove.silverorange.com:/exports/data  4861742080
>> 1039352832 3822389248  22%
>> /rhev/data-center/mnt/cultivar0.grove.silverorange.com:_exports_data
>> cultivar0.grove.silverorange.com:/exports/iso   4861742080
>> 1039352832 3822389248  22%
>> /rhev/data-center/mnt/cultivar0.grove.silverorange.com:_exports_iso
>> cultivar0.grove.silverorange.com:/exports/import_export 4861742080
>> 1039352832 3822389248  22%
>> /rhev/data-center/mnt/cultivar0.grove.silverorange.com:_exports_import__export
>> cultivar0.grove.silverorange.com:/exports/hosted_engine 4861742080
>> 1039352832 3822389248  22%
>> /rhev/data-center/mnt/cultivar0.grove.silverorange.com:_exports_hosted__engine
>>
>> Like I said, the VM data storage itself seems to be working ok, as all
>> other VMs appear to be running.
>>
>> I'm curious why the broker log says this file is not found when it is
>> correct and I can see the file at that path:
>>
>> RequestError: failed to read metadata: [Errno 2] No such file or
>> directory:
>> '/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8'
>>
>>  ls -al
>> /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8
>> -rw-rw. 1 vdsm kvm 1028096 Jan 12 09:59
>> /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8
>>
>> Is this due to the symlink problem you guys are referring to that was
>> addressed in RC1 or something else?  Could there possibly be a permissions
>> problem somewhere?
>>
>> Assuming that all three hosts have 4.2 rpms installed and the host engine
>> will not start is it safe for me to update hosts to 4.2 RC1 rpms?   Or
>> perhaps install that repo and *only* update the ovirt HA packages?
>> Assuming that I cannot yet apply the same updates to the inaccessible hosted
>> engine VM.
>>
>> I should also mention one more thing.  I originally upgraded the engine VM
>> first using new RPMS then engine-setup.  It failed due to not being in
>> global maintenance, so I set global maintenance and r

Re: [ovirt-users] Some major problems after 4.2 upgrade, could really use some assistance

2018-01-11 Thread Martin Sivak
Hi,

you hit one known issue we already have fixes for (4.1 hosts with 4.2
engine): 
https://gerrit.ovirt.org/#/q/status:open+project:ovirt-hosted-engine-ha+branch:v2.1.z+topic:ovf_42_for_41

You can try hotfixing it by upgrading hosted engine packages to 4.2 or
applying the patches manually and installing python-lxml.

I am not sure what happened to your other VM.

Best regards

Martin Sivak

On Thu, Jan 11, 2018 at 6:15 AM, Jayme <jay...@gmail.com> wrote:
> I performed Ovirt 4.2 upgrade on a 3 host cluster with NFS shared storage.
> The shared storage is mounted from one of the hosts.
>
> I upgraded the hosted engine first, downloading the 4.2 rpm, doing a yum
> update then engine setup which seemed to complete successfully, at the end
> it powered down the hosted VM but it never came back up.  I was unable to
> start it.
>
> I proceeded to upgrade the three hosts, ovirt 4.2 rpm and a full yum update.
> I also rebooted each of the three hosts.
>
> After some time the hosts did come back and almost all of the VMs are
> running again and seem to be working ok with the exception of two:
>
> 1. The hosted VM still will not start, I've tried everything I can think of.
>
> 2. A VM that I know existed is not running and does not appear to exist, I
> have no idea where it is or how to start it.
>
> 1. Hosted engine
>
> From one of the hosts I get a weird error trying to start it:
>
> # hosted-engine --vm-start
> Command VM.getStats with args {'vmID':
> '4013c829-c9d7-4b72-90d5-6fe58137504c'} failed:
> (code=1, message=Virtual machine does not exist: {'vmId':
> u'4013c829-c9d7-4b72-90d5-6fe58137504c'})
>
> From the two other hosts I do not get the same error as above, sometimes it
> appears to start but --vm-status shows errors such as:  Engine status
> : {"reason": "failed liveliness check", "health": "bad", "vm": "up",
> "detail": "Up"}
>
> Seeing these errors in syslog:
>
> Jan 11 01:06:30 host0 libvirtd: 2018-01-11 05:06:30.473+: 1910: error :
> qemuOpenFileAs:3183 : Failed to open file
> '/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/c2dde892-f978-4dfc-a421-c8e04cf387f9/23aa0a66-fa6c-4967-a1e5-fbe47c0cd705':
> No such file or directory
>
> Jan 11 01:06:30 host0 libvirtd: 2018-01-11 05:06:30.473+: 1910: error :
> qemuDomainStorageOpenStat:11492 : cannot stat file
> '/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/c2dde892-f978-4dfc-a421-c8e04cf387f9/23aa0a66-fa6c-4967-a1e5-fbe47c0cd705':
> Bad file descriptor
>
> 2. Missing VM.  virsh -r list on each host does not show the VM at all.  I
> know it existed and is important.  The log on one of the hosts even shows
> that it started it recently then stopped in 10 or so minutes later:
>
> Jan 10 18:47:17 host3 systemd-machined: New machine qemu-9-Berna.
> Jan 10 18:47:17 host3 systemd: Started Virtual Machine qemu-9-Berna.
> Jan 10 18:47:17 host3 systemd: Starting Virtual Machine qemu-9-Berna.
> Jan 10 18:54:45 host3 systemd-machined: Machine qemu-9-Berna terminated.
>
> How can I find out the status of the "Berna" VM and get it running again?
>
> Thanks so much!
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 答复: How power-saving schedule policy applied?

2018-01-09 Thread Martin Sivak
> Set ‘HostInReserve’ to 0. Bingo, cloudstack03 goes down as expected.
>
> But when I launch a new VM I get an error popup dialog which inform me that 
> there isn’t enough memory resource
> to start a new VM. Is it normal way that power saving mechanism works? Can’t 
> power saving policy scheduling module
> wake up a host to launch VM automatically?
>

We decided against this. Booting a host can take a long time + there
is some time before our management agent reports in
and we can't wait that long when starting a VM. That is why we keep
the empty host in reserve. That way the VM can be
started immediately and the additional host's boot time does not
matter too much.

> Also, Is there any timeline base power saving policy I can apply? Like from 
> 8’clock AM to 6’clock PM, 10 hosts up,
> 6’clock PM to next 8’clock AM, just keep 3 hosts alive.

We do not have UI for this, but it is easily done using cron and REST
API (either directly or using some SDK).
We were even working on it for a while, but it was put on backburner
since sysadmins know cron well and the UI would have to be limited
anyway.

Best regards

--
Martin Sivak
SLA / oVirt

On Tue, Jan 9, 2018 at 8:55 AM, Alex Shen <a...@bill-jc.com> wrote:
>
> Hi Martin, Karli
>
>
>
> Thank a lot for your help.
>
>
>
> According to your instruction, I added two parameters into cluster 
> configuration, as below.
>
> 1)  enable power cycling mechanisms
>
> 2)  reserve one host standing by
>
> My test bed:
>
> 3 hosts which are real servers, cloudstack01/cloudstack03/cloudstack04.
>
> Heavy load on cloudstack01, 21 vms
>
> Light load on cloudstack03, 1 vms
>
> No thing on cloudstack04
>
>
>
> When I shut down the last vm on cloudstack03(because I can’t migrate it to 
> cloudstack01, I guess the load is already exceeding the threshold), the 
> cloudstack04 goes to ‘maintenance’ then goes to ‘down’ state. Yepp, POWER 
> SAVING.
>
>
>
>
>
> I tried further test:
>
> Set ‘HostInReserve’ to 0. Bingo, cloudstack03 goes down as expected.
>
> But when I launch a new VM I get an error popup dialog which inform me that 
> there isn’t enough memory resource to start a new VM. Is it normal way that 
> power saving mechanism works? Can’t power saving policy scheduling module 
> wake up a host to launch VM automatically?
>
>
>
> Also, Is there any timeline base power saving policy I can apply? Like from 
> 8’clock AM to 6’clock PM, 10 hosts up, 6’clock PM to next 8’clock AM, just 
> keep 3 hosts alive.
>
>
>
> Best regards
>
> Alex
>
>
>
>
>
> 发件人: Martin Sivak [mailto:msi...@redhat.com]
> 发送时间: 2018年1月8日 19:43
> 收件人: Karli Sjöberg <ka...@inparadise.se>
> 抄送: Alex Shen <a...@bill-jc.com>; users <users@ovirt.org>
> 主题: Re: [ovirt-users] How power-saving schedule policy applied?
>
>
>
> Hi,
>
>
>
> and here is a fresh screenshot just for you :)
>
>
>
> You need to edit the cluster, select scheduling policy tab and add two 
> parameters to your Power saving policy. One enables the power cycling 
> mechanisms and the second one (HostsInReserve) controls how many empty hosts 
> are allowed to stay up. When the host is not empty anymore a new one will be 
> started.
>
>
>
> Best regards
>
>
>
> --
>
> Martin Sivak
>
> SLA / oVirt
>
>
>
> On Mon, Jan 8, 2018 at 12:29 PM, Karli Sjöberg <ka...@inparadise.se> wrote:
>
>
>
>
>
> Den 8 jan. 2018 12:07 skrev Alex Shen <a...@bill-jc.com>:
>
> Hi,
>
>
>
> I’m wondering how to apply power-saving schedule in ovirt cluster(version is 
> 4.2.0.2-1.el7.centos)? Is there any instruction or manual with UI snapshots 
> guidance? That will be appreciated.
>
>
>
> I’ve configed all host with ipmilan protocol and test ok. I suppose that 
> power saving policy should converge VMs into hosts one by one. Something like 
> that. From the beginning, all hosts are in ‘power off’ state, if a VM launch, 
> power saving scheduling module should power on one host, and assign VM into 
> that host. With many VMs activated, exceeding the threshold, power saving 
> scheduling module should turn on another host…
>
>
>
> Jepp, that's basically how it works:)
>
>
>
> /K
>
>
>
>
>
> If anyone had played power saving scheduling policy, please share me your 
> scenarios or some hints. Thanks a lot.
>
>
>
> Alex
>
>
>
> 无病毒。www.avast.com
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
>
> 无病毒。www.avast.com
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirt 4.1 hosted engine: Resize or Change "hosted-engine" Data Domain

2018-01-08 Thread Martin Sivak
Hi,

Simone can you please help here?

Martin

On Mon, Jan 8, 2018 at 2:44 PM, yayo (j) <jag...@gmail.com> wrote:
> Hi,
>
> Sorry for ask it again but the steps are not clear ...
>
> Thank you
>
> 2018-01-07 16:39 GMT+01:00 yayo (j) <jag...@gmail.com>:
>>
>> Hi,
>>
>> Sorry but I needs to migrate from one hosted-engine to another, so, where
>> I can restore backup? Before or after the autoimport triggered?
>>
>> * Create new hosted-engine lun
>> * Backup current hosted-engine
>> * From one node execute hosted-engine --deploy --he-remove-storage-vm
>> --he-remove-hosts
>> Right? And after that?
>>
>> Can you help me to better understand?
>> Thank you!
>>
>> Il 03 Gen 2018 14:39, "Martin Sivak" <msi...@redhat.com> ha scritto:
>>
>> Hi,
>>
>> we do not have any nice procedure to do that. Moving hosted engine to
>> a different storage usually involves backup and restore of the engine
>> database. See for example here:
>> http://lists.ovirt.org/pipermail/users/2017-June/082466.html
>>
>> Best regards
>>
>> --
>> Martin Sivak
>> SLA / oVirt
>>
>> On Wed, Jan 3, 2018 at 12:20 PM, yayo (j) <jag...@gmail.com> wrote:
>> > hi at all,
>> >
>> > We have the "hosted engine" Data domain on a FC LUN too big for only the
>> > hosted-engine so we want to create another little FC LUN and move the
>> > hosted-engine vm on this new LUN and destroy old one ...
>> >
>> > Is there any official workflow or how to to do this operation? Or,
>> > someone
>> > can  guide me?
>> >
>> > Thank you!
>> >
>> > ___
>> > Users mailing list
>> > Users@ovirt.org
>> > http://lists.ovirt.org/mailman/listinfo/users
>> >
>>
>>
>
>
>
> --
> Linux User: 369739 http://counter.li.org
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] How power-saving schedule policy applied?

2018-01-08 Thread Martin Sivak
Hi,

and here is a fresh screenshot just for you :)

You need to edit the cluster, select scheduling policy tab and add two
parameters to your Power saving policy. One enables the power cycling
mechanisms and the second one (HostsInReserve) controls how many empty
hosts are allowed to stay up. When the host is not empty anymore a new one
will be started.

Best regards

--
Martin Sivak
SLA / oVirt

On Mon, Jan 8, 2018 at 12:29 PM, Karli Sjöberg <ka...@inparadise.se> wrote:

>
>
> Den 8 jan. 2018 12:07 skrev Alex Shen <a...@bill-jc.com>:
>
> Hi,
>
>
>
> I’m wondering how to apply power-saving schedule in ovirt cluster(version
> is 4.2.0.2-1.el7.centos)? Is there any instruction or manual with UI
> snapshots guidance? That will be appreciated.
>
>
>
> I’ve configed all host with ipmilan protocol and test ok. I suppose that
> power saving policy should converge VMs into hosts one by one. Something
> like that. From the beginning, all hosts are in ‘power off’ state, if a VM
> launch, power saving scheduling module should power on one host, and assign
> VM into that host. With many VMs activated, exceeding the threshold, power
> saving scheduling module should turn on another host…
>
>
> Jepp, that's basically how it works:)
>
> /K
>
>
>
> If anyone had played power saving scheduling policy, please share me your
> scenarios or some hints. Thanks a lot.
>
>
>
> Alex
>
>
> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=emailclient>
> 无病毒。www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=emailclient>
> <#m_9062372848519472011_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirt 4.1 hosted engine: Resize or Change "hosted-engine" Data Domain

2018-01-03 Thread Martin Sivak
Hi,

we do not have any nice procedure to do that. Moving hosted engine to
a different storage usually involves backup and restore of the engine
database. See for example here:
http://lists.ovirt.org/pipermail/users/2017-June/082466.html

Best regards

--
Martin Sivak
SLA / oVirt

On Wed, Jan 3, 2018 at 12:20 PM, yayo (j) <jag...@gmail.com> wrote:
> hi at all,
>
> We have the "hosted engine" Data domain on a FC LUN too big for only the
> hosted-engine so we want to create another little FC LUN and move the
> hosted-engine vm on this new LUN and destroy old one ...
>
> Is there any official workflow or how to to do this operation? Or, someone
> can  guide me?
>
> Thank you!
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Turn on "Nested Virtualization" on the "hosted-engine"

2017-12-27 Thread Martin Sivak
Hi,

it might be a good idea to install vdsm-hooks-nestedvt and
vdsm-hook-macspoof on the host as described for example here:
https://community.redhat.com/blog/2013/08/testing-ovirt-3-3-with-nested-kvm/

Those two hooks will configure the kvm module properly and allow
dhcp/pxe to be used from within the nested VM.

Best regards

--
Martin Sivak
SLA / oVirt

On Wed, Dec 27, 2017 at 9:05 AM, Michal Skrivanek
<michal.skriva...@redhat.com> wrote:
>
>
> On 25 Dec 2017, at 13:03, Roman Drovalev <drova...@miac.kaluga.ru> wrote:
>
>>I'm not sure I understand the layering - hyper-V on oVirt or vice-versa?
>
> the layering: oVirt host -> Hyper-V_VM on oVirt -> VM on Hyper-V_VM
>
> CentOS based on linux kernel. I manual modified linux kernel parametrs.
>
>
> if you modified the kernel parameters already (which you did, as seen in
> screen shot your cmdline does have nested virt param) then there’s nothing
> else to do in the GUI, your host should support nested virtualization.
> I’m not sure why do you need SR-IOV but that’s not going to work on top of
> hyperv
>
>
> oVirt-engine based on oVirt API. How to manually change the parameters of
> ovirt-engine through API?
>
>>In any case, I assume an additional host is needed, the host hosting the
>> Hosted-Engine VM cannot be changed with those parameters when the HE is up.
>>Y.
>
> It's impossible, the second host is now - Hyper-V with working VM's.
>
> 25.12.2017 12:19, Yaniv Kaul
>
>
>
> On Mon, Dec 25, 2017 at 10:14 AM, Roman Drovalev <drova...@miac.kaluga.ru>
> wrote:
>>
>> Thanks for the quick response.
>>
>> >You probably need an additional host.
>>
>> I'm migrating from Hyper-V at the moment there is no way to add a second
>> host.
>>
>>
>> 25.12.2017 10:43, Yaniv Kaul пишет:
>> > What are you trying to achieve?
>>
>> "Nested Virtualization" . I need run on Hyper-V one VM on the my oVirt
>
>
> I'm not sure I understand the layering - hyper-V on oVirt or vice-versa?
> In any case, I assume an additional host is needed, the host hosting the
> Hosted-Engine VM cannot be changed with those parameters when the HE is up.
> Y.
>
>>
>>
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
> --
> С уважением, Дровалев Роман
> инженер ГБУЗ "МИАЦ",
> Калужской области.
> тел. 4842 705 004
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Regarding Ovirt Installation

2017-12-21 Thread Martin Sivak
Hi,

one of the new features of oVirt 4.2 is support for Replica 1 all in
one setup using hosted engine and gluster in hyper-converged mode.

So it should be again possible to use just a single host for
everything, I am not sure we have a documentation ready for that
though.

Best regards

Martin Sivak

On Thu, Dec 21, 2017 at 2:15 PM, Simone Tiraboschi <stira...@redhat.com> wrote:
>
>
> On Thu, Dec 21, 2017 at 1:59 PM, ruth john <gamerangerserve...@gmail.com>
> wrote:
>>
>> Sir, buying nfs storage would cost me a lot. Can't I use it directly on
>> the provided storage?
>
>
> We don't have anymore an all-in-one installation where the engine and vdsm
> runs altogether in the same machine; the proposed replacement is
> hosted-engine which doesn't work with local storage since it's supposed to
> be able to restart the engine VM somewhere else for HA reasons and the local
> storage is against that by definition.
> If you have three machines I'd suggest an hyperconverged gluster deployment
> with replica 3.
>
> If you want to try it on a single machine keep present that NFS in loop-back
> is discouraged so maybe you could try iSCSI or, maybe with a small hack,
> gluster in replica 1 in loopback
>
>
>>
>>
>>
>> On Dec 21, 2017 1:25 PM, "Simone Tiraboschi" <stira...@redhat.com> wrote:
>>
>>
>>
>> On Wed, Dec 20, 2017 at 10:45 PM, ruth john <gamerangerserve...@gmail.com>
>> wrote:
>>>
>>> I am delighted with the interface and other features of the Ovirt but was
>>> never able to install it properly, is that true OVirt doesn't support
>>> Hetzner Dedicated and OVH dedicated?
>>
>>
>> I personally know about a friend who is running it on an a couple of
>> dedicated OVH machines with NFS storage provided by OVH.
>> No idea about Hetzner.
>>
>>
>>>
>>> if not can anyone please help me to install atleast on one to make me
>>> understand where am i doing the mistake.
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>
>>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HA Broker fails after 4.2 upgrade

2017-12-21 Thread Martin Sivak
Btw lacking vdsm logs here this seems to be the same issue Jason
Brooks just reported here too. Hosted engine is trying to get storage
info from VDSM and gets error instead..

--
Martin Sivak
SLA / oVirt

On Thu, Dec 21, 2017 at 9:02 AM, Simone Tiraboschi <stira...@redhat.com> wrote:
>
>
> On Thu, Dec 21, 2017 at 5:13 AM, Andy <farkey_2...@yahoo.com> wrote:
>>
>> Hello all,
>>
>> I just upgraded my OVIRT instance to 4.2, the engine completed
>> successfully, however after I upgraded the hosts the HA Broker will not
>> start.  The 2 hosts are running CentOS 7.4, running gluster and CTDB.  The
>> VIPS are up and can be reached from both hosts as well as I can mount the
>> gluster storage.
>>
>> The error from the agent.log:
>>
>> MainThread::INFO::2017-12-20
>> 21:02:19,219::agent::67::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
>> ovirt-hosted-engine-ha agent 2.2.2 started
>> MainThread::INFO::2017-12-20
>> 21:02:19,346::hosted_engine::243::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname)
>> Found certificate common name: hm3svr01.hm3.loc
>> MainThread::INFO::2017-12-20
>> 21:02:20,478::hosted_engine::525::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
>> Initializing ha-broker connection
>> MainThread::INFO::2017-12-20
>> 21:02:20,482::brokerlink::77::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
>> Starting monitor ping, options {'addr': '192.168.3.1'}
>> MainThread::ERROR::2017-12-20
>> 21:02:20,483::hosted_engine::538::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
>> Failed to start necessary monitors
>> MainThread::ERROR::2017-12-20
>> 21:02:20,485::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
>> Traceback (most recent call last):
>>   File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 131, in _run_agent
>> return action(he)
>>   File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 55, in action_proper
>> return he.start_monitoring()
>>   File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 416, in start_monitoring
>> self._initialize_broker()
>>   File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 535, in _initialize_broker
>> m.get('options', {}))
>>   File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>> line 83, in start_monitor
>> .format(type, options, e))
>> RequestError: Failed to start monitor ping, options {'addr':
>> '192.168.x.x'}: [Errno 2] No such file or directory
>
>
> This simply means that the broker is not ready.
>
>>
>>
>>
>> The broker.log:
>>
>> MainThread::INFO::2017-12-20
>> 23:06:19,405::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>> Finished loading submonitors
>> MainThread::INFO::2017-12-20
>> 23:06:20,324::storage_backends::346::ovirt_hosted_engine_ha.lib.storage_backends::(connect)
>> Connecting the storage
>> MainThread::INFO::2017-12-20
>> 23:06:20,325::storage_server::252::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>> Connecting storage server
>> MainThread::INFO::2017-12-20
>> 23:06:20,849::storage_server::259::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>> Connecting storage server
>> MainThread::WARNING::2017-12-20
>> 23:06:20,913::storage_broker::96::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
>> Can't connect vdsm storage: Connection to storage server failed
>> MainThread::INFO::2017-12-20
>> 23:06:22,087::broker::45::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
>> ovirt-hosted-engine-ha broker 2.2.2 started
>> MainThread::INFO::2017-12-20
>> 23:06:22,088::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>> Searching for submonitors in
>> /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/s
>> ubmonitors
>> MainThread::INFO::2017-12-20
>> 23:06:22,089::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>> Loaded submonitor cpu-load
>> MainThread::INFO::2017-12-20
>> 23:06:22,093::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>

Re: [ovirt-users] Debugging why hosted engine flips between EngineUp and EngineBadHealth

2017-12-13 Thread Martin Sivak
Hi,

I am afraid we do not have logs that would go that deep into the stack. DNS
resolution issues will definitely affect both the notification system (if
not using localhost smtp) and the engine status checks (because we use the
fqdn).

Best regards

Martin

On Wed, Dec 13, 2017 at 3:15 PM, Luca 'remix_tj' Lorenzetto <
lorenzetto.l...@gmail.com> wrote:

> Hello,
>
> Today i started troubleshooting more in depth on dns requests and exactly
> while i was looking at tcpdump an event of EngineUp -> EngineBadHealth
> happened.
>
> Looking at the dns requests i see this:
>
> [...]
> 14:30:35.909201 IP kvmhost01.intranet.company.it.55654 >
> dns.company.it.53: 34102+ A? engine01.intranet.company.it. (54)
> 14:30:35.909215 IP kvmhost01.intranet.company.it.55654 >
> dns.company.it.53: 6242+ ? engine01.intranet.company.it. (54)
> 14:30:40.914285 IP kvmhost01.intranet.company.it.55654 >
> dns.company.it.53: 34102+ A? engine01.intranet.company.it. (54)
> 14:30:40.914316 IP kvmhost01.intranet.company.it.55654 >
> dns.company.it.53: 6242+ ? engine01.intranet.company.it. (54)
> 14:30:45.918306 IP kvmhost01.intranet.company.it.54885 >
> dns.company.it.53: 60263+ A? engine01.intranet.company.it.
> intranet.company.it. (74)
> 14:30:45.918329 IP kvmhost01.intranet.company.it.54885 >
> dns.company.it.53: 18681+ ? engine01.intranet.company.it.
> intranet.company.it. (74)
> 14:30:50.920376 IP kvmhost01.intranet.company.it.54885 >
> dns.company.it.53: 60263+ A? engine01.intranet.company.it.
> intranet.company.it. (74)
> 14:30:50.920411 IP kvmhost01.intranet.company.it.54885 >
> dns.company.it.53: 18681+ ? engine01.intranet.company.it.
> intranet.company.it. (74)
> 14:30:56.044242 IP kvmhost01.intranet.company.it.58319 >
> dns.company.it.53: 28413+ A? engine01.intranet.company.it. (54)
> 14:30:56.044267 IP kvmhost01.intranet.company.it.58319 >
> dns.company.it.53: 29680+ ? engine01.intranet.company.it. (54)
> 14:31:01.049761 IP kvmhost01.intranet.company.it.58319 >
> dns.company.it.53: 28413+ A? engine01.intranet.company.it. (54)
> 14:31:01.049777 IP kvmhost01.intranet.company.it.58319 >
> dns.company.it.53: 29680+ ? engine01.intranet.company.it. (54)
> 14:31:06.052635 IP kvmhost01.intranet.company.it.58093 >
> dns.company.it.53: 24807+ A? engine01.intranet.company.it.
> intranet.company.it. (74)
> 14:31:06.052649 IP kvmhost01.intranet.company.it.58093 >
> dns.company.it.53: 53745+ ? engine01.intranet.company.it.
> intranet.company.it. (74)
> 14:31:11.057724 IP kvmhost01.intranet.company.it.58093 >
> dns.company.it.53: 24807+ A? engine01.intranet.company.it.
> intranet.company.it. (74)
> 14:31:11.057745 IP kvmhost01.intranet.company.it.58093 >
> dns.company.it.53: 53745+ ? engine01.intranet.company.it.
> intranet.company.it. (74)
> 14:31:16.175204 IP kvmhost01.intranet.company.it.44950 >
> dns.company.it.53: 63680+ A? engine01.intranet.company.it. (54)
> 14:31:16.175225 IP kvmhost01.intranet.company.it.44950 >
> dns.company.it.53: 15726+ ? engine01.intranet.company.it. (54)
> 14:31:19.670746 IP kvmhost01.intranet.company.it.54689 >
> dns.company.it.53: 40999+ A? kvmsvilca01.intranet.company.it. (49)
> 14:31:21.180295 IP kvmhost01.intranet.company.it.44950 >
> dns.company.it.53: 63680+ A? engine01.intranet.company.it. (54)
> 14:31:21.180337 IP kvmhost01.intranet.company.it.44950 >
> dns.company.it.53: 15726+ ? engine01.intranet.company.it. (54)
> 14:31:23.771959 IP kvmhost01.intranet.company.it.53741 >
> dns.company.it.53: 1707+ A? internalmx.intranet.company.it. (48)
> [...]
>
> The last dns requests has success and gets the MX address and immediately
> after i get the email reporting the status change.
>
> This is clearly an issue with name resolution, but that's not clear to me
> from the broker.log file. The only message about it that i get is:
>
> Thread-16::DEBUG::2017-12-13 14:31:23,657::monitor::126::
> ovirt_hosted_engine_ha.broker.monitor.Monitor::(get_value) Submonitor
> engine-health id 139653
> 412040592 current value: {"reason": "failed liveliness check", "health":
> "bad", "vm": "up", "detail": "up"}
> Thread-16::DEBUG::2017-12-13 14:31:23,657::listener::170::
> ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
> Response: success {"reaso
> n": "failed liveliness check", "health": "bad", "vm": "up", "detail": "up"}
>
>
> But around that messages i get no signals of error on dns queries or
> similar. Do i need to check on other log files?
>
> Luca
>
>
> On Mon, Dec 11, 2017 at 3:34 PM, Luca 'remix_tj' Lorenzetto <
> lorenzetto.l...@gmail.com> wrote:
>
>> Hi Martin, Hi all,
>>
>> *some minutes* has passed and i've the piece of log i'm looking at.
>>
>> ​
>>  broker.log-upbadup
>> 
>> ​
>>
>>
>
>
> --
> "E' assurdo impiegare gli uomini di intelligenza eccellente per fare
> calcoli che potrebbero essere affidati a chiunque se si usassero delle
> macchine"
> Gottfried Wilhelm 

Re: [ovirt-users] 4-2rc hosted-engine don't boot error:cannot allocate kernel buffer

2017-12-11 Thread Martin Sivak
Hi,

we also have a proper fix now and will release it with the next RC build.

Best regards

--
Martin Sivak
SLA / oVirt

On Mon, Dec 11, 2017 at 12:10 PM, Maton, Brett <mat...@ltresources.co.uk>
wrote:

> Really short version ( I can't find the link to the ovirt doc at the
> moment)
>
> Put hosted engine in to global maintenance and power off the vm
> (hosted-engine command).
>
> On one of your physical hosts, make a copy of he config and update the
> memory
>
> cp /var/run/ovirt-hosted-engine-ha/vm.conf .
> vim vm.conf
>
> Then start hosted engine with the new config
>
>
> hosted-engine --vm-start --vm-conf=./vm.conf
>
>
>
> On 11 December 2017 at 10:36, Roberto Nunin <robnu...@gmail.com> wrote:
>
>>
>>
>> 2017-12-11 10:32 GMT+01:00 Maton, Brett <mat...@ltresources.co.uk>:
>>
>>> Hi Roberto can you check how much RAM is allocated to the HE VM ?
>>>
>>>
>>> virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.
>>> conf
>>>
>>> virsh # dominfo HostedEngine
>>>
>>>
>>> The last update I did seems to have changed the HE RAM from 4GB to 4MB!
>>>
>>
>>
>> ​Yes, you're right.. ​
>>
>> virsh # dominfo HostedEngine
>> Id: 191
>> Name:   HostedEngine
>> UUID:   6831dd96-af48-4673-ac98-f1b9ba60754b
>> OS Type:hvm
>> State:  running
>> CPU(s): 4
>> CPU time:   9053.7s
>> Max memory: 4096 KiB
>> Used memory:4096 KiB
>> Persistent: yes
>> Autostart:  disable
>> Managed save:   no
>> Security model: selinux
>> Security DOI:   0
>> Security label: system_u:system_r:svirt_t:s0:c201,c408 (enforcing)
>>
>>
>>>
>>> On 11 December 2017 at 09:08, Simone Tiraboschi <stira...@redhat.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Dec 11, 2017 at 9:47 AM, Roberto Nunin <robnu...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello all
>>>>>
>>>>> during weekend, I've re-tried to deploy my 4.2_rc lab.
>>>>> Everything was fine, apart the fact host 2 and 3 weren't imported. I
>>>>> had to add them to the cluster manually, with the NEW function.
>>>>> After this Gluster volumes were added fine to the environment.
>>>>>
>>>>> Next engine deploy on nodes 2 and 3, ended with ok status.
>>>>>
>>>>> Tring to migrate HE from host 1 to host 2 was fine, the same from host
>>>>> 2 to host 3.
>>>>>
>>>>> After these two attempts, no way to migrate HE back to any host.
>>>>> Tried Maintenance mode set to global, reboot the HE and now I'm in the
>>>>> same condition reported below, not anymore able to boot the HE.
>>>>>
>>>>> Here's hosted-engine --vm-status:
>>>>>
>>>>> !! Cluster is in GLOBAL MAINTENANCE mode !!
>>>>>
>>>>>
>>>>>
>>>>> --== Host 1 status ==--
>>>>>
>>>>> conf_on_shared_storage : True
>>>>> Status up-to-date  : True
>>>>> Hostname   : aps-te61-mng.example.com
>>>>> Host ID: 1
>>>>> Engine status  : {"reason": "vm not running on
>>>>> this host", "health": "bad", "vm": "down", "detail": "unknown"}
>>>>> Score  : 3400
>>>>> stopped: False
>>>>> Local maintenance  : False
>>>>> crc32  : 7dfc420b
>>>>> local_conf_timestamp   : 181953
>>>>> Host timestamp : 181952
>>>>> Extra metadata (valid at timestamp):
>>>>> metadata_parse_version=1
>>>>> metadata_feature_version=1
>>>>> timestamp=181952 (Mon Dec 11 09:21:46 2017)
>>>>> host-id=1
>>>>> score=3400
>>>>> vm_conf_refresh_time=181953 (Mon Dec 11 09:21:47 2017)
>>>>> conf_on_shared_storage=True
>>>>> maintenance=False
>>>>> state=GlobalMaintenance
>>>>> stopped=False
>>>>>
>&g

Re: [ovirt-users] oVirt self-Hosted engine- Hosted Engine failover issue

2017-12-04 Thread Martin Sivak
Hi,

there are two major sources of information you need to debug this. All
come from the hosts themselves:

hosted-engine --vm-status - the current status of the cluster, should
be the same on both hosts, but can differ if something is wrong with
the synchronization
/var/log/ovirt-hosted-engine-ha/{agent.log,broker.log} - the log files

The usual reason for migrating (well stopping and starting) the VM are
issues with pinging the configured gateway or crash of the VM.

Best regards

Martin Sivak

On Fri, Dec 1, 2017 at 4:36 AM, Terry hey <recreati...@gmail.com> wrote:
> Hello all,
>
> i created two hosts in Default cluster. The Hosted Engine was installed in
> host1. Then i tried to live migrate hosted engine from host1 to host2 and it
> was OKAY. However, the hosted engine suddenly shut down for a while. Then,
> when the hosted engine worked again, i found that the hosted engine was
> located in host1.
>
> The problem is that since there is no vms inside these two hosts except
> hosted engine vm why the hosted engine will suddenly shutdown and auto
> migrate to host1?
>
> Since i am newer for ovirt self hosted engine if you guys want to read some
> log, would you give me the log path?
>
> I very appreciated with your help.
>
> Regards,
> Terry
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Debugging why hosted engine flips between EngineUp and EngineBadHealth

2017-12-04 Thread Martin Sivak
Hi,

please attach the log. You can grep out the connected / disconnected lines.

Look for engine health monitor lines.

Martin

On Sat, Dec 2, 2017 at 5:10 PM, Luca 'remix_tj' Lorenzetto
<lorenzetto.l...@gmail.com> wrote:
> Hello,
>
> i had several switches between EngineUp and EngineBadHealth today with
> broker.log ad DEBUG level. Where i should start to identify root
> cause? Log is somewhat chatty at this level.
>
> Luca
>
> On Fri, Dec 1, 2017 at 1:24 PM, Martin Sivak <msi...@redhat.com> wrote:
>> Hi,
>>
>>> [logger_root]
>>> level=INFO
>>
>>> [handler_logfile]
>>> level=DEBUG
>>
>>> Seems already set. The file broker.log is already containing DEBUG,
>>> but syslog is not (and this is good). What about logger_root?
>>
>> Yeah, I think you should change that one as well to get full debug
>> logging. The handler level does nothing if the messages do not get to
>> it. And the root logger should not let them in the default
>> configuration you have.
>>
>> Best regards
>>
>> Martin
>>
>>> Luca
>>>
>>> On Fri, Dec 1, 2017 at 12:29 PM, Martin Sivak <msi...@redhat.com> wrote:
>>>> Hi,
>>>>
>>>> can you please enable DEBUG log and then attach broker.log once it
>>>> reproduces? See /etc/ovirt-hosted-engine-ha/broker-log.conf for the
>>>> place where to set it (do not forget to restart ovirt-ha-agent and
>>>> ovirt-ha-broker afterwards).
>>>>
>>>> Name resolution issues might be the cause for this indeed, because the
>>>> broker is trying to query a health endpoint over HTTP. If
>>>> notifications failed because of unresolvable name then there is high
>>>> chance the same happens to the health request every now and then.
>>>>
>>>> Best regards
>>>>
>>>> Martin Sivak
>>>>
>>>> On Fri, Dec 1, 2017 at 10:50 AM, Luca 'remix_tj' Lorenzetto
>>>> <lorenzetto.l...@gmail.com> wrote:
>>>>> Hi all,
>>>>>
>>>>> since some days my hosted-engine environments (one RHEV 4.0.7, one
>>>>> ovirt 4.1.7) continue to send mails about changes between EngineUp and
>>>>> EngineBadHealth.
>>>>>
>>>>> This is pretty annoying and i'm not able to find out the root cause.
>>>>>
>>>>> The only issue i've seen on hosts is this error appearing sometimes
>>>>> randomly about sending mails.
>>>>>
>>>>> Thread-1::ERROR::2017-12-01
>>>>> 03:05:05,084::notifications::39::ovirt_hosted_engine_ha.broker.notifications.Notifications::(send_email)
>>>>> [Errno -2] Name or service not known
>>>>> Traceback (most recent call last):
>>>>>   File 
>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/notifications.py",
>>>>> line 26, in send_email
>>>>> timeout=float(cfg["smtp-timeout"]))
>>>>>   File "/usr/lib64/python2.7/smtplib.py", line 255, in __init__
>>>>> (code, msg) = self.connect(host, port)
>>>>>   File "/usr/lib64/python2.7/smtplib.py", line 315, in connect
>>>>> self.sock = self._get_socket(host, port, self.timeout)
>>>>>   File "/usr/lib64/python2.7/smtplib.py", line 290, in _get_socket
>>>>> return socket.create_connection((host, port), timeout)
>>>>>   File "/usr/lib64/python2.7/socket.py", line 553, in create_connection
>>>>> for res in getaddrinfo(host, port, 0, SOCK_STREAM):
>>>>> gaierror: [Errno -2] Name or service not known
>>>>> Thread-6::WARNING::2017-12-01
>>>>> 03:05:05,427::engine_health::130::engine_health.CpuLoadNoEngine::(action)
>>>>> bad health status: Hosted Engine is not up!
>>>>>
>>>>> There are no errors on engine logs and all the api queries done by
>>>>> ovirt-hosted-engine-ha returns HTTP code 200.
>>>>>
>>>>> I suspect the switch between EngineUP and EngineBadHealth status could
>>>>> be due to some dns resolution issues, but there is no clear message on
>>>>> the log showing this and this doesn't help our netadmins to make some
>>>>> traces.
>>>>>
>>>>> Is there a way to increase the verbosity of broker.log and agent.log?
>>>>>
>>>>> Luca
>>>

Re: [ovirt-users] Debugging why hosted engine flips between EngineUp and EngineBadHealth

2017-12-01 Thread Martin Sivak
Hi,

> [logger_root]
> level=INFO

> [handler_logfile]
> level=DEBUG

> Seems already set. The file broker.log is already containing DEBUG,
> but syslog is not (and this is good). What about logger_root?

Yeah, I think you should change that one as well to get full debug
logging. The handler level does nothing if the messages do not get to
it. And the root logger should not let them in the default
configuration you have.

Best regards

Martin

> Luca
>
> On Fri, Dec 1, 2017 at 12:29 PM, Martin Sivak <msi...@redhat.com> wrote:
>> Hi,
>>
>> can you please enable DEBUG log and then attach broker.log once it
>> reproduces? See /etc/ovirt-hosted-engine-ha/broker-log.conf for the
>> place where to set it (do not forget to restart ovirt-ha-agent and
>> ovirt-ha-broker afterwards).
>>
>> Name resolution issues might be the cause for this indeed, because the
>> broker is trying to query a health endpoint over HTTP. If
>> notifications failed because of unresolvable name then there is high
>> chance the same happens to the health request every now and then.
>>
>> Best regards
>>
>> Martin Sivak
>>
>> On Fri, Dec 1, 2017 at 10:50 AM, Luca 'remix_tj' Lorenzetto
>> <lorenzetto.l...@gmail.com> wrote:
>>> Hi all,
>>>
>>> since some days my hosted-engine environments (one RHEV 4.0.7, one
>>> ovirt 4.1.7) continue to send mails about changes between EngineUp and
>>> EngineBadHealth.
>>>
>>> This is pretty annoying and i'm not able to find out the root cause.
>>>
>>> The only issue i've seen on hosts is this error appearing sometimes
>>> randomly about sending mails.
>>>
>>> Thread-1::ERROR::2017-12-01
>>> 03:05:05,084::notifications::39::ovirt_hosted_engine_ha.broker.notifications.Notifications::(send_email)
>>> [Errno -2] Name or service not known
>>> Traceback (most recent call last):
>>>   File 
>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/notifications.py",
>>> line 26, in send_email
>>> timeout=float(cfg["smtp-timeout"]))
>>>   File "/usr/lib64/python2.7/smtplib.py", line 255, in __init__
>>> (code, msg) = self.connect(host, port)
>>>   File "/usr/lib64/python2.7/smtplib.py", line 315, in connect
>>> self.sock = self._get_socket(host, port, self.timeout)
>>>   File "/usr/lib64/python2.7/smtplib.py", line 290, in _get_socket
>>> return socket.create_connection((host, port), timeout)
>>>   File "/usr/lib64/python2.7/socket.py", line 553, in create_connection
>>> for res in getaddrinfo(host, port, 0, SOCK_STREAM):
>>> gaierror: [Errno -2] Name or service not known
>>> Thread-6::WARNING::2017-12-01
>>> 03:05:05,427::engine_health::130::engine_health.CpuLoadNoEngine::(action)
>>> bad health status: Hosted Engine is not up!
>>>
>>> There are no errors on engine logs and all the api queries done by
>>> ovirt-hosted-engine-ha returns HTTP code 200.
>>>
>>> I suspect the switch between EngineUP and EngineBadHealth status could
>>> be due to some dns resolution issues, but there is no clear message on
>>> the log showing this and this doesn't help our netadmins to make some
>>> traces.
>>>
>>> Is there a way to increase the verbosity of broker.log and agent.log?
>>>
>>> Luca
>>>
>>> --
>>> "E' assurdo impiegare gli uomini di intelligenza eccellente per fare
>>> calcoli che potrebbero essere affidati a chiunque se si usassero delle
>>> macchine"
>>> Gottfried Wilhelm von Leibnitz, Filosofo e Matematico (1646-1716)
>>>
>>> "Internet è la più grande biblioteca del mondo.
>>> Ma il problema è che i libri sono tutti sparsi sul pavimento"
>>> John Allen Paulos, Matematico (1945-vivente)
>>>
>>> Luca 'remix_tj' Lorenzetto, http://www.remixtj.net , 
>>> <lorenzetto.l...@gmail.com>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
> --
> "E' assurdo impiegare gli uomini di intelligenza eccellente per fare
> calcoli che potrebbero essere affidati a chiunque se si usassero delle
> macchine"
> Gottfried Wilhelm von Leibnitz, Filosofo e Matematico (1646-1716)
>
> "Internet è la più grande biblioteca del mondo.
> Ma il problema è che i libri sono tutti sparsi sul pavimento"
> John Allen Paulos, Matematico (1945-vivente)
>
> Luca 'remix_tj' Lorenzetto, http://www.remixtj.net , 
> <lorenzetto.l...@gmail.com>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Debugging why hosted engine flips between EngineUp and EngineBadHealth

2017-12-01 Thread Martin Sivak
Hi,

can you please enable DEBUG log and then attach broker.log once it
reproduces? See /etc/ovirt-hosted-engine-ha/broker-log.conf for the
place where to set it (do not forget to restart ovirt-ha-agent and
ovirt-ha-broker afterwards).

Name resolution issues might be the cause for this indeed, because the
broker is trying to query a health endpoint over HTTP. If
notifications failed because of unresolvable name then there is high
chance the same happens to the health request every now and then.

Best regards

Martin Sivak

On Fri, Dec 1, 2017 at 10:50 AM, Luca 'remix_tj' Lorenzetto
<lorenzetto.l...@gmail.com> wrote:
> Hi all,
>
> since some days my hosted-engine environments (one RHEV 4.0.7, one
> ovirt 4.1.7) continue to send mails about changes between EngineUp and
> EngineBadHealth.
>
> This is pretty annoying and i'm not able to find out the root cause.
>
> The only issue i've seen on hosts is this error appearing sometimes
> randomly about sending mails.
>
> Thread-1::ERROR::2017-12-01
> 03:05:05,084::notifications::39::ovirt_hosted_engine_ha.broker.notifications.Notifications::(send_email)
> [Errno -2] Name or service not known
> Traceback (most recent call last):
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/notifications.py",
> line 26, in send_email
> timeout=float(cfg["smtp-timeout"]))
>   File "/usr/lib64/python2.7/smtplib.py", line 255, in __init__
> (code, msg) = self.connect(host, port)
>   File "/usr/lib64/python2.7/smtplib.py", line 315, in connect
> self.sock = self._get_socket(host, port, self.timeout)
>   File "/usr/lib64/python2.7/smtplib.py", line 290, in _get_socket
> return socket.create_connection((host, port), timeout)
>   File "/usr/lib64/python2.7/socket.py", line 553, in create_connection
> for res in getaddrinfo(host, port, 0, SOCK_STREAM):
> gaierror: [Errno -2] Name or service not known
> Thread-6::WARNING::2017-12-01
> 03:05:05,427::engine_health::130::engine_health.CpuLoadNoEngine::(action)
> bad health status: Hosted Engine is not up!
>
> There are no errors on engine logs and all the api queries done by
> ovirt-hosted-engine-ha returns HTTP code 200.
>
> I suspect the switch between EngineUP and EngineBadHealth status could
> be due to some dns resolution issues, but there is no clear message on
> the log showing this and this doesn't help our netadmins to make some
> traces.
>
> Is there a way to increase the verbosity of broker.log and agent.log?
>
> Luca
>
> --
> "E' assurdo impiegare gli uomini di intelligenza eccellente per fare
> calcoli che potrebbero essere affidati a chiunque se si usassero delle
> macchine"
> Gottfried Wilhelm von Leibnitz, Filosofo e Matematico (1646-1716)
>
> "Internet è la più grande biblioteca del mondo.
> Ma il problema è che i libri sono tutti sparsi sul pavimento"
> John Allen Paulos, Matematico (1945-vivente)
>
> Luca 'remix_tj' Lorenzetto, http://www.remixtj.net , 
> <lorenzetto.l...@gmail.com>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Time management in self hosted engine vm

2017-11-28 Thread Martin Sivak
Hi,

- I would not recommend changing the hw clock timezone
- The OS timezone can be freely changed afaik (using any tool possible)
- I do not see how enabling time sync would break anything either

My development machines run engine using CET timezone and hosts using
UTC. I have never seen any issues..

Martin Sivak

On Tue, Nov 28, 2017 at 4:43 PM, Gianluca Cecchi
<gianluca.cec...@gmail.com> wrote:
> Hello,
> I have a self hosted engine HCI environment that was born in 4.0.5 through
> the ansible/gdeploy mechanism and then gradually updated to 4.1.7
> Now the engine and host are at CentOS 7.4 level. Inside the gdeploy jobs the
> engine vm was installed through the appliance at the 4.0.5 time.
> I see in web admin gui that hw clock of engine vm is configured as UTC (so
> far so good) and the same is inside the OS. And I would like to change it to
> UTC+1 like the 3 hosts
> I also see that neither chronyd nor ntpd are installed on engine vm.
> Can I fix the situation the normal os level way:
>
> - timedatectl set-timezone "Europe/Rome"
>
> - install and configure chronyd
> or better not to configure time sync in engine vm?
>
> Thanks,
> Gianluca
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


  1   2   3   >