[ovirt-users] ILO2 Fencing

2018-03-28 Thread TomK

Hey Guy's,

I've tested my ILO2 fence from the ovirt engine CLI and that works:

fence_ilo2 -a 192.168.0.37 -l  --password="" 
--ssl-insecure --tls1.0 -v -o status


The UI gives me:

Test failed: Failed to run fence status-check on host 
'ph-host01.my.dom'.  No other host was available to serve as proxy for 
the operation.


Going to add a second host in a bit but anyway to get this working with 
just one host?  I'm just adding the one host to oVirt for some POC we 
are doing atm but the UI forces me to adjust Power Management settings 
before proceeding.


Also:

2018-03-28 02:04:15,183-04 WARN 
[org.ovirt.engine.core.bll.network.NetworkConfigurator] 
(EE-ManagedThreadFactory-engine-Thread-335) [2d691be9] Failed to find a 
valid interface for the management network of host ph-host01.my.dom. If 
the interface br0 is a bridge, it should be torn-down manually.
2018-03-28 02:04:15,184-04 ERROR 
[org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] 
(EE-ManagedThreadFactory-engine-Thread-335) [2d691be9] Exception: 
org.ovirt.engine.core.bll.network.NetworkConfigurator$NetworkConfiguratorException: 
Interface br0 is invalid for management network



I've these defined as such but not clear what it is expecting:

[root@ph-host01 ~]# ip a
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
   valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
   valid_lft forever preferred_lft forever
2: eth0:  mtu 1500 qdisc mq 
master bond0 state UP qlen 1000

link/ether 78:e7:d1:8c:b1:ba brd ff:ff:ff:ff:ff:ff
3: eth1:  mtu 1500 qdisc mq 
master bond0 state DOWN qlen 1000

link/ether 78:e7:d1:8c:b1:ba brd ff:ff:ff:ff:ff:ff
4: eth2:  mtu 1500 qdisc mq 
master bond0 state DOWN qlen 1000

link/ether 78:e7:d1:8c:b1:ba brd ff:ff:ff:ff:ff:ff
5: eth3:  mtu 1500 qdisc mq 
master bond0 state DOWN qlen 1000

link/ether 78:e7:d1:8c:b1:ba brd ff:ff:ff:ff:ff:ff
21: bond0:  mtu 1500 qdisc 
noqueue master br0 state UP qlen 1000

link/ether 78:e7:d1:8c:b1:ba brd ff:ff:ff:ff:ff:ff
inet6 fe80::7ae7:d1ff:fe8c:b1ba/64 scope link
   valid_lft forever preferred_lft forever
23: ;vdsmdummy;:  mtu 1500 qdisc noop state DOWN 
qlen 1000

link/ether fe:69:c7:50:0d:dd brd ff:ff:ff:ff:ff:ff
24: br0:  mtu 1500 qdisc noqueue state 
UP qlen 1000

link/ether 78:e7:d1:8c:b1:ba brd ff:ff:ff:ff:ff:ff
inet 192.168.0.39/23 brd 192.168.1.255 scope global br0
   valid_lft forever preferred_lft forever
inet6 fe80::7ae7:d1ff:fe8c:b1ba/64 scope link
   valid_lft forever preferred_lft forever
[root@ph-host01 ~]# cd /etc/sysconfig/network-scripts/
[root@ph-host01 network-scripts]# cat ifcfg-br0
DEVICE=br0
TYPE=Bridge
BOOTPROTO=none
IPADDR=192.168.0.39
NETMASK=255.255.254.0
GATEWAY=192.168.0.1
ONBOOT=yes
DELAY=0
USERCTL=no
DEFROUTE=yes
NM_CONTROLLED=no
DOMAIN="my.dom nix.my.dom"
SEARCH="my.dom nix.my.dom"
HOSTNAME=ph-host01.my.dom
DNS1=192.168.0.224
DNS2=192.168.0.44
DNS3=192.168.0.45
ZONE=public
[root@ph-host01 network-scripts]# cat ifcfg-bond0
DEVICE=bond0
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
NM_CONTROLLED=no
BONDING_OPTS="miimon=100 mode=2"
BRIDGE=br0
#
#
# IPADDR=192.168.0.39
# NETMASK=255.255.254.0
# GATEWAY=192.168.0.1
# DNS1=192.168.0.1
[root@ph-host01 network-scripts]#


--
Cheers,
Tom K.
-

Living on earth is expensive, but it includes a free trip around the sun.

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Host affinity rule

2018-03-28 Thread Colin Coe
Hi all

I suspect one of our hypervisors is faulty but at this stage I can't prove
it.

We're running RHV 4.1.7 (about to upgrade to v4.1.10 in a few days).

I'm planning on create a negative host affinity rule to prevent all current
existing VMs from running on the suspect host.  Afterwards I'll create a
couple of test VMs and put them in a positive host affinity rule so they
only run on the suspect host.

There are about 150 existing VMs, are there any known problems with host
affinity rules and putting 150 or so VMs in the group?

This is production so I need to be careful.

Thanks
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Cache to NFS

2018-03-28 Thread Marcelo Leandro
Hello,
I have 1 server configured with RAID 6 36TB HDD, I would like improve the
performance, I read about lvmcache with SSD and  would like know if its
indicated to configure with NFS and how can calculete the size necessary to
ssd.

Very Thanks.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Hosted engine VDSM issue with sanlock

2018-03-28 Thread Jamie Lawrence
I still can't resolve this issue.

I have a host that is stuck in a cycle; it will be marked non responsive, then 
come back up, ending with an "finished activation" message in the GUI. Then it 
repeats.

The root cause seems to be sanlock.  I'm just unclear on why it started or how 
to resolve it. The only "approved" knob I'm aware of is 
--reinitialize-lockspace and the manual equivalent, neither of which fix 
anything.

Anyone have a guess?

-j

- - - vdsm.log - - - -

2018-03-28 10:38:22,207-0700 INFO  (monitor/b41eb20) [storage.SANLock] 
Acquiring host id for domain b41eb20a-eafb-481b-9a50-a135cf42b15e (id=1, 
async=True) (clusterlock:284)
2018-03-28 10:38:22,208-0700 ERROR (monitor/b41eb20) [storage.Monitor] Error 
acquiring host id 1 for domain b41eb20a-eafb-481b-9a50-a135cf42b15e 
(monitor:568)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 565, in 
_acquireHostId
self.domain.acquireHostId(self.hostId, async=True)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 828, in 
acquireHostId
self._manifest.acquireHostId(hostId, async)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 453, in 
acquireHostId
self._domainLock.acquireHostId(hostId, async)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 
315, in acquireHostId
raise se.AcquireHostIdFailure(self._sdUUID, e)
AcquireHostIdFailure: Cannot acquire host id: 
(u'b41eb20a-eafb-481b-9a50-a135cf42b15e', SanlockException(22, 'Sanlock 
lockspace add failure', 'Invalid argument'))
2018-03-28 10:38:23,078-0700 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call 
Host.ping2 succeeded in 0.00 seconds (__init__:573)
2018-03-28 10:38:23,085-0700 INFO  (jsonrpc/6) [vdsm.api] START 
repoStats(domains=[u'b41eb20a-eafb-481b-9a50-a135cf42b15e']) from=::1,54450, 
task_id=186d7e8b-7b4e-485d-a9e0-c0cb46eed621 (api:46)
2018-03-28 10:38:23,085-0700 INFO  (jsonrpc/6) [vdsm.api] FINISH repoStats 
return={u'b41eb20a-eafb-481b-9a50-a135cf42b15e': {'code': 0, 'actual': True, 
'version': 4, 'acquired': False, 'delay': '0.000812547', 'lastCheck': '0.4', 
'valid': True}} from=::1,54450, task_id=186d7e8b-7b4e-485d-a9e0-c0cb46eed621 
(api:52)
2018-03-28 10:38:23,086-0700 INFO  (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC call 
Host.getStorageRepoStats succeeded in 0.00 seconds (__init__:573)
2018-03-28 10:38:23,092-0700 WARN  (vdsm.Scheduler) [Executor] Worker blocked: 
 at 0x1d44150> 
timeout=15, duration=150 at 0x7f076c05fb90> task#=83985 at 0x7f082c08e510>, 
traceback:
File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap
  self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
  self.run()
File: "/usr/lib64/python2.7/threading.py", line 765, in run
  self.__target(*self.__args, **self.__kwargs)
File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 194, 
in run
  ret = func(*args, **kwargs)
File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run
  self._execute_task()
File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in 
_execute_task
  task()
File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__
  self._callable()
File: "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 213, in 
__call__
  self._func()
File: "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line 578, in 
__call__
  stats = hostapi.get_stats(self._cif, self._samples.stats())
File: "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 77, in get_stats
  ret['haStats'] = _getHaInfo()
File: "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 182, in 
_getHaInfo
  stats = instance.get_all_stats()
File: 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", 
line 93, in get_all_stats
  stats = broker.get_stats_from_storage()
File: 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", 
line 135, in get_stats_from_storage
  result = self._proxy.get_stats()
File: "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__
  return self.__send(self.__name, args)
File: "/usr/lib64/python2.7/xmlrpclib.py", line 1587, in __request
  verbose=self.__verbose
File: "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request
  return self.single_request(host, handler, request_body, verbose)
File: "/usr/lib64/python2.7/xmlrpclib.py", line 1303, in single_request
  response = h.getresponse(buffering=True)
File: "/usr/lib64/python2.7/httplib.py", line 1089, in getresponse
  response.begin()
File: "/usr/lib64/python2.7/httplib.py", line 444, in begin
  version, status, reason = self._read_status()
File: "/usr/lib64/python2.7/httplib.py", line 400, in _read_status
  line = self.fp.readline(_MAXLINE + 1)
File: "/usr/lib64/python2.7/socket.py", line 476, in readline
  data = self._sock.recv(self._rbufsize) (executor:363)
2018-03-28 10:38:23,274-0700 INFO  (jsonrpc/3) 

[ovirt-users] [ANN] oVirt 4.2.2 GA Release is now available

2018-03-28 Thread Lev Veyde
The oVirt Project is pleased to announce the availability of the oVirt 4.2.2 GA
release, as of March 28th, 2018.

This update is a GA release of the second in a series of stabilization
updates to the 4.2
series.

This release is available now for:
* Red Hat Enterprise Linux 7.4 or later
* CentOS Linux (or similar) 7.4 or later

This release supports Hypervisor Hosts running:
* Red Hat Enterprise Linux 7.4 or later
* CentOS Linux (or similar) 7.4 or later
* oVirt Node 4.2

See the release notes [1] for installation / upgrade instructions and
a list of new features and bugs fixed.

Notes:
- oVirt Appliance is available
- oVirt Node is available [2]

Additional Resources:
* Read more about the oVirt 4.2.2 release highlights:
http://www.ovirt.org/release/4. 2
. 
2 /

* Get more oVirt Project updates on Twitter: https://twitter.com/ovirt
* Check out the latest project news on the oVirt blog:
http://www.ovirt.org/blog/

[1] http://www.ovirt.org/release/4. 2
. 
2 /

[2] http://resources.ovirt.org/pub/ovirt-4.
2
/iso/


-- 

Lev Veyde

Software Engineer, RHCE | RHCVA | MCITP

Red Hat Israel



l...@redhat.com | lve...@redhat.com

TRIED. TESTED. TRUSTED. 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] How it is oVirt used in your Department?

2018-03-28 Thread Fedele Stabile Nuovo Server
My question is mainly addressed at those of you who use oVirt not only
for creating services on virtual machines.
What is your experience and what did you made?
Is there anyone who virtualized an HPC cluster?
What is for you the advantage on virtualizing a cluster?
Or, having a class with PC or Raspberry is better to use LTSP or PiNet
or virtualize desktops?

I would like to have a lot of feedback to start a discussion about the
best way to use oVirt in different contexts

Fedele Stabile
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Recovering oVirt-Engine with a backup before upgrading to 4.2

2018-03-28 Thread Sven Achtelik


> -Ursprüngliche Nachricht-
> Von: Yedidyah Bar David [mailto:d...@redhat.com]
> Gesendet: Mittwoch, 28. März 2018 10:06
> An: Sven Achtelik
> Cc: users@ovirt.org
> Betreff: Re: [ovirt-users] Recovering oVirt-Engine with a backup before
> upgrading to 4.2
> 
> On Tue, Mar 27, 2018 at 9:14 PM, Sven Achtelik 
> wrote:
> > Hi All,
> >
> >
> >
> > I’m still facing issues with my HE engine. Here are the steps that I
> > took to end up in this situation:
> >
> >
> >
> > - Update Engine from 4.1.7 to 4.1.9
> >
> > o   That worked as expected
> >
> > - Automatic Backup of Engine DB in the night
> >
> > - Upgraded Engine from 4.1.9 to 4.2.1
> >
> > o   That worked fine
> >
> > - Noticed Issues with the HA support for HE
> >
> > o   Cause was not having the latest ovirt-ha agent/broker version on hosts
> >
> > - After updating the first host with the latest packages for the
> > Agent/Broker engine was started twice
> >
> > o   As a result the Engine VM Disk was corrupted and there is no Backup of
> > the Disk
> >
> > o   There is also no Backup of the Engine DB with version 4.2
> >
> > - VM disk was repaired with fsck.ext4, but DB is corrupt
> >
> > o   Can’t restore the Engine DB because the Backup DB from Engine V 4.1
> >
> > - Rolled back all changes on Engine VM to 4.1.9 and imported Backup
> >
> > o   Checked for HA VMs to set as disabled and started the Engine
> >
> > - Login is fine but the Engine is having trouble picking up and
> > information from the Hosts
> >
> > o   No information on running VMs or hosts status
> >
> > - Final Situation
> >
> > o   2 Hosts have VMs still running and I can’t stop those
> >
> > o   I still have the image of my corrupted Engine VM (v4.2)
> >
> >
> >
> > Since there were no major changes after upgrading from 4.1 to 4.2,
> > would it be possible to manually restore the 4.1 DB to the 4.2 Engine
> > VM to this up and running again or are there modifications made to the
> > DB on upgrading that are relevant for this ?
> 
> engine-backup requires restoring to the same version used to take the backup,
> with a single exception - on 4.0, it can restore 3.6.
> 
> It's very easy to patch it to allow also 4.1->4.2, search inside it for
> "VALID_BACKUP_RESTORE_PAIRS". However, I do not think anyone ever
> tested this, so no idea might break. In 3.6->4.0 days, we did have to fix a 
> few
> other things, notably apache httpd and iptables->firewalld:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1318580
> 
> > All my work on rolling back to 4.1.9 with the DB restore failed as the
> > Engine is not capable of picking up information from the hosts.
> 
> No idea why, but not sure it's related to your restore flow.
> 
> > Lessons learned is to always make a copy/snapshot of the engine VM
> > disk before upgrading anything.
> 
> If it's a hosted-engine, this isn't supported - see my reply on the list ~ 1 
> hour
> ago...
> 
> > What are my options on getting
> > back to a working environment ? Any help or hint is greatly appreciated.
> 
> Restore again with either methods - what you tried, or patching engine-
> backup and restore directly into 4.2 - and if the engine fails to talk to the 
> hosts,
> try to debug/fix this.
> 
> If you suspect corruption more severe that just the db, you can install a 
> fresh
> engine machine from scratch and restore to it. If it's a hosted-engine, you'll
> need to deploy hosted-engine from scratch, check docs about hosted-engine
> backup/restore.

I read through those documents and it seems that I would need an extra 
Host/Hardware which I don't have. 
https://ovirt.org/documentation/self-hosted/chap-Backing_up_and_Restoring_an_EL-Based_Self-Hosted_Environment/

So how would I be able to get a new setup working when I would like to use the 
Engine-VM-Image ? At this point it sounds like I would have to manually 
reinstall the machine that is left over and running. I'm lost at this point. 
> 
> Best regards,
> --
> Didi
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Deploy Self-Hosted Engine in a Active Host

2018-03-28 Thread FERNANDO FREDIANI
Hello

As I mentioned in another thread I am migrating a 'Bare-metal' oVirt-Engine
to a Self-Hosted Engine.
For that I am following this documentation:
https://ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_Metal_to_an_EL-Based_Self-Hosted_Environment/

However I think called me attention and I wanted to clarity: Must the Host
that will deploy the Self-Hosted Engine be in Maintenance mode and
therefore with no other VMs running ?

I have a Node which is currently part of a Cluster and wish to deploy the
Self-Hosted Engine to it. Must I have to put it into Maintenance mode first
or can I just run the 'hosted-engine --deploy'.

Note: this Self-Hosted Engine will manage the existing cluster where this
Node exists. Guess that is not an issue at all and part of what Self-Hosted
Engine is intended to.

Thanks
Fernando
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Any monitoring tool provided?

2018-03-28 Thread Peter Hudec
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

I have working proof-of-concept. There still could  be bug in
templates and not all parameters are monitored, but HOST and VM
discovery is working.

I will try to share it on github. At this moment I have some
performance issues, since it's using the zabbix-agent which needs to
for a new process for each query ;(

Peter

On 26/03/2018 08:13, Peter Hudec wrote:
> Yes, template and python
> 
> at this moment I understand oVirt API. I needed to write own small 
> SDK, since the oVirt SDK is using SSO for login, that means it do 
> some additional requests for login process. There is no option to 
> use Basic Auth. The SESSION reuse could be useful, but I do no
> want to add more
> 
> First  I need to understand some basics from Zabbix Discovery
> Rules / Host Prototypes. I would like to have VM as separate hosts
> in Zabbix, like VMWare does. The other stuff is quite easy.
> 
> There plugin for nagios/icinga if someone using this monitoring 
> tool https://github.com/ovido/check_rhev3
> 
> On 26/03/2018 08:02, Alex K wrote:
>> Hi Peter,
> 
>> This is interesting. Is it going to be a template with an 
>> external python script?
> 
>> Alex
> 
>> On Mon, Mar 26, 2018, 08:50 Peter Hudec > > wrote:
> 
>> Hi Terry,
> 
>> I started to work on ZABBIX integration based on oVirt API. 
>> Basically it should be like VmWare integration in ZABBIX with 
>> full hosts/vms discovery and statistic gathering.
> 
>> The API provides  for each statistics service for NIC, VM as wall
>> CPU and MEM utilization.
> 
>> There is also solution based on reading data from VDSM to 
>> prometheus 
>> http://rmohr.github.io/virtualization/2016/04/12/monitor-your-ovirt-d
a
>
>>
>> 
ta
> 
> 
> center-with-prometheus
>> 
>>
>> 
atacenter-with-prometheus>.
> 
> 
> 
>> Peter
> 
>> On 22/03/2018 04:41, Terry hey wrote:
>>> Dear all,
> 
>>> Now, we can just read how many storage used, cpu usage on
>>> ovirt dashboard. But is there any monitoring tool for
>>> monitoring virtual machine time to time? If yes, could you guys
>>> give me the procedure?
> 
> 
> 
>>> Regards Terry
> 
> 
>>> ___ Users mailing 
>>> list Users@ovirt.org 
>> http://lists.ovirt.org/mailman/listinfo/users
> 
> 
> 
>> ___ Users mailing 
>> list Users@ovirt.org  
>> http://lists.ovirt.org/mailman/listinfo/users
> 
> 
> 

- -- 
*Peter Hudec*
Infraštruktúrny architekt
phu...@cnc.sk 

*CNC, a.s.*
Borská 6, 841 04 Bratislava
Recepcia: +421 2  35 000 100

Mobil:+421 905 997 203
*www.cnc.sk* 

-BEGIN PGP SIGNATURE-

iQJCBAEBCAAsFiEEqSUbhuEwhryifNeVQnvVWOJ35BAFAlq7qPsOHHBodWRlY0Bj
bmMuc2sACgkQQnvVWOJ35BDXzw/8ChApssWNkM0HiixYESQP+lgxJeHqHYgvBbrQ
DTfiOfTXrWDLIXn7LQdtt7IH4LtTDEwLcGBFSQCTUuX7W6y6Uj5y9pkcGLrtFYuP
g1yBEPuqO3RB2QoR6FLlEyfqfDpnIWiRbtFFpK4P6UmRNQX637GKcluMN8EXeujY
w/S+0JoV9ANEnDgsyCQvJ1f89D4KTiD9eTv0zijl7abRew8ioMVAmxt2YBFQf1KC
rZQ4h7qbymYNDWRv/n4qx3StBN8e0crty73glfWbHrCuw0/lfSMgELWelvvSR1YE
1oEMmRaqQr7poxtXTGdtkXRkvxil+Or/IQ6jibFjMt9rmmkJ4jgMkGdSkYSHlCJC
G2pjlN0nlghOmj9QDaX54EAXUeybVmoD8yGVlmREEl8jPYVxAmfasJ7zJ+mIBTU4
21Z7/yhFHINi6pez+3t/42BA11XtfTiUx5GVzj5M+Ky6bbkVOF5H+ndJzRA92UHj
lZOyoFP5cg9cFkdmAGep4pE9BdHEIFKnS7m0vVM0IwiKRQMprvAYyBuj23V+wtTc
FXauv7+xpyFiH/0IqFCeHn9DCIXnEQlGfwDkCt4PYS+p0Jfm2hlI3fvGqeHZngYz
NElJT59Sxc8kWayZrbr5uaw6csTwcXXhVcALaWB/q6DsqR2KyPWwbs3CtfH5OGWh
39Jz7aw=
=NcUr
-END PGP SIGNATURE-
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt Node Resize tool for local storage

2018-03-28 Thread Christopher Cox

On 03/28/2018 01:37 AM, Pavol Brilla wrote:

Hi

AFAIK ext4 is not supporting online shrinking of filesystem,
to shrink storage you would need to unmount filesystem,
thus it is not possible to do with VM online.


Correct.  Just saying it's not possible at all with XFS, be that online or 
offline.

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Snapshot of the Self-Hosted Engine

2018-03-28 Thread FERNANDO FREDIANI
Hello Sven and all.

Yes storage does have the snapshot function and could be possibility be
used, but I was wondering a even easier way through the oVirt Node CLI or
something similar that can use the qcow2 image snapshot to do that with the
Self-Hosted Engine in Global Maintenance.

I used to run the oVirt Engine in a Libvirt KVM Virtual Machine in a
separate Host and it has always been extremely handy to have this feature.
There has been times where the upgrade was not successfully and just
turning off the VM, starting it from snapshot saved my day.

Regards
Fernando

2018-03-27 14:14 GMT-03:00 Sven Achtelik :

> Hi Fernando,
>
>
>
> depending on where you’re having your storage you could set everything to
> global maintenance, stop the vm and copy the disk image. Or if your storage
> systeme is able to do snapshots you could use that function once the engine
> is stopped. It’s the easiest way I can think of right now. What kind of
> storage are you using ?
>
>
>
> Sven
>
>
>
> *Von:* users-boun...@ovirt.org [mailto:users-boun...@ovirt.org] *Im
> Auftrag von *FERNANDO FREDIANI
> *Gesendet:* Dienstag, 27. März 2018 15:24
> *An:* users
> *Betreff:* [ovirt-users] Snapshot of the Self-Hosted Engine
>
>
>
> Hello
>
> Is it possible to snapshot the Self-Hosted Engine before an Upgrade ? If
> so how ?
>
> Thanks
>
> Fernando
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Snapshot of the Self-Hosted Engine

2018-03-28 Thread Simone Tiraboschi
On Wed, Mar 28, 2018 at 9:13 AM, Yedidyah Bar David  wrote:

> On Tue, Mar 27, 2018 at 4:23 PM, FERNANDO FREDIANI
>  wrote:
> > Hello
> >
> > Is it possible to snapshot the Self-Hosted Engine before an Upgrade ? If
> so
> > how ?
>
> I do not think so - I do not think anything changed since this:
>
> http://lists.ovirt.org/pipermail/users/2016-November/044103.html
>
> I agree it sounds like a useful thing to have. Not sure how hard it
> can be to implement it. Feel free to open an RFE bz.
>

We are using a disk lease to prevent split brains, AFAIK snapshots are not
compatible with disk leases.


>
> Basically, we'll have to:
>
> 1. Make sure everything continues to work sensibly - engine/vdsm do
> the right things, ha agent works as expected, etc.
>
> 2. Provide means to start the vm from a snapshot, and/or revert to a
> snapshot. This is going to be quite ugly, because it will have to
> duplicate in ovirt-hosted-engine-setup/-ha functionality that already
> exists in the engine, because at that point the engine is not
> available to assist with this.
>
> Best regards,
> --
> Didi
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Recovering oVirt-Engine with a backup before upgrading to 4.2

2018-03-28 Thread Sven Achtelik


> -Ursprüngliche Nachricht-
> Von: Yedidyah Bar David [mailto:d...@redhat.com]
> Gesendet: Mittwoch, 28. März 2018 10:06
> An: Sven Achtelik
> Cc: users@ovirt.org
> Betreff: Re: [ovirt-users] Recovering oVirt-Engine with a backup before
> upgrading to 4.2
> 
> On Tue, Mar 27, 2018 at 9:14 PM, Sven Achtelik 
> wrote:
> > Hi All,
> >
> >
> >
> > I’m still facing issues with my HE engine. Here are the steps that I
> > took to end up in this situation:
> >
> >
> >
> > - Update Engine from 4.1.7 to 4.1.9
> >
> > o   That worked as expected
> >
> > - Automatic Backup of Engine DB in the night
> >
> > - Upgraded Engine from 4.1.9 to 4.2.1
> >
> > o   That worked fine
> >
> > - Noticed Issues with the HA support for HE
> >
> > o   Cause was not having the latest ovirt-ha agent/broker version on hosts
> >
> > - After updating the first host with the latest packages for the
> > Agent/Broker engine was started twice
> >
> > o   As a result the Engine VM Disk was corrupted and there is no Backup of
> > the Disk
> >
> > o   There is also no Backup of the Engine DB with version 4.2
> >
> > - VM disk was repaired with fsck.ext4, but DB is corrupt
> >
> > o   Can’t restore the Engine DB because the Backup DB from Engine V 4.1
> >
> > - Rolled back all changes on Engine VM to 4.1.9 and imported Backup
> >
> > o   Checked for HA VMs to set as disabled and started the Engine
> >
> > - Login is fine but the Engine is having trouble picking up and
> > information from the Hosts
> >
> > o   No information on running VMs or hosts status
> >
> > - Final Situation
> >
> > o   2 Hosts have VMs still running and I can’t stop those
> >
> > o   I still have the image of my corrupted Engine VM (v4.2)
> >
> >
> >
> > Since there were no major changes after upgrading from 4.1 to 4.2,
> > would it be possible to manually restore the 4.1 DB to the 4.2 Engine
> > VM to this up and running again or are there modifications made to the
> > DB on upgrading that are relevant for this ?
> 
> engine-backup requires restoring to the same version used to take the backup,
> with a single exception - on 4.0, it can restore 3.6.
> 
> It's very easy to patch it to allow also 4.1->4.2, search inside it for
> "VALID_BACKUP_RESTORE_PAIRS". However, I do not think anyone ever
> tested this, so no idea might break. In 3.6->4.0 days, we did have to fix a 
> few
> other things, notably apache httpd and iptables->firewalld:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1318580
> 
> > All my work on rolling back to 4.1.9 with the DB restore failed as the
> > Engine is not capable of picking up information from the hosts.
> 
> No idea why, but not sure it's related to your restore flow.
> 
> > Lessons learned is to always make a copy/snapshot of the engine VM
> > disk before upgrading anything.
> 
> If it's a hosted-engine, this isn't supported - see my reply on the list ~ 1 
> hour
> ago...
> 
> > What are my options on getting
> > back to a working environment ? Any help or hint is greatly appreciated.
> 
> Restore again with either methods - what you tried, or patching engine-
> backup and restore directly into 4.2 - and if the engine fails to talk to the 
> hosts,
> try to debug/fix this.
> 
> If you suspect corruption more severe that just the db, you can install a 
> fresh
> engine machine from scratch and restore to it. If it's a hosted-engine, you'll
> need to deploy hosted-engine from scratch, check docs about hosted-engine
> backup/restore.

Will the setup of hosted engine from scratch require a new storage domain for 
the new engine or can I use the one that is already there ? What about the VMs 
running on my hosts, will they be effected by that ? It might be best to start 
with a fresh VM.

> 
> Best regards,
> --
> Didi
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Recovering oVirt-Engine with a backup before upgrading to 4.2

2018-03-28 Thread Yedidyah Bar David
On Tue, Mar 27, 2018 at 9:14 PM, Sven Achtelik  wrote:
> Hi All,
>
>
>
> I’m still facing issues with my HE engine. Here are the steps that I took to
> end up in this situation:
>
>
>
> - Update Engine from 4.1.7 to 4.1.9
>
> o   That worked as expected
>
> - Automatic Backup of Engine DB in the night
>
> - Upgraded Engine from 4.1.9 to 4.2.1
>
> o   That worked fine
>
> - Noticed Issues with the HA support for HE
>
> o   Cause was not having the latest ovirt-ha agent/broker version on hosts
>
> - After updating the first host with the latest packages for the
> Agent/Broker engine was started twice
>
> o   As a result the Engine VM Disk was corrupted and there is no Backup of
> the Disk
>
> o   There is also no Backup of the Engine DB with version 4.2
>
> - VM disk was repaired with fsck.ext4, but DB is corrupt
>
> o   Can’t restore the Engine DB because the Backup DB from Engine V 4.1
>
> - Rolled back all changes on Engine VM to 4.1.9 and imported Backup
>
> o   Checked for HA VMs to set as disabled and started the Engine
>
> - Login is fine but the Engine is having trouble picking up and
> information from the Hosts
>
> o   No information on running VMs or hosts status
>
> - Final Situation
>
> o   2 Hosts have VMs still running and I can’t stop those
>
> o   I still have the image of my corrupted Engine VM (v4.2)
>
>
>
> Since there were no major changes after upgrading from 4.1 to 4.2, would it
> be possible to manually restore the 4.1 DB to the 4.2 Engine VM to this up
> and running again or are there modifications made to the DB on upgrading
> that are relevant for this ?

engine-backup requires restoring to the same version used to take the backup,
with a single exception - on 4.0, it can restore 3.6.

It's very easy to patch it to allow also 4.1->4.2, search inside it for
"VALID_BACKUP_RESTORE_PAIRS". However, I do not think anyone ever tested
this, so no idea might break. In 3.6->4.0 days, we did have to fix a few
other things, notably apache httpd and iptables->firewalld:

https://bugzilla.redhat.com/show_bug.cgi?id=1318580

> All my work on rolling back to 4.1.9 with the
> DB restore failed as the Engine is not capable of picking up information
> from the hosts.

No idea why, but not sure it's related to your restore flow.

> Lessons learned is to always make a copy/snapshot of the
> engine VM disk before upgrading anything.

If it's a hosted-engine, this isn't supported - see my reply on the
list ~ 1 hour ago...

> What are my options on getting
> back to a working environment ? Any help or hint is greatly appreciated.

Restore again with either methods - what you tried, or patching engine-backup
and restore directly into 4.2 - and if the engine fails to talk to the hosts,
try to debug/fix this.

If you suspect corruption more severe that just the db, you can install a
fresh engine machine from scratch and restore to it. If it's a hosted-engine,
you'll need to deploy hosted-engine from scratch, check docs about hosted-engine
backup/restore.

Best regards,
-- 
Didi
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Snapshot of the Self-Hosted Engine

2018-03-28 Thread Yedidyah Bar David
On Tue, Mar 27, 2018 at 4:23 PM, FERNANDO FREDIANI
 wrote:
> Hello
>
> Is it possible to snapshot the Self-Hosted Engine before an Upgrade ? If so
> how ?

I do not think so - I do not think anything changed since this:

http://lists.ovirt.org/pipermail/users/2016-November/044103.html

I agree it sounds like a useful thing to have. Not sure how hard it
can be to implement it. Feel free to open an RFE bz.

Basically, we'll have to:

1. Make sure everything continues to work sensibly - engine/vdsm do
the right things, ha agent works as expected, etc.

2. Provide means to start the vm from a snapshot, and/or revert to a
snapshot. This is going to be quite ugly, because it will have to
duplicate in ovirt-hosted-engine-setup/-ha functionality that already
exists in the engine, because at that point the engine is not
available to assist with this.

Best regards,
-- 
Didi
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirt snapshot issue

2018-03-28 Thread Yedidyah Bar David
On Tue, Mar 27, 2018 at 3:38 PM, Sandro Bonazzola 
wrote:

>
>
> 2018-03-27 14:34 GMT+02:00 Alex K :
>
>> Hi All,
>>
>> Any idea on the below?
>>
>> I am using oVirt Guest Tools 4.2-1.el7.centos for the VM.
>> The Window 2016 server VM (which it the one with the relatively big
>> disks: 500 GB) it is consistently rendered unresponsive when trying to get
>> a snapshot.
>> I amy provide any other additional logs if needed.
>>
>
> Adding some people to the thread
>

Adding more people for this part.


>
>
>
>>
>> Alex
>>
>> On Sun, Mar 25, 2018 at 7:30 PM, Alex K  wrote:
>>
>>> Hi folks,
>>>
>>> I am facing frequently the following issue:
>>>
>>> On some large VMs (Windows 2016 with two disk drives, 60GB and 500GB)
>>> when attempting to create a snapshot of the VM, the VM becomes
>>> unresponsive.
>>>
>>> The errors that I managed to collect were:
>>>
>>> vdsm error at host hosting the VM:
>>> 2018-03-25 14:40:13,442+ WARN  (vdsm.Scheduler) [Executor] Worker
>>> blocked: >> {u'frozen': False, u'vmID': u'a5c761a2-41cd-40c2-b65f-f3819293e8a4',
>>> u'snapDrives': [{u'baseVolumeID': u'2a33e585-ece8-4f4d-b45d-5ecc9239200e',
>>> u'domainID': u'888e3aae-f49f-42f7-a7fa-76700befabea', u'volumeID':
>>> u'e9a01ebd-83dd-40c3-8c83-5302b0d15e04', u'imageID':
>>> u'c75b8e93-3067-4472-bf24-dafada224e4d'}, {u'baseVolumeID':
>>> u'3fb2278c-1b0d-4677-a529-99084e4b08af', u'domainID':
>>> u'888e3aae-f49f-42f7-a7fa-76700befabea', u'volumeID':
>>> u'78e6b6b1-2406-4393-8d92-831a6d4f1337', u'imageID':
>>> u'd4223744-bf5d-427b-bec2-f14b9bc2ef81'}]}, 'jsonrpc': '2.0', 'method':
>>> u'VM.snapshot', 'id': u'89555c87-9701-4260-9952-789965261e65'} at
>>> 0x7fca4004cc90> timeout=60, duration=60 at 0x39d8210> task#=155842 at
>>> 0x2240e10> (executor:351)
>>> 2018-03-25 14:40:15,261+ INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer]
>>> RPC call VM.getStats failed (error 1) in 0.01 seconds (__init__:539)
>>> 2018-03-25 14:40:17,471+ WARN  (jsonrpc/5) [virt.vm]
>>> (vmId='a5c761a2-41cd-40c2-b65f-f3819293e8a4') monitor became
>>> unresponsive (command timeout, age=67.910001) (vm:5132)
>>>
>>> engine.log:
>>> 2018-03-25 14:40:19,875Z WARN  [org.ovirt.engine.core.dal.dbb
>>> roker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler2)
>>> [1d737df7] EVENT_ID: VM_NOT_RESPONDING(126), Correlation ID: null, Call
>>> Stack: null, Custom ID: null, Custom Event ID: -1, Message: VM Data-Server
>>> is not responding.
>>>
>>> 2018-03-25 14:42:13,708Z ERROR [org.ovirt.engine.core.dal.dbb
>>> roker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler5)
>>> [17789048-009a-454b-b8ad-2c72c7cd37aa] EVENT_ID:
>>> VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null, Call Stack:
>>> null, Custom ID: null, Custom Event ID: -1, Message: VDSM v1.cluster
>>> command SnapshotVDS failed: Message timeout which can be caused by
>>> communication issues
>>> 2018-03-25 14:42:13,708Z ERROR [org.ovirt.engine.core.vdsbrok
>>> er.vdsbroker.SnapshotVDSCommand] (DefaultQuartzScheduler5)
>>> [17789048-009a-454b-b8ad-2c72c7cd37aa] Command
>>> 'SnapshotVDSCommand(HostName = v1.cluster, 
>>> SnapshotVDSCommandParameters:{runAsync='true',
>>> hostId='a713d988-ee03-4ff0-a0cd-dc4cde1507f4',
>>> vmId='a5c761a2-41cd-40c2-b65f-f3819293e8a4'})' execution failed:
>>> VDSGenericException: VDSNetworkException: Message timeout which can be
>>> caused by communication issues
>>> 2018-03-25 14:42:13,708Z WARN  [org.ovirt.engine.core.bll.sna
>>> pshots.CreateAllSnapshotsFromVmCommand] (DefaultQuartzScheduler5)
>>> [17789048-009a-454b-b8ad-2c72c7cd37aa] Could not perform live snapshot
>>> due to error, VM will still be configured to the new created snapshot:
>>> EngineException: org.ovirt.engine.core.vdsbroke
>>> r.vdsbroker.VDSNetworkException: VDSGenericException:
>>> VDSNetworkException: Message timeout which can be caused by communication
>>> issues (Failed with error VDS_NETWORK_ERROR and code 5022)
>>> 2018-03-25 14:42:13,708Z WARN  [org.ovirt.engine.core.vdsbroker.VdsManager]
>>> (org.ovirt.thread.pool-6-thread-15) [17789048-009a-454b-b8ad-2c72c7cd37aa]
>>> Host 'v1.cluster' is not responding. It will stay in Connecting state for a
>>> grace period of 61 seconds and after that an attempt to fence the host will
>>> be issued.
>>> 2018-03-25 14:42:13,725Z WARN  [org.ovirt.engine.core.dal.dbb
>>> roker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-15)
>>> [17789048-009a-454b-b8ad-2c72c7cd37aa] EVENT_ID:
>>> VDS_HOST_NOT_RESPONDING_CONNECTING(9,008), Correlation ID: null, Call
>>> Stack: null, Custom ID: null, Custom Event ID: -1, Message: Host v1.cluster
>>> is not responding. It will stay in Connecting state for a grace period of
>>> 61 seconds and after that an attempt to fence the host will be issued.
>>> 2018-03-25 14:42:13,751Z WARN  [org.ovirt.engine.core.dal.dbb
>>> roker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler5)
>>> 

Re: [ovirt-users] Which hardware are you using for oVirt

2018-03-28 Thread Andy Michielsen
Hello Chritopher,

Thank you very much for sharing.

It started out just for fun but now people at work are looking at me to have an 
environment to do testing, simulate problems they have encountered, etc.

And more an more off them see the benifits off this. At work we are running 
vmware but that was far to expencieve to use it for these test. But as I 
suspected that was in the beginning and I knew I had to be able to expand so 
whenever an old server was decommisioned from production I converted it to an 
node. I now have 4 in use and demands keep growing.

So now I want to ask my boss to invest in new hardware as now people are asking 
me why I do not have proper backups and even why the can not use the vm’s when 
I perform administrative tasks or upgrades.

So that’s why I’m very inerested in what others are using.

Kind regards.

> On 26 Mar 2018, at 18:03, Christopher Cox  wrote:
> 
>> On 03/24/2018 03:33 AM, Andy Michielsen wrote:
>> Hi all,
>> Not sure if this is the place to be asking this but I was wondering which 
>> hardware you all are using and why in order for me to see what I would be 
>> needing.
>> I would like to set up a HA cluster consisting off 3 hosts to be able to run 
>> 30 vm’s.
>> The engine, I can run on an other server. The hosts can be fitted with the 
>> storage and share the space through glusterfs. I would think I will be 
>> needing at least 3 nic’s but would be able to install ovn. (Are 1gb nic’s 
>> sufficient ?)
> 
> Just because you asked, but not because this is helpful to you
> 
> But first, a comment on "3 hosts to be able to run 30 VMs".  The SPM node 
> shouldn't run a lot of VMs.  There are settings (the setting slips my mind) 
> on the engine to give it a "virtual set" of VMs in order to keep VMs off of 
> it.
> 
> With that said, CPU wise, it doesn't require a lot to run 30 VM's.  The 
> costly thing is memory (in general).  So while a cheap set of 3 machines 
> might handle the CPU requirements of 30 VM's, those cheap machines might not 
> be able to give you the memory you need (depends).  You might be fine.  I 
> mean, there are cheap desktop like machines that do 64G (and sometimes more). 
>  Just something to keep in mind.  Memory and storage will be the most costly 
> items.  It's simple math.  Linux hosts, of course, don't necessarily need 
> much memory (or storage).  But Windows...
> 
> 1Gbit NIC's are "ok", but again, depends on storage.  Glusterfs is no speed 
> demon.  But you might not need "fast" storage.
> 
> Lastly, your setup is just for "fun", right?  Otherwise, read on.
> 
> 
> Running oVirt 3.6 (this is a production setup)
> 
> ovirt engine (manager):
> Dell PowerEdge 430, 32G
> 
> ovirt cluster nodes:
> Dell m1000e 1.1 backplane Blade Enclosure
> 9 x M630 Blades (2xE5-2669v3, 384GB), 4 iSCSI paths, 4 bonded LAN, all 10GbE, 
> CentOS 7.2
> 4 x MXL 10/40GbE (2x40Gbit LAN, 2x40Gbit iSCSI SAN to the S4810's)
> 
> 120 VM's, CentOS 6, CentOS 7, Windows 10 Ent., Windows Server 2012
> We've run on as few as 3 nodes.
> 
> Network, SAN and Storage (for ovirt Domains):
> 2 x S4810 (part is used for SAN, part for LAN)
> Equallogic dual controller (note: passive/active) PS6610S (84 x 4TB 7.2K SAS)
> Equallogic dual controller (note: passive/active) PS6610X (84 x 1TB 10K SAS
> 
> ISO and Export Domains are handled by:
> Dell PE R620, 32G, 2x10Gbit LAN, 2x10Gbit iSCSI to the SAN (above), CentOS 
> 7.4, NFS
> 
> What I like:
> * Easy setup.
> * Relatively good network and storage.
> 
> What I don't like:
> * 2 "effective" networks, LAN and iSCSI.  All networking uses the same 
> effective path.  Would be nice to have more physical isolation for mgmt vs 
> motion vs VMs.  QoS is provided in oVirt, but still, would be nice to have 
> the full pathways.
> * Storage doesn't use active/active controllers, so controller failover is 
> VERY slow.
> * We have a fast storage system, and somewhat slower storage system (matter 
> of IOPS),  neither is SSD, so there isn't a huge difference.  No real 
> redundancy or flexibility.
> * vdsm can no longer respond fast enough for the amount of disks defined (in 
> the event of a new Storage Domain add).  We have raised vdsTimeout, but have 
> not tested yet.
> 
> I inherited the "style" above.  My recommendation of where to start for a 
> reasonable production instance, minimum (assumes the S4810's above, not 
> priced here):
> 
> 1 x ovirt manager/engine, approx $1500
> 4 x Dell R620, 2xE5-2660, 768G, 6x10GbE (LAN, Storage, Motion), approx $42K
> 3 x Nexsan 18P 108TB, approx $96K
> 
> While significantly cheaper (by 6 figures), it provides active/active 
> controllers, storage reliability and flexibility and better network pathways. 
>  Why 4 x nodes?  Need at least N+1 for reliability.  The extra 4th node is 
> merely capacity.  Why 3 x storage?  Need at least N+1 for reliability.
> 
> Obviously, you'll still want to back things up and test the ability to 
> restore components like the ovirt 

Re: [ovirt-users] oVirt Node Resize tool for local storage

2018-03-28 Thread Pavol Brilla
Hi

AFAIK ext4 is not supporting online shrinking of filesystem,
to shrink storage you would need to unmount filesystem,
thus it is not possible to do with VM online.

On Wed, Mar 28, 2018 at 2:30 AM, Matt Simonsen  wrote:

> Hello,
>
> We have a development box with local storage, running ovirt Node 4.1
>
> It appears that using the admin interface on port 9090 I can resize a live
> partition to a smaller size.
>
> Our storage is a seperate LVM partition, ext4 formated.
>
> My question is, both theoretically and practically, if anyone has feedback
> on:
>
>
> #1: Does this work (ie- will it shrink the filesystem then shrink the LV)?
>
> #2: May we do this with VMs running?
>
>
> Thanks
>
> Matt
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>



-- 

PAVOL BRILLA

RHV QUALITY ENGINEER, CLOUD

Red Hat Czech Republic, Brno 

TRIED. TESTED. TRUSTED. 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Ping::(action) Failed to ping x.x.x.x, (4 out of 5)

2018-03-28 Thread Yedidyah Bar David
On Tue, Mar 27, 2018 at 9:45 AM, i...@linuxfabrik.ch
 wrote:
> Hi all,
>
> we randomly and constantly have this message in our /var/log/ovirt-
> hosted-engine-ha/broker.log:
>
> /var/log/ovirt-hosted-engine-ha/broker.log:Thread-1::WARNING::2018-03-
> 27 08:17:25,891::ping::63::ping.Ping::(action) Failed to ping x.x.x.x,
> (4 out of 5)
>
> The pinged device is a switch (not a gateway). We know that a switch
> might drop icmp packets if it needs to. The interesting thing about
> that is if it fails it fails always at "4 out of 5", but in the end (5
> of 5) it always succeeds.
>
> Is there a way to increase the amount of pings or to have another way
> instead of ping?

Now looked at the source, and I do not think there is a way. It might
be useful to add one, though - and might be not too hard - perhaps
something like this (didn't test):

https://gerrit.ovirt.org/89528

That said, I think it's about time we change the text(s) to imply
that we use this gateway to test network connectivity with ping,
so do not really need to be a gateway, but do need to reliably reply
to pings, and if we do have places where we use it as a gateway,
we should simply ask two questions, or at least allow overriding
in the config file.

Best regards,
-- 
Didi
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users