Hi,

We have managed to stabilize the DNS udpate in out network. Now the current
situation is.
I have 3 hosts that can run the engine (hosted-engine).
They were all in the 10.8.236.x. Now i have moved one of them in the
10.16.248.x.

If i boot the engine on one of the host that is in the 10.8.236.x the
engine is going up with status "good". I can access the engine UI. I can
see all my hosts even the one in the 10.16.248.x network.

But if i boot the engine on the hosted-engine host that was switch to the
10.16.248.x the engine is booting. I can ssh to it but the status is always
" fail for liveliness check".
The main difference is that when i boot on the host that is in the
10.16.248.x network the engine gets a address in the 248.x network.

On the engine i have this in the
/var/log/ovirt-engine-dwh/ovirt-engine-dwhd.log
019-07-23
09:05:30|MFzehi|YYTDiS|jTq2w8|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can
not sample data, oVirt Engine is not updating the statistics. Please check
your oVirt Engine status.|9704
the engine.log seems okey.

So i need to understand what this " liveliness check" do(or try to do) so i
can investigate why the engine status is not becoming good.

The initial deployment was done in the 10.8.236.x network. Maybe is as
something to do with that.

Thanks & Regards

Carl


















On Thu, Jul 18, 2019 at 8:53 AM Miguel Duarte de Mora Barroso <
[email protected]> wrote:

> On Thu, Jul 18, 2019 at 2:50 PM Miguel Duarte de Mora Barroso
> <[email protected]> wrote:
> >
> > On Thu, Jul 18, 2019 at 1:57 PM carl langlois <[email protected]>
> wrote:
> > >
> > > Hi Miguel,
> > >
> > > I have managed to change the config for the ovn-controler.
> > > with those commands
> > >  ovs-vsctl set Open_vSwitch . external-ids:ovn-remote=ssl:
> 10.16.248.74:6642
> > >  ovs-vsctl set Open_vSwitch . external-ids:ovn-encap-ip=10.16.248.65
> > > and restating the services
> >
> > Yes, that's what the script is supposed to do, check [0].
> >
> > Not sure why running vdsm-tool didn't work for you.
> >
> > >
> > > But even with this i still have the "fail for liveliness check" when
> starting the ovirt engine. But one thing  i notice with our new network is
> that the reverse DNS does not work(IP -> hostname). The forward is working
> fine. I am trying to see with our IT why it is not working.
> >
> > Do you guys use OVN? If not, you could disable the provider, install
> > the hosted-engine VM, then, if needed, re-add / re-activate it .
>
> I'm assuming it fails for the same reason you've stated initially  -
> i.e. ovn-controller is involved; if it is not, disregard this msg :)
> >
> > [0] -
> https://github.com/oVirt/ovirt-provider-ovn/blob/master/driver/scripts/setup_ovn_controller.sh#L24
> >
> > >
> > > Regards.
> > > Carl
> > >
> > > On Thu, Jul 18, 2019 at 4:03 AM Miguel Duarte de Mora Barroso <
> [email protected]> wrote:
> > >>
> > >> On Wed, Jul 17, 2019 at 7:07 PM carl langlois <[email protected]>
> wrote:
> > >> >
> > >> > Hi
> > >> > Here is the output of the command
> > >> >
> > >> > [root@ovhost1 ~]# vdsm-tool --vvverbose ovn-config 10.16.248.74
> ovirtmgmt
> > >> > MainThread::DEBUG::2019-07-17
> 13:02:52,581::cmdutils::150::root::(exec_cmd) lshw -json -disable usb
> -disable pcmcia -disable isapnp -disable ide -disable scsi -disable dmi
> -disable memory -disable cpuinfo (cwd None)
> > >> > MainThread::DEBUG::2019-07-17
> 13:02:52,738::cmdutils::158::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0
> > >> > MainThread::DEBUG::2019-07-17
> 13:02:52,741::routes::109::root::(get_gateway) The gateway 10.16.248.1 is
> duplicated for the device ovirtmgmt
> > >> > MainThread::DEBUG::2019-07-17
> 13:02:52,742::routes::109::root::(get_gateway) The gateway 10.16.248.1 is
> duplicated for the device ovirtmgmt
> > >> > MainThread::DEBUG::2019-07-17
> 13:02:52,742::cmdutils::150::root::(exec_cmd) /sbin/tc qdisc show (cwd None)
> > >> > MainThread::DEBUG::2019-07-17
> 13:02:52,744::cmdutils::158::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0
> > >> > MainThread::DEBUG::2019-07-17
> 13:02:52,745::cmdutils::150::root::(exec_cmd) /sbin/tc class show dev
> enp2s0f1 classid 0:1388 (cwd None)
> > >> > MainThread::DEBUG::2019-07-17
> 13:02:52,747::cmdutils::158::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0
> > >> > MainThread::DEBUG::2019-07-17
> 13:02:52,766::cmdutils::150::root::(exec_cmd)
> /usr/share/openvswitch/scripts/ovs-ctl status (cwd None)
> > >> > MainThread::DEBUG::2019-07-17
> 13:02:52,777::cmdutils::158::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0
> > >> > MainThread::DEBUG::2019-07-17
> 13:02:52,778::vsctl::67::root::(commit) Executing commands:
> /usr/bin/ovs-vsctl --timeout=5 --oneline --format=json -- list Bridge --
> list Port -- list Interface
> > >> > MainThread::DEBUG::2019-07-17
> 13:02:52,778::cmdutils::150::root::(exec_cmd) /usr/bin/ovs-vsctl
> --timeout=5 --oneline --format=json -- list Bridge -- list Port -- list
> Interface (cwd None)
> > >> > MainThread::DEBUG::2019-07-17
> 13:02:52,799::cmdutils::158::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0
> > >> > netlink/events::DEBUG::2019-07-17
> 13:02:52,802::concurrent::192::root::(run) START thread
> <Thread(netlink/events, started daemon 140299323660032)> (func=<bound
> method Monitor._scan of <vdsm.network.netlink.monitor.Monitor object at
> 0x7f99fb618c90>>, args=(), kwargs={})
> > >> > netlink/events::DEBUG::2019-07-17
> 13:02:54,805::concurrent::195::root::(run) FINISH thread
> <Thread(netlink/events, started daemon 140299323660032)>
> > >> > Using default PKI files
> > >> >
> > >> > I do not see any indication of the config??
> > >>
> > >> And afterwards when you execute "ovs-vsctl list Open_vSwitch" does it
> > >> reflect the updated value ?
> > >>
> > >> This command would have to be performed in the node where hosted
> > >> engine will be hosted - not sure if it's possible to determine before
> > >> hand which one it will be. If not, you should run it in all the nodes
> > >> in the cluster, to be sure.
> > >>
> > >> >
> > >> > Regards
> > >> > Carl
> > >> >
> > >> > On Wed, Jul 17, 2019 at 11:40 AM carl langlois <
> [email protected]> wrote:
> > >> >>
> > >> >> Hi
> > >> >>
> > >> >> I have open a bug
> https://bugzilla.redhat.com/show_bug.cgi?id=1730776
> > >> >>
> > >> >> I have try this command "vdsm-tool ovn-config 10.16.248.74
> ovirtmgmt" on one of the host but nothing changed. After a restart of the
> ovn-controler i still get
> > >> >>
> > >> >> 2019-07-17T15:38:52.572Z|00033|reconnect|INFO|ssl:
> 10.8.236.244:6642: waiting 8 seconds before reconnect
> > >> >> 2019-07-17T15:39:00.578Z|00034|reconnect|INFO|ssl:
> 10.8.236.244:6642: connecting...
> > >> >> 2019-07-17T15:39:05.720Z|00035|fatal_signal|WARN|terminating with
> signal 15 (Terminated)
> > >> >> 2019-07-17T15:39:05.863Z|00001|vlog|INFO|opened log file
> /var/log/openvswitch/ovn-controller.log
> > >> >>
> 2019-07-17T15:39:05.864Z|00002|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
> connecting...
> > >> >>
> 2019-07-17T15:39:05.864Z|00003|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
> connected
> > >> >> 2019-07-17T15:39:05.865Z|00004|reconnect|INFO|ssl:
> 10.8.236.244:6642: connecting...
> > >> >> 2019-07-17T15:39:06.865Z|00005|reconnect|INFO|ssl:
> 10.8.236.244:6642: connection attempt timed out
> > >> >> 2019-07-17T15:39:06.865Z|00006|reconnect|INFO|ssl:
> 10.8.236.244:6642: waiting 1 seconds before reconnect
> > >> >> 2019-07-17T15:39:07.867Z|00007|reconnect|INFO|ssl:
> 10.8.236.244:6642: connecting...
> > >> >> 2019-07-17T15:39:08.867Z|00008|reconnect|INFO|ssl:
> 10.8.236.244:6642: connection attempt timed out
> > >> >> 2019-07-17T15:39:08.868Z|00009|reconnect|INFO|ssl:
> 10.8.236.244:6642: waiting 2 seconds before reconnect
> > >> >> 2019-07-17T15:39:10.870Z|00010|reconnect|INFO|ssl:
> 10.8.236.244:6642: connecting...
> > >> >> 2019-07-17T15:39:12.872Z|00011|reconnect|INFO|ssl:
> 10.8.236.244:6642: connection attempt timed out
> > >> >> 2019-07-17T15:39:12.872Z|00012|reconnect|INFO|ssl:
> 10.8.236.244:6642: waiting 4 seconds before reconnect
> > >> >>
> > >> >>
> > >> >>
> > >> >> On Wed, Jul 17, 2019 at 10:56 AM Miguel Duarte de Mora Barroso <
> [email protected]> wrote:
> > >> >>>
> > >> >>> On Wed, Jul 17, 2019 at 3:01 PM carl langlois <
> [email protected]> wrote:
> > >> >>> >
> > >> >>> > Hi Miguel
> > >> >>> >
> > >> >>> > if i do ovs-vsctl  list Open_vSwitch i get
> > >> >>> >
> > >> >>> > uuid               : ce94c4b1-7eb2-42e3-8bfd-96e1dec40dea
> > >> >>> > bridges             : [9b0738ee-594d-4a87-8967-049a8b1a5774]
> > >> >>> > cur_cfg             : 1
> > >> >>> > datapath_types      : [netdev, system]
> > >> >>> > db_version          : "7.14.0"
> > >> >>> > external_ids        : {hostname="ovhost2",
> ovn-bridge-mappings="", ovn-encap-ip="10.8.236.150", ovn-encap-type=geneve,
> ovn-remote="ssl:10.8.236.244:6642",
> system-id="7c39d07b-1d54-417b-bf56-7a0f1a07f832"}
> > >> >>> > iface_types         : [geneve, gre, internal, lisp, patch, stt,
> system, tap, vxlan]
> > >> >>> > manager_options     : []
> > >> >>> > next_cfg            : 1
> > >> >>> > other_config        : {}
> > >> >>> > ovs_version         : "2.7.3"
> > >> >>> > ssl                 : []
> > >> >>> > statistics          : {}
> > >> >>> > system_type         : centos
> > >> >>> > system_version      : "7"
> > >> >>> >
> > >> >>> > I can see two addresses that are on the old network..
> > >> >>>
> > >> >>> Yes, those are it.
> > >> >>>
> > >> >>> Use the tool I mentioned to update that to the correct addresses
> on
> > >> >>> the network, and re-try.
> > >> >>>
> > >> >>> vdsm-tool ovn-config <engine_ip_on_net> <name of the management
> network>
> > >> >>>
> > >> >>> > Regards
> > >> >>> > Carl
> > >> >>> >
> > >> >>> >
> > >> >>> > On Wed, Jul 17, 2019 at 8:21 AM carl langlois <
> [email protected]> wrote:
> > >> >>> >>
> > >> >>> >> Hi Miguel,
> > >> >>> >>
> > >> >>> >> I will surely open a bugs, any specific ovirt componenent to
> select when openeing the bug?
> > >> >>>
> > >> >>> ovirt-engine
> > >> >>>
> > >> >>> >>
> > >> >>> >> When you say that the hosted-engine should have trigger a the
> update. Do you mean is was suppose to trigger the update and did not work
> or it is something missing?
> > >> >>>
> > >> >>> I sincerely do not know. @Dominik Holler, could you shed some
> light into this ?
> > >> >>>
> > >> >>> >> Could i have missed a step when switching the network?
> > >> >>> >>
> > >> >>> >> Also if i try to do ovs-vsctl list . The list command require
> a Table name. Not sure what table to use?
> > >> >>> >>
> > >> >>> >> Regards
> > >> >>> >> Carl
> > >> >>> >>
> > >> >>> >>
> > >> >>> >>
> > >> >>> >> On Wed, Jul 17, 2019 at 4:21 AM Miguel Duarte de Mora Barroso <
> [email protected]> wrote:
> > >> >>> >>>
> > >> >>> >>> On Tue, Jul 16, 2019 at 8:48 PM carl langlois <
> [email protected]> wrote:
> > >> >>> >>> >
> > >> >>> >>> > Hi
> > >> >>> >>> >
> > >> >>> >>> > We are in a process of changing our network connection. Our
> current network is using 10.8.256.x and we will change to 10.16.248.x. We
> have a HA ovirt cluster (around 10 nodes) currently configure on the
> 10.8.256.x. So my question is is it possible to relocate the ovirt cluster
> to the 10.16.248.x.  We have tried to move everything to the new network
> without success. All the node seem to boot up properly, our gluster storage
> also work properly.
> > >> >>> >>> > When we try to start the hosted-engine it goes up but fail
> the liveliness check. We have notice in the
> /var/log/openvswitch/ovn-controller.log that he is triying to connect to
> the hold ip address of the hosted-engine vm.
> > >> >>> >>> > 019-07-16T18:41:29.483Z|01992|reconnect|INFO|ssl:
> 10.8.236.244:6642: waiting 8 seconds before reconnect
> > >> >>> >>> > 2019-07-16T18:41:37.489Z|01993|reconnect|INFO|ssl:
> 10.8.236.244:6642: connecting...
> > >> >>> >>> > 2019-07-16T18:41:45.497Z|01994|reconnect|INFO|ssl:
> 10.8.236.244:6642: connection attempt timed out
> > >> >>> >>> >
> > >> >>> >>> > So my question is were is the 10.8.236.244 come from.
> > >> >>> >>>
> > >> >>> >>> Looks like the ovn controllers were not updated during the
> network change.
> > >> >>> >>>
> > >> >>> >>> The wrong IP is configured within openvswitch, you can see it
> in the
> > >> >>> >>> (offending) nodes through "ovs-vsctl list . ". It'll be a key
> in the
> > >> >>> >>> 'external_ids' column called 'ovn-remote' .
> > >> >>> >>>
> > >> >>> >>> This is not the solution, but a work-around; you could try to
> > >> >>> >>> configure the ovn controllers via:
> > >> >>> >>> vdsm-tool ovn-config <engine_ip_on_net> <name of the
> management network>
> > >> >>> >>>
> > >> >>> >>> Despite the provided work-around, I really think the hosted
> engine
> > >> >>> >>> should have triggered the ansible role that in turn triggers
> this
> > >> >>> >>> reconfiguration.
> > >> >>> >>>
> > >> >>> >>> Would you open a bug with this information ?
> > >> >>> >>>
> > >> >>> >>>
> > >> >>> >>> >
> > >> >>> >>> > The routing table for one of our host look like this
> > >> >>> >>> >
> > >> >>> >>> > estination     Gateway         Genmask         Flags Metric
> Ref    Use Iface
> > >> >>> >>> > default         gateway         0.0.0.0         UG    0
>   0        0 ovirtmgmt
> > >> >>> >>> > 10.16.248.0     0.0.0.0         255.255.255.0   U     0
>   0        0 ovirtmgmt
> > >> >>> >>> > link-local      0.0.0.0         255.255.0.0     U     1002
>  0        0 eno1
> > >> >>> >>> > link-local      0.0.0.0         255.255.0.0     U     1003
>  0        0 eno2
> > >> >>> >>> > link-local      0.0.0.0         255.255.0.0     U     1025
>  0        0 ovirtmgmt
> > >> >>> >>> >
> > >> >>> >>> > Any help would be really appreciated.
> > >> >>> >>> >
> > >> >>> >>> > Regards
> > >> >>> >>> > Carl
> > >> >>> >>> >
> > >> >>> >>> >
> > >> >>> >>> >
> > >> >>> >>> >
> > >> >>> >>> > _______________________________________________
> > >> >>> >>> > Users mailing list -- [email protected]
> > >> >>> >>> > To unsubscribe send an email to [email protected]
> > >> >>> >>> > Privacy Statement:
> https://www.ovirt.org/site/privacy-policy/
> > >> >>> >>> > oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> > >> >>> >>> > List Archives:
> https://lists.ovirt.org/archives/list/[email protected]/message/DBQUWEPPDK2JDFU4HOGNURK7AB3FDINC/
>
_______________________________________________
Users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/UB72PHIP2FO3EC3M3NRKDGOL6SA3MAE5/

Reply via email to