Hi, We have managed to stabilize the DNS udpate in out network. Now the current situation is. I have 3 hosts that can run the engine (hosted-engine). They were all in the 10.8.236.x. Now i have moved one of them in the 10.16.248.x.
If i boot the engine on one of the host that is in the 10.8.236.x the engine is going up with status "good". I can access the engine UI. I can see all my hosts even the one in the 10.16.248.x network. But if i boot the engine on the hosted-engine host that was switch to the 10.16.248.x the engine is booting. I can ssh to it but the status is always " fail for liveliness check". The main difference is that when i boot on the host that is in the 10.16.248.x network the engine gets a address in the 248.x network. On the engine i have this in the /var/log/ovirt-engine-dwh/ovirt-engine-dwhd.log 019-07-23 09:05:30|MFzehi|YYTDiS|jTq2w8|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not updating the statistics. Please check your oVirt Engine status.|9704 the engine.log seems okey. So i need to understand what this " liveliness check" do(or try to do) so i can investigate why the engine status is not becoming good. The initial deployment was done in the 10.8.236.x network. Maybe is as something to do with that. Thanks & Regards Carl On Thu, Jul 18, 2019 at 8:53 AM Miguel Duarte de Mora Barroso < [email protected]> wrote: > On Thu, Jul 18, 2019 at 2:50 PM Miguel Duarte de Mora Barroso > <[email protected]> wrote: > > > > On Thu, Jul 18, 2019 at 1:57 PM carl langlois <[email protected]> > wrote: > > > > > > Hi Miguel, > > > > > > I have managed to change the config for the ovn-controler. > > > with those commands > > > ovs-vsctl set Open_vSwitch . external-ids:ovn-remote=ssl: > 10.16.248.74:6642 > > > ovs-vsctl set Open_vSwitch . external-ids:ovn-encap-ip=10.16.248.65 > > > and restating the services > > > > Yes, that's what the script is supposed to do, check [0]. > > > > Not sure why running vdsm-tool didn't work for you. > > > > > > > > But even with this i still have the "fail for liveliness check" when > starting the ovirt engine. But one thing i notice with our new network is > that the reverse DNS does not work(IP -> hostname). The forward is working > fine. I am trying to see with our IT why it is not working. > > > > Do you guys use OVN? If not, you could disable the provider, install > > the hosted-engine VM, then, if needed, re-add / re-activate it . > > I'm assuming it fails for the same reason you've stated initially - > i.e. ovn-controller is involved; if it is not, disregard this msg :) > > > > [0] - > https://github.com/oVirt/ovirt-provider-ovn/blob/master/driver/scripts/setup_ovn_controller.sh#L24 > > > > > > > > Regards. > > > Carl > > > > > > On Thu, Jul 18, 2019 at 4:03 AM Miguel Duarte de Mora Barroso < > [email protected]> wrote: > > >> > > >> On Wed, Jul 17, 2019 at 7:07 PM carl langlois <[email protected]> > wrote: > > >> > > > >> > Hi > > >> > Here is the output of the command > > >> > > > >> > [root@ovhost1 ~]# vdsm-tool --vvverbose ovn-config 10.16.248.74 > ovirtmgmt > > >> > MainThread::DEBUG::2019-07-17 > 13:02:52,581::cmdutils::150::root::(exec_cmd) lshw -json -disable usb > -disable pcmcia -disable isapnp -disable ide -disable scsi -disable dmi > -disable memory -disable cpuinfo (cwd None) > > >> > MainThread::DEBUG::2019-07-17 > 13:02:52,738::cmdutils::158::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0 > > >> > MainThread::DEBUG::2019-07-17 > 13:02:52,741::routes::109::root::(get_gateway) The gateway 10.16.248.1 is > duplicated for the device ovirtmgmt > > >> > MainThread::DEBUG::2019-07-17 > 13:02:52,742::routes::109::root::(get_gateway) The gateway 10.16.248.1 is > duplicated for the device ovirtmgmt > > >> > MainThread::DEBUG::2019-07-17 > 13:02:52,742::cmdutils::150::root::(exec_cmd) /sbin/tc qdisc show (cwd None) > > >> > MainThread::DEBUG::2019-07-17 > 13:02:52,744::cmdutils::158::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0 > > >> > MainThread::DEBUG::2019-07-17 > 13:02:52,745::cmdutils::150::root::(exec_cmd) /sbin/tc class show dev > enp2s0f1 classid 0:1388 (cwd None) > > >> > MainThread::DEBUG::2019-07-17 > 13:02:52,747::cmdutils::158::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0 > > >> > MainThread::DEBUG::2019-07-17 > 13:02:52,766::cmdutils::150::root::(exec_cmd) > /usr/share/openvswitch/scripts/ovs-ctl status (cwd None) > > >> > MainThread::DEBUG::2019-07-17 > 13:02:52,777::cmdutils::158::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0 > > >> > MainThread::DEBUG::2019-07-17 > 13:02:52,778::vsctl::67::root::(commit) Executing commands: > /usr/bin/ovs-vsctl --timeout=5 --oneline --format=json -- list Bridge -- > list Port -- list Interface > > >> > MainThread::DEBUG::2019-07-17 > 13:02:52,778::cmdutils::150::root::(exec_cmd) /usr/bin/ovs-vsctl > --timeout=5 --oneline --format=json -- list Bridge -- list Port -- list > Interface (cwd None) > > >> > MainThread::DEBUG::2019-07-17 > 13:02:52,799::cmdutils::158::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0 > > >> > netlink/events::DEBUG::2019-07-17 > 13:02:52,802::concurrent::192::root::(run) START thread > <Thread(netlink/events, started daemon 140299323660032)> (func=<bound > method Monitor._scan of <vdsm.network.netlink.monitor.Monitor object at > 0x7f99fb618c90>>, args=(), kwargs={}) > > >> > netlink/events::DEBUG::2019-07-17 > 13:02:54,805::concurrent::195::root::(run) FINISH thread > <Thread(netlink/events, started daemon 140299323660032)> > > >> > Using default PKI files > > >> > > > >> > I do not see any indication of the config?? > > >> > > >> And afterwards when you execute "ovs-vsctl list Open_vSwitch" does it > > >> reflect the updated value ? > > >> > > >> This command would have to be performed in the node where hosted > > >> engine will be hosted - not sure if it's possible to determine before > > >> hand which one it will be. If not, you should run it in all the nodes > > >> in the cluster, to be sure. > > >> > > >> > > > >> > Regards > > >> > Carl > > >> > > > >> > On Wed, Jul 17, 2019 at 11:40 AM carl langlois < > [email protected]> wrote: > > >> >> > > >> >> Hi > > >> >> > > >> >> I have open a bug > https://bugzilla.redhat.com/show_bug.cgi?id=1730776 > > >> >> > > >> >> I have try this command "vdsm-tool ovn-config 10.16.248.74 > ovirtmgmt" on one of the host but nothing changed. After a restart of the > ovn-controler i still get > > >> >> > > >> >> 2019-07-17T15:38:52.572Z|00033|reconnect|INFO|ssl: > 10.8.236.244:6642: waiting 8 seconds before reconnect > > >> >> 2019-07-17T15:39:00.578Z|00034|reconnect|INFO|ssl: > 10.8.236.244:6642: connecting... > > >> >> 2019-07-17T15:39:05.720Z|00035|fatal_signal|WARN|terminating with > signal 15 (Terminated) > > >> >> 2019-07-17T15:39:05.863Z|00001|vlog|INFO|opened log file > /var/log/openvswitch/ovn-controller.log > > >> >> > 2019-07-17T15:39:05.864Z|00002|reconnect|INFO|unix:/var/run/openvswitch/db.sock: > connecting... > > >> >> > 2019-07-17T15:39:05.864Z|00003|reconnect|INFO|unix:/var/run/openvswitch/db.sock: > connected > > >> >> 2019-07-17T15:39:05.865Z|00004|reconnect|INFO|ssl: > 10.8.236.244:6642: connecting... > > >> >> 2019-07-17T15:39:06.865Z|00005|reconnect|INFO|ssl: > 10.8.236.244:6642: connection attempt timed out > > >> >> 2019-07-17T15:39:06.865Z|00006|reconnect|INFO|ssl: > 10.8.236.244:6642: waiting 1 seconds before reconnect > > >> >> 2019-07-17T15:39:07.867Z|00007|reconnect|INFO|ssl: > 10.8.236.244:6642: connecting... > > >> >> 2019-07-17T15:39:08.867Z|00008|reconnect|INFO|ssl: > 10.8.236.244:6642: connection attempt timed out > > >> >> 2019-07-17T15:39:08.868Z|00009|reconnect|INFO|ssl: > 10.8.236.244:6642: waiting 2 seconds before reconnect > > >> >> 2019-07-17T15:39:10.870Z|00010|reconnect|INFO|ssl: > 10.8.236.244:6642: connecting... > > >> >> 2019-07-17T15:39:12.872Z|00011|reconnect|INFO|ssl: > 10.8.236.244:6642: connection attempt timed out > > >> >> 2019-07-17T15:39:12.872Z|00012|reconnect|INFO|ssl: > 10.8.236.244:6642: waiting 4 seconds before reconnect > > >> >> > > >> >> > > >> >> > > >> >> On Wed, Jul 17, 2019 at 10:56 AM Miguel Duarte de Mora Barroso < > [email protected]> wrote: > > >> >>> > > >> >>> On Wed, Jul 17, 2019 at 3:01 PM carl langlois < > [email protected]> wrote: > > >> >>> > > > >> >>> > Hi Miguel > > >> >>> > > > >> >>> > if i do ovs-vsctl list Open_vSwitch i get > > >> >>> > > > >> >>> > uuid : ce94c4b1-7eb2-42e3-8bfd-96e1dec40dea > > >> >>> > bridges : [9b0738ee-594d-4a87-8967-049a8b1a5774] > > >> >>> > cur_cfg : 1 > > >> >>> > datapath_types : [netdev, system] > > >> >>> > db_version : "7.14.0" > > >> >>> > external_ids : {hostname="ovhost2", > ovn-bridge-mappings="", ovn-encap-ip="10.8.236.150", ovn-encap-type=geneve, > ovn-remote="ssl:10.8.236.244:6642", > system-id="7c39d07b-1d54-417b-bf56-7a0f1a07f832"} > > >> >>> > iface_types : [geneve, gre, internal, lisp, patch, stt, > system, tap, vxlan] > > >> >>> > manager_options : [] > > >> >>> > next_cfg : 1 > > >> >>> > other_config : {} > > >> >>> > ovs_version : "2.7.3" > > >> >>> > ssl : [] > > >> >>> > statistics : {} > > >> >>> > system_type : centos > > >> >>> > system_version : "7" > > >> >>> > > > >> >>> > I can see two addresses that are on the old network.. > > >> >>> > > >> >>> Yes, those are it. > > >> >>> > > >> >>> Use the tool I mentioned to update that to the correct addresses > on > > >> >>> the network, and re-try. > > >> >>> > > >> >>> vdsm-tool ovn-config <engine_ip_on_net> <name of the management > network> > > >> >>> > > >> >>> > Regards > > >> >>> > Carl > > >> >>> > > > >> >>> > > > >> >>> > On Wed, Jul 17, 2019 at 8:21 AM carl langlois < > [email protected]> wrote: > > >> >>> >> > > >> >>> >> Hi Miguel, > > >> >>> >> > > >> >>> >> I will surely open a bugs, any specific ovirt componenent to > select when openeing the bug? > > >> >>> > > >> >>> ovirt-engine > > >> >>> > > >> >>> >> > > >> >>> >> When you say that the hosted-engine should have trigger a the > update. Do you mean is was suppose to trigger the update and did not work > or it is something missing? > > >> >>> > > >> >>> I sincerely do not know. @Dominik Holler, could you shed some > light into this ? > > >> >>> > > >> >>> >> Could i have missed a step when switching the network? > > >> >>> >> > > >> >>> >> Also if i try to do ovs-vsctl list . The list command require > a Table name. Not sure what table to use? > > >> >>> >> > > >> >>> >> Regards > > >> >>> >> Carl > > >> >>> >> > > >> >>> >> > > >> >>> >> > > >> >>> >> On Wed, Jul 17, 2019 at 4:21 AM Miguel Duarte de Mora Barroso < > [email protected]> wrote: > > >> >>> >>> > > >> >>> >>> On Tue, Jul 16, 2019 at 8:48 PM carl langlois < > [email protected]> wrote: > > >> >>> >>> > > > >> >>> >>> > Hi > > >> >>> >>> > > > >> >>> >>> > We are in a process of changing our network connection. Our > current network is using 10.8.256.x and we will change to 10.16.248.x. We > have a HA ovirt cluster (around 10 nodes) currently configure on the > 10.8.256.x. So my question is is it possible to relocate the ovirt cluster > to the 10.16.248.x. We have tried to move everything to the new network > without success. All the node seem to boot up properly, our gluster storage > also work properly. > > >> >>> >>> > When we try to start the hosted-engine it goes up but fail > the liveliness check. We have notice in the > /var/log/openvswitch/ovn-controller.log that he is triying to connect to > the hold ip address of the hosted-engine vm. > > >> >>> >>> > 019-07-16T18:41:29.483Z|01992|reconnect|INFO|ssl: > 10.8.236.244:6642: waiting 8 seconds before reconnect > > >> >>> >>> > 2019-07-16T18:41:37.489Z|01993|reconnect|INFO|ssl: > 10.8.236.244:6642: connecting... > > >> >>> >>> > 2019-07-16T18:41:45.497Z|01994|reconnect|INFO|ssl: > 10.8.236.244:6642: connection attempt timed out > > >> >>> >>> > > > >> >>> >>> > So my question is were is the 10.8.236.244 come from. > > >> >>> >>> > > >> >>> >>> Looks like the ovn controllers were not updated during the > network change. > > >> >>> >>> > > >> >>> >>> The wrong IP is configured within openvswitch, you can see it > in the > > >> >>> >>> (offending) nodes through "ovs-vsctl list . ". It'll be a key > in the > > >> >>> >>> 'external_ids' column called 'ovn-remote' . > > >> >>> >>> > > >> >>> >>> This is not the solution, but a work-around; you could try to > > >> >>> >>> configure the ovn controllers via: > > >> >>> >>> vdsm-tool ovn-config <engine_ip_on_net> <name of the > management network> > > >> >>> >>> > > >> >>> >>> Despite the provided work-around, I really think the hosted > engine > > >> >>> >>> should have triggered the ansible role that in turn triggers > this > > >> >>> >>> reconfiguration. > > >> >>> >>> > > >> >>> >>> Would you open a bug with this information ? > > >> >>> >>> > > >> >>> >>> > > >> >>> >>> > > > >> >>> >>> > The routing table for one of our host look like this > > >> >>> >>> > > > >> >>> >>> > estination Gateway Genmask Flags Metric > Ref Use Iface > > >> >>> >>> > default gateway 0.0.0.0 UG 0 > 0 0 ovirtmgmt > > >> >>> >>> > 10.16.248.0 0.0.0.0 255.255.255.0 U 0 > 0 0 ovirtmgmt > > >> >>> >>> > link-local 0.0.0.0 255.255.0.0 U 1002 > 0 0 eno1 > > >> >>> >>> > link-local 0.0.0.0 255.255.0.0 U 1003 > 0 0 eno2 > > >> >>> >>> > link-local 0.0.0.0 255.255.0.0 U 1025 > 0 0 ovirtmgmt > > >> >>> >>> > > > >> >>> >>> > Any help would be really appreciated. > > >> >>> >>> > > > >> >>> >>> > Regards > > >> >>> >>> > Carl > > >> >>> >>> > > > >> >>> >>> > > > >> >>> >>> > > > >> >>> >>> > > > >> >>> >>> > _______________________________________________ > > >> >>> >>> > Users mailing list -- [email protected] > > >> >>> >>> > To unsubscribe send an email to [email protected] > > >> >>> >>> > Privacy Statement: > https://www.ovirt.org/site/privacy-policy/ > > >> >>> >>> > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > > >> >>> >>> > List Archives: > https://lists.ovirt.org/archives/list/[email protected]/message/DBQUWEPPDK2JDFU4HOGNURK7AB3FDINC/ >
_______________________________________________ Users mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/UB72PHIP2FO3EC3M3NRKDGOL6SA3MAE5/

