Hi Everyone,

  Thanks for the help, answers below.

On 2017-04-10 05:27 AM, Sandro Bonazzola wrote:
Adding Simone and Martin, replying inline.

On Mon, Apr 10, 2017 at 10:16 AM, Ondrej Svoboda <osvob...@redhat.com <mailto:osvob...@redhat.com>> wrote:

    Hello Charles,

    First, can you give us more information regarding the duplicated
    IPv6 addresses? Since you are going to reinstall the hosted
    engine, could you make sure that NetworkManager is disabled before
    adding the second vNIC (and perhaps even disable IPv6 and reboot
    as well, so we have a solid base and see what makes the difference)?

I disabled NetworkManager on the hosts (systemctl disable NetworkManager ; service NetworkManager stop) before doing the oVirt setup and rebooted to make sure that it didn't come back up. Or are you referring to on the hosted engine VM? I just removed and re-added the eth1 NIC in the hosted engine, and this is what showed up in dmesg: [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: [1af4:1000] type 00 class 0x020000
[Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: reg 0x10: [io 0x0000-0x001f]
[Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: reg 0x14: [mem 0x00000000-0x00000fff] [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref] [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: reg 0x30: [mem 0x00000000-0x0003ffff pref] [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: BAR 6: assigned [mem 0xc0000000-0xc003ffff pref] [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: BAR 4: assigned [mem 0xc0040000-0xc0043fff 64bit pref] [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: BAR 1: assigned [mem 0xc0044000-0xc0044fff] [Mon Apr 10 06:46:43 2017] pci 0000:00:08.0: BAR 0: assigned [io 0x1000-0x101f] [Mon Apr 10 06:46:43 2017] virtio-pci 0000:00:08.0: enabling device (0000 -> 0003)
[Mon Apr 10 06:46:43 2017] virtio-pci 0000:00:08.0: irq 35 for MSI/MSI-X
[Mon Apr 10 06:46:43 2017] virtio-pci 0000:00:08.0: irq 36 for MSI/MSI-X
[Mon Apr 10 06:46:43 2017] virtio-pci 0000:00:08.0: irq 37 for MSI/MSI-X
[Mon Apr 10 06:46:43 2017] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready [Mon Apr 10 06:46:43 2017] IPv6: eth1: IPv6 duplicate address fe80::21a:4aff:fe16:151 detected!

Then when the network dropped I started getting these:

[Mon Apr 10 06:48:00 2017] IPv6: eth1: IPv6 duplicate address 2001:410:e000:902:21a:4aff:fe16:151 detected! [Mon Apr 10 06:48:00 2017] IPv6: eth1: IPv6 duplicate address 2001:410:e000:902:21a:4aff:fe16:151 detected! [Mon Apr 10 06:49:51 2017] IPv6: eth1: IPv6 duplicate address 2001:410:e000:902:21a:4aff:fe16:151 detected! [Mon Apr 10 06:51:40 2017] IPv6: eth1: IPv6 duplicate address 2001:410:e000:902:21a:4aff:fe16:151 detected!

The network on eth1 would go down for a few seconds then come back up, but networking stays solid on eth0. I disabled NetworkManager on the HE VM as well to see if that makes a difference. I also disabled IPv6 with sysctl to see if that helps. I'll install a Ubuntu VM on the cluster later today and see if it has a similar issue.



    What kind of documentation did you follow to install the hosted
    engine? Was it this page?
    https://www.ovirt.org/documentation/how-to/hosted-engine/
    <https://www.ovirt.org/documentation/how-to/hosted-engine/> If so,
    could you file a bug against VDSM networking and attach
    /var/log/vdsm/vdsm.log and supervdsm.log, and make sure they
    include the time period from adding the second vNIC to rebooting?

    Second, even the vNIC going missing after reboot looks like a bug
    to me. Even though eth1 does not exist in the VM, can you see it
    defined for the VM in the engine web GUI?


If the HE vm configuration wasn't flushed to the OVF_STORE yet, it make sense it disappeared on restart.

The docs I used were https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html/self-hosted_engine_guide/chap-deploying_self-hosted_engine#Deploying_Self-Hosted_Engine_on_RHEL which someone on the list pointed me to last week as being more up-to-date than what was on the website (the docs on the website don't seem to mention that you need to put the HE on it's own datastore and look to be more geared towards bare-metal engine rather than the VM self hosted option.)

When I went back into the GUI and looked at the hosted engine config the second NIC was listed there, but it wasn't showing up in lspci on the VM. I removed the NIC in the GUI and re-added it, and the device appeared again on the VM. What is the proper way to "save" the state of the VM so that the OVF_STORE gets updated? When I do anything on the HE VM that I want to test I just type "reboot", but that powers down the VM. I then login to my host and run "hosted-engine --vm-start" which restarts it, but of course the last time I did that it restarted without the second NIC.


    The steps you took to install the hosted engine with regards to
    networking look good to me, but I believe Sandro (CC'ed) would be
    able to give more advice.

    Sandro, since we want to configure bonding, would you recommend to
    install the engine physically first, move it to a VM, according to
    the following method, and only then reconfigure networking?
    
https://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_Metal_to_an_EL-Based_Self-Hosted_Environment/
    
<https://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_Metal_to_an_EL-Based_Self-Hosted_Environment/>



I don't see why a diret HE deployment couldn't be done. Simone, Martin can you help here?



    Thank you,
    Ondra

    On Mon, Apr 10, 2017 at 8:51 AM, Charles Tassell
    <ctass...@gmail.com <mailto:ctass...@gmail.com>> wrote:

        Hi Everyone,

          Okay, I'm again having problems with getting basic
        networking setup with oVirt 4.1  Here is my situation.  I have
        two servers I want to use to create an oVirt cluster, with two
        different networks.  My "public" network is a 1G link on
        device em1 connected to my Internet feed, and my "storage"
        network is a 10G link connected on device p5p1 to my file
        server.  Since I need to connect to my storage network in
        order to do the install, I selected p5p1 has the ovirtmgmt
        interface when installing the hosted engine. That worked fine,
        I got everything installed, so I used some ssh-proxy magic to
        connect to the web console and completed the install (setup a
        Storage domain and create a new network vmNet for VM
        networking and added em1 to it.)

          The problem was that when I added a second network device to
        the HostedEngine VM (so that I can connect to it from my
        public network) it would intermittently go down.  I did some
        digging and found some IPV6 errors in the dmesg (IPv6: eth1:
        IPv6 duplicate address 2001:410:e000:902:21a:4aff:fe16:151
        detected!) so I disabled IPv6 on both eth0 and eth1 in the
        HostedEngine and rebooted it.  The problem is that when I
        restarted the VM, the eth1 device was missing.

          So, my question is: Can I add a second NIC to the
        HostedEngine VM and make it stick, or will it be deleted
whenever the engine VM is restarted?

When you change something in the HE Vm using the web ui, it has to be saved also on the OVF_STORE to make it permanent for further reboot.
Martin can you please elaborate here?


        Is there a better way to do what I'm trying to do, ie, should
        I setup ovirtmgmt on the public em1 interface, and then create
        the "storage" network after the fact for connecting to the
        datastores and such.  Is that even possible, or required?  I
        was thinking that it would be better for migrations and other
        management functions to happen on the faster 10G network, but
        if the HostedEngine doesn't need to be able to connect to the
        storage network maybe it's not worth the effort?

          Eventually I want to setup LACP on the storage network, but
        I had to wipe the servers and reinstall from scratch the last
        time I tried to set that up.  I was thinking that it was
        because I setup the bonding before installing oVirt, so I
        didn't do that this time.

          Here are my /etc/sysconfig/network-scripts/ifcfg-* files in
        case I did something wrong there (I'm more familiar with
        Debian/Ubuntu network setup than CentOS)

        ifcfg-eth0: (ovirtmgmt aka storage)
        ----------------
        BROADCAST=192.168.130.255
        NETMASK=255.255.255.0
        BOOTPROTO=static
        DEVICE=eth0
        IPADDR=192.168.130.179
        ONBOOT=yes
        DOMAIN=public.net <http://public.net>
        ZONE=public
        IPV6INIT=no


        ifcfg-eth1: (vmNet aka Internet)
        ----------------
        BROADCAST=192.168.1.255
        NETMASK=255.255.255.0
        BOOTPROTO=static
        DEVICE=eth1
        IPADDR=192.168.1.179
        GATEWAY=192.168.1.254
        ONBOOT=yes
        DNS1=192.168.1.1
        DNS2=192.168.1.2
        DOMAIN=public.net <http://public.net>
        ZONE=public
        IPV6INIT=no

        _______________________________________________
        Users mailing list
        Users@ovirt.org <mailto:Users@ovirt.org>
        http://lists.ovirt.org/mailman/listinfo/users
        <http://lists.ovirt.org/mailman/listinfo/users>





--

SANDRO BONAZZOLA

ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D

Red Hat EMEA <https://www.redhat.com/>

<https://red.ht/sig>      
TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>


_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to