Jonathan, I’d suggest to try the following: 1. Add “dns=none” to the main section of /etc/NetworkManager/NetworkManager.conf 2. Restart NetworkManager 3. Edit /etc/resolv.conf manually. Set a proper nameserver and remove a 99-origin-dns.sh comment line. 4. Restart NetworkManager again
From: <[email protected]> on behalf of Jonathan Lee <[email protected]> Date: Saturday, 12 May 2018 at 01:42 To: "[email protected]" <[email protected]> Subject: Enable traffic through an additional NIC The Origin documentation suggests that by default, OpenShift listens to traffic on ports 80 and 443 over all host network interfaces. Although not explicitly stated anywhere I can find, this suggests to me that the default network plugins treat all interfaces uniformly. However, I have encountered some odd network behavior on OpenShift master nodes with multiple NICs available on their virtual hosts. Because there are so many different ways to configure networking, I need help determining whether I have a misconfiguration or if I have encountered a bug. I have a private cloud that uses a Microsoft Hypervisor for each of my VMs. It has been configured with two virtual networks: a completely "isolated" network and a "public" network capable of communicating directly with other computers in my physical network and with the internet. I created a few VMs for my OpenShift cluster, each with a NIC pointing to the isolated network, and installed the minimal server image of CentOS 7.4 on each of them (with 1 additional VM running the Atomic Host variant). I then created an additional VM with 2 NICs, where 1 NIC was on the same isolated network, and the other NIC was on my "public" network, allowing SSH access from my physical workstation. This VM was used to host DNS services to the isolated network and was the point from which I executed Ansible scripts to install OpenShift Origin onto the isolated VMs. So I can SSH from my workstation into the bastion host, from which point I can SSH into the isolated VMs. [workstation] --- // (gateway) // ---- ("public" vLAN) ---- [bastion host] ---- ("isolated" vLAN) --- [OpenShift nodes] where the subnet used for the public vLAN occupies 172.18.8.0/23<http://172.18.8.0/23>, and subnet used by the OpenShift cluster on the private vLAN is 192.168.1.0/24<http://192.168.1.0/24> I am using the custom network configuration in Origin, so, for example, iptables is being used (not firewalld). The only non-standard config was manually setting the --bip for the Docker daemon to 192.168.200.1/24<http://192.168.200.1/24>, although there wasn't any risk of address overlap. After running applications successfully on OpenShift Origin 3.7 in isolation with 4 nodes (including 1 master+etcd), I decided to open the OpenShift master to network traffic on the "public" vLAN, so I added a second network interface (eth1) to the OpenShift master node. NetworkManager shows the connection is up, and it successfully received a DHCP assignment. I am able to communicate with other VMs camped on that same subnet, but unlike all those other VMs NOT running OpenShift, this VM seems to ONLY allow traffic within the subnet on that public NIC. In other words, it's as if the gateway is misconfigured, but as I used DHCP and can see the correct route defined for ip route, that doesn't seem to be the case. As if that wasn't odd enough, my oc commands stopped working from the master node. Within my private vLAN, the master node's FQDN was openshift.private.net<http://openshift.private.net>. Because the public vLAN's DNS server is unfamiliar with the "private.net<http://private.net>" subdomain, if I simply run oc login from the master node, the response is Unable to connect to the server: dial tcp: lookup openshift.private.net on 172.18.9.19:53: no such host FWIW, I have PEERDNS=no in the public interface configuration, yet 99-origin-dns.sh appears to have applied the public interface's DNS configuration to /etc/resolv.conf so it looks like there is a requirement to manually specify the DNS server during Origin installation? To test the theory that this is simply an issue with DNS lookup, I manually edited the /etc/resolv.conf file. Now the oc commands yield a different error: The connection to the server openshift.private.net:8443 was refused - did you specify the right host or port? Since this is now beyond my basic networking expertise, I tried another cluster, where ALL VM interfaces were on the public vLAN. Communication between the VMs and the other devices on my network was successful, even with two public NICs on the master node, so I deployed OpenShift Origin 3.6, and suddenly I could only communicate over eth0. Otherwise, everything worked until I rebooted the master node. At that point, I noticed internal name resolution began failing, so I looked at the ip route response and found that any time the order in which the devices appears was different compared to the previous boot. Only when eth0 was the first in the list did internal name resolution work. So before I start banging my head against a wall, is there a variable I should set explicitly in my inventory file prior to deploying Origin to prevent OpenShift from breaking communication over additional NICs?
_______________________________________________ users mailing list [email protected] http://lists.openshift.redhat.com/openshiftmm/listinfo/users
