Managed to solve this with adding my dns entry in the interface configuration file for NetworkManager on all dns client hosts.

So, keep /etc/NetworkManager/NetworkManager.conf :
.....
[main]
"dns=none"

and add DNS1 for my dns-server
/etc/sysconfig/network-scripts/ifcfg-xxxx
PEERDNS=YES #as required by the OShift install procedure for dnsmasq
DNS1=192.168.150.5 #my internal dns-server

This way /etc/resolv.conf remains untouched by NetworkManager who still gets the DNS servers via DHCP, but also adds mine.
nmcli conn show "System eth0" | grep IP4
shows the DNSs got via DHCP + my 192.168.150.5

On 09.05.2018 19:30, Dan Pungă wrote:
Hello all!

Let me first start by specifying that my problem isn't specifically OpenShift related, but I'm trying this mailing list in hopes that someone faced my problem before and somehow managed to solve it.

I'm running an OpenShift-Origin installation using the openshift-ansible playbooks and it all goes pretty fine until ansible is trying to install and configure the nodes.

Some background: all hosts are running updated CentOS 7 and are part of a private network with IPs allocated through DHCP. I'm running a separate host that is configured as a dns server (as required by the install procedure) and all the other hosts are configured to use this dns-server host for name resolution. In order to achieve this I had to disable the NetworkManager service's ability to configure DNS. This was done by specifying in /etc/NetworkManager/NetworkManager.conf "dns=none" under the main section. This option/configuration prevents the overwrite of /etc/resolv.conf by the NetworkManager service.

The OpenShift installation runs fine up to a task in the "Install nodes" playbook/batch where it tries starting and enabling the origin-node services. Curious enough, this task fails for only 1 node, while the other 3 seem to pass it, but at a later point, where the task is to restart the origin-node service, the remaining 3 fail as well.

By inspecting the journalctl logs for origin-node, I've found that there was no connectivity to a host on the network ....dial tcp: lookup lb.oshift-pinfold.intra on 192.168.150.16:53: no such host.... In fact there's no connectivity to the entire network and /etc/resolv.conf has been rewritten.

By doing some research on what was going on, I've found out that there's a script copied and run by the OpenSHift installer: /etc/NetworkManager/dispatcher.d/99-origin-dns.sh that overwrites the /etc/resolv.conf. I'm not really experienced in how this works, but I'm guessing that the behaviour would be to pass name-resolution to the dnsmasq service. I've found that the script also generates /etc/origin/node/resolv.conf and /etc/dnsmasq.d/origin-upstream-dns.conf which seems to copy the nameservers found in /etc/resolv.conf at first run. However, editing by hand /etc/resolv.conf to remake the initial condiguration and doing a systemctl restart NetworkManager, disregards my internal nameserver.

I'm thinking that the NetworkManager service somehow overwrites the /etc/resolv.conf file prior to invoking the /etc/NetworkManager/dispatcher.d/99-origin-dns.sh script. I've tried manually editing /etc/origin/node/resolv.conf and /etc/dnsmasq.d/origin-upstream-dns.conf and adding the dns server without restarting NetworkManager service. This way name resolution is functioning and I'm also able to start the origin-node service, but I'm afraid this is not suited for the automated installation process.

Any help/hints are much appreciated!

Dan Pungă

==========================

actual behaviour:
### the starting contents of /etc/resolv.conf (with my internal dns server configured as 192.168.150.5)
cat /etc/resolv.conf
search openstacklocal
search oshift-pinfold.intra
nameserver 192.168.150.5
nameserver 8.8.8.8
nameserver 8.8.4.4

###output of initial contents of configuration files
cat /etc/dnsmasq.d/origin-upstream-dns.conf
server=8.8.8.8
server=8.8.4.4
cat /etc/origin/node/resolv.conf
nameserver 8.8.8.8
nameserver 8.8.4.4

###empty configuration files written by /etc/NetworkManager/dispatcher.d/99-origin-dns.sh (just to prove the point)
 > /etc/dnsmasq.d/origin-upstream-dns.conf
> /etc/origin/node/resolv.conf

### restart NetworkManager
systemctl restart NetworkManager

###results...
cat /etc/resolv.conf
# nameserver updated by /etc/NetworkManager/dispatcher.d/99-origin-dns.sh
search cluster.local openstacklocal
search cluster.local oshift-pinfold.intra
nameserver 192.168.150.22

cat /etc/origin/node/resolv.conf
nameserver 8.8.8.8
nameserver 8.8.4.4

cat /etc/dnsmasq.d/origin-upstream-dns.conf
server=8.8.8.8
server=8.8.4.4

So the /etc/NetworkManager/dispatcher.d/99-origin-dns.sh script should write to the config files, shown here, the nameservers found in /etc/resolv.conf (when it's not watermarked). But it doesn't write the nameserver for my internal dns. The 8.8.8.8 and 8.8.4.4 could be confusing, but if I make an /etc/resolv.conf with some bogus nameservers, the result is precisely the same. I don't know how it finds those 8.8 nameservers and my guess, as I mentioned in the first part of the message, is that there's some config "elsewere" with the 8.8.. and it is used by the NetworkManager service to overwrite the /etc/resolv.conf file and it is this modified version that the /etc/NetworkManager/dispatcher.d/99-origin-dns.sh script finds and works with....

_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Reply via email to