Hello all!
Let me first start by specifying that my problem isn't specifically
OpenShift related, but I'm trying this mailing list in hopes that
someone faced my problem before and somehow managed to solve it.
I'm running an OpenShift-Origin installation using the openshift-ansible
playbooks and it all goes pretty fine until ansible is trying to install
and configure the nodes.
Some background: all hosts are running updated CentOS 7 and are part of
a private network with IPs allocated through DHCP. I'm running a
separate host that is configured as a dns server (as required by the
install procedure) and all the other hosts are configured to use this
dns-server host for name resolution. In order to achieve this I had to
disable the NetworkManager service's ability to configure DNS. This was
done by specifying in /etc/NetworkManager/NetworkManager.conf "dns=none"
under the main section. This option/configuration prevents the overwrite
of /etc/resolv.conf by the NetworkManager service.
The OpenShift installation runs fine up to a task in the "Install nodes"
playbook/batch where it tries starting and enabling the origin-node
services. Curious enough, this task fails for only 1 node, while the
other 3 seem to pass it, but at a later point, where the task is to
restart the origin-node service, the remaining 3 fail as well.
By inspecting the journalctl logs for origin-node, I've found that there
was no connectivity to a host on the network
....dial tcp: lookup lb.oshift-pinfold.intra on 192.168.150.16:53: no
such host....
In fact there's no connectivity to the entire network and
/etc/resolv.conf has been rewritten.
By doing some research on what was going on, I've found out that there's
a script copied and run by the OpenSHift installer:
/etc/NetworkManager/dispatcher.d/99-origin-dns.sh that overwrites the
/etc/resolv.conf. I'm not really experienced in how this works, but I'm
guessing that the behaviour would be to pass name-resolution to the
dnsmasq service. I've found that the script also generates
/etc/origin/node/resolv.conf and /etc/dnsmasq.d/origin-upstream-dns.conf
which seems to copy the nameservers found in /etc/resolv.conf at first run.
However, editing by hand /etc/resolv.conf to remake the initial
condiguration and doing a systemctl restart NetworkManager, disregards
my internal nameserver.
I'm thinking that the NetworkManager service somehow overwrites the
/etc/resolv.conf file prior to invoking the
/etc/NetworkManager/dispatcher.d/99-origin-dns.sh script.
I've tried manually editing /etc/origin/node/resolv.conf and
/etc/dnsmasq.d/origin-upstream-dns.conf and adding the dns server
without restarting NetworkManager service. This way name resolution is
functioning and I'm also able to start the origin-node service, but I'm
afraid this is not suited for the automated installation process.
Any help/hints are much appreciated!
Dan Pungă
==========================
actual behaviour:
### the starting contents of /etc/resolv.conf (with my internal dns
server configured as 192.168.150.5)
cat /etc/resolv.conf
search openstacklocal
search oshift-pinfold.intra
nameserver 192.168.150.5
nameserver 8.8.8.8
nameserver 8.8.4.4
###output of initial contents of configuration files
cat /etc/dnsmasq.d/origin-upstream-dns.conf
server=8.8.8.8
server=8.8.4.4
cat /etc/origin/node/resolv.conf
nameserver 8.8.8.8
nameserver 8.8.4.4
###empty configuration files written by
/etc/NetworkManager/dispatcher.d/99-origin-dns.sh (just to prove the point)
> /etc/dnsmasq.d/origin-upstream-dns.conf
> /etc/origin/node/resolv.conf
### restart NetworkManager
systemctl restart NetworkManager
###results...
cat /etc/resolv.conf
# nameserver updated by /etc/NetworkManager/dispatcher.d/99-origin-dns.sh
search cluster.local openstacklocal
search cluster.local oshift-pinfold.intra
nameserver 192.168.150.22
cat /etc/origin/node/resolv.conf
nameserver 8.8.8.8
nameserver 8.8.4.4
cat /etc/dnsmasq.d/origin-upstream-dns.conf
server=8.8.8.8
server=8.8.4.4
So the /etc/NetworkManager/dispatcher.d/99-origin-dns.sh script should
write to the config files, shown here, the nameservers found in
/etc/resolv.conf (when it's not watermarked). But it doesn't write the
nameserver for my internal dns. The 8.8.8.8 and 8.8.4.4 could be
confusing, but if I make an /etc/resolv.conf with some bogus
nameservers, the result is precisely the same. I don't know how it finds
those 8.8 nameservers and my guess, as I mentioned in the first part of
the message, is that there's some config "elsewere" with the 8.8.. and
it is used by the NetworkManager service to overwrite the
/etc/resolv.conf file and it is this modified version that the
/etc/NetworkManager/dispatcher.d/99-origin-dns.sh script finds and works
with....
_______________________________________________
users mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/users