"DNS not working" is likely a symptom. First, (by `kubectl exec`ing around), verify that pods on multiple nodes can talk to each other, and that they can talk to the internet (`ping 8.8.8.8; host google.com 8.8.8.8`).
Also a common problem is that the kubedns pods inherit the DNS settings from the host. Check if /etc/resolv.conf in these is sensible (should point at some DNS resolver, and that should be reachable too). /MR On Fri, Feb 24, 2017 at 7:37 PM thomas rogers <[email protected]> wrote: > Hello, > > I am using ansible to deploy Kubernetes to bare metal using the playbooks > from https://github.com/kubernetes/contrib/tree/master/ansible and am > having problems with DNS resolution from pods not living on the master node. > > Here are some details on the setup: > > Docker: Docker version 1.13.1, build 092cba3 > Flanneld: Flanneld version 0.5.5 > Kubernetes: Kubernetes v1.4.5 > OS: Linux version 4.4.0-59-generic (buildd@lgw01-11) (gcc version 5.4.0 > 20160609 <0201%2060609> (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #80-Ubuntu SMP > Fri Jan 6 17:47:47 UTC 2017 > > Testing, using the instructions from https://kubernetes.io/docs/admin/dns/ > on a single node (running on the master) yields: > > Server: 10.254.0.10 > Address 1: 10.254.0.10 kube-dns.kube-system.svc.cluster.local > > Name: kubernetes.default > Address 1: 10.254.0.1 kubernetes.default.svc.cluster.local > > while testing on a different node yields: > > Server: 10.254.0.10 > Address 1: 10.254.0.10 > > nslookup: can't resolve 'kubernetes.default' > > I have also used the vagrant scripts provided by the repository using > ubuntu16 with libvirt and see consistent behaviour with that of the > physical machines. > > I should also note that I was able to test the vagrant script using > centos7 (which installed kubernetes 1.4.0) which actually appeared to work, > however wasn't able to adapt that to the ubuntu16 case. > > Some information from the VM setup: > > Logs from kube-dns kubedns: > > I0224 19:28:08.359739 1 server.go:94] Using https://10.254.0.1:443 > for kubernetes master, kubernetes API: <nil> > I0224 19:28:08.365076 1 server.go:99] > v1.5.0-alpha.0.1651+7dcae5edd84f06-dirty > I0224 19:28:08.365111 1 server.go:101] FLAG: > --alsologtostderr="false" > I0224 19:28:08.365174 1 server.go:101] FLAG: --dns-port="10053" > I0224 19:28:08.365195 1 server.go:101] FLAG: > --domain="cluster.local." > I0224 19:28:08.365217 1 server.go:101] FLAG: --federations="" > I0224 19:28:08.365224 1 server.go:101] FLAG: --healthz-port="8081" > I0224 19:28:08.365229 1 server.go:101] FLAG: --kube-master-url="" > I0224 19:28:08.365256 1 server.go:101] FLAG: --kubecfg-file="" > I0224 19:28:08.365262 1 server.go:101] FLAG: --log-backtrace-at=":0" > I0224 19:28:08.365269 1 server.go:101] FLAG: --log-dir="" > I0224 19:28:08.365275 1 server.go:101] FLAG: > --log-flush-frequency="5s" > I0224 19:28:08.365281 1 server.go:101] FLAG: --logtostderr="true" > I0224 19:28:08.365285 1 server.go:101] FLAG: --stderrthreshold="2" > I0224 19:28:08.365290 1 server.go:101] FLAG: --v="0" > I0224 19:28:08.365294 1 server.go:101] FLAG: --version="false" > I0224 19:28:08.365299 1 server.go:101] FLAG: --vmodule="" > I0224 19:28:08.365389 1 server.go:138] Starting SkyDNS server. > Listening on port:10053 > I0224 19:28:08.365615 1 server.go:145] skydns: metrics enabled on : > /metrics: > I0224 19:28:08.365684 1 dns.go:166] Waiting for service: > default/kubernetes > I0224 19:28:08.366448 1 logs.go:41] skydns: ready for queries on > cluster.local. for tcp://0.0.0.0:10053 [rcache 0] > I0224 19:28:08.366517 1 logs.go:41] skydns: ready for queries on > cluster.local. for udp://0.0.0.0:10053 [rcache 0] > I0224 19:28:38.366301 1 dns.go:172] Ignoring error while waiting for > service default/kubernetes: Get > https://10.254.0.1:443/api/v1/namespaces/default/services/kubernetes: > dial tcp 10.254.0.1:443: i/o timeout. Sleeping 1s before retrying. > E0224 19:28:38.367639 1 reflector.go:214] pkg/dns/dns.go:154: Failed > to list *api.Endpoints: Get > https://10.254.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp > 10.254.0.1:443: i/o timeout > E0224 19:28:38.368457 1 reflector.go:214] pkg/dns/dns.go:155: Failed > to list *api.Service: Get > https://10.254.0.1:443/api/v1/services?resourceVersion=0: dial tcp > 10.254.0.1:443: i/o timeout > I0224 19:29:09.367199 1 dns.go:172] Ignoring error while waiting for > service default/kubernetes: Get > https://10.254.0.1:443/api/v1/namespaces/default/services/kubernetes: > dial tcp 10.254.0.1:443: i/o timeout. Sleeping 1s before retrying. > E0224 19:29:09.368291 1 reflector.go:214] pkg/dns/dns.go:154: Failed > to list *api.Endpoints: Get > https://10.254.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp > 10.254.0.1:443: i/o timeout > E0224 19:29:09.369085 1 reflector.go:214] pkg/dns/dns.go:155: Failed > to list *api.Service: Get > https://10.254.0.1:443/api/v1/services?resourceVersion=0: dial tcp > 10.254.0.1:443: i/o timeout > I0224 19:29:14.031575 1 server.go:133] Received signal: terminated, > will exit when the grace period ends > I0224 19:29:40.368220 1 dns.go:172] Ignoring error while waiting for > service default/kubernetes: Get > https://10.254.0.1:443/api/v1/namespaces/default/services/kubernetes: > dial tcp 10.254.0.1:443: i/o timeout. Sleeping 1s before retrying. > E0224 19:29:40.369086 1 reflector.go:214] pkg/dns/dns.go:154: Failed > to list *api.Endpoints: Get > https://10.254.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp > 10.254.0.1:443: i/o timeout > E0224 19:29:40.369799 1 reflector.go:214] pkg/dns/dns.go:155: Failed > to list *api.Service: Get > https://10.254.0.1:443/api/v1/services?resourceVersion=0: dial tcp > 10.254.0.1:443: i/o timeout > > > Logs from kube-dns dnsmasq: > > dnsmasq[1]: started, version 2.76 cachesize 1000 > dnsmasq[1]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN > DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect > inotify > dnsmasq[1]: using nameserver 127.0.0.1#10053 > dnsmasq[1]: read /etc/hosts - 7 addresses > > > Logs from kube-dns healthz: > > 2017/02/24 19:14:33 Healthz probe on /healthz-kubedns error: Result of > last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local' > , at 2017-02-24 19:14:33.130111978 +0000 UTC, error exit status 1 > ... > 2017/02/24 19:30:03 Healthz probe on /healthz-dnsmasq error: Result of > last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local' > , at 2017-02-24 19:29:43.134054709 +0000 UTC, error exit status 1 > > > Thanks, > Thomas > > -- > You received this message because you are subscribed to the Google Groups > "Kubernetes user discussion and Q&A" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/kubernetes-users. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/kubernetes-users. For more options, visit https://groups.google.com/d/optout.
