Re: [kubernetes-users] Outgoing network connections from pods fail on brand new cluster

'Tim Hockin' via Kubernetes user discussion and Q&A Wed, 05 Apr 2017 08:52:14 -0700

I see the flannel masquerade for inbound traffic (-A POSTROUTING ! -s
10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE) but not for outbound
(expect -A POSTROUTING -s 10.244.0.0/16 ! -d 10.244.0.0/16 -j
MASQUERADE)


On Wed, Apr 5, 2017 at 3:16 AM,  <jimmycua...@gmail.com> wrote:
> Hello all,
>
> I'm having an unusual problem with running Kubernetes on a cluster of four 
> Raspberry Pi 3s: all outgoing networking connections from inside pods are 
> failing. My hunch is that the cause of the problem is something related to 
> the overlay network (I'm using Flannel) but I am really not sure. All of the 
> relevant details I can think of follow. If anyone has an idea what the 
> problem might be or how I can debug it further, I'd be grateful!
>
> The cluster is running on four brand new Raspberry Pi 3 Model B machines 
> connected to my home network using Ethernet. Network requests work as 
> expected from the host machines.
>
> The servers are all flashed with Hypriot OS v1.4.0 
> (https://github.com/hypriot/image-builder-rpi/releases/tag/v1.4.0) with 
> Docker manually downgraded to v1.12.6, which is known to work with Kubernetes 
> 1.6. Kubernetes is the only thing installed on these servers.
>
> Kubernetes 1.6.1 is installed with kubeadm 1.6.1 following the getting 
> started guide exactly 
> (https://kubernetes.io/docs/getting-started-guides/kubeadm/). Specifically, 
> the kubeadm command I start with is: `kubeadm init 
> --apiserver-cert-extra-sans example.com --pod-network-cidr 10.244.0.0/16` 
> (where example.com is public DNS record for my home network.)
>
> RBAC roles are created for Flannel with `kubectl apply -f flannel-rbac.yml` 
> where the contents of the file are:
>
>     ---
>     kind: ClusterRole
>     apiVersion: rbac.authorization.k8s.io/v1beta1
>     metadata:
>       name: flannel
>     rules:
>       - apiGroups:
>           - ""
>         resources:
>           - pods
>         verbs:
>           - get
>       - apiGroups:
>           - ""
>         resources:
>           - nodes
>         verbs:
>           - list
>           - update
>           - watch
>     ---
>     kind: ClusterRoleBinding
>     apiVersion: rbac.authorization.k8s.io/v1beta1
>     metadata:
>       name: flannel
>     roleRef:
>       apiGroup: rbac.authorization.k8s.io
>       kind: ClusterRole
>       name: flannel
>     subjects:
>     - kind: ServiceAccount
>       name: flannel
>       namespace: kube-system
>
> Flannel is deployed with `kubectl apply -f flannel.yml` where the contents of 
> the file are:
>
>     ---
>     apiVersion: v1
>     kind: ServiceAccount
>     metadata:
>       name: flannel
>       namespace: kube-system
>     ---
>     kind: ConfigMap
>     apiVersion: v1
>     metadata:
>       name: kube-flannel-cfg
>       namespace: kube-system
>       labels:
>         tier: node
>         app: flannel
>     data:
>       cni-conf.json: |
>         {
>           "name": "cbr0",
>           "type": "flannel",
>           "delegate": {
>             "isDefaultGateway": true
>           }
>         }
>       net-conf.json: |
>         {
>           "Network": "10.244.0.0/16",
>           "Backend": {
>             "Type": "vxlan"
>           }
>         }
>     ---
>     apiVersion: extensions/v1beta1
>     kind: DaemonSet
>     metadata:
>       name: kube-flannel-ds
>       namespace: kube-system
>       labels:
>         tier: node
>         app: flannel
>     spec:
>       template:
>         metadata:
>           labels:
>             tier: node
>             app: flannel
>         spec:
>           hostNetwork: true
>           nodeSelector:
>             beta.kubernetes.io/arch: arm
>           tolerations:
>           - key: node-role.kubernetes.io/master
>             effect: NoSchedule
>           serviceAccountName: flannel
>           containers:
>           - name: kube-flannel
>             image: quay.io/coreos/flannel:v0.7.0-arm
>             command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr" ]
>             securityContext:
>               privileged: true
>             env:
>             - name: POD_NAME
>               valueFrom:
>                 fieldRef:
>                   fieldPath: metadata.name
>             - name: POD_NAMESPACE
>               valueFrom:
>                 fieldRef:
>                   fieldPath: metadata.namespace
>             volumeMounts:
>             - name: run
>               mountPath: /run
>             - name: flannel-cfg
>               mountPath: /etc/kube-flannel/
>           - name: install-cni
>             image: quay.io/coreos/flannel:v0.7.0-arm
>             command: [ "/bin/sh", "-c", "set -e -x; cp -f 
> /etc/kube-flannel/cni-conf.json /etc/cni/net.d/10-flannel.conf; while true; 
> do sleep 3600; done" ]
>             volumeMounts:
>             - name: cni
>               mountPath: /etc/cni/net.d
>             - name: flannel-cfg
>               mountPath: /etc/kube-flannel/
>           volumes:
>             - name: run
>               hostPath:
>                 path: /run
>             - name: cni
>               hostPath:
>                 path: /etc/cni/net.d
>             - name: flannel-cfg
>               configMap:
>                 name: kube-flannel-cfg
>
> All Kubernetes nodes are online (kube-01 is the master):
>
>     $ kubectl get nodes
>     NAME          STATUS    AGE       VERSION
>     kube-01   Ready     1d        v1.6.1
>     kube-02   Ready     1d        v1.6.1
>     kube-03   Ready     1d        v1.6.1
>     kube-04   Ready     1d        v1.6.1
>
> Here are the details of the kube-02 node, just as an example to show the node 
> details:
>
>     $ kubectl describe node kube-02
>     Name:                       kube-02
>     Role:
>     Labels:                     beta.kubernetes.io/arch=arm
>           beta.kubernetes.io/os=linux
>           ingress-controller=traefik
>           kubernetes.io/hostname=kube-02
>     Annotations:                
> flannel.alpha.coreos.com/backend-data={"VtepMAC":"7a:ce:5a:3b:78:80"}
>           flannel.alpha.coreos.com/backend-type=vxlan
>           flannel.alpha.coreos.com/kube-subnet-manager=true
>           flannel.alpha.coreos.com/public-ip=10.0.1.102
>           node.alpha.kubernetes.io/ttl=0
>           volumes.kubernetes.io/controller-managed-attach-detach=true
>     Taints:                     <none>
>     CreationTimestamp:  Mon, 03 Apr 2017 22:46:36 -0700
>     Phase:
>     Conditions:
>       Type                      Status  LastHeartbeatTime                     
>   LastTransitionTime                      Reason                          
> Message
>       ----                      ------  -----------------                     
>   ------------------                      ------                          
> -------
>       OutOfDisk                 False   Wed, 05 Apr 2017 02:35:43 -0700       
>   Mon, 03 Apr 2017 22:46:36 -0700         KubeletHasSufficientDisk        
> kubelet has sufficient disk space available
>       MemoryPressure    False   Wed, 05 Apr 2017 02:35:43 -0700         Mon, 
> 03 Apr 2017 22:46:36 -0700         KubeletHasSufficientMemory      kubelet 
> has sufficient memory available
>       DiskPressure              False   Wed, 05 Apr 2017 02:35:43 -0700       
>   Mon, 03 Apr 2017 22:46:36 -0700         KubeletHasNoDiskPressure        
> kubelet has no disk pressure
>       Ready             True    Wed, 05 Apr 2017 02:35:43 -0700         Mon, 
> 03 Apr 2017 22:47:38 -0700         KubeletReady                    kubelet is 
> posting ready status
>     Addresses:          10.0.1.102,10.0.1.102,kube-02
>     Capacity:
>      cpu:               4
>      memory:    882632Ki
>      pods:              110
>     Allocatable:
>      cpu:               4
>      memory:    780232Ki
>      pods:              110
>     System Info:
>      Machine ID:                        9989a26f06984d6dbadc01770f018e3b
>      System UUID:                       9989a26f06984d6dbadc01770f018e3b
>      Boot ID:                   4a400ae5-aaee-4c25-9125-4e0df445e064
>      Kernel Version:            4.4.50-hypriotos-v7+
>      OS Image:                  Raspbian GNU/Linux 8 (jessie)
>      Operating System:          linux
>      Architecture:                      arm
>      Container Runtime Version: docker://1.12.6
>      Kubelet Version:           v1.6.1
>      Kube-Proxy Version:                v1.6.1
>     PodCIDR:                    10.244.1.0/24
>     ExternalID:                 kube-02
>     Non-terminated Pods:                (2 in total)
>       Namespace                 Name                            CPU Requests  
>   CPU Limits      Memory Requests Memory Limits
>       ---------                 ----                            ------------  
>   ----------      --------------- -------------
>       kube-system                       kube-flannel-ds-p5l6q           0 
> (0%)          0 (0%)          0 (0%)          0 (0%)
>       kube-system                       kube-proxy-z9dpz                0 
> (0%)          0 (0%)          0 (0%)          0 (0%)
>     Allocated resources:
>       (Total limits may be over 100 percent, i.e., overcommitted.)
>       CPU Requests      CPU Limits      Memory Requests Memory Limits
>       ------------      ----------      --------------- -------------
>       0 (0%)    0 (0%)          0 (0%)          0 (0%)
>     Events:             <none>
>
> All pods, including kube-dns, are running as expected:
>
>     $ kubectl get pods --all-namespaces
>     NAMESPACE     NAME                                  READY     STATUS    
> RESTARTS   AGE
>     kube-system   etcd-kube-01                          1/1       Running   0 
>          1d
>     kube-system   kube-apiserver-kube-01                1/1       Running   0 
>          1d
>     kube-system   kube-controller-manager-kube-01       1/1       Running   0 
>          1d
>     kube-system   kube-dns-279829092-wf67d              3/3       Running   0 
>          1d
>     kube-system   kube-flannel-ds-g3dwn                 2/2       Running   0 
>          1d
>     kube-system   kube-flannel-ds-p5l6q                 2/2       Running   2 
>          1d
>     kube-system   kube-flannel-ds-sk2ln                 2/2       Running   0 
>          1d
>     kube-system   kube-flannel-ds-x5t2h                 2/2       Running   3 
>          1d
>     kube-system   kube-proxy-3c8s6                      1/1       Running   0 
>          1d
>     kube-system   kube-proxy-kh0fh                      1/1       Running   0 
>          1d
>     kube-system   kube-proxy-pgcz6                      1/1       Running   0 
>          1d
>     kube-system   kube-proxy-z9dpz                      1/1       Running   0 
>          1d
>     kube-system   kube-scheduler-kube-01                1/1       Running   0 
>          1d
>
> Services for the API server and DNS exist, as expected:
>
>     $ kubectl get svc --all-namespaces
>     NAMESPACE     NAME         CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
>     default       kubernetes   10.96.0.1    <none>        443/TCP         1d
>     kube-system   kube-dns     10.96.0.10   <none>        53/UDP,53/TCP   1d
>
> And endpoints for those services exist, as expected:
>
>     $ kubectl get endpoints --all-namespaces
>     NAMESPACE     NAME                      ENDPOINTS                     AGE
>     default       kubernetes                10.0.1.101:6443               1d
>     kube-system   kube-controller-manager   <none>                        1d
>     kube-system   kube-dns                  10.244.0.2:53,10.244.0.2:53   1d
>     kube-system   kube-scheduler            <none>                        1d
>
> Note that the API server is running on the host network, as this is how 
> kubeadm sets up its static pod, while kube-dns is running on the overlay 
> network.
>
> Initially, I tried deploying a few other applications, including the 
> Kubernetes Dashboard, and Traefik (used as an ingress controller) but 
> produced errors in their logs about not being able to contact the API server, 
> which was my first clues that something was wrong. Eventually, I reduced the 
> problem to the following failing test case. The Docker image is 
> https://hub.docker.com/r/jimmycuadra/rpi-debug/, which is just an ARM build 
> of Alpine Linux with `dig` and `curl` installed in addition to the stock 
> `nslookup`.
>
>     $ kubectl run debug --image jimmycuadra/rpi-debug --generator run-pod/v1 
> -o yaml --save-config --rm -it /bin/ash
>     If you don't see a command prompt, try pressing enter.
>     / # ifconfig
>     eth0      Link encap:Ethernet  HWaddr 0A:58:0A:F4:02:05
>               inet addr:10.244.2.5  Bcast:0.0.0.0  Mask:255.255.255.0
>               inet6 addr: fe80::c49f:43ff:fece:b3c3/64 Scope:Link
>               UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
>               RX packets:18 errors:0 dropped:0 overruns:0 frame:0
>               TX packets:7 errors:0 dropped:0 overruns:0 carrier:0
>               collisions:0 txqueuelen:0
>               RX bytes:3323 (3.2 KiB)  TX bytes:578 (578.0 B)
>
>     lo        Link encap:Local Loopback
>               inet addr:127.0.0.1  Mask:255.0.0.0
>               inet6 addr: ::1/128 Scope:Host
>               UP LOOPBACK RUNNING  MTU:65536  Metric:1
>               RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>               TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>               collisions:0 txqueuelen:1
>               RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
>     / # route -n
>     Kernel IP routing table
>     Destination     Gateway         Genmask         Flags Metric Ref    Use 
> Iface
>     0.0.0.0         10.244.3.1      0.0.0.0         UG    0      0        0 
> eth0
>     10.244.0.0      10.244.3.1      255.255.0.0     UG    0      0        0 
> eth0
>     10.244.3.0      0.0.0.0         255.255.255.0   U     0      0        0 
> eth0
>     / # cat /etc/resolv.conf
>     nameserver 10.96.0.10
>     search default.svc.cluster.local svc.cluster.local cluster.local 
> webpass.net
>     options ndots:5
>
>     / # cat /etc/hosts
>     # Kubernetes-managed hosts file.
>     127.0.0.1   localhost
>     ::1 localhost ip6-localhost ip6-loopback
>     fe00::0     ip6-localnet
>     fe00::0     ip6-mcastprefix
>     fe00::1     ip6-allnodes
>     fe00::2     ip6-allrouters
>     10.244.2.4  debug
>     / # nslookup kubernetes
>     ;; connection timed out; no servers could be reached
>
>     / # nslookup kubernetes.default.svc.cluster.local
>     ;; connection timed out; no servers could be reached
>
>     / # nslookup google.com
>     ;; connection timed out; no servers could be reached
>
>     / # curl -i --connect-timeout 15 -H "Host: www.google.com" 
> https://216.58.192.14/
>     curl: (28) Connection timed out after 15001 milliseconds
>     / # curl -i --connect-timeout 15 -H "Host: kubernetes" 
> https://10.0.1.101:6443/
>     curl: (28) Connection timed out after 15001 milliseconds
>     / # apk update
>     fetch http://nl.alpinelinux.org/alpine/edge/main/armhf/APKINDEX.tar.gz
>     ERROR: http://nl.alpinelinux.org/alpine/edge/main: temporary error (try 
> again later)
>     v3.5.0-3172-gb55f907b71 [http://nl.alpinelinux.org/alpine/edge/main]
>     1 errors; 5526 distinct packages available
>
> As you can see from the above session, the kube-dns DNS server is in 
> /etc/resolv.conf as expected (10.96.0.10), but nslookup fails for the 
> kubernetes name, both relative and fully qualified, as does nslookup on 
> google.com. I also tried using the IP of Google and of the Kubernetes node 
> running the API server manually, but no outgoing connections work. Even 
> Alpine Linux's package manager, apk, cannot make an outgoing connection.
>
> Trying the same steps using the "Default" DNS policy for the pod reveals that 
> DNS resolution and outgoing connections to the Internet still fail:
>
>     $ kubectl run debug --image jimmycuadra/rpi-debug --generator run-pod/v1 
> -o yaml --overrides '{"spec":{"dnsPolicy":"Default"}}' --save-config --rm -it 
> /bin/ash
>     If you don't see a command prompt, try pressing enter.
>     / # ifconfig
>     eth0      Link encap:Ethernet  HWaddr 0A:58:0A:F4:01:05
>               inet addr:10.244.1.5  Bcast:0.0.0.0  Mask:255.255.255.0
>               inet6 addr: fe80::34fc:c5ff:fef6:134/64 Scope:Link
>               UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
>               RX packets:18 errors:0 dropped:0 overruns:0 frame:0
>               TX packets:7 errors:0 dropped:0 overruns:0 carrier:0
>               collisions:0 txqueuelen:0
>               RX bytes:3323 (3.2 KiB)  TX bytes:578 (578.0 B)
>
>     lo        Link encap:Local Loopback
>               inet addr:127.0.0.1  Mask:255.0.0.0
>               inet6 addr: ::1/128 Scope:Host
>               UP LOOPBACK RUNNING  MTU:65536  Metric:1
>               RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>               TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>               collisions:0 txqueuelen:1
>               RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
>     / # route -n
>     Kernel IP routing table
>     Destination     Gateway         Genmask         Flags Metric Ref    Use 
> Iface
>     0.0.0.0         10.244.1.1      0.0.0.0         UG    0      0        0 
> eth0
>     10.244.0.0      10.244.1.1      255.255.0.0     UG    0      0        0 
> eth0
>     10.244.1.0      0.0.0.0         255.255.255.0   U     0      0        0 
> eth0
>     / # cat /etc/resolv.conf
>     nameserver 10.0.1.1
>     search webpass.net
>     / # cat /etc/hosts
>     # Kubernetes-managed hosts file.
>     127.0.0.1   localhost
>     ::1 localhost ip6-localhost ip6-loopback
>     fe00::0     ip6-localnet
>     fe00::0     ip6-mcastprefix
>     fe00::1     ip6-allnodes
>     fe00::2     ip6-allrouters
>     10.244.3.6  debug
>     / # nslookup google.com
>     ;; connection timed out; no servers could be reached
>
>     / # curl -i --connect-timeout 15 -H "Host: www.google.com" 
> https://216.58.192.14/
>     curl: (28) Connection timed out after 15000 milliseconds
>     / # apk update
>     fetch http://nl.alpinelinux.org/alpine/edge/main/armhf/APKINDEX.tar.gz
>     ERROR: http://nl.alpinelinux.org/alpine/edge/main: temporary error (try 
> again later)
>     v3.5.0-3172-gb55f907b71 [http://nl.alpinelinux.org/alpine/edge/main]
>     1 errors; 5526 distinct packages available
>
> You can see that Flannel is operating, because this debug pod is given an IP 
> within the pod network's CIDR (as kube-dns was):
>
>     $ kubectl describe pod debug
>     Name:               debug
>     Namespace:  default
>     Node:               kube-03/10.0.1.103
>     Start Time: Wed, 05 Apr 2017 02:51:46 -0700
>     Labels:             <none>
>     Annotations:        
> kubectl.kubernetes.io/last-applied-configuration={"kind":"Pod","apiVersion":"v1","metadata":{"name":"debug","creationTimestamp":null},"spec":{"containers":[{"name":"debug","image":"jimmycuadra/rpi-deb...
>     Status:             Running
>     IP:         10.244.3.6
>     Controllers:        <none>
>     Containers:
>       debug:
>         Container ID:   
> docker://8c24be5df5b1f526b901b912c654b63705122b64c194a9556d8453573755c752
>         Image:          jimmycuadra/rpi-debug
>         Image ID:               
> docker-pullable://jimmycuadra/rpi-debug@sha256:144cb3c504e691882034340890d58eac6ac7c11af482a645623c1cb33271ca5e
>         Port:
>         Args:
>           /bin/ash
>         State:          Running
>           Started:              Wed, 05 Apr 2017 02:51:50 -0700
>         Ready:          True
>         Restart Count:  0
>         Environment:    <none>
>         Mounts:
>           /var/run/secrets/kubernetes.io/serviceaccount from 
> default-token-09gfc (ro)
>     Conditions:
>       Type              Status
>       Initialized       True
>       Ready     True
>       PodScheduled      True
>     Volumes:
>       default-token-09gfc:
>         Type:   Secret (a volume populated by a Secret)
>         SecretName:     default-token-09gfc
>         Optional:       false
>     QoS Class:  BestEffort
>     Node-Selectors:     <none>
>     Tolerations:        node.alpha.kubernetes.io/notReady=:Exists:NoExecute 
> for 300s
>         node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s
>     Events:
>       FirstSeen LastSeen        Count   From                    SubObjectPath 
>           Type            Reason          Message
>       --------- --------        -----   ----                    ------------- 
>           --------        ------          -------
>       2m                2m              1       default-scheduler             
>                   Normal          Scheduled       Successfully assigned debug 
> to kube-03
>       2m                2m              1       kubelet, kube-03        
> spec.containers{debug}  Normal          Pulled          Container image 
> "jimmycuadra/rpi-debug" already present on machine
>       2m                2m              1       kubelet, kube-03        
> spec.containers{debug}  Normal          Created         Created container 
> with id 8c24be5df5b1f526b901b912c654b63705122b64c194a9556d8453573755c752
>       2m                2m              1       kubelet, kube-03        
> spec.containers{debug}  Normal          Started         Started container 
> with id 8c24be5df5b1f526b901b912c654b63705122b64c194a9556d8453573755c752
>
> Here is the beginning of the logs for kube-dns:
>
>     $ kubectl logs kube-dns-279829092-wf67d -c kubedns -n kube-system
>     I0404 05:46:45.782718       1 dns.go:49] version: 
> v1.5.2-beta.0+$Format:%h$
>     I0404 05:46:45.793351       1 server.go:70] Using configuration read from 
> directory: /kube-dns-config%!(EXTRA time.Duration=10s)
>     I0404 05:46:45.793794       1 server.go:112] FLAG: 
> --alsologtostderr="false"
>     I0404 05:46:45.793942       1 server.go:112] FLAG: 
> --config-dir="/kube-dns-config"
>     I0404 05:46:45.794033       1 server.go:112] FLAG: --config-map=""
>     I0404 05:46:45.794093       1 server.go:112] FLAG: 
> --config-map-namespace="kube-system"
>     I0404 05:46:45.794159       1 server.go:112] FLAG: --config-period="10s"
>     I0404 05:46:45.794247       1 server.go:112] FLAG: 
> --dns-bind-address="0.0.0.0"
>     I0404 05:46:45.794311       1 server.go:112] FLAG: --dns-port="10053"
>     I0404 05:46:45.794427       1 server.go:112] FLAG: 
> --domain="cluster.local."
>     I0404 05:46:45.794509       1 server.go:112] FLAG: --federations=""
>     I0404 05:46:45.794582       1 server.go:112] FLAG: --healthz-port="8081"
>     I0404 05:46:45.794647       1 server.go:112] FLAG: 
> --initial-sync-timeout="1m0s"
>     I0404 05:46:45.794722       1 server.go:112] FLAG: --kube-master-url=""
>     I0404 05:46:45.794795       1 server.go:112] FLAG: --kubecfg-file=""
>     I0404 05:46:45.794853       1 server.go:112] FLAG: --log-backtrace-at=":0"
>     I0404 05:46:45.794933       1 server.go:112] FLAG: --log-dir=""
>     I0404 05:46:45.795003       1 server.go:112] FLAG: 
> --log-flush-frequency="5s"
>     I0404 05:46:45.795073       1 server.go:112] FLAG: --logtostderr="true"
>     I0404 05:46:45.795144       1 server.go:112] FLAG: --nameservers=""
>     I0404 05:46:45.795202       1 server.go:112] FLAG: --stderrthreshold="2"
>     I0404 05:46:45.795264       1 server.go:112] FLAG: --v="2"
>     I0404 05:46:45.795324       1 server.go:112] FLAG: --version="false"
>     I0404 05:46:45.795407       1 server.go:112] FLAG: --vmodule=""
>     I0404 05:46:45.795793       1 server.go:175] Starting SkyDNS server 
> (0.0.0.0:10053)
>     I0404 05:46:45.800841       1 server.go:197] Skydns metrics enabled 
> (/metrics:10055)
>     I0404 05:46:45.800982       1 dns.go:147] Starting endpointsController
>     I0404 05:46:45.801050       1 dns.go:150] Starting serviceController
>     I0404 05:46:45.802186       1 logs.go:41] skydns: ready for queries on 
> cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
>     I0404 05:46:45.802431       1 logs.go:41] skydns: ready for queries on 
> cluster.local. for udp://0.0.0.0:10053 [rcache 0]
>     I0404 05:46:46.194772       1 dns.go:264] New service: kubernetes
>     I0404 05:46:46.199497       1 dns.go:462] Added SRV record 
> &{Host:kubernetes.default.svc.cluster.local. Port:443 Priority:10 Weight:10 
> Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
>     I0404 05:46:46.201053       1 dns.go:264] New service: kube-dns
>     I0404 05:46:46.201745       1 dns.go:462] Added SRV record 
> &{Host:kube-dns.kube-system.svc.cluster.local. Port:53 Priority:10 Weight:10 
> Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
>     I0404 05:46:46.202287       1 dns.go:462] Added SRV record 
> &{Host:kube-dns.kube-system.svc.cluster.local. Port:53 Priority:10 Weight:10 
> Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
>     I0404 05:46:46.302608       1 dns.go:171] Initialized services and 
> endpoints from apiserver
>     I0404 05:46:46.302733       1 server.go:128] Setting up Healthz Handler 
> (/readiness)
>     I0404 05:46:46.302843       1 server.go:133] Setting up cache handler 
> (/cache)
>     I0404 05:46:46.302935       1 server.go:119] Status HTTP port 8081
>     I0404 05:51:45.802627       1 dns.go:264] New service: kubernetes
>     I0404 05:51:45.803656       1 dns.go:462] Added SRV record 
> &{Host:kubernetes.default.svc.cluster.local. Port:443 Priority:10 Weight:10 
> Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
>     I0404 05:51:45.804266       1 dns.go:264] New service: kube-dns
>     I0404 05:51:45.804771       1 dns.go:462] Added SRV record 
> &{Host:kube-dns.kube-system.svc.cluster.local. Port:53 Priority:10 Weight:10 
> Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
>     I0404 05:51:45.805283       1 dns.go:462] Added SRV record 
> &{Host:kube-dns.kube-system.svc.cluster.local. Port:53 Priority:10 Weight:10 
> Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
>     I0404 05:54:12.745272       1 dns.go:264] New service: 
> kubernetes-dashboard
>     I0404 05:56:45.805684       1 dns.go:264] New service: kubernetes
>     I0404 05:56:45.809947       1 dns.go:462] Added SRV record 
> &{Host:kubernetes.default.svc.cluster.local. Port:443 Priority:10 Weight:10 
> Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
>     I0404 05:56:45.811538       1 dns.go:264] New service: kube-dns
>     I0404 05:56:45.812488       1 dns.go:462] Added SRV record 
> &{Host:kube-dns.kube-system.svc.cluster.local. Port:53 Priority:10 Weight:10 
> Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
>     I0404 05:56:45.813454       1 dns.go:462] Added SRV record 
> &{Host:kube-dns.kube-system.svc.cluster.local. Port:53 Priority:10 Weight:10 
> Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
>     I0404 05:56:45.814443       1 dns.go:264] New service: 
> kubernetes-dashboard
>     I0404 06:01:45.806051       1 dns.go:264] New service: kube-dns
>     I0404 06:01:45.806895       1 dns.go:462] Added SRV record 
> &{Host:kube-dns.kube-system.svc.cluster.local. Port:53 Priority:10 Weight:10 
> Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
>     I0404 06:01:45.807408       1 dns.go:462] Added SRV record 
> &{Host:kube-dns.kube-system.svc.cluster.local. Port:53 Priority:10 Weight:10 
> Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
>     I0404 06:01:45.807884       1 dns.go:264] New service: 
> kubernetes-dashboard
>     I0404 06:01:45.808341       1 dns.go:264] New service: kubernetes
>     I0404 06:01:45.808752       1 dns.go:462] Added SRV record 
> &{Host:kubernetes.default.svc.cluster.local. Port:443 Priority:10 Weight:10 
> Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
>
> I don't see any errors in any of it, just an endless stream of it finding 
> "kubernetes" and "kube-dns" as "new services" and adding SRV records for them.
>
> Here are the logs for Flannel on a node where the Flannel pod never restarted:
>
>     $ kubectl logs kube-flannel-ds-g3dwn -c kube-flannel -n kube-system
>     I0404 05:46:05.193078       1 kube.go:109] Waiting 10m0s for node 
> controller to sync
>     I0404 05:46:05.193340       1 kube.go:289] starting kube subnet manager
>     I0404 05:46:06.194279       1 kube.go:116] Node controller sync successful
>     I0404 05:46:06.194463       1 main.go:132] Installing signal handlers
>     I0404 05:46:06.196013       1 manager.go:136] Determining IP address of 
> default interface
>     I0404 05:46:06.199502       1 manager.go:149] Using interface with name 
> eth0 and address 10.0.1.101
>     I0404 05:46:06.199681       1 manager.go:166] Defaulting external address 
> to interface address (10.0.1.101)
>     I0404 05:46:06.631802       1 ipmasq.go:47] Adding iptables rule: -s 
> 10.244.0.0/16 -d 10.244.0.0/16 -j RETURN
>     I0404 05:46:06.665265       1 ipmasq.go:47] Adding iptables rule: -s 
> 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE
>     I0404 05:46:06.700650       1 ipmasq.go:47] Adding iptables rule: ! -s 
> 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE
>     I0404 05:46:06.720807       1 manager.go:250] Lease acquired: 
> 10.244.0.0/24
>     I0404 05:46:06.722263       1 network.go:58] Watching for L3 misses
>     I0404 05:46:06.722473       1 network.go:66] Watching for new subnet 
> leases
>     I0405 04:46:06.678418       1 network.go:160] Lease renewed, new 
> expiration: 2017-04-06 04:46:06.652848051 +0000 UTC
>
> Here are logs from a failed Flannel pod on one of the nodes where it's 
> restarted a few times:
>
>     $ kubectl logs kube-flannel-ds-x5t2h -c kube-flannel -n kube-system -p
>     E0404 05:50:02.782218       1 main.go:127] Failed to create 
> SubnetManager: error retrieving pod spec for 
> 'kube-system/kube-flannel-ds-x5t2h': Get 
> https://10.96.0.1:443/api/v1/namespaces/kube-system/pods/kube-flannel-ds-x5t2h:
>  dial tcp 10.96.0.1:443: i/o timeout
>
> Here are the iptables rules that appear identically on all four servers:
>
> $ sudo iptables-save
> # Generated by iptables-save v1.4.21 on Wed Apr  5 10:01:19 2017
> *nat
> :PREROUTING ACCEPT [3:372]
> :INPUT ACCEPT [3:372]
> :OUTPUT ACCEPT [26:1659]
> :POSTROUTING ACCEPT [26:1659]
> :DOCKER - [0:0]
> :KUBE-MARK-DROP - [0:0]
> :KUBE-MARK-MASQ - [0:0]
> :KUBE-NODEPORTS - [0:0]
> :KUBE-POSTROUTING - [0:0]
> :KUBE-SEP-HHOMLR7ARJQ6WUFK - [0:0]
> :KUBE-SEP-IT2ZTR26TO4XFPTO - [0:0]
> :KUBE-SEP-YIL6JZP7A3QYXJU2 - [0:0]
> :KUBE-SERVICES - [0:0]
> :KUBE-SVC-ERIFXISQEP7F7OF4 - [0:0]
> :KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
> :KUBE-SVC-TCOU7JCQXEZGVUNU - [0:0]
> -A PREROUTING -m comment --comment "kubernetes service portals" -j 
> KUBE-SERVICES
> -A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
> -A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
> -A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
> -A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j 
> KUBE-POSTROUTING
> -A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
> -A POSTROUTING -s 10.244.0.0/16 -d 10.244.0.0/16 -j RETURN
> -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE
> -A POSTROUTING ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE
> -A DOCKER -i docker0 -j RETURN
> -A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
> -A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
> -A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic 
> requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE
> -A KUBE-SEP-HHOMLR7ARJQ6WUFK -s 10.0.1.101/32 -m comment --comment 
> "default/kubernetes:https" -j KUBE-MARK-MASQ
> -A KUBE-SEP-HHOMLR7ARJQ6WUFK -p tcp -m comment --comment 
> "default/kubernetes:https" -m recent --set --name KUBE-SEP-HHOMLR7ARJQ6WUFK 
> --mask 255.255.255.255 --rsource -m tcp -j DNAT --to-destination 
> 10.0.1.101:6443
> -A KUBE-SEP-IT2ZTR26TO4XFPTO -s 10.244.0.2/32 -m comment --comment 
> "kube-system/kube-dns:dns-tcp" -j KUBE-MARK-MASQ
> -A KUBE-SEP-IT2ZTR26TO4XFPTO -p tcp -m comment --comment 
> "kube-system/kube-dns:dns-tcp" -m tcp -j DNAT --to-destination 10.244.0.2:53
> -A KUBE-SEP-YIL6JZP7A3QYXJU2 -s 10.244.0.2/32 -m comment --comment 
> "kube-system/kube-dns:dns" -j KUBE-MARK-MASQ
> -A KUBE-SEP-YIL6JZP7A3QYXJU2 -p udp -m comment --comment 
> "kube-system/kube-dns:dns" -m udp -j DNAT --to-destination 10.244.0.2:53
> -A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.1/32 -p tcp -m comment 
> --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j 
> KUBE-MARK-MASQ
> -A KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment 
> "default/kubernetes:https cluster IP" -m tcp --dport 443 -j 
> KUBE-SVC-NPX46M4PTMTKRN6Y
> -A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p udp -m comment 
> --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j 
> KUBE-MARK-MASQ
> -A KUBE-SERVICES -d 10.96.0.10/32 -p udp -m comment --comment 
> "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j 
> KUBE-SVC-TCOU7JCQXEZGVUNU
> -A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p tcp -m comment 
> --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j 
> KUBE-MARK-MASQ
> -A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment 
> "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j 
> KUBE-SVC-ERIFXISQEP7F7OF4
> -A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: 
> this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j 
> KUBE-NODEPORTS
> -A KUBE-SVC-ERIFXISQEP7F7OF4 -m comment --comment 
> "kube-system/kube-dns:dns-tcp" -j KUBE-SEP-IT2ZTR26TO4XFPTO
> -A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" 
> -m recent --rcheck --seconds 10800 --reap --name KUBE-SEP-HHOMLR7ARJQ6WUFK 
> --mask 255.255.255.255 --rsource -j KUBE-SEP-HHOMLR7ARJQ6WUFK
> -A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" 
> -j KUBE-SEP-HHOMLR7ARJQ6WUFK
> -A KUBE-SVC-TCOU7JCQXEZGVUNU -m comment --comment "kube-system/kube-dns:dns" 
> -j KUBE-SEP-YIL6JZP7A3QYXJU2
> COMMIT
> # Completed on Wed Apr  5 10:01:19 2017
> # Generated by iptables-save v1.4.21 on Wed Apr  5 10:01:19 2017
> *filter
> :INPUT ACCEPT [1943:614999]
> :FORWARD DROP [0:0]
> :OUTPUT ACCEPT [1949:861554]
> :DOCKER - [0:0]
> :DOCKER-ISOLATION - [0:0]
> :KUBE-FIREWALL - [0:0]
> :KUBE-SERVICES - [0:0]
> -A INPUT -j KUBE-FIREWALL
> -A FORWARD -j DOCKER-ISOLATION
> -A FORWARD -o docker0 -j DOCKER
> -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
> -A FORWARD -i docker0 ! -o docker0 -j ACCEPT
> -A FORWARD -i docker0 -o docker0 -j ACCEPT
> -A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
> -A OUTPUT -j KUBE-FIREWALL
> -A DOCKER-ISOLATION -j RETURN
> -A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping 
> marked packets" -m mark --mark 0x8000/0x8000 -j DROP
> COMMIT
> # Completed on Wed Apr  5 10:01:19 2017
>
> Here is the output of ifconfig on the server running the Kubernetes master 
> components (kube-01):
>
>     $ ifconfig
>     cni0      Link encap:Ethernet  HWaddr 0a:58:0a:f4:00:01
>               inet addr:10.244.0.1  Bcast:0.0.0.0  Mask:255.255.255.0
>               inet6 addr: fe80::807b:3bff:fedf:ff7d/64 Scope:Link
>               UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
>               RX packets:322236 errors:0 dropped:0 overruns:0 frame:0
>               TX packets:331776 errors:0 dropped:0 overruns:0 carrier:0
>               collisions:0 txqueuelen:1000
>               RX bytes:74133093 (70.6 MiB)  TX bytes:73272040 (69.8 MiB)
>
>     docker0   Link encap:Ethernet  HWaddr 02:42:43:63:54:be
>               inet addr:172.17.0.1  Bcast:0.0.0.0  Mask:255.255.0.0
>               UP BROADCAST MULTICAST  MTU:1500  Metric:1
>               RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>               TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>               collisions:0 txqueuelen:0
>               RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
>
>     eth0      Link encap:Ethernet  HWaddr b8:27:eb:fa:0d:18
>               inet addr:10.0.1.101  Bcast:10.0.1.255  Mask:255.255.255.0
>               inet6 addr: fe80::ba27:ebff:fefa:d18/64 Scope:Link
>               UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>               RX packets:1594829 errors:0 dropped:0 overruns:0 frame:0
>               TX packets:1234243 errors:0 dropped:0 overruns:0 carrier:0
>               collisions:0 txqueuelen:1000
>               RX bytes:482745836 (460.3 MiB)  TX bytes:943355891 (899.6 MiB)
>
>     flannel.1 Link encap:Ethernet  HWaddr 7a:54:f6:da:6b:a0
>               inet addr:10.244.0.0  Bcast:0.0.0.0  Mask:255.255.255.255
>               inet6 addr: fe80::7854:f6ff:feda:6ba0/64 Scope:Link
>               UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
>               RX packets:3 errors:0 dropped:0 overruns:0 frame:0
>               TX packets:0 errors:0 dropped:38 overruns:0 carrier:0
>               collisions:0 txqueuelen:0
>               RX bytes:204 (204.0 B)  TX bytes:0 (0.0 B)
>
>     lo        Link encap:Local Loopback
>               inet addr:127.0.0.1  Mask:255.0.0.0
>               inet6 addr: ::1/128 Scope:Host
>               UP LOOPBACK RUNNING  MTU:65536  Metric:1
>               RX packets:5523912 errors:0 dropped:0 overruns:0 frame:0
>               TX packets:5523912 errors:0 dropped:0 overruns:0 carrier:0
>               collisions:0 txqueuelen:1
>               RX bytes:2042135076 (1.9 GiB)  TX bytes:2042135076 (1.9 GiB)
>
>     vethbe064275 Link encap:Ethernet  HWaddr 1e:4f:ea:70:9f:e1
>               inet6 addr: fe80::1c4f:eaff:fe70:9fe1/64 Scope:Link
>               UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
>               RX packets:322237 errors:0 dropped:0 overruns:0 frame:0
>               TX packets:331794 errors:0 dropped:0 overruns:0 carrier:0
>               collisions:0 txqueuelen:0
>               RX bytes:78644487 (75.0 MiB)  TX bytes:73275343 (69.8 MiB)
>
> And here it is on the worker node kube-02:
>
>     $ ifconfig
>     cni0      Link encap:Ethernet  HWaddr 0a:58:0a:f4:01:01
>               inet addr:10.244.1.1  Bcast:0.0.0.0  Mask:255.255.255.0
>               inet6 addr: fe80::383a:41ff:fea4:f113/64 Scope:Link
>               UP BROADCAST MULTICAST  MTU:1500  Metric:1
>               RX packets:125 errors:0 dropped:0 overruns:0 frame:0
>               TX packets:51 errors:0 dropped:0 overruns:0 carrier:0
>               collisions:0 txqueuelen:1000
>               RX bytes:7794 (7.6 KiB)  TX bytes:7391 (7.2 KiB)
>
>     docker0   Link encap:Ethernet  HWaddr 02:42:ad:1b:1e:a3
>               inet addr:172.17.0.1  Bcast:0.0.0.0  Mask:255.255.0.0
>               UP BROADCAST MULTICAST  MTU:1500  Metric:1
>               RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>               TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>               collisions:0 txqueuelen:0
>               RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
>
>     eth0      Link encap:Ethernet  HWaddr b8:27:eb:bb:ff:69
>               inet addr:10.0.1.102  Bcast:10.0.1.255  Mask:255.255.255.0
>               inet6 addr: fe80::ba27:ebff:febb:ff69/64 Scope:Link
>               UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>               RX packets:750764 errors:0 dropped:0 overruns:0 frame:0
>               TX packets:442199 errors:0 dropped:0 overruns:0 carrier:0
>               collisions:0 txqueuelen:1000
>               RX bytes:597869801 (570.1 MiB)  TX bytes:42574858 (40.6 MiB)
>
>     flannel.1 Link encap:Ethernet  HWaddr 7a:ce:5a:3b:78:80
>               inet addr:10.244.1.0  Bcast:0.0.0.0  Mask:255.255.255.255
>               inet6 addr: fe80::78ce:5aff:fe3b:7880/64 Scope:Link
>               UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
>               RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>               TX packets:0 errors:0 dropped:38 overruns:0 carrier:0
>               collisions:0 txqueuelen:0
>               RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
>
>     lo        Link encap:Local Loopback
>               inet addr:127.0.0.1  Mask:255.0.0.0
>               inet6 addr: ::1/128 Scope:Host
>               UP LOOPBACK RUNNING  MTU:65536  Metric:1
>               RX packets:4 errors:0 dropped:0 overruns:0 frame:0
>               TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
>               collisions:0 txqueuelen:1
>               RX bytes:240 (240.0 B)  TX bytes:240 (240.0 B)
>
> Again, if anyone has made it this far, please let me know if you have any 
> ideas, or if there are other commands I can show the output of to help narrow 
> it down!
>
> Thanks very much,
> Jimmy
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Kubernetes user discussion and Q&A" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kubernetes-users+unsubscr...@googlegroups.com.
> To post to this group, send email to kubernetes-users@googlegroups.com.
> Visit this group at https://groups.google.com/group/kubernetes-users.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to kubernetes-users+unsubscr...@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.

Re: [kubernetes-users] Outgoing network connections from pods fail on brand new cluster

Reply via email to