Re: openshift-ansible release-3.10 - Install fails with control plane pods

Marc Schlegel Sun, 02 Sep 2018 10:26:05 -0700

I might have found something...it could be a Vagrant issue

Vagrant uses to network interfaces: one for its own ssh access, the other one 
uses the ip configured in the Vagrantfile.
Here´s a log from the etcd-pod


...
2018-09-02 17:15:43.896539 I | etcdserver: published {Name:master.vnet.de 
ClientURLs:[https://192.168.121.202:2379]} to cluster 6d42105e200fef69
2018-09-02 17:15:43.896651 I | embed: ready to serve client requests
2018-09-02 17:15:43.897149 I | embed: serving client requests on 
192.168.121.202:2379


The interesting part is, that it is serving on 192.168.121.202, but the ip 
which should be used is 192.168.60.150.

[vagrant@master ~]$ ip ad 
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group 
default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP 
group default qlen 1000
    link/ether 52:54:00:87:13:01 brd ff:ff:ff:ff:ff:ff
    inet 192.168.121.202/24 brd 192.168.121.255 scope global noprefixroute 
dynamic eth0
       valid_lft 3387sec preferred_lft 3387sec
    inet6 fe80::5054:ff:fe87:1301/64 scope link 
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP 
group default qlen 1000
    link/ether 5c:a1:ab:1e:00:02 brd ff:ff:ff:ff:ff:ff
    inet 192.168.60.150/24 brd 192.168.60.255 scope global noprefixroute eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::5ea1:abff:fe1e:2/64 scope link 
       valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state 
DOWN group default 
    link/ether 02:42:8b:fa:b7:b0 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever


Is there any way I can configure my inventory to use a dedicated 
network-interface (eth1 in my Vagrant case)?



Am Freitag, 31. August 2018, 21:15:12 CEST schrieben Sie:
> The dependency chain for control plane is node then etcd then api then
> controllers. From your previous post it looks like there's no apiserver
> running. I'd look into what's wrong there.
> 
> Check `master-logs api api` if that doesn't provide you any hints then
> check the logs for the node service but I can't think of anything that
> would fail there yet result in successfully starting the controller pods.
> The apiserver and controller pods use the same image. Each pod will have
> two containers, the k8s_POD containers are rarely interesting.
> 
> On Thu, Aug 30, 2018 at 2:37 PM Marc Schlegel <[email protected]> wrote:
> 
> > Thanks for the link. It looks like the api-pod is not getting up at all!
> >
> > Log from k8s_controllers_master-controllers-*
> >
> > [vagrant@master ~]$ sudo docker logs
> > k8s_controllers_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_1
> > E0830 18:28:05.787358       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:594:
> > Failed to list *v1.Pod: Get
> > https://master.vnet.de:8443/api/v1/pods?fieldSelector=spec.schedulerName%3Ddefault-scheduler%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded&limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.788589       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.ReplicationController: Get
> > https://master.vnet.de:8443/api/v1/replicationcontrollers?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.804239       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.Node: Get
> > https://master.vnet.de:8443/api/v1/nodes?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.806879       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1beta1.StatefulSet: Get
> > https://master.vnet.de:8443/apis/apps/v1beta1/statefulsets?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.808195       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1beta1.PodDisruptionBudget: Get
> > https://master.vnet.de:8443/apis/policy/v1beta1/poddisruptionbudgets?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.673507       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.PersistentVolume: Get
> > https://master.vnet.de:8443/api/v1/persistentvolumes?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.770141       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1beta1.ReplicaSet: Get
> > https://master.vnet.de:8443/apis/extensions/v1beta1/replicasets?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.773878       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.Service: Get
> > https://master.vnet.de:8443/api/v1/services?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.778204       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.StorageClass: Get
> > https://master.vnet.de:8443/apis/storage.k8s.io/v1/storageclasses?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.784874       1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.PersistentVolumeClaim: Get
> > https://master.vnet.de:8443/api/v1/persistentvolumeclaims?limit=500&resourceVersion=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> >
> > The log is full with those. Since it is all about api, I tried to get the
> > logs from k8s_POD_master-api-master.vnet.de_kube-system_* which is
> > completely empty :-/
> >
> > [vagrant@master ~]$ sudo docker logs
> > k8s_POD_master-api-master.vnet.de_kube-system_86017803919d833e39cb3d694c249997_1
> > [vagrant@master ~]$
> >
> > Is there any special prerequisite about the api-pod?
> >
> > regards
> > Marc
> >
> >
> > > Marc,
> > >
> > > could you please look over the issue [1] and pull the master pod logs and
> > > see if you bumped into same issue mentioned by the other folks?
> > > Also make sure the openshift-ansible release is the latest one.
> > >
> > > Dani
> > >
> > > [1] https://github.com/openshift/openshift-ansible/issues/9575
> > >
> > > On Wed, Aug 29, 2018 at 7:36 PM Marc Schlegel <[email protected]>
> > wrote:
> > >
> > > > Hello everyone
> > > >
> > > > I am having trouble getting a working Origin 3.10 installation using
> > the
> > > > openshift-ansible installer. My install always fails because the
> > control
> > > > pane pods are not available. I've checkout the release-3.10 branch from
> > > > openshift-ansible and configured the inventory accordingly
> > > >
> > > >
> > > > TASK [openshift_control_plane : Start and enable self-hosting node]
> > > > ******************
> > > > changed: [master]
> > > > TASK [openshift_control_plane : Get node logs]
> > > > *******************************
> > > > skipping: [master]
> > > > TASK [openshift_control_plane : debug]
> > > > ******************************************
> > > > skipping: [master]
> > > > TASK [openshift_control_plane : fail]
> > > > *********************************************
> > > > skipping: [master]
> > > > TASK [openshift_control_plane : Wait for control plane pods to appear]
> > > > ***************
> > > >
> > > > failed: [master] (item=etcd) => {"attempts": 60, "changed": false,
> > "item":
> > > > "etcd", "msg": {"cmd": "/bin/oc get pod master-etcd-master.vnet.de -o
> > > > json -n kube-system", "results": [{}], "returncode": 1, "stderr": "The
> > > > connection to the server master.vnet.de:8443 was refused - did you
> > > > specify the right host or port?\n", "stdout": ""}}
> > > >
> > > > TASK [openshift_control_plane : Report control plane errors]
> > > > *************************
> > > > fatal: [master]: FAILED! => {"changed": false, "msg": "Control plane
> > pods
> > > > didn't come up"}
> > > >
> > > >
> > > > I am using Vagrant to setup a local domain (vnet.de) which also
> > includes
> > > > a dnsmasq-node to have full control over the dns. The following VMs are
> > > > running and DNS ans SSH works as expected
> > > >
> > > > Hostname             IP
> > > > domain.vnet.de   192.168.60.100
> > > > master.vnet.de    192.168.60.150 (dns also works for openshift.vnet.de
> > > > which is configured as openshift_master_cluster_public_hostname) also
> > runs
> > > > etcd
> > > > infra.vnet.de        192.168.60.151
> > (openshift_master_default_subdomain
> > > > wildcard points to this node)
> > > > app1.vnet.de        192.168.60.152
> > > > app2.vnet.de        192.168.60.153
> > > >
> > > >
> > > > When connecting to the master-node I can see that several
> > docker-instances
> > > > are up and running
> > > >
> > > > [vagrant@master ~]$ sudo docker ps
> > > > CONTAINER ID        IMAGE                                    COMMAND
> > > >             CREATED             STATUS              PORTS
> > > >  NAMES
> > > >
> > > > 9a0844123909        ff5dd2137a4f                             "/bin/sh
> > -c
> > > > '#!/bi..."   19 minutes ago      Up 19 minutes
> > > >
> > k8s_etcd_master-etcd-master.vnet.de_kube-system_a2c858fccd481c334a9af7413728e203_0
> > > >
> > > > 41d803023b72        f216d84cdf54
> >  "/bin/bash -c
> > > > '#!/..."   19 minutes ago      Up 19 minutes
> > > >
> > k8s_controllers_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_0
> > > >
> > > > 044c9d12588c        docker.io/openshift/origin-pod:v3.10.0
> > > >  "/usr/bin/pod"           19 minutes ago      Up 19 minutes
> > > >
> > > >
> > k8s_POD_master-api-master.vnet.de_kube-system_86017803919d833e39cb3d694c249997_0
> > > >
> > > > 10a197e394b3        docker.io/openshift/origin-pod:v3.10.0
> > > >  "/usr/bin/pod"           19 minutes ago      Up 19 minutes
> > > >
> > > >
> > k8s_POD_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_0
> > > >
> > > > 20f4f86bdd07        docker.io/openshift/origin-pod:v3.10.0
> > > >  "/usr/bin/pod"           19 minutes ago      Up 19 minutes
> > > >
> > > >
> > k8s_POD_master-etcd-master.vnet.de_kube-system_a2c858fccd481c334a9af7413728e203_0
> > > >
> > > >
> > > > However, there is no port 8443 open on the master-node. No wonder the
> > > > ansible-installer complains.
> > > >
> > > > The machines are using a plain Centos 7.5 and I've run the
> > > > openshift-ansible/playbooks/prerequisites.yml first and then
> > > > openshift-ansible/playbooks/deploy_cluster.yml.
> > > > I've double-checked the installation documentation and my Vagrant
> > > > config...all looks correct.
> > > >
> > > > Any ideas/advice?
> > > > regards
> > > > Marc
> > > >
> > > >
> > > > _______________________________________________
> > > > users mailing list
> > > > [email protected]
> > > > http://lists.openshift.redhat.com/openshiftmm/listinfo/users
> > > >
> > >
> >
> >
> >
> >
> > _______________________________________________
> > users mailing list
> > [email protected]
> > http://lists.openshift.redhat.com/openshiftmm/listinfo/users
> >
> 





_______________________________________________
users mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: openshift-ansible release-3.10 - Install fails with control plane pods

Reply via email to