Re: openshift-ansible release-3.10 - Install fails with control plane pods

2018-10-09 Thread Marc Schlegel
Hello everyone

I was finally able to resolve the issue with the control plane.

The problem was caused by the master pod which was not able to connect to the 
etcd pod because the hostname always resolved to 127.0.0.1 and not the local 
cluster ip. This was due to the Vagrant box I used, and could be resolved by 
making sure that /etc/hosts only contained the localhost 127.0.0.1 entry.

Now the installer gets past the control-plane-check.

Unfortunately the next issue arises when the installer waits for the "catalog 
api server".  The command "curl -k 
https://apiserver.kube-service-catalog.svc/healthz; cannot connect because the 
installer only adds "cluster.local" to resolv.conf.
Either the installer makes sure that any service with .svc gets resolved as 
well (my current workaround, by adding server=/svc/172.30.0.1 to 
/etc/dnsmasq.d/origin-upstream-dns.conf), or all services get the hostname 
ending on "cluster.local"


Am Freitag, 31. August 2018, 21:15:12 CEST schrieben Sie:
> The dependency chain for control plane is node then etcd then api then
> controllers. From your previous post it looks like there's no apiserver
> running. I'd look into what's wrong there.
> 
> Check `master-logs api api` if that doesn't provide you any hints then
> check the logs for the node service but I can't think of anything that
> would fail there yet result in successfully starting the controller pods.
> The apiserver and controller pods use the same image. Each pod will have
> two containers, the k8s_POD containers are rarely interesting.
> 
> On Thu, Aug 30, 2018 at 2:37 PM Marc Schlegel  wrote:
> 
> > Thanks for the link. It looks like the api-pod is not getting up at all!
> >
> > Log from k8s_controllers_master-controllers-*
> >
> > [vagrant@master ~]$ sudo docker logs
> > k8s_controllers_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_1
> > E0830 18:28:05.787358   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:594:
> > Failed to list *v1.Pod: Get
> > https://master.vnet.de:8443/api/v1/pods?fieldSelector=spec.schedulerName%3Ddefault-scheduler%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.788589   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.ReplicationController: Get
> > https://master.vnet.de:8443/api/v1/replicationcontrollers?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.804239   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.Node: Get
> > https://master.vnet.de:8443/api/v1/nodes?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.806879   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1beta1.StatefulSet: Get
> > https://master.vnet.de:8443/apis/apps/v1beta1/statefulsets?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.808195   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1beta1.PodDisruptionBudget: Get
> > https://master.vnet.de:8443/apis/policy/v1beta1/poddisruptionbudgets?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.673507   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.PersistentVolume: Get
> > https://master.vnet.de:8443/api/v1/persistentvolumes?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.770141   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1beta1.ReplicaSet: Get
> > https://master.vnet.de:8443/apis/extensions/v1beta1/replicasets?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.773878   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.Service: Get
> > https://master.vnet.de:8443/api/v1/services?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.778204   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.StorageClass: Get
> > https://master.vnet.de:8443/apis/storage.k8s.io/v1/storageclasses?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.784874   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.PersistentVolumeClaim: Get
> > 

Re: openshift-ansible release-3.10 - Install fails with control plane pods

2018-09-02 Thread Marc Schlegel
Well I found two options for the inventory

openshift_ip

# host group for masters
[masters]
master openshift_ip=192.168.60.150
# host group for etcd
[etcd]
master openshift_ip=192.168.60.150
# host group for nodes, includes region info
[nodes]
master openshift_node_group_name='node-config-master' 
openshift_ip=192.168.60.150
infra openshift_node_group_name='node-config-infra' openshift_ip=192.168.60.151
app1 openshift_node_group_name='node-config-compute' openshift_ip=192.168.60.152
app2 openshift_node_group_name='node-config-compute' openshift_ip=192.168.60.153


and flannel

openshift_use_openshift_sdn=false 
openshift_use_flannel=true 
flannel_interface=eth1


The etcd logs are looking good now, still the problem seems that there is no 
SSL port open

Here are some line I could pull from journalctl on master

Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7200376300 
certificate_manager.go:216] Certificate rotation is enabled.
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7204536300 
manager.go:154] cAdvisor running in container: "/sys/fs/cgroup/cpu,cpuacct"
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7382576300 
certificate_manager.go:287] Rotating certificates
Sep 02 19:17:38 master.vnet.de origin-node[6300]: E0902 19:17:38.7525316300 
certificate_manager.go:299] Failed while requesting a signed certificate from 
the master: cannot create certificate signing request: Post 
https://master.vnet.de:8443/apis/certificates.k8s.io/v
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7784906300 
fs.go:142] Filesystem UUIDs: map[570897ca-e759-4c81-90cf-389da6eee4cc:/dev/vda2 
b60e9498-0baa-4d9f-90aa-069048217fee:/dev/dm-0 
c39c5bed-f37c-4263-bee8-aeb6a6659d7b:/dev/dm-1]
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7785066300 
fs.go:143] Filesystem partitions: map[tmpfs:{mountpoint:/dev/shm major:0 
minor:19 fsType:tmpfs blockSize:0} 
/dev/mapper/VolGroup00-LogVol00:{mountpoint:/var/lib/docker/overlay2 major:253 
minor
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7801306300 
manager.go:227] Machine: {NumCores:1 CpuFrequency:2808000 
MemoryCapacity:3974230016 HugePages:[{PageSize:1048576 NumPages:0} 
{PageSize:2048 NumPages:0}] MachineID:6c1357b9e4a54b929e1d09cacf37e
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7836556300 
manager.go:233] Version: {KernelVersion:3.10.0-862.2.3.el7.x86_64 
ContainerOsVersion:CentOS Linux 7 (Core) DockerVersion:1.13.1 
DockerAPIVersion:1.26 CadvisorVersion: CadvisorRevision:}
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7842516300 
server.go:621] --cgroups-per-qos enabled, but --cgroup-root was not specified.  
defaulting to /
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7845246300 
container_manager_linux.go:242] container manager verified user specified 
cgroup-root exists: /
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7845336300 
container_manager_linux.go:247] Creating Container Manager object based on Node 
Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: 
ContainerRuntime:docker CgroupsPerQOS:true C
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7846096300 
container_manager_linux.go:266] Creating device plugin manager: true
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7846166300 
manager.go:102] Creating Device Plugin manager at 
/var/lib/kubelet/device-plugins/kubelet.sock
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7847146300 
state_mem.go:36] [cpumanager] initializing new in-memory state store
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7849446300 
state_file.go:82] [cpumanager] state file: created new state file 
"/var/lib/origin/openshift.local.volumes/cpu_manager_state"
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7849886300 
server.go:895] Using root directory: /var/lib/origin/openshift.local.volumes
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7850136300 
kubelet.go:273] Adding pod path: /etc/origin/node/pods
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7850466300 
file.go:52] Watching path "/etc/origin/node/pods"
Sep 02 19:17:38 master.vnet.de origin-node[6300]: I0902 19:17:38.7850546300 
kubelet.go:298] Watching apiserver
Sep 02 19:17:38 master.vnet.de origin-node[6300]: E0902 19:17:38.7966516300 
reflector.go:205] 
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461:
 Failed to list *v1.Node: Get 
https://master.vnet.de:8443/api/v1/nodes?fieldSelector=metadata.
Sep 02 19:17:38 master.vnet.de origin-node[6300]: E0902 19:17:38.7966956300 
reflector.go:205] 
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452:
 Failed to list *v1.Service: Get 

Re: openshift-ansible release-3.10 - Install fails with control plane pods

2018-09-02 Thread Marc Schlegel
I might have found something...it could be a Vagrant issue

Vagrant uses to network interfaces: one for its own ssh access, the other one 
uses the ip configured in the Vagrantfile.
HereĀ“s a log from the etcd-pod

...
2018-09-02 17:15:43.896539 I | etcdserver: published {Name:master.vnet.de 
ClientURLs:[https://192.168.121.202:2379]} to cluster 6d42105e200fef69
2018-09-02 17:15:43.896651 I | embed: ready to serve client requests
2018-09-02 17:15:43.897149 I | embed: serving client requests on 
192.168.121.202:2379


The interesting part is, that it is serving on 192.168.121.202, but the ip 
which should be used is 192.168.60.150.

[vagrant@master ~]$ ip ad 
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group 
default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
   valid_lft forever preferred_lft forever
inet6 ::1/128 scope host 
   valid_lft forever preferred_lft forever
2: eth0:  mtu 1500 qdisc pfifo_fast state UP 
group default qlen 1000
link/ether 52:54:00:87:13:01 brd ff:ff:ff:ff:ff:ff
inet 192.168.121.202/24 brd 192.168.121.255 scope global noprefixroute 
dynamic eth0
   valid_lft 3387sec preferred_lft 3387sec
inet6 fe80::5054:ff:fe87:1301/64 scope link 
   valid_lft forever preferred_lft forever
3: eth1:  mtu 1500 qdisc pfifo_fast state UP 
group default qlen 1000
link/ether 5c:a1:ab:1e:00:02 brd ff:ff:ff:ff:ff:ff
inet 192.168.60.150/24 brd 192.168.60.255 scope global noprefixroute eth1
   valid_lft forever preferred_lft forever
inet6 fe80::5ea1:abff:fe1e:2/64 scope link 
   valid_lft forever preferred_lft forever
4: docker0:  mtu 1500 qdisc noqueue state 
DOWN group default 
link/ether 02:42:8b:fa:b7:b0 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 scope global docker0
   valid_lft forever preferred_lft forever


Is there any way I can configure my inventory to use a dedicated 
network-interface (eth1 in my Vagrant case)?



Am Freitag, 31. August 2018, 21:15:12 CEST schrieben Sie:
> The dependency chain for control plane is node then etcd then api then
> controllers. From your previous post it looks like there's no apiserver
> running. I'd look into what's wrong there.
> 
> Check `master-logs api api` if that doesn't provide you any hints then
> check the logs for the node service but I can't think of anything that
> would fail there yet result in successfully starting the controller pods.
> The apiserver and controller pods use the same image. Each pod will have
> two containers, the k8s_POD containers are rarely interesting.
> 
> On Thu, Aug 30, 2018 at 2:37 PM Marc Schlegel  wrote:
> 
> > Thanks for the link. It looks like the api-pod is not getting up at all!
> >
> > Log from k8s_controllers_master-controllers-*
> >
> > [vagrant@master ~]$ sudo docker logs
> > k8s_controllers_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_1
> > E0830 18:28:05.787358   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:594:
> > Failed to list *v1.Pod: Get
> > https://master.vnet.de:8443/api/v1/pods?fieldSelector=spec.schedulerName%3Ddefault-scheduler%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.788589   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.ReplicationController: Get
> > https://master.vnet.de:8443/api/v1/replicationcontrollers?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.804239   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.Node: Get
> > https://master.vnet.de:8443/api/v1/nodes?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.806879   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1beta1.StatefulSet: Get
> > https://master.vnet.de:8443/apis/apps/v1beta1/statefulsets?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.808195   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1beta1.PodDisruptionBudget: Get
> > https://master.vnet.de:8443/apis/policy/v1beta1/poddisruptionbudgets?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.673507   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.PersistentVolume: Get
> > https://master.vnet.de:8443/api/v1/persistentvolumes?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.770141   1 reflector.go:205]
> > 

Re: openshift-ansible release-3.10 - Install fails with control plane pods

2018-09-02 Thread klaasdemter

Hi,
I've this issue reproduceably after uninstalling a (failed/completed) 
installation and then reinstalling. It is however solved by rebooting 
all involved nodes/masters so I did not investigate further.


Greetings
Klaas

On 31.08.2018 21:26, Marc Schlegel wrote:

Sure, see attached.

Before each attempt I pull the latest release-3.10 branch for openshift-ansible.

@Scott Dodson: I am going to investigate again using your suggestions.


Marc,

Is it possible to share  your ansible inventory file to review your
openshift installation? I know there are some changes in 3.10 installation
and might reflect in the inventory.

On Thu, Aug 30, 2018 at 3:37 PM Marc Schlegel  wrote:


Thanks for the link. It looks like the api-pod is not getting up at all!

Log from k8s_controllers_master-controllers-*

[vagrant@master ~]$ sudo docker logs
k8s_controllers_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_1
E0830 18:28:05.787358   1 reflector.go:205]
github.com/openshift/origin/vendor/k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:594:
Failed to list *v1.Pod: Get
https://master.vnet.de:8443/api/v1/pods?fieldSelector=spec.schedulerName%3Ddefault-scheduler%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded=500=0:
dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:05.788589   1 reflector.go:205]
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
Failed to list *v1.ReplicationController: Get
https://master.vnet.de:8443/api/v1/replicationcontrollers?limit=500=0:
dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:05.804239   1 reflector.go:205]
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
Failed to list *v1.Node: Get
https://master.vnet.de:8443/api/v1/nodes?limit=500=0:
dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:05.806879   1 reflector.go:205]
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
Failed to list *v1beta1.StatefulSet: Get
https://master.vnet.de:8443/apis/apps/v1beta1/statefulsets?limit=500=0:
dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:05.808195   1 reflector.go:205]
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
Failed to list *v1beta1.PodDisruptionBudget: Get
https://master.vnet.de:8443/apis/policy/v1beta1/poddisruptionbudgets?limit=500=0:
dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:06.673507   1 reflector.go:205]
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
Failed to list *v1.PersistentVolume: Get
https://master.vnet.de:8443/api/v1/persistentvolumes?limit=500=0:
dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:06.770141   1 reflector.go:205]
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
Failed to list *v1beta1.ReplicaSet: Get
https://master.vnet.de:8443/apis/extensions/v1beta1/replicasets?limit=500=0:
dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:06.773878   1 reflector.go:205]
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
Failed to list *v1.Service: Get
https://master.vnet.de:8443/api/v1/services?limit=500=0:
dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:06.778204   1 reflector.go:205]
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
Failed to list *v1.StorageClass: Get
https://master.vnet.de:8443/apis/storage.k8s.io/v1/storageclasses?limit=500=0:
dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:06.784874   1 reflector.go:205]
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
Failed to list *v1.PersistentVolumeClaim: Get
https://master.vnet.de:8443/api/v1/persistentvolumeclaims?limit=500=0:
dial tcp 127.0.0.1:8443: getsockopt: connection refused

The log is full with those. Since it is all about api, I tried to get the
logs from k8s_POD_master-api-master.vnet.de_kube-system_* which is
completely empty :-/

[vagrant@master ~]$ sudo docker logs
k8s_POD_master-api-master.vnet.de_kube-system_86017803919d833e39cb3d694c249997_1
[vagrant@master ~]$

Is there any special prerequisite about the api-pod?

regards
Marc



Marc,

could you please look over the issue [1] and pull the master pod logs and
see if you bumped into same issue mentioned by the other folks?
Also make sure the openshift-ansible release is the latest one.

Dani

[1] https://github.com/openshift/openshift-ansible/issues/9575

On Wed, Aug 29, 2018 at 7:36 PM Marc Schlegel 

wrote:

Hello everyone

I am having trouble getting a working Origin 3.10 installation using

the

openshift-ansible installer. My install always fails because the

control

pane pods are not available. I've checkout the release-3.10 branch from
openshift-ansible and configured the inventory accordingly


TASK [openshift_control_plane : Start and 

Re: openshift-ansible release-3.10 - Install fails with control plane pods

2018-08-31 Thread Marc Schlegel
Sure, see attached. 

Before each attempt I pull the latest release-3.10 branch for openshift-ansible.

@Scott Dodson: I am going to investigate again using your suggestions.

> Marc,
> 
> Is it possible to share  your ansible inventory file to review your
> openshift installation? I know there are some changes in 3.10 installation
> and might reflect in the inventory.
> 
> On Thu, Aug 30, 2018 at 3:37 PM Marc Schlegel  wrote:
> 
> > Thanks for the link. It looks like the api-pod is not getting up at all!
> >
> > Log from k8s_controllers_master-controllers-*
> >
> > [vagrant@master ~]$ sudo docker logs
> > k8s_controllers_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_1
> > E0830 18:28:05.787358   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:594:
> > Failed to list *v1.Pod: Get
> > https://master.vnet.de:8443/api/v1/pods?fieldSelector=spec.schedulerName%3Ddefault-scheduler%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.788589   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.ReplicationController: Get
> > https://master.vnet.de:8443/api/v1/replicationcontrollers?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.804239   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.Node: Get
> > https://master.vnet.de:8443/api/v1/nodes?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.806879   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1beta1.StatefulSet: Get
> > https://master.vnet.de:8443/apis/apps/v1beta1/statefulsets?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:05.808195   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1beta1.PodDisruptionBudget: Get
> > https://master.vnet.de:8443/apis/policy/v1beta1/poddisruptionbudgets?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.673507   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.PersistentVolume: Get
> > https://master.vnet.de:8443/api/v1/persistentvolumes?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.770141   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1beta1.ReplicaSet: Get
> > https://master.vnet.de:8443/apis/extensions/v1beta1/replicasets?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.773878   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.Service: Get
> > https://master.vnet.de:8443/api/v1/services?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.778204   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.StorageClass: Get
> > https://master.vnet.de:8443/apis/storage.k8s.io/v1/storageclasses?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> > E0830 18:28:06.784874   1 reflector.go:205]
> > github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> > Failed to list *v1.PersistentVolumeClaim: Get
> > https://master.vnet.de:8443/api/v1/persistentvolumeclaims?limit=500=0:
> > dial tcp 127.0.0.1:8443: getsockopt: connection refused
> >
> > The log is full with those. Since it is all about api, I tried to get the
> > logs from k8s_POD_master-api-master.vnet.de_kube-system_* which is
> > completely empty :-/
> >
> > [vagrant@master ~]$ sudo docker logs
> > k8s_POD_master-api-master.vnet.de_kube-system_86017803919d833e39cb3d694c249997_1
> > [vagrant@master ~]$
> >
> > Is there any special prerequisite about the api-pod?
> >
> > regards
> > Marc
> >
> >
> > > Marc,
> > >
> > > could you please look over the issue [1] and pull the master pod logs and
> > > see if you bumped into same issue mentioned by the other folks?
> > > Also make sure the openshift-ansible release is the latest one.
> > >
> > > Dani
> > >
> > > [1] https://github.com/openshift/openshift-ansible/issues/9575
> > >
> > > On Wed, Aug 29, 2018 at 7:36 PM Marc Schlegel 
> > wrote:
> > >
> > > > Hello everyone
> > > >
> > > > I am having trouble getting a working Origin 3.10 installation using
> > the
> > > > openshift-ansible installer. My install always fails because the
> > control
> > > > pane pods are not available. I've checkout the 

Re: openshift-ansible release-3.10 - Install fails with control plane pods

2018-08-31 Thread Scott Dodson
The dependency chain for control plane is node then etcd then api then
controllers. From your previous post it looks like there's no apiserver
running. I'd look into what's wrong there.

Check `master-logs api api` if that doesn't provide you any hints then
check the logs for the node service but I can't think of anything that
would fail there yet result in successfully starting the controller pods.
The apiserver and controller pods use the same image. Each pod will have
two containers, the k8s_POD containers are rarely interesting.

On Thu, Aug 30, 2018 at 2:37 PM Marc Schlegel  wrote:

> Thanks for the link. It looks like the api-pod is not getting up at all!
>
> Log from k8s_controllers_master-controllers-*
>
> [vagrant@master ~]$ sudo docker logs
> k8s_controllers_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_1
> E0830 18:28:05.787358   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:594:
> Failed to list *v1.Pod: Get
> https://master.vnet.de:8443/api/v1/pods?fieldSelector=spec.schedulerName%3Ddefault-scheduler%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:05.788589   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1.ReplicationController: Get
> https://master.vnet.de:8443/api/v1/replicationcontrollers?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:05.804239   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1.Node: Get
> https://master.vnet.de:8443/api/v1/nodes?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:05.806879   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1beta1.StatefulSet: Get
> https://master.vnet.de:8443/apis/apps/v1beta1/statefulsets?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:05.808195   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1beta1.PodDisruptionBudget: Get
> https://master.vnet.de:8443/apis/policy/v1beta1/poddisruptionbudgets?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:06.673507   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1.PersistentVolume: Get
> https://master.vnet.de:8443/api/v1/persistentvolumes?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:06.770141   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1beta1.ReplicaSet: Get
> https://master.vnet.de:8443/apis/extensions/v1beta1/replicasets?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:06.773878   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1.Service: Get
> https://master.vnet.de:8443/api/v1/services?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:06.778204   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1.StorageClass: Get
> https://master.vnet.de:8443/apis/storage.k8s.io/v1/storageclasses?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:06.784874   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1.PersistentVolumeClaim: Get
> https://master.vnet.de:8443/api/v1/persistentvolumeclaims?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
>
> The log is full with those. Since it is all about api, I tried to get the
> logs from k8s_POD_master-api-master.vnet.de_kube-system_* which is
> completely empty :-/
>
> [vagrant@master ~]$ sudo docker logs
> k8s_POD_master-api-master.vnet.de_kube-system_86017803919d833e39cb3d694c249997_1
> [vagrant@master ~]$
>
> Is there any special prerequisite about the api-pod?
>
> regards
> Marc
>
>
> > Marc,
> >
> > could you please look over the issue [1] and pull the master pod logs and
> > see if you bumped into same issue mentioned by the other folks?
> > Also make sure the openshift-ansible release is the latest one.
> >
> > Dani
> >
> > [1] https://github.com/openshift/openshift-ansible/issues/9575
> >
> > On Wed, Aug 29, 2018 at 7:36 PM Marc Schlegel 
> wrote:
> >
> > > Hello everyone
> > >
> > > I am having trouble getting a working Origin 3.10 installation using
> the
> > > openshift-ansible installer. My install always fails because the
> control
> > > pane pods are not available. I've checkout the release-3.10 branch 

Re: openshift-ansible release-3.10 - Install fails with control plane pods

2018-08-31 Thread Ricardo Martinelli de Oliveira
Marc,

Is it possible to share  your ansible inventory file to review your
openshift installation? I know there are some changes in 3.10 installation
and might reflect in the inventory.

On Thu, Aug 30, 2018 at 3:37 PM Marc Schlegel  wrote:

> Thanks for the link. It looks like the api-pod is not getting up at all!
>
> Log from k8s_controllers_master-controllers-*
>
> [vagrant@master ~]$ sudo docker logs
> k8s_controllers_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_1
> E0830 18:28:05.787358   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:594:
> Failed to list *v1.Pod: Get
> https://master.vnet.de:8443/api/v1/pods?fieldSelector=spec.schedulerName%3Ddefault-scheduler%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:05.788589   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1.ReplicationController: Get
> https://master.vnet.de:8443/api/v1/replicationcontrollers?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:05.804239   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1.Node: Get
> https://master.vnet.de:8443/api/v1/nodes?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:05.806879   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1beta1.StatefulSet: Get
> https://master.vnet.de:8443/apis/apps/v1beta1/statefulsets?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:05.808195   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1beta1.PodDisruptionBudget: Get
> https://master.vnet.de:8443/apis/policy/v1beta1/poddisruptionbudgets?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:06.673507   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1.PersistentVolume: Get
> https://master.vnet.de:8443/api/v1/persistentvolumes?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:06.770141   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1beta1.ReplicaSet: Get
> https://master.vnet.de:8443/apis/extensions/v1beta1/replicasets?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:06.773878   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1.Service: Get
> https://master.vnet.de:8443/api/v1/services?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:06.778204   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1.StorageClass: Get
> https://master.vnet.de:8443/apis/storage.k8s.io/v1/storageclasses?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
> E0830 18:28:06.784874   1 reflector.go:205]
> github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87:
> Failed to list *v1.PersistentVolumeClaim: Get
> https://master.vnet.de:8443/api/v1/persistentvolumeclaims?limit=500=0:
> dial tcp 127.0.0.1:8443: getsockopt: connection refused
>
> The log is full with those. Since it is all about api, I tried to get the
> logs from k8s_POD_master-api-master.vnet.de_kube-system_* which is
> completely empty :-/
>
> [vagrant@master ~]$ sudo docker logs
> k8s_POD_master-api-master.vnet.de_kube-system_86017803919d833e39cb3d694c249997_1
> [vagrant@master ~]$
>
> Is there any special prerequisite about the api-pod?
>
> regards
> Marc
>
>
> > Marc,
> >
> > could you please look over the issue [1] and pull the master pod logs and
> > see if you bumped into same issue mentioned by the other folks?
> > Also make sure the openshift-ansible release is the latest one.
> >
> > Dani
> >
> > [1] https://github.com/openshift/openshift-ansible/issues/9575
> >
> > On Wed, Aug 29, 2018 at 7:36 PM Marc Schlegel 
> wrote:
> >
> > > Hello everyone
> > >
> > > I am having trouble getting a working Origin 3.10 installation using
> the
> > > openshift-ansible installer. My install always fails because the
> control
> > > pane pods are not available. I've checkout the release-3.10 branch from
> > > openshift-ansible and configured the inventory accordingly
> > >
> > >
> > > TASK [openshift_control_plane : Start and enable self-hosting node]
> > > **
> > > changed: [master]
> > > TASK [openshift_control_plane : Get node logs]
> > > ***
> > > skipping: [master]
> > > TASK 

Re: openshift-ansible release-3.10 - Install fails with control plane pods

2018-08-30 Thread Marc Schlegel
Thanks for the link. It looks like the api-pod is not getting up at all!

Log from k8s_controllers_master-controllers-*

[vagrant@master ~]$ sudo docker logs 
k8s_controllers_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_1
E0830 18:28:05.787358   1 reflector.go:205] 
github.com/openshift/origin/vendor/k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:594:
 Failed to list *v1.Pod: Get 
https://master.vnet.de:8443/api/v1/pods?fieldSelector=spec.schedulerName%3Ddefault-scheduler%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded=500=0:
 dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:05.788589   1 reflector.go:205] 
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87: 
Failed to list *v1.ReplicationController: Get 
https://master.vnet.de:8443/api/v1/replicationcontrollers?limit=500=0:
 dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:05.804239   1 reflector.go:205] 
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87: 
Failed to list *v1.Node: Get 
https://master.vnet.de:8443/api/v1/nodes?limit=500=0: dial tcp 
127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:05.806879   1 reflector.go:205] 
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87: 
Failed to list *v1beta1.StatefulSet: Get 
https://master.vnet.de:8443/apis/apps/v1beta1/statefulsets?limit=500=0:
 dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:05.808195   1 reflector.go:205] 
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87: 
Failed to list *v1beta1.PodDisruptionBudget: Get 
https://master.vnet.de:8443/apis/policy/v1beta1/poddisruptionbudgets?limit=500=0:
 dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:06.673507   1 reflector.go:205] 
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87: 
Failed to list *v1.PersistentVolume: Get 
https://master.vnet.de:8443/api/v1/persistentvolumes?limit=500=0:
 dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:06.770141   1 reflector.go:205] 
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87: 
Failed to list *v1beta1.ReplicaSet: Get 
https://master.vnet.de:8443/apis/extensions/v1beta1/replicasets?limit=500=0:
 dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:06.773878   1 reflector.go:205] 
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87: 
Failed to list *v1.Service: Get 
https://master.vnet.de:8443/api/v1/services?limit=500=0: dial 
tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:06.778204   1 reflector.go:205] 
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87: 
Failed to list *v1.StorageClass: Get 
https://master.vnet.de:8443/apis/storage.k8s.io/v1/storageclasses?limit=500=0:
 dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0830 18:28:06.784874   1 reflector.go:205] 
github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:87: 
Failed to list *v1.PersistentVolumeClaim: Get 
https://master.vnet.de:8443/api/v1/persistentvolumeclaims?limit=500=0:
 dial tcp 127.0.0.1:8443: getsockopt: connection refused

The log is full with those. Since it is all about api, I tried to get the logs 
from k8s_POD_master-api-master.vnet.de_kube-system_* which is completely empty 
:-/

[vagrant@master ~]$ sudo docker logs 
k8s_POD_master-api-master.vnet.de_kube-system_86017803919d833e39cb3d694c249997_1
[vagrant@master ~]$ 

Is there any special prerequisite about the api-pod?

regards
Marc


> Marc,
> 
> could you please look over the issue [1] and pull the master pod logs and
> see if you bumped into same issue mentioned by the other folks?
> Also make sure the openshift-ansible release is the latest one.
> 
> Dani
> 
> [1] https://github.com/openshift/openshift-ansible/issues/9575
> 
> On Wed, Aug 29, 2018 at 7:36 PM Marc Schlegel  wrote:
> 
> > Hello everyone
> >
> > I am having trouble getting a working Origin 3.10 installation using the
> > openshift-ansible installer. My install always fails because the control
> > pane pods are not available. I've checkout the release-3.10 branch from
> > openshift-ansible and configured the inventory accordingly
> >
> >
> > TASK [openshift_control_plane : Start and enable self-hosting node]
> > **
> > changed: [master]
> > TASK [openshift_control_plane : Get node logs]
> > ***
> > skipping: [master]
> > TASK [openshift_control_plane : debug]
> > **
> > skipping: [master]
> > TASK [openshift_control_plane : fail]
> > *
> > skipping: [master]
> > TASK [openshift_control_plane : Wait for control plane pods to appear]
> > ***
> >
> > failed: [master] (item=etcd) => {"attempts": 60, "changed": false, "item":
> > "etcd", 

Re: openshift-ansible release-3.10 - Install fails with control plane pods

2018-08-30 Thread Daniel Comnea
Marc,

could you please look over the issue [1] and pull the master pod logs and
see if you bumped into same issue mentioned by the other folks?
Also make sure the openshift-ansible release is the latest one.

Dani

[1] https://github.com/openshift/openshift-ansible/issues/9575

On Wed, Aug 29, 2018 at 7:36 PM Marc Schlegel  wrote:

> Hello everyone
>
> I am having trouble getting a working Origin 3.10 installation using the
> openshift-ansible installer. My install always fails because the control
> pane pods are not available. I've checkout the release-3.10 branch from
> openshift-ansible and configured the inventory accordingly
>
>
> TASK [openshift_control_plane : Start and enable self-hosting node]
> **
> changed: [master]
> TASK [openshift_control_plane : Get node logs]
> ***
> skipping: [master]
> TASK [openshift_control_plane : debug]
> **
> skipping: [master]
> TASK [openshift_control_plane : fail]
> *
> skipping: [master]
> TASK [openshift_control_plane : Wait for control plane pods to appear]
> ***
>
> failed: [master] (item=etcd) => {"attempts": 60, "changed": false, "item":
> "etcd", "msg": {"cmd": "/bin/oc get pod master-etcd-master.vnet.de -o
> json -n kube-system", "results": [{}], "returncode": 1, "stderr": "The
> connection to the server master.vnet.de:8443 was refused - did you
> specify the right host or port?\n", "stdout": ""}}
>
> TASK [openshift_control_plane : Report control plane errors]
> *
> fatal: [master]: FAILED! => {"changed": false, "msg": "Control plane pods
> didn't come up"}
>
>
> I am using Vagrant to setup a local domain (vnet.de) which also includes
> a dnsmasq-node to have full control over the dns. The following VMs are
> running and DNS ans SSH works as expected
>
> Hostname IP
> domain.vnet.de   192.168.60.100
> master.vnet.de192.168.60.150 (dns also works for openshift.vnet.de
> which is configured as openshift_master_cluster_public_hostname) also runs
> etcd
> infra.vnet.de192.168.60.151 (openshift_master_default_subdomain
> wildcard points to this node)
> app1.vnet.de192.168.60.152
> app2.vnet.de192.168.60.153
>
>
> When connecting to the master-node I can see that several docker-instances
> are up and running
>
> [vagrant@master ~]$ sudo docker ps
> CONTAINER IDIMAGECOMMAND
> CREATED STATUS  PORTS
>  NAMES
>
> 9a0844123909ff5dd2137a4f "/bin/sh -c
> '#!/bi..."   19 minutes ago  Up 19 minutes
>  
> k8s_etcd_master-etcd-master.vnet.de_kube-system_a2c858fccd481c334a9af7413728e203_0
>
> 41d803023b72f216d84cdf54 "/bin/bash -c
> '#!/..."   19 minutes ago  Up 19 minutes
>  
> k8s_controllers_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_0
>
> 044c9d12588cdocker.io/openshift/origin-pod:v3.10.0
>  "/usr/bin/pod"   19 minutes ago  Up 19 minutes
>
>  
> k8s_POD_master-api-master.vnet.de_kube-system_86017803919d833e39cb3d694c249997_0
>
> 10a197e394b3docker.io/openshift/origin-pod:v3.10.0
>  "/usr/bin/pod"   19 minutes ago  Up 19 minutes
>
>  
> k8s_POD_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_0
>
> 20f4f86bdd07docker.io/openshift/origin-pod:v3.10.0
>  "/usr/bin/pod"   19 minutes ago  Up 19 minutes
>
>  
> k8s_POD_master-etcd-master.vnet.de_kube-system_a2c858fccd481c334a9af7413728e203_0
>
>
> However, there is no port 8443 open on the master-node. No wonder the
> ansible-installer complains.
>
> The machines are using a plain Centos 7.5 and I've run the
> openshift-ansible/playbooks/prerequisites.yml first and then
> openshift-ansible/playbooks/deploy_cluster.yml.
> I've double-checked the installation documentation and my Vagrant
> config...all looks correct.
>
> Any ideas/advice?
> regards
> Marc
>
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


openshift-ansible release-3.10 - Install fails with control plane pods

2018-08-29 Thread Marc Schlegel
Hello everyone

I am having trouble getting a working Origin 3.10 installation using the 
openshift-ansible installer. My install always fails because the control pane 
pods are not available. I've checkout the release-3.10 branch from 
openshift-ansible and configured the inventory accordingly


TASK [openshift_control_plane : Start and enable self-hosting node] 
**
changed: [master]
TASK [openshift_control_plane : Get node logs] ***
skipping: [master]
TASK [openshift_control_plane : debug] 
**
skipping: [master]
TASK [openshift_control_plane : fail] 
*
skipping: [master]
TASK [openshift_control_plane : Wait for control plane pods to appear] 
***

failed: [master] (item=etcd) => {"attempts": 60, "changed": false, "item": 
"etcd", "msg": {"cmd": "/bin/oc get pod master-etcd-master.vnet.de -o json -n 
kube-system", "results": [{}], "returncode": 1, "stderr": "The connection to 
the server master.vnet.de:8443 was refused - did you specify the right host or 
port?\n", "stdout": ""}}  

TASK [openshift_control_plane : Report control plane errors] 
*
fatal: [master]: FAILED! => {"changed": false, "msg": "Control plane pods 
didn't come up"}


I am using Vagrant to setup a local domain (vnet.de) which also includes a 
dnsmasq-node to have full control over the dns. The following VMs are running 
and DNS ans SSH works as expected

Hostname IP
domain.vnet.de   192.168.60.100
master.vnet.de192.168.60.150 (dns also works for openshift.vnet.de which is 
configured as openshift_master_cluster_public_hostname) also runs etcd
infra.vnet.de192.168.60.151 (openshift_master_default_subdomain 
wildcard points to this node)
app1.vnet.de192.168.60.152
app2.vnet.de192.168.60.153


When connecting to the master-node I can see that several docker-instances are 
up and running

[vagrant@master ~]$ sudo docker ps
CONTAINER IDIMAGECOMMAND
  CREATED STATUS  PORTS   NAMES 


9a0844123909ff5dd2137a4f "/bin/sh -c 
'#!/bi..."   19 minutes ago  Up 19 minutes   
k8s_etcd_master-etcd-master.vnet.de_kube-system_a2c858fccd481c334a9af7413728e203_0

41d803023b72f216d84cdf54 "/bin/bash -c 
'#!/..."   19 minutes ago  Up 19 minutes   
k8s_controllers_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_0
  
044c9d12588cdocker.io/openshift/origin-pod:v3.10.0   "/usr/bin/pod" 
  19 minutes ago  Up 19 minutes   
k8s_POD_master-api-master.vnet.de_kube-system_86017803919d833e39cb3d694c249997_0
  
10a197e394b3docker.io/openshift/origin-pod:v3.10.0   "/usr/bin/pod" 
  19 minutes ago  Up 19 minutes   
k8s_POD_master-controllers-master.vnet.de_kube-system_a3c3ca56f69ed817bad799176cba5ce8_0
  
20f4f86bdd07docker.io/openshift/origin-pod:v3.10.0   "/usr/bin/pod" 
  19 minutes ago  Up 19 minutes   
k8s_POD_master-etcd-master.vnet.de_kube-system_a2c858fccd481c334a9af7413728e203_0
 

However, there is no port 8443 open on the master-node. No wonder the 
ansible-installer complains. 

The machines are using a plain Centos 7.5 and I've run the 
openshift-ansible/playbooks/prerequisites.yml first and then 
openshift-ansible/playbooks/deploy_cluster.yml.
I've double-checked the installation documentation and my Vagrant config...all 
looks correct.

Any ideas/advice?
regards
Marc


___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users