Hi Dan, Which openshift-ansible release tag have you used ?
Cheers, Dani On Mon, Jun 3, 2019 at 4:18 PM Punga Dan <dan.pu...@gmail.com> wrote: > Thank you very much for the extensive response, Samuel! > > I've found that I do have a DNS misconfiguration so I receive the CSR > error from the title not because of something related to Openshift > installer procedure. > > Somehow (and I haven't yet found the reason, but still looking for it) > dnsmasq fills the upstream DNS configuration with some public nameservers > and not my "internal" DNS. > So after the openshift-ansible playbook, related to this, installs dnsmasq > and calls the /etc/NetworkManager/dispatcher.d/99-origin-dns.sh > script(restarts NetworkManager), all nodes end up with "bad" upstream > nameservers (in the /etc/dnsmasq.d/origin-upstream-dns.conf and > /etc/origin/node/resolv.conf files). > Even if the /etc/resolv.conf file for each host has the right nameserver > and search domain, dnsmasq populates the OKD-related conf files above with > a different nameserver. > > I think this is related to dnsmasq/NetworkManager specific > configuration....will have to look into it and figure out what's not going > as expected and why. I believe these are served by the DHCP server, but > still looking for a way to address this. > > Anyway thanks again for the input, it put me on the right track! :) > > Dan > > În dum., 2 iun. 2019 la 22:04, Samuel Martín Moro <faus...@gmail.com> a > scris: > >> Hi, >> >> >> This is quite puzzling, ... could you share your inventory with us? make >> sure to obfuscate any sensitive data (ldap/htpasswd credentials among >> others, ...) >> mostly interested in potential openshift_node_groups edition. Although >> something else might come up (?) >> >> >> At first glance, you are right, it sounds like a firewalling issue. >> Yet from your description, you did open all required ports. >> I could suggest you check back on these, make sure your data is accurate >> - although I would assume it is. >> Also: if using Cri-O as a runtime, note that you would be missing port >> 10010, that should be opened on all nodes. Yet I don't think that one would >> be related to nodes registrations against your master API. >> >> Another explanation could be related to DNS (can your infra/compute nodes >> properly resolve your masters name? the contrary would be unusual, still >> could explain what's going on). >> >> As a general rule, at that stage, I would restart the origin-node service >> on those hosts that fail to register, keeping an eye on /var/log/messages >> (or journalctl -f). >> If that doesn't help, I might raise log levels in >> /etc/sysconfig/origin-node (there's a variable which defaults to 2, you can >> change it to 99, beware it would give you a lots of logs/could saturate >> your disks at some point, don't keep it like this over a long period) >> >> Dealing with large volumes of logs, note that openshift services tends to >> store messages with prefix based on severity: you might be able to "| grep >> -E 'E[0-9][0-9]" to focus on error messages, or W[0-9][0-9] for warnings, >> ... >> >> Your issue being potentially related to firewalling, I might also use >> tcpdump looking into what's being exchanged between nodes. >> Look for any packets with a SYN flag ("[S]") that would not be followed >> by an SYN-ACK ("[S.]"). >> >> >> Let us know how that goes, >> >> >> Good luck. >> Failing during the "Approve node certificate" steps is relatively common, >> and could have several causes, from node groups configuration, to DNS, >> firewalls, broken TCP handshake, MTU not allowing for certificates to go >> through, ... we'll want to dig deeper, to elucidate that issue. >> >> >> Regards. >> >> On Sat, Jun 1, 2019 at 12:19 PM Punga Dan <dan.pu...@gmail.com> wrote: >> >>> Hello all! >>> >>> I'm hitting a problem when trying to install a OKD3.11 on one master 2 >>> infra and 2 compute nodes. The hosts are VM that run centos7. >>> I've gone through the issues related to this subject: >>> https://access.redhat.com/solutions/3680401 which suggest naming the >>> hosts as FQDN. Tried it with the same problem appearing for the same set of >>> hosts(all except the master). >>> >>> In my case the error is only for the 2 infra nodes and 2 compute nodes, >>> so not for the master as well. >>> >>> oc get nodes gives me just the master node, but I guess this is the case >>> as the other OKD-nodes stand to be created by the process that fails. Am I >>> wrong? >>> >>> oc get csr gives me a result of 3 csrs: >>> [root@master ~]# oc get csr >>> NAME AGE REQUESTOR CONDITION >>> csr-4xjjb 24m system:admin Approved,Issued >>> csr-b6x45 24m system:admin Approved,Issued >>> csr-hgmpf 20m system:node:master Approved,Issued >>> >>> Here I believe I have 2 csrs for system:Admin because I ran >>> the playbooks/openshift-node/join.yml a second time. >>> >>> The bootstrapping certificates on the master look fine(??) >>> [root@master ~]# ll /etc/origin/node/certificates/ >>> total 20 >>> -rw-------. 1 root root 2830 iun 1 11:30 >>> kubelet-client-2019-06-01-11-30-04.pem >>> -rw-------. 1 root root 1135 iun 1 11:31 >>> kubelet-client-2019-06-01-11-31-23.pem >>> lrwxrwxrwx. 1 root root 68 iun 1 11:31 kubelet-client-current.pem -> >>> /etc/origin/node/certificates/kubelet-client-2019-06-01-11-31-23.pem >>> -rw-------. 1 root root 1179 iun 1 11:35 >>> kubelet-server-2019-06-01-11-35-42.pem >>> lrwxrwxrwx. 1 root root 68 iun 1 11:35 kubelet-server-current.pem -> >>> /etc/origin/node/certificates/kubelet-server-2019-06-01-11-35-42.pem >>> >>> I've rechecked the open ports thinking the issue lies in some >>> network-related config. >>> - all hosts have the node related ports opened: 53/udp, 10250/tcp, >>> 4789/udp >>> - master(with etcd): 8053/udp+tcp, 2049/udp+tcp, 8443/tcp, 8444/tcp, >>> 4789/udp, 53/udp >>> - infra has on top of the node ones, the ports related to router/routes >>> and logging components which it will host >>> The chosen SDN >>> is os_sdn_network_plugin_name='redhat/openshift-ovs-multitenant' with no >>> extra config in the inventory file. (Do I need any?) >>> >>> >>> Any hints about where and what to check would be much appreciated! >>> >>> Best regards, >>> Dan Pungă >>> _______________________________________________ >>> users mailing list >>> users@lists.openshift.redhat.com >>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users >>> >> >> >> -- >> Samuel Martín Moro >> {EPITECH.} 2011 >> >> "Nobody wants to say how this works. >> Maybe nobody knows ..." >> Xorg.conf(5) >> > _______________________________________________ > users mailing list > users@lists.openshift.redhat.com > http://lists.openshift.redhat.com/openshiftmm/listinfo/users >
_______________________________________________ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users