Hi Jason, Kindest thanks for trying to help.
In order 1) Indeed, the "lb" host is configured (via dnsmasq) as a DNS forwarder, has the correct "/etc/hosts" (which is propagated to all the other hosts in the cluster), and all hosts have an entry pointing to it in the "/etc/resolv.conf" 2) A bit puzzled wrt "system:node" vs "system:anonymous".... I've just test the corresponding curl call on another system where everything work as expected (at least so far...) and the response I get back from a GET to " /api/v1/namespaces" still refers to "system:anonymous" , and not "system:node" Also, to make things even more weird, if I copy the node "kubeconfig" in the ".kube/config" I am identified accordingly (i.e. as "system:node") when doing an "oc whoami" 3) Thanks for pointing out that specifying "HTTP_PROXY" / "HTTPS_PROXY" and resp "NO_PROXY" is not yet possible via the Ansible installer. My remaining question is: Is there any way to debug the authentication process / why the "oc login" with "httpasswd" back end doesn't work ? Thanks again, /Florian On Thu, Feb 25, 2016 at 10:30 PM, Jason DeTiberus <[email protected]> wrote: > > > On Thu, Feb 25, 2016 at 10:54 AM, Florian Daniel Otel < > [email protected]> wrote: > >> >> Hello all, >> >> I have the following problems: >> >> I have a multimaster OSE setup consisting of the following: >> - A LB with "native" HA >> - Three masters (doubling as "etcd" nodes) >> - Two nodes >> >> >> All the hosts are themselves OpenStack instances (hence the ".novalocal" >> suffix). DNS is via an "/etc/hosts" propagated across, with the "lb" host >> doubling as DNS forwarder (via dnsmasq). All Internet access is via an http >> / https proxy. >> > > So, if I'm understanding this correctly, then the lb host is correctly > resolving the dns for all of the *.novalocal addresses that are in use by > the cluster and all of the hosts are pre-configured to use the lb host as > the dns resolver prior to running the installation? If not, then there will > definitely be issues, since /etc/hosts is not used by deployed containers. > > >> >> After many attempts we finally get a setup that is somewhat working (see >> P.S. for why "somehow"). Attached is the "/etc/ansible/hosts" file. >> Installation is from the main "openshift-ansible" repo ( >> https://github.com/openshift/openshift-ansible) >> >> My problem: >> >> After installation, on one master I created two users in >> "/etc/origin/htpasswd". After creation I have propagated the file to all >> the other masters. UNIX permissions to the file on all masters are "0600" >> >> However, doing an "oc login" returns a "401 Unauthorized", and I cannot >> find what the issue is, or how to debug it (no trace for it in the >> "atomic-openshift-master-api" or "atomic-openshift-master-controllers" >> logs). >> > >> >> [root@az1node01 ~]# oc login >> Authentication required for https://az1lb01.mydomain.novalocal:8443 >> (openshift) >> Username: reguser >> Password: >> Login failed (401 Unauthorized) >> Unauthorized >> >> >> The puzzling thing is that using the "system:node" certificates and keys >> work (in the sense I am identified as "system:anonymous"): >> > > Something is definitely not right here, the user for the system:node certs > should be identified as the system:node user and not anonymous. I suspect > that there is a larger issue at play here. > > It looks like the initial cluster creation may have had issues... The > atomic-openshift-master-api logs should provide more insight into what may > have gone wrong. > > >> >> >> curl -v --cacert /etc/origin/node/ca.crt --cert >> "/etc/origin/node/system:node:az1node01.mydomain.novalocal.crt" --key >> "/etc/origin/node/system:node:az1node01.mydomain.novalocal.key" >> https://az1lb01.mydomain.novalocal:8443/api/v1/namespaces >> * About to connect() to az1lb01.mydomain.novalocal port 8443 (#0) >> * Trying 10.0.0.31... >> * Connected to az1lb01.mydomain.novalocal (10.0.0.31) port 8443 (#0) >> * Initializing NSS with certpath: sql:/etc/pki/nssdb >> * CAfile: /etc/origin/node/ca.crt >> CApath: none >> * NSS: client certificate not found: /etc/origin/node/system >> * SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 >> * Server certificate: >> * subject: CN=10.0.0.24 >> * start date: Feb 24 19:40:56 2016 GMT >> * expire date: Feb 23 19:40:57 2018 GMT >> * common name: 10.0.0.24 >> * issuer: CN=openshift-signer@1456342841 >> > GET /api/v1/namespaces HTTP/1.1 >> > User-Agent: curl/7.29.0 >> > Host: az1lb01.mydomain.novalocal:8443 >> > Accept: */* >> > >> < HTTP/1.1 403 Forbidden >> < Cache-Control: no-store >> < Content-Type: application/json >> < Date: Thu, 25 Feb 2016 14:42:41 GMT >> < Content-Length: 255 >> < >> { >> "kind": "Status", >> "apiVersion": "v1", >> "metadata": {}, >> "status": "Failure", >> "message": "User \"system:anonymous\" cannot list all namespaces in the >> cluster", >> "reason": "Forbidden", >> "details": { >> "kind": "namespaces" >> }, >> "code": 403 >> } >> * Connection #0 to host az1lb01.mydomain.novalocal left intact >> >> Attached is also the master configuration file for one master. >> >> >> My questions: >> >> - I had many issues in getting the installation working, mostly due to >> the Ansible installer reading the OpenStack instance metadata, and >> inconsistencies btw. that and the "hostname". >> >> Is there any particular repo / branch of the installer that is known to >> work in this particular setup ? Any particular settings I should use in the >> Ansible hosts file ? >> >> I suspect the certificate issues I'm encountering is because of that >> (in combination with the proxy) but I'm not sure. >> >> - Operating behind an HTTP / HTTPS proxy: Even before starting the >> Ansible installer, Docker was (properly) configured to the HTTP / HTTPS >> proxy settings, and working correctly. However, for the installer itself I >> found no way to express the "HTTP_PROXY" "HTTPS_PROXY" and, particularly, >> the "NO_PROXY" settings. For that I'm relying on exported environment >> variables in the shell. Is there a "proper" way to do this via the >> installer itself. >> > > There is an openshift-ansible PR to expose this directly ( > https://github.com/openshift/openshift-ansible/pull/1385) > > >> >> Post installer I have manually added those settings into >> "/etc/sysconfig/atomic-openshift-master", >> "/etc/sysconfig/atomic-openshift-master-controllers", >> "/etc/sysconfig/atomic-openshift-master-api" and, respectively for the >> nodes, "/etc/sysconfig/atomic-openshift-node", but don't know how to do >> this via the installer itself. >> >> >> - Is there an issue with the masters doubling as "etcd" nodes ? >> > > No, there should not be any issues with co-locating the etcd service > alongside the masters. > > >> >> >> The most frustrating part is that I have this very setup working >> perfectly fine in a public cloud environment (namely on GCE) , but with the >> (three) "etcd" hosts distinct from the masters (i.e. total of 9 hosts >> instead of 6), and with unproxied Internet access.... However, that >> installation is from a different repo branch (namely from " >> https://github.com/detiber/openshift-ansible" from the "gceFixes" branch >> ) >> > > I *believe* all of the fixes from gceFixes have been merged into master at > this point. > > >> >> >> Thanks a lot for the help, >> >> Florian >> >> P.S. The weirdest case wrt certificates is when trying to check the >> "etcd" cluster: >> >> >> [root@az1master01 ~]# etcdctl --debug -C >> https://az1master01.mydomain.novalocal:2379, >> https://az3master02.mydomain.novalocal:2379, >> https://az3master03.mydomain.novalocal:2379 --ca-file >> /etc/origin/master/ca.crt --cert-file >> /etc/origin/master/master.etcd-client.crt --key-file >> /etc/origin/master/master.etcd-client.key cluster-health >> Cluster-Endpoints: https://az3master02.mydomain.novalocal:2379, >> https://az1master01.mydomain.novalocal:2379, >> https://az3master03.mydomain.novalocal:2379 >> cURL Command: curl -X GET >> https://az3master02.mydomain.novalocal:2379/v2/members >> cURL Command: curl -X GET >> https://az1master01.mydomain.novalocal:2379/v2/members >> cURL Command: curl -X GET >> https://az3master03.mydomain.novalocal:2379/v2/members >> cluster may be unhealthy: failed to list members >> Error: client: etcd cluster is unavailable or misconfigured >> error #0: x509: certificate signed by unknown authority >> error #1: x509: certificate signed by unknown authority >> error #2: x509: certificate signed by unknown authority >> > > You need to use the etcd ca cert here: etcdctl --debug -C > https://az1master01.mydomain.novalocal:2379, > https://az3master02.mydomain.novalocal:2379, > https://az3master03.mydomain.novalocal:2379 --ca-file > /etc/origin/master/master.etcd-ca.crt --cert-file > /etc/origin/master/master.etcd-client.crt --key-file > /etc/origin/master/master.etcd-client.key cluster-health > > >> >> >> >> Attempting doing a direct curl to the "etcd" >> >> [root@az1master01 ~]# curl -v --cacert /etc/origin/master/ca.crt >> --cert /etc/origin/master/master.etcd-client.crt --key >> /etc/origin/master/master.etcd-client.key >> https://az1master01.mydomain.novalocal:2379/v2/members >> * About to connect() to az1master01.mydomain.novalocal port 2379 (#0) >> * Trying 10.0.0.22... >> * Connected to az1master01.mydomain.novalocal (10.0.0.22) port 2379 (#0) >> * Initializing NSS with certpath: sql:/etc/pki/nssdb >> * CAfile: /etc/origin/master/ca.crt >> CApath: none >> * Server certificate: >> * subject: CN=az1master01.mydomain.novalocal >> * start date: Feb 24 19:38:07 2016 GMT >> * expire date: Feb 23 19:38:07 2017 GMT >> * common name: az1master01.mydomain.novalocal >> * issuer: CN=etcd-signer@1456342665 >> * NSS error -8179 (SEC_ERROR_UNKNOWN_ISSUER) >> * Peer's Certificate issuer is not recognized. >> * Closing connection 0 >> curl: (60) Peer's Certificate issuer is not recognized. >> More details here: http://curl.haxx.se/docs/sslcerts.html >> >> curl performs SSL certificate verification by default, using a "bundle" >> of Certificate Authority (CA) public keys (CA certs). If the default >> bundle file isn't adequate, you can specify an alternate file >> using the --cacert option. >> If this HTTPS server uses a certificate signed by a CA represented in >> the bundle, the certificate verification probably failed due to a >> problem with the certificate (it might be expired, or the name might >> not match the domain name in the URL). >> If you'd like to turn off curl's verification of the certificate, use >> the -k (or --insecure) option. >> [root@az1master01 ~]# >> >> >> >> >> _______________________________________________ >> users mailing list >> [email protected] >> http://lists.openshift.redhat.com/openshiftmm/listinfo/users >> >> > > > -- > Jason DeTiberus >
_______________________________________________ users mailing list [email protected] http://lists.openshift.redhat.com/openshiftmm/listinfo/users
