On Thu, Feb 25, 2016 at 5:03 PM, Florian Daniel Otel <[email protected] > wrote:
> Hi Jason, > > Kindest thanks for trying to help. > > In order > > 1) Indeed, the "lb" host is configured (via dnsmasq) as a DNS forwarder, > has the correct "/etc/hosts" (which is propagated to all the other hosts in > the cluster), and all hosts have an entry pointing to it in the > "/etc/resolv.conf" > > 2) A bit puzzled wrt "system:node" vs "system:anonymous".... > > I've just test the corresponding curl call on another system where > everything work as expected (at least so far...) and the response I get > back from a GET to " /api/v1/namespaces" still refers to "system:anonymous" > , and not "system:node" > > Also, to make things even more weird, if I copy the node "kubeconfig" in > the ".kube/config" I am identified accordingly (i.e. as "system:node") when > doing an "oc whoami" > I'm probably missing something with the way that the node identifies itself when using client certificate authentication, I'm seeing the same behavior on a system I have that is functioning as expected. > > > 3) Thanks for pointing out that specifying "HTTP_PROXY" / "HTTPS_PROXY" > and resp "NO_PROXY" is not yet possible via the Ansible installer. > > My remaining question is: Is there any way to debug the authentication > process / why the "oc login" with "httpasswd" back end doesn't work ? > You will most likely need to increase the logging level to see authentication logs for the api service. In /etc/sysconfig/atomic-openshift-master-api. increasing the loglevel to 4 should provide output around the authentication failure. > > > Thanks again, > > /Florian > > > > On Thu, Feb 25, 2016 at 10:30 PM, Jason DeTiberus <[email protected]> > wrote: > >> >> >> On Thu, Feb 25, 2016 at 10:54 AM, Florian Daniel Otel < >> [email protected]> wrote: >> >>> >>> Hello all, >>> >>> I have the following problems: >>> >>> I have a multimaster OSE setup consisting of the following: >>> - A LB with "native" HA >>> - Three masters (doubling as "etcd" nodes) >>> - Two nodes >>> >>> >>> All the hosts are themselves OpenStack instances (hence the ".novalocal" >>> suffix). DNS is via an "/etc/hosts" propagated across, with the "lb" host >>> doubling as DNS forwarder (via dnsmasq). All Internet access is via an http >>> / https proxy. >>> >> >> So, if I'm understanding this correctly, then the lb host is correctly >> resolving the dns for all of the *.novalocal addresses that are in use by >> the cluster and all of the hosts are pre-configured to use the lb host as >> the dns resolver prior to running the installation? If not, then there will >> definitely be issues, since /etc/hosts is not used by deployed containers. >> >> >>> >>> After many attempts we finally get a setup that is somewhat working (see >>> P.S. for why "somehow"). Attached is the "/etc/ansible/hosts" file. >>> Installation is from the main "openshift-ansible" repo ( >>> https://github.com/openshift/openshift-ansible) >>> >>> My problem: >>> >>> After installation, on one master I created two users in >>> "/etc/origin/htpasswd". After creation I have propagated the file to all >>> the other masters. UNIX permissions to the file on all masters are "0600" >>> >>> However, doing an "oc login" returns a "401 Unauthorized", and I cannot >>> find what the issue is, or how to debug it (no trace for it in the >>> "atomic-openshift-master-api" or "atomic-openshift-master-controllers" >>> logs). >>> >> >>> >>> [root@az1node01 ~]# oc login >>> Authentication required for https://az1lb01.mydomain.novalocal:8443 >>> (openshift) >>> Username: reguser >>> Password: >>> Login failed (401 Unauthorized) >>> Unauthorized >>> >>> >>> The puzzling thing is that using the "system:node" certificates and keys >>> work (in the sense I am identified as "system:anonymous"): >>> >> >> Something is definitely not right here, the user for the system:node >> certs should be identified as the system:node user and not anonymous. I >> suspect that there is a larger issue at play here. >> >> It looks like the initial cluster creation may have had issues... The >> atomic-openshift-master-api logs should provide more insight into what may >> have gone wrong. >> >> >>> >>> >>> curl -v --cacert /etc/origin/node/ca.crt --cert >>> "/etc/origin/node/system:node:az1node01.mydomain.novalocal.crt" --key >>> "/etc/origin/node/system:node:az1node01.mydomain.novalocal.key" >>> https://az1lb01.mydomain.novalocal:8443/api/v1/namespaces >>> * About to connect() to az1lb01.mydomain.novalocal port 8443 (#0) >>> * Trying 10.0.0.31... >>> * Connected to az1lb01.mydomain.novalocal (10.0.0.31) port 8443 (#0) >>> * Initializing NSS with certpath: sql:/etc/pki/nssdb >>> * CAfile: /etc/origin/node/ca.crt >>> CApath: none >>> * NSS: client certificate not found: /etc/origin/node/system >>> * SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 >>> * Server certificate: >>> * subject: CN=10.0.0.24 >>> * start date: Feb 24 19:40:56 2016 GMT >>> * expire date: Feb 23 19:40:57 2018 GMT >>> * common name: 10.0.0.24 >>> * issuer: CN=openshift-signer@1456342841 >>> > GET /api/v1/namespaces HTTP/1.1 >>> > User-Agent: curl/7.29.0 >>> > Host: az1lb01.mydomain.novalocal:8443 >>> > Accept: */* >>> > >>> < HTTP/1.1 403 Forbidden >>> < Cache-Control: no-store >>> < Content-Type: application/json >>> < Date: Thu, 25 Feb 2016 14:42:41 GMT >>> < Content-Length: 255 >>> < >>> { >>> "kind": "Status", >>> "apiVersion": "v1", >>> "metadata": {}, >>> "status": "Failure", >>> "message": "User \"system:anonymous\" cannot list all namespaces in >>> the cluster", >>> "reason": "Forbidden", >>> "details": { >>> "kind": "namespaces" >>> }, >>> "code": 403 >>> } >>> * Connection #0 to host az1lb01.mydomain.novalocal left intact >>> >>> Attached is also the master configuration file for one master. >>> >>> >>> My questions: >>> >>> - I had many issues in getting the installation working, mostly due to >>> the Ansible installer reading the OpenStack instance metadata, and >>> inconsistencies btw. that and the "hostname". >>> >>> Is there any particular repo / branch of the installer that is known >>> to work in this particular setup ? Any particular settings I should use in >>> the Ansible hosts file ? >>> >>> I suspect the certificate issues I'm encountering is because of that >>> (in combination with the proxy) but I'm not sure. >>> >>> - Operating behind an HTTP / HTTPS proxy: Even before starting the >>> Ansible installer, Docker was (properly) configured to the HTTP / HTTPS >>> proxy settings, and working correctly. However, for the installer itself I >>> found no way to express the "HTTP_PROXY" "HTTPS_PROXY" and, particularly, >>> the "NO_PROXY" settings. For that I'm relying on exported environment >>> variables in the shell. Is there a "proper" way to do this via the >>> installer itself. >>> >> >> There is an openshift-ansible PR to expose this directly ( >> https://github.com/openshift/openshift-ansible/pull/1385) >> >> >>> >>> Post installer I have manually added those settings into >>> "/etc/sysconfig/atomic-openshift-master", >>> "/etc/sysconfig/atomic-openshift-master-controllers", >>> "/etc/sysconfig/atomic-openshift-master-api" and, respectively for the >>> nodes, "/etc/sysconfig/atomic-openshift-node", but don't know how to do >>> this via the installer itself. >>> >>> >>> - Is there an issue with the masters doubling as "etcd" nodes ? >>> >> >> No, there should not be any issues with co-locating the etcd service >> alongside the masters. >> >> >>> >>> >>> The most frustrating part is that I have this very setup working >>> perfectly fine in a public cloud environment (namely on GCE) , but with the >>> (three) "etcd" hosts distinct from the masters (i.e. total of 9 hosts >>> instead of 6), and with unproxied Internet access.... However, that >>> installation is from a different repo branch (namely from " >>> https://github.com/detiber/openshift-ansible" from the "gceFixes" >>> branch ) >>> >> >> I *believe* all of the fixes from gceFixes have been merged into master >> at this point. >> >> >>> >>> >>> Thanks a lot for the help, >>> >>> Florian >>> >>> P.S. The weirdest case wrt certificates is when trying to check the >>> "etcd" cluster: >>> >>> >>> [root@az1master01 ~]# etcdctl --debug -C >>> https://az1master01.mydomain.novalocal:2379, >>> https://az3master02.mydomain.novalocal:2379, >>> https://az3master03.mydomain.novalocal:2379 --ca-file >>> /etc/origin/master/ca.crt --cert-file >>> /etc/origin/master/master.etcd-client.crt --key-file >>> /etc/origin/master/master.etcd-client.key cluster-health >>> Cluster-Endpoints: https://az3master02.mydomain.novalocal:2379, >>> https://az1master01.mydomain.novalocal:2379, >>> https://az3master03.mydomain.novalocal:2379 >>> cURL Command: curl -X GET >>> https://az3master02.mydomain.novalocal:2379/v2/members >>> cURL Command: curl -X GET >>> https://az1master01.mydomain.novalocal:2379/v2/members >>> cURL Command: curl -X GET >>> https://az3master03.mydomain.novalocal:2379/v2/members >>> cluster may be unhealthy: failed to list members >>> Error: client: etcd cluster is unavailable or misconfigured >>> error #0: x509: certificate signed by unknown authority >>> error #1: x509: certificate signed by unknown authority >>> error #2: x509: certificate signed by unknown authority >>> >> >> You need to use the etcd ca cert here: etcdctl --debug -C >> https://az1master01.mydomain.novalocal:2379, >> https://az3master02.mydomain.novalocal:2379, >> https://az3master03.mydomain.novalocal:2379 --ca-file >> /etc/origin/master/master.etcd-ca.crt --cert-file >> /etc/origin/master/master.etcd-client.crt --key-file >> /etc/origin/master/master.etcd-client.key cluster-health >> >> >>> >>> >>> >>> Attempting doing a direct curl to the "etcd" >>> >>> [root@az1master01 ~]# curl -v --cacert /etc/origin/master/ca.crt >>> --cert /etc/origin/master/master.etcd-client.crt --key >>> /etc/origin/master/master.etcd-client.key >>> https://az1master01.mydomain.novalocal:2379/v2/members >>> * About to connect() to az1master01.mydomain.novalocal port 2379 (#0) >>> * Trying 10.0.0.22... >>> * Connected to az1master01.mydomain.novalocal (10.0.0.22) port 2379 (#0) >>> * Initializing NSS with certpath: sql:/etc/pki/nssdb >>> * CAfile: /etc/origin/master/ca.crt >>> CApath: none >>> * Server certificate: >>> * subject: CN=az1master01.mydomain.novalocal >>> * start date: Feb 24 19:38:07 2016 GMT >>> * expire date: Feb 23 19:38:07 2017 GMT >>> * common name: az1master01.mydomain.novalocal >>> * issuer: CN=etcd-signer@1456342665 >>> * NSS error -8179 (SEC_ERROR_UNKNOWN_ISSUER) >>> * Peer's Certificate issuer is not recognized. >>> * Closing connection 0 >>> curl: (60) Peer's Certificate issuer is not recognized. >>> More details here: http://curl.haxx.se/docs/sslcerts.html >>> >>> curl performs SSL certificate verification by default, using a "bundle" >>> of Certificate Authority (CA) public keys (CA certs). If the default >>> bundle file isn't adequate, you can specify an alternate file >>> using the --cacert option. >>> If this HTTPS server uses a certificate signed by a CA represented in >>> the bundle, the certificate verification probably failed due to a >>> problem with the certificate (it might be expired, or the name might >>> not match the domain name in the URL). >>> If you'd like to turn off curl's verification of the certificate, use >>> the -k (or --insecure) option. >>> [root@az1master01 ~]# >>> >>> >>> >>> >>> _______________________________________________ >>> users mailing list >>> [email protected] >>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users >>> >>> >> >> >> -- >> Jason DeTiberus >> > > -- Jason DeTiberus
_______________________________________________ users mailing list [email protected] http://lists.openshift.redhat.com/openshiftmm/listinfo/users
