Hello all,

I have the following problems:

I have a multimaster OSE setup consisting of the following:
- A LB with "native" HA
- Three masters (doubling as "etcd" nodes)
- Two nodes


All the hosts are themselves OpenStack instances (hence the ".novalocal"
suffix). DNS is via an "/etc/hosts" propagated across, with the "lb" host
doubling as DNS forwarder (via dnsmasq). All Internet access is via an http
/ https proxy.

After many attempts we finally get a setup that is somewhat working (see
P.S. for why "somehow"). Attached is the "/etc/ansible/hosts" file.
Installation is from the main "openshift-ansible" repo (
https://github.com/openshift/openshift-ansible)

My problem:

After installation, on one master I created two users in
"/etc/origin/htpasswd". After creation I have propagated the file to all
the other masters. UNIX permissions to the file on all masters are "0600"

However, doing an "oc login" returns a "401 Unauthorized", and I cannot
find what the issue is, or how to debug it (no trace for it in the
"atomic-openshift-master-api" or "atomic-openshift-master-controllers"
logs).


[root@az1node01 ~]# oc login
Authentication required for https://az1lb01.mydomain.novalocal:8443
(openshift)
Username: reguser
Password:
Login failed (401 Unauthorized)
Unauthorized


The puzzling thing is that using the "system:node" certificates and keys
work (in the sense I am identified as "system:anonymous"):


curl -v --cacert  /etc/origin/node/ca.crt --cert
"/etc/origin/node/system:node:az1node01.mydomain.novalocal.crt" --key
"/etc/origin/node/system:node:az1node01.mydomain.novalocal.key"
https://az1lb01.mydomain.novalocal:8443/api/v1/namespaces
* About to connect() to az1lb01.mydomain.novalocal port 8443 (#0)
*   Trying 10.0.0.31...
* Connected to az1lb01.mydomain.novalocal (10.0.0.31) port 8443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
*   CAfile: /etc/origin/node/ca.crt
  CApath: none
* NSS: client certificate not found: /etc/origin/node/system
* SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate:
*       subject: CN=10.0.0.24
*       start date: Feb 24 19:40:56 2016 GMT
*       expire date: Feb 23 19:40:57 2018 GMT
*       common name: 10.0.0.24
*       issuer: CN=openshift-signer@1456342841
> GET /api/v1/namespaces HTTP/1.1
> User-Agent: curl/7.29.0
> Host: az1lb01.mydomain.novalocal:8443
> Accept: */*
>
< HTTP/1.1 403 Forbidden
< Cache-Control: no-store
< Content-Type: application/json
< Date: Thu, 25 Feb 2016 14:42:41 GMT
< Content-Length: 255
<
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "User \"system:anonymous\" cannot list all namespaces in the
cluster",
  "reason": "Forbidden",
  "details": {
    "kind": "namespaces"
  },
  "code": 403
}
* Connection #0 to host az1lb01.mydomain.novalocal left intact

Attached is also the master configuration file for one master.


My questions:

- I had many issues in getting the installation working, mostly due to the
Ansible installer reading the OpenStack instance metadata, and
inconsistencies btw. that and the "hostname".

  Is there any particular repo / branch of the installer that is known to
work in this particular setup ? Any particular settings I should use in the
Ansible hosts file ?

  I suspect the certificate issues I'm encountering is because of that (in
combination with the proxy) but I'm not sure.

- Operating behind an HTTP / HTTPS proxy: Even before starting the Ansible
installer, Docker was (properly) configured to the HTTP / HTTPS proxy
settings, and working correctly. However, for the installer itself I found
no way to express the "HTTP_PROXY" "HTTPS_PROXY" and, particularly, the
"NO_PROXY" settings.  For that I'm relying on exported environment
variables in the shell. Is there a "proper" way to do this via the
installer itself.

  Post installer I have manually added those settings into
"/etc/sysconfig/atomic-openshift-master",
"/etc/sysconfig/atomic-openshift-master-controllers",
"/etc/sysconfig/atomic-openshift-master-api" and, respectively for the
nodes, "/etc/sysconfig/atomic-openshift-node", but don't know how to do
this via the installer itself.


- Is there an issue with the masters doubling as "etcd" nodes ?


The most frustrating part  is that I have this very setup working perfectly
fine in a public cloud environment (namely on GCE) , but with the (three)
"etcd" hosts distinct from the masters (i.e. total of 9 hosts instead of
6), and with unproxied Internet access.... However, that installation is
from a different repo branch (namely from "
https://github.com/detiber/openshift-ansible"; from the "gceFixes" branch )


Thanks a lot for the help,

Florian

P.S. The weirdest case wrt certificates is when trying to check the "etcd"
cluster:


[root@az1master01 ~]# etcdctl --debug  -C
https://az1master01.mydomain.novalocal:2379,
https://az3master02.mydomain.novalocal:2379,
https://az3master03.mydomain.novalocal:2379 --ca-file
/etc/origin/master/ca.crt  --cert-file
/etc/origin/master/master.etcd-client.crt     --key-file
/etc/origin/master/master.etcd-client.key cluster-health
Cluster-Endpoints: https://az3master02.mydomain.novalocal:2379,
https://az1master01.mydomain.novalocal:2379,
https://az3master03.mydomain.novalocal:2379
cURL Command: curl -X GET
https://az3master02.mydomain.novalocal:2379/v2/members
cURL Command: curl -X GET
https://az1master01.mydomain.novalocal:2379/v2/members
cURL Command: curl -X GET
https://az3master03.mydomain.novalocal:2379/v2/members
cluster may be unhealthy: failed to list members
Error:  client: etcd cluster is unavailable or misconfigured
error #0: x509: certificate signed by unknown authority
error #1: x509: certificate signed by unknown authority
error #2: x509: certificate signed by unknown authority



Attempting doing a direct curl to the "etcd"

[root@az1master01 ~]# curl -v   --cacert /etc/origin/master/ca.crt --cert
/etc/origin/master/master.etcd-client.crt     --key
/etc/origin/master/master.etcd-client.key
https://az1master01.mydomain.novalocal:2379/v2/members
* About to connect() to az1master01.mydomain.novalocal port 2379 (#0)
*   Trying 10.0.0.22...
* Connected to az1master01.mydomain.novalocal (10.0.0.22) port 2379 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
*   CAfile: /etc/origin/master/ca.crt
  CApath: none
* Server certificate:
* subject: CN=az1master01.mydomain.novalocal
* start date: Feb 24 19:38:07 2016 GMT
* expire date: Feb 23 19:38:07 2017 GMT
* common name: az1master01.mydomain.novalocal
* issuer: CN=etcd-signer@1456342665
* NSS error -8179 (SEC_ERROR_UNKNOWN_ISSUER)
* Peer's Certificate issuer is not recognized.
* Closing connection 0
curl: (60) Peer's Certificate issuer is not recognized.
More details here: http://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.
[root@az1master01 ~]#

Attachment: ansible-hosts-mydomain
Description: Binary data

Attachment: az1master01--master-config.yaml
Description: Binary data

_______________________________________________
users mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Reply via email to