> On 06 Sep 2016, at 18:00, Clayton Coleman <[email protected]> wrote: > > What auth mechanism backs your "admin" user? >
.htpasswd Thanks for the follow up Candide > On Sep 6, 2016, at 10:19 AM, Candide Kemmler <[email protected] > <mailto:[email protected]>> wrote: > >> Yes that seems to be OK..., although I'm not sure I know exactly what the >> "root cluster cert is", so I checked all the following: >> >> [root@paas master]# openssl x509 -enddate -noout -in cloudapps.router.pem >> notAfter=Apr 21 16:38:31 2018 GMT >> [root@paas master]# openssl x509 -enddate -noout -in ca.crt >> notAfter=Apr 20 16:31:56 2021 GMT >> [root@paas master]# openssl x509 -enddate -noout -in master.server.crt >> notAfter=Apr 21 16:32:00 2018 GMT >> [root@paas master]# openssl x509 -enddate -noout -in etcd.server.crt >> notAfter=Apr 21 16:32:01 2018 GMT >> [root@paas master]# openssl x509 -enddate -noout -in admin.crt >> notAfter=Apr 21 16:31:58 2018 GMT >> [root@paas master]# openssl x509 -enddate -noout -in ca-bundle.crt >> notAfter=Apr 20 16:31:56 2021 GMT >> [root@paas master]# openssl x509 -enddate -noout -in openshift-master.crt >> notAfter=Apr 21 16:31:57 2018 GMT >> [root@paas master]# openssl x509 -enddate -noout -in openshift-registry.crt >> notAfter=Apr 21 16:32:00 2018 GMT >> >>> On 06 Sep 2016, at 15:04, Clayton Coleman <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Were you able to check the expiration date on your admin root cluster cert >>> and verify it has not expired? >>> >>> On Sep 6, 2016, at 5:19 AM, Candide Kemmler <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>>> Hi Clayton, >>>> >>>> Thanks! Here's the result of running `sudo oadm diagnostics`. I'm >>>> particularly bothered by the "the server has asked for the client to >>>> provide credentials" message as I'm seeing this one when I try to execute >>>> the ansible scripts as well. Do you know how to solve it? >>>> >>>> Any other ideas on things I should focus on? >>>> >>>> Regards, >>>> >>>> Candide >>>> >>>> >>>> [Note] Determining if client configuration exists for client/cluster >>>> diagnostics >>>> Info: Successfully read a client config file at '/root/.kube/config' >>>> [Note] Could not configure a client, so client diagnostics are limited to >>>> testing configuration and connection >>>> Info: Using context for cluster-admin access: >>>> 'default/paas-intrinsic-world:8443/system:admin' >>>> [Note] Performing systemd discovery >>>> >>>> [Note] Running diagnostic: >>>> ConfigContexts[logging/paas-intrinsic-world:8443/admin] >>>> Description: Validate client config context is complete and has >>>> connectivity >>>> >>>> ERROR: [DCli0014 from diagnostic >>>> ConfigContexts@openshift/origin/pkg/diagnostics/client/config_contexts.go:285] >>>> For client config context 'logging/paas-intrinsic-world:8443/admin': >>>> The server URL is 'https://paas.intrinsic.world:8443' >>>> <https://paas.intrinsic.world:8443'> >>>> The user authentication is 'admin/paas-intrinsic-world:8443' >>>> The current project is 'logging' >>>> (*errors.StatusError) the server has asked for the client to >>>> provide credentials >>>> >>>> This means that when we tried to make a request to the master API >>>> server, the request required credentials that were not presented. >>>> This >>>> can happen with an expired or invalid authentication token. Try >>>> logging >>>> in with this user again. >>>> >>>> [Note] Running diagnostic: >>>> ConfigContexts[logging/paas-intrinsic-world:8443/system:admin] >>>> Description: Validate client config context is complete and has >>>> connectivity >>>> >>>> Info: For client config context >>>> 'logging/paas-intrinsic-world:8443/system:admin': >>>> The server URL is 'https://paas.intrinsic.world:8443' >>>> <https://paas.intrinsic.world:8443'> >>>> The user authentication is 'system:admin/paas-intrinsic-world:8443' >>>> The current project is 'logging' >>>> Successfully requested project list; has access to project(s): >>>> [openshift-infra dev ieml-demo logging management-infra misc >>>> openshift p2p default ieml-dev ...] >>>> >>>> [Note] Running diagnostic: ClusterRegistry >>>> Description: Check that there is a working Docker registry >>>> >>>> WARN: [DClu1009 from diagnostic >>>> ClusterRegistry@openshift/origin/pkg/diagnostics/cluster/registry.go:217] >>>> The "docker-registry-1-8w93s" pod for the "docker-registry" service >>>> is not running. >>>> This may be transient, a scheduling error, or something else. >>>> >>>> ERROR: [DClu1001 from diagnostic >>>> ClusterRegistry@openshift/origin/pkg/diagnostics/cluster/registry.go:173] >>>> The "docker-registry" service exists but no pods currently running, >>>> so it >>>> is not available. Builds and deployments that use the registry will >>>> fail. >>>> >>>> [Note] Running diagnostic: ClusterRoleBindings >>>> Description: Check that the default ClusterRoleBindings are present >>>> and contain the expected subjects >>>> >>>> Info: clusterrolebinding/cluster-admins has more subjects than expected. >>>> >>>> Use the `oadm policy reconcile-cluster-role-bindings` command to >>>> update the role binding to remove extra subjects. >>>> >>>> Info: clusterrolebinding/cluster-admins has extra subject {User admin >>>> }. >>>> >>>> Info: clusterrolebinding/cluster-readers has more subjects than expected. >>>> >>>> Use the `oadm policy reconcile-cluster-role-bindings` command to >>>> update the role binding to remove extra subjects. >>>> >>>> Info: clusterrolebinding/cluster-readers has extra subject >>>> {ServiceAccount management-infra management-admin }. >>>> Info: clusterrolebinding/cluster-readers has extra subject >>>> {ServiceAccount logging aggregated-logging-fluentd }. >>>> >>>> [Note] Running diagnostic: ClusterRoles >>>> Description: Check that the default ClusterRoles are present and >>>> contain the expected permissions >>>> >>>> [Note] Running diagnostic: ClusterRouterName >>>> Description: Check there is a working router >>>> >>>> ERROR: [DClu2007 from diagnostic >>>> ClusterRouter@openshift/origin/pkg/diagnostics/cluster/router.go:156] >>>> The "router" DeploymentConfig exists but has no running pods, so it >>>> is not available. Apps will not be externally accessible via the >>>> router. >>>> >>>> [Note] Running diagnostic: MasterNode >>>> Description: Check if master is also running node (for Open vSwitch) >>>> >>>> Info: Found a node with same IP as master: paas.intrinsic.world >>>> >>>> [Note] Running diagnostic: NodeDefinitions >>>> Description: Check node records on master >>>> >>>> WARN: [DClu0003 from diagnostic >>>> NodeDefinition@openshift/origin/pkg/diagnostics/cluster/node_definitions.go:112] >>>> Node paas.intrinsic.world is ready but is marked Unschedulable. >>>> This is usually set manually for administrative reasons. >>>> An administrator can mark the node schedulable with: >>>> oadm manage-node paas.intrinsic.world --schedulable=true >>>> >>>> While in this state, pods should not be scheduled to deploy on the >>>> node. >>>> Existing pods will continue to run until completed or evacuated (see >>>> other options for 'oadm manage-node'). >>>> >>>> [Note] Running diagnostic: AnalyzeLogs >>>> Description: Check for recent problems in systemd service logs >>>> >>>> Info: Checking journalctl logs for 'origin-master' service >>>> Info: Checking journalctl logs for 'origin-node' service >>>> Info: Checking journalctl logs for 'docker' service >>>> >>>> [Note] Running diagnostic: MasterConfigCheck >>>> Description: Check the master config file >>>> >>>> Info: Found a master config file: /etc/origin/master/master-config.yaml >>>> >>>> WARN: [DH0005 from diagnostic >>>> MasterConfigCheck@openshift/origin/pkg/diagnostics/host/check_master_config.go:58] >>>> Validation of master config file >>>> '/etc/origin/master/master-config.yaml' warned: >>>> assetConfig.loggingPublicURL: Invalid value: "": required to view >>>> aggregated container logs in the console >>>> assetConfig.metricsPublicURL: Invalid value: "": required to view >>>> cluster metrics in the console >>>> >>>> [Note] Running diagnostic: NodeConfigCheck >>>> Description: Check the node config file >>>> >>>> Info: Found a node config file: /etc/origin/node/node-config.yaml >>>> >>>> [Note] Running diagnostic: UnitStatus >>>> Description: Check status for related systemd units >>>> >>>> [Note] Summary of diagnostics execution (version v1.1.6): >>>> [Note] Warnings seen: 3 >>>> [Note] Errors seen: 4 >>>> >>>> >>>> >>>>> On 05 Sep 2016, at 18:46, Clayton Coleman <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> >>>>> Did you change the IP of your master, or otherwise delete / alter the >>>>> openshift-infra namespace? Or have your client certificates expired >>>>> (is this cluster 1 year old(? >>>>> >>>>> Before deleting, try two things: >>>>> >>>>> oadm diagnostics >>>>> >>>>> From the master (to see if it identifies anything). >>>>> >>>>> Also check your certificate expiration a. >>>>> >>>>>> On Sep 5, 2016, at 5:00 AM, Candide Kemmler <[email protected] >>>>>> <mailto:[email protected]>> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I have a development server setup made up of two nodes (1 master - 1 >>>>>> slave) running a bunch of different projects and environments which just >>>>>> crashed badly on me. >>>>>> >>>>>> Symptoms are: all containers in all projects are in pending state >>>>>> (orange circle) - when I try to `delete all`, things get removed but >>>>>> pods hang in a 'terminating' state. oc describe gives me uninteresting >>>>>> information that I already know (basically that pods are Pending) and oc >>>>>> logs tells me that it (could not find the requested resource). >>>>>> >>>>>> I tried to `sudo systemctl restart origin-master` as it seems to have >>>>>> produced good results in the past, but that didn't help this time. I >>>>>> also tried that in combination with a full system reboot. >>>>>> >>>>>> Finally I tried running the ansible scripts in hopes of updating origin >>>>>> to the latest version (it's still running 1.1.6) but I got the following >>>>>> error log: >>>>>> >>>>>> failed: [paas.intrinsic.world] => {"changed": false, "cmd": ["oc", >>>>>> "create", "-n", "openshift", "-f", >>>>>> "/usr/share/openshift/examples/image-streams/image-streams-centos7.json"], >>>>>> "delta": "0:00:00.180874", "end": "2016-09-05 07:20:12.050123", >>>>>> "failed": true, "failed_when_result": true, "rc": 1, "start": >>>>>> "2016-09-05 07:20:11.869249", "stdout_lines": [], "warnings": []} >>>>>> stderr: unable to connect to a server to handle "imagestreamlists": the >>>>>> server has asked for the client to provide credentials >>>>>> >>>>>> FATAL: all hosts have already failed -- aborting >>>>>> >>>>>> PLAY RECAP >>>>>> ******************************************************************** >>>>>> to retry, use: --limit @/Users/candide/config.retry >>>>>> >>>>>> apps.intrinsic.world : ok=48 changed=0 unreachable=0 >>>>>> failed=0 >>>>>> localhost : ok=15 changed=0 unreachable=0 >>>>>> failed=0 >>>>>> paas.intrinsic.world : ok=207 changed=0 unreachable=0 >>>>>> failed=1 >>>>>> >>>>>> My last option is to reinstall everything from scratch but before I do >>>>>> this I wanted to know if you guys had other ideas on how to get on top >>>>>> of things again. >>>>>> >>>>>> Candide >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> [email protected] >>>>>> <mailto:[email protected]> >>>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users >>>>>> <http://lists.openshift.redhat.com/openshiftmm/listinfo/users> >>>> >>
_______________________________________________ users mailing list [email protected] http://lists.openshift.redhat.com/openshiftmm/listinfo/users
