What auth mechanism backs your "admin" user? On Sep 6, 2016, at 10:19 AM, Candide Kemmler <[email protected]> wrote:
Yes that seems to be OK..., although I'm not sure I know exactly what the "root cluster cert is", so I checked all the following: [root@paas master]# openssl x509 -enddate -noout -in cloudapps.router.pem notAfter=Apr 21 16:38:31 2018 GMT [root@paas master]# openssl x509 -enddate -noout -in ca.crt notAfter=Apr 20 16:31:56 2021 GMT [root@paas master]# openssl x509 -enddate -noout -in master.server.crt notAfter=Apr 21 16:32:00 2018 GMT [root@paas master]# openssl x509 -enddate -noout -in etcd.server.crt notAfter=Apr 21 16:32:01 2018 GMT [root@paas master]# openssl x509 -enddate -noout -in admin.crt notAfter=Apr 21 16:31:58 2018 GMT [root@paas master]# openssl x509 -enddate -noout -in ca-bundle.crt notAfter=Apr 20 16:31:56 2021 GMT [root@paas master]# openssl x509 -enddate -noout -in openshift-master.crt notAfter=Apr 21 16:31:57 2018 GMT [root@paas master]# openssl x509 -enddate -noout -in openshift-registry.crt notAfter=Apr 21 16:32:00 2018 GMT On 06 Sep 2016, at 15:04, Clayton Coleman <[email protected]> wrote: Were you able to check the expiration date on your admin root cluster cert and verify it has not expired? On Sep 6, 2016, at 5:19 AM, Candide Kemmler <[email protected]> wrote: Hi Clayton, Thanks! Here's the result of running `sudo oadm diagnostics`. I'm particularly bothered by the "the server has asked for the client to provide credentials" message as I'm seeing this one when I try to execute the ansible scripts as well. Do you know how to solve it? Any other ideas on things I should focus on? Regards, Candide [Note] Determining if client configuration exists for client/cluster diagnostics Info: Successfully read a client config file at '/root/.kube/config' [Note] Could not configure a client, so client diagnostics are limited to testing configuration and connection Info: Using context for cluster-admin access: 'default/paas-intrinsic-world:8443/system:admin' [Note] Performing systemd discovery [Note] Running diagnostic: ConfigContexts[logging/paas-intrinsic-world:8443/admin] Description: Validate client config context is complete and has connectivity *ERROR: [DCli0014 from diagnostic ConfigContexts@openshift/origin/pkg/diagnostics/client/config_contexts.go:285]* * For client config context 'logging/paas-intrinsic-world:8443/admin':* * The server URL is 'https://paas.intrinsic.world:8443' <https://paas.intrinsic.world:8443'>* * The user authentication is 'admin/paas-intrinsic-world:8443'* * The current project is 'logging'* * (*errors.StatusError) the server has asked for the client to provide credentials* This means that when we tried to make a request to the master API server, the request required credentials that were not presented. This can happen with an expired or invalid authentication token. Try logging in with this user again. [Note] Running diagnostic: ConfigContexts[logging/paas-intrinsic-world:8443/system:admin] Description: Validate client config context is complete and has connectivity Info: For client config context 'logging/paas-intrinsic-world:8443/system:admin': The server URL is 'https://paas.intrinsic.world:8443' The user authentication is 'system:admin/paas-intrinsic-world:8443' The current project is 'logging' Successfully requested project list; has access to project(s): [openshift-infra dev ieml-demo logging management-infra misc openshift p2p default ieml-dev ...] [Note] Running diagnostic: ClusterRegistry Description: Check that there is a working Docker registry *WARN: [DClu1009 from diagnostic ClusterRegistry@openshift/origin/pkg/diagnostics/cluster/registry.go:217]* * The "docker-registry-1-8w93s" pod for the "docker-registry" service is not running.* * This may be transient, a scheduling error, or something else.* *ERROR: [DClu1001 from diagnostic ClusterRegistry@openshift/origin/pkg/diagnostics/cluster/registry.go:173]* * The "docker-registry" service exists but no pods currently running, so it* * is not available. Builds and deployments that use the registry will fail.* [Note] Running diagnostic: ClusterRoleBindings Description: Check that the default ClusterRoleBindings are present and contain the expected subjects Info: clusterrolebinding/cluster-admins has more subjects than expected. Use the `oadm policy reconcile-cluster-role-bindings` command to update the role binding to remove extra subjects. Info: clusterrolebinding/cluster-admins has extra subject {User admin }. Info: clusterrolebinding/cluster-readers has more subjects than expected. Use the `oadm policy reconcile-cluster-role-bindings` command to update the role binding to remove extra subjects. Info: clusterrolebinding/cluster-readers has extra subject {ServiceAccount management-infra management-admin }. Info: clusterrolebinding/cluster-readers has extra subject {ServiceAccount logging aggregated-logging-fluentd }. [Note] Running diagnostic: ClusterRoles Description: Check that the default ClusterRoles are present and contain the expected permissions [Note] Running diagnostic: ClusterRouterName Description: Check there is a working router *ERROR: [DClu2007 from diagnostic ClusterRouter@openshift/origin/pkg/diagnostics/cluster/router.go:156]* * The "router" DeploymentConfig exists but has no running pods, so it* * is not available. Apps will not be externally accessible via the router.* [Note] Running diagnostic: MasterNode Description: Check if master is also running node (for Open vSwitch) Info: Found a node with same IP as master: paas.intrinsic.world [Note] Running diagnostic: NodeDefinitions Description: Check node records on master *WARN: [DClu0003 from diagnostic NodeDefinition@openshift/origin/pkg/diagnostics/cluster/node_definitions.go:112]* * Node paas.intrinsic.world is ready but is marked Unschedulable.* * This is usually set manually for administrative reasons.* * An administrator can mark the node schedulable with:* * oadm manage-node paas.intrinsic.world --schedulable=true* While in this state, pods should not be scheduled to deploy on the node. Existing pods will continue to run until completed or evacuated (see other options for 'oadm manage-node'). [Note] Running diagnostic: AnalyzeLogs Description: Check for recent problems in systemd service logs Info: Checking journalctl logs for 'origin-master' service Info: Checking journalctl logs for 'origin-node' service Info: Checking journalctl logs for 'docker' service [Note] Running diagnostic: MasterConfigCheck Description: Check the master config file Info: Found a master config file: /etc/origin/master/master-config.yaml *WARN: [DH0005 from diagnostic MasterConfigCheck@openshift/origin/pkg/diagnostics/host/check_master_config.go:58]* * Validation of master config file '/etc/origin/master/master-config.yaml' warned:* * assetConfig.loggingPublicURL: Invalid value: "": required to view aggregated container logs in the console* * assetConfig.metricsPublicURL: Invalid value: "": required to view cluster metrics in the console* [Note] Running diagnostic: NodeConfigCheck Description: Check the node config file Info: Found a node config file: /etc/origin/node/node-config.yaml [Note] Running diagnostic: UnitStatus Description: Check status for related systemd units [Note] Summary of diagnostics execution (version v1.1.6): [Note] Warnings seen: 3 [Note] Errors seen: 4 On 05 Sep 2016, at 18:46, Clayton Coleman <[email protected]> wrote: Did you change the IP of your master, or otherwise delete / alter the openshift-infra namespace? Or have your client certificates expired (is this cluster 1 year old(? Before deleting, try two things: oadm diagnostics >From the master (to see if it identifies anything). Also check your certificate expiration a. On Sep 5, 2016, at 5:00 AM, Candide Kemmler <[email protected]> wrote: Hi, I have a development server setup made up of two nodes (1 master - 1 slave) running a bunch of different projects and environments which just crashed badly on me. Symptoms are: all containers in all projects are in pending state (orange circle) - when I try to `delete all`, things get removed but pods hang in a 'terminating' state. oc describe gives me uninteresting information that I already know (basically that pods are Pending) and oc logs tells me that it (could not find the requested resource). I tried to `sudo systemctl restart origin-master` as it seems to have produced good results in the past, but that didn't help this time. I also tried that in combination with a full system reboot. Finally I tried running the ansible scripts in hopes of updating origin to the latest version (it's still running 1.1.6) but I got the following error log: failed: [paas.intrinsic.world] => {"changed": false, "cmd": ["oc", "create", "-n", "openshift", "-f", "/usr/share/openshift/examples/image-streams/image-streams-centos7.json"], "delta": "0:00:00.180874", "end": "2016-09-05 07:20:12.050123", "failed": true, "failed_when_result": true, "rc": 1, "start": "2016-09-05 07:20:11.869249", "stdout_lines": [], "warnings": []} stderr: unable to connect to a server to handle "imagestreamlists": the server has asked for the client to provide credentials FATAL: all hosts have already failed -- aborting PLAY RECAP ******************************************************************** to retry, use: --limit @/Users/candide/config.retry apps.intrinsic.world : ok=48 changed=0 unreachable=0 failed=0 localhost : ok=15 changed=0 unreachable=0 failed=0 paas.intrinsic.world : ok=207 changed=0 unreachable=0 failed=1 My last option is to reinstall everything from scratch but before I do this I wanted to know if you guys had other ideas on how to get on top of things again. Candide _______________________________________________ users mailing list [email protected] http://lists.openshift.redhat.com/openshiftmm/listinfo/users
_______________________________________________ users mailing list [email protected] http://lists.openshift.redhat.com/openshiftmm/listinfo/users
