Hi,
I have a development server setup made up of two nodes (1 master - 1 slave)
running a bunch of different projects and environments which just crashed badly
on me.
Symptoms are: all containers in all projects are in pending state (orange
circle) - when I try to `delete all`, things get removed but pods hang in a
'terminating' state. oc describe gives me uninteresting information that I
already know (basically that pods are Pending) and oc logs tells me that it
(could not find the requested resource).
I tried to `sudo systemctl restart origin-master` as it seems to have produced
good results in the past, but that didn't help this time. I also tried that in
combination with a full system reboot.
Finally I tried running the ansible scripts in hopes of updating origin to the
latest version (it's still running 1.1.6) but I got the following error log:
failed: [paas.intrinsic.world] => {"changed": false, "cmd": ["oc", "create",
"-n", "openshift", "-f",
"/usr/share/openshift/examples/image-streams/image-streams-centos7.json"],
"delta": "0:00:00.180874", "end": "2016-09-05 07:20:12.050123", "failed": true,
"failed_when_result": true, "rc": 1, "start": "2016-09-05 07:20:11.869249",
"stdout_lines": [], "warnings": []}
stderr: unable to connect to a server to handle "imagestreamlists": the server
has asked for the client to provide credentials
FATAL: all hosts have already failed -- aborting
PLAY RECAP ********************************************************************
to retry, use: --limit @/Users/candide/config.retry
apps.intrinsic.world : ok=48 changed=0 unreachable=0 failed=0
localhost : ok=15 changed=0 unreachable=0 failed=0
paas.intrinsic.world : ok=207 changed=0 unreachable=0 failed=1
My last option is to reinstall everything from scratch but before I do this I
wanted to know if you guys had other ideas on how to get on top of things again.
Candide
_______________________________________________
users mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/users