You have two master nodes? You'ld rather go with 1 or 3. With 2 masters, your etcd quorum is 2. If you lose a master, the API would be unavailable.
Now ... c5-xlarge should be fine. Not sure why your console doesn't show everything (and not familiar with that dashboard yet). as a wild guess, probably some delay collecting metrics. Though you should see something, eventually. If you use "oc describe node <node-name>", you should see the reservations (requests & limits) for that node. Might be able to figure out what's eating up your resources. Depending on what openshift components you're deploying, you may already be using quite a lot. Especially if you don't have infra nodes, and did deploy EFK and/or hawkular/cassandra. Prometheus could use some resources as well. Meanwhile, istio itself can ship with more or less components, ... If using EFK: you may be able to lower resources requests/limits for ElasticSearch If using Hawkular: same remark regarding Cassandra Hard to say, without seeing it. But you can probably free up some resources here and there. Good luck, Regards. On Wed, Oct 9, 2019 at 1:47 PM Just Marvin < marvin.the.cynical.ro...@gmail.com> wrote: > Samuel, > > So it is CPU. But, I destroyed the cluster, gave it machines with > twice as much memory, retried and got the same problem: > > Events: > Type Reason Age From > Message > ---- ------ ---- ---- > ------- > Warning FailedScheduling 16s (x4 over 3m12s) default-scheduler 0/4 > nodes are available: 2 Insufficient cpu, 2 node(s) had taints that the pod > didn't tolerate. > > I'm guessing that the two pods with taints are the two master nodes, > but the other two are c5-xlarge machines. But here is maybe a relevant > observation. As soon as I log into the cluster, I see this on my main > dashboard. > > [image: image.png] > > Is there perhaps a problem with the CPU resource monitoring that is > causing my problems? > > Regards, > Marvin > > On Sun, Oct 6, 2019 at 3:52 PM Samuel Martín Moro <faus...@gmail.com> > wrote: > >> you can use "oc describe pod <pod-name>", to figure out what's going on >> with your pod. >> could be that you're out of cpu/memory. >> >> Regards. >> >> On Sun, Oct 6, 2019 at 9:27 PM Just Marvin < >> marvin.the.cynical.ro...@gmail.com> wrote: >> >>> Hi, >>> >>> [zaphod@oc3027208274 ocp4.2-aws]$ oc get pods >>> NAME READY STATUS RESTARTS AGE >>> istio-citadel-7cb44f4bb-tccql 1/1 Running 0 9m35s >>> istio-galley-75599dbc67-b4mgx 1/1 Running 0 8m41s >>> istio-policy-56476c984b-c7t8j 0/2 *Pending* 0 8m23s >>> istio-telemetry-d5bbd7d7b-v8kjq 0/2 *Pending* 0 8m24s >>> jaeger-5d9dfdfb67-mv8mp 2/2 Running 0 8m45s >>> prometheus-685bdbdc45-hmb9f 2/2 Running 0 9m17s >>> [zaphod@oc3027208274 ocp4.2-aws]$ >>> >>> The pods in Pending state don't seem to be moving forward. The >>> operator logs aren't showing anything informative about why this might be. >>> Is this normal, or if there is a problem, and if so, how would I figure out >>> the cause? >>> >>> Regards, >>> Marvin >>> _______________________________________________ >>> users mailing list >>> users@lists.openshift.redhat.com >>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users >>> >> >> >> -- >> Samuel Martín Moro >> {EPITECH.} 2011 >> >> "Nobody wants to say how this works. >> Maybe nobody knows ..." >> Xorg.conf(5) >> > -- Samuel Martín Moro {EPITECH.} 2011 "Nobody wants to say how this works. Maybe nobody knows ..." Xorg.conf(5)
_______________________________________________ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users