You can hit the master prometheus endpoint to see what is going on (or run
prometheus from the release-3.6 branch in examples/prometheus):
oc get --raw /metrics
As an admin will dump the apiserver prometheus metrics for that server.
You can look at (going from memory here) go_memstats_heap_inuse_bytes to
see exactly how much memory Go has allocated. You can look at
apiserver_request_count to see if there are large numbers of requests being
made from any particular client (the readme in the prometheus directory has
more queries for visualizing this from the prometheus dashboard).
Re: unusual behavior, looking at the logs at a reasonable log level should
not be generating a lot of output.
Generally for a cluster that small you should be using about 100-200M of
memory for masters and same for nodes. If the heap in use reported above
is bigger than that you might have a rogue client creating things.
On Oct 21, 2017, at 5:53 AM, Joel Pearson <[email protected]>
wrote:
Hi Clayton,
We’re running 3.6.1 I believe. It was installed a few weeks ago using
OpenShift ansible on the the release-3.6 branch.
We’re running 11 namespaces, 2 nodes, 7 pods, so it’s pretty minimal.
I’ve never run this prune.
https://docs.openshift.com/container-platform/3.6/admin_guide/pruning_resources.html
Is there some log that would help highlight exactly what the issue is?
Thanks,
Joel
On Sat, 21 Oct 2017 at 2:23 pm, Clayton Coleman <[email protected]> wrote:
> What version are you running? How many nodes, pods, and namespaces?
> Excessive memory use can be caused by not running prune or having an
> automated process that creates lots of an object. Excessive CPU use can be
> caused by an errant client or component stuck in a hot loop repeatedly
> taking the same action.
>
>
>
> On Oct 21, 2017, at 1:55 AM, Joel Pearson <[email protected]>
> wrote:
>
> Hi,
>
> I've got a brand new OpenShift cluster running on OpenStack and I'm
> finding that the single master that I have is struggling big time, it seems
> to consume tons of virtual memory and then start swapping and slows right
> down.
>
> It is running with 16GB of memory, 40GB disk and 2 CPUs.
>
> The cluster is fairly idle, so I don't know why the master gets this way.
> Restarting the master solves the problem for a while, for example, I
> restarted it at 10pm last night, and when I checked again this morning it
> was in the same situation.
>
> Would having multiple masters alleviate this problem?
>
> Here is a snapshot of top:
>
> <openshift master swapping.png>
>
>
> Any advice? I've happy to build the cluster with multiple masters if it
> will help.
>
>
> --
> Kind Regards,
>
> Joel Pearson
> Agile Digital | Senior Software Consultant
>
> Love Your Software™ | ABN 98 106 361 273
>
> _______________________________________________
> users mailing list
> [email protected]
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
> --
Kind Regards,
Joel Pearson
Agile Digital | Senior Software Consultant
Love Your Software™ | ABN 98 106 361 273
p: 1300 858 277 | m: 0405 417 843 <0405417843> | w: agiledigital.com.au
_______________________________________________
users mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/users