On Wed, Jun 4, 2014 at 5:27 PM, vlasmarias <vlasmar...@contigo.com> wrote:

> For the past few days, we've been seeing unexpected extremely high CPU
> spikes
> in our system. We observed the following: the 'free' memory would go down
> to
> lower than 300 MB; at that point, 'cached' slowly starts to go down, and
> then CPU starts to go way up.
>
> It's almost as if the OS was not releasing 'cached' memory fast enough for
> Postgres. Is that analysis correct? Is there a way to fix this?
>

This sounds like a kernel problem, probably either the zone reclaim issue,
or the transparent huge pages issue.

I don't know the exact details off the top of my head, but both have been
discussed a lot on both this list and the pgsql-hackers list.




>
> Here's the session:
>
>  04:58:37 load average: 2.37, free: 532, cached: 22852
>  04:58:57 load average: 1.91, free: 451, cached: 22859
>  04:59:17 load average: 1.82, free: 469, cached: 22866
>  04:59:57 load average: 1.57, free: 387, cached: 22884
>

What tool is that?  I'm not familiar with this output format.




>  max_connections              | 500
>

While this is probably fundamentally a kernel problem, you are not doing
yourself any favors by allowing 500 connections to a machine with 24 cores.
 High numbers of connections can trigger poor kernel behavior.

Cheers,

Jeff

Reply via email to