Re: Autoanalyze CPU usage

2017-12-20 Thread Nikolay Samokhvalov
On Tue, Dec 19, 2017 at 7:47 PM, Habib Nahas  wrote:

> Hi,
>
> We operate an RDS postgres 9.5 instance and have periodic CPU spikes to
> 100%. These spikes appear to be due to autoanalyze kicking on our larger
> tables.
>

How did you draw such conclusion? How did you find that autoanalyze is the
reason of CPU spikes?


Re: Autoanalyze CPU usage

2017-12-19 Thread Laurenz Albe
Habib Nahas wrote:
> The CPU spike occurred between 13:05 - 13:15. last_autoanalyze for the table
> shows a time of 12:49; last_autovacuum does not show any activity around
> this time for any table. Checkpoint logs are also normal around this time.
> I'd like to understand if there are any other sources of activity I
> should be checking for that would account for the spike.

last_autoanalyze is set after autoanalyze is done, so that would suggest
that autoanalyze is not the problem.

It can be tough to figure out where the activity is coming from unless
cou can catch it in the act.  You could log all statements (though the amount
of log may be prohibitive and can cripple performance), you could log
just long running statements in the hope that these are at fault, you
could log connections and disconnections and hope to find the problem
that way.  Maybe logging your applications can help too.

Yours,
Laurenz Albe



Re: Autoanalyze CPU usage

2017-12-19 Thread Justin Pryzby
On Tue, Dec 19, 2017 at 02:37:18PM -0800, Habib Nahas wrote:
> As it happens our larger tables operate as a business log and are also
> insert only.
> 
> - There is no partitioning at this time since we expect to have an
> automated process to delete rows older than a certain date.

This is a primary use case for partitioning ; bulk DROP rather than DELETE.

> - Analyzing doing off-hours sounds like a good idea; if there is no other
> way to determine effect on db we may end up doing that.

You can also implement a manual analyze job and hope to avoid autoanalyze.

> - We have an open schema and heavily depend on jsonb, so I'm not sure if
> increasing the statistics target will be helpful.

If the increased stats target isn't useful for that, I would recommend to
decrease it.

-- 
Justin Pryzby
System Administrator
Telsasoft
+1-952-707-8581



Re: Autoanalyze CPU usage

2017-12-19 Thread Habib Nahas
The autoanalyze factor is set to 0.05 for the db, and we have not changed
the default statistics target.

The CPU spike occurred between 13:05 - 13:15. last_autoanalyze for the
table shows a time of 12:49; last_autovacuum does not show any activity
around this time for any table. Checkpoint logs are also normal around this
time. I'd like to understand if there are any other sources of activity I
should be checking for that would account for the spike.

User workload is throttled to avoid excess load on the db, so a query is
unlikely to have caused the spike. But we can dig deeper if other causes
are ruled out.

Thanks

On Tue, Dec 19, 2017 at 2:03 PM, Tomas Vondra 
wrote:

>
>
> On 12/19/2017 05:47 PM, Habib Nahas wrote:
> > Hi,
> >
> > We operate an RDS postgres 9.5 instance and have periodic CPU spikes to
> > 100%. These spikes appear to be due to autoanalyze kicking on our larger
> > tables.
> >
> > Our largest table has 75 million rows and the autoanalyze scale factor
> > is set to 0.05.
> >
> > The documentation I've read suggests that the analyze always operates on
> > the entire table and is not incremental. Given that supposition are
> > there ways to control cost(especially CPU) of the autoanalyze operation?
> > Would a more aggressive autoanalyze scale factor (0.01) help. With the
> > current scale factor we see an autoanalyze once a week, query
> > performance has been acceptable so far, which could imply that scale
> > factor could be increased if necessary.
> >
>
> No, reducing the scale factor to 0.01 will not help at all, it will
> actually make the issue worse. The only thing autoanalyze does is
> running ANALYZE, which *always* collects a fixed-size sample. Making it
> more frequent will not reduce the amount of work done on each run.
>
> So the first question is if you are not using the default (0.1), i.e.
> have you reduced it to 0.05.
>
> The other question is why it's so CPU-intensive. Are you using the
> default statistics_target value (100), or have you increased that too?
>
> regards
>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>