Re: AutoVacuum and growing transaction XID's

2020-05-08 Thread Michael Lewis
autovacuum_naptime being only 5 seconds seems too frequent. A lock_timeout
might be 1-5 seconds depending on your system. Usually, DDL can fail and
wait a little time rather than lock the table for minutes and have all
reads back up behind the DDL.

Given you have autovacuum_vacuum_cost_limit set to unlimited (seems very
odd), I'm not sure a manual vacuum freeze command on the tables with high
age would perform differently. Still, issuing a vacuum freeze and then
killing the autovacuum process might be worth trying.


Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-08 Thread Jeff Janes
On Thu, May 7, 2020 at 5:17 PM Avinash Kumar 
wrote:

> Hi,
>
> On Thu, May 7, 2020 at 6:08 PM Rory Campbell-Lange <
> r...@campbell-lange.net> wrote:
>
>> One of our clusters has well over 500 databases fronted by pg_bouncer.
>>
>> We get excellent connection "flattening" using pg_bouncer with
>> per-database connection spikes dealt with through a reserve pool.
>>
> What if you see at least 4 connections being established by each client
> during peak ? And if you serve 4 or 2  connections per each DB, then you
> are creating 1000 or more reserved connections with 500 DBs in a cluster.
>

Does every database spike at the same time?


>
>> The nice thing about separate databases is that it is easy to scale
>> horizontally.
>>
> Agreed. But, how about autovacuum ? Workers shift from DB to DB and 500
> clusters means you may have to have a lot of manual vacuuming in place as
> well.
>

Why would having difference schemas in different DBs change your manual
vacuuming needs?  And if anything, having separate DBs will make
autovacuuming more efficient, as it keeps the statistics collectors stats
files smaller.

Cheers,

Jeff

>


Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-08 Thread Jeff Janes
On Thu, May 7, 2020 at 4:05 PM samhitha g 
wrote:

> Hi experts,
>
> Our application serves multiple tenants. Each tenant has the schema with a
> few hundreds of tables and few functions.
> We have 2000 clients so we have to create 2000 schemas in a single
> database.
>
> While doing this, i observed that the catalog tables pg_attribute,
> pg_class, pg_depend grow huge in count and size.
>

Please attach numbers to "huge".  We don't know what "huge" means to you.

"2000  * a few hundred" tables is certainly getting to the point where it
makes sense to be concerned.  But my concern would be more about backup and
recovery, version upgrades, pg_dump, etc. not about daily operations.

Cheers,

Jeff

>


Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-08 Thread Avinash Kumar
Hi,

On Fri, May 8, 2020 at 3:53 AM Laurenz Albe 
wrote:

> On Fri, 2020-05-08 at 03:47 -0300, Avinash Kumar wrote:
> > > Just set "autovacuum_max_workers" higher.
> >
> > No, that wouldn't help. If you just increase autovacuum_max_workers, the
> total cost limit of
> > autovacuum_vacuum_cost_limit (or vacuum_cost_limit) is shared by so many
> workers and it
> > further delays autovacuum per each worker. Instead you need to increase
> autovacuum_vacuum_cost_limit
> > as well when you increase the number of workers.
>
> True, I should have mentioned that.
>
> > But, if you do that and also increase workers, well, you would easily
> reach the limitations
> > of the disk. I am not sure it is anywhere advised to have 20
> autovacuum_max_workers unless
> > i have a disk with lots of IOPS and with very tiny tables across all the
> databases.
>
> Sure, if you have a high database load, you will at some point exceed the
> limits of
> the machine, which is not surprising.  What I am trying to say is that you
> have to ramp
> up the resources for autovacuum together with increasing the overall
> workload.
> You should consider autovacuum as part of that workload.
>
> If your machine cannot cope with the workload any more, you have to scale,
> which
> is easily done by adding more machines if you have many databases.
>
Agreed. Getting back to the original question asked by Sammy, i think it is
still bad to create 2000 databases for storing 2000 clients/(schemas) for a
multi-tenant setup.

>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>

-- 
Regards,
Avinash Vallarapu


Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-08 Thread Laurenz Albe
On Fri, 2020-05-08 at 03:47 -0300, Avinash Kumar wrote:
> > Just set "autovacuum_max_workers" higher.
> 
> No, that wouldn't help. If you just increase autovacuum_max_workers, the 
> total cost limit of
> autovacuum_vacuum_cost_limit (or vacuum_cost_limit) is shared by so many 
> workers and it
> further delays autovacuum per each worker. Instead you need to increase 
> autovacuum_vacuum_cost_limit
> as well when you increase the number of workers.

True, I should have mentioned that.

> But, if you do that and also increase workers, well, you would easily reach 
> the limitations
> of the disk. I am not sure it is anywhere advised to have 20 
> autovacuum_max_workers unless
> i have a disk with lots of IOPS and with very tiny tables across all the 
> databases.

Sure, if you have a high database load, you will at some point exceed the 
limits of
the machine, which is not surprising.  What I am trying to say is that you have 
to ramp
up the resources for autovacuum together with increasing the overall workload.
You should consider autovacuum as part of that workload.

If your machine cannot cope with the workload any more, you have to scale, which
is easily done by adding more machines if you have many databases.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-08 Thread Avinash Kumar
Hi,

On Fri, May 8, 2020 at 3:31 AM Laurenz Albe 
wrote:

> On Thu, 2020-05-07 at 18:17 -0300, Avinash Kumar wrote:
> > > The nice thing about separate databases is that it is easy to scale
> > > horizontally.
> >
> > Agreed. But, how about autovacuum ? Workers shift from DB to DB and 500
> clusters
> > means you may have to have a lot of manual vacuuming in place as well.
>
> Just set "autovacuum_max_workers" higher.
>
No, that wouldn't help. If you just increase autovacuum_max_workers, the
total cost limit of autovacuum_vacuum_cost_limit (or vacuum_cost_limit) is
shared by so many workers and it further delays autovacuum per each worker.
Instead you need to increase autovacuum_vacuum_cost_limit as well when you
increase the number of workers. But, if you do that and also increase
workers, well, you would easily reach the limitations of the disk. I am not
sure it is anywhere advised to have 20 autovacuum_max_workers unless i have
a disk with lots of IOPS and with very tiny tables across all the
databases.

>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>

-- 
Regards,
Avinash Vallarapu


Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-08 Thread Laurenz Albe
On Thu, 2020-05-07 at 18:17 -0300, Avinash Kumar wrote:
> > The nice thing about separate databases is that it is easy to scale
> > horizontally.
> 
> Agreed. But, how about autovacuum ? Workers shift from DB to DB and 500 
> clusters
> means you may have to have a lot of manual vacuuming in place as well.

Just set "autovacuum_max_workers" higher.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: AutoVacuum and growing transaction XID's

2020-05-08 Thread github kran
On Thu, May 7, 2020 at 11:04 PM David Rowley  wrote:

> On Fri, 8 May 2020 at 13:51, github kran  wrote:
> >   I can't either DROP or ALTER any other tables ( REMOVE Inheritance
> for any of old tables where the WRITES are not getting written to). Any of
> the ALTER TABLE OR DROP TABLE  DDL's arer not getting exeucted even I
> WAITED FOR SEVERAL MINUTES , so I have terminated those queries as I didn't
> have luck.
>
> The auto-vacuum freeze holds an SharedUpdateExclusiveLock on the table
> being vacuumed. If you try any DDL that requires an
> AccessExclusiveLock, it'll have to wait until the vacuum has
> completed. If you leave the DDL running then all accesses to the table
> will be queued behind the ungranted AccessExclusiveLock.  It's likely
> a good idea to always run DDL with a fairly short lock_timeout, just
> in case this happens.
>
*  How much value I can assign to lock_timeout so that I dont get into
trouble to test my DDL commands and without impacting other sessions.*

>
> >3)  Can I increase the  autovacuum_freeze_max_age on the tables on
> production system ?
>


>
> Yes, but you cannot increase the per-table setting above the global
> setting. Changing the global setting requires a restart.
>
>How can I change the value of the global setting of the
autovacuum_freeze_max_Age value.


> David
>


Re: AutoVacuum and growing transaction XID's

2020-05-08 Thread github kran
Thanks David for your replies.

On Thu, May 7, 2020 at 11:01 PM David Rowley  wrote:

> On Fri, 8 May 2020 at 09:18, github kran  wrote:
> > 1)  We haven't changed anything related to autovacuum except a work_mem
> parameter which was increased to 4 GB which I believe is not related to
> autovacuum
>
> It might want to look into increasing vacuum_cost_limit to something
> well above 200 or dropping autovacuum_vacuum_cost_delay down from 20
> to something much lower. However, you say you've not changed the
> autovacuum settings, but you've also said:
>
> >1)  I see there are 8 Vacuum workers ( Not sure what changed) running
> in the background and the concern I have is all of these vacuum processes
> are running with wrap around and while they are running
>

   - Yes I said it was originally 3 but I noticed  the work_mem parameter
   was changed few weeks back to 4 GB and then from that day onwards there is
   an increasing trend of  the MaxUsedTransactionIds from 200 Million to 347
   million ( It's growing day by day from last 2 -3 weeks)
   - Do you think there could be a formula on how the workers could have
   increased based on this increase in WORK_MEM controlled by database ?.


> The default is 3, so if you have 8 then the settings are non-standard.
>
> It might be good to supply the output of:
>
> SELECT name,setting from pg_Settings where name like '%vacuum%';
>
   Output of vacuum

name setting min_val max_val boot_val reset_val
autovacuum on null null on on
autovacuum_analyze_scale_factor 0.02 0 100 0.1 0.02
autovacuum_analyze_threshold 50 0 2147483647 50 50
autovacuum_freeze_max_age 2 10 20 2 2
autovacuum_max_workers 8 1 262143 3 8
autovacuum_multixact_freeze_max_age 4 1 20 4
4
autovacuum_naptime 5 1 2147483 60 5
autovacuum_vacuum_cost_delay 5 -1 100 20 5
autovacuum_vacuum_cost_limit -1 -1 1 -1 -1
autovacuum_vacuum_scale_factor 0.05 0 100 0.2 0.05
autovacuum_vacuum_threshold 50 0 2147483647 50 50
autovacuum_work_mem -1 -1 2147483647 -1 -1


>
> You should know that the default speed that autovacuum runs at is
> quite slow in 9.6. If you end up with all your autovacuum workers tied
> up with anti-wraparound vacuums then other tables are likely to get
> neglected and that could lead to stale stats or bloated tables. Best
> to aim to get auto-vacuum running faster or aim to perform some manual
> vacuums of tables that are over their max freeze age during an
> off-peak period to make use of the lower load during those times.
> Start with tables in pg_class with the largest age(relfrozenxid).
> You'll still likely want to look at the speed autovacuum runs at
> either way.
>
> Please be aware that the first time a new cluster crosses the
> autovacuum_freeze_max_age threshold can be a bit of a pain point as it
> can mean that many tables require auto-vacuum activity all at once.
> The impact of this is compounded if you have many tables that never
> receive an UPDATE/DELETE as auto-vacuum, in 9.6, does not visit those
> tables for any other reason. After the first time, the relfrozenxids
> of tables tend to be more staggered so their vacuum freeze
> requirements are also more staggered and that tends to cause fewer
> problems.
>

  The current situation I have is the auto vacuum kicked with 8 tables with
each of those tied to each worker and it's running very slow in 9.6 as you
mentioned
   i observed VACUUM  on those 8 tables is running from last 15 hrs and
other process are running for 1 hr+ and others for few minutes for
different tables.

   Finally I would wait for your reply to see what could be done for this
VACUUM and growing TXIDs  values.

   -Do you think I should consider changing back the work_mem back to 4
   MB what it was originally ?
   -   Can I apply your recommendations on a production instance directly
   or you prefer me to apply initially in other environment before applying on
   Prod ?
   -   Also like I said I want to clean up few unused tables OR MANUAL
   VACUUM but current system doesn't allow me to do it considering these
   factors.
   -  I will try to run VACUUM Manually during off peak hrs , Can I STOP
   the Manual VACUUM process if its take more than 10 minutes or what is the
   allowed time in mins I can have it running  ?.

David
>