Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-08 Thread Jeff Janes
On Thu, May 7, 2020 at 5:17 PM Avinash Kumar 
wrote:

> Hi,
>
> On Thu, May 7, 2020 at 6:08 PM Rory Campbell-Lange <
> r...@campbell-lange.net> wrote:
>
>> One of our clusters has well over 500 databases fronted by pg_bouncer.
>>
>> We get excellent connection "flattening" using pg_bouncer with
>> per-database connection spikes dealt with through a reserve pool.
>>
> What if you see at least 4 connections being established by each client
> during peak ? And if you serve 4 or 2  connections per each DB, then you
> are creating 1000 or more reserved connections with 500 DBs in a cluster.
>

Does every database spike at the same time?


>
>> The nice thing about separate databases is that it is easy to scale
>> horizontally.
>>
> Agreed. But, how about autovacuum ? Workers shift from DB to DB and 500
> clusters means you may have to have a lot of manual vacuuming in place as
> well.
>

Why would having difference schemas in different DBs change your manual
vacuuming needs?  And if anything, having separate DBs will make
autovacuuming more efficient, as it keeps the statistics collectors stats
files smaller.

Cheers,

Jeff

>


Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-08 Thread Jeff Janes
On Thu, May 7, 2020 at 4:05 PM samhitha g 
wrote:

> Hi experts,
>
> Our application serves multiple tenants. Each tenant has the schema with a
> few hundreds of tables and few functions.
> We have 2000 clients so we have to create 2000 schemas in a single
> database.
>
> While doing this, i observed that the catalog tables pg_attribute,
> pg_class, pg_depend grow huge in count and size.
>

Please attach numbers to "huge".  We don't know what "huge" means to you.

"2000  * a few hundred" tables is certainly getting to the point where it
makes sense to be concerned.  But my concern would be more about backup and
recovery, version upgrades, pg_dump, etc. not about daily operations.

Cheers,

Jeff

>


Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-08 Thread Avinash Kumar
Hi,

On Fri, May 8, 2020 at 3:53 AM Laurenz Albe 
wrote:

> On Fri, 2020-05-08 at 03:47 -0300, Avinash Kumar wrote:
> > > Just set "autovacuum_max_workers" higher.
> >
> > No, that wouldn't help. If you just increase autovacuum_max_workers, the
> total cost limit of
> > autovacuum_vacuum_cost_limit (or vacuum_cost_limit) is shared by so many
> workers and it
> > further delays autovacuum per each worker. Instead you need to increase
> autovacuum_vacuum_cost_limit
> > as well when you increase the number of workers.
>
> True, I should have mentioned that.
>
> > But, if you do that and also increase workers, well, you would easily
> reach the limitations
> > of the disk. I am not sure it is anywhere advised to have 20
> autovacuum_max_workers unless
> > i have a disk with lots of IOPS and with very tiny tables across all the
> databases.
>
> Sure, if you have a high database load, you will at some point exceed the
> limits of
> the machine, which is not surprising.  What I am trying to say is that you
> have to ramp
> up the resources for autovacuum together with increasing the overall
> workload.
> You should consider autovacuum as part of that workload.
>
> If your machine cannot cope with the workload any more, you have to scale,
> which
> is easily done by adding more machines if you have many databases.
>
Agreed. Getting back to the original question asked by Sammy, i think it is
still bad to create 2000 databases for storing 2000 clients/(schemas) for a
multi-tenant setup.

>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>

-- 
Regards,
Avinash Vallarapu


Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-08 Thread Laurenz Albe
On Fri, 2020-05-08 at 03:47 -0300, Avinash Kumar wrote:
> > Just set "autovacuum_max_workers" higher.
> 
> No, that wouldn't help. If you just increase autovacuum_max_workers, the 
> total cost limit of
> autovacuum_vacuum_cost_limit (or vacuum_cost_limit) is shared by so many 
> workers and it
> further delays autovacuum per each worker. Instead you need to increase 
> autovacuum_vacuum_cost_limit
> as well when you increase the number of workers.

True, I should have mentioned that.

> But, if you do that and also increase workers, well, you would easily reach 
> the limitations
> of the disk. I am not sure it is anywhere advised to have 20 
> autovacuum_max_workers unless
> i have a disk with lots of IOPS and with very tiny tables across all the 
> databases.

Sure, if you have a high database load, you will at some point exceed the 
limits of
the machine, which is not surprising.  What I am trying to say is that you have 
to ramp
up the resources for autovacuum together with increasing the overall workload.
You should consider autovacuum as part of that workload.

If your machine cannot cope with the workload any more, you have to scale, which
is easily done by adding more machines if you have many databases.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-08 Thread Avinash Kumar
Hi,

On Fri, May 8, 2020 at 3:31 AM Laurenz Albe 
wrote:

> On Thu, 2020-05-07 at 18:17 -0300, Avinash Kumar wrote:
> > > The nice thing about separate databases is that it is easy to scale
> > > horizontally.
> >
> > Agreed. But, how about autovacuum ? Workers shift from DB to DB and 500
> clusters
> > means you may have to have a lot of manual vacuuming in place as well.
>
> Just set "autovacuum_max_workers" higher.
>
No, that wouldn't help. If you just increase autovacuum_max_workers, the
total cost limit of autovacuum_vacuum_cost_limit (or vacuum_cost_limit) is
shared by so many workers and it further delays autovacuum per each worker.
Instead you need to increase autovacuum_vacuum_cost_limit as well when you
increase the number of workers. But, if you do that and also increase
workers, well, you would easily reach the limitations of the disk. I am not
sure it is anywhere advised to have 20 autovacuum_max_workers unless i have
a disk with lots of IOPS and with very tiny tables across all the
databases.

>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>

-- 
Regards,
Avinash Vallarapu


Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-08 Thread Laurenz Albe
On Thu, 2020-05-07 at 18:17 -0300, Avinash Kumar wrote:
> > The nice thing about separate databases is that it is easy to scale
> > horizontally.
> 
> Agreed. But, how about autovacuum ? Workers shift from DB to DB and 500 
> clusters
> means you may have to have a lot of manual vacuuming in place as well.

Just set "autovacuum_max_workers" higher.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-07 Thread Avinash Kumar
Hi,

On Thu, May 7, 2020 at 6:08 PM Rory Campbell-Lange 
wrote:

> On 07/05/20, Avinash Kumar (avinash.vallar...@gmail.com) wrote:
> > >> Our application serves multiple tenants. Each tenant has the schema
> > >> with a few hundreds of tables and few functions.
> > >> We have 2000 clients so we have to create 2000 schemas in a single
> > >> database.
>
> > > That is one option but I wouldn't say you must.  If you cannot get
> > > individual tables to be multi-tenant you are probably better off
> having one
> > > database per client on a shared cluster - at least given the size of
> the
> > > schema and number of clients.
> > >
> > I am working on a similar problem.
> > 1 database per each client may be a killer when you have a connection
> > pooler that creates a pool for a unique combination of (user,database).
>
> One of our clusters has well over 500 databases fronted by pg_bouncer.
>
> We get excellent connection "flattening" using pg_bouncer with
> per-database connection spikes dealt with through a reserve pool.
>
What if you see at least 4 connections being established by each client
during peak ? And if you serve 4 or 2  connections per each DB, then you
are creating 1000 or more reserved connections with 500 DBs in a cluster.

>
> The nice thing about separate databases is that it is easy to scale
> horizontally.
>
Agreed. But, how about autovacuum ? Workers shift from DB to DB and 500
clusters means you may have to have a lot of manual vacuuming in place as
well.

>
> Rory
>


-- 
Regards,
Avinash Vallarapu
+1-902-221-5976


Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-07 Thread Avinash Kumar
Hi,

On Thu, May 7, 2020 at 5:18 PM David G. Johnston 
wrote:

> On Thu, May 7, 2020 at 1:05 PM samhitha g 
> wrote:
>
>> Our application serves multiple tenants. Each tenant has the schema
>> with a few hundreds of tables and few functions.
>> We have 2000 clients so we have to create 2000 schemas in a single
>> database.
>>
>
> That is one option but I wouldn't say you must.  If you cannot get
> individual tables to be multi-tenant you are probably better off having one
> database per client on a shared cluster - at least given the size of the
> schema and number of clients.
>
I am working on a similar problem.
1 database per each client may be a killer when you have a connection
pooler that creates a pool for a unique combination of (user,database).

>
> David J.
>
>

-- 
Regards,
Avinash Vallarapu
+1-902-221-5976


Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-07 Thread David G. Johnston
On Thu, May 7, 2020 at 1:05 PM samhitha g 
wrote:

> Our application serves multiple tenants. Each tenant has the schema with a
> few hundreds of tables and few functions.
> We have 2000 clients so we have to create 2000 schemas in a single
> database.
>

That is one option but I wouldn't say you must.  If you cannot get
individual tables to be multi-tenant you are probably better off having one
database per client on a shared cluster - at least given the size of the
schema and number of clients.

David J.