2017-06-28 10:43 GMT+02:00 Pritam Baral <pri...@pritambaral.com>: > > > On Wednesday 28 June 2017 02:00 PM, Ulf Lohbrügge wrote: > > Nope, I didn't try that yet. But I don't have the impression that > reindexing the indexes in information_schema will help. The table > information_schema.tables consists of the following indexes: > > > > "pg_class_oid_index" UNIQUE, btree (oid) > > "pg_class_relname_nsp_index" UNIQUE, btree (relname, relnamespace) > > "pg_class_tblspc_relfilenode_index" btree (reltablespace, > relfilenode) > > information_schema.tables is not a table, it's a view; at least on 9.5[0]. > These indexes you list are actually indexes on the pg_catalog.pg_class > table. >
Yes, it's a view. \d+ information_schema.tables gives: View definition: SELECT current_database()::information_schema.sql_identifier AS table_catalog, nc.nspname::information_schema.sql_identifier AS table_schema, c.relname::information_schema.sql_identifier AS table_name, CASE WHEN nc.oid = pg_my_temp_schema() THEN 'LOCAL TEMPORARY'::text WHEN c.relkind = 'r'::"char" THEN 'BASE TABLE'::text WHEN c.relkind = 'v'::"char" THEN 'VIEW'::text WHEN c.relkind = 'f'::"char" THEN 'FOREIGN TABLE'::text ELSE NULL::text END::information_schema.character_data AS table_type, NULL::character varying::information_schema.sql_identifier AS self_referencing_column_name, NULL::character varying::information_schema.character_data AS reference_generation, CASE WHEN t.typname IS NOT NULL THEN current_database() ELSE NULL::name END::information_schema.sql_identifier AS user_defined_type_catalog, nt.nspname::information_schema.sql_identifier AS user_defined_type_schema, t.typname::information_schema.sql_identifier AS user_defined_type_name, CASE WHEN c.relkind = 'r'::"char" OR (c.relkind = ANY (ARRAY['v'::"char", 'f'::"char"])) AND (pg_relation_is_updatable(c.oid::regclass, false) & 8) = 8 THEN 'YES'::text ELSE 'NO'::text END::information_schema.yes_or_no AS is_insertable_into, CASE WHEN t.typname IS NOT NULL THEN 'YES'::text ELSE 'NO'::text END::information_schema.yes_or_no AS is_typed, NULL::character varying::information_schema.character_data AS commit_action FROM pg_namespace nc JOIN pg_class c ON nc.oid = c.relnamespace LEFT JOIN (pg_type t JOIN pg_namespace nt ON t.typnamespace = nt.oid) ON c.reloftype = t.oid WHERE (c.relkind = ANY (ARRAY['r'::"char", 'v'::"char", 'f'::"char"])) AND NOT pg_is_other_temp_schema(nc.oid) AND (pg_has_role(c.relowner, 'USAGE'::text) OR has_table_privilege(c.oid, 'SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR has_any_column_privilege(c.oid, 'SELECT, INSERT, UPDATE, REFERENCES'::text)); > > > > > The costly sequence scan in question on pg_class happens with the > following WHERE clause: > > > > WHERE (c.relkind = ANY (ARRAY['r'::"char", 'v'::"char", 'f'::"char"])) > AND NOT pg_is_other_temp_schema(nc.oid) AND (pg_has_role(c.relowner, > 'USAGE'::text) OR has_table_privilege(c.oid, 'SELECT, INSERT, UPDATE, > DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR > has_any_column_privilege(c.oid, > 'SELECT, INSERT, UPDATE, REFERENCES'::text)); > > This is not the bottleneck WHERE clause the query plan from your first > mail shows. That one is: > > ((relkind = ANY ('{r,v,f}'::"char"[])) AND (((relname)::information_ > schema.sql_identifier)::text = 'bar'::text) AND (pg_has_role(relowner, > 'USAGE'::text) OR has_table_privilege(oid, 'SELECT, INSERT, UPDATE, DELETE, > TRUNCATE, REFERENCES, TRIGGER'::text) OR has_any_column_privilege(oid, > 'SELECT, INSERT, UPDATE, REFERENCES'::text))) > The part you copied is from the EXPLAIN ANALYZE output. The WHERE clause I posted earlier (or see view definition) above does unfortunately not contain the relname. > > I can say with certainty that an index on pg_catalog.pg_class.relname is > going to speed this up. Postgres doesn't allow modifying system catalogs, > but the `REINDEX SYSTEM <dbname>;` command should rebuild the system > indexes and pg_catalog.pg_class.relname should be included in them (I > tested on 9.6). > > Do try that once. If you still see sequential scans, check what indexes > are present on pg_catalog.pg_class. > I just fired a 'REINDEX SYSTEM <dbname>;' but the output of EXPLAIN ANALYZE is unchanged and the query duration did not change. Best Regards, Ulf > > > > > > Besides pg_class_oid_index none of the referenced columns is indexed. I > tried to add an index on relowner but didn't succeed because the column is > used in the function call pg_has_role and the query is still forced to do a > sequence scan. > > > > Regards, > > Ulf > > > > 2017-06-28 3:31 GMT+02:00 Pritam Baral <pri...@pritambaral.com <mailto: > pri...@pritambaral.com>>: > > > > On Wednesday 28 June 2017 05:27 AM, Ulf Lohbrügge wrote: > > > Hi all, > > > > > > we use schemata to separate our customers in a multi-tenant setup > (9.5.7, Debian stable). Each tenant is managed in his own schema with all > the tables that only he can access. All tables in all schemata are the same > in terms of their DDL: Every tenant uses e.g. his own table 'address'. We > currently manage around 1200 schemata (i.e. tenants) on one cluster. Every > schema consists currently of ~200 tables - so we end up with ~240000 tables > plus constraints, indexes, sequences et al. > > > > > > Our current approach is quite nice in terms of data privacy > because every tenant is isolated from all other tenants. A tenant uses his > own user that gives him only access to the corresponding schema. > Performance is great for us - we didn't expect Postgres to scale so well! > > > > > > But performance is pretty bad when we query things in the > information_schema: > > > > > > SELECT > > > * > > > FROM information_schema.tables > > > WHERE table_schema = 'foo' > > > AND table_name = 'bar';`` > > > > > > Above query results in a large sequence scan with a filter that > removes 1305161 rows: > > > > > > > > QUERY PLAN > > > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------- > > > Nested Loop Left Join (cost=0.70..101170.18 rows=3 width=265) > (actual time=383.505..383.505 rows=0 loops=1) > > > -> Nested Loop (cost=0.00..101144.65 rows=3 width=141) > (actual time=383.504..383.504 rows=0 loops=1) > > > Join Filter: (nc.oid = c.relnamespace) > > > -> Seq Scan on pg_class c (cost=0.00..101023.01 > rows=867 width=77) (actual time=383.502..383.502 rows=0 loops=1) > > > Filter: ((relkind = ANY ('{r,v,f}'::"char"[])) AND > (((relname)::information_schema.sql_identifier)::text = 'bar'::text) AND > (pg_has_role(relowner, 'USAGE'::text) OR has_table_privilege(oid, 'SELECT, > INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER'::text) OR > has_any_column_privilege(oid, 'SELECT, INSERT, UPDATE, REFERENCES'::text))) > > > Rows Removed by Filter: 1305161 > > > -> Materialize (cost=0.00..56.62 rows=5 width=68) > (never executed) > > > -> Seq Scan on pg_namespace nc (cost=0.00..56.60 > rows=5 width=68) (never executed) > > > Filter: ((NOT pg_is_other_temp_schema(oid)) > AND (((nspname)::information_schema.sql_identifier)::text = 'foo'::text)) > > > -> Nested Loop (cost=0.70..8.43 rows=1 width=132) (never > executed) > > > -> Index Scan using pg_type_oid_index on pg_type t > (cost=0.42..8.12 rows=1 width=72) (never executed) > > > Index Cond: (c.reloftype = oid) > > > -> Index Scan using pg_namespace_oid_index on > pg_namespace nt (cost=0.28..0.30 rows=1 width=68) (never executed) > > > Index Cond: (oid = t.typnamespace) > > > Planning time: 0.624 ms > > > Execution time: 383.784 ms > > > (16 rows) > > > > > > We noticed the degraded performance first when using the psql cli. > Pressing tab after beginning a WHERE clause results in a query against the > information_schema which is pretty slow and ends in "lag" when trying to > enter queries. > > > > > > We also use Flyway (https://flywaydb.org/) to handle our database > migrations. Unfortunately Flyway is querying the information_schema to > check if specific tables exist (I guess this is one of the reasons > information_schema exists) and therefore vastly slows down the migration of > our tenants. Our last migration run on all tenants (schemata) almost took > 2h because the above query is executed multiple times per tenant. The > migration run consisted of multiple sql files to be executed and triggered > more than 10 queries on information_schema per tenant. > > > > > > I don't think that Flyway is to blame because querying the > information_schema should be a fast operation (and was fast for us when we > had less schemata). I tried to speedup querying pg_class by adding indexes > (after enabling allow_system_table_mods) but didn't succeed. The function > call 'pg_has_role' is probably not easy to optimize. > > > > > > Postgres is really doing a great job to handle those many schemata > and tables but doesn't scale well when querying information_schema. I > actually don't want to change my current multi-tenant setup (one schema per > tenant) as it is working great but the slow information_schema is killing > our deployments. > > > > > > Are there any other options besides switching from > one-schema-per-tenant-approach? Any help is greatly appreciated! > > > > Have you tried a `REINDEX SYSTEM <dbname>`? > > > > > > > > Regards, > > > Ulf > > > > -- > > #!/usr/bin/env regards > > Chhatoi Pritam Baral > > > > > > > > -- > > Sent via pgsql-performance mailing list ( > pgsql-performance@postgresql.org <mailto:pgsql-performance@postgresql.org > >) > > To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-performance < > http://www.postgresql.org/mailpref/pgsql-performance> > > > > > > [0]: https://www.postgresql.org/docs/9.5/static/infoschema-tables.html > > -- > #!/usr/bin/env regards > Chhatoi Pritam Baral > >