Milions of views - performance, stability

Hubert Rutkowski Fri, 16 Sep 2022 16:06:25 -0700

Hello! I have written python program to benchmark view efficiency,
because in our platform they have a role to play and we noticed the
performance is less than expected.
Basically, benchmark creates table:


CREATE TABLE IF NOT EXISTS foobar ( id int, text varchar(40) );

for i in range(1200300):
    INSERT INTO foobar (id, text) VALUES ({i}, 'some string');
    CREATE VIEW foobar_{i} as select * from foobar where id={i};

Couldn't be any simpler. Postgres 13.1 running in docker, on Ubuntu
20. However, noticed that performance of certain commands is strangely
slow:
- dumping through pg_dump to tar took 13 minutes. Same table but
without views: less than 1 second.
- restoring through pg_restore took 147 minutes. Same table but
without views: half a second.

In other situation (not observed by me) the dumping process of real
world db with not only 1.2M empty views but in addition gigabytes of
data in rows, lasted for many many hours, and ultimately had to be
stopped.


What's even stranger is dropping performance: DROP TABLE foobar
CASCADE;. First of all, had to increase locks to allow it to finish,
otherwise it was quickly bailing because of "too little shared
memory".
    alter system set  max_locks_per_transaction=40000;

But even after that, it took almost 7 hours and crashed:

2022-09-13 23:16:31.113 UTC [1] LOG:  server process (PID 404) was
terminated by signal 9: Killed
2022-09-13 23:16:31.113 UTC [1] DETAIL:  Failed process was running:
drop table foobar cascade;
2022-09-13 23:16:31.115 UTC [1] LOG:  terminating any other active
server processes
2022-09-13 23:16:31.115 UTC [1247] WARNING:  terminating connection
because of crash of another server process
2022-09-13 23:16:31.115 UTC [1247] DETAIL:  The postmaster has
commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2022-09-13 23:16:31.117 UTC [97] HINT:  In a moment you should be able
to reconnect to the database and repeat your command.
2022-09-13 23:16:31.136 UTC [1248] FATAL:  the database system is in
recovery mode
2022-09-13 23:16:31.147 UTC [1249] FATAL:  the database system is in
recovery mode
2022-09-13 23:16:31.192 UTC [1] LOG:  all server processes terminated;
reinitializing
2022-09-13 23:16:31.819 UTC [1250] LOG:  database system was
interrupted; last known up at 2022-09-13 23:15:47 UTC
2022-09-13 23:16:34.959 UTC [1250] LOG:  database system was not
properly shut down; automatic recovery in progress
2022-09-13 23:16:34.965 UTC [1250] LOG:  redo starts at 2/3A3FEEC8
2022-09-13 23:16:36.421 UTC [1250] LOG:  invalid record length at
2/5F355008: wanted 24, got 0
2022-09-13 23:16:36.421 UTC [1250] LOG:  redo done at 2/5F354FD0
2022-09-13 23:16:37.166 UTC [1] LOG:  database system is ready to
accept connections

After updating Postgres to 14.5, it crashed in a bit different way:

2022-09-15 19:20:26.000 UTC [67] LOG:  checkpoints are occurring too
frequently (23 seconds apart)
2022-09-15 19:20:26.000 UTC [67] HINT:  Consider increasing the
configuration parameter "max_wal_size".
2022-09-15 19:20:39.058 UTC [1] LOG:  server process (PID 223) was
terminated by signal 9: Killed
2022-09-15 19:20:39.058 UTC [1] DETAIL:  Failed process was running:
drop table foobar cascade;


Wihout the views, table can be dropped in 20ms.

There must be something inherently slow in the way that Postgres
manages views. I know that under the hood, views are like
table+relation to parent table, so it could be
compared to having about million of tables. They are not that light,
aren't they?
Probably the issue is made worse because of atomicity: dropping the
foobar table with cascade needs to have all views dropped first, in
transaction. But why handling them would be so slow?


Assuming the above is true, I'm wondering if there's a way to improve
the performance of Postgres commands like (the most important) backup
and restore, in situation
of so many views. Dropping table is not that important, but would be
good to have it working too, ie. by first deleting the views in
batches (my idea, will test).
But backups and restores must be faster and reliable in order to
implement one feature in our platform.
Perhaps adding index on views, so that it can quickly assess how many
there are and to lock them, disabling something, tweaking some perf
option... throwing ideas.

Please advice. Or maybe there's no hope to make this behave better :)

Regards,
Hubert

Milions of views - performance, stability

Reply via email to