Re: [PERFORM] gin performance issue.
Marc Maminwrites: > http://explain.depesz.com/s/wKv7 > Postgres Version 9.3.10 (Linux) > Hello, > this is a large daily table that only get bulk inserts (200-400 /days) with > no update. > After rebuilding the whole table, the Bitmap Index Scan on > r_20160204_ix_toprid falls under 1 second (from 800) > Fastupdate is using the default, but autovacuum is disabled on that table > which contains 30 Mio rows. Pre-9.5, it's a pretty bad idea to disable autovacuum on a GIN index, because then the "pending list" only gets flushed when it exceeds work_mem. (Obviously, using a large work_mem setting makes this worse.) regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] bad COPY performance with NOTIFY in a trigger
patch submitted on -hackers list. http://www.postgresql.org/message-id/CAP_rwwn2z0gPOn8GuQ3qDVS5+HgEcG2EzEOyiJtcA=vpdeh...@mail.gmail.com results after the patch: trigger= BEGIN RETURN NULL; END rows=4 228ms COPY test.tab FROM '/tmp/test.dat' 205ms COPY test.tab FROM '/tmp/test.dat' rows=8 494ms COPY test.tab FROM '/tmp/test.dat' 395ms COPY test.tab FROM '/tmp/test.dat' rows=12 678ms COPY test.tab FROM '/tmp/test.dat' 652ms COPY test.tab FROM '/tmp/test.dat' rows=16 956ms COPY test.tab FROM '/tmp/test.dat' 822ms COPY test.tab FROM '/tmp/test.dat' rows=20 1184ms COPY test.tab FROM '/tmp/test.dat' 1072ms COPY test.tab FROM '/tmp/test.dat' trigger= BEGIN PERFORM pg_notify('test',NEW.id::text); RETURN NULL; END rows=4 440ms COPY test.tab FROM '/tmp/test.dat' 406ms COPY test.tab FROM '/tmp/test.dat' rows=8 887ms COPY test.tab FROM '/tmp/test.dat' 769ms COPY test.tab FROM '/tmp/test.dat' rows=12 1346ms COPY test.tab FROM '/tmp/test.dat' 1171ms COPY test.tab FROM '/tmp/test.dat' rows=16 1710ms COPY test.tab FROM '/tmp/test.dat' 1709ms COPY test.tab FROM '/tmp/test.dat' rows=20 2189ms COPY test.tab FROM '/tmp/test.dat' 2206ms COPY test.tab FROM '/tmp/test.dat' On Fri, Feb 5, 2016 at 1:45 PM, Filip Rembiałkowski < filip.rembialkow...@gmail.com> wrote: > On Thu, Feb 4, 2016 at 11:41 PM, Tom Lanewrote: > >> =?UTF-8?Q?Filip_Rembia=C5=82kowski?= >> writes: >> > A table has a trigger. >> > The trigger sends a NOTIFY. >> > Test with COPY FROM shows non-linear correlation between number of >> inserted >> > rows and COPY duration. >> >> No surprise, see AsyncExistsPendingNotify. You would have a lot of other >> performance issues with sending hundreds of thousands of distinct notify >> events from one transaction anyway, so I can't get terribly excited about >> this. >> > > > What kind of issues? Do you mean, problems in postgres or problems in > client? > > Is there an additional non-linear cost on COMMIT (extra to the cost I > already showed)? > > The 8GB internal queue (referenced in a Note at > http://www.postgresql.org/docs/current/static/sql-notify.html) should be > able to keep ~ 1E8 such notifications (assumed one notification will fit in > 80 bytes). > > On client side, this seems legit - the LISTENer deamon will collect these > notifications and process them in line. > There might be no LISTENer running at all. > > Still, the main problem I get with this approach is quadratic cost on big > insert transactions. > I wonder if this behavior is possible to change in future postgres > versions. And how much programming work does it require. > > Is duplicate-elimination a fundamental, non-negotiable requirement? > > > > Thank you, > Filip > >
Re: [PERFORM] View containing a recursive function
On Mon, 1 Feb 2016 at 10:45 Tom Lanewrote: > Mathieu De Zutter writes: > Assuming you have a reasonably late-model PG, you could rewrite the > view with a lateral function call: > > CREATE OR REPLACE VIEW covering_works_r AS > SELECT > w.idAS work_id, > fn.fAS covering_work_id > FROM work w, fn_covering_works(w.id) as fn(f); > > which puts the SRF into FROM where the planner can deal with it much > better. > Thanks a lot. That fixes it! Another problem is that you let the function default to being VOLATILE, > which would have disabled view flattening even if this didn't. I see > no reason for this function not to be marked STABLE. > By marking it STABLE, it ignores my row estimate of 1 - I guess because of the inlining. The number of results is usually just 1, though the number can go up to 10 in exceptional cases. That's still a lot better than the inexplicable estimate of the planner (101) when marked STABLE, which often leads to triggering a hash join instead of a nested loop in complex queries: -> Recursive Union (cost=0.00..795.53 rows=*101* width=4) (actual time=0.001..0.009 rows=1 loops=4) -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=4) -> Nested Loop (cost=0.29..79.35 rows=10 width=4) (actual time=0.005..0.005 rows=0 loops=5) -> WorkTable Scan on func f (cost=0.00..0.20 rows=10 width=4) (actual time=0.000..0.000 rows=1 loops=5) -> Index Scan using adaptation_adapted_idx on adaptation ad (cost=0.29..7.91 rows=1 width=8) (actual time=0.004..0.004 rows=0 loops=5) Index Cond: (adapted_id = f.work_id) Thanks again, Mathieu
Re: [PERFORM] bad COPY performance with NOTIFY in a trigger
Tom Lanewrites: > No surprise, see AsyncExistsPendingNotify. You would have a lot of other > performance issues with sending hundreds of thousands of distinct notify > events from one transaction anyway, so I can't get terribly excited about > this. @Filip: you probably want a per-statement trigger rather than a per-row trigger: insert all rows with COPY, then send one notification. You have to mark the new rows somehow yourself; unfortunately PostgreSQL has no way to tell them in a statement trigger. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] bad COPY performance with NOTIFY in a trigger
On Fri, Feb 5, 2016 at 9:33 AM, Filip Rembiałkowskiwrote: > patch submitted on -hackers list. > http://www.postgresql.org/message-id/CAP_rwwn2z0gPOn8GuQ3qDVS5+HgEcG2EzEOyiJtcA=vpdeh...@mail.gmail.com > > results after the patch: > > trigger= BEGIN RETURN NULL; END > rows=4 > 228ms COPY test.tab FROM '/tmp/test.dat' > 205ms COPY test.tab FROM '/tmp/test.dat' > rows=8 > 494ms COPY test.tab FROM '/tmp/test.dat' > 395ms COPY test.tab FROM '/tmp/test.dat' > rows=12 > 678ms COPY test.tab FROM '/tmp/test.dat' > 652ms COPY test.tab FROM '/tmp/test.dat' > rows=16 > 956ms COPY test.tab FROM '/tmp/test.dat' > 822ms COPY test.tab FROM '/tmp/test.dat' > rows=20 > 1184ms COPY test.tab FROM '/tmp/test.dat' > 1072ms COPY test.tab FROM '/tmp/test.dat' > trigger= BEGIN PERFORM pg_notify('test',NEW.id::text); RETURN NULL; END > rows=4 > 440ms COPY test.tab FROM '/tmp/test.dat' > 406ms COPY test.tab FROM '/tmp/test.dat' > rows=8 > 887ms COPY test.tab FROM '/tmp/test.dat' > 769ms COPY test.tab FROM '/tmp/test.dat' > rows=12 > 1346ms COPY test.tab FROM '/tmp/test.dat' > 1171ms COPY test.tab FROM '/tmp/test.dat' > rows=16 > 1710ms COPY test.tab FROM '/tmp/test.dat' > 1709ms COPY test.tab FROM '/tmp/test.dat' > rows=20 > 2189ms COPY test.tab FROM '/tmp/test.dat' > 2206ms COPY test.tab FROM '/tmp/test.dat' I'm not so sure that this is a great idea. Generally, we tend to discourage GUCs that control behavior at the SQL level. Are you 100% certain that there is no path to optimizing this case without changing behvior? merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] bad COPY performance with NOTIFY in a trigger
On Thu, Feb 4, 2016 at 11:41 PM, Tom Lanewrote: > =?UTF-8?Q?Filip_Rembia=C5=82kowski?= > writes: > > A table has a trigger. > > The trigger sends a NOTIFY. > > Test with COPY FROM shows non-linear correlation between number of > inserted > > rows and COPY duration. > > No surprise, see AsyncExistsPendingNotify. You would have a lot of other > performance issues with sending hundreds of thousands of distinct notify > events from one transaction anyway, so I can't get terribly excited about > this. > What kind of issues? Do you mean, problems in postgres or problems in client? Is there an additional non-linear cost on COMMIT (extra to the cost I already showed)? The 8GB internal queue (referenced in a Note at http://www.postgresql.org/docs/current/static/sql-notify.html) should be able to keep ~ 1E8 such notifications (assumed one notification will fit in 80 bytes). On client side, this seems legit - the LISTENer deamon will collect these notifications and process them in line. There might be no LISTENer running at all. Still, the main problem I get with this approach is quadratic cost on big insert transactions. I wonder if this behavior is possible to change in future postgres versions. And how much programming work does it require. Is duplicate-elimination a fundamental, non-negotiable requirement? Thank you, Filip
[PERFORM] gin performance issue.
http://explain.depesz.com/s/wKv7 Postgres Version 9.3.10 (Linux) Hello, this is a large daily table that only get bulk inserts (200-400 /days) with no update. After rebuilding the whole table, the Bitmap Index Scan on r_20160204_ix_toprid falls under 1 second (from 800) Fastupdate is using the default, but autovacuum is disabled on that table which contains 30 Mio rows. Another specificity is that the cardinality of the indexed column is very high. The average count per distinct values is only 2.7 I'm not sure what the problem is. Does the missing vacuum affect the gin index sanity further than not cleaning the pending list? As I understand it, this list will be merged into the index automatically when it get full, independently from the vaccum setting. Can it be an index bloating issue ? and last but not least, can I reduce the problem by configuration ? regards, Marc Mamin