Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT
Hi, thanks for your follow-up questions. - postgres version is 9.1.13 - the number of rows (in this latest instance) is 28,474,842 - I've clustered and vacuum-full-ed and analyzed this table frequently, attempting to troubleshoot this. (Running vacuum full on the whole catalog seems a little excessive, and unlikely to help.) Hi, I mean the pg_* tables. When working with temp objects and a high number of classes, regular vacuum may not clean them efficiently. It is not a bad idea to run a vacuum full verbose manually on the largest of those from time to time to verify that they don't grow outer control. And this normally requires a few seconds only. The verbose output of vacuum full sometimes returns interesting information... For the ANALYZE performance, I guess that these are the most relevant one: pg_statistic; pg_class; pg_attribute; pg_index; pg_constraint; regards, Marc Mamin - no other processes are likely to be interfering; nothing other than PostgreSQL runs on this machine (except for normal OS processes and New Relic server monitoring service); concurrent activity in PostgreSQL is low-level and unrelated, and this effect is observed systematically whenever this kind of operation is performed on this table - no override for this table; the system default_statistics_target is 100 (the default) - yes, an ANALYZE is performed on the temp table after the COPY and before the INSERT - no index on the temp table (but I'm scanning the whole thing anyway). There are indexes on f_foo as detailed in my original post, and I expect the PK to make the WHERE NOT EXISTS filtering efficient (as it filters on exactly all columns of the PK) ... but even if it didn't, I would expect that to only slow down the actual insert execution, not the EXPLAIN. Cheers, Gulli On Wed, Mar 4, 2015 at 8:10 PM, Marc Mamin m.ma...@intershop.de wrote: Hi, we are seeing cases of EXPLAIN INSERT INTO foo SELECT ... taking over an hour, with disk I/O utilization (percent of time device is busy) at 100% the whole time, although I/O bandwidth is not saturated. This is on PostgreSQL 9.1.13. What could cause this? Note that there is no ANALYZE. Is it possible that the SELECT is actually executed, in planning the INSERT? When executing the INSERT itself (not EXPLAIN) immediately afterwards, that logs a temporary file message, but the EXPLAIN invocation does not (though the disk I/O suggests that a large on-disk sort is occurring): LOG: temporary file: path base/pgsql_tmp/pgsql_tmp6016.0, size 744103936 STATEMENT: INSERT INTO f_foo SELECT [...] During that actual execution, there's a lower disk I/O utilization (though a higher MB/s rate). Charts of disk I/O utilization and rate are at http://postimg.org/image/628h6jmn3/ ... the solid 100% span is the EXPLAIN statement, ending at 6:13:30pm, followed by the INSERT statement ending at 6:32pm. Metrics are collected by New Relic; their definition of I/O utilization is at https://discuss.newrelic.com/t/disk-i-o-metrics/2900 Here's the EXPLAIN statement: LOG: duration: 3928778.823 ms statement: EXPLAIN INSERT INTO f_foo SELECT t_foo.fk_d1, t_foo.fk_d2, t_foo.fk_d3, t_foo.fk_d4, t_foo.fk_d5, t_foo.fk_d6, t_foo.value FROM t_foo WHERE NOT (EXISTS (SELECT * FROM f_foo WHERE f_foo.fk_d2 = t_foo.fk_d2 AND f_foo.fk_d3 = t_foo.fk_d3 AND f_foo.fk_d4 = t_foo.fk_d4 AND f_foo.fk_d5 = t_foo.fk_d5 AND f_foo.fk_d6 = t_foo.fk_d6 AND f_foo.fk_d1 = t_foo.fk_d1)) (where t_foo is a temp table previously populated using COPY, and the NOT EXISTS subquery refers to the same table we are inserting into) Here's the EXPLAIN output: Insert on f_foo (cost=8098210.50..9354519.69 rows=1 width=16) - Merge Anti Join (cost=8098210.50..9354519.69 rows=1 width=16) Merge Cond: ((t_foo.fk_d2 = public.f_foo.fk_d2) AND (t_foo.fk_d3 = public.f_foo.fk_d3) AND (t_foo.fk_d4 = public.f_foo.fk_d4) AND (t_foo.fk_d5 = public.f_foo.fk_d5) AND (t_foo.fk_d6 = public.f_foo.fk_d6) AND (t_foo.fk_d1 = public.f_foo.fk_d1)) - Sort (cost=3981372.25..4052850.70 rows=28591380 width=16) Sort Key: t_foo.fk_d2, t_foo.fk_d3, t_foo.fk_d4, t_foo.fk_d5, t_foo.fk_d6, t_foo.fk_d1 - Seq Scan on t_foo (cost=0.00..440461.80 rows=28591380 width=16) - Sort (cost=4116838.25..4188025.36 rows=28474842 width=12) Sort Key: public.f_foo.fk_d2, public.f_foo.fk_d3, public.f_foo.fk_d4, public.f_foo.fk_d5,
Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT
Hi, On 5.3.2015 16:01, Gunnlaugur Thor Briem wrote: Hi, thanks for your follow-up questions. - postgres version is 9.1.13 - the number of rows (in this latest instance) is 28,474,842 - I've clustered and vacuum-full-ed and analyzed this table frequently, attempting to troubleshoot this. (Running vacuum full on the whole catalog seems a little excessive, and unlikely to help.) - no other processes are likely to be interfering; nothing other than PostgreSQL runs on this machine (except for normal OS processes and New Relic server monitoring service); concurrent activity in PostgreSQL is low-level and unrelated, and this effect is observed systematically whenever this kind of operation is performed on this table - no override for this table; the system default_statistics_target is 100 (the default) - yes, an ANALYZE is performed on the temp table after the COPY and before the INSERT - no index on the temp table (but I'm scanning the whole thing anyway). There are indexes on f_foo as detailed in my original post, and I expect the PK to make the WHERE NOT EXISTS filtering efficient (as it filters on exactly all columns of the PK) ... but even if it didn't, I would expect that to only slow down the actual insert execution, not the EXPLAIN. The only thing I can think of is some sort of memory exhaustion, resulting in swapping out large amounts of memory. That'd explain the I/O load. Can you run something like vmstat to see if this really is swap? The fact that plain INSERT does not do that contradicts that, as it should be able to plan either both queries (plain and EXPLAIN), or none of them. Can you prepare a self-contained test case? I.e. a script that demonstrates the issue? I tried to reproduce the issue using the information provided so far, but unsuccessfully :-( Even if you could reproduce the problem on another machine (because of keeping the data internal) on a server with debug symbols and see where most of the time is spent (e.g. using 'perf top'), that'd be useful. regards -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PG 9.3 materialized view VS Views, indexes, shared memory
On 2/20/15 12:09 PM, Nicolas Paris wrote: Well it seems that max query size for CREATE INDEX is 8160 character in my 9.3 postgresql version. Then the only solution see is to add a new boolean field : huge_table.view1 and change predicat to WHERE view1=1 But I may have 800 views.. adding 800 new fields indexed to the huge table is actually not a good idea. Too bad Any idea to solve that partial view limitation? If you have that many different views I doubt you want that many indexes anyway. Have you tried just hitting the base table and indexes directly, either through plain views or just direct SQL? Also, how frequently does data change in the huge table? This sounds like a case where the visibility map could make a huge difference. By the way, if all the Mat Views are in one schema that's already in the search path, a very easy way to test this would be to create an equivalent set of regular views in a different schema (which you can probably do programmatically via pg_get_viewdef()) and then change the search_path to put the new schema before the old. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT
Tomas Vondra tomas.von...@2ndquadrant.com writes: On 5.3.2015 16:01, Gunnlaugur Thor Briem wrote: - postgres version is 9.1.13 The only thing I can think of is some sort of memory exhaustion, resulting in swapping out large amounts of memory. I'm wondering about the issue addressed by commit fccebe421 (Use SnapshotDirty rather than an active snapshot to probe index endpoints). Now, that was allegedly fixed in 9.1.13 ... but if the OP were confused and this server were running, say, 9.1.12, that could be a viable explanation. Another possibly viable explanation for seeing the issue in 9.1.13 would be if I fat-fingered the back-patch somehow :-(. In any case, I concur with your advice: Even if you could reproduce the problem on another machine (because of keeping the data internal) on a server with debug symbols and see where most of the time is spent (e.g. using 'perf top'), that'd be useful. Without some more-precise idea of where the time is going, we're really just guessing as to the cause. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] slow server : s_lock and _bt_checkkeys on perf top
On 3/4/15 1:31 PM, Soni M wrote: Hello All, Master db size 1.5 TB All postgres 9.1.13 installed from RHEL package. It has streaming replica and slony replica to another servers. Server performance is slower than usual, before that, there's a big query got cancelled and then performance get slow. No sign of IO wait. on sar, it's %user and %system dominate the cpu usage 01:25:04 PM CPU %user %nice %system %iowait%steal %idle Average:all 51.91 0.00 12.03 0.66 0.00 35.39 on perf top, i saw 18.93% postgres [.] s_lock 10.72% postgres [.] _bt_checkkeys almost always at top. _bt_checkkeys is the function that compares a row in a btree index to a condition. s_lock is a spinlock; the high CPU usage in there indicates there's heavy lock contention somewhere. Is there one PG process that's clearly using more CPU than the others? What else is running in the database? Are there any unusual data types involved? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT
What could cause this? Note that there is no ANALYZE. Can you capture pstack and/or perf report while explain hangs? I think it should shed light on the activity of PostgreSQL. Vladimir -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Reverse Key Index
On 26.02.2015 13:37, Heikki Linnakangas wrote: On 02/26/2015 12:31 AM, Josh Berkus wrote: On 02/14/2015 10:35 AM, Sven R. Kunze wrote: Thanks for the immediate reply. I understand the use case is quite limited. On the other hand, I see potential when it comes to applications which use PostgreSQL. There, programmers would have to change a lot of code to tweak existing (and more importantly working) queries to hash/reverse an id column first. Using ORMs would make this change even more painful and maybe even impossible. When reading https://richardfoote.wordpress.com/2008/01/14/introduction-to-reverse-key-indexes-part-i/ carefully, it also seems to work with index scan partially in case of equality comparisons. Seems like a good use for SP-GiST. Go for it! A b-tree opclass that just compares from right-to-left would work just as well, and perform better. - Heikki Thanks for the hint. That also sounds easy to implement. Regards, -- Sven R. Kunze TBZ-PARIV GmbH, Bernsdorfer Str. 210-212, 09126 Chemnitz Tel: +49 (0)371 33714721, Fax: +49 (0)371 5347920 e-mail: srku...@tbz-pariv.de web: www.tbz-pariv.de Geschäftsführer: Dr. Reiner Wohlgemuth Sitz der Gesellschaft: Chemnitz Registergericht: Chemnitz HRB 8543 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT
Hi, thanks for your follow-up questions. - postgres version is 9.1.13 - the number of rows (in this latest instance) is 28,474,842 - I've clustered and vacuum-full-ed and analyzed this table frequently, attempting to troubleshoot this. (Running vacuum full on the whole catalog seems a little excessive, and unlikely to help.) - no other processes are likely to be interfering; nothing other than PostgreSQL runs on this machine (except for normal OS processes and New Relic server monitoring service); concurrent activity in PostgreSQL is low-level and unrelated, and this effect is observed systematically whenever this kind of operation is performed on this table - no override for this table; the system default_statistics_target is 100 (the default) - yes, an ANALYZE is performed on the temp table after the COPY and before the INSERT - no index on the temp table (but I'm scanning the whole thing anyway). There are indexes on f_foo as detailed in my original post, and I expect the PK to make the WHERE NOT EXISTS filtering efficient (as it filters on exactly all columns of the PK) ... but even if it didn't, I would expect that to only slow down the actual insert execution, not the EXPLAIN. Cheers, Gulli On Wed, Mar 4, 2015 at 8:10 PM, Marc Mamin m.ma...@intershop.de wrote: Hi, we are seeing cases of EXPLAIN INSERT INTO foo SELECT ... taking over an hour, with disk I/O utilization (percent of time device is busy) at 100% the whole time, although I/O bandwidth is not saturated. This is on PostgreSQL 9.1.13. What could cause this? Note that there is no ANALYZE. Is it possible that the SELECT is actually executed, in planning the INSERT? When executing the INSERT itself (not EXPLAIN) immediately afterwards, that logs a temporary file message, but the EXPLAIN invocation does not (though the disk I/O suggests that a large on-disk sort is occurring): LOG: temporary file: path base/pgsql_tmp/pgsql_tmp6016.0, size 744103936 STATEMENT: INSERT INTO f_foo SELECT [...] During that actual execution, there's a lower disk I/O utilization (though a higher MB/s rate). Charts of disk I/O utilization and rate are at http://postimg.org/image/628h6jmn3/ ... the solid 100% span is the EXPLAIN statement, ending at 6:13:30pm, followed by the INSERT statement ending at 6:32pm. Metrics are collected by New Relic; their definition of I/O utilization is at https://discuss.newrelic.com/t/disk-i-o-metrics/2900 Here's the EXPLAIN statement: LOG: duration: 3928778.823 ms statement: EXPLAIN INSERT INTO f_foo SELECT t_foo.fk_d1, t_foo.fk_d2, t_foo.fk_d3, t_foo.fk_d4, t_foo.fk_d5, t_foo.fk_d6, t_foo.value FROM t_foo WHERE NOT (EXISTS (SELECT * FROM f_foo WHERE f_foo.fk_d2 = t_foo.fk_d2 AND f_foo.fk_d3 = t_foo.fk_d3 AND f_foo.fk_d4 = t_foo.fk_d4 AND f_foo.fk_d5 = t_foo.fk_d5 AND f_foo.fk_d6 = t_foo.fk_d6 AND f_foo.fk_d1 = t_foo.fk_d1)) (where t_foo is a temp table previously populated using COPY, and the NOT EXISTS subquery refers to the same table we are inserting into) Here's the EXPLAIN output: Insert on f_foo (cost=8098210.50..9354519.69 rows=1 width=16) - Merge Anti Join (cost=8098210.50..9354519.69 rows=1 width=16) Merge Cond: ((t_foo.fk_d2 = public.f_foo.fk_d2) AND (t_foo.fk_d3 = public.f_foo.fk_d3) AND (t_foo.fk_d4 = public.f_foo.fk_d4) AND (t_foo.fk_d5 = public.f_foo.fk_d5) AND (t_foo.fk_d6 = public.f_foo.fk_d6) AND (t_foo.fk_d1 = public.f_foo.fk_d1)) - Sort (cost=3981372.25..4052850.70 rows=28591380 width=16) Sort Key: t_foo.fk_d2, t_foo.fk_d3, t_foo.fk_d4, t_foo.fk_d5, t_foo.fk_d6, t_foo.fk_d1 - Seq Scan on t_foo (cost=0.00..440461.80 rows=28591380 width=16) - Sort (cost=4116838.25..4188025.36 rows=28474842 width=12) Sort Key: public.f_foo.fk_d2, public.f_foo.fk_d3, public.f_foo.fk_d4, public.f_foo.fk_d5, public.f_foo.fk_d6, public.f_foo.fk_d1 - Seq Scan on f_foo (cost=0.00..591199.42 rows=28474842 width=12) The INSERT is indeed rather large (which is why we're issuing an EXPLAIN ahead of it to log the plan). So its long execution time is expected. But I want to understand why the EXPLAIN takes even longer. The table looks like this: \d f_foo Table public.f_foo Column | Type | Modifiers +--+--- fk_d1 | smallint | not null fk_d2 | smallint | not null fk_d3 | smallint | not null fk_d4 | smallint | not null fk_d5 | smallint | not null fk_d6 | smallint | not null value | integer | Indexes: