Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

2015-03-05 Thread Marc Mamin


Hi, thanks for your follow-up questions.
- postgres version is 9.1.13
- the number of rows (in this latest instance) is 28,474,842
- I've clustered and vacuum-full-ed and analyzed this table frequently, 
attempting to troubleshoot this. (Running vacuum full on the whole catalog 
seems a little excessive, and unlikely to help.)

Hi,

I mean the pg_* tables. When working with temp objects and a high number of 
classes, regular vacuum may not clean them efficiently.
It is not a bad idea to run a vacuum full verbose manually on the largest of 
those from time to time to verify that they don't grow outer control.
And this normally requires a few seconds only.
The verbose output of vacuum full sometimes returns interesting information...
For the ANALYZE performance, I guess that these are the most relevant one:
 pg_statistic;
 pg_class;
 pg_attribute;
 pg_index;
 pg_constraint;

regards,

Marc Mamin



- no other processes are likely to be interfering; nothing other than 
PostgreSQL runs on this machine (except for normal OS processes and New Relic 
server monitoring service); concurrent activity in PostgreSQL is low-level and 
unrelated, and this effect is observed systematically whenever this kind of 
operation is performed on this table
- no override for this table; the system default_statistics_target is 100 (the 
default)
- yes, an ANALYZE is performed on the temp table after the COPY and before the 
INSERT
- no index on the temp table (but I'm scanning the whole thing anyway). There 
are indexes on f_foo as detailed in my original post, and I expect the PK to 
make the WHERE NOT EXISTS filtering efficient (as it filters on exactly all 
columns of the PK) ... but even if it didn't, I would expect that to only slow 
down the actual insert execution, not the EXPLAIN.
Cheers,
Gulli
On Wed, Mar 4, 2015 at 8:10 PM, Marc Mamin m.ma...@intershop.de wrote:

Hi,
we are seeing cases of EXPLAIN INSERT INTO foo SELECT ... taking over an 
 hour, with disk I/O utilization (percent of time device is busy) at 100% the 
 whole time, although I/O bandwidth is not saturated. This is on PostgreSQL 
 9.1.13.
What could cause this? Note that there is no ANALYZE. Is it possible that 
 the SELECT is actually executed, in planning the INSERT?
When executing the INSERT itself (not EXPLAIN) immediately afterwards, 
 that logs a temporary file message, but the EXPLAIN invocation does not 
 (though the disk I/O suggests that a large on-disk sort is occurring):
LOG:  temporary file: path base/pgsql_tmp/pgsql_tmp6016.0, size 
 744103936
STATEMENT:  INSERT INTO f_foo SELECT
[...]
During that actual execution, there's a lower disk I/O utilization 
 (though a higher MB/s rate).
Charts of disk I/O utilization and rate are at 
 http://postimg.org/image/628h6jmn3/ ... the solid 100% span is the EXPLAIN 
 statement, ending at 6:13:30pm, followed by the INSERT statement ending at 
 6:32pm. Metrics are collected by New Relic; their definition of I/O 
 utilization is at https://discuss.newrelic.com/t/disk-i-o-metrics/2900
Here's the EXPLAIN statement:
LOG:  duration: 3928778.823 ms  statement: EXPLAIN INSERT INTO f_foo 
 SELECT
   t_foo.fk_d1,
   t_foo.fk_d2,
   t_foo.fk_d3,
   t_foo.fk_d4,
   t_foo.fk_d5,
   t_foo.fk_d6,
   t_foo.value
FROM t_foo
WHERE NOT (EXISTS
 (SELECT *
  FROM f_foo
  WHERE f_foo.fk_d2 = t_foo.fk_d2
AND f_foo.fk_d3 = t_foo.fk_d3
AND f_foo.fk_d4 = t_foo.fk_d4
AND f_foo.fk_d5 = t_foo.fk_d5
AND f_foo.fk_d6 = t_foo.fk_d6
AND f_foo.fk_d1 = t_foo.fk_d1))
(where t_foo is a temp table previously populated using COPY, and the NOT 
 EXISTS subquery refers to the same table we are inserting into)
Here's the EXPLAIN output:
Insert on f_foo  (cost=8098210.50..9354519.69 rows=1 width=16)
  -  Merge Anti Join  (cost=8098210.50..9354519.69 rows=1 width=16)
Merge Cond: ((t_foo.fk_d2 = public.f_foo.fk_d2) AND
 (t_foo.fk_d3 = public.f_foo.fk_d3) AND
 (t_foo.fk_d4 = public.f_foo.fk_d4) AND
 (t_foo.fk_d5 = public.f_foo.fk_d5) AND
 (t_foo.fk_d6 = public.f_foo.fk_d6) AND
 (t_foo.fk_d1 = public.f_foo.fk_d1))
-  Sort  (cost=3981372.25..4052850.70 rows=28591380 width=16)
  Sort Key: t_foo.fk_d2, t_foo.fk_d3, t_foo.fk_d4, 
 t_foo.fk_d5,
t_foo.fk_d6, t_foo.fk_d1
  -  Seq Scan on t_foo  (cost=0.00..440461.80 rows=28591380
  width=16)
-  Sort  (cost=4116838.25..4188025.36 rows=28474842 width=12)
  Sort Key: public.f_foo.fk_d2, public.f_foo.fk_d3,
public.f_foo.fk_d4, public.f_foo.fk_d5,
 

Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

2015-03-05 Thread Tomas Vondra
Hi,

On 5.3.2015 16:01, Gunnlaugur Thor Briem wrote:
 Hi, thanks for your follow-up questions.
 
 - postgres version is 9.1.13
 - the number of rows (in this latest instance) is 28,474,842
 - I've clustered and vacuum-full-ed and analyzed this table frequently,
 attempting to troubleshoot this. (Running vacuum full on the whole
 catalog seems a little excessive, and unlikely to help.)
 - no other processes are likely to be interfering; nothing other than
 PostgreSQL runs on this machine (except for normal OS processes and New
 Relic server monitoring service); concurrent activity in PostgreSQL is
 low-level and unrelated, and this effect is observed systematically
 whenever this kind of operation is performed on this table
 - no override for this table; the system default_statistics_target is
 100 (the default)
 - yes, an ANALYZE is performed on the temp table after the COPY and
 before the INSERT
 - no index on the temp table (but I'm scanning the whole thing anyway).
 There are indexes on f_foo as detailed in my original post, and I expect
 the PK to make the WHERE NOT EXISTS filtering efficient (as it filters
 on exactly all columns of the PK) ... but even if it didn't, I would
 expect that to only slow down the actual insert execution, not the EXPLAIN.

The only thing I can think of is some sort of memory exhaustion,
resulting in swapping out large amounts of memory. That'd explain the
I/O load. Can you run something like vmstat to see if this really is swap?

The fact that plain INSERT does not do that contradicts that, as it
should be able to plan either both queries (plain and EXPLAIN), or none
of them.

Can you prepare a self-contained test case? I.e. a script that
demonstrates the issue? I tried to reproduce the issue using the
information provided so far, but unsuccessfully :-(

Even if you could reproduce the problem on another machine (because of
keeping the data internal) on a server with debug symbols and see where
most of the time is spent (e.g. using 'perf top'), that'd be useful.

regards

-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PG 9.3 materialized view VS Views, indexes, shared memory

2015-03-05 Thread Jim Nasby

On 2/20/15 12:09 PM, Nicolas Paris wrote:

Well it seems that max  query size for CREATE INDEX is 8160 character in
my 9.3 postgresql version.
Then the only solution see is to add  a new boolean field : huge_table.view1
and change predicat to WHERE view1=1 
But I may have 800 views.. adding 800 new fields indexed to the huge
table is actually not a good idea. Too bad

Any idea to solve that partial view limitation?


If you have that many different views I doubt you want that many indexes 
anyway.


Have you tried just hitting the base table and indexes directly, either 
through plain views or just direct SQL?


Also, how frequently does data change in the huge table? This sounds 
like a case where the visibility map could make a huge difference.


By the way, if all the Mat Views are in one schema that's already in the 
search path, a very easy way to test this would be to create an 
equivalent set of regular views in a different schema (which you can 
probably do programmatically via pg_get_viewdef()) and then change the 
search_path to put the new schema before the old.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

2015-03-05 Thread Tom Lane
Tomas Vondra tomas.von...@2ndquadrant.com writes:
 On 5.3.2015 16:01, Gunnlaugur Thor Briem wrote:
 - postgres version is 9.1.13

 The only thing I can think of is some sort of memory exhaustion,
 resulting in swapping out large amounts of memory.

I'm wondering about the issue addressed by commit fccebe421 (Use
SnapshotDirty rather than an active snapshot to probe index endpoints).
Now, that was allegedly fixed in 9.1.13 ... but if the OP were confused
and this server were running, say, 9.1.12, that could be a viable
explanation.  Another possibly viable explanation for seeing the issue
in 9.1.13 would be if I fat-fingered the back-patch somehow :-(.

In any case, I concur with your advice:

 Even if you could reproduce the problem on another machine (because of
 keeping the data internal) on a server with debug symbols and see where
 most of the time is spent (e.g. using 'perf top'), that'd be useful.

Without some more-precise idea of where the time is going, we're really
just guessing as to the cause.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] slow server : s_lock and _bt_checkkeys on perf top

2015-03-05 Thread Jim Nasby

On 3/4/15 1:31 PM, Soni M wrote:

Hello All,

Master db size 1.5 TB
All postgres 9.1.13 installed from RHEL package.
It has streaming replica and slony replica to another servers.

Server performance is slower than usual, before that, there's a big
query got cancelled and then performance get slow.

No sign of IO wait.

on sar, it's %user and %system dominate the cpu usage
01:25:04 PM CPU %user %nice   %system   %iowait%steal
   %idle
Average:all 51.91  0.00 12.03  0.66  0.00
   35.39

on perf top, i saw
  18.93%  postgres  [.] s_lock
  10.72%  postgres  [.] _bt_checkkeys
almost always at top.


_bt_checkkeys is the function that compares a row in a btree index to a 
condition. s_lock is a spinlock; the high CPU usage in there indicates 
there's heavy lock contention somewhere.


Is there one PG process that's clearly using more CPU than the others? 
What else is running in the database? Are there any unusual data types 
involved?

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

2015-03-05 Thread Vladimir Sitnikov
What could cause this? Note that there is no ANALYZE.

Can you capture pstack and/or perf report while explain hangs?
I think it should shed light on the activity of PostgreSQL.

Vladimir


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Reverse Key Index

2015-03-05 Thread Sven R. Kunze

On 26.02.2015 13:37, Heikki Linnakangas wrote:

On 02/26/2015 12:31 AM, Josh Berkus wrote:

On 02/14/2015 10:35 AM, Sven R. Kunze wrote:

Thanks for the immediate reply.

I understand the use case is quite limited.

On the other hand, I see potential when it comes to applications which
use PostgreSQL. There, programmers would have to change a lot of 
code to
tweak existing (and more importantly working) queries to 
hash/reverse an
id column first. Using ORMs would make this change even more painful 
and

maybe even impossible.

When reading
https://richardfoote.wordpress.com/2008/01/14/introduction-to-reverse-key-indexes-part-i/ 


carefully, it also seems to work with index scan partially in case of
equality comparisons.


Seems like a good use for SP-GiST.  Go for it!


A b-tree opclass that just compares from right-to-left would work just
as well, and perform better.

- Heikki




Thanks for the hint. That also sounds easy to implement.

Regards,

--
Sven R. Kunze
TBZ-PARIV GmbH, Bernsdorfer Str. 210-212, 09126 Chemnitz
Tel: +49 (0)371 33714721, Fax: +49 (0)371 5347920
e-mail: srku...@tbz-pariv.de
web: www.tbz-pariv.de

Geschäftsführer: Dr. Reiner Wohlgemuth
Sitz der Gesellschaft: Chemnitz
Registergericht: Chemnitz HRB 8543



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

2015-03-05 Thread Gunnlaugur Thor Briem
Hi, thanks for your follow-up questions.

- postgres version is 9.1.13
- the number of rows (in this latest instance) is 28,474,842
- I've clustered and vacuum-full-ed and analyzed this table frequently,
attempting to troubleshoot this. (Running vacuum full on the whole catalog
seems a little excessive, and unlikely to help.)
- no other processes are likely to be interfering; nothing other than
PostgreSQL runs on this machine (except for normal OS processes and New
Relic server monitoring service); concurrent activity in PostgreSQL is
low-level and unrelated, and this effect is observed systematically
whenever this kind of operation is performed on this table
- no override for this table; the system default_statistics_target is 100
(the default)
- yes, an ANALYZE is performed on the temp table after the COPY and before
the INSERT
- no index on the temp table (but I'm scanning the whole thing anyway).
There are indexes on f_foo as detailed in my original post, and I expect
the PK to make the WHERE NOT EXISTS filtering efficient (as it filters on
exactly all columns of the PK) ... but even if it didn't, I would expect
that to only slow down the actual insert execution, not the EXPLAIN.

Cheers,

Gulli


On Wed, Mar 4, 2015 at 8:10 PM, Marc Mamin m.ma...@intershop.de wrote:

  Hi,
 we are seeing cases of EXPLAIN INSERT INTO foo SELECT ... taking over an
 hour, with disk I/O utilization (percent of time device is busy) at 100%
 the whole time, although I/O bandwidth is not saturated. This is on
 PostgreSQL 9.1.13.
 What could cause this? Note that there is no ANALYZE. Is it possible that
 the SELECT is actually executed, in planning the INSERT?
 When executing the INSERT itself (not EXPLAIN) immediately afterwards,
 that logs a temporary file message, but the EXPLAIN invocation does not
 (though the disk I/O suggests that a large on-disk sort is occurring):
 LOG:  temporary file: path base/pgsql_tmp/pgsql_tmp6016.0, size
 744103936
 STATEMENT:  INSERT INTO f_foo SELECT
 [...]
 During that actual execution, there's a lower disk I/O utilization
 (though a higher MB/s rate).
 Charts of disk I/O utilization and rate are at
 http://postimg.org/image/628h6jmn3/ ... the solid 100% span is the
 EXPLAIN statement, ending at 6:13:30pm, followed by the INSERT statement
 ending at 6:32pm. Metrics are collected by New Relic; their definition of
 I/O utilization is at https://discuss.newrelic.com/t/disk-i-o-metrics/2900
 Here's the EXPLAIN statement:
 LOG:  duration: 3928778.823 ms  statement: EXPLAIN INSERT INTO f_foo
 SELECT
t_foo.fk_d1,
t_foo.fk_d2,
t_foo.fk_d3,
t_foo.fk_d4,
t_foo.fk_d5,
t_foo.fk_d6,
t_foo.value
 FROM t_foo
 WHERE NOT (EXISTS
  (SELECT *
   FROM f_foo
   WHERE f_foo.fk_d2 = t_foo.fk_d2
 AND f_foo.fk_d3 = t_foo.fk_d3
 AND f_foo.fk_d4 = t_foo.fk_d4
 AND f_foo.fk_d5 = t_foo.fk_d5
 AND f_foo.fk_d6 = t_foo.fk_d6
 AND f_foo.fk_d1 = t_foo.fk_d1))
 (where t_foo is a temp table previously populated using COPY, and the NOT
 EXISTS subquery refers to the same table we are inserting into)
 Here's the EXPLAIN output:
 Insert on f_foo  (cost=8098210.50..9354519.69 rows=1 width=16)
   -  Merge Anti Join  (cost=8098210.50..9354519.69 rows=1 width=16)
 Merge Cond: ((t_foo.fk_d2 = public.f_foo.fk_d2) AND
  (t_foo.fk_d3 = public.f_foo.fk_d3) AND
  (t_foo.fk_d4 = public.f_foo.fk_d4) AND
  (t_foo.fk_d5 = public.f_foo.fk_d5) AND
  (t_foo.fk_d6 = public.f_foo.fk_d6) AND
  (t_foo.fk_d1 = public.f_foo.fk_d1))
 -  Sort  (cost=3981372.25..4052850.70 rows=28591380 width=16)
   Sort Key: t_foo.fk_d2, t_foo.fk_d3, t_foo.fk_d4,
 t_foo.fk_d5,
 t_foo.fk_d6, t_foo.fk_d1
   -  Seq Scan on t_foo  (cost=0.00..440461.80 rows=28591380
   width=16)
 -  Sort  (cost=4116838.25..4188025.36 rows=28474842 width=12)
   Sort Key: public.f_foo.fk_d2, public.f_foo.fk_d3,
 public.f_foo.fk_d4, public.f_foo.fk_d5,
 public.f_foo.fk_d6, public.f_foo.fk_d1
   -  Seq Scan on f_foo  (cost=0.00..591199.42 rows=28474842
   width=12)
 The INSERT is indeed rather large (which is why we're issuing an EXPLAIN
 ahead of it to log the plan). So its long execution time is expected. But I
 want to understand why the EXPLAIN takes even longer.
 The table looks like this:
 \d f_foo
 Table public.f_foo
  Column |   Type   | Modifiers
 +--+---
  fk_d1  | smallint | not null
  fk_d2  | smallint | not null
  fk_d3  | smallint | not null
  fk_d4  | smallint | not null
  fk_d5  | smallint | not null
  fk_d6  | smallint | not null
  value  | integer  |
 Indexes: