date:20150305

Re: [PERFORM] PG 9.3 materialized view VS Views, indexes, shared memory

2015-03-05 Thread Jim Nasby


On 2/20/15 12:09 PM, Nicolas Paris wrote:

Well it seems that max  query size for CREATE INDEX is 8160 character in
my 9.3 postgresql version.
Then the only solution see is to add  a new boolean field : huge_table.view1
and change predicat to "WHERE view1=1 "
But I may have 800 views.. adding 800 new fields indexed to the huge
table is actually not a good idea. Too bad

Any idea to solve that partial view limitation?


If you have that many different views I doubt you want that many indexes 
anyway.


Have you tried just hitting the base table and indexes directly, either 
through plain views or just direct SQL?


Also, how frequently does data change in the huge table? This sounds 
like a case where the visibility map could make a huge difference.


By the way, if all the Mat Views are in one schema that's already in the 
search path, a very easy way to test this would be to create an 
equivalent set of regular views in a different schema (which you can 
probably do programmatically via pg_get_viewdef()) and then change the 
search_path to put the new schema before the old.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] slow server : s_lock and _bt_checkkeys on perf top

2015-03-05 Thread Jim Nasby


On 3/4/15 1:31 PM, Soni M wrote:

Hello All,

Master db size 1.5 TB
All postgres 9.1.13 installed from RHEL package.
It has streaming replica and slony replica to another servers.

Server performance is slower than usual, before that, there's a big
query got cancelled and then performance get slow.

No sign of IO wait.

on sar, it's %user and %system dominate the cpu usage
01:25:04 PM CPU %user %nice   %system   %iowait%steal
   %idle
Average:all 51.91  0.00 12.03  0.66  0.00
   35.39

on perf top, i saw
  18.93%  postgres  [.] s_lock
  10.72%  postgres  [.] _bt_checkkeys
almost always at top.


_bt_checkkeys is the function that compares a row in a btree index to a 
condition. s_lock is a spinlock; the high CPU usage in there indicates 
there's heavy lock contention somewhere.


Is there one PG process that's clearly using more CPU than the others? 
What else is running in the database? Are there any unusual data types 
involved?

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

2015-03-05 Thread Tom Lane

Tomas Vondra  writes:
> On 5.3.2015 16:01, Gunnlaugur Thor Briem wrote:
>> - postgres version is 9.1.13

> The only thing I can think of is some sort of memory exhaustion,
> resulting in swapping out large amounts of memory.

I'm wondering about the issue addressed by commit fccebe421 ("Use
SnapshotDirty rather than an active snapshot to probe index endpoints").
Now, that was allegedly fixed in 9.1.13 ... but if the OP were confused
and this server were running, say, 9.1.12, that could be a viable
explanation.  Another possibly viable explanation for seeing the issue
in 9.1.13 would be if I fat-fingered the back-patch somehow :-(.

In any case, I concur with your advice:

> Even if you could reproduce the problem on another machine (because of
> keeping the data internal) on a server with debug symbols and see where
> most of the time is spent (e.g. using 'perf top'), that'd be useful.

Without some more-precise idea of where the time is going, we're really
just guessing as to the cause.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

2015-03-05 Thread Tomas Vondra

Hi,

On 5.3.2015 16:01, Gunnlaugur Thor Briem wrote:
> Hi, thanks for your follow-up questions.
> 
> - postgres version is 9.1.13
> - the number of rows (in this latest instance) is 28,474,842
> - I've clustered and vacuum-full-ed and analyzed this table frequently,
> attempting to troubleshoot this. (Running vacuum full on the whole
> catalog seems a little excessive, and unlikely to help.)
> - no other processes are likely to be interfering; nothing other than
> PostgreSQL runs on this machine (except for normal OS processes and New
> Relic server monitoring service); concurrent activity in PostgreSQL is
> low-level and unrelated, and this effect is observed systematically
> whenever this kind of operation is performed on this table
> - no override for this table; the system default_statistics_target is
> 100 (the default)
> - yes, an ANALYZE is performed on the temp table after the COPY and
> before the INSERT
> - no index on the temp table (but I'm scanning the whole thing anyway).
> There are indexes on f_foo as detailed in my original post, and I expect
> the PK to make the WHERE NOT EXISTS filtering efficient (as it filters
> on exactly all columns of the PK) ... but even if it didn't, I would
> expect that to only slow down the actual insert execution, not the EXPLAIN.

The only thing I can think of is some sort of memory exhaustion,
resulting in swapping out large amounts of memory. That'd explain the
I/O load. Can you run something like vmstat to see if this really is swap?

The fact that plain INSERT does not do that contradicts that, as it
should be able to plan either both queries (plain and EXPLAIN), or none
of them.

Can you prepare a self-contained test case? I.e. a script that
demonstrates the issue? I tried to reproduce the issue using the
information provided so far, but unsuccessfully :-(

Even if you could reproduce the problem on another machine (because of
keeping the data internal) on a server with debug symbols and see where
most of the time is spent (e.g. using 'perf top'), that'd be useful.

regards

-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

2015-03-05 Thread Marc Mamin



>Hi, thanks for your follow-up questions.
>- postgres version is 9.1.13
>- the number of rows (in this latest instance) is 28,474,842
>- I've clustered and vacuum-full-ed and analyzed this table frequently, 
>attempting to troubleshoot this. (Running vacuum full on the whole catalog 
>seems a little excessive, and unlikely to help.)

Hi,

I mean the pg_* tables. When working with temp objects and a high number of 
classes, regular vacuum may not clean them efficiently.
It is not a bad idea to run a vacuum full verbose manually on the largest of 
those from time to time to verify that they don't grow outer control.
And this normally requires a few seconds only.
The verbose output of vacuum full sometimes returns interesting information...
For the ANALYZE performance, I guess that these are the most relevant one:
 pg_statistic;
 pg_class;
 pg_attribute;
 pg_index;
 pg_constraint;

regards,

Marc Mamin



>- no other processes are likely to be interfering; nothing other than 
>PostgreSQL runs on this machine (except for normal OS processes and New Relic 
>server monitoring service); concurrent activity in PostgreSQL is low-level and 
>unrelated, and this effect is observed systematically whenever this kind of 
>operation is performed on this table
>- no override for this table; the system default_statistics_target is 100 (the 
>default)
>- yes, an ANALYZE is performed on the temp table after the COPY and before the 
>INSERT
>- no index on the temp table (but I'm scanning the whole thing anyway). There 
>are indexes on f_foo as detailed in my original post, and I expect the PK to 
>make the WHERE NOT EXISTS filtering efficient (as it filters on exactly all 
>columns of the PK) ... but even if it didn't, I would expect that to only slow 
>down the actual insert execution, not the EXPLAIN.
>Cheers,
>Gulli
>On Wed, Mar 4, 2015 at 8:10 PM, Marc Mamin  wrote:
>
>>Hi,
>>we are seeing cases of EXPLAIN INSERT INTO foo SELECT ... taking over an 
> hour, with disk I/O utilization (percent of time device is busy) at 100% the 
> whole time, although I/O bandwidth is not saturated. This is on PostgreSQL 
> 9.1.13.
>>What could cause this? Note that there is no ANALYZE. Is it possible that 
> the SELECT is actually executed, in planning the INSERT?
>>When executing the INSERT itself (not EXPLAIN) immediately afterwards, 
> that logs a "temporary file" message, but the EXPLAIN invocation does not 
> (though the disk I/O suggests that a large on-disk sort is occurring):
>>LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp6016.0", size 
> 744103936
>>STATEMENT:  INSERT INTO f_foo SELECT
>>[...]
>>During that actual execution, there's a lower disk I/O utilization 
> (though a higher MB/s rate).
>>Charts of disk I/O utilization and rate are at 
> http://postimg.org/image/628h6jmn3/ ... the solid 100% span is the EXPLAIN 
> statement, ending at 6:13:30pm, followed by the INSERT statement ending at 
> 6:32pm. Metrics are collected by New Relic; their definition of I/O 
> utilization is at https://discuss.newrelic.com/t/disk-i-o-metrics/2900
>>Here's the EXPLAIN statement:
>>LOG:  duration: 3928778.823 ms  statement: EXPLAIN INSERT INTO f_foo 
> SELECT
>>   t_foo.fk_d1,
>>   t_foo.fk_d2,
>>   t_foo.fk_d3,
>>   t_foo.fk_d4,
>>   t_foo.fk_d5,
>>   t_foo.fk_d6,
>>   t_foo.value
>>FROM t_foo
>>WHERE NOT (EXISTS
>> (SELECT *
>>  FROM f_foo
>>  WHERE f_foo.fk_d2 = t_foo.fk_d2
>>AND f_foo.fk_d3 = t_foo.fk_d3
>>AND f_foo.fk_d4 = t_foo.fk_d4
>>AND f_foo.fk_d5 = t_foo.fk_d5
>>AND f_foo.fk_d6 = t_foo.fk_d6
>>AND f_foo.fk_d1 = t_foo.fk_d1))
>>(where t_foo is a temp table previously populated using COPY, and the NOT 
> EXISTS subquery refers to the same table we are inserting into)
>>Here's the EXPLAIN output:
>>Insert on f_foo  (cost=8098210.50..9354519.69 rows=1 width=16)
>>  ->  Merge Anti Join  (cost=8098210.50..9354519.69 rows=1 width=16)
>>Merge Cond: ((t_foo.fk_d2 = public.f_foo.fk_d2) AND
>> (t_foo.fk_d3 = public.f_foo.fk_d3) AND
>> (t_foo.fk_d4 = public.f_foo.fk_d4) AND
>> (t_foo.fk_d5 = public.f_foo.fk_d5) AND
>> (t_foo.fk_d6 = public.f_foo.fk_d6) AND
>> (t_foo.fk_d1 = public.f_foo.fk_d1))
>>->  Sort  (cost=3981372.25..4052850.70 rows=28591380 width=16)
>>  Sort Key: t_foo.fk_d2, t_foo.fk_d3, t_foo.fk_d4, 
> t_foo.fk_d5,
>>t_foo.fk_d6, t_foo.fk_d1
>>  ->  Seq Scan on t_foo  (cost=0.00..440461.80 rows=28591380
>>  width=16)
>>->  Sort  (cost=4116838.25..4188025.36 rows=28474842 width=12)
>>  Sort K

Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

2015-03-05 Thread Vladimir Sitnikov

>What could cause this? Note that there is no ANALYZE.

Can you capture pstack and/or perf report while explain hangs?
I think it should shed light on the activity of PostgreSQL.

Vladimir


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

2015-03-05 Thread Gunnlaugur Thor Briem

Hi, thanks for your follow-up questions.

- postgres version is 9.1.13
- the number of rows (in this latest instance) is 28,474,842
- I've clustered and vacuum-full-ed and analyzed this table frequently,
attempting to troubleshoot this. (Running vacuum full on the whole catalog
seems a little excessive, and unlikely to help.)
- no other processes are likely to be interfering; nothing other than
PostgreSQL runs on this machine (except for normal OS processes and New
Relic server monitoring service); concurrent activity in PostgreSQL is
low-level and unrelated, and this effect is observed systematically
whenever this kind of operation is performed on this table
- no override for this table; the system default_statistics_target is 100
(the default)
- yes, an ANALYZE is performed on the temp table after the COPY and before
the INSERT
- no index on the temp table (but I'm scanning the whole thing anyway).
There are indexes on f_foo as detailed in my original post, and I expect
the PK to make the WHERE NOT EXISTS filtering efficient (as it filters on
exactly all columns of the PK) ... but even if it didn't, I would expect
that to only slow down the actual insert execution, not the EXPLAIN.

Cheers,

Gulli


On Wed, Mar 4, 2015 at 8:10 PM, Marc Mamin  wrote:

>  >Hi,
> >we are seeing cases of EXPLAIN INSERT INTO foo SELECT ... taking over an
> hour, with disk I/O utilization (percent of time device is busy) at 100%
> the whole time, although I/O bandwidth is not saturated. This is on
> PostgreSQL 9.1.13.
> >What could cause this? Note that there is no ANALYZE. Is it possible that
> the SELECT is actually executed, in planning the INSERT?
> >When executing the INSERT itself (not EXPLAIN) immediately afterwards,
> that logs a "temporary file" message, but the EXPLAIN invocation does not
> (though the disk I/O suggests that a large on-disk sort is occurring):
> >LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp6016.0", size
> 744103936
> >STATEMENT:  INSERT INTO f_foo SELECT
> >[...]
> >During that actual execution, there's a lower disk I/O utilization
> (though a higher MB/s rate).
> >Charts of disk I/O utilization and rate are at
> http://postimg.org/image/628h6jmn3/ ... the solid 100% span is the
> EXPLAIN statement, ending at 6:13:30pm, followed by the INSERT statement
> ending at 6:32pm. Metrics are collected by New Relic; their definition of
> I/O utilization is at https://discuss.newrelic.com/t/disk-i-o-metrics/2900
> >Here's the EXPLAIN statement:
> >LOG:  duration: 3928778.823 ms  statement: EXPLAIN INSERT INTO f_foo
> SELECT
> >   t_foo.fk_d1,
> >   t_foo.fk_d2,
> >   t_foo.fk_d3,
> >   t_foo.fk_d4,
> >   t_foo.fk_d5,
> >   t_foo.fk_d6,
> >   t_foo.value
> >FROM t_foo
> >WHERE NOT (EXISTS
> > (SELECT *
> >  FROM f_foo
> >  WHERE f_foo.fk_d2 = t_foo.fk_d2
> >AND f_foo.fk_d3 = t_foo.fk_d3
> >AND f_foo.fk_d4 = t_foo.fk_d4
> >AND f_foo.fk_d5 = t_foo.fk_d5
> >AND f_foo.fk_d6 = t_foo.fk_d6
> >AND f_foo.fk_d1 = t_foo.fk_d1))
> >(where t_foo is a temp table previously populated using COPY, and the NOT
> EXISTS subquery refers to the same table we are inserting into)
> >Here's the EXPLAIN output:
> >Insert on f_foo  (cost=8098210.50..9354519.69 rows=1 width=16)
> >  ->  Merge Anti Join  (cost=8098210.50..9354519.69 rows=1 width=16)
> >Merge Cond: ((t_foo.fk_d2 = public.f_foo.fk_d2) AND
> > (t_foo.fk_d3 = public.f_foo.fk_d3) AND
> > (t_foo.fk_d4 = public.f_foo.fk_d4) AND
> > (t_foo.fk_d5 = public.f_foo.fk_d5) AND
> > (t_foo.fk_d6 = public.f_foo.fk_d6) AND
> > (t_foo.fk_d1 = public.f_foo.fk_d1))
> >->  Sort  (cost=3981372.25..4052850.70 rows=28591380 width=16)
> >  Sort Key: t_foo.fk_d2, t_foo.fk_d3, t_foo.fk_d4,
> t_foo.fk_d5,
> >t_foo.fk_d6, t_foo.fk_d1
> >  ->  Seq Scan on t_foo  (cost=0.00..440461.80 rows=28591380
> >  width=16)
> >->  Sort  (cost=4116838.25..4188025.36 rows=28474842 width=12)
> >  Sort Key: public.f_foo.fk_d2, public.f_foo.fk_d3,
> >public.f_foo.fk_d4, public.f_foo.fk_d5,
> >public.f_foo.fk_d6, public.f_foo.fk_d1
> >  ->  Seq Scan on f_foo  (cost=0.00..591199.42 rows=28474842
> >  width=12)
> >The INSERT is indeed rather large (which is why we're issuing an EXPLAIN
> ahead of it to log the plan). So its long execution time is expected. But I
> want to understand why the EXPLAIN takes even longer.
> >The table looks like this:
> >\d f_foo
> >Table "public.f_foo"
> > Column |   Type   | Modifiers
> >+--+---
> > fk_d1  | smallint | not null
> > fk_d2  | smallint | not null
> > fk_d3  | smallint | not null
> > fk_d4  | sm

Re: [PERFORM] Reverse Key Index

2015-03-05 Thread Sven R. Kunze


On 26.02.2015 13:37, Heikki Linnakangas wrote:

On 02/26/2015 12:31 AM, Josh Berkus wrote:

On 02/14/2015 10:35 AM, Sven R. Kunze wrote:

Thanks for the immediate reply.

I understand the use case is quite limited.

On the other hand, I see potential when it comes to applications which
use PostgreSQL. There, programmers would have to change a lot of 
code to
tweak existing (and more importantly working) queries to 
hash/reverse an
id column first. Using ORMs would make this change even more painful 
and

maybe even impossible.

When reading
https://richardfoote.wordpress.com/2008/01/14/introduction-to-reverse-key-indexes-part-i/ 


carefully, it also seems to work with index scan partially in case of
equality comparisons.


Seems like a good use for SP-GiST.  Go for it!


A b-tree opclass that just compares from right-to-left would work just
as well, and perform better.

- Heikki




Thanks for the hint. That also sounds easy to implement.

Regards,

--
Sven R. Kunze
TBZ-PARIV GmbH, Bernsdorfer Str. 210-212, 09126 Chemnitz
Tel: +49 (0)371 33714721, Fax: +49 (0)371 5347920
e-mail: srku...@tbz-pariv.de
web: www.tbz-pariv.de

Geschäftsführer: Dr. Reiner Wohlgemuth
Sitz der Gesellschaft: Chemnitz
Registergericht: Chemnitz HRB 8543



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PG 9.3 materialized view VS Views, indexes, shared memory

Re: [PERFORM] slow server : s_lock and _bt_checkkeys on perf top

Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

Re: [PERFORM] EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

Re: [PERFORM] Reverse Key Index

8 matches

Site Navigation

Mail list logo

Footer information