Re: [PERFORM] High CPU usage after partitioning

2013-01-22 Thread rudi
On Tue, Jan 22, 2013 at 1:41 AM, Andrew Dunstan and...@dunslane.net wrote:


 On 01/21/2013 10:05 AM, rudi wrote:

 Hello,

 I'm running postgresl 9.0. After partitioning a big table, CPU usage
 raised from average 5-10% to average 70-80%.

 - the table contains about 20.000.000 rows
 - partitions are selected using a trigger, based on an indexed field, a
 date (IF date_taken = x AND date_taken  y)
 - I created 5 partitions, the 2012 one now contains most of the rows. The
 2013 partition is the live partition, mostly insert, a few select based
 on the above indexed field. The 2013, 2014, 2015 partitions are empty
 - constraint execution is on.
 I have 2 weeks CPU usage reports and the pattern definately changed after
 I made the partitions. Any idea?



 Well, the first question that comes to my mind is whether it's the inserts
 that are causing the load or the reads. If it's the inserts then you should
 show us the whole trigger. Does it by any chance use 'execute'?


I think I found the culprit. The insert trigger doesn't seem to be an
issue. It is a trivial IF-ELSE and inserts seems fast.

IF (NEW.date_taken  DATE '2013-01-01') THEN
INSERT INTO sb_logs_2012 VALUES (NEW.*);
ELSIF (NEW.date_taken = DATE '2013-01-01' AND NEW.date_taken  DATE
'2014-01-01') THEN
INSERT INTO sb_logs_2013 VALUES (NEW.*);
[...]
END IF;

Every query has been carefully optimized, child tables are indexed. The
table(s) has a UNIQUE index on (date_taken, device_id) and date_taken
is the partitioning column (one partition per year).
There are few well known access path to this table: INSERTs (40-50.000 each
day), SELECTs on a specific device_id AND on a specific day.

BUT, I discovered an access path used by a process every few secs. to get
the last log for a given device, and this query became really slow after
partitioning:

Result  (cost=341156.04..341182.90 rows=4 width=86) (actual
time=1132.326..1132.329 rows=1 loops=1)
  InitPlan 1 (returns $0)
-  Aggregate  (cost=341156.03..341156.04 rows=1 width=8) (actual
time=1132.295..1132.296 rows=1 loops=1)
  -  Append  (cost=0.00..341112.60 rows=17371 width=8) (actual
time=45.600..1110.057 rows=19016 loops=1)
-  Seq Scan on sb_logs  (cost=0.00..0.00 rows=1 width=8)
(actual time=0.000..0.000 rows=0 loops=1)
  Filter: (device_id = 901)
-  Index Scan using
sb_logs_2012_on_date_taken_and_device_id on sb_logs_2012 sb_logs
 (cost=0.00..319430.51 rows=16003 width=8) (actual time=45.599..1060.143
rows=17817 loops=1)
  Index Cond: (device_id = 901)
-  Index Scan using
sb_logs_2013_on_date_taken_and_device_id on sb_logs_2013 sb_logs
 (cost=0.00..21663.39 rows=1363 width=8) (actual time=0.022..47.661
rows=1199 loops=1)
  Index Cond: (device_id = 901)
-  Bitmap Heap Scan on sb_logs_2014 sb_logs
 (cost=10.25..18.71 rows=4 width=8) (actual time=0.011..0.011 rows=0
loops=1)
  Recheck Cond: (device_id = 901)
  -  Bitmap Index Scan on
sb_logs_2014_on_date_taken_and_device_id  (cost=0.00..10.25 rows=4 width=0)
(actual time=0.008..0.008 rows=0 loops=1)
Index Cond: (device_id = 901)
  -  Append  (cost=0.00..26.86 rows=4 width=86) (actual
time=1132.325..1132.328 rows=1 loops=1)
-  Seq Scan on sb_logs  (cost=0.00..0.00 rows=1 width=90) (actual
time=0.002..0.002 rows=0 loops=1)
  Filter: ((device_id = 901) AND (date_taken = $0))
-  Index Scan using sb_logs_2012_on_date_taken_and_device_id on
sb_logs_2012 sb_logs  (cost=0.00..10.20 rows=1 width=90) (actual
time=1132.314..1132.314 rows=0 loops=1)
  Index Cond: ((date_taken = $0) AND (device_id = 901))
-  Index Scan using sb_logs_2013_on_date_taken_and_device_id on
sb_logs_2013 sb_logs  (cost=0.00..8.39 rows=1 width=91) (actual
time=0.007..0.008 rows=1 loops=1)
  Index Cond: ((date_taken = $0) AND (device_id = 901))
-  Index Scan using sb_logs_2014_on_date_taken_and_device_id on
sb_logs_2014 sb_logs  (cost=0.00..8.27 rows=1 width=72) (actual
time=0.002..0.002 rows=0 loops=1)
  Index Cond: ((date_taken = $0) AND (device_id = 901))
Total runtime: 1132.436 ms

I must find a better way to get that information, but I wonder if there
could be a better plan. The same query over a table with the same structure
but not partitioned gives far better plan:

Index Scan using index_iv_logs_on_date_taken_and_device_id on iv_logs
 (cost=12.35..21.88 rows=1 width=157) (actual time=0.065..0.066 rows=1
loops=1)
  Index Cond: ((date_taken = $1) AND (device_id = 1475))
  InitPlan 2 (returns $1)
-  Result  (cost=12.34..12.35 rows=1 width=0) (actual
time=0.059..0.059 rows=1 loops=1)
  InitPlan 1 (returns $0)
-  Limit  (cost=0.00..12.34 rows=1 width=8) (actual
time=0.055..0.056 rows=1 loops=1)
  -  Index Scan Backward using

Re: [PERFORM] High CPU usage after partitioning

2013-01-22 Thread rudi
On Tue, Jan 22, 2013 at 3:04 PM, Merlin Moncure mmonc...@gmail.com wrote:

 let's see the query -- it's probably written in such a way so as to
 not be able to be optimized through CE.


The query is pretty simple and standard, the behaviour (and the plan) is
totally different when it comes to a partitioned table.

Partioned table query = explain analyze SELECT  sb_logs.* FROM sb_logs
 WHERE (device_id = 901 AND date_taken = (SELECT MAX(date_taken) FROM
sb_logs WHERE device_id = 901));
Plain table query = explain analyze SELECT  iv_logs.* FROM iv_logs
 WHERE (device_id = 1475 AND date_taken = (SELECT MAX(date_taken) FROM
iv_logs WHERE device_id = 1475));

sb_logs and iv_logs have identical index structure and similar cardinality
(about ~12.000.000 rows the first, ~9.000.000 rows the second).

sb_logs PLAN:
  InitPlan 1 (returns $0)
-  Aggregate  (cost=339424.47..339424.48 rows=1 width=8) (actual
time=597.742..597.742 rows=1 loops=1)
  -  Append  (cost=0.00..339381.68 rows=17114 width=8) (actual
time=42.791..594.001 rows=19024 loops=1)
-  Seq Scan on sb_logs  (cost=0.00..0.00 rows=1 width=8)
(actual time=0.000..0.000 rows=0 loops=1)
  Filter: (device_id = 901)
-  Index Scan using
sb_logs_2012_on_date_taken_and_device_id on sb_logs_2012 sb_logs
 (cost=0.00..319430.51 rows=16003 width=8) (actual time=42.789..559.165
rows=17817 loops=1)
  Index Cond: (device_id = 901)
-  Index Scan using
sb_logs_2013_on_date_taken_and_device_id on sb_logs_2013 sb_logs
 (cost=0.00..19932.46 rows=1106 width=8) (actual time=0.037..31.699
rows=1207 loops=1)
  Index Cond: (device_id = 901)
-  Bitmap Heap Scan on sb_logs_2014 sb_logs
 (cost=10.25..18.71 rows=4 width=8) (actual time=0.012..0.012 rows=0
loops=1)
  Recheck Cond: (device_id = 901)
  -  Bitmap Index Scan on
sb_logs_2014_on_date_taken_and_device_id  (cost=0.00..10.25 rows=4 width=0)
(actual time=0.010..0.010 rows=0 loops=1)
Index Cond: (device_id = 901)
  -  Append  (cost=0.00..26.86 rows=4 width=86) (actual
time=597.808..597.811 rows=1 loops=1)
-  Seq Scan on sb_logs  (cost=0.00..0.00 rows=1 width=90) (actual
time=0.022..0.022 rows=0 loops=1)
  Filter: ((device_id = 901) AND (date_taken = $0))
-  Index Scan using sb_logs_2012_on_date_taken_and_device_id on
sb_logs_2012 sb_logs  (cost=0.00..10.20 rows=1 width=90) (actual
time=597.773..597.773 rows=0 loops=1)
  Index Cond: ((date_taken = $0) AND (device_id = 901))
-  Index Scan using sb_logs_2013_on_date_taken_and_device_id on
sb_logs_2013 sb_logs  (cost=0.00..8.39 rows=1 width=91) (actual
time=0.011..0.011 rows=1 loops=1)
  Index Cond: ((date_taken = $0) AND (device_id = 901))
-  Index Scan using sb_logs_2014_on_date_taken_and_device_id on
sb_logs_2014 sb_logs  (cost=0.00..8.27 rows=1 width=72) (actual
time=0.003..0.003 rows=0 loops=1)
  Index Cond: ((date_taken = $0) AND (device_id = 901))
Total runtime: 598.049 ms

iv_logs PLAN:

Index Scan using index_iv_logs_on_date_taken_and_device_id on iv_logs
 (cost=12.35..21.88 rows=1 width=157) (actual time=0.060..0.060 rows=1
loops=1)
  Index Cond: ((date_taken = $1) AND (device_id = 1475))
  InitPlan 2 (returns $1)
-  Result  (cost=12.34..12.35 rows=1 width=0) (actual
time=0.053..0.053 rows=1 loops=1)
  InitPlan 1 (returns $0)
-  Limit  (cost=0.00..12.34 rows=1 width=8) (actual
time=0.050..0.051 rows=1 loops=1)
  -  Index Scan Backward using
index_iv_logs_on_date_taken_and_device_id on iv_logs  (cost=0.00..261151.32
rows=21163 width=8) (actual time=0.046..0.046 rows=1 loops=1)
Index Cond: ((date_taken IS NOT NULL) AND
(device_id = 1475))
Total runtime: 0.101 ms


-- 
rd

This is the way the world ends.
Not with a bang, but a whimper.


Re: [PERFORM] High CPU usage after partitioning

2013-01-22 Thread Andrew Dunstan


On 01/22/2013 09:21 AM, rudi wrote:
On Tue, Jan 22, 2013 at 3:04 PM, Merlin Moncure mmonc...@gmail.com 
mailto:mmonc...@gmail.com wrote:


let's see the query -- it's probably written in such a way so as to
not be able to be optimized through CE.


The query is pretty simple and standard, the behaviour (and the plan) 
is totally different when it comes to a partitioned table.


Partioned table query = explain analyze SELECT  sb_logs.* FROM 
sb_logs  WHERE (device_id = 901 AND date_taken = (SELECT 
MAX(date_taken) FROM sb_logs WHERE device_id = 901));




And there you have it. Constraint exclusion does not work in cases like 
this. It only works with static expressions (such as a literal date in 
this case).


cheers

andrew


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High CPU usage after partitioning

2013-01-22 Thread Igor Neyman
In PG 9.2 I’m getting “Index Only Scan Backward” for every partition in the 
first part of execution plan, when looking for MAX in partitioned table on a 
similar query:

-  Index Only Scan Backward using pk_cycle_200610 on 
gp_cycle_200610 gp_cycle  (cost=0.00..8.34 rows=5 width=8) (actual 
time=0.021..0.021 rows=1 loops=1)
  Index Cond: (cycle_date_time IS NOT NULL)
  Heap Fetches: 0

May be you should upgrade to 9.2.

Regards,
Igor Neyman


From: rudi [mailto:rudol...@gmail.com]
Sent: Tuesday, January 22, 2013 10:08 AM
To: pgsql-performance@postgresql.org
Subject: Re: High CPU usage after partitioning

On Tue, Jan 22, 2013 at 3:46 PM, Andrew Dunstan 
and...@dunslane.netmailto:and...@dunslane.net wrote:
The query is pretty simple and standard, the behaviour (and the plan) is 
totally different when it comes to a partitioned table.

Partioned table query = explain analyze SELECT  sb_logs.* FROM sb_logs  
WHERE (device_id = 901 AND date_taken = (SELECT MAX(date_taken) FROM sb_logs 
WHERE device_id = 901));

And there you have it. Constraint exclusion does not work in cases like this. 
It only works with static expressions (such as a literal date in this case).

Ok, but I would have expected same plant repeated 4 times. When the table is 
not partitioned, the plan is defintely smarter: it knows that index is reversed 
and looks for max with an index scan backward). When the table is partitioned, 
it scan forward and I guess it will always do a full index scan.



--
rd

This is the way the world ends.
Not with a bang, but a whimper.


Re: [PERFORM] High CPU usage after partitioning

2013-01-22 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 01/22/2013 09:21 AM, rudi wrote:
 The query is pretty simple and standard, the behaviour (and the plan) 
 is totally different when it comes to a partitioned table.
 
 Partioned table query = explain analyze SELECT  sb_logs.* FROM 
 sb_logs  WHERE (device_id = 901 AND date_taken = (SELECT 
 MAX(date_taken) FROM sb_logs WHERE device_id = 901));

 And there you have it. Constraint exclusion does not work in cases like 
 this. It only works with static expressions (such as a literal date in 
 this case).

This isn't about constraint exclusion I think.  The main problem is in
the sub-select: 9.0 isn't able to index-optimize a MAX() across a
partitioned table, for lack of MergeAppend, so you end up scanning lots
of rows there.  9.1 or 9.2 should be better.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Effect of the WindowAgg on the Nested Loop

2013-01-22 Thread Виктор Егоров
Greetings.

I've been playing with a small query that I've been asked to optimize
and noticed a strange (for me) effect.
Query uses this table:

   Table clc06_tiles
   Column   | Type  |
Modifiers
+---+---
 geometry   | geometry  |
 code_06| character varying(3)  |
 gid| bigint| not null default
nextval('clc06_tiles_gid_seq'::regclass)
Indexes:
clc06_tiles_pkey PRIMARY KEY, btree (gid)
i_clc06_tiles_geometry gist (geometry)
Check constraints:
enforce_dims_geom CHECK (st_ndims(geometry) = 2)
enforce_geotype_geom CHECK (geometrytype(geometry) =
'MULTIPOLYGON'::text OR geometrytype(geometry) = 'POLYGON'::text OR
geometry IS NULL)
enforce_srid_geom CHECK (st_srid(geometry) = 3035)

and this function:
CREATE OR REPLACE FUNCTION my_trans(x1 float8, y1 float8, x2 float8,
y2 float8) RETURNS geometry AS $my_trans$
SELECT st_Transform(
st_GeomFromText('LINESTRING('||x1::text||' '||y1::text||
 ', '||x2::text||' '||y2::text||')',4326),3035);
$my_trans$ LANGUAGE sql IMMUTABLE STRICT;

and these constants:
\set x1 4.56
\set y1 52.54
\set x2 5.08
\set y2 53.34


Original query looks like this ( http://explain.depesz.com/s/pzv ):

SELECT n, i.*, st_NumGeometries(i.geom)
  FROM (
SELECT a.code_06 as code_06,
   st_Multi(st_Intersection(a.geometry,
my_trans(:x1,:y1,:x2,:y2))) as geom
  FROM clc06_tiles a
 WHERE st_Intersects(a.geometry, my_trans(:x1,:y1,:x2,:y2))) i
  JOIN generate_series(1,10) n ON n = st_NumGeometries(i.geom);


After a while I added row_number() to the inner part (
http://explain.depesz.com/s/hfs ):

SELECT n, i.*, st_NumGeometries(i.geom)
  FROM (
SELECT row_number() OVER () AS rn, a.code_06 as code_06,
   st_Multi(st_Intersection(a.geometry,
my_trans(:x1,:y1,:x2,:y2))) as geom
  FROM clc06_tiles a
 WHERE st_Intersects(a.geometry, my_trans(:x1,:y1,:x2,:y2))) i
  JOIN generate_series(1,10) n ON n = st_NumGeometries(i.geom);


It was really surprising to see a side effect of 8x performance boost.
The only difference I can see is an extra WindowAgg step in the second variant.



Could you kindly explain how WindowAgg node affects the overall
performance, please?



PostgreSQL 9.2.1 on x86_64-unknown-linux-gnu, compiled by gcc
(Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit
 archive_command| (disabled) | configuration file
 bgwriter_delay | 100ms  | configuration file
 bgwriter_lru_maxpages  | 200| configuration file
 checkpoint_segments| 30 | configuration file
 default_text_search_config | pg_catalog.english | configuration file
 effective_cache_size   | 3GB| configuration file
 listen_addresses   | *  | configuration file
 log_checkpoints| on | configuration file
 log_connections| on | configuration file
 log_destination| csvlog | configuration file
 log_disconnections | on | configuration file
 log_lock_waits | on | configuration file
 log_min_duration_statement | 100ms  | configuration file
 log_rotation_age   | 1d | configuration file
 log_temp_files | 20MB   | configuration file
 log_timezone   | UTC| configuration file
 logging_collector  | on | configuration file
 maintenance_work_mem   | 64MB   | configuration file
 max_connections| 100| configuration file
 max_stack_depth| 2MB| environment variable
 max_wal_senders| 2  | configuration file
 port   | 5432   | configuration file
 shared_buffers | 768MB  | configuration file
 temp_buffers   | 32MB   | configuration file
 TimeZone   | UTC| configuration file
 wal_level  | hot_standby| configuration file
 work_mem   | 8MB| configuration file


-- 
Victor Y. Yegorov


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance