[PERFORM] Slow delete times??
Please tell me if this timing makes sense to you for a Celeron 433 w/ RAM=256MB dedicated testing server. I expected some slowness, but not this high. db_epsilon=# \d t_active_subjects Table public.t_active_subjects Column | Type | Modifiers +--+ id | integer | not null default nextval('public.t_active_subjects_id_seq'::text) old_id | integer | ext_subject| integer | not null ext_group | integer | final_grade| integer | type | character(1) | ree| date | borrado| boolean | ext_active_student | integer | sum_presences | integer | sum_hours | integer | Indexes: t_active_subjects_pkey primary key btree (id), i_t_active_subjects__ext_active_student btree (ext_active_student), i_t_active_subjects__ext_group btree (ext_group), i_t_active_subjects__ext_subject btree (ext_subject), i_t_active_subjects__old_id btree (old_id) Foreign Key constraints: $4 FOREIGN KEY (ext_group) REFERENCES t_groups(id) ON UPDATE NO ACTION ON DELETE NO ACTION, $3 FOREIGN KEY (ext_subject) REFERENCES t_subjects(id) ON UPDATE NO ACTION ON DELETE NO ACTION db_epsilon=# EXPLAIN DELETE FROM t_active_subjects; QUERY PLAN - Seq Scan on t_active_subjects (cost=0.00..3391.73 rows=52373 width=6) (1 row) db_epsilon=# EXPLAIN ANALYZE DELETE FROM t_active_subjects; QUERY PLAN Seq Scan on t_active_subjects (cost=0.00..3391.73 rows=52373 width=6) (actual time=0.11..4651.82 rows=73700 loops=1) Total runtime: 3504528.15 msec (2 rows) db_epsilon=# SELECT version(); version - PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2 20020903 (Red Hat Linux 8.0 3.2-7) (1 row) [EMAIL PROTECTED] data]# cat postgresql.conf | grep -v \# | grep \= tcpip_socket = true fsync = false LC_MESSAGES = 'en_US.UTF-8' LC_MONETARY = 'en_US.UTF-8' LC_NUMERIC = 'en_US.UTF-8' LC_TIME = 'en_US.UTF-8' Okay, some details: * The query takes to run about 3,504.52815 sec for 52,373 rows, which averages about 15 deletes per second. * Each ext_* field is a foreign key to another table's pk. * This is a dedicated testing server with 256 MB RAM, and is a Celeron 433 MHz. It still has enough disk space, I think: about 200 MB. * Disk is 4 MB. I guess it must be about what, 4500 RPM? * fsync is disabled. I don't know what other info to provide... Thanks in advance. -- Octavio Alvarez Piza. E-mail: [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[PERFORM] Seq scan on zero-parameters function
Hi! I'd like to know if this is expected behavior. These are two couples of queries. In each couple, the first one has a WHERE field = function() condition, just like the second one, but in the form WHERE field = (SELECT function()). In my opinion, both should have the same execution plan, as the function has no parameters and, therefore, is constant. I'm concerned about this, because the second form looks like a workaround. *** TESTED IN: PostgreSQL 7.4.1 on i686-pc-cygwin *** pgdb=# explain analyze select count(*) from t_students where period = (select current_period_id()); QUERY PLAN --- Aggregate (cost=127.84..127.84 rows=1 width=0) (actual time=1.000..1.000 rows=1 loops=1) InitPlan - Result (cost=0.00..0.01 rows=1 width=0) (actual time=1.000..1.000 rows=1 loops=1) - Index Scan using i_t_students__period on t_students (cost=0.00..127.71 rows=44 width=0) (actual time=1.000..1.000 rows=21 loop=1) Index Cond: (period = $0) Total runtime: 1.000 ms (6 rows) pgdb=# explain analyze select count(*) from t_students where period = (select current_period_id()); QUERY PLAN --- Aggregate (cost=127.84..127.84 rows=1 width=0) (actual time=1.000..1.000 rows=1 loops=1) InitPlan - Result (cost=0.00..0.01 rows=1 width=0) (actual time=1.000..1.000 rows=1 loops=1) - Index Scan using i_t_students__period on t_students (cost=0.00..127.71 rows=44 width=0) (actual time=1.000..1.000 rows=21 loop=1) Index Cond: (period = $0) Total runtime: 1.000 ms (6 rows) pgdb=# select version(); version --- PostgreSQL 7.4.1 on i686-pc-cygwin, compiled by GCC gcc (GCC) 3.3.1 (cygming special) (1 row) pgdb=# *** TESTED IN: PostgreSQL 7.3.4 on i386-redhat-linux-gnu *** pgdb=# explain analyze select count(*) from t_students where period = current_period_id(); QUERY PLAN --- Aggregate (cost=182.32..182.32 rows=1 width=0) (actual time=49077.38..49077.38 rows=1 loops=1) - Seq Scan on t_students (cost=0.00..182.22 rows=43 width=0) (actual time=17993.89..49077.13 rows=21 loops=1) Filter: (period = current_period_id()) Total runtime: 49077.61 msec (4 rows) pgdb=# explain analyze select count(*) from t_students where period = (select current_period_id()); QUERY PLAN -- Aggregate (cost=125.19..125.19 rows=1 width=0) (actual time=131.59..131.60 rows=1 loops=1) InitPlan - Result (cost=0.00..0.01 rows=1 width=0) (actual time=41.05..41.06 rows=1 loops=1) - Index Scan using i_t_students__period on t_students (cost=0.00..125.08 rows=43 width=0) (actual time=131.28..131.48 rows=21 loops=1) Index Cond: (period = $0) Total runtime: 131.95 msec (6 rows) pgdb=# select version(); version - PostgreSQL 7.3.4 on i386-redhat-linux-gnu, compiled by GCC 2.96 (1 row) -- Octavio Alvarez. E-mail: [EMAIL PROTECTED] Agradezco que sus correos sean enviados siempre a esta dirección. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Seq scan on zero-parameters function
Tomasz Myrta said: Dnia 2004-02-06 08:19, U¿ytkownik Octavio Alvarez napisa³: In each couple, the first one has a WHERE field = function() condition, just like the second one, but in the form WHERE field = (SELECT function()). In my opinion, both should have the same execution plan, as the function has no parameters and, therefore, is constant. Nope. What would you say about function without params returning timeofday()? Is it constant? No... :-P ;-) If you are sure, that your function returns constant value - declare it as IMMUTABLE. (look at CREATE FUNCTION documentation) Thanks for the hint. In fact, my current_period_id() is based on time, but it should be constant along the query execution. I mean, I don't want some records filtered with some values and other with other values... I'll have an uncongruent recordset. Say SELECT [field-list] FROM [complex-join] WHERE sec = datepart('second', now()); Now suppose the query takes always more than 1 second because of the complex-join or whatever reason: I will naver have a congruent recordset. IMMUTABLE wouldn't help here, only wrapping the function in a subquery. Is this expected behavior? Is this standards compliant (if it can be qualified as such)? Octavio. -- Octavio Alvarez. E-mail: [EMAIL PROTECTED] Agradezco que sus correos sean enviados siempre a esta dirección. ---(end of broadcast)--- TIP 8: explain analyze is your friend
[PERFORM] Sorting when LEFT JOINING to 2 same tables, even aliased.
) (actual time=0.000..2.000 rows=377 loops=1) - Materialize (cost=8.77..12.54 rows=377 width=22) (actual time=0.000..0.163 rows=377 loops=3473) - Seq Scan on t_materias t_materias__equivalentes (cost=0.00..8.77 rows=377 width=22) (actual time=0.000..1.000 rows=377 loops=1) - Hash (cost=2.07..2.07 rows=107 width=13) (actual time=0.000..0.000 rows=0 loops=1) - Seq Scan on t_grupos (cost=0.00..2.07 rows=107 width=13) (actual time=0.000..0.000 rows=107 loops=1) Total runtime: 18787.000 ms SELECT count(*) FROM t_materias_en_tira; count --- 41059 (1 row) SELECT count(*) FROM t_materias; count --- 377 (1 row) SELECT version();; version --- PostgreSQL 7.4.1 on i686-pc-cygwin, compiled by GCC gcc (GCC) 3.3.1 (cygming special) (1 row) -- Octavio Alvarez. E-mail: [EMAIL PROTECTED] Agradezco que sus correos sean enviados siempre a esta dirección. -- Octavio Alvarez. E-mail: [EMAIL PROTECTED] Agradezco que sus correos sean enviados siempre a esta dirección. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[PERFORM] Slow queries from information schema
I'm aware you already know that information_schema is slow [1] [2], so I just want to expose/document another case and tests I did. I'm using the following view to check what tables depend on what other tables. CREATE VIEW raw_relation_tree AS SELECT tc_p.table_catalog AS parent_catalog, tc_p.table_schema AS parent_schema, tc_p.table_name AS parent_table, tc_c.table_catalog AS child_catalog, tc_c.table_schema AS child_schema, tc_c.table_name AS child_table FROM information_schema.referential_constraints AS rc NATURAL JOIN information_schema.table_constraints AS tc_c LEFT JOIN information_schema.table_constraints AS tc_p ON rc.unique_constraint_catalog = tc_p.constraint_catalog AND rc.unique_constraint_schema = tc_p.constraint_schema AND rc.unique_constraint_name = tc_p.constraint_name ; test=# select count(*) from raw_relation_tree; count --- 11 (1 row) An EXPLAIN ANALYZE for a simple SELECT on each of the FROM tables give: referential_constraints: ~9ms. table_constraints: ~24ms. The result, on the above view: ~80ms. Fair enough. But if I apply a condition: SELECT * FROM ___pgnui_relation_tree.raw_relation_tree WHERE parent_schema child_schema; it takes ~2 seconds (!) to complete. I tried using an alternate table_constraints definition by creating my own view and changing UNION to UNION ALL (as per [2]) The results were: table_constraints using UNION ALL has the same number of rows as the UNION version. table_constraints now take about 4 ms (as expected). VIEW raw_relation_tree is now 110 ms. VIEW raw_relation_tree WHERE parent_schema child_schema: 3.3 sec. EXPLAIN results are way too long to post here. If it is ok, I'll gladly post them. Using 8.3.6. [1] http://archives.postgresql.org/pgsql-bugs/2008-12/msg00144.php [2] http://archives.postgresql.org/pgsql-performance/2008-05/msg00062.php -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Slow queries from information schema
On Sat, 2009-02-14 at 15:02 -0500, Tom Lane wrote: Octavio Alvarez alvar...@alvarezp.ods.org writes: The result, on the above view: ~80ms. Fair enough. But if I apply a condition: SELECT * FROM ___pgnui_relation_tree.raw_relation_tree WHERE parent_schema child_schema; it takes ~2 seconds (!) to complete. I'm not sure I'm seeing the exact same case as you, but what I see here is that 8.3 puts the join condition involving _pg_keysequal() at the top of the tree where it will be executed quite a lot of times (way more than the planner expects, because of bad rowcount estimates below) ... and _pg_keysequal() is implemented in a depressingly inefficient way. CVS HEAD seems to avoid this trap in the same case, but I'm not entirely convinced whether it's getting better rowcount estimates or just got lucky. Anyway it seems to help a great deal if you use a less sucky definition of the function, such as create or replace function information_schema._pg_keysequal(smallint[], smallint[]) RETURNS boolean LANGUAGE sql STRICT IMMUTABLE AS 'select $1 @ $2 and $2 @ $1'; Wow! Just tried it with the UNION (the original) version of information_schema.table_constraints and it drastically reduced the total runtime to 309 ms! I also tested it with UNION ALL and it took 1.6 sec. (and yet, 50% of the previous time with UNION ALL). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Two equivalent WITH RECURSIVE queries, one of them slow.
Hello. I have a tree-like table with a three-field PK (name, date, id) and one parent field. It has 5k to 6k records as of now, but it will hold about 1 million records. I am trying the following WITH RECURSIVE query: WITH RECURSIVE t AS ( SELECT par.id AS tid, par.name, par.date, par.id, par.text, par.h_title, par.h_name, par.parent FROM _books.par UNION SELECT t.tid AS pid, p.name, p.date, p.id, p.text, p.h_title, p.h_name, p.parent FROM t, _books.par p WHERE p.name = t.name AND p.date = t.date AND t.id = p.parent ) SELECT t.tid, t.name, t.date, t.id, t.text, t.h_title, t.h_name, t.parent FROM t WHERE name = 'cfx' AND date = '2009-08-19' AND tid = '28340'; ... which takes 2547.503 ms However, if I try the same query but adding the same WHERE clause to the non-recursive term, I get much better results. WITH RECURSIVE t AS ( SELECT par.id AS tid, par.name, par.date, par.id, par.text, par.h_title, par.h_name, par.parent FROM _books.par WHERE name = 'cfx' AND date = '2009-08-19' AND par.id = '28340' UNION SELECT t.tid AS pid, p.name, p.date, p.id, p.text, p.h_title, p.h_name, p.parent FROM t, _books.par p WHERE p.name = t.name AND p.date = t.date AND t.id = p.parent ) SELECT t.tid, t.name, t.date, t.id, t.text, t.h_title, t.h_name, t.parent FROM t WHERE name = 'cfx' AND date = '2009-08-19' AND tid = '28340'; ... which takes 0.221 ms I am being forced to use the slow query because I want to define it as a view, leaving the WHERE clause to the application. I fail to see where the two queries might be different, or, what cases the slow one considers that the fast one doesn't, as to get a clue on how to workaround this. I have taken the EXPLAIN ANALYZE output for both queries. It looks like the slow one is processing all records (read: not adding the WHERE clause to the non-recursive term). QUERY PLAN -- CTE Scan on t (cost=96653.20..96820.57 rows=1 width=144) (actual time=32.931..2541.792 rows=1 loops=1) Filter: (((name)::text = 'cfx'::text) AND (date = '2009-08-19'::date) AND (tid = 28340)) CTE t - Recursive Union (cost=0.00..96653.20 rows=6086 width=212) (actual time=0.017..2442.655 rows=33191 loops=1) - Seq Scan on par (cost=0.00..237.96 rows=5996 width=208) (actual time=0.011..5.591 rows=5996 loops=1) - Merge Join (cost=8909.74..9629.35 rows=9 width=212) (actual time=225.979..254.727 rows=3022 loops=9) Merge Cond: (((t.name)::text = (p.name)::text) AND (t.date = p.date) AND (t.id = p.parent)) - Sort (cost=7700.54..7850.44 rows=59960 width=44) (actual time=58.163..59.596 rows=3685 loops=9) Sort name: t.name, t.date, t.id Sort Method: quicksort Memory: 17kB - WorkTable Scan on t (cost=0.00..1199.20 rows=59960 width=44) (actual time=0.027..3.486 rows=3688 loops=9) - Materialize (cost=1209.20..1284.15 rows=5996 width=208) (actual time=163.062..177.415 rows=5810 loops=9) - Sort (cost=1209.20..1224.19 rows=5996 width=208) (actual time=163.054..172.543 rows=5810 loops=9) Sort name: p.name, p.date, p.parent Sort Method: external merge Disk: 1304kB - Seq Scan on par p (cost=0.00..237.96 rows=5996 width=208) (actual time=0.015..3.330 rows=5996 loops=9) Total runtime: 2547.503 ms (17 rows) QUERY PLAN -- CTE Scan on t (cost=927.80..928.10 rows=1 width=144) (actual time=0.036..0.132 rows=1 loops=1) Filter: (((name)::text = 'cfx'::text) AND (date = '2009-08-19'::date) AND (tid = 28340)) CTE t - Recursive Union (cost=0.00..927.80 rows=11 width=212) (actual time=0.030..0.124 rows=1 loops=1) - Index Scan using par_id on par (cost=0.00..8.27 rows=1 width=208) (actual time=0.024..0.026 rows=1 loops=1) Index Cond: (id = 28340) Filter: (((name)::text = 'cfx'::text) AND (date = '2009-08-19'::date)) - Nested Loop (cost=0.00..91.93 rows=1 width=212) (actual time=0.091..0.091 rows=0 loops=1) Join Filter: (((t.name)::text = (p.name)::text) AND (t.date = p.date)) - WorkTable Scan on t (cost=0.00..0.20 rows=10 width=44) (actual