[PERFORM] Slow delete times??

2004-01-23 Thread Octavio Alvarez
Please tell me if this timing makes sense to you for a Celeron 433 w/
RAM=256MB dedicated testing server. I expected some slowness, but not this
high.

db_epsilon=# \d t_active_subjects
 Table public.t_active_subjects
 Column | Type |   Modifiers
+--+
 id | integer  | not null default
nextval('public.t_active_subjects_id_seq'::text)
 old_id | integer  |
 ext_subject| integer  | not null
 ext_group  | integer  |
 final_grade| integer  |
 type   | character(1) |
 ree| date |
 borrado| boolean  |
 ext_active_student | integer  |
 sum_presences  | integer  |
 sum_hours  | integer  |
Indexes: t_active_subjects_pkey primary key btree (id),
 i_t_active_subjects__ext_active_student btree (ext_active_student),
 i_t_active_subjects__ext_group btree (ext_group),
 i_t_active_subjects__ext_subject btree (ext_subject),
 i_t_active_subjects__old_id btree (old_id)
Foreign Key constraints: $4 FOREIGN KEY (ext_group) REFERENCES
t_groups(id) ON UPDATE NO ACTION ON DELETE NO ACTION,
 $3 FOREIGN KEY (ext_subject) REFERENCES
t_subjects(id) ON UPDATE NO ACTION ON DELETE NO
ACTION

db_epsilon=# EXPLAIN DELETE FROM t_active_subjects;
   QUERY PLAN
-
 Seq Scan on t_active_subjects  (cost=0.00..3391.73 rows=52373 width=6)
(1 row)

db_epsilon=# EXPLAIN ANALYZE DELETE FROM t_active_subjects;
   QUERY PLAN

 Seq Scan on t_active_subjects  (cost=0.00..3391.73 rows=52373 width=6)
(actual time=0.11..4651.82 rows=73700 loops=1)
 Total runtime: 3504528.15 msec
(2 rows)

db_epsilon=# SELECT version();
 version
-
 PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2
20020903 (Red Hat Linux 8.0 3.2-7)
(1 row)

[EMAIL PROTECTED] data]# cat postgresql.conf | grep -v \# | grep \=
tcpip_socket = true
fsync = false
LC_MESSAGES = 'en_US.UTF-8'
LC_MONETARY = 'en_US.UTF-8'
LC_NUMERIC = 'en_US.UTF-8'
LC_TIME = 'en_US.UTF-8'

Okay, some details:
 * The query takes to run about 3,504.52815 sec for 52,373 rows, which
averages about 15 deletes per second.
 * Each ext_* field is a foreign key to another table's pk.
 * This is a dedicated testing server with 256 MB RAM, and is a Celeron
433 MHz. It still has enough disk space, I think: about 200 MB.
 * Disk is 4 MB. I guess it must be about what, 4500 RPM?
 * fsync is disabled.

I don't know what other info to provide...

Thanks in advance.

--
Octavio Alvarez Piza.
E-mail: [EMAIL PROTECTED]

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[PERFORM] Seq scan on zero-parameters function

2004-02-06 Thread Octavio Alvarez
Hi!

   I'd like to know if this is expected behavior. These are two couples of
queries. In each couple, the first one has a WHERE field = function()
condition, just like the second one, but in the form WHERE field =
(SELECT function()). In my opinion, both should have the same execution
plan, as the function has no parameters and, therefore, is constant.

   I'm concerned about this, because the second form looks like a workaround.

*** TESTED IN: PostgreSQL 7.4.1 on i686-pc-cygwin ***

pgdb=# explain analyze select count(*) from t_students where period =
(select current_period_id());
 QUERY
PLAN
---
 Aggregate  (cost=127.84..127.84 rows=1 width=0) (actual time=1.000..1.000
rows=1 loops=1)
   InitPlan
 -  Result  (cost=0.00..0.01 rows=1 width=0) (actual
time=1.000..1.000 rows=1 loops=1)
   -  Index Scan using i_t_students__period on t_students 
(cost=0.00..127.71 rows=44 width=0) (actual time=1.000..1.000 rows=21
loop=1)
 Index Cond: (period = $0)
 Total runtime: 1.000 ms
(6 rows)

pgdb=# explain analyze select count(*) from t_students where period =
(select current_period_id());
 QUERY
PLAN
---
 Aggregate  (cost=127.84..127.84 rows=1 width=0) (actual time=1.000..1.000
rows=1 loops=1)
   InitPlan
 -  Result  (cost=0.00..0.01 rows=1 width=0) (actual
time=1.000..1.000 rows=1 loops=1)
   -  Index Scan using i_t_students__period on t_students 
(cost=0.00..127.71 rows=44 width=0) (actual time=1.000..1.000 rows=21
loop=1)
 Index Cond: (period = $0)
 Total runtime: 1.000 ms
(6 rows)

pgdb=# select version();
version
---
 PostgreSQL 7.4.1 on i686-pc-cygwin, compiled by GCC gcc (GCC) 3.3.1
(cygming special)
(1 row)

pgdb=#

*** TESTED IN: PostgreSQL 7.3.4 on i386-redhat-linux-gnu ***

pgdb=# explain analyze select count(*) from t_students where period =
current_period_id();
  QUERY PLAN
---
 Aggregate  (cost=182.32..182.32 rows=1 width=0) (actual
time=49077.38..49077.38 rows=1 loops=1)
   -  Seq Scan on t_students  (cost=0.00..182.22 rows=43 width=0) (actual
time=17993.89..49077.13 rows=21 loops=1)
 Filter: (period = current_period_id())
 Total runtime: 49077.61 msec
(4 rows)

pgdb=# explain analyze select count(*) from t_students where period =
(select current_period_id());
  QUERY
PLAN
--
 Aggregate  (cost=125.19..125.19 rows=1 width=0) (actual
time=131.59..131.60 rows=1 loops=1)
   InitPlan
 -  Result  (cost=0.00..0.01 rows=1 width=0) (actual
time=41.05..41.06 rows=1 loops=1)
   -  Index Scan using i_t_students__period on t_students 
(cost=0.00..125.08 rows=43 width=0) (actual time=131.28..131.48 rows=21
loops=1)
 Index Cond: (period = $0)
 Total runtime: 131.95 msec
(6 rows)

pgdb=# select version();
 version
-
 PostgreSQL 7.3.4 on i386-redhat-linux-gnu, compiled by GCC 2.96
(1 row)




-- 
Octavio Alvarez.
E-mail: [EMAIL PROTECTED]

Agradezco que sus correos sean enviados siempre a esta dirección.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Seq scan on zero-parameters function

2004-02-06 Thread Octavio Alvarez

Tomasz Myrta said:
 Dnia 2004-02-06 08:19, U¿ytkownik Octavio Alvarez napisa³:
 In each couple, the first one has a WHERE field = function()
 condition, just like the second one, but in the form WHERE field =
 (SELECT function()). In my opinion, both should have the same execution
 plan, as the function has no parameters and, therefore, is constant.

 Nope.

 What would you say about function without params returning timeofday()?
 Is it constant?

No... :-P ;-)

 If you are sure, that your function returns constant value - declare it
 as IMMUTABLE. (look at CREATE FUNCTION documentation)

Thanks for the hint.

In fact, my current_period_id() is based on time, but it should be
constant along the query execution. I mean, I don't want some records
filtered with some values and other with other values... I'll have an
uncongruent recordset.

Say SELECT [field-list] FROM [complex-join] WHERE sec = datepart('second',
now()); Now suppose the query takes always more than 1 second because of
the complex-join or whatever reason: I will naver have a congruent
recordset.

IMMUTABLE wouldn't help here, only wrapping the function in a subquery. Is
this expected behavior? Is this standards compliant (if it can be
qualified as such)?

Octavio.

-- 
Octavio Alvarez.
E-mail: [EMAIL PROTECTED]

Agradezco que sus correos sean enviados siempre a esta dirección.

---(end of broadcast)---
TIP 8: explain analyze is your friend


[PERFORM] Sorting when LEFT JOINING to 2 same tables, even aliased.

2004-03-11 Thread Octavio Alvarez
) (actual time=0.000..2.000 rows=377 loops=1)
 -  Materialize  (cost=8.77..12.54 rows=377 width=22) (actual
time=0.000..0.163 rows=377 loops=3473)
   -  Seq Scan on t_materias t_materias__equivalentes
(cost=0.00..8.77 rows=377 width=22) (actual
time=0.000..1.000 rows=377 loops=1)
   -  Hash  (cost=2.07..2.07 rows=107 width=13) (actual time=0.000..0.000
rows=0 loops=1)
 -  Seq Scan on t_grupos  (cost=0.00..2.07 rows=107 width=13)
(actual time=0.000..0.000 rows=107 loops=1)

 Total runtime: 18787.000 ms

SELECT count(*) FROM t_materias_en_tira;
 count
---
 41059
(1 row)

SELECT count(*) FROM t_materias;
 count
---
   377
(1 row)

SELECT version();;
version
---
 PostgreSQL 7.4.1 on i686-pc-cygwin, compiled by GCC gcc (GCC) 3.3.1
(cygming special)
(1 row)


-- 
Octavio Alvarez.
E-mail: [EMAIL PROTECTED]

Agradezco que sus correos sean enviados siempre a esta dirección.



-- 
Octavio Alvarez.
E-mail: [EMAIL PROTECTED]

Agradezco que sus correos sean enviados siempre a esta dirección.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] Slow queries from information schema

2009-02-14 Thread Octavio Alvarez
I'm aware you already know that information_schema is slow [1] [2], so I
just want to expose/document another case and tests I did.

I'm using the following view to check what tables depend on what other
tables.

CREATE VIEW raw_relation_tree AS
SELECT
   tc_p.table_catalog AS parent_catalog,
   tc_p.table_schema AS parent_schema,
   tc_p.table_name AS parent_table,
   tc_c.table_catalog AS child_catalog,
   tc_c.table_schema AS child_schema,
   tc_c.table_name AS child_table
FROM
   information_schema.referential_constraints AS rc
   NATURAL JOIN information_schema.table_constraints AS tc_c
   LEFT JOIN information_schema.table_constraints AS tc_p ON
  rc.unique_constraint_catalog = tc_p.constraint_catalog AND
  rc.unique_constraint_schema = tc_p.constraint_schema AND
  rc.unique_constraint_name = tc_p.constraint_name
;

test=# select count(*) from raw_relation_tree;
count 
---
11
(1 row)

An EXPLAIN ANALYZE for a simple SELECT on each of the FROM tables give:
referential_constraints: ~9ms.
table_constraints: ~24ms.

The result, on the above view: ~80ms. Fair enough. But if I apply a
condition:

SELECT * FROM ___pgnui_relation_tree.raw_relation_tree WHERE
parent_schema  child_schema;

it takes ~2 seconds (!) to complete.

I tried using an alternate table_constraints definition by creating my
own view and changing UNION to UNION ALL (as per [2]) The results were:

table_constraints using UNION ALL has the same number of rows as the
UNION version.

table_constraints now take about 4 ms (as expected).
VIEW raw_relation_tree is now 110 ms.
VIEW raw_relation_tree WHERE parent_schema  child_schema: 3.3 sec.

EXPLAIN results are way too long to post here. If it is ok, I'll gladly
post them.

Using 8.3.6.

[1] http://archives.postgresql.org/pgsql-bugs/2008-12/msg00144.php
[2]
http://archives.postgresql.org/pgsql-performance/2008-05/msg00062.php



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow queries from information schema

2009-02-14 Thread Octavio Alvarez
On Sat, 2009-02-14 at 15:02 -0500, Tom Lane wrote:
 Octavio Alvarez alvar...@alvarezp.ods.org writes:
  The result, on the above view: ~80ms. Fair enough. But if I apply a
  condition:
  SELECT * FROM ___pgnui_relation_tree.raw_relation_tree WHERE
  parent_schema  child_schema;
  it takes ~2 seconds (!) to complete.
 
 I'm not sure I'm seeing the exact same case as you, but what I see here
 is that 8.3 puts the join condition involving _pg_keysequal() at the
 top of the tree where it will be executed quite a lot of times (way
 more than the planner expects, because of bad rowcount estimates below)
 ... and _pg_keysequal() is implemented in a depressingly inefficient way.
 
 CVS HEAD seems to avoid this trap in the same case, but I'm not entirely
 convinced whether it's getting better rowcount estimates or just got
 lucky.
 
 Anyway it seems to help a great deal if you use a less sucky definition
 of the function, such as
 
 create or replace function information_schema._pg_keysequal(smallint[], 
 smallint[]) RETURNS boolean
 LANGUAGE sql STRICT IMMUTABLE AS
 'select $1 @ $2 and $2 @ $1';

Wow! Just tried it with the UNION (the original) version of
information_schema.table_constraints and it drastically reduced the
total runtime to 309 ms!

I also tested it with UNION ALL and it took 1.6 sec. (and yet, 50% of
the previous time with UNION ALL).




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Two equivalent WITH RECURSIVE queries, one of them slow.

2010-07-05 Thread Octavio Alvarez

Hello.

I have a tree-like table with a three-field PK (name, date, id) and one  
parent field.
It has 5k to 6k records as of now, but it will hold about 1 million  
records.


I am trying the following WITH RECURSIVE query:

WITH RECURSIVE t AS (
 SELECT par.id AS tid, par.name, par.date, par.id,  
par.text, par.h_title, par.h_name, par.parent

   FROM _books.par
UNION
 SELECT t.tid AS pid, p.name, p.date, p.id, p.text,  
p.h_title, p.h_name, p.parent

   FROM t, _books.par p
  WHERE p.name = t.name AND p.date = t.date AND t.id =  
p.parent

)
 SELECT t.tid, t.name, t.date, t.id, t.text, t.h_title, t.h_name, t.parent
   FROM t WHERE name = 'cfx' AND date = '2009-08-19' AND tid = '28340';

... which takes 2547.503 ms

However, if I try the same query but adding the same WHERE clause to the
non-recursive term, I get much better results.


WITH RECURSIVE t AS (
 SELECT par.id AS tid, par.name, par.date, par.id,  
par.text, par.h_title, par.h_name, par.parent
   FROM _books.par WHERE name = 'cfx' AND date =  
'2009-08-19' AND par.id = '28340'

UNION
 SELECT t.tid AS pid, p.name, p.date, p.id, p.text,  
p.h_title, p.h_name, p.parent

   FROM t, _books.par p
  WHERE p.name = t.name AND p.date = t.date AND t.id =  
p.parent

)
 SELECT t.tid, t.name, t.date, t.id, t.text, t.h_title, t.h_name, t.parent
   FROM t WHERE name = 'cfx' AND date = '2009-08-19' AND tid = '28340';

... which takes 0.221 ms

I am being forced to use the slow query because I want to define it as a
view, leaving the WHERE clause to the application.

I fail to see where the two queries might be different, or, what cases the
slow one considers that the fast one doesn't, as to get a clue on how to
workaround this.

I have taken the EXPLAIN ANALYZE output for both queries. It looks like
the slow one is processing all records (read: not adding the WHERE clause
to the non-recursive term).


  QUERY  
PLAN

--
 CTE Scan on t  (cost=96653.20..96820.57 rows=1 width=144) (actual  
time=32.931..2541.792 rows=1 loops=1)
   Filter: (((name)::text = 'cfx'::text) AND (date = '2009-08-19'::date)  
AND (tid = 28340))

   CTE t
 -  Recursive Union  (cost=0.00..96653.20 rows=6086 width=212)  
(actual time=0.017..2442.655 rows=33191 loops=1)
   -  Seq Scan on par  (cost=0.00..237.96 rows=5996 width=208)  
(actual time=0.011..5.591 rows=5996 loops=1)
   -  Merge Join  (cost=8909.74..9629.35 rows=9 width=212)  
(actual time=225.979..254.727 rows=3022 loops=9)
 Merge Cond: (((t.name)::text = (p.name)::text) AND  
(t.date = p.date) AND (t.id = p.parent))
 -  Sort  (cost=7700.54..7850.44 rows=59960 width=44)  
(actual time=58.163..59.596 rows=3685 loops=9)

   Sort name: t.name, t.date, t.id
   Sort Method:  quicksort  Memory: 17kB
   -  WorkTable Scan on t  (cost=0.00..1199.20  
rows=59960 width=44) (actual time=0.027..3.486 rows=3688 loops=9)
 -  Materialize  (cost=1209.20..1284.15 rows=5996  
width=208) (actual time=163.062..177.415 rows=5810 loops=9)
   -  Sort  (cost=1209.20..1224.19 rows=5996  
width=208) (actual time=163.054..172.543 rows=5810 loops=9)

 Sort name: p.name, p.date, p.parent
 Sort Method:  external merge  Disk: 1304kB
 -  Seq Scan on par p  (cost=0.00..237.96  
rows=5996 width=208) (actual time=0.015..3.330 rows=5996 loops=9)

 Total runtime: 2547.503 ms
(17 rows)



QUERY  
PLAN

--
 CTE Scan on t  (cost=927.80..928.10 rows=1 width=144) (actual  
time=0.036..0.132 rows=1 loops=1)
   Filter: (((name)::text = 'cfx'::text) AND (date = '2009-08-19'::date)  
AND (tid = 28340))

   CTE t
 -  Recursive Union  (cost=0.00..927.80 rows=11 width=212) (actual  
time=0.030..0.124 rows=1 loops=1)
   -  Index Scan using par_id on par  (cost=0.00..8.27 rows=1  
width=208) (actual time=0.024..0.026 rows=1 loops=1)

 Index Cond: (id = 28340)
 Filter: (((name)::text = 'cfx'::text) AND (date =  
'2009-08-19'::date))
   -  Nested Loop  (cost=0.00..91.93 rows=1 width=212) (actual  
time=0.091..0.091 rows=0 loops=1)
 Join Filter: (((t.name)::text = (p.name)::text) AND  
(t.date = p.date))
 -  WorkTable Scan on t  (cost=0.00..0.20 rows=10  
width=44) (actual