Re: [PERFORM] Very slow queries - please help

2005-12-12 Thread Bealach-na Bo
Thanks very  much - there are a lot of good articles there... Reading as 
fast as I can :)


Best,

Bealach



From: Thomas F. O'Connell [EMAIL PROTECTED]
To: Bealach-na Bo [EMAIL PROTECTED]
CC: PgSQL - Performance pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Very slow queries - please help
Date: Sun, 4 Dec 2005 00:40:01 -0600


On Nov 24, 2005, at 12:14 PM, Bealach-na Bo wrote:


The consensus seems to be that I need more indexes and I also need to
look into the NOT IN statement as a possible bottleneck. I've
introduced the indexes which has led to a DRAMATIC change in response
time. Now I have to experiment with INNER JOIN - OUTER JOIN
variations, SET ENABLE_SEQSCAN=OFF.

Forgive me for not mentioning each person individually and by name.
You have all contributed to confirming what I had suspected (and
hoped): that *I* have a lot to learn!

I'm attaching table descriptions, the first few lines of top output
while the queries were running, index lists, sample queries and
EXPLAIN ANALYSE output BEFORE and AFTER the introduction of the
indexes. As I said, DRAMATIC :) I notice that the CPU usage does not
vary very much, it's nearly 100% anyway, but the memory usage drops
markedly, which is another very nice result of the index introduction.

Any more comments and tips would be very welcome.


You might find the following resources from techdocs instructive:

http://techdocs.postgresql.org/redir.php?link=/techdocs/ 
pgsqladventuresep2.php


http://techdocs.postgresql.org/redir.php?link=/techdocs/ 
pgsqladventuresep3.php


These documents provide some guidance into the process of index  selection. 
It seems like you could still stand to benefit from more  indexes based on 
your queries, table definitions, and current indexes.


--
Thomas F. O'Connell
Database Architecture and Programming
Co-Founder
Sitening, LLC

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005 (cell)
615-469-5150 (office)
615-469-5151 (fax)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Very slow queries - please help

2005-12-03 Thread Thomas F. O'Connell


On Nov 24, 2005, at 12:14 PM, Bealach-na Bo wrote:


The consensus seems to be that I need more indexes and I also need to
look into the NOT IN statement as a possible bottleneck. I've
introduced the indexes which has led to a DRAMATIC change in response
time. Now I have to experiment with INNER JOIN - OUTER JOIN
variations, SET ENABLE_SEQSCAN=OFF.

Forgive me for not mentioning each person individually and by name.
You have all contributed to confirming what I had suspected (and
hoped): that *I* have a lot to learn!

I'm attaching table descriptions, the first few lines of top output
while the queries were running, index lists, sample queries and
EXPLAIN ANALYSE output BEFORE and AFTER the introduction of the
indexes. As I said, DRAMATIC :) I notice that the CPU usage does not
vary very much, it's nearly 100% anyway, but the memory usage drops
markedly, which is another very nice result of the index introduction.

Any more comments and tips would be very welcome.


You might find the following resources from techdocs instructive:

http://techdocs.postgresql.org/redir.php?link=/techdocs/ 
pgsqladventuresep2.php


http://techdocs.postgresql.org/redir.php?link=/techdocs/ 
pgsqladventuresep3.php


These documents provide some guidance into the process of index  
selection. It seems like you could still stand to benefit from more  
indexes based on your queries, table definitions, and current indexes.


--
Thomas F. O'Connell
Database Architecture and Programming
Co-Founder
Sitening, LLC

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005 (cell)
615-469-5150 (office)
615-469-5151 (fax)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Very slow queries - please help.

2005-11-24 Thread Bealach-na Bo

Hi Folks,

I'm new to Postgresql.

I'm having great difficulties getting the performance I had hoped for
from Postgresql 8.0. The typical query below takes ~20 minutes !!

I hope an expert out there will tell me what I'm doing wrong - I hope
*I* am doing something wrong.

Hardware

Single processor, Intel Xeon 3.06 GHz machine running Red Hat
Ent. 4. with 1.5 GB of RAM.

The machine is dedicated to running Postgresql 8.0 and Apache/mod_perl
etc. The database is being accessed for report generation via a web
form. The web server talks to Pg over TCP/IP (I know, that I don't
need to do this if they are all on the same machine, but I have good
reasons for this and don't suspect that this is where my problems are
- I have the same poor performance when running from psql on the
server.)

Database

Very simple, not fully normalized set of two tables. The first table,
very small (2000 lines of 4 cols with very few chars and integers in
in col). The other quite a bit larger (50 lines with 15
cols. with the largest fields ~ 256 chars)

Typical query


SELECT n.name
FROM node n
WHERE n.name
LIKE '56x%'
AND n.type='H'
AND n.usage='TEST'
AND n.node_id
NOT IN
(select n.node_id
FROM job_log j
INNER JOIN node n
ON j.node_id = n.node_id
WHERE n.name
LIKE '56x%'
AND n.type='H'
AND n.usage='TEST'
AND j.job_name = 'COPY FILES'
AND j.job_start = '2005-11-14 00:00:00'
AND (j.job_stop = '2005-11-22 09:31:10' OR j.job_stop IS NULL))
ORDER BY n.name


The node table is the small table and the job_log table is the large
table.


I've tried all the basic things that I found in the documentation like
VACUUM ANALYZE, EXPLAIN etc., but I suspect there is something
terribly wrong with what I'm doing and these measures will not shave
off 19 min and 50 seconds off the query time.

Any help and comments would be very much appreciated.


Bealach



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Very slow queries - please help.

2005-11-24 Thread Claus Guttesen
 Typical query
 

 SELECT n.name
 FROM node n
 WHERE n.name
 LIKE '56x%'
 AND n.type='H'
 AND n.usage='TEST'
 AND n.node_id
 NOT IN
 (select n.node_id
 FROM job_log j
 INNER JOIN node n
 ON j.node_id = n.node_id
 WHERE n.name
 LIKE '56x%'
 AND n.type='H'
 AND n.usage='TEST'
 AND j.job_name = 'COPY FILES'
 AND j.job_start = '2005-11-14 00:00:00'
 AND (j.job_stop = '2005-11-22 09:31:10' OR j.job_stop IS NULL))
 ORDER BY n.name

Do you have any indexes?

regards
Claus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Very slow queries - please help.

2005-11-24 Thread Bealach-na Bo


Hi,

Thanks for your comments. I've explicitly made any indexes, but the
default ones are:



[EMAIL PROTECTED] \di
  List of relations
Schema  |  Name   | Type  |  Owner  |  Table
-+-+---+-+-
user | job_log_id_pkey | index | user | job_log
user | node_id_pkey| index | user | node
user | node_name_key   | index | user | node
(3 rows)



I'm also sending the EXPLAIN outputs.





 explain SELECT n.name,n.type,
n.usage, j.status,
j.job_start,j.job_stop,
j.nfiles_in_job,j.job_name
 FROM job_log j
 INNER JOIN node n
 ON j.node_id = n.node_id
 WHERE n.name
 LIKE '56x%'
 AND n.type = 'K'
 AND n.usage = 'LIVE'
 AND j.job_name = 'COPY FILES'
 AND j.job_start = '2005-11-14 00:00:00'
 AND (j.job_stop = '2005-11-14 05:00:00' OR j.job_stop IS NULL)
 ORDER BY n.name;



 QUERY PLAN

--
Nested Loop  (cost=0.00..75753.31 rows=1 width=461)
  Join Filter: (inner.node_id = outer.node_id)
  -  Index Scan using node_name_key on node n  (cost=0.00..307.75 rows=1 
width=181)
Filter: ((name ~~ '56x%'::text) AND (type = 'K'::bpchar) AND 
(usage = 'LIVE'::bpchar))

  -  Seq Scan on job_log j  (cost=0.00..75445.54 rows=1 width=288)
Filter: ((job_name = 'COPY FILES'::bpchar) AND (job_start = 
'2005-11-14 00:00:00'::timestamp without time zone) AND ((job_stop = 
'2005-11-14 05:00:00'::timestamp without time zone) OR (job_stop IS NULL)))

(6 rows)


 explain SELECT n.name, n.type, n.usage
 FROM node n
 WHERE n.name
 LIKE '56x%'
 AND n.type  = 'K'
 AND n.usage = 'LIVE'
 AND n.node_id
 NOT IN
 (SELECT n.node_id
 FROM job_log j
 INNER JOIN node n
 ON j.node_id = n.node_id
 WHERE n.name
 LIKE '56x%'
 AND n.type  = 'K'
 AND n.usage = 'LIVE'
 AND j.job_name = 'COPY FILES'
 AND j.job_start = '2005-11-14 00:00:00'
 AND (j.job_stop = '2005-11-14 05:00:00' OR j.job_stop IS NULL))
 ORDER BY n.name;







 QUERY PLAN

--
Index Scan using node_name_key on node n  (cost=75451.55..75764.94 rows=1 
width=177)
  Filter: ((name ~~ '56x%'::text) AND (type = 'K'::bpchar) AND (usage = 
'LIVE'::bpchar) AND (NOT (hashed subplan)))

  SubPlan
-  Nested Loop  (cost=0.00..75451.54 rows=1 width=4)
  -  Seq Scan on job_log j  (cost=0.00..75445.54 rows=1 width=4)
Filter: ((job_name = 'COPY FILES'::bpchar) AND (job_start 
= '2005-11-14 00:00:00'::timestamp without time zone) AND ((job_stop = 
'2005-11-14 05:00:00'::timestamp without time zone) OR (job_stop IS NULL)))
  -  Index Scan using node_id_pkey on node n  (cost=0.00..5.99 
rows=1 width=4)

Index Cond: (outer.node_id = n.node_id)
Filter: ((name ~~ '56x%'::text) AND (type = 'K'::bpchar) 
AND (usage = 'LIVE'::bpchar))



Yours,

Bealach



From: Claus Guttesen [EMAIL PROTECTED]
To: Bealach-na Bo [EMAIL PROTECTED]
CC: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Very slow queries - please help.
Date: Thu, 24 Nov 2005 14:23:38 +0100

 Typical query
 

 SELECT n.name
 FROM node n
 WHERE n.name
 LIKE '56x%'
 AND n.type='H'
 AND n.usage='TEST'
 AND n.node_id
 NOT IN
 (select n.node_id
 FROM job_log j
 INNER JOIN node n
 ON j.node_id = n.node_id
 WHERE n.name
 LIKE '56x%'
 AND n.type='H'
 AND n.usage='TEST'
 AND j.job_name = 'COPY FILES'
 AND j.job_start = '2005-11-14 00:00:00'
 AND (j.job_stop = '2005-11-22 09:31:10' OR j.job_stop IS NULL))
 ORDER BY n.name

Do you have any indexes?

regards
Claus




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Very slow queries - please help.

2005-11-24 Thread Guillaume Smet

Hi,


I'm also sending the EXPLAIN outputs.


Please provide EXPLAIN ANALYZE outputs instead of EXPLAIN. You will have 
more information.


Indexes on your tables are obviously missing. You should try to add:

CREATE INDEX idx_node_filter ON node(name, type, usage);
CREATE INDEX idx_job_log_filter ON job_log(job_name, job_start, job_stop);

I'm not so sure it's a good idea to add job_stop in this index as you 
have an IS NULL in your query so I'm not sure it can be used. You should 
try it anyway and remove it if not needed.


I added all your search fields in the indexes but it depends a lot on 
the selectivity of your conditions. I don't know your data but I think 
you understand the idea.


HTH

--
Guillaume

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Very slow queries - please help.

2005-11-24 Thread Tom Lane
Bealach-na Bo [EMAIL PROTECTED] writes:
 I'm having great difficulties getting the performance I had hoped for
 from Postgresql 8.0. The typical query below takes ~20 minutes !!

You need to show us the table definition (including indexes) and the
EXPLAIN ANALYZE results for the query.

It seems likely that the NOT IN is the source of your problems,
but it's hard to be sure without EXPLAIN results.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Very slow queries - please help

2005-11-24 Thread Bealach-na Bo

OK.

The consensus seems to be that I need more indexes and I also need to
look into the NOT IN statement as a possible bottleneck. I've
introduced the indexes which has led to a DRAMATIC change in response
time. Now I have to experiment with INNER JOIN - OUTER JOIN
variations, SET ENABLE_SEQSCAN=OFF.

Forgive me for not mentioning each person individually and by name.
You have all contributed to confirming what I had suspected (and
hoped): that *I* have a lot to learn!

I'm attaching table descriptions, the first few lines of top output
while the queries were running, index lists, sample queries and
EXPLAIN ANALYSE output BEFORE and AFTER the introduction of the
indexes. As I said, DRAMATIC :) I notice that the CPU usage does not
vary very much, it's nearly 100% anyway, but the memory usage drops
markedly, which is another very nice result of the index introduction.

Any more comments and tips would be very welcome.

Thank you all for your input.

Bealach.




[EMAIL PROTECTED] \d job_log
Table blouser.job_log
Column |Type |Modifiers
+-+--
job_log_id | integer | not null default 
nextval('job_log_id_seq'::text)

first_registry | timestamp without time zone |
blogger_name   | character(50)   |
node_id| integer |
job_type   | character(50)   |
job_name   | character(256)  |
job_start  | timestamp without time zone |
job_timeout| interval|
job_stop   | timestamp without time zone |
nfiles_in_job  | integer |
status | integer |
error_code | smallint|
Indexes:
   job_log_id_pkey PRIMARY KEY, btree (job_log_id)
Check constraints:
   job_log_status_check CHECK (status = 0 OR status = 1 OR status = 8 OR 
status = 9)

Foreign-key constraints:
   legal_node FOREIGN KEY (node_id) REFERENCES node(node_id)





[EMAIL PROTECTED] \d node
 Table blouser.node
Column  | Type  |   Modifiers
-+---+---
node_id | integer   | not null default nextval('node_id_seq'::text)
name| character(50) |
type| character(1)  |
usage   | character(4)  |
Indexes:
   node_id_pkey PRIMARY KEY, btree (node_id)
   node_name_key UNIQUE, btree (name)
Check constraints:
   node_type_check CHECK (type = 'B'::bpchar OR type = 'K'::bpchar OR 
type = 'C'::bpchar OR type = 'T'::bpchar OR type = 'R'::bpchar)
   node_usage_check CHECK (usage = 'TEST'::bpchar OR usage = 
'LIVE'::bpchar)



#before new indexes were created


Tasks: 114 total,   2 running, 112 sleeping,   0 stopped,   0 zombie
Cpu(s): 25.7% us, 24.5% sy,  0.0% ni, 49.4% id,  0.3% wa,  0.0% hi,  0.0% si
Mem:   1554788k total,  1513576k used,41212k free,31968k buffers
Swap:  1020024k total,27916k used,   992108k free,   708728k cached

 PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
25883 postgres  25   0 20528  12m  11m R 99.7  0.8   4:54.91 postmaster





[EMAIL PROTECTED] \di
  List of relations
Schema  |  Name   | Type  |  Owner  |  Table
-+-+---+-+-
blouser | job_log_id_pkey | index | blouser | job_log
blouser | node_id_pkey| index | blouser | node
blouser | node_name_key   | index | blouser | node
(3 rows)


 EXPLAIN ANALYSE SELECT n.name,n.type,
n.usage, j.status,
j.job_start,j.job_stop,
j.nfiles_in_job,j.job_name
 FROM job_log j
 INNER JOIN node n
 ON j.node_id = n.node_id
 WHERE n.name
 LIKE '711%'
 AND n.type = 'K'
 AND n.usage = 'LIVE'
 AND j.job_name = 'COPY FILES'
 AND j.job_start = '2005-11-14 00:00:00'
 AND (j.job_stop = '2005-11-14 05:00:00' OR j.job_stop IS NULL)
 ORDER BY n.name;


 QUERY PLAN

---
Nested Loop  (cost=0.00..75753.31 rows=1 width=461) (actual 
time=270486.692..291662.350 rows=3 loops=1)

  Join Filter: (inner.node_id = outer.node_id)
  -  Index Scan using node_name_key on node n  (cost=0.00..307.75 rows=1 
width=181) (actual time=0.135..11.034 rows=208 loops=1)
Filter: ((name ~~ '711%'::text) AND (type = 'K'::bpchar) AND 
(usage = 'LIVE'::bpchar))
  -  Seq Scan on job_log j  (cost=0.00..75445.54 rows=1 width=288) (actual 
time=273.374..1402.089 rows=22 loops=208)
Filter: ((job_name = 'COPY FILES'::bpchar) AND (job_start = 
'2005-11-14 00:00:00'::timestamp without time zone) AND ((job_stop = 
'2005-11-14 

[PERFORM] Very slow queries - please help

2005-11-24 Thread Bealach-na Bo

A quick note to say that I'm very grateful for Tom Lane's input also.
Tom, I did put you on the list of recipients for my last posting to
pgsql-performance, but got:


cut here
This is an automatically generated Delivery Status Notification.

Delivery to the following recipients failed.

  [EMAIL PROTECTED]



Many regards,

Bealach



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly