Re: [PERFORM] Tsearch2 really slower than ilike ?

2004-11-18 Thread Herv Piedvache
Oleg,

Le Mercredi 17 Novembre 2004 18:23, Oleg Bartunov a écrit :
  Sorry but when I do your request I get :
  # select id_site from site where idx_site_name @@  'livejourn';
  ERROR:  type   does not exist

 no idea :) btw, what version of postgresql and OS you're running.
 Could you try minimal test - check sql commands from tsearch2 sources,
 some basic queries from tsearch2 documentation, tutorials.

 btw, your query should looks like
 select id_site from site_rss where idx_site_name @@ 'livejourn';
  

 How did you run your queries at all ? I mean your first message about
 poor tsearch2 performance.

I don't know what happend yesterday ... it's running now ...

You sent me :
zz=# explain analyze select id_site from site_rss where idx_site_name 
@@  'livejourn';
                                                             QUERY PLAN 
---
  Index Scan using ix_idx_site_name on site_rss  (cost=0.00..733.62 rows=184 
width=4) (actual time=0.339..39.183 rows=1737 loops=1)
    Index Cond: (idx_site_name @@ '\'livejourn\''::tsquery)
    Filter: (idx_site_name @@ '\'livejourn\''::tsquery)
  Total runtime: 40.997 ms
(4 rows)

It's really fast ! So, I don't understand your problem. 
I run query on my desktop machine, nothing special.


I get this :
   QUERY PLAN
-
 Index Scan using ix_idx_site_name on site_rss s  (cost=0.00..574.19 rows=187 
width=24) (actual time=105.097..7157.277 rows=388 loops=1)
   Index Cond: (idx_site_name @@ '\'livejourn\''::tsquery)
   Filter: (idx_site_name @@ '\'livejourn\''::tsquery)
 Total runtime: 7158.576 ms
(4 rows)

With the ilike I get :
 QUERY PLAN

 Seq Scan on site_rss s  (cost=0.00..8360.23 rows=1 width=24) (actual 
time=8.195..879.440 rows=404 loops=1)
   Filter: (site_name ~~* '%livejourn%'::text)
 Total runtime: 882.600 ms
(3 rows)

I don't know what is your desktop ... but I'm using PostgreSQL 7.4.6, on 
Debian Woody with a PC Bi-PIII 933 Mhz and 1 Gb of memory ... the server is 
dedicated to this database ... !!

I have no idea !

Regards,

-- 
Hervé Piedvache

Elma Ingénierie Informatique
6 rue du Faubourg Saint-Honoré
F-75008 - Paris - France
Pho. 33-144949901
Fax. 33-144949902

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Tsearch2 really slower than ilike ?

2004-11-18 Thread Oleg Bartunov
Have you run 'vacuum analyze' ?
1;2c1;2c1;2c
1;2c1;2c1;2cmy desktop is very simple PIII, 512 Mb RAM.
1;2c1;2c1;2cOleg
1;2c1;2c1;2c
1;2c1;2c1;2cOn Thu, 18 Nov 2004, [iso-8859-15] Herv? Piedvache wrote:
Oleg,
Le Mercredi 17 Novembre 2004 18:23, Oleg Bartunov a ?crit :
Sorry but when I do your request I get :
# select id_site from site where idx_site_name @@ 'livejourn';
ERROR: type  does not exist
no idea :) btw, what version of postgresql and OS you're running.
Could you try minimal test - check sql commands from tsearch2 sources,
some basic queries from tsearch2 documentation, tutorials.
btw, your query should looks like
select id_site from site_rss where idx_site_name @@ 'livejourn';
 
How did you run your queries at all ? I mean your first message about
poor tsearch2 performance.
I don't know what happend yesterday ... it's running now ...
You sent me :
zz=# explain analyze select id_site from site_rss where idx_site_name
@@ 'livejourn';
  QUERY PLAN
---
 Index Scan using ix_idx_site_name on site_rss (cost=0.00..733.62 rows=184
width=4) (actual time=0.339..39.183 rows=1737 loops=1)
  Index Cond: (idx_site_name @@ '\'livejourn\''::tsquery)
  Filter: (idx_site_name @@ '\'livejourn\''::tsquery)
 Total runtime: 40.997 ms
(4 rows)
It's really fast ! So, I don't understand your problem.
I run query on my desktop machine, nothing special.

I get this :
  QUERY PLAN
-
Index Scan using ix_idx_site_name on site_rss s  (cost=0.00..574.19 rows=187
width=24) (actual time=105.097..7157.277 rows=388 loops=1)
  Index Cond: (idx_site_name @@ '\'livejourn\''::tsquery)
  Filter: (idx_site_name @@ '\'livejourn\''::tsquery)
Total runtime: 7158.576 ms
(4 rows)
With the ilike I get :
QUERY PLAN

Seq Scan on site_rss s  (cost=0.00..8360.23 rows=1 width=24) (actual
time=8.195..879.440 rows=404 loops=1)
  Filter: (site_name ~~* '%livejourn%'::text)
Total runtime: 882.600 ms
(3 rows)
I don't know what is your desktop ... but I'm using PostgreSQL 7.4.6, on
Debian Woody with a PC Bi-PIII 933 Mhz and 1 Gb of memory ... the server is
dedicated to this database ... !!
I have no idea !
Regards,

Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Tsearch2 really slower than ilike ?

2004-11-18 Thread Herv Piedvache
Le Jeudi 18 Novembre 2004 10:37, Oleg Bartunov a écrit :
 Have you run 'vacuum analyze' ?

Yep every night VACUUM FULL VERBOSE ANALYZE; of all the database !

 1;2c1;2c1;2c
 1;2c1;2c1;2cmy desktop is very simple PIII, 512 Mb RAM.
 1;2c1;2c1;2c Oleg
 1;2c1;2c1;2c

YOU send strange caracters ! ;o)

 1;2c1;2c1;2cOn Thu, 18 Nov 2004, [iso-8859-15] Herv? Piedvache wrote:
  Oleg,
 
  Le Mercredi 17 Novembre 2004 18:23, Oleg Bartunov a ?crit :
  Sorry but when I do your request I get :
  # select id_site from site where idx_site_name @@  'livejourn';
  ERROR:  type   does not exist
 
  no idea :) btw, what version of postgresql and OS you're running.
  Could you try minimal test - check sql commands from tsearch2 sources,
  some basic queries from tsearch2 documentation, tutorials.
 
  btw, your query should looks like
  select id_site from site_rss where idx_site_name @@ 'livejourn';
   
 
  How did you run your queries at all ? I mean your first message about
  poor tsearch2 performance.
 
  I don't know what happend yesterday ... it's running now ...
 
  You sent me :
  zz=# explain analyze select id_site from site_rss where idx_site_name
  @@  'livejourn';
                                                               QUERY PLAN
  -
 -- Index Scan
  using ix_idx_site_name on site_rss  (cost=0.00..733.62 rows=184 width=4)
  (actual time=0.339..39.183 rows=1737 loops=1)
      Index Cond: (idx_site_name @@ '\'livejourn\''::tsquery)
      Filter: (idx_site_name @@ '\'livejourn\''::tsquery)
    Total runtime: 40.997 ms
  (4 rows)
 
  It's really fast ! So, I don't understand your problem.
  I run query on my desktop machine, nothing special.
 
  I get this :
QUERY PLAN
  -
  Index
  Scan using ix_idx_site_name on site_rss s  (cost=0.00..574.19 rows=187
  width=24) (actual time=105.097..7157.277 rows=388 loops=1)
Index Cond: (idx_site_name @@ '\'livejourn\''::tsquery)
Filter: (idx_site_name @@ '\'livejourn\''::tsquery)
  Total runtime: 7158.576 ms
  (4 rows)
 
  With the ilike I get :
  QUERY PLAN
  -
 --- Seq Scan on site_rss s 
  (cost=0.00..8360.23 rows=1 width=24) (actual time=8.195..879.440 rows=404
  loops=1)
Filter: (site_name ~~* '%livejourn%'::text)
  Total runtime: 882.600 ms
  (3 rows)
 
  I don't know what is your desktop ... but I'm using PostgreSQL 7.4.6, on
  Debian Woody with a PC Bi-PIII 933 Mhz and 1 Gb of memory ... the server
  is dedicated to this database ... !!
 
  I have no idea !
 
  Regards,

   Regards,
   Oleg
 _
 Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
 Sternberg Astronomical Institute, Moscow University (Russia)
 Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
 phone: +007(095)939-16-83, +007(095)939-23-83
 ---(end of broadcast)---
 TIP 8: explain analyze is your friend

-- 
Hervé Piedvache

Elma Ingénierie Informatique
6 rue du Faubourg Saint-Honoré
F-75008 - Paris - France
Pho. 33-144949901
Fax. 33-144949902

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Tsearch2 really slower than ilike ?

2004-11-18 Thread Oleg Bartunov
1;2c1;2c1;2cBlin !
what's happenning with my terminal when I read messagess from this guy ?
I don't even know how to call him - I see just Herv?
Oleg
1;2c1;2c1;2c1;2c
1;2cOn Thu, 18 Nov 2004, [iso-8859-15] Herv? Piedvache wrote:
Le Jeudi 18 Novembre 2004 10:37, Oleg Bartunov a ?crit :
Have you run 'vacuum analyze' ?
Yep every night VACUUM FULL VERBOSE ANALYZE; of all the database !
1;2c1;2c1;2c
1;2c1;2c1;2cmy desktop is very simple PIII, 512 Mb RAM.
1;2c1;2c11;2c1;2c1;2c;2c Oleg1;2c1;2c1;2c
11;2c1;2c1;2c;2c1;2c1;2c
YOU send strange caracters ! ;o)
1;2c1;2c1;2cOn Thu, 18 Nov 2004, [iso-8859-15] Herv? Piedvache wrote:
Oleg,
Le Mercredi 17 Novembre 2004 18:23, Oleg Bartunov a ?crit :
Sorry but when I do your request I get :
# select id_site from site where idx_site_name @@ 'livejourn';
ERROR: type  d1;2c1;2c1;2c1;2coes not exist
no idea :) btw, what version of postgresql and OS you're running.
Could you try minimal test - check sql commands from tsearch2 sources,
some basic queries from tsearch2 documentation, tutorials.
btw, your query should looks like
select id_site from site_rss where idx_site_name @@ 'livejourn';
 
How did you run your queries at all ? I mean your first message about
poor tsearch2 performance.
I don't know what happend yesterday ... it's running now ...
You sent me :
zz=# explain analyze select id_site from site_rss where idx_site_name
@@ 'livejourn';
  QUERY PLAN
-
-- Index Scan
using ix_idx_site_name on site_rss (cost=0.00..733.62 rows=184 width=4)
(actual time=0.339..39.183 rows=1737 loops=1)
  Index Cond: (idx_site_name @@ '\'livejourn\''::tsquery)
  Filter: (idx_site_name @@ '\'livejourn\''::tsquery)
 Total runtime: 40.997 ms
(4 rows)
It's really fast ! So, I don't understand your problem.
I run query on my desktop machine, nothing special.
I get this :
  QUERY PLAN
-
 Index
Scan using ix_idx_site_name on site_rss s  (cost=0.00..574.19 rows=187
width=24) (actual time=105.097..7157.277 rows=388 loops=1)
  Index Cond: (idx_site_name @@ '\'livejourn\''::tsquery)
  Filter: (idx_site_name @@ '\'livejourn\''::tsquery)
Total runtime: 7158.576 ms
(4 rows)
With the ilike I get :
QUERY PLAN
-
--- Seq Scan on site_rss s
(cost=0.00..8360.23 rows=1 width=24) (actual time=8.195..879.440 rows=404
loops=1)
  Filter: (site_name ~~* '%livejourn%'::text)
Total runtime: 882.600 ms
(3 rows)
I don't know what is your desktop ... but I'm using PostgreSQL 7.4.6, on
Debian Woody with a PC Bi-PIII 933 Mhz and 1 Gb of memory ... the server
is dedicated to this database ... !!
I have no idea !
Regards,
Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
---(end of broadcast)---
TIP 8: explain analyze is your friend

Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


[PERFORM] Query Performance and IOWait

2004-11-18 Thread Andrew Janian
Hello All,

I have a setup with a Dell Poweredge 2650 with Red Hat and Postgres 7.4.5 with 
a database with about 27GB of data.  The table in question has about 35 million 
rows.

I am running the following query:

SELECT *
FROM mb_fix_message
WHERE msg_client_order_id IN (
SELECT msg_client_order_id
FROM mb_fix_message
WHERE msg_log_time = '2004-06-01'
AND msg_log_time  '2004-06-01 13:30:00.000'
AND msg_message_type IN ('D','G')
AND mb_ord_type = '1'
)
AND msg_log_time  '2004-06-01'
AND msg_log_time  '2004-06-01 23:59:59.999'
AND msg_message_type = '8'
AND (mb_raw_text LIKE '%39=1%' OR mb_raw_text LIKE '%39=2%');

with the following plan:




 QUERY PLAN
Nested Loop IN Join  (cost=0.00..34047.29 rows=1 width=526)
  -  Index Scan using mfi_log_time on mb_fix_message  (cost=0.00..22231.31 
rows=2539 width=526)
   Index Cond: ((msg_log_time  '2004-06-01 00:00:00'::timestamp without 
time zone) AND (msg_log_time  '2004-06-01 23:59:59.999'::timestamp without 
time zone))
   Filter: (((msg_message_type)::text = '8'::text) AND 
(((mb_raw_text)::text ~~ '%39=1%'::text) OR ((mb_raw_text)::text ~~ 
'%39=2%'::text)))
  -  Index Scan using mfi_client_ordid on mb_fix_message  (cost=0.00..445.56 
rows=1 width=18)
   Index Cond: ((outer.msg_client_order_id)::text = 
(mb_fix_message.msg_client_order_id)::text)
   Filter: ((msg_log_time = '2004-06-01 00:00:00'::timestamp without time 
zone) AND (msg_log_time  '2004-06-01 13:30:00'::timestamp without time zone) 
AND ((msg_message_type)::text = 'D'::text) OR ((msg_message_type)::text = 
'G'::text)) AND ((mb_ord_type)::text = '1'::text))

While running, this query produces 100% iowait usage on its processor and takes 
a ungodly amount of time (about an hour).

The postgres settings are as follows:

shared_buffers = 32768  # min 16, at least max_connections*2, 8KB each
sort_mem = 262144   # min 64, size in KB

And the /etc/sysctl.conf has:
kernel.shmall = 274235392
kernel.shmmax = 274235392

The system has 4GB of RAM.

I am pretty sure of these settings, but only from my reading of the docs and 
others' recommendations online.

Thanks,

Andrew Janian
OMS Development
Scottrade Financial Services
(314) 965-1555 x 1513
Cell: (314) 369-2083

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Query Performance and IOWait

2004-11-18 Thread Richard_D_Levine
Andrew,

It seems that you could combine the subquery's WHERE clause with the main
query's to produce a simpler query, i.e. one without a subquery.

Rick





 
  Andrew Janian   
 
  [EMAIL PROTECTED]To:   
[EMAIL PROTECTED]
  Sent by:   cc:
 
  [EMAIL PROTECTED]Subject:  [PERFORM] Query 
Performance and IOWait
  tgresql.org   
 

 

 
  11/18/2004 08:42 AM   
 

 

 




Hello All,

I have a setup with a Dell Poweredge 2650 with Red Hat and Postgres 7.4.5
with a database with about 27GB of data.  The table in question has about
35 million rows.

I am running the following query:

SELECT *
FROM mb_fix_message
WHERE msg_client_order_id IN (
 SELECT msg_client_order_id
 FROM mb_fix_message
 WHERE msg_log_time = '2004-06-01'
 AND msg_log_time  '2004-06-01 13:30:00.000'
 AND msg_message_type IN ('D','G')
 AND mb_ord_type = '1'
 )
 AND msg_log_time  '2004-06-01'
 AND msg_log_time  '2004-06-01 23:59:59.999'
 AND msg_message_type = '8'
 AND (mb_raw_text LIKE '%39=1%' OR mb_raw_text LIKE '%39=2%');

with the following plan:

QUERY PLAN
Nested Loop IN Join  (cost=0.00..34047.29 rows=1 width=526)
  -  Index Scan using mfi_log_time on mb_fix_message  (cost=0.00..22231.31
rows=2539 width=526)
   Index Cond: ((msg_log_time  '2004-06-01 00:00:00'::timestamp
without time zone) AND (msg_log_time  '2004-06-01 23:59:59.999'::timestamp
without time zone))
   Filter: (((msg_message_type)::text = '8'::text) AND
(((mb_raw_text)::text ~~ '%39=1%'::text) OR ((mb_raw_text)::text ~~
'%39=2%'::text)))
  -  Index Scan using mfi_client_ordid on mb_fix_message
(cost=0.00..445.56 rows=1 width=18)
   Index Cond: ((outer.msg_client_order_id)::text =
(mb_fix_message.msg_client_order_id)::text)
   Filter: ((msg_log_time = '2004-06-01 00:00:00'::timestamp without
time zone) AND (msg_log_time  '2004-06-01 13:30:00'::timestamp without
time zone) AND ((msg_message_type)::text = 'D'::text) OR
((msg_message_type)::text = 'G'::text)) AND ((mb_ord_type)::text =
'1'::text))

While running, this query produces 100% iowait usage on its processor and
takes a ungodly amount of time (about an hour).

The postgres settings are as follows:

shared_buffers = 32768  # min 16, at least max_connections*2, 8KB
each
sort_mem = 262144   # min 64, size in KB

And the /etc/sysctl.conf has:
kernel.shmall = 274235392
kernel.shmmax = 274235392

The system has 4GB of RAM.

I am pretty sure of these settings, but only from my reading of the docs
and others' recommendations online.

Thanks,

Andrew Janian
OMS Development
Scottrade Financial Services
(314) 965-1555 x 1513
Cell: (314) 369-2083

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Query Performance and IOWait

2004-11-18 Thread Andrew Janian
Actually, unfortunately, that won't work.  The subquery gets a list of message 
IDs and then the outer query gets the responses to those messages.

Also, I dumped this data and imported it all to ms sql server and then ran it 
there.  The query ran in 2s.

Andrew

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
Sent: Thursday, November 18, 2004 7:57 AM
To: Andrew Janian
Cc: [EMAIL PROTECTED];
[EMAIL PROTECTED]
Subject: Re: [PERFORM] Query Performance and IOWait


Andrew,

It seems that you could combine the subquery's WHERE clause with the main
query's to produce a simpler query, i.e. one without a subquery.

Rick





 
  Andrew Janian   
 
  [EMAIL PROTECTED]To:   
[EMAIL PROTECTED]
  Sent by:   cc:
 
  [EMAIL PROTECTED]Subject:  [PERFORM] Query 
Performance and IOWait
  tgresql.org   
 

 

 
  11/18/2004 08:42 AM   
 

 

 




Hello All,

I have a setup with a Dell Poweredge 2650 with Red Hat and Postgres 7.4.5
with a database with about 27GB of data.  The table in question has about
35 million rows.

I am running the following query:

SELECT *
FROM mb_fix_message
WHERE msg_client_order_id IN (
 SELECT msg_client_order_id
 FROM mb_fix_message
 WHERE msg_log_time = '2004-06-01'
 AND msg_log_time  '2004-06-01 13:30:00.000'
 AND msg_message_type IN ('D','G')
 AND mb_ord_type = '1'
 )
 AND msg_log_time  '2004-06-01'
 AND msg_log_time  '2004-06-01 23:59:59.999'
 AND msg_message_type = '8'
 AND (mb_raw_text LIKE '%39=1%' OR mb_raw_text LIKE '%39=2%');

with the following plan:

QUERY PLAN
Nested Loop IN Join  (cost=0.00..34047.29 rows=1 width=526)
  -  Index Scan using mfi_log_time on mb_fix_message  (cost=0.00..22231.31
rows=2539 width=526)
   Index Cond: ((msg_log_time  '2004-06-01 00:00:00'::timestamp
without time zone) AND (msg_log_time  '2004-06-01 23:59:59.999'::timestamp
without time zone))
   Filter: (((msg_message_type)::text = '8'::text) AND
(((mb_raw_text)::text ~~ '%39=1%'::text) OR ((mb_raw_text)::text ~~
'%39=2%'::text)))
  -  Index Scan using mfi_client_ordid on mb_fix_message
(cost=0.00..445.56 rows=1 width=18)
   Index Cond: ((outer.msg_client_order_id)::text =
(mb_fix_message.msg_client_order_id)::text)
   Filter: ((msg_log_time = '2004-06-01 00:00:00'::timestamp without
time zone) AND (msg_log_time  '2004-06-01 13:30:00'::timestamp without
time zone) AND ((msg_message_type)::text = 'D'::text) OR
((msg_message_type)::text = 'G'::text)) AND ((mb_ord_type)::text =
'1'::text))

While running, this query produces 100% iowait usage on its processor and
takes a ungodly amount of time (about an hour).

The postgres settings are as follows:

shared_buffers = 32768  # min 16, at least max_connections*2, 8KB
each
sort_mem = 262144   # min 64, size in KB

And the /etc/sysctl.conf has:
kernel.shmall = 274235392
kernel.shmmax = 274235392

The system has 4GB of RAM.

I am pretty sure of these settings, but only from my reading of the docs
and others' recommendations online.

Thanks,

Andrew Janian
OMS Development
Scottrade Financial Services
(314) 965-1555 x 1513
Cell: (314) 369-2083

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings




---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Query Performance and IOWait

2004-11-18 Thread Woody Woodring
 
Andrew,

What version of Redhat are you running?   We have found running Enterprise
Update 3 kernel kills our Dell boxes with IOWait, both NFS and local disk
traffic.  Update 2 kernel does not seem to have the issue, and we are in the
process of trying Update 4 beta to see if it is better.

Woody

iGLASS Networks
www.iglass.net

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Andrew Janian
Sent: Thursday, November 18, 2004 9:02 AM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: [PERFORM] Query Performance and IOWait

Actually, unfortunately, that won't work.  The subquery gets a list of
message IDs and then the outer query gets the responses to those messages.

Also, I dumped this data and imported it all to ms sql server and then ran
it there.  The query ran in 2s.

Andrew

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
Sent: Thursday, November 18, 2004 7:57 AM
To: Andrew Janian
Cc: [EMAIL PROTECTED];
[EMAIL PROTECTED]
Subject: Re: [PERFORM] Query Performance and IOWait


Andrew,

It seems that you could combine the subquery's WHERE clause with the main
query's to produce a simpler query, i.e. one without a subquery.

Rick




 

  Andrew Janian

  [EMAIL PROTECTED]To:
[EMAIL PROTECTED]

  Sent by:   cc:

  [EMAIL PROTECTED]Subject:  [PERFORM]
Query Performance and IOWait
  tgresql.org

 

 

  11/18/2004 08:42 AM

 

 





Hello All,

I have a setup with a Dell Poweredge 2650 with Red Hat and Postgres 7.4.5
with a database with about 27GB of data.  The table in question has about
35 million rows.

I am running the following query:

SELECT *
FROM mb_fix_message
WHERE msg_client_order_id IN (
 SELECT msg_client_order_id
 FROM mb_fix_message
 WHERE msg_log_time = '2004-06-01'
 AND msg_log_time  '2004-06-01 13:30:00.000'
 AND msg_message_type IN ('D','G')
 AND mb_ord_type = '1'
 )
 AND msg_log_time  '2004-06-01'
 AND msg_log_time  '2004-06-01 23:59:59.999'
 AND msg_message_type = '8'
 AND (mb_raw_text LIKE '%39=1%' OR mb_raw_text LIKE '%39=2%');

with the following plan:

QUERY PLAN
Nested Loop IN Join  (cost=0.00..34047.29 rows=1 width=526)
  -  Index Scan using mfi_log_time on mb_fix_message  (cost=0.00..22231.31
rows=2539 width=526)
   Index Cond: ((msg_log_time  '2004-06-01 00:00:00'::timestamp without
time zone) AND (msg_log_time  '2004-06-01 23:59:59.999'::timestamp without
time zone))
   Filter: (((msg_message_type)::text = '8'::text) AND
(((mb_raw_text)::text ~~ '%39=1%'::text) OR ((mb_raw_text)::text ~~
'%39=2%'::text)))
  -  Index Scan using mfi_client_ordid on mb_fix_message
(cost=0.00..445.56 rows=1 width=18)
   Index Cond: ((outer.msg_client_order_id)::text =
(mb_fix_message.msg_client_order_id)::text)
   Filter: ((msg_log_time = '2004-06-01 00:00:00'::timestamp without
time zone) AND (msg_log_time  '2004-06-01 13:30:00'::timestamp without time
zone) AND ((msg_message_type)::text = 'D'::text) OR
((msg_message_type)::text = 'G'::text)) AND ((mb_ord_type)::text =
'1'::text))

While running, this query produces 100% iowait usage on its processor and
takes a ungodly amount of time (about an hour).

The postgres settings are as follows:

shared_buffers = 32768  # min 16, at least max_connections*2, 8KB
each
sort_mem = 262144   # min 64, size in KB

And the /etc/sysctl.conf has:
kernel.shmall = 274235392
kernel.shmmax = 274235392

The system has 4GB of RAM.

I am pretty sure of these settings, but only from my reading of the docs and
others' recommendations online.

Thanks,

Andrew Janian
OMS Development
Scottrade Financial Services
(314) 965-1555 x 1513
Cell: (314) 369-2083

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings




---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Query Performance and IOWait

2004-11-18 Thread Andrew Janian
I have run ANALYZE right before running this query.

I will run EXPLAIN ANALYZE when I can.  I started running the query when I sent 
the first email and it is still running.  Looke like it longer than an hour.

I will post the results of EXPLAIN ANALYZE in a few hours when I get them.

Thanks for all your help,

Andrew

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 18, 2004 9:40 AM
To: Andrew Janian
Cc: [EMAIL PROTECTED]
Subject: Re: [PERFORM] Query Performance and IOWait 


Andrew Janian [EMAIL PROTECTED] writes:
   
   
   
QUERY PLAN
 Nested Loop IN Join  (cost=0.00..34047.29 rows=1 width=526)
   -  Index Scan using mfi_log_time on mb_fix_message  (cost=0.00..22231.31 
 rows=2539 width=526)
Index Cond: ((msg_log_time  '2004-06-01 00:00:00'::timestamp without 
 time zone) AND (msg_log_time  '2004-06-01 23:59:59.999'::timestamp without 
 time zone))
Filter: (((msg_message_type)::text = '8'::text) AND 
 (((mb_raw_text)::text ~~ '%39=1%'::text) OR ((mb_raw_text)::text ~~ 
 '%39=2%'::text)))
   -  Index Scan using mfi_client_ordid on mb_fix_message  (cost=0.00..445.56 
 rows=1 width=18)
Index Cond: ((outer.msg_client_order_id)::text = 
 (mb_fix_message.msg_client_order_id)::text)
Filter: ((msg_log_time = '2004-06-01 00:00:00'::timestamp without 
 time zone) AND (msg_log_time  '2004-06-01 13:30:00'::timestamp without time 
 zone) AND ((msg_message_type)::text = 'D'::text) OR ((msg_message_type)::text 
 = 'G'::text)) AND ((mb_ord_type)::text = '1'::text))

 While running, this query produces 100% iowait usage on its processor and 
 takes a ungodly amount of time (about an hour).

This plan looks fairly reasonable if the rowcount estimates are
accurate.  Have you ANALYZEd the table lately?  You might need to
bump up the statistics target for the msg_log_time column to improve
the quality of the estimates.  It would be useful to see EXPLAIN
ANALYZE results too (yes I know it'll take you an hour to get them...)

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] query plan question

2004-11-18 Thread David Parker
What I think is happening with the missing pg_statistic entries:

The install of our application involves a lot of data importing (via
JDBC) in one large transaction, which can take up to 30 minutes. (I
realize I left out this key piece of info in my original post...)

The pg_autovacuum logic is relying on data from pg_stat_all_tables to
make the decision about running analyze. As far as I can tell, the data
in this view gets updated outside of the transaction, because I saw the
numbers growing while I was importing. I saw pg_autovacuum log messages
for running analyze on several tables, but no statistics data showed up
for these, I assume because the actual data in the table wasn't yet
visible to pg_autovacuum because the import transaction had not finished
yet.

When the import finished, not all of the tables affected by the import
were re-visited because they had not bumped up over the threshold again,
even though the analyze run for those tables had not generated any stats
because of the still-open transaction.

Am I making the correct assumptions about the way the various pieces
work? Does this scenario make sense?

It's easy enough for us to kick off a vacuum/analyze at the end of a
long import - but this mysterious behavior was bugging me!

Thanks.

- DAP 

-Original Message-
From: Matthew T. O'Connor [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, November 17, 2004 2:02 PM
To: David Parker
Cc: Tom Lane; Jeff; Russell Smith; [EMAIL PROTECTED]
Subject: Re: [PERFORM] query plan question

Well based on the autovacuum log that you attached, all of 
those tables 
are insert only (at least during the time period included in 
the log.   
Is that correct?  If so, autovacuum will never do a vacuum 
(unless required by xid wraparound issues) on those tables.  
So this doesn't appear to be an autovacuum problem.  I'm not 
sure about the missing pg_statistic entries anyone else care 
to field that one?

Matthew


David Parker wrote:

Thanks. The tables I'm concerned with are named: 'schema', 'usage', 
'usageparameter', and 'flow'. It looks like autovacuum is performing
analyzes:

% grep Performing:  logs/.db.tazz.vacuum.log
[2004-11-17 12:05:58 PM] Performing: ANALYZE 
public.scriptlibrary_library
[2004-11-17 12:15:59 PM] Performing: ANALYZE 
public.scriptlibraryparm
[2004-11-17 12:15:59 PM] Performing: ANALYZE public.usageparameter
[2004-11-17 12:21:00 PM] Performing: ANALYZE public.usageproperty
[2004-11-17 12:21:00 PM] Performing: ANALYZE public.route
[2004-11-17 12:21:00 PM] Performing: ANALYZE public.usageparameter
[2004-11-17 12:21:00 PM] Performing: ANALYZE 
public.scriptlibrary_library
[2004-11-17 12:26:01 PM] Performing: ANALYZE public.usage
[2004-11-17 12:26:01 PM] Performing: ANALYZE public.usageparameter
[2004-11-17 12:31:04 PM] Performing: ANALYZE public.usageproperty
[2004-11-17 12:36:04 PM] Performing: ANALYZE public.route
[2004-11-17 12:36:04 PM] Performing: ANALYZE public.service_usage
[2004-11-17 12:36:04 PM] Performing: ANALYZE public.usageparameter

But when I run the following:

select * from pg_statistic where starelid in (select oid from 
pg_class 
where relname in
('schema','usageparameter','flow','usage'))

it returns no records. Shouldn't it? It doesn't appear to be doing a 
vacuum anywhere, which makes sense because none of these tables have 
over the default threshold of 1000. Are there statistics 
which only get 
generated by vacuum?

I've attached a gzip of the pg_autovacuum log file, with -d 3.

Thanks again.

- DAP


  

-Original Message-
From: Matthew T. O'Connor [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 17, 2004 11:41 AM
To: David Parker
Cc: Tom Lane; Jeff; Russell Smith; [EMAIL PROTECTED]
Subject: Re: [PERFORM] query plan question

David Parker wrote:



We're using postgresql 7.4.5. I've only recently put 
pg_autovacuum in 
place as part of our installation, and I'm basically taking the 
defaults. I doubt it's a problem with autovacuum itself, but rather 
with my configuration of it. I have some reading to do, so
  

any pointers


to existing autovacuum threads would be greatly appreciated!

  

Well the first thing to do is increase the verbosity of the 
pg_autovacuum logging output.  If you use -d2 or higher, 
pg_autovacuum 
will print out a lot of detail on what it thinks the thresholds are 
and
why it is or isn't performing vacuums and analyzes.   Attach 
some of the
log and I'll take a look at it.






---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Query Performance and IOWait

2004-11-18 Thread Joshua D. Drake
Hello,
What is your statistics target?
What is your effective_cache_size?
Have you tried running the query as a cursor?
Sincerely,
Joshua D. Drake

Andrew Janian wrote:
I have run ANALYZE right before running this query.
I will run EXPLAIN ANALYZE when I can.  I started running the query when I sent 
the first email and it is still running.  Looke like it longer than an hour.
I will post the results of EXPLAIN ANALYZE in a few hours when I get them.
Thanks for all your help,
Andrew
-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 18, 2004 9:40 AM
To: Andrew Janian
Cc: [EMAIL PROTECTED]
Subject: Re: [PERFORM] Query Performance and IOWait 

Andrew Janian [EMAIL PROTECTED] writes:



QUERY PLAN
Nested Loop IN Join  (cost=0.00..34047.29 rows=1 width=526)
 -  Index Scan using mfi_log_time on mb_fix_message  (cost=0.00..22231.31 
rows=2539 width=526)
  Index Cond: ((msg_log_time  '2004-06-01 00:00:00'::timestamp without time 
zone) AND (msg_log_time  '2004-06-01 23:59:59.999'::timestamp without time zone))
  Filter: (((msg_message_type)::text = '8'::text) AND (((mb_raw_text)::text 
~~ '%39=1%'::text) OR ((mb_raw_text)::text ~~ '%39=2%'::text)))
 -  Index Scan using mfi_client_ordid on mb_fix_message  (cost=0.00..445.56 
rows=1 width=18)
  Index Cond: ((outer.msg_client_order_id)::text = 
(mb_fix_message.msg_client_order_id)::text)
  Filter: ((msg_log_time = '2004-06-01 00:00:00'::timestamp without time 
zone) AND (msg_log_time  '2004-06-01 13:30:00'::timestamp without time zone) AND 
((msg_message_type)::text = 'D'::text) OR ((msg_message_type)::text = 'G'::text)) AND 
((mb_ord_type)::text = '1'::text))

While running, this query produces 100% iowait usage on its processor and takes a ungodly amount of time (about an hour).

This plan looks fairly reasonable if the rowcount estimates are
accurate.  Have you ANALYZEd the table lately?  You might need to
bump up the statistics target for the msg_log_time column to improve
the quality of the estimates.  It would be useful to see EXPLAIN
ANALYZE results too (yes I know it'll take you an hour to get them...)
regards, tom lane
---(end of broadcast)---
TIP 6: Have you searched our list archives?
   http://archives.postgresql.org

--
Command Prompt, Inc., home of PostgreSQL Replication, and plPHP.
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
begin:vcard
fn:Joshua D. Drake
n:Drake;Joshua D.
org:Command Prompt, Inc.
adr:;;PO Box 215;Cascade Locks;Oregon;97014;USA
email;internet:[EMAIL PROTECTED]
title:Consultant
tel;work:503-667-4564
tel;fax:503-210-0334
note:Command Prompt, Inc. is the largest and oldest US based commercial PostgreSQL support provider. We  provide the only commercially viable integrated PostgreSQL replication solution, but also custom programming, and support. We authored  the book Practical PostgreSQL, the procedural language plPHP, and adding trigger capability to plPerl.
x-mozilla-html:FALSE
url:http://www.commandprompt.com/
version:2.1
end:vcard


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[PERFORM] Interaction between Free Space Map an alternate location for a database

2004-11-18 Thread Dave Cramer
Can someone explain how the free space map deals with alternate database 
locations?

Given that the free space map is global, and it is ostensibly managing 
free disk space, how does it deal with tuples across disk locations ?

Dave
--
Dave Cramer
http://www.postgresintl.com
519 939 0336
ICQ#14675561
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Query Performance and IOWait

2004-11-18 Thread Dave Cramer
Andrew,
Dell's aren't well known for their disk performance, apparently most of 
the perc controllers sold with dell's are actually adaptec controllers. 
Also apparently they do not come with the battery required to use the 
battery backed up write cache ( In fact according to some Dell won't 
even sell the battery to you). Also Dell's monitoring software is quite 
a memory hog.

Have you looked at top ?, and also hdparm -Tt /dev/sd?
Dave
Andrew Janian wrote:
Hello All,
I have a setup with a Dell Poweredge 2650 with Red Hat and Postgres 7.4.5 with 
a database with about 27GB of data.  The table in question has about 35 million 
rows.
I am running the following query:
SELECT *
FROM mb_fix_message
WHERE msg_client_order_id IN (
SELECT msg_client_order_id
FROM mb_fix_message
WHERE msg_log_time = '2004-06-01'
AND msg_log_time  '2004-06-01 13:30:00.000'
AND msg_message_type IN ('D','G')
AND mb_ord_type = '1'
)
AND msg_log_time  '2004-06-01'
AND msg_log_time  '2004-06-01 23:59:59.999'
AND msg_message_type = '8'
AND (mb_raw_text LIKE '%39=1%' OR mb_raw_text LIKE '%39=2%');
with the following plan:



QUERY PLAN
Nested Loop IN Join  (cost=0.00..34047.29 rows=1 width=526)
 -  Index Scan using mfi_log_time on mb_fix_message  (cost=0.00..22231.31 
rows=2539 width=526)
  Index Cond: ((msg_log_time  '2004-06-01 00:00:00'::timestamp without time 
zone) AND (msg_log_time  '2004-06-01 23:59:59.999'::timestamp without time zone))
  Filter: (((msg_message_type)::text = '8'::text) AND (((mb_raw_text)::text 
~~ '%39=1%'::text) OR ((mb_raw_text)::text ~~ '%39=2%'::text)))
 -  Index Scan using mfi_client_ordid on mb_fix_message  (cost=0.00..445.56 
rows=1 width=18)
  Index Cond: ((outer.msg_client_order_id)::text = 
(mb_fix_message.msg_client_order_id)::text)
  Filter: ((msg_log_time = '2004-06-01 00:00:00'::timestamp without time 
zone) AND (msg_log_time  '2004-06-01 13:30:00'::timestamp without time zone) AND 
((msg_message_type)::text = 'D'::text) OR ((msg_message_type)::text = 'G'::text)) AND 
((mb_ord_type)::text = '1'::text))
While running, this query produces 100% iowait usage on its processor and takes 
a ungodly amount of time (about an hour).
The postgres settings are as follows:
shared_buffers = 32768  # min 16, at least max_connections*2, 8KB each
sort_mem = 262144   # min 64, size in KB
And the /etc/sysctl.conf has:
kernel.shmall = 274235392
kernel.shmmax = 274235392
The system has 4GB of RAM.
I am pretty sure of these settings, but only from my reading of the docs and 
others' recommendations online.
Thanks,
Andrew Janian
OMS Development
Scottrade Financial Services
(314) 965-1555 x 1513
Cell: (314) 369-2083
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings
 

--
Dave Cramer
http://www.postgresintl.com
519 939 0336
ICQ#14675561
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Interaction between Free Space Map an alternate location for a database

2004-11-18 Thread Josh Berkus
Dave,

 Given that the free space map is global, and it is ostensibly managing
 free disk space, how does it deal with tuples across disk locations ?

Are you talking Tablespaces?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Query Performance and IOWait

2004-11-18 Thread Josh Berkus
Woody,

 What version of Redhat are you running?   We have found running Enterprise
 Update 3 kernel kills our Dell boxes with IOWait, both NFS and local disk
 traffic.  Update 2 kernel does not seem to have the issue, and we are in
 the process of trying Update 4 beta to see if it is better.

This is interesting; do you have more to say about it?   I've been having some 
mysterious issues with RHES that I've not been able to pin down.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Interaction between Free Space Map an alternate location for a database

2004-11-18 Thread Tom Lane
Dave Cramer [EMAIL PROTECTED] writes:
 Can someone explain how the free space map deals with alternate database 
 locations?

It doesn't really care.  It identifies tables by database OID+table OID,
and where they happen to sit physically doesn't matter.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Interaction between Free Space Map an alternate location

2004-11-18 Thread Dave Cramer




No, have a look at the create database command

there is a clause 'with location' that allows you to set up a separate
location for the db

Dave

Josh Berkus wrote:

  Dave,

  
  
Given that the free space map is global, and it is ostensibly managing
free disk space, how does it deal with tuples across disk locations ?

  
  
Are you talking Tablespaces?

  


-- 
Dave Cramer
http://www.postgresintl.com
519 939 0336
ICQ#14675561





Re: [PERFORM] Query Performance and IOWait

2004-11-18 Thread Woody Woodring
From our experience it is not just a postgres issue, but all IO with the
Update 3 kernel.

We have a box with Update 3 that queries a remote postgres database(Running
RH7.3, RH3 Update2) and writes to a file on an NFS server.   The update 3
box does half the work with 2-3 times the load as our update 1 and 2 boxes.
Looking at top the box is always above 90% IO Wait on the CPU.  When we
downgrade the kernel to Update 2 it seems to fix the issue.

We several Update 3 boxes that run postgres locally and they all struggle
compared to the Update 2 boxes

We have tried the Fedora Core 3 with not much more success and we are going
to try the Update 4 beta kernel next week to see if it is any better.

There are several threads on the Taroon mailing list discussing the issue.

Woody

-Original Message-
From: Josh Berkus [mailto:[EMAIL PROTECTED] 
Sent: Thursday, November 18, 2004 1:34 PM
To: [EMAIL PROTECTED]
Cc: Woody Woodring; 'Andrew Janian'
Subject: Re: [PERFORM] Query Performance and IOWait

Woody,

 What version of Redhat are you running?   We have found running 
 Enterprise Update 3 kernel kills our Dell boxes with IOWait, both NFS 
 and local disk traffic.  Update 2 kernel does not seem to have the 
 issue, and we are in the process of trying Update 4 beta to see if it is
better.

This is interesting; do you have more to say about it?   I've been having
some 
mysterious issues with RHES that I've not been able to pin down.

--
Josh Berkus
Aglio Database Solutions
San Francisco



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Interaction between Free Space Map an alternate location

2004-11-18 Thread Dave Cramer




Ok, so the global part of the fsm is just that it is in shared memory.
If certain databases have more 
free space they will simply take up more of the fsm. There is no cross
database movement of tuples.
( I realized this when I tried to form my next question)

Dave

Tom Lane wrote:

  Dave Cramer [EMAIL PROTECTED] writes:
  
  
Can someone explain how the free space map deals with alternate database 
locations?

  
  
It doesn't really care.  It identifies tables by database OID+table OID,
and where they happen to sit physically doesn't matter.

			regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


  


-- 
Dave Cramer
http://www.postgresintl.com
519 939 0336
ICQ#14675561





Re: [PERFORM] memcached and PostgreSQL

2004-11-18 Thread Sean Chittenden
So What does memcached offer pgsql users? It would still seem to offer
the benefit of a multi-machined cache.
Ack, I totally missed this thread.  Sorry for jumping in late.
Basically, memcached and pgmemcache offer a more technically correct 
way of implementing query caching.  MySQL's query caching is a 
disaster, IMHO.  memcached alleviates this load from the database and 
puts it elsewhere in a more optimized form.  The problem with memcached 
by itself is that you're relying on the application to invalidate the 
cache.  How many different places have to be kept in sync?  Using 
memcached, in its current form, makes relying on the application to be 
developed correctly with centralized libraries and database access 
routines.  Bah, that's a cluster f#$@ waiting to happen.

pgmemcache fixes that though so that you don't have to worry about 
invalidating the cache in every application/routine.  Instead you just 
centralize that logic in the database and automatically invalidate via 
triggers.  It's working out very well for me.

I'd be interested in success stories, fwiw.  In the next week or so 
I'll probably stick this on pgfoundry and build a proper make/release 
structure.  -sc

--
Sean Chittenden
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match