Re: [PERFORM] Tsearch2 really slower than ilike ?
Oleg, Le Mercredi 17 Novembre 2004 18:23, Oleg Bartunov a écrit : Sorry but when I do your request I get : # select id_site from site where idx_site_name @@ 'livejourn'; ERROR: type does not exist no idea :) btw, what version of postgresql and OS you're running. Could you try minimal test - check sql commands from tsearch2 sources, some basic queries from tsearch2 documentation, tutorials. btw, your query should looks like select id_site from site_rss where idx_site_name @@ 'livejourn'; How did you run your queries at all ? I mean your first message about poor tsearch2 performance. I don't know what happend yesterday ... it's running now ... You sent me : zz=# explain analyze select id_site from site_rss where idx_site_name @@ 'livejourn'; QUERY PLAN --- Index Scan using ix_idx_site_name on site_rss (cost=0.00..733.62 rows=184 width=4) (actual time=0.339..39.183 rows=1737 loops=1) Index Cond: (idx_site_name @@ '\'livejourn\''::tsquery) Filter: (idx_site_name @@ '\'livejourn\''::tsquery) Total runtime: 40.997 ms (4 rows) It's really fast ! So, I don't understand your problem. I run query on my desktop machine, nothing special. I get this : QUERY PLAN - Index Scan using ix_idx_site_name on site_rss s (cost=0.00..574.19 rows=187 width=24) (actual time=105.097..7157.277 rows=388 loops=1) Index Cond: (idx_site_name @@ '\'livejourn\''::tsquery) Filter: (idx_site_name @@ '\'livejourn\''::tsquery) Total runtime: 7158.576 ms (4 rows) With the ilike I get : QUERY PLAN Seq Scan on site_rss s (cost=0.00..8360.23 rows=1 width=24) (actual time=8.195..879.440 rows=404 loops=1) Filter: (site_name ~~* '%livejourn%'::text) Total runtime: 882.600 ms (3 rows) I don't know what is your desktop ... but I'm using PostgreSQL 7.4.6, on Debian Woody with a PC Bi-PIII 933 Mhz and 1 Gb of memory ... the server is dedicated to this database ... !! I have no idea ! Regards, -- Hervé Piedvache Elma Ingénierie Informatique 6 rue du Faubourg Saint-Honoré F-75008 - Paris - France Pho. 33-144949901 Fax. 33-144949902 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Tsearch2 really slower than ilike ?
Have you run 'vacuum analyze' ? 1;2c1;2c1;2c 1;2c1;2c1;2cmy desktop is very simple PIII, 512 Mb RAM. 1;2c1;2c1;2cOleg 1;2c1;2c1;2c 1;2c1;2c1;2cOn Thu, 18 Nov 2004, [iso-8859-15] Herv? Piedvache wrote: Oleg, Le Mercredi 17 Novembre 2004 18:23, Oleg Bartunov a ?crit : Sorry but when I do your request I get : # select id_site from site where idx_site_name @@ 'livejourn'; ERROR: type does not exist no idea :) btw, what version of postgresql and OS you're running. Could you try minimal test - check sql commands from tsearch2 sources, some basic queries from tsearch2 documentation, tutorials. btw, your query should looks like select id_site from site_rss where idx_site_name @@ 'livejourn'; How did you run your queries at all ? I mean your first message about poor tsearch2 performance. I don't know what happend yesterday ... it's running now ... You sent me : zz=# explain analyze select id_site from site_rss where idx_site_name @@ 'livejourn'; QUERY PLAN --- Index Scan using ix_idx_site_name on site_rss (cost=0.00..733.62 rows=184 width=4) (actual time=0.339..39.183 rows=1737 loops=1) Index Cond: (idx_site_name @@ '\'livejourn\''::tsquery) Filter: (idx_site_name @@ '\'livejourn\''::tsquery) Total runtime: 40.997 ms (4 rows) It's really fast ! So, I don't understand your problem. I run query on my desktop machine, nothing special. I get this : QUERY PLAN - Index Scan using ix_idx_site_name on site_rss s (cost=0.00..574.19 rows=187 width=24) (actual time=105.097..7157.277 rows=388 loops=1) Index Cond: (idx_site_name @@ '\'livejourn\''::tsquery) Filter: (idx_site_name @@ '\'livejourn\''::tsquery) Total runtime: 7158.576 ms (4 rows) With the ilike I get : QUERY PLAN Seq Scan on site_rss s (cost=0.00..8360.23 rows=1 width=24) (actual time=8.195..879.440 rows=404 loops=1) Filter: (site_name ~~* '%livejourn%'::text) Total runtime: 882.600 ms (3 rows) I don't know what is your desktop ... but I'm using PostgreSQL 7.4.6, on Debian Woody with a PC Bi-PIII 933 Mhz and 1 Gb of memory ... the server is dedicated to this database ... !! I have no idea ! Regards, Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Tsearch2 really slower than ilike ?
Le Jeudi 18 Novembre 2004 10:37, Oleg Bartunov a écrit : Have you run 'vacuum analyze' ? Yep every night VACUUM FULL VERBOSE ANALYZE; of all the database ! 1;2c1;2c1;2c 1;2c1;2c1;2cmy desktop is very simple PIII, 512 Mb RAM. 1;2c1;2c1;2c Oleg 1;2c1;2c1;2c YOU send strange caracters ! ;o) 1;2c1;2c1;2cOn Thu, 18 Nov 2004, [iso-8859-15] Herv? Piedvache wrote: Oleg, Le Mercredi 17 Novembre 2004 18:23, Oleg Bartunov a ?crit : Sorry but when I do your request I get : # select id_site from site where idx_site_name @@ 'livejourn'; ERROR: type does not exist no idea :) btw, what version of postgresql and OS you're running. Could you try minimal test - check sql commands from tsearch2 sources, some basic queries from tsearch2 documentation, tutorials. btw, your query should looks like select id_site from site_rss where idx_site_name @@ 'livejourn'; How did you run your queries at all ? I mean your first message about poor tsearch2 performance. I don't know what happend yesterday ... it's running now ... You sent me : zz=# explain analyze select id_site from site_rss where idx_site_name @@ 'livejourn'; QUERY PLAN - -- Index Scan using ix_idx_site_name on site_rss (cost=0.00..733.62 rows=184 width=4) (actual time=0.339..39.183 rows=1737 loops=1) Index Cond: (idx_site_name @@ '\'livejourn\''::tsquery) Filter: (idx_site_name @@ '\'livejourn\''::tsquery) Total runtime: 40.997 ms (4 rows) It's really fast ! So, I don't understand your problem. I run query on my desktop machine, nothing special. I get this : QUERY PLAN - Index Scan using ix_idx_site_name on site_rss s (cost=0.00..574.19 rows=187 width=24) (actual time=105.097..7157.277 rows=388 loops=1) Index Cond: (idx_site_name @@ '\'livejourn\''::tsquery) Filter: (idx_site_name @@ '\'livejourn\''::tsquery) Total runtime: 7158.576 ms (4 rows) With the ilike I get : QUERY PLAN - --- Seq Scan on site_rss s (cost=0.00..8360.23 rows=1 width=24) (actual time=8.195..879.440 rows=404 loops=1) Filter: (site_name ~~* '%livejourn%'::text) Total runtime: 882.600 ms (3 rows) I don't know what is your desktop ... but I'm using PostgreSQL 7.4.6, on Debian Woody with a PC Bi-PIII 933 Mhz and 1 Gb of memory ... the server is dedicated to this database ... !! I have no idea ! Regards, Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 8: explain analyze is your friend -- Hervé Piedvache Elma Ingénierie Informatique 6 rue du Faubourg Saint-Honoré F-75008 - Paris - France Pho. 33-144949901 Fax. 33-144949902 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Tsearch2 really slower than ilike ?
1;2c1;2c1;2cBlin ! what's happenning with my terminal when I read messagess from this guy ? I don't even know how to call him - I see just Herv? Oleg 1;2c1;2c1;2c1;2c 1;2cOn Thu, 18 Nov 2004, [iso-8859-15] Herv? Piedvache wrote: Le Jeudi 18 Novembre 2004 10:37, Oleg Bartunov a ?crit : Have you run 'vacuum analyze' ? Yep every night VACUUM FULL VERBOSE ANALYZE; of all the database ! 1;2c1;2c1;2c 1;2c1;2c1;2cmy desktop is very simple PIII, 512 Mb RAM. 1;2c1;2c11;2c1;2c1;2c;2c Oleg1;2c1;2c1;2c 11;2c1;2c1;2c;2c1;2c1;2c YOU send strange caracters ! ;o) 1;2c1;2c1;2cOn Thu, 18 Nov 2004, [iso-8859-15] Herv? Piedvache wrote: Oleg, Le Mercredi 17 Novembre 2004 18:23, Oleg Bartunov a ?crit : Sorry but when I do your request I get : # select id_site from site where idx_site_name @@ 'livejourn'; ERROR: type d1;2c1;2c1;2c1;2coes not exist no idea :) btw, what version of postgresql and OS you're running. Could you try minimal test - check sql commands from tsearch2 sources, some basic queries from tsearch2 documentation, tutorials. btw, your query should looks like select id_site from site_rss where idx_site_name @@ 'livejourn'; How did you run your queries at all ? I mean your first message about poor tsearch2 performance. I don't know what happend yesterday ... it's running now ... You sent me : zz=# explain analyze select id_site from site_rss where idx_site_name @@ 'livejourn'; QUERY PLAN - -- Index Scan using ix_idx_site_name on site_rss (cost=0.00..733.62 rows=184 width=4) (actual time=0.339..39.183 rows=1737 loops=1) Index Cond: (idx_site_name @@ '\'livejourn\''::tsquery) Filter: (idx_site_name @@ '\'livejourn\''::tsquery) Total runtime: 40.997 ms (4 rows) It's really fast ! So, I don't understand your problem. I run query on my desktop machine, nothing special. I get this : QUERY PLAN - Index Scan using ix_idx_site_name on site_rss s (cost=0.00..574.19 rows=187 width=24) (actual time=105.097..7157.277 rows=388 loops=1) Index Cond: (idx_site_name @@ '\'livejourn\''::tsquery) Filter: (idx_site_name @@ '\'livejourn\''::tsquery) Total runtime: 7158.576 ms (4 rows) With the ilike I get : QUERY PLAN - --- Seq Scan on site_rss s (cost=0.00..8360.23 rows=1 width=24) (actual time=8.195..879.440 rows=404 loops=1) Filter: (site_name ~~* '%livejourn%'::text) Total runtime: 882.600 ms (3 rows) I don't know what is your desktop ... but I'm using PostgreSQL 7.4.6, on Debian Woody with a PC Bi-PIII 933 Mhz and 1 Gb of memory ... the server is dedicated to this database ... !! I have no idea ! Regards, Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 8: explain analyze is your friend Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Query Performance and IOWait
Hello All, I have a setup with a Dell Poweredge 2650 with Red Hat and Postgres 7.4.5 with a database with about 27GB of data. The table in question has about 35 million rows. I am running the following query: SELECT * FROM mb_fix_message WHERE msg_client_order_id IN ( SELECT msg_client_order_id FROM mb_fix_message WHERE msg_log_time = '2004-06-01' AND msg_log_time '2004-06-01 13:30:00.000' AND msg_message_type IN ('D','G') AND mb_ord_type = '1' ) AND msg_log_time '2004-06-01' AND msg_log_time '2004-06-01 23:59:59.999' AND msg_message_type = '8' AND (mb_raw_text LIKE '%39=1%' OR mb_raw_text LIKE '%39=2%'); with the following plan: QUERY PLAN Nested Loop IN Join (cost=0.00..34047.29 rows=1 width=526) - Index Scan using mfi_log_time on mb_fix_message (cost=0.00..22231.31 rows=2539 width=526) Index Cond: ((msg_log_time '2004-06-01 00:00:00'::timestamp without time zone) AND (msg_log_time '2004-06-01 23:59:59.999'::timestamp without time zone)) Filter: (((msg_message_type)::text = '8'::text) AND (((mb_raw_text)::text ~~ '%39=1%'::text) OR ((mb_raw_text)::text ~~ '%39=2%'::text))) - Index Scan using mfi_client_ordid on mb_fix_message (cost=0.00..445.56 rows=1 width=18) Index Cond: ((outer.msg_client_order_id)::text = (mb_fix_message.msg_client_order_id)::text) Filter: ((msg_log_time = '2004-06-01 00:00:00'::timestamp without time zone) AND (msg_log_time '2004-06-01 13:30:00'::timestamp without time zone) AND ((msg_message_type)::text = 'D'::text) OR ((msg_message_type)::text = 'G'::text)) AND ((mb_ord_type)::text = '1'::text)) While running, this query produces 100% iowait usage on its processor and takes a ungodly amount of time (about an hour). The postgres settings are as follows: shared_buffers = 32768 # min 16, at least max_connections*2, 8KB each sort_mem = 262144 # min 64, size in KB And the /etc/sysctl.conf has: kernel.shmall = 274235392 kernel.shmmax = 274235392 The system has 4GB of RAM. I am pretty sure of these settings, but only from my reading of the docs and others' recommendations online. Thanks, Andrew Janian OMS Development Scottrade Financial Services (314) 965-1555 x 1513 Cell: (314) 369-2083 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Query Performance and IOWait
Andrew, It seems that you could combine the subquery's WHERE clause with the main query's to produce a simpler query, i.e. one without a subquery. Rick Andrew Janian [EMAIL PROTECTED]To: [EMAIL PROTECTED] Sent by: cc: [EMAIL PROTECTED]Subject: [PERFORM] Query Performance and IOWait tgresql.org 11/18/2004 08:42 AM Hello All, I have a setup with a Dell Poweredge 2650 with Red Hat and Postgres 7.4.5 with a database with about 27GB of data. The table in question has about 35 million rows. I am running the following query: SELECT * FROM mb_fix_message WHERE msg_client_order_id IN ( SELECT msg_client_order_id FROM mb_fix_message WHERE msg_log_time = '2004-06-01' AND msg_log_time '2004-06-01 13:30:00.000' AND msg_message_type IN ('D','G') AND mb_ord_type = '1' ) AND msg_log_time '2004-06-01' AND msg_log_time '2004-06-01 23:59:59.999' AND msg_message_type = '8' AND (mb_raw_text LIKE '%39=1%' OR mb_raw_text LIKE '%39=2%'); with the following plan: QUERY PLAN Nested Loop IN Join (cost=0.00..34047.29 rows=1 width=526) - Index Scan using mfi_log_time on mb_fix_message (cost=0.00..22231.31 rows=2539 width=526) Index Cond: ((msg_log_time '2004-06-01 00:00:00'::timestamp without time zone) AND (msg_log_time '2004-06-01 23:59:59.999'::timestamp without time zone)) Filter: (((msg_message_type)::text = '8'::text) AND (((mb_raw_text)::text ~~ '%39=1%'::text) OR ((mb_raw_text)::text ~~ '%39=2%'::text))) - Index Scan using mfi_client_ordid on mb_fix_message (cost=0.00..445.56 rows=1 width=18) Index Cond: ((outer.msg_client_order_id)::text = (mb_fix_message.msg_client_order_id)::text) Filter: ((msg_log_time = '2004-06-01 00:00:00'::timestamp without time zone) AND (msg_log_time '2004-06-01 13:30:00'::timestamp without time zone) AND ((msg_message_type)::text = 'D'::text) OR ((msg_message_type)::text = 'G'::text)) AND ((mb_ord_type)::text = '1'::text)) While running, this query produces 100% iowait usage on its processor and takes a ungodly amount of time (about an hour). The postgres settings are as follows: shared_buffers = 32768 # min 16, at least max_connections*2, 8KB each sort_mem = 262144 # min 64, size in KB And the /etc/sysctl.conf has: kernel.shmall = 274235392 kernel.shmmax = 274235392 The system has 4GB of RAM. I am pretty sure of these settings, but only from my reading of the docs and others' recommendations online. Thanks, Andrew Janian OMS Development Scottrade Financial Services (314) 965-1555 x 1513 Cell: (314) 369-2083 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Query Performance and IOWait
Actually, unfortunately, that won't work. The subquery gets a list of message IDs and then the outer query gets the responses to those messages. Also, I dumped this data and imported it all to ms sql server and then ran it there. The query ran in 2s. Andrew -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, November 18, 2004 7:57 AM To: Andrew Janian Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [PERFORM] Query Performance and IOWait Andrew, It seems that you could combine the subquery's WHERE clause with the main query's to produce a simpler query, i.e. one without a subquery. Rick Andrew Janian [EMAIL PROTECTED]To: [EMAIL PROTECTED] Sent by: cc: [EMAIL PROTECTED]Subject: [PERFORM] Query Performance and IOWait tgresql.org 11/18/2004 08:42 AM Hello All, I have a setup with a Dell Poweredge 2650 with Red Hat and Postgres 7.4.5 with a database with about 27GB of data. The table in question has about 35 million rows. I am running the following query: SELECT * FROM mb_fix_message WHERE msg_client_order_id IN ( SELECT msg_client_order_id FROM mb_fix_message WHERE msg_log_time = '2004-06-01' AND msg_log_time '2004-06-01 13:30:00.000' AND msg_message_type IN ('D','G') AND mb_ord_type = '1' ) AND msg_log_time '2004-06-01' AND msg_log_time '2004-06-01 23:59:59.999' AND msg_message_type = '8' AND (mb_raw_text LIKE '%39=1%' OR mb_raw_text LIKE '%39=2%'); with the following plan: QUERY PLAN Nested Loop IN Join (cost=0.00..34047.29 rows=1 width=526) - Index Scan using mfi_log_time on mb_fix_message (cost=0.00..22231.31 rows=2539 width=526) Index Cond: ((msg_log_time '2004-06-01 00:00:00'::timestamp without time zone) AND (msg_log_time '2004-06-01 23:59:59.999'::timestamp without time zone)) Filter: (((msg_message_type)::text = '8'::text) AND (((mb_raw_text)::text ~~ '%39=1%'::text) OR ((mb_raw_text)::text ~~ '%39=2%'::text))) - Index Scan using mfi_client_ordid on mb_fix_message (cost=0.00..445.56 rows=1 width=18) Index Cond: ((outer.msg_client_order_id)::text = (mb_fix_message.msg_client_order_id)::text) Filter: ((msg_log_time = '2004-06-01 00:00:00'::timestamp without time zone) AND (msg_log_time '2004-06-01 13:30:00'::timestamp without time zone) AND ((msg_message_type)::text = 'D'::text) OR ((msg_message_type)::text = 'G'::text)) AND ((mb_ord_type)::text = '1'::text)) While running, this query produces 100% iowait usage on its processor and takes a ungodly amount of time (about an hour). The postgres settings are as follows: shared_buffers = 32768 # min 16, at least max_connections*2, 8KB each sort_mem = 262144 # min 64, size in KB And the /etc/sysctl.conf has: kernel.shmall = 274235392 kernel.shmmax = 274235392 The system has 4GB of RAM. I am pretty sure of these settings, but only from my reading of the docs and others' recommendations online. Thanks, Andrew Janian OMS Development Scottrade Financial Services (314) 965-1555 x 1513 Cell: (314) 369-2083 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Query Performance and IOWait
Andrew, What version of Redhat are you running? We have found running Enterprise Update 3 kernel kills our Dell boxes with IOWait, both NFS and local disk traffic. Update 2 kernel does not seem to have the issue, and we are in the process of trying Update 4 beta to see if it is better. Woody iGLASS Networks www.iglass.net -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andrew Janian Sent: Thursday, November 18, 2004 9:02 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [PERFORM] Query Performance and IOWait Actually, unfortunately, that won't work. The subquery gets a list of message IDs and then the outer query gets the responses to those messages. Also, I dumped this data and imported it all to ms sql server and then ran it there. The query ran in 2s. Andrew -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, November 18, 2004 7:57 AM To: Andrew Janian Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [PERFORM] Query Performance and IOWait Andrew, It seems that you could combine the subquery's WHERE clause with the main query's to produce a simpler query, i.e. one without a subquery. Rick Andrew Janian [EMAIL PROTECTED]To: [EMAIL PROTECTED] Sent by: cc: [EMAIL PROTECTED]Subject: [PERFORM] Query Performance and IOWait tgresql.org 11/18/2004 08:42 AM Hello All, I have a setup with a Dell Poweredge 2650 with Red Hat and Postgres 7.4.5 with a database with about 27GB of data. The table in question has about 35 million rows. I am running the following query: SELECT * FROM mb_fix_message WHERE msg_client_order_id IN ( SELECT msg_client_order_id FROM mb_fix_message WHERE msg_log_time = '2004-06-01' AND msg_log_time '2004-06-01 13:30:00.000' AND msg_message_type IN ('D','G') AND mb_ord_type = '1' ) AND msg_log_time '2004-06-01' AND msg_log_time '2004-06-01 23:59:59.999' AND msg_message_type = '8' AND (mb_raw_text LIKE '%39=1%' OR mb_raw_text LIKE '%39=2%'); with the following plan: QUERY PLAN Nested Loop IN Join (cost=0.00..34047.29 rows=1 width=526) - Index Scan using mfi_log_time on mb_fix_message (cost=0.00..22231.31 rows=2539 width=526) Index Cond: ((msg_log_time '2004-06-01 00:00:00'::timestamp without time zone) AND (msg_log_time '2004-06-01 23:59:59.999'::timestamp without time zone)) Filter: (((msg_message_type)::text = '8'::text) AND (((mb_raw_text)::text ~~ '%39=1%'::text) OR ((mb_raw_text)::text ~~ '%39=2%'::text))) - Index Scan using mfi_client_ordid on mb_fix_message (cost=0.00..445.56 rows=1 width=18) Index Cond: ((outer.msg_client_order_id)::text = (mb_fix_message.msg_client_order_id)::text) Filter: ((msg_log_time = '2004-06-01 00:00:00'::timestamp without time zone) AND (msg_log_time '2004-06-01 13:30:00'::timestamp without time zone) AND ((msg_message_type)::text = 'D'::text) OR ((msg_message_type)::text = 'G'::text)) AND ((mb_ord_type)::text = '1'::text)) While running, this query produces 100% iowait usage on its processor and takes a ungodly amount of time (about an hour). The postgres settings are as follows: shared_buffers = 32768 # min 16, at least max_connections*2, 8KB each sort_mem = 262144 # min 64, size in KB And the /etc/sysctl.conf has: kernel.shmall = 274235392 kernel.shmmax = 274235392 The system has 4GB of RAM. I am pretty sure of these settings, but only from my reading of the docs and others' recommendations online. Thanks, Andrew Janian OMS Development Scottrade Financial Services (314) 965-1555 x 1513 Cell: (314) 369-2083 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Query Performance and IOWait
I have run ANALYZE right before running this query. I will run EXPLAIN ANALYZE when I can. I started running the query when I sent the first email and it is still running. Looke like it longer than an hour. I will post the results of EXPLAIN ANALYZE in a few hours when I get them. Thanks for all your help, Andrew -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Thursday, November 18, 2004 9:40 AM To: Andrew Janian Cc: [EMAIL PROTECTED] Subject: Re: [PERFORM] Query Performance and IOWait Andrew Janian [EMAIL PROTECTED] writes: QUERY PLAN Nested Loop IN Join (cost=0.00..34047.29 rows=1 width=526) - Index Scan using mfi_log_time on mb_fix_message (cost=0.00..22231.31 rows=2539 width=526) Index Cond: ((msg_log_time '2004-06-01 00:00:00'::timestamp without time zone) AND (msg_log_time '2004-06-01 23:59:59.999'::timestamp without time zone)) Filter: (((msg_message_type)::text = '8'::text) AND (((mb_raw_text)::text ~~ '%39=1%'::text) OR ((mb_raw_text)::text ~~ '%39=2%'::text))) - Index Scan using mfi_client_ordid on mb_fix_message (cost=0.00..445.56 rows=1 width=18) Index Cond: ((outer.msg_client_order_id)::text = (mb_fix_message.msg_client_order_id)::text) Filter: ((msg_log_time = '2004-06-01 00:00:00'::timestamp without time zone) AND (msg_log_time '2004-06-01 13:30:00'::timestamp without time zone) AND ((msg_message_type)::text = 'D'::text) OR ((msg_message_type)::text = 'G'::text)) AND ((mb_ord_type)::text = '1'::text)) While running, this query produces 100% iowait usage on its processor and takes a ungodly amount of time (about an hour). This plan looks fairly reasonable if the rowcount estimates are accurate. Have you ANALYZEd the table lately? You might need to bump up the statistics target for the msg_log_time column to improve the quality of the estimates. It would be useful to see EXPLAIN ANALYZE results too (yes I know it'll take you an hour to get them...) regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] query plan question
What I think is happening with the missing pg_statistic entries: The install of our application involves a lot of data importing (via JDBC) in one large transaction, which can take up to 30 minutes. (I realize I left out this key piece of info in my original post...) The pg_autovacuum logic is relying on data from pg_stat_all_tables to make the decision about running analyze. As far as I can tell, the data in this view gets updated outside of the transaction, because I saw the numbers growing while I was importing. I saw pg_autovacuum log messages for running analyze on several tables, but no statistics data showed up for these, I assume because the actual data in the table wasn't yet visible to pg_autovacuum because the import transaction had not finished yet. When the import finished, not all of the tables affected by the import were re-visited because they had not bumped up over the threshold again, even though the analyze run for those tables had not generated any stats because of the still-open transaction. Am I making the correct assumptions about the way the various pieces work? Does this scenario make sense? It's easy enough for us to kick off a vacuum/analyze at the end of a long import - but this mysterious behavior was bugging me! Thanks. - DAP -Original Message- From: Matthew T. O'Connor [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 17, 2004 2:02 PM To: David Parker Cc: Tom Lane; Jeff; Russell Smith; [EMAIL PROTECTED] Subject: Re: [PERFORM] query plan question Well based on the autovacuum log that you attached, all of those tables are insert only (at least during the time period included in the log. Is that correct? If so, autovacuum will never do a vacuum (unless required by xid wraparound issues) on those tables. So this doesn't appear to be an autovacuum problem. I'm not sure about the missing pg_statistic entries anyone else care to field that one? Matthew David Parker wrote: Thanks. The tables I'm concerned with are named: 'schema', 'usage', 'usageparameter', and 'flow'. It looks like autovacuum is performing analyzes: % grep Performing: logs/.db.tazz.vacuum.log [2004-11-17 12:05:58 PM] Performing: ANALYZE public.scriptlibrary_library [2004-11-17 12:15:59 PM] Performing: ANALYZE public.scriptlibraryparm [2004-11-17 12:15:59 PM] Performing: ANALYZE public.usageparameter [2004-11-17 12:21:00 PM] Performing: ANALYZE public.usageproperty [2004-11-17 12:21:00 PM] Performing: ANALYZE public.route [2004-11-17 12:21:00 PM] Performing: ANALYZE public.usageparameter [2004-11-17 12:21:00 PM] Performing: ANALYZE public.scriptlibrary_library [2004-11-17 12:26:01 PM] Performing: ANALYZE public.usage [2004-11-17 12:26:01 PM] Performing: ANALYZE public.usageparameter [2004-11-17 12:31:04 PM] Performing: ANALYZE public.usageproperty [2004-11-17 12:36:04 PM] Performing: ANALYZE public.route [2004-11-17 12:36:04 PM] Performing: ANALYZE public.service_usage [2004-11-17 12:36:04 PM] Performing: ANALYZE public.usageparameter But when I run the following: select * from pg_statistic where starelid in (select oid from pg_class where relname in ('schema','usageparameter','flow','usage')) it returns no records. Shouldn't it? It doesn't appear to be doing a vacuum anywhere, which makes sense because none of these tables have over the default threshold of 1000. Are there statistics which only get generated by vacuum? I've attached a gzip of the pg_autovacuum log file, with -d 3. Thanks again. - DAP -Original Message- From: Matthew T. O'Connor [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 17, 2004 11:41 AM To: David Parker Cc: Tom Lane; Jeff; Russell Smith; [EMAIL PROTECTED] Subject: Re: [PERFORM] query plan question David Parker wrote: We're using postgresql 7.4.5. I've only recently put pg_autovacuum in place as part of our installation, and I'm basically taking the defaults. I doubt it's a problem with autovacuum itself, but rather with my configuration of it. I have some reading to do, so any pointers to existing autovacuum threads would be greatly appreciated! Well the first thing to do is increase the verbosity of the pg_autovacuum logging output. If you use -d2 or higher, pg_autovacuum will print out a lot of detail on what it thinks the thresholds are and why it is or isn't performing vacuums and analyzes. Attach some of the log and I'll take a look at it. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Query Performance and IOWait
Hello, What is your statistics target? What is your effective_cache_size? Have you tried running the query as a cursor? Sincerely, Joshua D. Drake Andrew Janian wrote: I have run ANALYZE right before running this query. I will run EXPLAIN ANALYZE when I can. I started running the query when I sent the first email and it is still running. Looke like it longer than an hour. I will post the results of EXPLAIN ANALYZE in a few hours when I get them. Thanks for all your help, Andrew -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Thursday, November 18, 2004 9:40 AM To: Andrew Janian Cc: [EMAIL PROTECTED] Subject: Re: [PERFORM] Query Performance and IOWait Andrew Janian [EMAIL PROTECTED] writes: QUERY PLAN Nested Loop IN Join (cost=0.00..34047.29 rows=1 width=526) - Index Scan using mfi_log_time on mb_fix_message (cost=0.00..22231.31 rows=2539 width=526) Index Cond: ((msg_log_time '2004-06-01 00:00:00'::timestamp without time zone) AND (msg_log_time '2004-06-01 23:59:59.999'::timestamp without time zone)) Filter: (((msg_message_type)::text = '8'::text) AND (((mb_raw_text)::text ~~ '%39=1%'::text) OR ((mb_raw_text)::text ~~ '%39=2%'::text))) - Index Scan using mfi_client_ordid on mb_fix_message (cost=0.00..445.56 rows=1 width=18) Index Cond: ((outer.msg_client_order_id)::text = (mb_fix_message.msg_client_order_id)::text) Filter: ((msg_log_time = '2004-06-01 00:00:00'::timestamp without time zone) AND (msg_log_time '2004-06-01 13:30:00'::timestamp without time zone) AND ((msg_message_type)::text = 'D'::text) OR ((msg_message_type)::text = 'G'::text)) AND ((mb_ord_type)::text = '1'::text)) While running, this query produces 100% iowait usage on its processor and takes a ungodly amount of time (about an hour). This plan looks fairly reasonable if the rowcount estimates are accurate. Have you ANALYZEd the table lately? You might need to bump up the statistics target for the msg_log_time column to improve the quality of the estimates. It would be useful to see EXPLAIN ANALYZE results too (yes I know it'll take you an hour to get them...) regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org -- Command Prompt, Inc., home of PostgreSQL Replication, and plPHP. Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL begin:vcard fn:Joshua D. Drake n:Drake;Joshua D. org:Command Prompt, Inc. adr:;;PO Box 215;Cascade Locks;Oregon;97014;USA email;internet:[EMAIL PROTECTED] title:Consultant tel;work:503-667-4564 tel;fax:503-210-0334 note:Command Prompt, Inc. is the largest and oldest US based commercial PostgreSQL support provider. We provide the only commercially viable integrated PostgreSQL replication solution, but also custom programming, and support. We authored the book Practical PostgreSQL, the procedural language plPHP, and adding trigger capability to plPerl. x-mozilla-html:FALSE url:http://www.commandprompt.com/ version:2.1 end:vcard ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Interaction between Free Space Map an alternate location for a database
Can someone explain how the free space map deals with alternate database locations? Given that the free space map is global, and it is ostensibly managing free disk space, how does it deal with tuples across disk locations ? Dave -- Dave Cramer http://www.postgresintl.com 519 939 0336 ICQ#14675561 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Query Performance and IOWait
Andrew, Dell's aren't well known for their disk performance, apparently most of the perc controllers sold with dell's are actually adaptec controllers. Also apparently they do not come with the battery required to use the battery backed up write cache ( In fact according to some Dell won't even sell the battery to you). Also Dell's monitoring software is quite a memory hog. Have you looked at top ?, and also hdparm -Tt /dev/sd? Dave Andrew Janian wrote: Hello All, I have a setup with a Dell Poweredge 2650 with Red Hat and Postgres 7.4.5 with a database with about 27GB of data. The table in question has about 35 million rows. I am running the following query: SELECT * FROM mb_fix_message WHERE msg_client_order_id IN ( SELECT msg_client_order_id FROM mb_fix_message WHERE msg_log_time = '2004-06-01' AND msg_log_time '2004-06-01 13:30:00.000' AND msg_message_type IN ('D','G') AND mb_ord_type = '1' ) AND msg_log_time '2004-06-01' AND msg_log_time '2004-06-01 23:59:59.999' AND msg_message_type = '8' AND (mb_raw_text LIKE '%39=1%' OR mb_raw_text LIKE '%39=2%'); with the following plan: QUERY PLAN Nested Loop IN Join (cost=0.00..34047.29 rows=1 width=526) - Index Scan using mfi_log_time on mb_fix_message (cost=0.00..22231.31 rows=2539 width=526) Index Cond: ((msg_log_time '2004-06-01 00:00:00'::timestamp without time zone) AND (msg_log_time '2004-06-01 23:59:59.999'::timestamp without time zone)) Filter: (((msg_message_type)::text = '8'::text) AND (((mb_raw_text)::text ~~ '%39=1%'::text) OR ((mb_raw_text)::text ~~ '%39=2%'::text))) - Index Scan using mfi_client_ordid on mb_fix_message (cost=0.00..445.56 rows=1 width=18) Index Cond: ((outer.msg_client_order_id)::text = (mb_fix_message.msg_client_order_id)::text) Filter: ((msg_log_time = '2004-06-01 00:00:00'::timestamp without time zone) AND (msg_log_time '2004-06-01 13:30:00'::timestamp without time zone) AND ((msg_message_type)::text = 'D'::text) OR ((msg_message_type)::text = 'G'::text)) AND ((mb_ord_type)::text = '1'::text)) While running, this query produces 100% iowait usage on its processor and takes a ungodly amount of time (about an hour). The postgres settings are as follows: shared_buffers = 32768 # min 16, at least max_connections*2, 8KB each sort_mem = 262144 # min 64, size in KB And the /etc/sysctl.conf has: kernel.shmall = 274235392 kernel.shmmax = 274235392 The system has 4GB of RAM. I am pretty sure of these settings, but only from my reading of the docs and others' recommendations online. Thanks, Andrew Janian OMS Development Scottrade Financial Services (314) 965-1555 x 1513 Cell: (314) 369-2083 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings -- Dave Cramer http://www.postgresintl.com 519 939 0336 ICQ#14675561 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Interaction between Free Space Map an alternate location for a database
Dave, Given that the free space map is global, and it is ostensibly managing free disk space, how does it deal with tuples across disk locations ? Are you talking Tablespaces? -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Query Performance and IOWait
Woody, What version of Redhat are you running? We have found running Enterprise Update 3 kernel kills our Dell boxes with IOWait, both NFS and local disk traffic. Update 2 kernel does not seem to have the issue, and we are in the process of trying Update 4 beta to see if it is better. This is interesting; do you have more to say about it? I've been having some mysterious issues with RHES that I've not been able to pin down. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Interaction between Free Space Map an alternate location for a database
Dave Cramer [EMAIL PROTECTED] writes: Can someone explain how the free space map deals with alternate database locations? It doesn't really care. It identifies tables by database OID+table OID, and where they happen to sit physically doesn't matter. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Interaction between Free Space Map an alternate location
No, have a look at the create database command there is a clause 'with location' that allows you to set up a separate location for the db Dave Josh Berkus wrote: Dave, Given that the free space map is global, and it is ostensibly managing free disk space, how does it deal with tuples across disk locations ? Are you talking Tablespaces? -- Dave Cramer http://www.postgresintl.com 519 939 0336 ICQ#14675561
Re: [PERFORM] Query Performance and IOWait
From our experience it is not just a postgres issue, but all IO with the Update 3 kernel. We have a box with Update 3 that queries a remote postgres database(Running RH7.3, RH3 Update2) and writes to a file on an NFS server. The update 3 box does half the work with 2-3 times the load as our update 1 and 2 boxes. Looking at top the box is always above 90% IO Wait on the CPU. When we downgrade the kernel to Update 2 it seems to fix the issue. We several Update 3 boxes that run postgres locally and they all struggle compared to the Update 2 boxes We have tried the Fedora Core 3 with not much more success and we are going to try the Update 4 beta kernel next week to see if it is any better. There are several threads on the Taroon mailing list discussing the issue. Woody -Original Message- From: Josh Berkus [mailto:[EMAIL PROTECTED] Sent: Thursday, November 18, 2004 1:34 PM To: [EMAIL PROTECTED] Cc: Woody Woodring; 'Andrew Janian' Subject: Re: [PERFORM] Query Performance and IOWait Woody, What version of Redhat are you running? We have found running Enterprise Update 3 kernel kills our Dell boxes with IOWait, both NFS and local disk traffic. Update 2 kernel does not seem to have the issue, and we are in the process of trying Update 4 beta to see if it is better. This is interesting; do you have more to say about it? I've been having some mysterious issues with RHES that I've not been able to pin down. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Interaction between Free Space Map an alternate location
Ok, so the global part of the fsm is just that it is in shared memory. If certain databases have more free space they will simply take up more of the fsm. There is no cross database movement of tuples. ( I realized this when I tried to form my next question) Dave Tom Lane wrote: Dave Cramer [EMAIL PROTECTED] writes: Can someone explain how the free space map deals with alternate database locations? It doesn't really care. It identifies tables by database OID+table OID, and where they happen to sit physically doesn't matter. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings -- Dave Cramer http://www.postgresintl.com 519 939 0336 ICQ#14675561
Re: [PERFORM] memcached and PostgreSQL
So What does memcached offer pgsql users? It would still seem to offer the benefit of a multi-machined cache. Ack, I totally missed this thread. Sorry for jumping in late. Basically, memcached and pgmemcache offer a more technically correct way of implementing query caching. MySQL's query caching is a disaster, IMHO. memcached alleviates this load from the database and puts it elsewhere in a more optimized form. The problem with memcached by itself is that you're relying on the application to invalidate the cache. How many different places have to be kept in sync? Using memcached, in its current form, makes relying on the application to be developed correctly with centralized libraries and database access routines. Bah, that's a cluster f#$@ waiting to happen. pgmemcache fixes that though so that you don't have to worry about invalidating the cache in every application/routine. Instead you just centralize that logic in the database and automatically invalidate via triggers. It's working out very well for me. I'd be interested in success stories, fwiw. In the next week or so I'll probably stick this on pgfoundry and build a proper make/release structure. -sc -- Sean Chittenden ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match