Re: [PERFORM] VERY slow queries at random
On 07.06.2007, at 22:42, Greg Smith wrote: On Thu, 7 Jun 2007, Gunther Mayer wrote: wal checkpoint config is on pg defaults everywhere, all relevant config options are commented out. I'm no expert in wal stuff but I don't see how that could cause the problem? Checkpoints are very resource intensive and can cause other processes (including your selects) to hang for a considerable period of time while they are processing. With the default parameters, they can happen very frequently. Normally checkpoint_segments and checkpoint_timeout are increased in order to keep this from happening. This would normally be an issue only if you're writing a substantial amount of data to your tables. If there are a lot of writes going on, you might get some improvement by adjusting those parameters upward; the defaults are pretty low. Make sure you read http://www.postgresql.org/docs/8.2/static/wal-configuration.html first so you know what you're playing with, there are some recovery implications invoved. I remember us having problems with 8.0 background writer, you might want to try turning it off. Not sure if it behaves as badly in 8.2. increasing wal buffers might be a good idea also. Kristo ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] VERY slow queries at random
could be that the checkpoints are done too seldom. what is your wal checkpoint config? Kristo On 07.06.2007, at 0:27, Scott Marlowe wrote: Gunther Mayer wrote: Hi there, We run a small ISP with a FreeBSD/freeradius/postgresql 8.2.4 backend and 200+ users. Authentication happens via UAM/hotspot and I see a lot of authorisation and accounting packets that are handled via PL/PGSQL functions directly in the database. Everything seems to work 100% except that a few times a day I see Jun 6 10:41:31 caligula postgres[57347]: [4-1] radiususer: LOG: duration: 19929.291 ms statement: SELECT fn_accounting_start(...) in my logs. I'm logging slow queries with log_min_duration_statement = 500 in my postgresql.conf. Sometimes another query runs equally slow or even slower (I've seen 139 seconds!!!) a few minutes before or after as well, but then everything is back to normal. Even though I haven't yet indexed my data I know that the system is performant because my largest table (the accounting one) only has 5000+ rows, the entire database is only a few MB's and I have plenty of memory (2GB), shared_buffers = 100MB and max_fsm_pages = 179200. Also from briefly enabling log_parser_stats = on log_planner_stats = on log_executor_stats = on I saw that most queries are 100% satisfied from cache so the disk doesn't even get hit. Finally, the problem seems unrelated to load because it happens at 4am just as likely as at peak traffic time. What the heck could cause such erratic behaviour? I suspect some type of resource problem but what and how could I dig deeper? Maybe your hard drive is set to spin down after a certain period of idle, and since most all your data is coming from memory, then it might be that on the rare occasion when it needs to hit the drive it's not spun up anymore. Maybe some other process is cranking up (cron jobs???) that are chewing up all your I/O bandwidth? Hard to say. Anything in the system logs that would give you a hint? Try correlating them by the time of the slow pgsql queries. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] VERY slow queries at random
Andrew Sullivan wrote: On Wed, Jun 06, 2007 at 09:20:54PM +0200, Gunther Mayer wrote: What the heck could cause such erratic behaviour? I suspect some type of resource problem but what and how could I dig deeper? Is something (perhaps implicitly) locking the table? That will cause this. There are a whole bunch of update queries that fire all the time but afaik none of them ever lock the entire table. To the best of my knowledge UPDATE ... WHERE ... only locks those rows that it actually operates on, in my case this is always a single row. No explicit locking is done anywhere, but perhaps you're right and it is a locking issue. Question is, how do I find out about locks at the time when I only get told about the slow query *after* it has completed and postgres has told me so by logging a slow query entry in my logs? Gunther ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] VERY slow queries at random
Scott Marlowe wrote: Gunther Mayer wrote: Hi there, We run a small ISP with a FreeBSD/freeradius/postgresql 8.2.4 backend and 200+ users. Authentication happens via UAM/hotspot and I see a lot of authorisation and accounting packets that are handled via PL/PGSQL functions directly in the database. Everything seems to work 100% except that a few times a day I see Jun 6 10:41:31 caligula postgres[57347]: [4-1] radiususer: LOG: duration: 19929.291 ms statement: SELECT fn_accounting_start(...) in my logs. I'm logging slow queries with log_min_duration_statement = 500 in my postgresql.conf. Sometimes another query runs equally slow or even slower (I've seen 139 seconds!!!) a few minutes before or after as well, but then everything is back to normal. Even though I haven't yet indexed my data I know that the system is performant because my largest table (the accounting one) only has 5000+ rows, the entire database is only a few MB's and I have plenty of memory (2GB), shared_buffers = 100MB and max_fsm_pages = 179200. Also from briefly enabling log_parser_stats = on log_planner_stats = on log_executor_stats = on I saw that most queries are 100% satisfied from cache so the disk doesn't even get hit. Finally, the problem seems unrelated to load because it happens at 4am just as likely as at peak traffic time. What the heck could cause such erratic behaviour? I suspect some type of resource problem but what and how could I dig deeper? Maybe your hard drive is set to spin down after a certain period of idle, and since most all your data is coming from memory, then it might be that on the rare occasion when it needs to hit the drive it's not spun up anymore. I doubt that as a serious amount of logging is taking place on the box all the time which goes straight to disk. Also, no disk in the world would take more than a minute to spin up... Maybe some other process is cranking up (cron jobs???) that are chewing up all your I/O bandwidth? Hmm, I investigated that too but if that was the case the queries would run slow always at the same time of the day. Hard to say. Anything in the system logs that would give you a hint? Try correlating them by the time of the slow pgsql queries. Nothing relevant in the system logs at the time of the slow query appearing. I have in the mean time tweaked syslog-ng.conf such that as soon as it detects a duration: greater than 500ms log message it spawns top and top -m io and redirects the output to file. At least in that way I can check what's keeping the system busy immediately *after* a slow query has occured. Of course now Murphy's law has it that since I've done that (30 hours ago) not a single slow query has fired, but hey, I'll look at the results once I have them. On another note, autovacuum couldn't cause such issues, could it? I do have autovacuum enabled (autovacuum=on as well as stats_start_collector=on, stats_block_level = on and stats_row_level = on), is there any possibility that autovacuum is not as resource friendly as advertised? Gunther ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] VERY slow queries at random
On Thu, Jun 07, 2007 at 04:22:47PM +0200, Gunther Mayer wrote: There are a whole bunch of update queries that fire all the time but afaik none of them ever lock the entire table. To the best of my knowledge UPDATE ... WHERE ... only locks those rows that it actually operates on, in my case this is always a single row. Well that shouldn't be biting you, then (you're not in SERIALIZABLE mode, right?). The other obvious bit would be checkpoint storms. What's your bgwriter config like? Question is, how do I find out about locks at the time when I only get told about the slow query *after* it has completed and postgres has told me so by logging a slow query entry in my logs? You can't :( A -- Andrew Sullivan | [EMAIL PROTECTED] This work was visionary and imaginative, and goes to show that visionary and imaginative work need not end up well. --Dennis Ritchie ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] VERY slow queries at random
Gunther Mayer wrote: On another note, autovacuum couldn't cause such issues, could it? I do have autovacuum enabled (autovacuum=on as well as stats_start_collector=on, stats_block_level = on and stats_row_level = on), is there any possibility that autovacuum is not as resource friendly as advertised? Hmm. I am not sure where did you read that but I don't think it has ever been stated that autovacuum is resource friendly in the default configuration (I, for one, have never tried, intended or wanted to state that). I suggest tuning the autovacuum_vacuum_cost_delay parameters if you want it to interfere less with your regular operation. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] VERY slow queries at random
On Thu, 7 Jun 2007, Gunther Mayer wrote: wal checkpoint config is on pg defaults everywhere, all relevant config options are commented out. I'm no expert in wal stuff but I don't see how that could cause the problem? Checkpoints are very resource intensive and can cause other processes (including your selects) to hang for a considerable period of time while they are processing. With the default parameters, they can happen very frequently. Normally checkpoint_segments and checkpoint_timeout are increased in order to keep this from happening. This would normally be an issue only if you're writing a substantial amount of data to your tables. If there are a lot of writes going on, you might get some improvement by adjusting those parameters upward; the defaults are pretty low. Make sure you read http://www.postgresql.org/docs/8.2/static/wal-configuration.html first so you know what you're playing with, there are some recovery implications invoved. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] VERY slow queries at random
Hi there, We run a small ISP with a FreeBSD/freeradius/postgresql 8.2.4 backend and 200+ users. Authentication happens via UAM/hotspot and I see a lot of authorisation and accounting packets that are handled via PL/PGSQL functions directly in the database. Everything seems to work 100% except that a few times a day I see Jun 6 10:41:31 caligula postgres[57347]: [4-1] radiususer: LOG: duration: 19929.291 ms statement: SELECT fn_accounting_start(...) in my logs. I'm logging slow queries with log_min_duration_statement = 500 in my postgresql.conf. Sometimes another query runs equally slow or even slower (I've seen 139 seconds!!!) a few minutes before or after as well, but then everything is back to normal. Even though I haven't yet indexed my data I know that the system is performant because my largest table (the accounting one) only has 5000+ rows, the entire database is only a few MB's and I have plenty of memory (2GB), shared_buffers = 100MB and max_fsm_pages = 179200. Also from briefly enabling log_parser_stats = on log_planner_stats = on log_executor_stats = on I saw that most queries are 100% satisfied from cache so the disk doesn't even get hit. Finally, the problem seems unrelated to load because it happens at 4am just as likely as at peak traffic time. What the heck could cause such erratic behaviour? I suspect some type of resource problem but what and how could I dig deeper? Gunther ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] VERY slow queries at random
On Wed, Jun 06, 2007 at 09:20:54PM +0200, Gunther Mayer wrote: What the heck could cause such erratic behaviour? I suspect some type of resource problem but what and how could I dig deeper? Is something (perhaps implicitly) locking the table? That will cause this. A -- Andrew Sullivan | [EMAIL PROTECTED] The year's penultimate month is not in truth a good way of saying November. --H.W. Fowler ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] VERY slow queries at random
Gunther Mayer wrote: Hi there, We run a small ISP with a FreeBSD/freeradius/postgresql 8.2.4 backend and 200+ users. Authentication happens via UAM/hotspot and I see a lot of authorisation and accounting packets that are handled via PL/PGSQL functions directly in the database. Everything seems to work 100% except that a few times a day I see Jun 6 10:41:31 caligula postgres[57347]: [4-1] radiususer: LOG: duration: 19929.291 ms statement: SELECT fn_accounting_start(...) in my logs. I'm logging slow queries with log_min_duration_statement = 500 in my postgresql.conf. Sometimes another query runs equally slow or even slower (I've seen 139 seconds!!!) a few minutes before or after as well, but then everything is back to normal. Even though I haven't yet indexed my data I know that the system is performant because my largest table (the accounting one) only has 5000+ rows, the entire database is only a few MB's and I have plenty of memory (2GB), shared_buffers = 100MB and max_fsm_pages = 179200. Also from briefly enabling log_parser_stats = on log_planner_stats = on log_executor_stats = on I saw that most queries are 100% satisfied from cache so the disk doesn't even get hit. Finally, the problem seems unrelated to load because it happens at 4am just as likely as at peak traffic time. What the heck could cause such erratic behaviour? I suspect some type of resource problem but what and how could I dig deeper? Maybe your hard drive is set to spin down after a certain period of idle, and since most all your data is coming from memory, then it might be that on the rare occasion when it needs to hit the drive it's not spun up anymore. Maybe some other process is cranking up (cron jobs???) that are chewing up all your I/O bandwidth? Hard to say. Anything in the system logs that would give you a hint? Try correlating them by the time of the slow pgsql queries. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Very slow queries
From: Chad Wagner [EMAIL PROTECTED] To: Sidar López Cruz [EMAIL PROTECTED] CC: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Very slow queries Date: Tue, 30 Jan 2007 17:37:17 -0500 On 1/30/07, Sidar López Cruz [EMAIL PROTECTED] wrote: query: Delete From ceroriesgo.salarios Where numero_patrono Not In (Select numero_patrono From ceroriesgo.patronos) Seq Scan on salarios (cost=51021.78..298803854359.95 rows=14240077 width=6) Filter: (NOT (subplan)) SubPlan - Materialize (cost=51021.78..69422.58 rows=1032980 width=25) - Seq Scan on patronos (cost=0.00..41917.80 rows=1032980 width=25) How many rows exist in salarios, but not in patronos? How many rows are there in salarios? Rows: Patronos: 1032980 Salarios: 28480200 What does the explain look like for: delete from ceroriesgo.salarios s where not exists (select 1 from ceroriesgo.patronos where numero_patrono = s.numero_patrono); Also, is this not a case for a foreign key with a cascade delete? No, this is not cascade delete case because I need to delete from salarios not from patronos. http://www.postgresql.org/docs/8.2/static/ddl-constraints.html -- Chad http://www.postgresqlforums.com/ _ Charla con tus amigos en línea mediante MSN Messenger: http://messenger.latam.msn.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Very slow queries
How many rows were delete last time you ran the query? Chad's query looks good but here is another variation that may help. Delete From ceroriesgo.salarios Where numero_patrono In (Select ceroriesgo.salarios.numero_patrono From ceroriesgo.salarios Left Join ceroriesgo.patronos Using (numero_patrono) Where ceroriesgo.patronos.numero_patrono Is Null) Hope that Helps, Ted Sidar López Cruz wrote: From: Chad Wagner [EMAIL PROTECTED] To: Sidar López Cruz [EMAIL PROTECTED] CC: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Very slow queries Date: Tue, 30 Jan 2007 17:37:17 -0500 On 1/30/07, Sidar López Cruz [EMAIL PROTECTED] wrote: query: Delete From ceroriesgo.salarios Where numero_patrono Not In (Select numero_patrono From ceroriesgo.patronos) Seq Scan on salarios (cost=51021.78..298803854359.95 rows=14240077 width=6) Filter: (NOT (subplan)) SubPlan - Materialize (cost=51021.78..69422.58 rows=1032980 width=25) - Seq Scan on patronos (cost=0.00..41917.80 rows=1032980 width=25) How many rows exist in salarios, but not in patronos? How many rows are there in salarios? Rows: Patronos: 1032980 Salarios: 28480200 What does the explain look like for: delete from ceroriesgo.salarios s where not exists (select 1 from ceroriesgo.patronos where numero_patrono = s.numero_patrono); Also, is this not a case for a foreign key with a cascade delete? No, this is not cascade delete case because I need to delete from salarios not from patronos. http://www.postgresql.org/docs/8.2/static/ddl-constraints.html -- Chad http://www.postgresqlforums.com/ _ Charla con tus amigos en línea mediante MSN Messenger: http://messenger.latam.msn.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend -- *Edward Allen* Software Engineer Black Duck Software, Inc. [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] T +1.781.891.5100 x133 F +1.781.891.5145 http://www.blackducksoftware.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Very slow queries
How many rows were delete last time you ran the query? I never delete any rows, the tables was inserted with copy command, then I create index and I need to delete these records on ceroriesgo.salarios to create the foreign key restriction on it. Chad's query looks good but here is another variation that may help. Delete From ceroriesgo.salarios Where numero_patrono In (Select ceroriesgo.salarios.numero_patrono From ceroriesgo.salarios Left Join ceroriesgo.patronos Using (numero_patrono) Where ceroriesgo.patronos.numero_patrono Is Null) Hope that Helps, Ted Sidar López Cruz wrote: From: Chad Wagner [EMAIL PROTECTED] To: Sidar López Cruz [EMAIL PROTECTED] CC: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Very slow queries Date: Tue, 30 Jan 2007 17:37:17 -0500 On 1/30/07, Sidar López Cruz [EMAIL PROTECTED] wrote: query: Delete From ceroriesgo.salarios Where numero_patrono Not In (Select numero_patrono From ceroriesgo.patronos) Seq Scan on salarios (cost=51021.78..298803854359.95 rows=14240077 width=6) Filter: (NOT (subplan)) SubPlan - Materialize (cost=51021.78..69422.58 rows=1032980 width=25) - Seq Scan on patronos (cost=0.00..41917.80 rows=1032980 width=25) How many rows exist in salarios, but not in patronos? How many rows are there in salarios? Rows: Patronos: 1032980 Salarios: 28480200 What does the explain look like for: delete from ceroriesgo.salarios s where not exists (select 1 from ceroriesgo.patronos where numero_patrono = s.numero_patrono); Also, is this not a case for a foreign key with a cascade delete? No, this is not cascade delete case because I need to delete from salarios not from patronos. http://www.postgresql.org/docs/8.2/static/ddl-constraints.html -- Chad http://www.postgresqlforums.com/ _ Charla con tus amigos en línea mediante MSN Messenger: http://messenger.latam.msn.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend -- *Edward Allen* Software Engineer Black Duck Software, Inc. [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] T +1.781.891.5100 x133 F +1.781.891.5145 http://www.blackducksoftware.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq _ Las mejores tiendas, los precios mas bajos, entregas en todo el mundo, YupiMSN Compras: http://latam.msn.com/compras/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Very slow queries
From: Ted Allen [EMAIL PROTECTED] To: Sidar López Cruz [EMAIL PROTECTED] CC: pgsql-performance@postgresql.org, [EMAIL PROTECTED] Subject: Re: [PERFORM] Very slow queries Date: Wed, 31 Jan 2007 09:32:43 -0500 How many rows were delete last time you ran the query? Chad's query looks good but here is another variation that may help. Delete From ceroriesgo.salarios Where numero_patrono In (Select ceroriesgo.salarios.numero_patrono From ceroriesgo.salarios Left Join ceroriesgo.patronos Using (numero_patrono) Where ceroriesgo.patronos.numero_patrono Is Null) Executing these query take: Query returned successfully: 290 rows affected, 2542387 ms execution time. I think that's too many time Hope that Helps, Ted Sidar López Cruz wrote: From: Chad Wagner [EMAIL PROTECTED] To: Sidar López Cruz [EMAIL PROTECTED] CC: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Very slow queries Date: Tue, 30 Jan 2007 17:37:17 -0500 On 1/30/07, Sidar López Cruz [EMAIL PROTECTED] wrote: query: Delete From ceroriesgo.salarios Where numero_patrono Not In (Select numero_patrono From ceroriesgo.patronos) Seq Scan on salarios (cost=51021.78..298803854359.95 rows=14240077 width=6) Filter: (NOT (subplan)) SubPlan - Materialize (cost=51021.78..69422.58 rows=1032980 width=25) - Seq Scan on patronos (cost=0.00..41917.80 rows=1032980 width=25) How many rows exist in salarios, but not in patronos? How many rows are there in salarios? Rows: Patronos: 1032980 Salarios: 28480200 What does the explain look like for: delete from ceroriesgo.salarios s where not exists (select 1 from ceroriesgo.patronos where numero_patrono = s.numero_patrono); Also, is this not a case for a foreign key with a cascade delete? No, this is not cascade delete case because I need to delete from salarios not from patronos. http://www.postgresql.org/docs/8.2/static/ddl-constraints.html -- Chad http://www.postgresqlforums.com/ _ Charla con tus amigos en línea mediante MSN Messenger: http://messenger.latam.msn.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend -- *Edward Allen* Software Engineer Black Duck Software, Inc. [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] T +1.781.891.5100 x133 F +1.781.891.5145 http://www.blackducksoftware.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq _ MSN Amor: busca tu ½ naranja http://latam.msn.com/amor/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Very slow queries
On 1/31/07, Sidar López Cruz [EMAIL PROTECTED] wrote: Executing these query take: Query returned successfully: 290 rows affected, 2542387 ms execution time. I think that's too many time I would post the plans that you are getting, otherwise just mentioning the execution time is not very helpful. Also, yet another syntax is the UPDATE foo... FROM tab1, tab2... syntax. http://www.postgresql.org/docs/8.2/static/sql-update.html In any case, I thought you mentioned this was a one off query? -- Chad http://www.postgresqlforums.com/
Re: [PERFORM] Very slow queries
From: Ted Allen [EMAIL PROTECTED] To: Sidar López Cruz [EMAIL PROTECTED] CC: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Very slow queries Date: Tue, 30 Jan 2007 16:14:38 -0500 What indexes do those tables have? Any? Yes: TABLE ceroriesgo.patronos ADD CONSTRAINT patronos_pkey PRIMARY KEY(numero_patrono); INDEX salarios_numero_patrono_idx ON ceroriesgo.salarios USING btree (numero_patrono); Sidar López Cruz wrote: Check this: query: Delete From ceroriesgo.salarios Where numero_patrono Not In (Select numero_patrono From ceroriesgo.patronos) Seq Scan on salarios (cost=51021.78..298803854359.95 rows=14240077 width=6) Filter: (NOT (subplan)) SubPlan - Materialize (cost=51021.78..69422.58 rows=1032980 width=25) - Seq Scan on patronos (cost=0.00..41917.80 rows=1032980 width=25) These query took a day to finish, how or who can improove better performance of my PostgreSQL. _ Charla con tus amigos en línea mediante MSN Messenger: http://messenger.latam.msn.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- *Edward Allen* Software Engineer Black Duck Software, Inc. [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] T +1.781.891.5100 x133 F +1.781.891.5145 http://www.blackducksoftware.com _ Charla con tus amigos en línea mediante MSN Messenger: http://messenger.latam.msn.com/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Very slow queries
What indexes do those tables have? Any? Sidar López Cruz wrote: Check this: query: Delete From ceroriesgo.salarios Where numero_patrono Not In (Select numero_patrono From ceroriesgo.patronos) Seq Scan on salarios (cost=51021.78..298803854359.95 rows=14240077 width=6) Filter: (NOT (subplan)) SubPlan - Materialize (cost=51021.78..69422.58 rows=1032980 width=25) - Seq Scan on patronos (cost=0.00..41917.80 rows=1032980 width=25) These query took a day to finish, how or who can improove better performance of my PostgreSQL. _ Charla con tus amigos en línea mediante MSN Messenger: http://messenger.latam.msn.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- *Edward Allen* Software Engineer Black Duck Software, Inc. [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] T +1.781.891.5100 x133 F +1.781.891.5145 http://www.blackducksoftware.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Very slow queries
On 1/30/07, Sidar López Cruz [EMAIL PROTECTED] wrote: query: Delete From ceroriesgo.salarios Where numero_patrono Not In (Select numero_patrono From ceroriesgo.patronos) Seq Scan on salarios (cost=51021.78..298803854359.95 rows=14240077 width=6) Filter: (NOT (subplan)) SubPlan - Materialize (cost=51021.78..69422.58 rows=1032980 width=25) - Seq Scan on patronos (cost=0.00..41917.80 rows=1032980 width=25) How many rows exist in salarios, but not in patronos? How many rows are there in salarios? What does the explain look like for: delete from ceroriesgo.salarios s where not exists (select 1 from ceroriesgo.patronos where numero_patrono = s.numero_patrono); Also, is this not a case for a foreign key with a cascade delete? http://www.postgresql.org/docs/8.2/static/ddl-constraints.html -- Chad http://www.postgresqlforums.com/
Re: [PERFORM] Very slow queries - please help
Thanks very much - there are a lot of good articles there... Reading as fast as I can :) Best, Bealach From: Thomas F. O'Connell [EMAIL PROTECTED] To: Bealach-na Bo [EMAIL PROTECTED] CC: PgSQL - Performance pgsql-performance@postgresql.org Subject: Re: [PERFORM] Very slow queries - please help Date: Sun, 4 Dec 2005 00:40:01 -0600 On Nov 24, 2005, at 12:14 PM, Bealach-na Bo wrote: The consensus seems to be that I need more indexes and I also need to look into the NOT IN statement as a possible bottleneck. I've introduced the indexes which has led to a DRAMATIC change in response time. Now I have to experiment with INNER JOIN - OUTER JOIN variations, SET ENABLE_SEQSCAN=OFF. Forgive me for not mentioning each person individually and by name. You have all contributed to confirming what I had suspected (and hoped): that *I* have a lot to learn! I'm attaching table descriptions, the first few lines of top output while the queries were running, index lists, sample queries and EXPLAIN ANALYSE output BEFORE and AFTER the introduction of the indexes. As I said, DRAMATIC :) I notice that the CPU usage does not vary very much, it's nearly 100% anyway, but the memory usage drops markedly, which is another very nice result of the index introduction. Any more comments and tips would be very welcome. You might find the following resources from techdocs instructive: http://techdocs.postgresql.org/redir.php?link=/techdocs/ pgsqladventuresep2.php http://techdocs.postgresql.org/redir.php?link=/techdocs/ pgsqladventuresep3.php These documents provide some guidance into the process of index selection. It seems like you could still stand to benefit from more indexes based on your queries, table definitions, and current indexes. -- Thomas F. O'Connell Database Architecture and Programming Co-Founder Sitening, LLC http://www.sitening.com/ 110 30th Avenue North, Suite 6 Nashville, TN 37203-6320 615-260-0005 (cell) 615-469-5150 (office) 615-469-5151 (fax) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Very slow queries - please help
On Nov 24, 2005, at 12:14 PM, Bealach-na Bo wrote: The consensus seems to be that I need more indexes and I also need to look into the NOT IN statement as a possible bottleneck. I've introduced the indexes which has led to a DRAMATIC change in response time. Now I have to experiment with INNER JOIN - OUTER JOIN variations, SET ENABLE_SEQSCAN=OFF. Forgive me for not mentioning each person individually and by name. You have all contributed to confirming what I had suspected (and hoped): that *I* have a lot to learn! I'm attaching table descriptions, the first few lines of top output while the queries were running, index lists, sample queries and EXPLAIN ANALYSE output BEFORE and AFTER the introduction of the indexes. As I said, DRAMATIC :) I notice that the CPU usage does not vary very much, it's nearly 100% anyway, but the memory usage drops markedly, which is another very nice result of the index introduction. Any more comments and tips would be very welcome. You might find the following resources from techdocs instructive: http://techdocs.postgresql.org/redir.php?link=/techdocs/ pgsqladventuresep2.php http://techdocs.postgresql.org/redir.php?link=/techdocs/ pgsqladventuresep3.php These documents provide some guidance into the process of index selection. It seems like you could still stand to benefit from more indexes based on your queries, table definitions, and current indexes. -- Thomas F. O'Connell Database Architecture and Programming Co-Founder Sitening, LLC http://www.sitening.com/ 110 30th Avenue North, Suite 6 Nashville, TN 37203-6320 615-260-0005 (cell) 615-469-5150 (office) 615-469-5151 (fax) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Very slow queries - please help.
Hi Folks, I'm new to Postgresql. I'm having great difficulties getting the performance I had hoped for from Postgresql 8.0. The typical query below takes ~20 minutes !! I hope an expert out there will tell me what I'm doing wrong - I hope *I* am doing something wrong. Hardware Single processor, Intel Xeon 3.06 GHz machine running Red Hat Ent. 4. with 1.5 GB of RAM. The machine is dedicated to running Postgresql 8.0 and Apache/mod_perl etc. The database is being accessed for report generation via a web form. The web server talks to Pg over TCP/IP (I know, that I don't need to do this if they are all on the same machine, but I have good reasons for this and don't suspect that this is where my problems are - I have the same poor performance when running from psql on the server.) Database Very simple, not fully normalized set of two tables. The first table, very small (2000 lines of 4 cols with very few chars and integers in in col). The other quite a bit larger (50 lines with 15 cols. with the largest fields ~ 256 chars) Typical query SELECT n.name FROM node n WHERE n.name LIKE '56x%' AND n.type='H' AND n.usage='TEST' AND n.node_id NOT IN (select n.node_id FROM job_log j INNER JOIN node n ON j.node_id = n.node_id WHERE n.name LIKE '56x%' AND n.type='H' AND n.usage='TEST' AND j.job_name = 'COPY FILES' AND j.job_start = '2005-11-14 00:00:00' AND (j.job_stop = '2005-11-22 09:31:10' OR j.job_stop IS NULL)) ORDER BY n.name The node table is the small table and the job_log table is the large table. I've tried all the basic things that I found in the documentation like VACUUM ANALYZE, EXPLAIN etc., but I suspect there is something terribly wrong with what I'm doing and these measures will not shave off 19 min and 50 seconds off the query time. Any help and comments would be very much appreciated. Bealach ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Very slow queries - please help.
Typical query SELECT n.name FROM node n WHERE n.name LIKE '56x%' AND n.type='H' AND n.usage='TEST' AND n.node_id NOT IN (select n.node_id FROM job_log j INNER JOIN node n ON j.node_id = n.node_id WHERE n.name LIKE '56x%' AND n.type='H' AND n.usage='TEST' AND j.job_name = 'COPY FILES' AND j.job_start = '2005-11-14 00:00:00' AND (j.job_stop = '2005-11-22 09:31:10' OR j.job_stop IS NULL)) ORDER BY n.name Do you have any indexes? regards Claus ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Very slow queries - please help.
Hi, Thanks for your comments. I've explicitly made any indexes, but the default ones are: [EMAIL PROTECTED] \di List of relations Schema | Name | Type | Owner | Table -+-+---+-+- user | job_log_id_pkey | index | user | job_log user | node_id_pkey| index | user | node user | node_name_key | index | user | node (3 rows) I'm also sending the EXPLAIN outputs. explain SELECT n.name,n.type, n.usage, j.status, j.job_start,j.job_stop, j.nfiles_in_job,j.job_name FROM job_log j INNER JOIN node n ON j.node_id = n.node_id WHERE n.name LIKE '56x%' AND n.type = 'K' AND n.usage = 'LIVE' AND j.job_name = 'COPY FILES' AND j.job_start = '2005-11-14 00:00:00' AND (j.job_stop = '2005-11-14 05:00:00' OR j.job_stop IS NULL) ORDER BY n.name; QUERY PLAN -- Nested Loop (cost=0.00..75753.31 rows=1 width=461) Join Filter: (inner.node_id = outer.node_id) - Index Scan using node_name_key on node n (cost=0.00..307.75 rows=1 width=181) Filter: ((name ~~ '56x%'::text) AND (type = 'K'::bpchar) AND (usage = 'LIVE'::bpchar)) - Seq Scan on job_log j (cost=0.00..75445.54 rows=1 width=288) Filter: ((job_name = 'COPY FILES'::bpchar) AND (job_start = '2005-11-14 00:00:00'::timestamp without time zone) AND ((job_stop = '2005-11-14 05:00:00'::timestamp without time zone) OR (job_stop IS NULL))) (6 rows) explain SELECT n.name, n.type, n.usage FROM node n WHERE n.name LIKE '56x%' AND n.type = 'K' AND n.usage = 'LIVE' AND n.node_id NOT IN (SELECT n.node_id FROM job_log j INNER JOIN node n ON j.node_id = n.node_id WHERE n.name LIKE '56x%' AND n.type = 'K' AND n.usage = 'LIVE' AND j.job_name = 'COPY FILES' AND j.job_start = '2005-11-14 00:00:00' AND (j.job_stop = '2005-11-14 05:00:00' OR j.job_stop IS NULL)) ORDER BY n.name; QUERY PLAN -- Index Scan using node_name_key on node n (cost=75451.55..75764.94 rows=1 width=177) Filter: ((name ~~ '56x%'::text) AND (type = 'K'::bpchar) AND (usage = 'LIVE'::bpchar) AND (NOT (hashed subplan))) SubPlan - Nested Loop (cost=0.00..75451.54 rows=1 width=4) - Seq Scan on job_log j (cost=0.00..75445.54 rows=1 width=4) Filter: ((job_name = 'COPY FILES'::bpchar) AND (job_start = '2005-11-14 00:00:00'::timestamp without time zone) AND ((job_stop = '2005-11-14 05:00:00'::timestamp without time zone) OR (job_stop IS NULL))) - Index Scan using node_id_pkey on node n (cost=0.00..5.99 rows=1 width=4) Index Cond: (outer.node_id = n.node_id) Filter: ((name ~~ '56x%'::text) AND (type = 'K'::bpchar) AND (usage = 'LIVE'::bpchar)) Yours, Bealach From: Claus Guttesen [EMAIL PROTECTED] To: Bealach-na Bo [EMAIL PROTECTED] CC: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Very slow queries - please help. Date: Thu, 24 Nov 2005 14:23:38 +0100 Typical query SELECT n.name FROM node n WHERE n.name LIKE '56x%' AND n.type='H' AND n.usage='TEST' AND n.node_id NOT IN (select n.node_id FROM job_log j INNER JOIN node n ON j.node_id = n.node_id WHERE n.name LIKE '56x%' AND n.type='H' AND n.usage='TEST' AND j.job_name = 'COPY FILES' AND j.job_start = '2005-11-14 00:00:00' AND (j.job_stop = '2005-11-22 09:31:10' OR j.job_stop IS NULL)) ORDER BY n.name Do you have any indexes? regards Claus ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Very slow queries - please help.
Hi, I'm also sending the EXPLAIN outputs. Please provide EXPLAIN ANALYZE outputs instead of EXPLAIN. You will have more information. Indexes on your tables are obviously missing. You should try to add: CREATE INDEX idx_node_filter ON node(name, type, usage); CREATE INDEX idx_job_log_filter ON job_log(job_name, job_start, job_stop); I'm not so sure it's a good idea to add job_stop in this index as you have an IS NULL in your query so I'm not sure it can be used. You should try it anyway and remove it if not needed. I added all your search fields in the indexes but it depends a lot on the selectivity of your conditions. I don't know your data but I think you understand the idea. HTH -- Guillaume ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Very slow queries - please help.
Bealach-na Bo [EMAIL PROTECTED] writes: I'm having great difficulties getting the performance I had hoped for from Postgresql 8.0. The typical query below takes ~20 minutes !! You need to show us the table definition (including indexes) and the EXPLAIN ANALYZE results for the query. It seems likely that the NOT IN is the source of your problems, but it's hard to be sure without EXPLAIN results. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Very slow queries - please help
OK. The consensus seems to be that I need more indexes and I also need to look into the NOT IN statement as a possible bottleneck. I've introduced the indexes which has led to a DRAMATIC change in response time. Now I have to experiment with INNER JOIN - OUTER JOIN variations, SET ENABLE_SEQSCAN=OFF. Forgive me for not mentioning each person individually and by name. You have all contributed to confirming what I had suspected (and hoped): that *I* have a lot to learn! I'm attaching table descriptions, the first few lines of top output while the queries were running, index lists, sample queries and EXPLAIN ANALYSE output BEFORE and AFTER the introduction of the indexes. As I said, DRAMATIC :) I notice that the CPU usage does not vary very much, it's nearly 100% anyway, but the memory usage drops markedly, which is another very nice result of the index introduction. Any more comments and tips would be very welcome. Thank you all for your input. Bealach. [EMAIL PROTECTED] \d job_log Table blouser.job_log Column |Type |Modifiers +-+-- job_log_id | integer | not null default nextval('job_log_id_seq'::text) first_registry | timestamp without time zone | blogger_name | character(50) | node_id| integer | job_type | character(50) | job_name | character(256) | job_start | timestamp without time zone | job_timeout| interval| job_stop | timestamp without time zone | nfiles_in_job | integer | status | integer | error_code | smallint| Indexes: job_log_id_pkey PRIMARY KEY, btree (job_log_id) Check constraints: job_log_status_check CHECK (status = 0 OR status = 1 OR status = 8 OR status = 9) Foreign-key constraints: legal_node FOREIGN KEY (node_id) REFERENCES node(node_id) [EMAIL PROTECTED] \d node Table blouser.node Column | Type | Modifiers -+---+--- node_id | integer | not null default nextval('node_id_seq'::text) name| character(50) | type| character(1) | usage | character(4) | Indexes: node_id_pkey PRIMARY KEY, btree (node_id) node_name_key UNIQUE, btree (name) Check constraints: node_type_check CHECK (type = 'B'::bpchar OR type = 'K'::bpchar OR type = 'C'::bpchar OR type = 'T'::bpchar OR type = 'R'::bpchar) node_usage_check CHECK (usage = 'TEST'::bpchar OR usage = 'LIVE'::bpchar) #before new indexes were created Tasks: 114 total, 2 running, 112 sleeping, 0 stopped, 0 zombie Cpu(s): 25.7% us, 24.5% sy, 0.0% ni, 49.4% id, 0.3% wa, 0.0% hi, 0.0% si Mem: 1554788k total, 1513576k used,41212k free,31968k buffers Swap: 1020024k total,27916k used, 992108k free, 708728k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 25883 postgres 25 0 20528 12m 11m R 99.7 0.8 4:54.91 postmaster [EMAIL PROTECTED] \di List of relations Schema | Name | Type | Owner | Table -+-+---+-+- blouser | job_log_id_pkey | index | blouser | job_log blouser | node_id_pkey| index | blouser | node blouser | node_name_key | index | blouser | node (3 rows) EXPLAIN ANALYSE SELECT n.name,n.type, n.usage, j.status, j.job_start,j.job_stop, j.nfiles_in_job,j.job_name FROM job_log j INNER JOIN node n ON j.node_id = n.node_id WHERE n.name LIKE '711%' AND n.type = 'K' AND n.usage = 'LIVE' AND j.job_name = 'COPY FILES' AND j.job_start = '2005-11-14 00:00:00' AND (j.job_stop = '2005-11-14 05:00:00' OR j.job_stop IS NULL) ORDER BY n.name; QUERY PLAN --- Nested Loop (cost=0.00..75753.31 rows=1 width=461) (actual time=270486.692..291662.350 rows=3 loops=1) Join Filter: (inner.node_id = outer.node_id) - Index Scan using node_name_key on node n (cost=0.00..307.75 rows=1 width=181) (actual time=0.135..11.034 rows=208 loops=1) Filter: ((name ~~ '711%'::text) AND (type = 'K'::bpchar) AND (usage = 'LIVE'::bpchar)) - Seq Scan on job_log j (cost=0.00..75445.54 rows=1 width=288) (actual time=273.374..1402.089 rows=22 loops=208) Filter: ((job_name = 'COPY FILES'::bpchar) AND (job_start = '2005-11-14 00:00:00'::timestamp without time zone) AND ((job_stop = '2005-11-14
[PERFORM] Very slow queries - please help
A quick note to say that I'm very grateful for Tom Lane's input also. Tom, I did put you on the list of recipients for my last posting to pgsql-performance, but got: cut here This is an automatically generated Delivery Status Notification. Delivery to the following recipients failed. [EMAIL PROTECTED] Many regards, Bealach ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly