Re: [PERFORM] VERY slow queries at random

2007-06-08 Thread Kristo Kaiv


On 07.06.2007, at 22:42, Greg Smith wrote:


On Thu, 7 Jun 2007, Gunther Mayer wrote:

wal checkpoint config is on pg defaults everywhere, all relevant  
config options are commented out. I'm no expert in wal stuff but I  
don't see how that could cause the problem?


Checkpoints are very resource intensive and can cause other  
processes (including your selects) to hang for a considerable  
period of time while they are processing.  With the default  
parameters, they can happen very frequently.  Normally  
checkpoint_segments and checkpoint_timeout are increased in order  
to keep this from happening.


This would normally be an issue only if you're writing a  
substantial amount of data to your tables.  If there are a lot of  
writes going on, you might get some improvement by adjusting those  
parameters upward; the defaults are pretty low.  Make sure you read  
http://www.postgresql.org/docs/8.2/static/wal-configuration.html  
first so you know what you're playing with, there are some recovery  
implications invoved.


I remember us having problems with 8.0 background writer, you might  
want to try turning it off. Not sure if it behaves as badly in 8.2.

increasing wal buffers might be a good idea also.

Kristo


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] VERY slow queries at random

2007-06-07 Thread Kristo Kaiv

could be that the checkpoints are done too seldom.
what is your wal checkpoint config?

Kristo
On 07.06.2007, at 0:27, Scott Marlowe wrote:


Gunther Mayer wrote:

Hi there,

We run a small ISP with a FreeBSD/freeradius/postgresql 8.2.4 backend
and 200+ users. Authentication happens via UAM/hotspot and I see a  
lot

of authorisation and accounting packets that are handled via PL/PGSQL
functions directly in the database.

Everything seems to work 100% except that a few times a day I see

Jun  6 10:41:31 caligula postgres[57347]: [4-1] radiususer: LOG:
duration: 19929.291 ms  statement: SELECT fn_accounting_start(...)

in my logs. I'm logging slow queries with  
log_min_duration_statement =
500 in my postgresql.conf. Sometimes another query runs equally  
slow or
even slower (I've seen 139 seconds!!!) a few minutes before or  
after as

well, but then everything is back to normal.

Even though I haven't yet indexed my data I know that the system is
performant because my largest table (the accounting one) only has  
5000+
rows, the entire database is only a few MB's and I have plenty of  
memory

(2GB), shared_buffers = 100MB and max_fsm_pages = 179200.  Also from
briefly enabling

log_parser_stats = on
log_planner_stats = on
log_executor_stats = on

I saw that most queries are 100% satisfied from cache so the disk
doesn't even get hit. Finally, the problem seems unrelated to load
because it happens at 4am just as likely as at peak traffic time.

What the heck could cause such erratic behaviour? I suspect some  
type of

resource problem but what and how could I dig deeper?


Maybe your hard drive is set to spin down after a certain period of  
idle, and since most all your data is coming from memory, then it  
might be that on the rare occasion when it needs to hit the drive  
it's not spun up anymore.


Maybe some other process is cranking up (cron jobs???) that are  
chewing up all your I/O bandwidth?


Hard to say.  Anything in the system logs that would give you a  
hint?  Try correlating them by the time of the slow pgsql queries.



---(end of  
broadcast)---

TIP 2: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] VERY slow queries at random

2007-06-07 Thread Gunther Mayer

Andrew Sullivan wrote:

On Wed, Jun 06, 2007 at 09:20:54PM +0200, Gunther Mayer wrote:
  

What the heck could cause such erratic behaviour? I suspect some type of
resource problem but what and how could I dig deeper?



Is something (perhaps implicitly) locking the table?  That will cause
this.
  
There are a whole bunch of update queries that fire all the time but 
afaik none of them ever lock the entire table. To the best of my 
knowledge UPDATE ... WHERE ... only locks those rows that it actually 
operates on, in my case this is always a single row. No explicit locking 
is done anywhere, but perhaps you're right and it is a locking issue. 
Question is, how do I find out about locks at the time when I only get 
told about the slow query *after* it has completed and postgres has told 
me so by logging a slow query entry in my logs?


Gunther

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] VERY slow queries at random

2007-06-07 Thread Gunther Mayer

Scott Marlowe wrote:

Gunther Mayer wrote:

Hi there,

We run a small ISP with a FreeBSD/freeradius/postgresql 8.2.4 backend
and 200+ users. Authentication happens via UAM/hotspot and I see a lot
of authorisation and accounting packets that are handled via PL/PGSQL
functions directly in the database.

Everything seems to work 100% except that a few times a day I see

Jun  6 10:41:31 caligula postgres[57347]: [4-1] radiususer: LOG:
duration: 19929.291 ms  statement: SELECT fn_accounting_start(...)

in my logs. I'm logging slow queries with log_min_duration_statement =
500 in my postgresql.conf. Sometimes another query runs equally slow or
even slower (I've seen 139 seconds!!!) a few minutes before or after as
well, but then everything is back to normal.

Even though I haven't yet indexed my data I know that the system is
performant because my largest table (the accounting one) only has 5000+
rows, the entire database is only a few MB's and I have plenty of memory
(2GB), shared_buffers = 100MB and max_fsm_pages = 179200.  Also from
briefly enabling

log_parser_stats = on
log_planner_stats = on
log_executor_stats = on

I saw that most queries are 100% satisfied from cache so the disk
doesn't even get hit. Finally, the problem seems unrelated to load
because it happens at 4am just as likely as at peak traffic time.

What the heck could cause such erratic behaviour? I suspect some type of
resource problem but what and how could I dig deeper? 


Maybe your hard drive is set to spin down after a certain period of 
idle, and since most all your data is coming from memory, then it 
might be that on the rare occasion when it needs to hit the drive it's 
not spun up anymore.
I doubt that as a serious amount of logging is taking place on the box 
all the time which goes straight to disk. Also, no disk in the world 
would take more than a minute to spin up...
Maybe some other process is cranking up (cron jobs???) that are 
chewing up all your I/O bandwidth?
Hmm, I investigated that too but if that was the case the queries would 
run slow always at the same time of the day.
Hard to say.  Anything in the system logs that would give you a hint?  
Try correlating them by the time of the slow pgsql queries.
Nothing relevant in the system logs at the time of the slow query 
appearing. I have in the mean time tweaked syslog-ng.conf such that as 
soon as it detects a duration: greater than 500ms log message it 
spawns top and top -m io and redirects the output to file. At least in 
that way I can check what's keeping the system busy immediately *after* 
a slow query has occured. Of course now Murphy's law has it that since 
I've done that (30 hours ago) not a single slow query has fired, but 
hey, I'll look at the results once I have them.


On another note, autovacuum couldn't cause such issues, could it? I do 
have autovacuum enabled (autovacuum=on as well as 
stats_start_collector=on, stats_block_level = on and stats_row_level = 
on), is there any possibility that autovacuum is not as resource 
friendly as advertised?


Gunther

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] VERY slow queries at random

2007-06-07 Thread Andrew Sullivan
On Thu, Jun 07, 2007 at 04:22:47PM +0200, Gunther Mayer wrote:
 There are a whole bunch of update queries that fire all the time but 
 afaik none of them ever lock the entire table. To the best of my 
 knowledge UPDATE ... WHERE ... only locks those rows that it actually 
 operates on, in my case this is always a single row.

Well that shouldn't be biting you, then (you're not in SERIALIZABLE
mode, right?).  The other obvious bit would be checkpoint storms. 
What's your bgwriter config like?

 Question is, how do I find out about locks at the time when I only get 
 told about the slow query *after* it has completed and postgres has told 
 me so by logging a slow query entry in my logs?

You can't :(

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
This work was visionary and imaginative, and goes to show that visionary
and imaginative work need not end up well. 
--Dennis Ritchie

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] VERY slow queries at random

2007-06-07 Thread Alvaro Herrera
Gunther Mayer wrote:

 On another note, autovacuum couldn't cause such issues, could it? I do 
 have autovacuum enabled (autovacuum=on as well as 
 stats_start_collector=on, stats_block_level = on and stats_row_level = 
 on), is there any possibility that autovacuum is not as resource 
 friendly as advertised?

Hmm.  I am not sure where did you read that but I don't think it has
ever been stated that autovacuum is resource friendly in the default
configuration (I, for one, have never tried, intended or wanted to state
that).  I suggest tuning the autovacuum_vacuum_cost_delay parameters if
you want it to interfere less with your regular operation.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] VERY slow queries at random

2007-06-07 Thread Greg Smith

On Thu, 7 Jun 2007, Gunther Mayer wrote:

wal checkpoint config is on pg defaults everywhere, all relevant config 
options are commented out. I'm no expert in wal stuff but I don't see how 
that could cause the problem?


Checkpoints are very resource intensive and can cause other processes 
(including your selects) to hang for a considerable period of time while 
they are processing.  With the default parameters, they can happen very 
frequently.  Normally checkpoint_segments and checkpoint_timeout are 
increased in order to keep this from happening.


This would normally be an issue only if you're writing a substantial 
amount of data to your tables.  If there are a lot of writes going on, you 
might get some improvement by adjusting those parameters upward; the 
defaults are pretty low.  Make sure you read 
http://www.postgresql.org/docs/8.2/static/wal-configuration.html first so 
you know what you're playing with, there are some recovery implications 
invoved.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[PERFORM] VERY slow queries at random

2007-06-06 Thread Gunther Mayer

Hi there,

We run a small ISP with a FreeBSD/freeradius/postgresql 8.2.4 backend
and 200+ users. Authentication happens via UAM/hotspot and I see a lot
of authorisation and accounting packets that are handled via PL/PGSQL
functions directly in the database.

Everything seems to work 100% except that a few times a day I see

Jun  6 10:41:31 caligula postgres[57347]: [4-1] radiususer: LOG:
duration: 19929.291 ms  statement: SELECT fn_accounting_start(...)

in my logs. I'm logging slow queries with log_min_duration_statement =
500 in my postgresql.conf. Sometimes another query runs equally slow or
even slower (I've seen 139 seconds!!!) a few minutes before or after as
well, but then everything is back to normal.

Even though I haven't yet indexed my data I know that the system is
performant because my largest table (the accounting one) only has 5000+
rows, the entire database is only a few MB's and I have plenty of memory
(2GB), shared_buffers = 100MB and max_fsm_pages = 179200.  Also from
briefly enabling

log_parser_stats = on
log_planner_stats = on
log_executor_stats = on

I saw that most queries are 100% satisfied from cache so the disk
doesn't even get hit. Finally, the problem seems unrelated to load
because it happens at 4am just as likely as at peak traffic time.

What the heck could cause such erratic behaviour? I suspect some type of
resource problem but what and how could I dig deeper?

Gunther


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] VERY slow queries at random

2007-06-06 Thread Andrew Sullivan
On Wed, Jun 06, 2007 at 09:20:54PM +0200, Gunther Mayer wrote:
 
 What the heck could cause such erratic behaviour? I suspect some type of
 resource problem but what and how could I dig deeper?

Is something (perhaps implicitly) locking the table?  That will cause
this.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The year's penultimate month is not in truth a good way of saying
November.
--H.W. Fowler

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] VERY slow queries at random

2007-06-06 Thread Scott Marlowe

Gunther Mayer wrote:

Hi there,

We run a small ISP with a FreeBSD/freeradius/postgresql 8.2.4 backend
and 200+ users. Authentication happens via UAM/hotspot and I see a lot
of authorisation and accounting packets that are handled via PL/PGSQL
functions directly in the database.

Everything seems to work 100% except that a few times a day I see

Jun  6 10:41:31 caligula postgres[57347]: [4-1] radiususer: LOG:
duration: 19929.291 ms  statement: SELECT fn_accounting_start(...)

in my logs. I'm logging slow queries with log_min_duration_statement =
500 in my postgresql.conf. Sometimes another query runs equally slow or
even slower (I've seen 139 seconds!!!) a few minutes before or after as
well, but then everything is back to normal.

Even though I haven't yet indexed my data I know that the system is
performant because my largest table (the accounting one) only has 5000+
rows, the entire database is only a few MB's and I have plenty of memory
(2GB), shared_buffers = 100MB and max_fsm_pages = 179200.  Also from
briefly enabling

log_parser_stats = on
log_planner_stats = on
log_executor_stats = on

I saw that most queries are 100% satisfied from cache so the disk
doesn't even get hit. Finally, the problem seems unrelated to load
because it happens at 4am just as likely as at peak traffic time.

What the heck could cause such erratic behaviour? I suspect some type of
resource problem but what and how could I dig deeper? 


Maybe your hard drive is set to spin down after a certain period of 
idle, and since most all your data is coming from memory, then it might 
be that on the rare occasion when it needs to hit the drive it's not 
spun up anymore.


Maybe some other process is cranking up (cron jobs???) that are chewing 
up all your I/O bandwidth?


Hard to say.  Anything in the system logs that would give you a hint?  
Try correlating them by the time of the slow pgsql queries.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Very slow queries

2007-01-31 Thread Sidar López Cruz





From: Chad Wagner [EMAIL PROTECTED]
To: Sidar López Cruz [EMAIL PROTECTED]
CC: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Very slow queries
Date: Tue, 30 Jan 2007 17:37:17 -0500

On 1/30/07, Sidar López Cruz [EMAIL PROTECTED] wrote:


query: Delete From ceroriesgo.salarios Where numero_patrono Not In (Select
numero_patrono From ceroriesgo.patronos)

Seq Scan on salarios  (cost=51021.78..298803854359.95 rows=14240077
width=6)
  Filter: (NOT (subplan))
  SubPlan
-  Materialize  (cost=51021.78..69422.58 rows=1032980 width=25)
  -  Seq Scan on patronos  (cost=0.00..41917.80 rows=1032980
width=25)



How many rows exist in salarios, but not in patronos?  How many rows are
there in salarios?


Rows:
Patronos: 1032980
Salarios:  28480200



What does the explain look like for:

delete
from ceroriesgo.salarios s
where not exists (select 1
   from ceroriesgo.patronos
 where numero_patrono = s.numero_patrono);

Also, is this not a case for a foreign key with a cascade delete?


No, this is not cascade delete case because I need to delete from salarios 
not from patronos.




http://www.postgresql.org/docs/8.2/static/ddl-constraints.html


--
Chad
http://www.postgresqlforums.com/


_
Charla con tus amigos en línea mediante MSN Messenger: 
http://messenger.latam.msn.com/



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Very slow queries

2007-01-31 Thread Ted Allen

How many rows were delete last time you ran the query?

Chad's query looks good but here is another variation that may help. 


Delete From ceroriesgo.salarios Where numero_patrono In (Select
ceroriesgo.salarios.numero_patrono From ceroriesgo.salarios Left Join 
ceroriesgo.patronos Using (numero_patrono) Where  
ceroriesgo.patronos.numero_patrono Is Null)


Hope that Helps,
Ted

Sidar López Cruz wrote:





From: Chad Wagner [EMAIL PROTECTED]
To: Sidar López Cruz [EMAIL PROTECTED]
CC: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Very slow queries
Date: Tue, 30 Jan 2007 17:37:17 -0500

On 1/30/07, Sidar López Cruz [EMAIL PROTECTED] wrote:


query: Delete From ceroriesgo.salarios Where numero_patrono Not In 
(Select

numero_patrono From ceroriesgo.patronos)

Seq Scan on salarios  (cost=51021.78..298803854359.95 rows=14240077
width=6)
  Filter: (NOT (subplan))
  SubPlan
-  Materialize  (cost=51021.78..69422.58 rows=1032980 width=25)
  -  Seq Scan on patronos  (cost=0.00..41917.80 rows=1032980
width=25)



How many rows exist in salarios, but not in patronos?  How many rows are
there in salarios?


Rows:
Patronos: 1032980
Salarios:  28480200



What does the explain look like for:

delete
from ceroriesgo.salarios s
where not exists (select 1
   from ceroriesgo.patronos
 where numero_patrono = s.numero_patrono);

Also, is this not a case for a foreign key with a cascade delete?


No, this is not cascade delete case because I need to delete from 
salarios not from patronos.




http://www.postgresql.org/docs/8.2/static/ddl-constraints.html


--
Chad
http://www.postgresqlforums.com/


_
Charla con tus amigos en línea mediante MSN Messenger: 
http://messenger.latam.msn.com/



---(end of broadcast)---
TIP 6: explain analyze is your friend




--

*Edward Allen*
Software Engineer
Black Duck Software, Inc.

[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
T +1.781.891.5100 x133
F +1.781.891.5145
http://www.blackducksoftware.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Very slow queries

2007-01-31 Thread Sidar López Cruz

How many rows were delete last time you ran the query?


I never delete any rows, the tables was inserted with copy command, then I 
create index and I need to delete these records on ceroriesgo.salarios to 
create the foreign key restriction on it.





Chad's query looks good but here is another variation that may help.

Delete From ceroriesgo.salarios Where numero_patrono In (Select
ceroriesgo.salarios.numero_patrono From ceroriesgo.salarios Left Join 
ceroriesgo.patronos Using (numero_patrono) Where  
ceroriesgo.patronos.numero_patrono Is Null)


Hope that Helps,
Ted

Sidar López Cruz wrote:





From: Chad Wagner [EMAIL PROTECTED]
To: Sidar López Cruz [EMAIL PROTECTED]
CC: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Very slow queries
Date: Tue, 30 Jan 2007 17:37:17 -0500

On 1/30/07, Sidar López Cruz [EMAIL PROTECTED] wrote:


query: Delete From ceroriesgo.salarios Where numero_patrono Not In 
(Select

numero_patrono From ceroriesgo.patronos)

Seq Scan on salarios  (cost=51021.78..298803854359.95 rows=14240077
width=6)
  Filter: (NOT (subplan))
  SubPlan
-  Materialize  (cost=51021.78..69422.58 rows=1032980 width=25)
  -  Seq Scan on patronos  (cost=0.00..41917.80 rows=1032980
width=25)



How many rows exist in salarios, but not in patronos?  How many rows are
there in salarios?


Rows:
Patronos: 1032980
Salarios:  28480200



What does the explain look like for:

delete
from ceroriesgo.salarios s
where not exists (select 1
   from ceroriesgo.patronos
 where numero_patrono = s.numero_patrono);

Also, is this not a case for a foreign key with a cascade delete?


No, this is not cascade delete case because I need to delete from salarios 
not from patronos.




http://www.postgresql.org/docs/8.2/static/ddl-constraints.html


--
Chad
http://www.postgresqlforums.com/


_
Charla con tus amigos en línea mediante MSN Messenger: 
http://messenger.latam.msn.com/



---(end of broadcast)---
TIP 6: explain analyze is your friend




--

*Edward Allen*
Software Engineer
Black Duck Software, Inc.

[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
T +1.781.891.5100 x133
F +1.781.891.5145
http://www.blackducksoftware.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


_
Las mejores tiendas, los precios mas bajos, entregas en todo el mundo, 
YupiMSN Compras: http://latam.msn.com/compras/



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Very slow queries

2007-01-31 Thread Sidar López Cruz

From: Ted Allen [EMAIL PROTECTED]
To: Sidar López Cruz [EMAIL PROTECTED]
CC: pgsql-performance@postgresql.org,  [EMAIL PROTECTED]
Subject: Re: [PERFORM] Very slow queries
Date: Wed, 31 Jan 2007 09:32:43 -0500

How many rows were delete last time you ran the query?

Chad's query looks good but here is another variation that may help.

Delete From ceroriesgo.salarios Where numero_patrono In (Select
ceroriesgo.salarios.numero_patrono From ceroriesgo.salarios Left Join 
ceroriesgo.patronos Using (numero_patrono) Where  
ceroriesgo.patronos.numero_patrono Is Null)




Executing these query take:

Query returned successfully: 290 rows affected, 2542387 ms execution time.


I think that's too many time






Hope that Helps,
Ted

Sidar López Cruz wrote:





From: Chad Wagner [EMAIL PROTECTED]
To: Sidar López Cruz [EMAIL PROTECTED]
CC: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Very slow queries
Date: Tue, 30 Jan 2007 17:37:17 -0500

On 1/30/07, Sidar López Cruz [EMAIL PROTECTED] wrote:


query: Delete From ceroriesgo.salarios Where numero_patrono Not In 
(Select

numero_patrono From ceroriesgo.patronos)

Seq Scan on salarios  (cost=51021.78..298803854359.95 rows=14240077
width=6)
  Filter: (NOT (subplan))
  SubPlan
-  Materialize  (cost=51021.78..69422.58 rows=1032980 width=25)
  -  Seq Scan on patronos  (cost=0.00..41917.80 rows=1032980
width=25)



How many rows exist in salarios, but not in patronos?  How many rows are
there in salarios?


Rows:
Patronos: 1032980
Salarios:  28480200



What does the explain look like for:

delete
from ceroriesgo.salarios s
where not exists (select 1
   from ceroriesgo.patronos
 where numero_patrono = s.numero_patrono);

Also, is this not a case for a foreign key with a cascade delete?


No, this is not cascade delete case because I need to delete from salarios 
not from patronos.




http://www.postgresql.org/docs/8.2/static/ddl-constraints.html


--
Chad
http://www.postgresqlforums.com/


_
Charla con tus amigos en línea mediante MSN Messenger: 
http://messenger.latam.msn.com/



---(end of broadcast)---
TIP 6: explain analyze is your friend




--

*Edward Allen*
Software Engineer
Black Duck Software, Inc.

[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
T +1.781.891.5100 x133
F +1.781.891.5145
http://www.blackducksoftware.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


_
MSN Amor: busca tu ½ naranja http://latam.msn.com/amor/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Very slow queries

2007-01-31 Thread Chad Wagner

On 1/31/07, Sidar López Cruz [EMAIL PROTECTED] wrote:


Executing these query take:
Query returned successfully: 290 rows affected, 2542387 ms execution time.
I think that's too many time



I would post the plans that you are getting, otherwise just mentioning the
execution time is not very helpful.  Also, yet another syntax is the UPDATE
foo... FROM tab1, tab2... syntax.

http://www.postgresql.org/docs/8.2/static/sql-update.html


In any case, I thought you mentioned this was a one off query?



--
Chad
http://www.postgresqlforums.com/


Re: [PERFORM] Very slow queries

2007-01-30 Thread Sidar López Cruz





From: Ted Allen [EMAIL PROTECTED]
To: Sidar López Cruz [EMAIL PROTECTED]
CC: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Very slow queries
Date: Tue, 30 Jan 2007 16:14:38 -0500





What indexes do those tables have?  Any?


Yes:
TABLE ceroriesgo.patronos ADD CONSTRAINT patronos_pkey PRIMARY 
KEY(numero_patrono);


INDEX salarios_numero_patrono_idx  ON ceroriesgo.salarios
 USING btree (numero_patrono);





Sidar López Cruz wrote:

Check this:

query: Delete From ceroriesgo.salarios Where numero_patrono Not In (Select 
numero_patrono From ceroriesgo.patronos)


Seq Scan on salarios  (cost=51021.78..298803854359.95 rows=14240077 
width=6)

 Filter: (NOT (subplan))
 SubPlan
   -  Materialize  (cost=51021.78..69422.58 rows=1032980 width=25)
 -  Seq Scan on patronos  (cost=0.00..41917.80 rows=1032980 
width=25)



These query took a day to finish, how or who can improove better 
performance of my PostgreSQL.


_
Charla con tus amigos en línea mediante MSN Messenger: 
http://messenger.latam.msn.com/



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org




--

*Edward Allen*
Software Engineer
Black Duck Software, Inc.

[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
T +1.781.891.5100 x133
F +1.781.891.5145
http://www.blackducksoftware.com



_
Charla con tus amigos en línea mediante MSN Messenger: 
http://messenger.latam.msn.com/



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Very slow queries

2007-01-30 Thread Ted Allen

What indexes do those tables have?  Any?

Sidar López Cruz wrote:

Check this:

query: Delete From ceroriesgo.salarios Where numero_patrono Not In 
(Select numero_patrono From ceroriesgo.patronos)


Seq Scan on salarios  (cost=51021.78..298803854359.95 rows=14240077 
width=6)

 Filter: (NOT (subplan))
 SubPlan
   -  Materialize  (cost=51021.78..69422.58 rows=1032980 width=25)
 -  Seq Scan on patronos  (cost=0.00..41917.80 rows=1032980 
width=25)



These query took a day to finish, how or who can improove better 
performance of my PostgreSQL.


_
Charla con tus amigos en línea mediante MSN Messenger: 
http://messenger.latam.msn.com/



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org




--

*Edward Allen*
Software Engineer
Black Duck Software, Inc.

[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
T +1.781.891.5100 x133
F +1.781.891.5145
http://www.blackducksoftware.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Very slow queries

2007-01-30 Thread Chad Wagner

On 1/30/07, Sidar López Cruz [EMAIL PROTECTED] wrote:


query: Delete From ceroriesgo.salarios Where numero_patrono Not In (Select
numero_patrono From ceroriesgo.patronos)

Seq Scan on salarios  (cost=51021.78..298803854359.95 rows=14240077
width=6)
  Filter: (NOT (subplan))
  SubPlan
-  Materialize  (cost=51021.78..69422.58 rows=1032980 width=25)
  -  Seq Scan on patronos  (cost=0.00..41917.80 rows=1032980
width=25)



How many rows exist in salarios, but not in patronos?  How many rows are
there in salarios?

What does the explain look like for:

delete
from ceroriesgo.salarios s
where not exists (select 1
   from ceroriesgo.patronos
 where numero_patrono = s.numero_patrono);

Also, is this not a case for a foreign key with a cascade delete?

http://www.postgresql.org/docs/8.2/static/ddl-constraints.html


--
Chad
http://www.postgresqlforums.com/


Re: [PERFORM] Very slow queries - please help

2005-12-12 Thread Bealach-na Bo
Thanks very  much - there are a lot of good articles there... Reading as 
fast as I can :)


Best,

Bealach



From: Thomas F. O'Connell [EMAIL PROTECTED]
To: Bealach-na Bo [EMAIL PROTECTED]
CC: PgSQL - Performance pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Very slow queries - please help
Date: Sun, 4 Dec 2005 00:40:01 -0600


On Nov 24, 2005, at 12:14 PM, Bealach-na Bo wrote:


The consensus seems to be that I need more indexes and I also need to
look into the NOT IN statement as a possible bottleneck. I've
introduced the indexes which has led to a DRAMATIC change in response
time. Now I have to experiment with INNER JOIN - OUTER JOIN
variations, SET ENABLE_SEQSCAN=OFF.

Forgive me for not mentioning each person individually and by name.
You have all contributed to confirming what I had suspected (and
hoped): that *I* have a lot to learn!

I'm attaching table descriptions, the first few lines of top output
while the queries were running, index lists, sample queries and
EXPLAIN ANALYSE output BEFORE and AFTER the introduction of the
indexes. As I said, DRAMATIC :) I notice that the CPU usage does not
vary very much, it's nearly 100% anyway, but the memory usage drops
markedly, which is another very nice result of the index introduction.

Any more comments and tips would be very welcome.


You might find the following resources from techdocs instructive:

http://techdocs.postgresql.org/redir.php?link=/techdocs/ 
pgsqladventuresep2.php


http://techdocs.postgresql.org/redir.php?link=/techdocs/ 
pgsqladventuresep3.php


These documents provide some guidance into the process of index  selection. 
It seems like you could still stand to benefit from more  indexes based on 
your queries, table definitions, and current indexes.


--
Thomas F. O'Connell
Database Architecture and Programming
Co-Founder
Sitening, LLC

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005 (cell)
615-469-5150 (office)
615-469-5151 (fax)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Very slow queries - please help

2005-12-03 Thread Thomas F. O'Connell


On Nov 24, 2005, at 12:14 PM, Bealach-na Bo wrote:


The consensus seems to be that I need more indexes and I also need to
look into the NOT IN statement as a possible bottleneck. I've
introduced the indexes which has led to a DRAMATIC change in response
time. Now I have to experiment with INNER JOIN - OUTER JOIN
variations, SET ENABLE_SEQSCAN=OFF.

Forgive me for not mentioning each person individually and by name.
You have all contributed to confirming what I had suspected (and
hoped): that *I* have a lot to learn!

I'm attaching table descriptions, the first few lines of top output
while the queries were running, index lists, sample queries and
EXPLAIN ANALYSE output BEFORE and AFTER the introduction of the
indexes. As I said, DRAMATIC :) I notice that the CPU usage does not
vary very much, it's nearly 100% anyway, but the memory usage drops
markedly, which is another very nice result of the index introduction.

Any more comments and tips would be very welcome.


You might find the following resources from techdocs instructive:

http://techdocs.postgresql.org/redir.php?link=/techdocs/ 
pgsqladventuresep2.php


http://techdocs.postgresql.org/redir.php?link=/techdocs/ 
pgsqladventuresep3.php


These documents provide some guidance into the process of index  
selection. It seems like you could still stand to benefit from more  
indexes based on your queries, table definitions, and current indexes.


--
Thomas F. O'Connell
Database Architecture and Programming
Co-Founder
Sitening, LLC

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005 (cell)
615-469-5150 (office)
615-469-5151 (fax)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Very slow queries - please help.

2005-11-24 Thread Bealach-na Bo

Hi Folks,

I'm new to Postgresql.

I'm having great difficulties getting the performance I had hoped for
from Postgresql 8.0. The typical query below takes ~20 minutes !!

I hope an expert out there will tell me what I'm doing wrong - I hope
*I* am doing something wrong.

Hardware

Single processor, Intel Xeon 3.06 GHz machine running Red Hat
Ent. 4. with 1.5 GB of RAM.

The machine is dedicated to running Postgresql 8.0 and Apache/mod_perl
etc. The database is being accessed for report generation via a web
form. The web server talks to Pg over TCP/IP (I know, that I don't
need to do this if they are all on the same machine, but I have good
reasons for this and don't suspect that this is where my problems are
- I have the same poor performance when running from psql on the
server.)

Database

Very simple, not fully normalized set of two tables. The first table,
very small (2000 lines of 4 cols with very few chars and integers in
in col). The other quite a bit larger (50 lines with 15
cols. with the largest fields ~ 256 chars)

Typical query


SELECT n.name
FROM node n
WHERE n.name
LIKE '56x%'
AND n.type='H'
AND n.usage='TEST'
AND n.node_id
NOT IN
(select n.node_id
FROM job_log j
INNER JOIN node n
ON j.node_id = n.node_id
WHERE n.name
LIKE '56x%'
AND n.type='H'
AND n.usage='TEST'
AND j.job_name = 'COPY FILES'
AND j.job_start = '2005-11-14 00:00:00'
AND (j.job_stop = '2005-11-22 09:31:10' OR j.job_stop IS NULL))
ORDER BY n.name


The node table is the small table and the job_log table is the large
table.


I've tried all the basic things that I found in the documentation like
VACUUM ANALYZE, EXPLAIN etc., but I suspect there is something
terribly wrong with what I'm doing and these measures will not shave
off 19 min and 50 seconds off the query time.

Any help and comments would be very much appreciated.


Bealach



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Very slow queries - please help.

2005-11-24 Thread Claus Guttesen
 Typical query
 

 SELECT n.name
 FROM node n
 WHERE n.name
 LIKE '56x%'
 AND n.type='H'
 AND n.usage='TEST'
 AND n.node_id
 NOT IN
 (select n.node_id
 FROM job_log j
 INNER JOIN node n
 ON j.node_id = n.node_id
 WHERE n.name
 LIKE '56x%'
 AND n.type='H'
 AND n.usage='TEST'
 AND j.job_name = 'COPY FILES'
 AND j.job_start = '2005-11-14 00:00:00'
 AND (j.job_stop = '2005-11-22 09:31:10' OR j.job_stop IS NULL))
 ORDER BY n.name

Do you have any indexes?

regards
Claus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Very slow queries - please help.

2005-11-24 Thread Bealach-na Bo


Hi,

Thanks for your comments. I've explicitly made any indexes, but the
default ones are:



[EMAIL PROTECTED] \di
  List of relations
Schema  |  Name   | Type  |  Owner  |  Table
-+-+---+-+-
user | job_log_id_pkey | index | user | job_log
user | node_id_pkey| index | user | node
user | node_name_key   | index | user | node
(3 rows)



I'm also sending the EXPLAIN outputs.





 explain SELECT n.name,n.type,
n.usage, j.status,
j.job_start,j.job_stop,
j.nfiles_in_job,j.job_name
 FROM job_log j
 INNER JOIN node n
 ON j.node_id = n.node_id
 WHERE n.name
 LIKE '56x%'
 AND n.type = 'K'
 AND n.usage = 'LIVE'
 AND j.job_name = 'COPY FILES'
 AND j.job_start = '2005-11-14 00:00:00'
 AND (j.job_stop = '2005-11-14 05:00:00' OR j.job_stop IS NULL)
 ORDER BY n.name;



 QUERY PLAN

--
Nested Loop  (cost=0.00..75753.31 rows=1 width=461)
  Join Filter: (inner.node_id = outer.node_id)
  -  Index Scan using node_name_key on node n  (cost=0.00..307.75 rows=1 
width=181)
Filter: ((name ~~ '56x%'::text) AND (type = 'K'::bpchar) AND 
(usage = 'LIVE'::bpchar))

  -  Seq Scan on job_log j  (cost=0.00..75445.54 rows=1 width=288)
Filter: ((job_name = 'COPY FILES'::bpchar) AND (job_start = 
'2005-11-14 00:00:00'::timestamp without time zone) AND ((job_stop = 
'2005-11-14 05:00:00'::timestamp without time zone) OR (job_stop IS NULL)))

(6 rows)


 explain SELECT n.name, n.type, n.usage
 FROM node n
 WHERE n.name
 LIKE '56x%'
 AND n.type  = 'K'
 AND n.usage = 'LIVE'
 AND n.node_id
 NOT IN
 (SELECT n.node_id
 FROM job_log j
 INNER JOIN node n
 ON j.node_id = n.node_id
 WHERE n.name
 LIKE '56x%'
 AND n.type  = 'K'
 AND n.usage = 'LIVE'
 AND j.job_name = 'COPY FILES'
 AND j.job_start = '2005-11-14 00:00:00'
 AND (j.job_stop = '2005-11-14 05:00:00' OR j.job_stop IS NULL))
 ORDER BY n.name;







 QUERY PLAN

--
Index Scan using node_name_key on node n  (cost=75451.55..75764.94 rows=1 
width=177)
  Filter: ((name ~~ '56x%'::text) AND (type = 'K'::bpchar) AND (usage = 
'LIVE'::bpchar) AND (NOT (hashed subplan)))

  SubPlan
-  Nested Loop  (cost=0.00..75451.54 rows=1 width=4)
  -  Seq Scan on job_log j  (cost=0.00..75445.54 rows=1 width=4)
Filter: ((job_name = 'COPY FILES'::bpchar) AND (job_start 
= '2005-11-14 00:00:00'::timestamp without time zone) AND ((job_stop = 
'2005-11-14 05:00:00'::timestamp without time zone) OR (job_stop IS NULL)))
  -  Index Scan using node_id_pkey on node n  (cost=0.00..5.99 
rows=1 width=4)

Index Cond: (outer.node_id = n.node_id)
Filter: ((name ~~ '56x%'::text) AND (type = 'K'::bpchar) 
AND (usage = 'LIVE'::bpchar))



Yours,

Bealach



From: Claus Guttesen [EMAIL PROTECTED]
To: Bealach-na Bo [EMAIL PROTECTED]
CC: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Very slow queries - please help.
Date: Thu, 24 Nov 2005 14:23:38 +0100

 Typical query
 

 SELECT n.name
 FROM node n
 WHERE n.name
 LIKE '56x%'
 AND n.type='H'
 AND n.usage='TEST'
 AND n.node_id
 NOT IN
 (select n.node_id
 FROM job_log j
 INNER JOIN node n
 ON j.node_id = n.node_id
 WHERE n.name
 LIKE '56x%'
 AND n.type='H'
 AND n.usage='TEST'
 AND j.job_name = 'COPY FILES'
 AND j.job_start = '2005-11-14 00:00:00'
 AND (j.job_stop = '2005-11-22 09:31:10' OR j.job_stop IS NULL))
 ORDER BY n.name

Do you have any indexes?

regards
Claus




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Very slow queries - please help.

2005-11-24 Thread Guillaume Smet

Hi,


I'm also sending the EXPLAIN outputs.


Please provide EXPLAIN ANALYZE outputs instead of EXPLAIN. You will have 
more information.


Indexes on your tables are obviously missing. You should try to add:

CREATE INDEX idx_node_filter ON node(name, type, usage);
CREATE INDEX idx_job_log_filter ON job_log(job_name, job_start, job_stop);

I'm not so sure it's a good idea to add job_stop in this index as you 
have an IS NULL in your query so I'm not sure it can be used. You should 
try it anyway and remove it if not needed.


I added all your search fields in the indexes but it depends a lot on 
the selectivity of your conditions. I don't know your data but I think 
you understand the idea.


HTH

--
Guillaume

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Very slow queries - please help.

2005-11-24 Thread Tom Lane
Bealach-na Bo [EMAIL PROTECTED] writes:
 I'm having great difficulties getting the performance I had hoped for
 from Postgresql 8.0. The typical query below takes ~20 minutes !!

You need to show us the table definition (including indexes) and the
EXPLAIN ANALYZE results for the query.

It seems likely that the NOT IN is the source of your problems,
but it's hard to be sure without EXPLAIN results.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Very slow queries - please help

2005-11-24 Thread Bealach-na Bo

OK.

The consensus seems to be that I need more indexes and I also need to
look into the NOT IN statement as a possible bottleneck. I've
introduced the indexes which has led to a DRAMATIC change in response
time. Now I have to experiment with INNER JOIN - OUTER JOIN
variations, SET ENABLE_SEQSCAN=OFF.

Forgive me for not mentioning each person individually and by name.
You have all contributed to confirming what I had suspected (and
hoped): that *I* have a lot to learn!

I'm attaching table descriptions, the first few lines of top output
while the queries were running, index lists, sample queries and
EXPLAIN ANALYSE output BEFORE and AFTER the introduction of the
indexes. As I said, DRAMATIC :) I notice that the CPU usage does not
vary very much, it's nearly 100% anyway, but the memory usage drops
markedly, which is another very nice result of the index introduction.

Any more comments and tips would be very welcome.

Thank you all for your input.

Bealach.




[EMAIL PROTECTED] \d job_log
Table blouser.job_log
Column |Type |Modifiers
+-+--
job_log_id | integer | not null default 
nextval('job_log_id_seq'::text)

first_registry | timestamp without time zone |
blogger_name   | character(50)   |
node_id| integer |
job_type   | character(50)   |
job_name   | character(256)  |
job_start  | timestamp without time zone |
job_timeout| interval|
job_stop   | timestamp without time zone |
nfiles_in_job  | integer |
status | integer |
error_code | smallint|
Indexes:
   job_log_id_pkey PRIMARY KEY, btree (job_log_id)
Check constraints:
   job_log_status_check CHECK (status = 0 OR status = 1 OR status = 8 OR 
status = 9)

Foreign-key constraints:
   legal_node FOREIGN KEY (node_id) REFERENCES node(node_id)





[EMAIL PROTECTED] \d node
 Table blouser.node
Column  | Type  |   Modifiers
-+---+---
node_id | integer   | not null default nextval('node_id_seq'::text)
name| character(50) |
type| character(1)  |
usage   | character(4)  |
Indexes:
   node_id_pkey PRIMARY KEY, btree (node_id)
   node_name_key UNIQUE, btree (name)
Check constraints:
   node_type_check CHECK (type = 'B'::bpchar OR type = 'K'::bpchar OR 
type = 'C'::bpchar OR type = 'T'::bpchar OR type = 'R'::bpchar)
   node_usage_check CHECK (usage = 'TEST'::bpchar OR usage = 
'LIVE'::bpchar)



#before new indexes were created


Tasks: 114 total,   2 running, 112 sleeping,   0 stopped,   0 zombie
Cpu(s): 25.7% us, 24.5% sy,  0.0% ni, 49.4% id,  0.3% wa,  0.0% hi,  0.0% si
Mem:   1554788k total,  1513576k used,41212k free,31968k buffers
Swap:  1020024k total,27916k used,   992108k free,   708728k cached

 PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
25883 postgres  25   0 20528  12m  11m R 99.7  0.8   4:54.91 postmaster





[EMAIL PROTECTED] \di
  List of relations
Schema  |  Name   | Type  |  Owner  |  Table
-+-+---+-+-
blouser | job_log_id_pkey | index | blouser | job_log
blouser | node_id_pkey| index | blouser | node
blouser | node_name_key   | index | blouser | node
(3 rows)


 EXPLAIN ANALYSE SELECT n.name,n.type,
n.usage, j.status,
j.job_start,j.job_stop,
j.nfiles_in_job,j.job_name
 FROM job_log j
 INNER JOIN node n
 ON j.node_id = n.node_id
 WHERE n.name
 LIKE '711%'
 AND n.type = 'K'
 AND n.usage = 'LIVE'
 AND j.job_name = 'COPY FILES'
 AND j.job_start = '2005-11-14 00:00:00'
 AND (j.job_stop = '2005-11-14 05:00:00' OR j.job_stop IS NULL)
 ORDER BY n.name;


 QUERY PLAN

---
Nested Loop  (cost=0.00..75753.31 rows=1 width=461) (actual 
time=270486.692..291662.350 rows=3 loops=1)

  Join Filter: (inner.node_id = outer.node_id)
  -  Index Scan using node_name_key on node n  (cost=0.00..307.75 rows=1 
width=181) (actual time=0.135..11.034 rows=208 loops=1)
Filter: ((name ~~ '711%'::text) AND (type = 'K'::bpchar) AND 
(usage = 'LIVE'::bpchar))
  -  Seq Scan on job_log j  (cost=0.00..75445.54 rows=1 width=288) (actual 
time=273.374..1402.089 rows=22 loops=208)
Filter: ((job_name = 'COPY FILES'::bpchar) AND (job_start = 
'2005-11-14 00:00:00'::timestamp without time zone) AND ((job_stop = 
'2005-11-14 

[PERFORM] Very slow queries - please help

2005-11-24 Thread Bealach-na Bo

A quick note to say that I'm very grateful for Tom Lane's input also.
Tom, I did put you on the list of recipients for my last posting to
pgsql-performance, but got:


cut here
This is an automatically generated Delivery Status Notification.

Delivery to the following recipients failed.

  [EMAIL PROTECTED]



Many regards,

Bealach



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly