Re: [PERFORM] cost-based vacuum

2005-07-12 Thread Simon Riggs
On Mon, 2005-07-11 at 15:51 +0100, Simon Riggs wrote:
 On Mon, 2005-07-11 at 09:07 -0400, Ian Westmacott wrote:
  On Mon, 2005-07-11 at 07:31, Simon Riggs wrote:
   The ANALYZE commands hold read locks on the tables you wish to write to.
   If you slow them down, you merely slow down your write transactions
   also, and then the read transactions that wait behind them. Every time
   the ANALYZE sleeps it wakes up the other transactions, which then
   realise they can't move because of locks and then wake up the ANALYZEs
   for another shot. The end result is that you introduce more context-
   switching, without any chance of doing more useful work while the
   ANALYZEs sleep.
  
  Let me make sure I understand.  ANALYZE acquires a read
  lock on the table, that it holds until the operation is
  complete (including any sleeps).  That read lock blocks
  the extension of that table via COPY.  Is that right?
  
  According to the 8.0 docs, ANALYZE acquires an ACCESS SHARE
  lock on the table, and that conflicts only with ACCESS
  EXCLUSIVE.  Thats why I didn't think I had a lock issue,
  since I think COPY only needs ROW EXCLUSIVE.  Or perhaps
  the transaction needs something more?
 
 The docs are correct, but don't show catalog and buffer locks.
 
 ...but on further reading of the code there are no catalog locks or
 buffer locks held across the sleep points. So, my explanation doesn't
 work as an explanation for the sleep/no sleep difference you have
 observed.

I've been through all the code now and can't find any resource that is
held across a delay point. Nor any reason to believe that the vacuum
cost accounting would slow anything down.

Since vacuum_cost_delay is a userset parameter, you should be able to
SET this solely for the analyze_thread. That way we will know with more
certainty that it is the analyze_thread that is interfering.

What is your default_statistics_target?
Do you have other stats targets set?

How long does ANALYZE take to run, with/without the vacuum_cost_delay?

Thanks,

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] join and query planner

2005-07-12 Thread Bruno Wolff III
On Wed, Jul 06, 2005 at 18:54:02 -0300,
  Dario Pudlo [EMAIL PROTECTED] wrote:
 (first at all, sorry for my english)
 Hi.
- Does left join restrict the order in which the planner must join
 tables? I've read about join, but i'm not sure about left join...

The left join operator is not associative so in general the planner doesn't
have much flexibility to reorder left (or right) joins.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] Projecting currentdb to more users

2005-07-12 Thread Yves Vindevogel
Hi,

We have a couple of database that are identical (one for each customer).
They are all relatively small, ranging from 100k records to 1m records.
There's only one main table with some smaller tables, a lot of indexes and some functions.

I would like to make an estimation of the performance, the diskspace and other related things, 
when we have database of for instance 10 million records or 100 million records.

Is there any math to be done on that ?

Met vriendelijke groeten,
Bien à vous,
Kind regards,

Yves Vindevogel
Implements




Mail: [EMAIL PROTECTED]  - Mobile: +32 (478) 80 82 91

Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76

Web: http://www.implements.be
x-tad-smaller
First they ignore you.  Then they laugh at you.  Then they fight you.  Then you win.
Mahatma Ghandi./x-tad-smaller
---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] cost-based vacuum

2005-07-12 Thread Ian Westmacott
On Tue, 2005-07-12 at 03:45, Simon Riggs wrote:
 Since vacuum_cost_delay is a userset parameter, you should be able to
 SET this solely for the analyze_thread. That way we will know with more
 certainty that it is the analyze_thread that is interfering.

That is what I have been doing.  In fact, I have eliminated
the reader_thread and analyze_thread.  I just have the
writer_thread running, and a psql connection with which I
perform ANALYZE, for various vacuum_cost_* parameters.
(I'm trying to extract a reproducible experiment)

It appears not to matter whether it is one of the tables
being written to that is ANALYZEd.  I can ANALYZE an old,
quiescent table, or a system table and see this effect.

 What is your default_statistics_target?

All other configs are default; default_statistics_target=10.

 Do you have other stats targets set?

No.  The only thing slightly out of the ordinary with the
tables is that they are created WITHOUT OIDS.  Some indexes,
but no primary keys.  All columns NOT NULL.

 How long does ANALYZE take to run, with/without the vacuum_cost_delay?

Well, on one table with about 50K rows, it takes about 1/4s
to ANALYZE with vacuum_cost_delay=0, and about 15s with
vacuum_cost_delay=1000.

Other things of note:

- VACUUM has the same effect.  If I VACUUM or ANALYZE the
  whole DB, the CPU spikes reset between tables.
- vmstat reports blocks written drops as the CPU rises.
  Don't know if it is cause or effect yet.  On a small test
  system, I'm writing about 1.5MB/s.  After about 20s
  of cost-based ANALYZE, this drops under 0.5MB/s.
- this is a dual Xeon.  I have tried both with and without
  hyperthreading.  I haven't tried to reproduce it
  elsewhere yet, but will.
- Looking at oprofile reports for 10-minute runs of a
  database-wide VACUUM with vacuum_cost_delay=0 and 1000,
  shows the latter spending a lot of time in LWLockAcquire
  and LWLockRelease (20% each vs. 2%).


Thanks,

--Ian



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Projecting currentdb to more users

2005-07-12 Thread Matthew Nuzum
On 7/12/05, Yves Vindevogel [EMAIL PROTECTED] wrote:
 Hi,
 
 We have a couple of database that are identical (one for each customer).
 They are all relatively small, ranging from 100k records to 1m records.
 There's only one main table with some smaller tables, a lot of indexes
 and some functions.
 
 I would like to make an estimation of the performance, the diskspace
 and other related things,
 when we have database of for instance 10 million records or 100 million
 records.
 
 Is there any math to be done on that ?

Its pretty easy to make a database run fast with only a few thousand
records, or even a million records, however things start to slow down
non-linearly when the database grows too big to fit in RAM.

I'm not a guru, but my attempts to do this have not been very accurate.

Maybe (just maybe) you could get an idea by disabling the OS cache on
the file system(s) holding the database and then somehow fragmenting
the drive severly (maybe by putting each table in it's own disk
partition?!?) and measuring performance.

On the positive side, there are a lot of wise people on this list who
have +++ experience optimzing slow queries on big databases. So
queries now that run in 20 ms but slow down to 7 seconds when your
tables grow will likely benefit from optimizing.
-- 
Matthew Nuzum
www.bearfruit.org

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Projecting currentdb to more users

2005-07-12 Thread Mohan, Ross
From AMD's suit against Intel. Perhaps relevant to some PG/AMD issues. 

...125. Intel has designed its compiler purposely to degrade performance when 
a program
is run on an AMD platform. To achieve this, Intel designed the compiler to 
compile code
along several alternate code paths. Some paths are executed when the program 
runs on an Intel
platform and others are executed when the program is operated on a computer 
with an AMD
microprocessor. (The choice of code path is determined when the program is 
started, using a
feature known as CPUID which identifies the computer's microprocessor.) By 
design, the
code paths were not created equally. If the program detects a Genuine Intel 
microprocessor,
it executes a fully optimized code path and operates with the maximum 
efficiency. However,
if the program detects an Authentic AMD microprocessor, it executes a 
different code path
that will degrade the program's performance or cause it to crash...


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PERFORM] General DB Tuning

2005-07-12 Thread Brent Henry
Help!  After recently migrating to Postgres 8, I've
discovered to my horror that I can't determine which
queries are poorly performing anymore because the
logging has drastically changed and no longer shows
durations for anything done through JDBC.

So I'm desperately trying to do performance tuning on
my servers and have no way to sort out which
statements are the slowest.

Does anyone have any suggestions?  How do you
determine what queries are behaving badly when you
can't get durations out of the logs?

I have a perl script that analyzes the output from
Postgres 7 logs and it works great!  But it relies on
the duration being there.

I did some searches on postgresql.org mailing lists
and have seen a few people discussing this problem,
but noone seems to be too worried about it.  Is there
a simple work-around?

Sincerely,

Brent




Sell on Yahoo! Auctions – no fees. Bid on great items.  
http://auctions.yahoo.com/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] General DB Tuning

2005-07-12 Thread Tom Arthurs
I have this in my postgresql.conf file and it works fine (set the min to 
whatever you want to log)

log_min_duration_statement = 3000 # -1 is disabled, in milliseconds.

Another setting that might get what you want:

#log_duration = false

uncomment and change to true.

From the docs: 
(http://www.postgresql.org/docs/8.0/interactive/runtime-config.html)


 Causes the duration of every completed statement which satisfies 
log_statement to be logged. When using this option, if you are not using 
syslog, it is recommended that you log the PID or session ID using 
log_line_prefix so that you can link the statement to the duration using 
the process ID or session ID. The default is off. Only superusers can 
change this setting.


Brent Henry wrote:

Help!  After recently migrating to Postgres 8, I've
discovered to my horror that I can't determine which
queries are poorly performing anymore because the
logging has drastically changed and no longer shows
durations for anything done through JDBC.

So I'm desperately trying to do performance tuning on
my servers and have no way to sort out which
statements are the slowest.

Does anyone have any suggestions?  How do you
determine what queries are behaving badly when you
can't get durations out of the logs?

I have a perl script that analyzes the output from
Postgres 7 logs and it works great!  But it relies on
the duration being there.

I did some searches on postgresql.org mailing lists
and have seen a few people discussing this problem,
but noone seems to be too worried about it.  Is there
a simple work-around?

Sincerely,

Brent




Sell on Yahoo! Auctions – no fees. Bid on great items.  
http://auctions.yahoo.com/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings





---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] General DB Tuning

2005-07-12 Thread Brent Henry
Yes, that is exactly what I want to use!

Unfortunately, it doesn't work if you access postgres
through a JDBC connection.  I don't know why.  I found
a posting from back in February which talks aobut this
a little:

http://archives.postgresql.org/pgsql-admin/2005-02/msg00055.php

But I can't find anywhere where someone has fixed it. 
Am I the only one accessing postgres through JDBC?

-Brent


--- Tom Arthurs [EMAIL PROTECTED] wrote:

 I have this in my postgresql.conf file and it works
 fine (set the min to 
 whatever you want to log)
 log_min_duration_statement = 3000 # -1 is disabled,
 in milliseconds.
 
 Another setting that might get what you want:
 
 #log_duration = false
 
 uncomment and change to true.
 
  From the docs: 

(http://www.postgresql.org/docs/8.0/interactive/runtime-config.html)
 
   Causes the duration of every completed statement
 which satisfies 
 log_statement to be logged. When using this option,
 if you are not using 
 syslog, it is recommended that you log the PID or
 session ID using 
 log_line_prefix so that you can link the statement
 to the duration using 
 the process ID or session ID. The default is off.
 Only superusers can 
 change this setting.
 
 Brent Henry wrote:
  Help!  After recently migrating to Postgres 8,
 I've
  discovered to my horror that I can't determine
 which
  queries are poorly performing anymore because the
  logging has drastically changed and no longer
 shows
  durations for anything done through JDBC.
  
  So I'm desperately trying to do performance tuning
 on
  my servers and have no way to sort out which
  statements are the slowest.
  
  Does anyone have any suggestions?  How do you
  determine what queries are behaving badly when you
  can't get durations out of the logs?
  
  I have a perl script that analyzes the output from
  Postgres 7 logs and it works great!  But it relies
 on
  the duration being there.
  
  I did some searches on postgresql.org mailing
 lists
  and have seen a few people discussing this
 problem,
  but noone seems to be too worried about it.  Is
 there
  a simple work-around?
  
  Sincerely,
  
  Brent
  
  
  
 
 
  Sell on Yahoo! Auctions – no fees. Bid on great
 items.  
  http://auctions.yahoo.com/
  
  ---(end of
 broadcast)---
  TIP 5: don't forget to increase your free space
 map settings
  
  
  
 
 ---(end of
 broadcast)---
 TIP 2: Don't 'kill -9' the postmaster
 


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] General DB Tuning

2005-07-12 Thread Tom Arthurs
we are using jdbc -- the log_min_duration_statement = 3000  statement 
works fine for me.  Looks like there's no other work around for the 
bug(?).  Not sure since I have no interest in logging a million 
statements a day, I only want to see the poorly performing hits.


Brent Henry wrote:

Yes, that is exactly what I want to use!

Unfortunately, it doesn't work if you access postgres
through a JDBC connection.  I don't know why.  I found
a posting from back in February which talks aobut this
a little:

http://archives.postgresql.org/pgsql-admin/2005-02/msg00055.php

But I can't find anywhere where someone has fixed it. 
Am I the only one accessing postgres through JDBC?


-Brent


--- Tom Arthurs [EMAIL PROTECTED] wrote:



I have this in my postgresql.conf file and it works
fine (set the min to 
whatever you want to log)

log_min_duration_statement = 3000 # -1 is disabled,
in milliseconds.

Another setting that might get what you want:

#log_duration = false

uncomment and change to true.

From the docs: 



(http://www.postgresql.org/docs/8.0/interactive/runtime-config.html)


 Causes the duration of every completed statement
which satisfies 
log_statement to be logged. When using this option,
if you are not using 
syslog, it is recommended that you log the PID or
session ID using 
log_line_prefix so that you can link the statement
to the duration using 
the process ID or session ID. The default is off.
Only superusers can 
change this setting.


Brent Henry wrote:


Help!  After recently migrating to Postgres 8,


I've


discovered to my horror that I can't determine


which


queries are poorly performing anymore because the
logging has drastically changed and no longer


shows


durations for anything done through JDBC.

So I'm desperately trying to do performance tuning


on


my servers and have no way to sort out which
statements are the slowest.

Does anyone have any suggestions?  How do you
determine what queries are behaving badly when you
can't get durations out of the logs?

I have a perl script that analyzes the output from
Postgres 7 logs and it works great!  But it relies


on


the duration being there.

I did some searches on postgresql.org mailing


lists


and have seen a few people discussing this


problem,


but noone seems to be too worried about it.  Is


there


a simple work-around?

Sincerely,

Brent









Sell on Yahoo! Auctions – no fees. Bid on great


items.  


http://auctions.yahoo.com/

---(end of


broadcast)---


TIP 5: don't forget to increase your free space


map settings





---(end of
broadcast)---
TIP 2: Don't 'kill -9' the postmaster





__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


---(end of broadcast)---
TIP 6: explain analyze is your friend





---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] General DB Tuning

2005-07-12 Thread Dennis

Tom Arthurs wrote:

we are using jdbc -- the log_min_duration_statement = 3000  
statement works fine for me.  Looks like there's no other work around 
for the bug(?).  Not sure since I have no interest in logging a 
million statements a day, I only want to see the poorly performing hits. 



Doesn't it depend on what jdbc driver you are using?

Dennis

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] General DB Tuning

2005-07-12 Thread Christopher Kings-Lynne
we are using jdbc -- the log_min_duration_statement = 3000  
statement works fine for me.  Looks like there's no other work around 
for the bug(?).  Not sure since I have no interest in logging a 
million statements a day, I only want to see the poorly performing hits. 


Doesn't it depend on what jdbc driver you are using?


It depends if he's using new-protocol prepared queries which don't get 
logged properly.  Wasn't that fixed for 8.1 or something?


Chris


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] General DB Tuning

2005-07-12 Thread Tom Arthurs

hmm, yea maybe -- we are using the 7.4 driver with 8.0.x db.

Dennis wrote:

Tom Arthurs wrote:

we are using jdbc -- the log_min_duration_statement = 3000  
statement works fine for me.  Looks like there's no other work around 
for the bug(?).  Not sure since I have no interest in logging a 
million statements a day, I only want to see the poorly performing hits. 




Doesn't it depend on what jdbc driver you are using?

Dennis

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster





---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] General DB Tuning

2005-07-12 Thread Brent Henry
We are running Postgres 8.0.2 with the 8.0.2 jdbc
driver.  And yes we are using prepared statements. 
I've spent hours trying to get the
'log_min_duration_statement' and 'log_duration'
options to work with no luck.  I never get any
duration from the statement.  I also never see 'begin'
or 'commit' in the log so I can't tell how long my
batch commands are taking to commit to the DB.

Is there a different kind of 'prepared' statements
that we should be using in the driver to get logging
to work properly?  What is the 'new' protocol?

Tom, what version are you using?  Are you using
prepared statements in JDBC?

-Brent


--- Christopher Kings-Lynne
[EMAIL PROTECTED] wrote:

  we are using jdbc -- the
 log_min_duration_statement = 3000  
  statement works fine for me.  Looks like there's
 no other work around 
  for the bug(?).  Not sure since I have no
 interest in logging a 
  million statements a day, I only want to see the
 poorly performing hits. 
  
  Doesn't it depend on what jdbc driver you are
 using?
 
 It depends if he's using new-protocol prepared
 queries which don't get 
 logged properly.  Wasn't that fixed for 8.1 or
 something?
 
 Chris
 
 
 ---(end of
 broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq
 





Start your day with Yahoo! - make it your home page
http://www.yahoo.com/r/hs
 

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] General DB Tuning

2005-07-12 Thread Christopher Kings-Lynne

Is there a different kind of 'prepared' statements
that we should be using in the driver to get logging
to work properly?  What is the 'new' protocol?


The 8.0.2 jdbc driver uses real prepared statements instead of faked 
ones.  The problem is the new protocol (that the 8.0.2 driver users) has 
a bug where protocol-prepared queries don't get logged properly.


I don't know if it's been fixed...

Chris


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] General DB Tuning

2005-07-12 Thread Tom Arthurs

Here's the answer for you from the jdbc list:


Alvin Hung wrote:



Currently, 8.0.2 / JDBC 8.0-310, log_min_duration_statement does not
work with JDBC.  Nothing will get logged.  This makes it very
difficult to tune a java application.  Can you tell me when will this
be fixed?  Thanks.



This is a server limitation: it does not handle logging of the V3
extended query protocol very well. There's gradual progress being made
on it; you might want to search the pgsql-hackers and pgsql-patches
archives for details.

==


We are using prepared statements, but we are using the 7.4 driver with 
the 8.0.3 server.  I think it comes down to locally (on the client) 
prepared statements vs using server side prepared statments.  I never 
got past this issue (changing the code is in our todo list, but pretty 
far down it) so I never noticed the logging issues.)


I had a problem with prepared statements with the 8.x drivers -- here's 
what I got from the jdbc list when I asked the question:



1.  What changed between the driver versions that generate this error?



The driver started to use server-side prepared statements for
parameterization of queries (i.e. the driver translates ? to $n in the
main query string, and sends the actual parameter values out-of-band
from the query itself). One sideeffect of this is that parameters are
more strongly typed than in the 7.4.x versions where the driver would do
literal parameter substitution into the query string before sending it
to the backend. Also, you can use parameters in fewer places (they must
fit the backend's idea of where parameterizable expressions are allowed)
-- e.g. see the recent thread about ORDER BY ? changing behaviour with
the newer driver.



2.  What is the downside of continuing to use the 7.x version of the
driver -- or are there better alternatives (patch, new version, etc).  I
am using build 311 of the driver.



Most active development happens on the 8.0 version; 7.4.x is maintained
for bugfixes but that's about it, you won't get the benefit of any
performance improvements or added features that go into 8.0. Also, the
7.4.x driver won't necessarily work with servers = 8.0.

In the longer term, the 7.4.x version will eventually become unmaintained.


So for the short term, you could downgrade your driver.


Brent Henry wrote:

We are running Postgres 8.0.2 with the 8.0.2 jdbc
driver.  And yes we are using prepared statements. 
I've spent hours trying to get the

'log_min_duration_statement' and 'log_duration'
options to work with no luck.  I never get any
duration from the statement.  I also never see 'begin'
or 'commit' in the log so I can't tell how long my
batch commands are taking to commit to the DB.

Is there a different kind of 'prepared' statements
that we should be using in the driver to get logging
to work properly?  What is the 'new' protocol?

Tom, what version are you using?  Are you using
prepared statements in JDBC?

-Brent


--- Christopher Kings-Lynne
[EMAIL PROTECTED] wrote:



we are using jdbc -- the


log_min_duration_statement = 3000  


statement works fine for me.  Looks like there's


no other work around 


for the bug(?).  Not sure since I have no


interest in logging a 


million statements a day, I only want to see the


poorly performing hits. 


Doesn't it depend on what jdbc driver you are


using?

It depends if he's using new-protocol prepared
queries which don't get 
logged properly.  Wasn't that fixed for 8.1 or

something?

Chris


---(end of
broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq








Start your day with Yahoo! - make it your home page
http://www.yahoo.com/r/hs
 






---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Projecting currentdb to more users

2005-07-12 Thread Jean-Max Reymond
2005/7/12, Mohan, Ross [EMAIL PROTECTED]:
 From AMD's suit against Intel. Perhaps relevant to some PG/AMD issues.

Postgres is compiled with gnu compiler. Isn't it ?
I don't know how much can Postgres benefit from an optimized Intel compiler.

-- 
Jean-Max Reymond
CKR Solutions Open Source
Nice France
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster