Re: [PERFORM] Tuning queries on large database

2004-08-06 Thread Gaetano Mendola
Valerie Schneider DSI/DEV wrote:
Hi,
I have some problem of performance on a PG database, and I don't
know how to improve. I Have two questions : one about the storage
of data, one about tuning queries. If possible !
My job is to compare Oracle and Postgres. All our operational databases
have been running under Oracle for about fifteen years. Now I try to replace
Oracle by Postgres.
Show us the explain analyze on your queries.
Regards
Gaetano Mendola


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Temporary tables

2004-08-06 Thread Gaetano Mendola
G u i d o B a r o s i o wrote:
The box:
Linux 2.4.24-ck1 
8  Intel(R) Xeon(TM) MP CPU 2.80GHz
4 gb RAM.
Postgresql 7.4.2

The problem: 
Short in disk space. (waiting new hard)

The real problem:
Developers usually write queries involving the creation of temporary tables. 
I seen too this behavior, till I explained that this is a valid sql:
select T.* from ( select * from table t where a = 5 ) AS T join foo using ( bar );
show us a typical function that use temporary tables.

Regards
Gaetano Mendola


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] [GENERAL] Tuning queries on large database

2004-08-06 Thread Gaetano Mendola
Valerie Schneider DSI/DEV wrote:
 Hi,
 I 've decreased the sort_mem to 5000 instead of 5.
 I recreated ma table using integer and real types instead of
 numeric : the result is very improved for the disk space :

  schema | relfilenode |  table   |   index|  reltuples  |   size
 +-+--++-+--
  public |   253442696 | data || 1.25113e+08 | 29760016
  public |   378639579 | data | i_data_dat | 1.25113e+08 |  2744400
  public |   378555698 | data | pk_data| 1.25113e+08 |  3295584

 so it takes about 28 Gb instead of 68 Gb !

 For my different queries, it's better but less performant than oracle :

oracle  PG yesterday(numeric)   PG today(integer/real)
 Q1 1s 1s 1s
 Q2  3s 8s 4s
 Q3  8s  1m20s27s
 Q4 28s 17m20s  6m47s
Are you using the same disk for oracle and PG ?
Could you post your actual postgresql.conf ?
Try also to mount your partition with the option: noatime
and try again.
Regards
Gaetano Mendola

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Performance Bottleneck

2004-08-06 Thread Gaetano Mendola
Martin Foster wrote:
Gaetano Mendola wrote:
Martin Foster wrote:
I run a Perl/CGI driven website that makes extensive use of 
PostgreSQL (7.4.3) for everything from user information to formatting 
and display of specific sections of the site.   The server itself, is 
a dual processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard 
drives mirrored for redundancy running under FreeBSD 5.2.1 (AMD64).

Recently loads on the site have increased during peak hours to the 
point of showing considerable loss in performance.This can be 
observed when connections move from the 120 concurrent connections to 
PostgreSQL to roughly 175 or more. Essentially, the machine seems 
to struggle to keep up with continual requests and slows down 
respectively as resources are tied down.

Code changes have been made to the scripts to essentially back off in 
high load working environments which have worked to an extent. 
However, as loads continue to increase the database itself is not 
taking well to the increased traffic taking place.

Having taken a look at 'Tuning PostgreSQL for Performance' 
(http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as 
best I could in order to set my settings.However, even with 
statistics disabled and ever setting tweaked things still consider to 
deteriorate.

Is there anything anyone can recommend in order to give the system a 
necessary speed boost?   It would seem to me that a modest dataset of 
roughly a Gig combined with that type of hardware should be able to 
handle substantially more load then what it is.  Can anyone provide 
me with clues as where to pursue?Would disabling 'fsync' provide 
more performance if I choose that information may be lost in case of 
a crash?

If anyone needs access to logs, settings et cetera.   Please ask, I 
simply wish to test the waters first on what is needed. Thanks!

Tell us about your tipical queries, show us your configuration file.
The access are only in read only mode or do you have concurrent writers
and readers ? During peak hours your processors are tied to 100% ?
What say the vmstat and the iostat ?
May be you are not using indexes some where, or may be yes but the
planner is not using it... In two words we needs other informations
in order to help you.

Regards
Gaetano Mendola

I included all the files in attachments, which will hopefully cut down 
on any replied to Emails.As for things like connection pooling, the 
web server makes use of Apache::DBI to pool the connections for the Perl 
scripts being driven on that server.For the sake of being thorough, 
a quick 'apachectl status' was thrown in when the database was under a 
good load.
Let start from your postgres configuration:
shared_buffers = 8192 This is really too small for your configuration
sort_mem = 2048
wal_buffers = 128 This is really too small for your configuration
effective_cache_size = 16000
change this values in:
shared_buffers = 5
sort_mem = 16084
wal_buffers = 1500
effective_cache_size = 32000
to bump up the shm usage you have to configure your OS in order to be
allowed to use that ammount of SHM.
This are the numbers that I feel good for your HW, the second step now is
analyze your queries
The queries themselves are simple, normally drawing information from one 
table with few conditions or in the most complex cases using joins on 
two table or sub queries.   These behave very well and always have, the 
problem is that these queries take place in rather large amounts due to 
the dumb nature of the scripts themselves.
Show us the explain analyze on that queries, how many rows the tables are
containing, the table schema could be also usefull.

regards
Gaetano Mendola




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] The black art of postgresql.conf tweaking

2004-08-06 Thread Gaetano Mendola
Paul Serby wrote:
Can anyone give a good reference site/book for getting the most out of 
your postgres server.

All I can find is contradicting theories on how to work out your settings.
This is what I followed to setup our db server that serves our web 
applications.

http://www.phpbuilder.com/columns/smith20010821.php3?page=2
We have a Dell Poweredge with the following spec.
CPU: Intel(R) Xeon(TM) CPU 3.06GHz (512 KB Cache)
CPU: Intel(R) Xeon(TM) CPU 3.06GHz (512 KB Cache)
CPU: Intel(R) Xeon(TM) CPU 3.06GHz (512 KB Cache)
CPU: Intel(R) Xeon(TM) CPU 3.06GHz (512 KB Cache)
Physical Memory: 2077264 kB
Swap Memory: 2048244 kB
Apache on the Web server can take up to 300 connections and PHP is using 
 pg_pconnect

Postgres is set with the following.
max_connections = 300
shared_buffers = 38400
sort_mem = 12000
But Apache is still maxing out the non-super user connection limit.
Tell us the value MaxClients  in your apache configuration

Regards
Gaetano Mendola

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] The black art of postgresql.conf tweaking

2004-08-06 Thread Josh Berkus
Paul,

  Physical Memory: 2077264 kB

  sort_mem = 12000

Hmmm.  Someone may already have mentioned this, but that looks problematic.  
You're allowing up to 12MB per sort, and up to 300 connections.  Even if each 
concurrent connection averages only one sort (and they can use more) that's 
3600MB ... roughly 1.5 times your *total* RAM, leaving out RAM for Apache, 
postmaster, shared buffers, etc.

I strongly suggest that you either decrease your total connections or your 
sort_mem, or both.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Performance Bottleneck

2004-08-06 Thread Mike Benoit
On Wed, 2004-08-04 at 17:25 +0200, Gaetano Mendola wrote:

  The queries themselves are simple, normally drawing information from one 
  table with few conditions or in the most complex cases using joins on 
  two table or sub queries.   These behave very well and always have, the 
  problem is that these queries take place in rather large amounts due to 
  the dumb nature of the scripts themselves.
 
 Show us the explain analyze on that queries, how many rows the tables are
 containing, the table schema could be also usefull.
 

If the queries themselves are optimized as much as they can be, and as
you say, its just the sheer amount of similar queries hitting the
database, you could try using prepared queries for ones that are most
often executed to eliminate some of the overhead. 

I've had relatively good success with this in the past, and it doesn't
take very much code modification.

-- 
Mike Benoit [EMAIL PROTECTED]


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] The black art of postgresql.conf tweaking

2004-08-06 Thread Gaetano Mendola
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Josh Berkus wrote:
| Paul,
|
|
|Physical Memory: 2077264 kB
|
|
|sort_mem = 12000
|
|
| Hmmm.  Someone may already have mentioned this, but that looks problematic.
| You're allowing up to 12MB per sort, and up to 300 connections.  Even if each
| concurrent connection averages only one sort (and they can use more) that's
| 3600MB ... roughly 1.5 times your *total* RAM, leaving out RAM for Apache,
| postmaster, shared buffers, etc.
|
| I strongly suggest that you either decrease your total connections or your
| sort_mem, or both.
Of course your are speaking about the worst case, I aplly in scenarios like
this on the rule 80/20: 80% of connection will perform a sort and 20% will allocate
memory for the sort operation in the same window time:
300 -- 80% --  240 -- 20% -- 48
48 * 12MB = 576 MB
that seems resonable with the total ammount of memory available.
Am I too optimistic?

Regards
Gaetano Mendola








-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFBE81z7UpzwH2SGd4RAuzzAJ98Ze0HQedKaZ/laT7P1OS44FG0CwCfaWkY
MAR1TEY1+x61PoXjK/K8Q4Y=
=8UmF
-END PGP SIGNATURE-
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Performance Bottleneck

2004-08-06 Thread Martin Foster
Gaetano Mendola wrote:

Let start from your postgres configuration:
shared_buffers = 8192 This is really too small for your 
configuration
sort_mem = 2048

wal_buffers = 128 This is really too small for your configuration
effective_cache_size = 16000
change this values in:
shared_buffers = 5
sort_mem = 16084
wal_buffers = 1500
effective_cache_size = 32000
to bump up the shm usage you have to configure your OS in order to be
allowed to use that ammount of SHM.
This are the numbers that I feel good for your HW, the second step now is
analyze your queries
These changes have yielded some visible improvements, with load averages 
rarely going over the anything noticeable.   However, I do have a 
question on the matter, why do these values seem to be far higher then 
what a frequently pointed to document would indicate as necessary?

http://www.varlena.com/GeneralBits/Tidbits/perf.html
I am simply curious, as this clearly shows that my understanding of 
PostgreSQL is clearly lacking when it comes to tweaking for the hardware.

Martin Foster
Creator/Designer Ethereal Realms
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] The black art of postgresql.conf tweaking

2004-08-06 Thread Josh Berkus
Gaetano,

 Of course your are speaking about the worst case, I aplly in scenarios 
like
 this on the rule 80/20: 80% of connection will perform a sort and 20% will 
allocate
 memory for the sort operation in the same window time:

Well, I suppose it depends on how aggresive your connection pooling is.   If 
you minimize idle connections, then 300 connections can mean 200 concurrent 
queries.  And since Paul *is* having problems, this is worth looking into.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Performance Bottleneck

2004-08-06 Thread Martin Foster
Mike Benoit wrote:
On Wed, 2004-08-04 at 17:25 +0200, Gaetano Mendola wrote:

The queries themselves are simple, normally drawing information from one 
table with few conditions or in the most complex cases using joins on 
two table or sub queries.   These behave very well and always have, the 
problem is that these queries take place in rather large amounts due to 
the dumb nature of the scripts themselves.
Show us the explain analyze on that queries, how many rows the tables are
containing, the table schema could be also usefull.

If the queries themselves are optimized as much as they can be, and as
you say, its just the sheer amount of similar queries hitting the
database, you could try using prepared queries for ones that are most
often executed to eliminate some of the overhead. 

I've had relatively good success with this in the past, and it doesn't
take very much code modification.
One of the biggest problems is most probably related to the indexes. 
Since the performance penalty of logging the information needed to see 
which queries are used and which are not is a slight problem, then I 
cannot really make use of it for now.

However, I am curious how one would go about preparing query?   Is this 
similar to the DBI::Prepare statement with placeholders and simply 
changing the values passed on execute?  Or is this something database 
level such as a view et cetera?

SELECT
 Post.PostIDNumber,
 Post.$format,
 Post.PuppeteerLogin,
 Post.PuppetName,
 Post.PostCmd,
 Post.PostClass
FROM Post
WHERE Post.PostIDNumber  ?::INT
  AND (Post.PostTo='all' OR Post.PostTo=?)
  AND (NOT EXISTS (SELECT PuppetIgnore.PuppetLogin
FROM PuppetIgnore
WHERE PuppetIgnore.PuppetIgnore='global'
  AND PuppetIgnore.PuppeteerLogin=?
  AND PuppetIgnore.PuppetLogin=Post.PuppeteerLogin)
   OR Post.PuppeteerLogin IS NULL)
  AND (NOT EXISTS (SELECT PuppetIgnore.PuppetName
FROM PuppetIgnore
WHERE PuppetIgnore.PuppetIgnore='single'
  AND PuppetIgnore.PuppeteerLogin=?
  AND PuppetIgnore.PuppetName=Post.PuppetName)
   OR Post.PuppetName IS NULL)
ORDER BY Post.PostIDNumber LIMIT 100
The range is determined from the previous run or through a query listed 
below.   It was determined that using INT was far faster then limiting 
by timestamp.

SELECT MIN(PostIDNumber)
FROM Post
WHERE RealmName=?
  AND PostClass IN ('general','play')
  AND PostTo='all'
The above simply provides a starting point, nothing more.   Once posts 
are pulled the script will throw in the last pulled number as to start 
from a fresh point.

Under MySQL time was an stored as an INT which may have helped it handle 
timestamps more efficiently.It also made use of three or more 
queries, where two were done to generate an IN statement for the query 
actually running at the time.

Martin Foster
Creator/Designer Ethereal Realms
[EMAIL PROTECTED]
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Performance Bottleneck

2004-08-06 Thread Gaetano Mendola
Martin Foster wrote:
Gaetano Mendola wrote:

Let start from your postgres configuration:
shared_buffers = 8192 This is really too small for your 
configuration
sort_mem = 2048

wal_buffers = 128 This is really too small for your 
configuration

effective_cache_size = 16000
change this values in:
shared_buffers = 5
sort_mem = 16084
wal_buffers = 1500
effective_cache_size = 32000
to bump up the shm usage you have to configure your OS in order to be
allowed to use that ammount of SHM.
This are the numbers that I feel good for your HW, the second step now is
analyze your queries
These changes have yielded some visible improvements, with load averages 
rarely going over the anything noticeable.   However, I do have a 
question on the matter, why do these values seem to be far higher then 
what a frequently pointed to document would indicate as necessary?

http://www.varlena.com/GeneralBits/Tidbits/perf.html
I am simply curious, as this clearly shows that my understanding of 
PostgreSQL is clearly lacking when it comes to tweaking for the hardware.
Unfortunately there is no a wizard tuning for postgres so each one of
us have a own school. The data I gave you are oversized to be sure
to achieve improvements. Now you can start to decrease these values
( starting from the wal_buffers ) in order to find the good compromise
with your HW.

Regards
Gaetano Mendola





---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] The black art of postgresql.conf tweaking

2004-08-06 Thread Gaetano Mendola
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Josh Berkus wrote:
| Gaetano,
|
|
|Of course your are speaking about the worst case, I aplly in scenarios
|
| like
|
|this on the rule 80/20: 80% of connection will perform a sort and 20% will
|
| allocate
|
|memory for the sort operation in the same window time:
|
|
| Well, I suppose it depends on how aggresive your connection pooling is.   If
| you minimize idle connections, then 300 connections can mean 200 concurrent
| queries.  And since Paul *is* having problems, this is worth looking into.
With 4 CPU ( like Paul have ) there is a lot of space in order to have 200
concurrent connection running but I don't believe that all 200 togheter are
allocating space for sort, I have not seen the code but I'm quite confident
that the memory for sort is released as soon the sort operation is over,
not at the end of connection.

Regards
Gaetano Mendola

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFBFBcn7UpzwH2SGd4RAuNhAJ0f+NVUlRUszX+gUE6EfYiFYQy5JQCgnaRj
HcguR1U3CgvQiZ4a56PBtVU=
=6Jzo
-END PGP SIGNATURE-
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Performance Bottleneck

2004-08-06 Thread Tom Lane
Martin Foster [EMAIL PROTECTED] writes:
 Gaetano Mendola wrote:
 change this values in:
 shared_buffers = 5
 sort_mem = 16084
 
 wal_buffers = 1500

This value of wal_buffers is simply ridiculous.

There isn't any reason to set wal_buffers higher than the amount of
WAL log data that will be generated by a single transaction, because
whatever is in the buffers will be flushed at transaction commit.
If you are mainly dealing with heavy concurrency then it's the mean time
between transaction commits that matters, and that's even less than the
average transaction length.

Even if you are mainly interested in the performance of large updating
transactions that are not concurrent with anything else (bulk data load,
perhaps), I'm not sure that I see any value in setting wal_buffers so
high.  The data will have to go to disk before commit in any case, and
buffering so much of it just means that you are going to have a serious
spike in disk traffic right before commit.  It's almost certainly better
to keep wal_buffers conservatively small and let the data trickle out as
the transaction proceeds.  I don't actually think there is anything very
wrong with the default value (8) ... perhaps it is too small, but it's
not two orders of magnitude too small.

In 8.0, the presence of the background writer may make it useful to run
with wal_buffers somewhat higher than before, but I still doubt that
order-of-a-thousand buffers would be useful.  The RAM would almost
certainly be better spent on general-purpose disk buffers or kernel
cache.

Note though that this is just informed opinion, as I've never done or
seen any benchmarks that examine the results of changing wal_buffers
while holding other things constant.  Has anyone tried it?

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[PERFORM] Question about Generating Possible Plans by the planner/optimizer

2004-08-06 Thread sandra ruiz
in the docummentation about the planer it says:
It first combines all possible ways of scanning and joining the relations 
that appear in a query

I would like to know if there's a time limit to do that or if it just scans 
ALL the posibilities until it finishes..no matter the time it takes..

thanks in advance.
_
MSN Amor: busca tu ½ naranja http://latam.msn.com/amor/
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Performance Bottleneck

2004-08-06 Thread Scott Marlowe
On Fri, 2004-08-06 at 17:24, Gaetano Mendola wrote:
 Martin Foster wrote:
 
  Gaetano Mendola wrote:
  
 
 
  Let start from your postgres configuration:
 
  shared_buffers = 8192 This is really too small for your 
  configuration
  sort_mem = 2048
 
  wal_buffers = 128 This is really too small for your 
  configuration
 
  effective_cache_size = 16000
 
  change this values in:
 
  shared_buffers = 5
  sort_mem = 16084
 
  wal_buffers = 1500
 
  effective_cache_size = 32000
 
 
  to bump up the shm usage you have to configure your OS in order to be
  allowed to use that ammount of SHM.
 
  This are the numbers that I feel good for your HW, the second step now is
  analyze your queries
 
  
  These changes have yielded some visible improvements, with load averages 
  rarely going over the anything noticeable.   However, I do have a 
  question on the matter, why do these values seem to be far higher then 
  what a frequently pointed to document would indicate as necessary?
  
  http://www.varlena.com/GeneralBits/Tidbits/perf.html
  
  I am simply curious, as this clearly shows that my understanding of 
  PostgreSQL is clearly lacking when it comes to tweaking for the hardware.
 
 Unfortunately there is no a wizard tuning for postgres so each one of
 us have a own school. The data I gave you are oversized to be sure
 to achieve improvements. Now you can start to decrease these values
 ( starting from the wal_buffers ) in order to find the good compromise
 with your HW.

FYI, my school of tuning is to change one thing at a time some
reasonable percentage (shared_buffers from 1000 to 2000) and measure the
change under simulated load.  Make another change, test it, chart the
shape of the change line.  It should look something like this for most
folks:

shared_buffers | q/s (more is better)
100 | 20
200 | 45
400 | 80
1000 | 100
... levels out here...
8000 | 110
1 | 108
2 | 40
3 | 20

Note it going back down as we exceed our memory and start swapping
shared_buffers.  Where that happens on your machine is determined by
many things like your machine's memory, memory bandwidth, type of load,
etc... but it will happen on most machines and when it does, it often
happens at the worst times, under heavy parallel load.

Unless testing shows it's faster, 1 or 25% of mem (whichever is
less) is usually a pretty good setting for shared_buffers.  Large data
sets may require more than 1, but going over 25% on machines with
large memory is usually a mistake, especially servers that do anything
other than just PostgreSQL.

You're absolutely right about one thing, there's no automatic wizard for
tuning this stuff.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Question about Generating Possible Plans by the planner/optimizer

2004-08-06 Thread Steinar H. Gunderson
On Fri, Aug 06, 2004 at 07:28:38PM -0500, sandra ruiz wrote:
 in the docummentation about the planer it says:
 
 It first combines all possible ways of scanning and joining the relations 
 that appear in a query
 
 I would like to know if there's a time limit to do that or if it just scans 
 ALL the posibilities until it finishes..no matter the time it takes..

Depends; if you join a lot of tables, it stops doing an exhaustive search and
goes for genetic optimization instead:

  http://www.postgresql.org/docs/7.4/static/geqo.html

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Performance Bottleneck

2004-08-06 Thread Martin Foster
Scott Marlowe wrote:
On Fri, 2004-08-06 at 17:24, Gaetano Mendola wrote:
Martin Foster wrote:

Gaetano Mendola wrote:

Let start from your postgres configuration:
shared_buffers = 8192 This is really too small for your 
configuration
sort_mem = 2048

wal_buffers = 128 This is really too small for your 
configuration

effective_cache_size = 16000
change this values in:
shared_buffers = 5
sort_mem = 16084
wal_buffers = 1500
effective_cache_size = 32000
to bump up the shm usage you have to configure your OS in order to be
allowed to use that ammount of SHM.
This are the numbers that I feel good for your HW, the second step now is
analyze your queries
These changes have yielded some visible improvements, with load averages 
rarely going over the anything noticeable.   However, I do have a 
question on the matter, why do these values seem to be far higher then 
what a frequently pointed to document would indicate as necessary?

http://www.varlena.com/GeneralBits/Tidbits/perf.html
I am simply curious, as this clearly shows that my understanding of 
PostgreSQL is clearly lacking when it comes to tweaking for the hardware.
Unfortunately there is no a wizard tuning for postgres so each one of
us have a own school. The data I gave you are oversized to be sure
to achieve improvements. Now you can start to decrease these values
( starting from the wal_buffers ) in order to find the good compromise
with your HW.

FYI, my school of tuning is to change one thing at a time some
reasonable percentage (shared_buffers from 1000 to 2000) and measure the
change under simulated load.  Make another change, test it, chart the
shape of the change line.  It should look something like this for most
folks:
shared_buffers | q/s (more is better)
100 | 20
200 | 45
400 | 80
1000 | 100
... levels out here...
8000 | 110
1 | 108
2 | 40
3 | 20
Note it going back down as we exceed our memory and start swapping
shared_buffers.  Where that happens on your machine is determined by
many things like your machine's memory, memory bandwidth, type of load,
etc... but it will happen on most machines and when it does, it often
happens at the worst times, under heavy parallel load.
Unless testing shows it's faster, 1 or 25% of mem (whichever is
less) is usually a pretty good setting for shared_buffers.  Large data
sets may require more than 1, but going over 25% on machines with
large memory is usually a mistake, especially servers that do anything
other than just PostgreSQL.
You're absolutely right about one thing, there's no automatic wizard for
tuning this stuff.
Which rather points out the crux of the problem.  This is a live system, 
meaning changes made need to be as informed as possible, and that 
changing values for the sake of testing can lead to potential problems 
in service.

Martin Foster
Creator/Designer Ethereal Realms
[EMAIL PROTECTED]
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Performance Bottleneck

2004-08-06 Thread Scott Marlowe
On Fri, 2004-08-06 at 22:02, Martin Foster wrote:
 Scott Marlowe wrote:
 
  On Fri, 2004-08-06 at 17:24, Gaetano Mendola wrote:
  
 Martin Foster wrote:
 
 
 Gaetano Mendola wrote:
 
 
 
 Let start from your postgres configuration:
 
 shared_buffers = 8192 This is really too small for your 
 configuration
 sort_mem = 2048
 
 wal_buffers = 128 This is really too small for your 
 configuration
 
 effective_cache_size = 16000
 
 change this values in:
 
 shared_buffers = 5
 sort_mem = 16084
 
 wal_buffers = 1500
 
 effective_cache_size = 32000
 
 
 to bump up the shm usage you have to configure your OS in order to be
 allowed to use that ammount of SHM.
 
 This are the numbers that I feel good for your HW, the second step now is
 analyze your queries
 
 
 These changes have yielded some visible improvements, with load averages 
 rarely going over the anything noticeable.   However, I do have a 
 question on the matter, why do these values seem to be far higher then 
 what a frequently pointed to document would indicate as necessary?
 
 http://www.varlena.com/GeneralBits/Tidbits/perf.html
 
 I am simply curious, as this clearly shows that my understanding of 
 PostgreSQL is clearly lacking when it comes to tweaking for the hardware.
 
 Unfortunately there is no a wizard tuning for postgres so each one of
 us have a own school. The data I gave you are oversized to be sure
 to achieve improvements. Now you can start to decrease these values
 ( starting from the wal_buffers ) in order to find the good compromise
 with your HW.
  
  
  FYI, my school of tuning is to change one thing at a time some
  reasonable percentage (shared_buffers from 1000 to 2000) and measure the
  change under simulated load.  Make another change, test it, chart the
  shape of the change line.  It should look something like this for most
  folks:
  
  shared_buffers | q/s (more is better)
  100 | 20
  200 | 45
  400 | 80
  1000 | 100
  ... levels out here...
  8000 | 110
  1 | 108
  2 | 40
  3 | 20
  
  Note it going back down as we exceed our memory and start swapping
  shared_buffers.  Where that happens on your machine is determined by
  many things like your machine's memory, memory bandwidth, type of load,
  etc... but it will happen on most machines and when it does, it often
  happens at the worst times, under heavy parallel load.
  
  Unless testing shows it's faster, 1 or 25% of mem (whichever is
  less) is usually a pretty good setting for shared_buffers.  Large data
  sets may require more than 1, but going over 25% on machines with
  large memory is usually a mistake, especially servers that do anything
  other than just PostgreSQL.
  
  You're absolutely right about one thing, there's no automatic wizard for
  tuning this stuff.
  
 
 Which rather points out the crux of the problem.  This is a live system, 
 meaning changes made need to be as informed as possible, and that 
 changing values for the sake of testing can lead to potential problems 
 in service.

But if you make those changes slowly, as I was showing, you should see
the small deleterious effects like I was showing long before they become
catastrophic.  To just jump shared_buffers to 5 is not a good idea,
especially if the sweet spot is likely lower than that.  


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Performance Bottleneck

2004-08-06 Thread Martin Foster
Scott Marlowe wrote:
On Fri, 2004-08-06 at 22:02, Martin Foster wrote:
Scott Marlowe wrote:

On Fri, 2004-08-06 at 17:24, Gaetano Mendola wrote:

Martin Foster wrote:

Gaetano Mendola wrote:

Let start from your postgres configuration:
shared_buffers = 8192 This is really too small for your 
configuration
sort_mem = 2048

wal_buffers = 128 This is really too small for your 
configuration

effective_cache_size = 16000
change this values in:
shared_buffers = 5
sort_mem = 16084
wal_buffers = 1500
effective_cache_size = 32000
to bump up the shm usage you have to configure your OS in order to be
allowed to use that ammount of SHM.
This are the numbers that I feel good for your HW, the second step now is
analyze your queries
These changes have yielded some visible improvements, with load averages 
rarely going over the anything noticeable.   However, I do have a 
question on the matter, why do these values seem to be far higher then 
what a frequently pointed to document would indicate as necessary?

http://www.varlena.com/GeneralBits/Tidbits/perf.html
I am simply curious, as this clearly shows that my understanding of 
PostgreSQL is clearly lacking when it comes to tweaking for the hardware.
Unfortunately there is no a wizard tuning for postgres so each one of
us have a own school. The data I gave you are oversized to be sure
to achieve improvements. Now you can start to decrease these values
( starting from the wal_buffers ) in order to find the good compromise
with your HW.

FYI, my school of tuning is to change one thing at a time some
reasonable percentage (shared_buffers from 1000 to 2000) and measure the
change under simulated load.  Make another change, test it, chart the
shape of the change line.  It should look something like this for most
folks:
shared_buffers | q/s (more is better)
100 | 20
200 | 45
400 | 80
1000 | 100
... levels out here...
8000 | 110
1 | 108
2 | 40
3 | 20
Note it going back down as we exceed our memory and start swapping
shared_buffers.  Where that happens on your machine is determined by
many things like your machine's memory, memory bandwidth, type of load,
etc... but it will happen on most machines and when it does, it often
happens at the worst times, under heavy parallel load.
Unless testing shows it's faster, 1 or 25% of mem (whichever is
less) is usually a pretty good setting for shared_buffers.  Large data
sets may require more than 1, but going over 25% on machines with
large memory is usually a mistake, especially servers that do anything
other than just PostgreSQL.
You're absolutely right about one thing, there's no automatic wizard for
tuning this stuff.
Which rather points out the crux of the problem.  This is a live system, 
meaning changes made need to be as informed as possible, and that 
changing values for the sake of testing can lead to potential problems 
in service.

But if you make those changes slowly, as I was showing, you should see
the small deleterious effects like I was showing long before they become
catastrophic.  To just jump shared_buffers to 5 is not a good idea,
especially if the sweet spot is likely lower than that.  

While I agree, there are also issues with the fact that getting 
consistent results from this site are very much difficult to do, since 
it is based on the whims of users visiting one of three sites hosted on 
the same hardware.

Now that being said, having wal_buffers at 8 certainly would not be a 
good idea, since the database logs themselves were warning of excessive 
writes in that region.I am not hoping for a perfect intermix ratio, 
that will solve all my problems.

But a good idea on a base that will allow me to gain a fair load would 
certainly be a good option.   Right now, the load being handled is not 
much more then a single processor system did with half the memory. 
Certainly this architecture should be able to take more of a beating 
then this?

Martin Foster
Creator/Designer Ethereal Realms
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]