Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-08-01 Thread Luke Lonergan
Marc,

You should expect that for the kind of OLAP workload you describe in steps 2
and 3 you will have exactly one CPU working for you in Postgres.

If you want to accelerate the speed of this processing by a factor of 100 or
more on this machine, you should try Greenplum DB which is Postgres 8.2
compatible.  Based on the overall setup you describe, you may have a hybrid
installation with GPDB doing the reporting / OLAP workload and the other
Postgres databases handling the customer workloads.

- Luke


On 7/24/07 7:38 AM, Marc Mamin [EMAIL PROTECTED] wrote:

  
 Hello,
 
 thank you for all your comments and recommendations.
 
 I'm aware that the conditions for this benchmark are not ideal, mostly
 due to the lack of time to prepare it. We will also need an additional
 benchmark on a less powerful - more realistic - server to better
 understand the scability of our application.
 
 
 Our application is based on java and is generating dynamic reports from
 log files content. Dynamic means here that a repor will be calculated
 from the postgres data the first time it is requested (it will  then be
 cached). Java is used to drive the data preparation and to
 handle/generate the reports requests.
 
 This is much more an OLAP system then an OLTP, at least for our
 performance concern.
 
 
 
 
 Data preparation:
 
 1) parsing the log files with a heavy use of perl (regular expressions)
 to generate csv files. Prepared statements also maintain reference
 tables in the DB. Postgres performance is not an issue for this first
 step.
 
 2) loading the csv files with COPY. As around 70% of the data to load
 come in a single daily table, we don't allow concurrent jobs for this
 step. We have between a few and a few hundreds files to load into a
 single table; they are processed one after the other. A primary key is
 always defined; for the case when the required indexes are alreay built
 and when the new data are above a given size, we are using a shadow
 table  instead (without the indexes) , build the index after the import
 and then replace the live table with the shadow one.
 For example, we a have a table of 13 GB + 11 GB indexes (5 pieces).
 
 Performances :
 
 a) is there an ideal size to consider for our csv files (100 x 10
 MB or better 1 x 1GB ?)
 b) maintenance_work_mem: I'll use around 1 GB as recommended by
 Stefan
 
 3) Data agggregation. This is the heaviest part for Postgres. On our
 current system some queries need above one hour, with phases of around
 100% cpu use, alterning with times of heavy i/o load when temporary
 results are written/read to the plate (pgsql_tmp). During the
 aggregation, other postgres activities are low (at least should be) as
 this should take place at night. Currently we have a locking mechanism
 to avoid having more than one of such queries running concurently. This
 may be to strict for the benchmark server but better reflect our current
 hardware capabilities.
 
 Performances : Here we should favorise a single huge transaction and
 consider a low probability to have another transaction requiring large
 sort space. Considering this, is it reasonable to define work_mem being
 3GB (I guess I should raise this parameter dynamically before running
 the aggregation queries)
 
 4) Queries (report generation)
 
 We have only few requests which are not satisfying while requiring large
 sort operations. The data are structured in different aggregation levels
 (minutes, hours, days) with logical time based partitions in oder to
 limit the data size to compute for a given report. Moreover we can scale
 our infrastrucure while using different or dedicated Postgres servers
 for different customers. Smaller customers may share a same instance,
 each of them having its own schema (The lock mechanism for large
 aggregations apply to a whole Postgres instance, not to a single
 customer) . The benchmark will help us to plan such distribution.
 
 During the benchmark, we will probably not have more than 50 not idle
 connections simultaneously. It is a bit too early for us to fine tune
 this part. The benchmark will mainly focus on the steps 1 to 3
 
 During the benchmark, the Db will reach a size of about 400 GB,
 simulating 3 different customers, also with data quite equally splitted
 in 3 scheemas.
 
 
 
 I will post our configuration(s) later on.
 
 
 
 Thanks again for all your valuable input.
 
 Marc Mamin
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-07-31 Thread Dimitri
Hi Luke,

On the same page of Solaris internals wiki you may find links to the
study with db_STRESS benchmark (done on UFS and ZFS with PostgreSQL,
MySQL and Oracle (well, Oracle results are removed, but at least I may
say it entered into the same tuning as PgSQL). Tests were done on
Sol10u3 (as well you may find any other platform details in report
document)...

Also, if block size adjustment  is less or more transparent (don't
read 32K if you need only 8K - with huge data volume you'll simply
waste your cache; in  case you're doing full scan - leave prefetch
algorithm to work for you); probably ARC (cache) limitation need more
light. Well, I even cannot say there is any problem, etc. with it - it
just has too much aggressive implementation :)) If all your running
programs fitting into 1GB of RAM - you may leave ARC size by default
(leaves 1GB free of system RAM). Otherwise, you should limit ARC to
keep your workload execution comfortable: ARC allocating memory very
quickly and every time your program need more RAM - it entering into
concurrency with ARC... In my tests I observed short workload freezes
during such periods and I did not like it too much :)) specially with
high connection numbers :))

well, we may spend hours to discuss :)  (sorry to be short, I have a
very limited mail access for the moment)...

However, ZFS is improving all the time and works better and better
with every Solaris release, so probably all current tuning will be
different or obsolete at the end of this year :))

BTW, forgot to mention, you'll need Solaris 10u4 or at least 10u3 but
with all recent patches applied to run M8000 on full power.

Best regards!
-Dimitri


On 7/30/07, Luke Lonergan [EMAIL PROTECTED] wrote:
 Hi Dimitri,

 Can you post some experimental evidence that these settings matter?

 At this point we have several hundred terabytes of PG databases running on
 ZFS, all of them setting speed records for data warehouses.

 We did testing on these settings last year on S10U2, perhaps things have
 changed since then.

 - Luke

 Msg is shrt cuz m on ma treo

  -Original Message-
 From: Dimitri [mailto:[EMAIL PROTECTED]
 Sent: Monday, July 30, 2007 05:26 PM Eastern Standard Time
 To:   Luke Lonergan
 Cc:   Josh Berkus; pgsql-performance@postgresql.org; Marc Mamin
 Subject:  Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

 Luke,

 ZFS tuning is not coming from general suggestion ideas, but from real
 practice...

 So,
   - limit ARC is the MUST for the moment to keep your database running
 comfortable (specially DWH!)
   - 8K blocksize is chosen to read exactly one page when PG ask to
 read one page - don't mix it with prefetch! when prefetch is detected,
 ZFS will read next blocks without any demand from PG; but otherwise
 why you need to read more  pages each time PG asking only one?...
   - prefetch of course not needed for OLTP, but helps on OLAP/DWH, agree :)

 Rgds,
 -Dimitri


 On 7/22/07, Luke Lonergan [EMAIL PROTECTED] wrote:
  Josh,
 
  On 7/20/07 4:26 PM, Josh Berkus [EMAIL PROTECTED] wrote:
 
   There are some specific tuning parameters you need for ZFS or
 performance
   is going to suck.
  
   http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide
   (scroll down to PostgreSQL)
   http://www.sun.com/servers/coolthreads/tnb/applications_postgresql.jsp
   http://bugs.opensolaris.org/view_bug.do?bug_id=6437054
  
   You also don't say anything about what kind of workload you're running.
 
 
  I think we're assuming that the workload is OLTP when putting these tuning
  guidelines forward.  Note that the ZFS tuning guidance referred to in this
  bug article recommend turning vdev prefetching off for random I/O
  (databases).  This is exactly the opposite of what we should do for OLAP
  workloads.
 
  Also, the lore that setting recordsize on ZFS is mandatory for good
 database
  performance is similarly not appropriate for OLAP work.
 
  If the workload is OLAP / Data Warehousing, I'd suggest ignoring all of
 the
  tuning information from Sun that refers generically to database.  The
  untuned ZFS performance should be far better in those cases.
 Specifically,
  these three should be ignored:
  - (ignore this) limit ARC memory use
  - (ignore this) set recordsize to 8K
  - (ignore this) turn off vdev prefetch
 
  - Luke
 
 
 
  ---(end of broadcast)---
  TIP 9: In versions below 8.0, the planner will ignore your desire to
 choose an index scan if your joining column's datatypes do not
 match
 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-07-30 Thread Dimitri
Luke,

ZFS tuning is not coming from general suggestion ideas, but from real
practice...

So,
  - limit ARC is the MUST for the moment to keep your database running
comfortable (specially DWH!)
  - 8K blocksize is chosen to read exactly one page when PG ask to
read one page - don't mix it with prefetch! when prefetch is detected,
ZFS will read next blocks without any demand from PG; but otherwise
why you need to read more  pages each time PG asking only one?...
  - prefetch of course not needed for OLTP, but helps on OLAP/DWH, agree :)

Rgds,
-Dimitri


On 7/22/07, Luke Lonergan [EMAIL PROTECTED] wrote:
 Josh,

 On 7/20/07 4:26 PM, Josh Berkus [EMAIL PROTECTED] wrote:

  There are some specific tuning parameters you need for ZFS or performance
  is going to suck.
 
  http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide
  (scroll down to PostgreSQL)
  http://www.sun.com/servers/coolthreads/tnb/applications_postgresql.jsp
  http://bugs.opensolaris.org/view_bug.do?bug_id=6437054
 
  You also don't say anything about what kind of workload you're running.


 I think we're assuming that the workload is OLTP when putting these tuning
 guidelines forward.  Note that the ZFS tuning guidance referred to in this
 bug article recommend turning vdev prefetching off for random I/O
 (databases).  This is exactly the opposite of what we should do for OLAP
 workloads.

 Also, the lore that setting recordsize on ZFS is mandatory for good database
 performance is similarly not appropriate for OLAP work.

 If the workload is OLAP / Data Warehousing, I'd suggest ignoring all of the
 tuning information from Sun that refers generically to database.  The
 untuned ZFS performance should be far better in those cases.  Specifically,
 these three should be ignored:
 - (ignore this) limit ARC memory use
 - (ignore this) set recordsize to 8K
 - (ignore this) turn off vdev prefetch

 - Luke



 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-07-30 Thread Luke Lonergan
Hi Dimitri,

Can you post some experimental evidence that these settings matter?

At this point we have several hundred terabytes of PG databases running on ZFS, 
all of them setting speed records for data warehouses.

We did testing on these settings last year on S10U2, perhaps things have 
changed since then.

- Luke

Msg is shrt cuz m on ma treo

 -Original Message-
From:   Dimitri [mailto:[EMAIL PROTECTED]
Sent:   Monday, July 30, 2007 05:26 PM Eastern Standard Time
To: Luke Lonergan
Cc: Josh Berkus; pgsql-performance@postgresql.org; Marc Mamin
Subject:Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

Luke,

ZFS tuning is not coming from general suggestion ideas, but from real
practice...

So,
  - limit ARC is the MUST for the moment to keep your database running
comfortable (specially DWH!)
  - 8K blocksize is chosen to read exactly one page when PG ask to
read one page - don't mix it with prefetch! when prefetch is detected,
ZFS will read next blocks without any demand from PG; but otherwise
why you need to read more  pages each time PG asking only one?...
  - prefetch of course not needed for OLTP, but helps on OLAP/DWH, agree :)

Rgds,
-Dimitri


On 7/22/07, Luke Lonergan [EMAIL PROTECTED] wrote:
 Josh,

 On 7/20/07 4:26 PM, Josh Berkus [EMAIL PROTECTED] wrote:

  There are some specific tuning parameters you need for ZFS or performance
  is going to suck.
 
  http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide
  (scroll down to PostgreSQL)
  http://www.sun.com/servers/coolthreads/tnb/applications_postgresql.jsp
  http://bugs.opensolaris.org/view_bug.do?bug_id=6437054
 
  You also don't say anything about what kind of workload you're running.


 I think we're assuming that the workload is OLTP when putting these tuning
 guidelines forward.  Note that the ZFS tuning guidance referred to in this
 bug article recommend turning vdev prefetching off for random I/O
 (databases).  This is exactly the opposite of what we should do for OLAP
 workloads.

 Also, the lore that setting recordsize on ZFS is mandatory for good database
 performance is similarly not appropriate for OLAP work.

 If the workload is OLAP / Data Warehousing, I'd suggest ignoring all of the
 tuning information from Sun that refers generically to database.  The
 untuned ZFS performance should be far better in those cases.  Specifically,
 these three should be ignored:
 - (ignore this) limit ARC memory use
 - (ignore this) set recordsize to 8K
 - (ignore this) turn off vdev prefetch

 - Luke



 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match



Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-07-24 Thread Marc Mamin
 
Hello,

thank you for all your comments and recommendations.

I'm aware that the conditions for this benchmark are not ideal, mostly
due to the lack of time to prepare it. We will also need an additional
benchmark on a less powerful - more realistic - server to better
understand the scability of our application.


Our application is based on java and is generating dynamic reports from
log files content. Dynamic means here that a repor will be calculated
from the postgres data the first time it is requested (it will  then be
cached). Java is used to drive the data preparation and to
handle/generate the reports requests.

This is much more an OLAP system then an OLTP, at least for our
performance concern.




Data preparation:

1) parsing the log files with a heavy use of perl (regular expressions)
to generate csv files. Prepared statements also maintain reference
tables in the DB. Postgres performance is not an issue for this first
step.

2) loading the csv files with COPY. As around 70% of the data to load
come in a single daily table, we don't allow concurrent jobs for this
step. We have between a few and a few hundreds files to load into a
single table; they are processed one after the other. A primary key is
always defined; for the case when the required indexes are alreay built
and when the new data are above a given size, we are using a shadow 
table  instead (without the indexes) , build the index after the import
and then replace the live table with the shadow one. 
For example, we a have a table of 13 GB + 11 GB indexes (5 pieces).

Performances :

a) is there an ideal size to consider for our csv files (100 x 10
MB or better 1 x 1GB ?)
b) maintenance_work_mem: I'll use around 1 GB as recommended by
Stefan

3) Data agggregation. This is the heaviest part for Postgres. On our
current system some queries need above one hour, with phases of around
100% cpu use, alterning with times of heavy i/o load when temporary
results are written/read to the plate (pgsql_tmp). During the
aggregation, other postgres activities are low (at least should be) as
this should take place at night. Currently we have a locking mechanism
to avoid having more than one of such queries running concurently. This
may be to strict for the benchmark server but better reflect our current
hardware capabilities.

Performances : Here we should favorise a single huge transaction and
consider a low probability to have another transaction requiring large
sort space. Considering this, is it reasonable to define work_mem being
3GB (I guess I should raise this parameter dynamically before running
the aggregation queries)

4) Queries (report generation)

We have only few requests which are not satisfying while requiring large
sort operations. The data are structured in different aggregation levels
(minutes, hours, days) with logical time based partitions in oder to
limit the data size to compute for a given report. Moreover we can scale
our infrastrucure while using different or dedicated Postgres servers
for different customers. Smaller customers may share a same instance,
each of them having its own schema (The lock mechanism for large
aggregations apply to a whole Postgres instance, not to a single
customer) . The benchmark will help us to plan such distribution.

During the benchmark, we will probably not have more than 50 not idle
connections simultaneously. It is a bit too early for us to fine tune
this part. The benchmark will mainly focus on the steps 1 to 3

During the benchmark, the Db will reach a size of about 400 GB,
simulating 3 different customers, also with data quite equally splitted
in 3 scheemas.



I will post our configuration(s) later on.



Thanks again for all your valuable input.

Marc Mamin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-07-22 Thread Luke Lonergan
Josh,

On 7/20/07 4:26 PM, Josh Berkus [EMAIL PROTECTED] wrote:

 There are some specific tuning parameters you need for ZFS or performance
 is going to suck.
 
 http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide
 (scroll down to PostgreSQL)
 http://www.sun.com/servers/coolthreads/tnb/applications_postgresql.jsp
 http://bugs.opensolaris.org/view_bug.do?bug_id=6437054
 
 You also don't say anything about what kind of workload you're running.


I think we're assuming that the workload is OLTP when putting these tuning
guidelines forward.  Note that the ZFS tuning guidance referred to in this
bug article recommend turning vdev prefetching off for random I/O
(databases).  This is exactly the opposite of what we should do for OLAP
workloads.

Also, the lore that setting recordsize on ZFS is mandatory for good database
performance is similarly not appropriate for OLAP work.

If the workload is OLAP / Data Warehousing, I'd suggest ignoring all of the
tuning information from Sun that refers generically to database.  The
untuned ZFS performance should be far better in those cases.  Specifically,
these three should be ignored:
- (ignore this) limit ARC memory use
- (ignore this) set recordsize to 8K
- (ignore this) turn off vdev prefetch

- Luke



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-07-20 Thread Josh Berkus
Marc,

 Server Specifications:
 --

 Sun SPARC Enterprise M8000 Server:

 http://www.sun.com/servers/highend/m8000/specs.xml

 File system:

 http://en.wikipedia.org/wiki/ZFS

There are some specific tuning parameters you need for ZFS or performance 
is going to suck.

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide
(scroll down to PostgreSQL)
http://www.sun.com/servers/coolthreads/tnb/applications_postgresql.jsp
http://bugs.opensolaris.org/view_bug.do?bug_id=6437054

You also don't say anything about what kind of workload you're running.


-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-07-20 Thread Gavin M. Roy

Having done something similar recently, I would recommend that you look at
adding connection pooling using pgBouncer transaction pooling between your
benchmark app and PgSQL.  In our application we have about 2000 clients
funneling down to 30 backends and are able to sustain large transaction per
second volume.  This has been the #1 key to success for us in running on
monster hardware.
Regards,

Gavin

On 7/17/07, Marc Mamin [EMAIL PROTECTED] wrote:



Postgres configuration for 64 CPUs, 128 GB RAM...

Hello,

We have the oppotunity to benchmark our application on a large server. I
have to prepare the Postgres configuration and I'd appreciate some comments
on it as I am not experienced with servers of such a scale. Moreover the
configuration should be fail-proof as I won't be able to attend the tests.

Our application (java + perl) and Postgres will run on the same server,
whereas the application activity is low when Postgres has large transactions
to process.

There is a large gap between our current produtcion server (Linux, 4GB
RAM, 4 cpus) and the benchmark server; one of the target of this  benchmark
is to verify the scalability of our application.

And you have no reason to be envious as the server doesn't belong us :-)

Thanks for your comments,

Marc Mamin




Posgres version: 8.2.1


Server Specifications:
--

Sun SPARC Enterprise M8000 Server:

*http://www.sun.com/servers/highend/m8000/specs.xml*http://www.sun.com/servers/highend/m8000/specs.xml

File system:

*http://en.wikipedia.org/wiki/ZFS* http://en.wikipedia.org/wiki/ZFS


Planned configuration:


# we don't expect more than 150 parallel connections,
# but I suspect a leak in our application that let some idle connections
open

max_connections=2000

ssl = off

#maximum allowed
shared_buffers= 262143

# on our current best production server with 4GB RAM (not dedicated to
Postgres), work_mem is set to 600 MB
# this limitation is probably the bottleneck for our application as the
files in pgsql_tmp grows up to 15 GB
# during large aggregations (we have a locking mechanismus to avoid
parallel processing of such transactions)
work_mem = 31457280  # (30 GB)

# index creation time is also an issue for us; the process is locking
other large processes too.
# our largest table so far is 13 GB + 11 GB indexes
maintenance_work_mem = 31457280  # (30 GB)

# more than the max number of tables +indexes expected during the
benchmark
max_fsm_relations = 10

max_fsm_pages = 180

# don't know if I schoud modify this.
# seems to be sufficient on our production servers
max_stack_depth = 2MB

# vacuum will be done per hand between each test session
autovacuum = off


# required to analyse the benchmark
log_min_duration_statement = 1000

max_prepared_transaction = 100

# seems to be required to drop schema/roles containing large number of
objects
max_locks_per_transaction = 128



# I use the default for the bgwriter as I couldnt find recommendation on
those

#bgwriter_delay = 200ms # 10-1ms between rounds
#bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers
scanned/round
#bgwriter_lru_maxpages = 5  # 0-1000 buffers max written/round
#bgwriter_all_percent = 0.333   # 0-100% of all buffers
scanned/round
#bgwriter_all_maxpages = 5  # 0-1000 buffers max written/round

#WAL

fsync = on

#use default
#wal_sync_method

# we are using 32 on our production system
wal_buffers=64

# we didn't make any testing with this parameter until now, but this
should'nt be a relevant
# point as our performance focus is on large transactions
commit_delay = 0

#CHECKPOINT

# xlog will be  on a separate disk
checkpoint_segments=256

checkpoint_timeout = 5min



Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-07-17 Thread Stefan Kaltenbrunner

Marc Mamin wrote:


Postgres configuration for 64 CPUs, 128 GB RAM...


there are probably not that much installation out there that large - 
comments below




Hello,

We have the oppotunity to benchmark our application on a large server. I 
have to prepare the Postgres configuration and I'd appreciate some 
comments on it as I am not experienced with servers of such a scale. 
Moreover the configuration should be fail-proof as I won't be able to 
attend the tests.


Our application (java + perl) and Postgres will run on the same server, 
whereas the application activity is low when Postgres has large 
transactions to process.


There is a large gap between our current produtcion server (Linux, 4GB 
RAM, 4 cpus) and the benchmark server; one of the target of this  
benchmark is to verify the scalability of our application.




[...]

Posgres version: 8.2.1


upgrade to 8.2.4


File system:

_http://en.wikipedia.org/wiki/ZFS_


way more important is what kind of disk-IO subsystem you have attached ...





Planned configuration:


# we don't expect more than 150 parallel connections,
# but I suspect a leak in our application that let some idle connections 
open


max_connections=2000

ssl = off

#maximum allowed
shared_buffers= 262143


this is probably on the lower side for a 128GB box



# on our current best production server with 4GB RAM (not dedicated to 
Postgres), work_mem is set to 600 MB
# this limitation is probably the bottleneck for our application as the 
files in pgsql_tmp grows up to 15 GB
# during large aggregations (we have a locking mechanismus to avoid 
parallel processing of such transactions)

work_mem = 31457280  # (30 GB)


this is simply ridiculous - work_mem is PER SORT - so if your query 
requires 8 sorts it will feel free to use 8x30GB and needs to be 
multiplied by the number of concurrent connections.




# index creation time is also an issue for us; the process is locking 
other large processes too.

# our largest table so far is 13 GB + 11 GB indexes
maintenance_work_mem = 31457280  # (30 GB)


this is ridiculous too - testing has shown that there is not much point 
in going beyond 1GB or so




# more than the max number of tables +indexes expected during the benchmark
max_fsm_relations = 10

max_fsm_pages = 180


this is probably way to low for a database the size of yours - watch the 
oputput of VACUUM VERBOSE on a database wide vacuum for some stats on that.




# don't know if I schoud modify this.
# seems to be sufficient on our production servers
max_stack_depth = 2MB

# vacuum will be done per hand between each test session
autovacuum = off



# required to analyse the benchmark
log_min_duration_statement = 1000


max_prepared_transaction = 100


# seems to be required to drop schema/roles containing large number of 
objects

max_locks_per_transaction = 128




# I use the default for the bgwriter as I couldnt find recommendation on 
those


#bgwriter_delay = 200ms # 10-1ms between rounds
#bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers 
scanned/round

#bgwriter_lru_maxpages = 5  # 0-1000 buffers max written/round
#bgwriter_all_percent = 0.333   # 0-100% of all buffers 
scanned/round

#bgwriter_all_maxpages = 5  # 0-1000 buffers max written/round


#WAL

fsync = on

#use default
#wal_sync_method

# we are using 32 on our production system
wal_buffers=64


values up to 512 or so have been reported to help on systems with very 
high concurrency



what is missing here is your settings for:

effective_cache_size

and

random_page_cost



Stefan

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-07-17 Thread Jim C. Nasby
On Tue, Jul 17, 2007 at 04:10:30PM +0200, Marc Mamin wrote:
 shared_buffers= 262143
 
You should at least try some runs with this set far, far larger. At
least 10% of memory, but it'd be nice to see what happens with this set
to 50% or higher as well (though don't set it larger than the database
since it'd be a waste).

How big is the database, anyway?

 # on our current best production server with 4GB RAM (not dedicated to
 Postgres), work_mem is set to 600 MB
 # this limitation is probably the bottleneck for our application as the
 files in pgsql_tmp grows up to 15 GB 
 # during large aggregations (we have a locking mechanismus to avoid
 parallel processing of such transactions)

Keep in mind that a good filesystem will be caching most of pgsql_tmp if
it can.

 max_prepared_transaction = 100
 
Are you using 2PC? If not, there's no reason to touch this (could could
just set it to 0).

 # I use the default for the bgwriter as I couldnt find recommendation on
 those
 
 #bgwriter_delay = 200ms # 10-1ms between rounds
 #bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers
 scanned/round
 #bgwriter_lru_maxpages = 5  # 0-1000 buffers max
 written/round
 #bgwriter_all_percent = 0.333   # 0-100% of all buffers
 scanned/round
 #bgwriter_all_maxpages = 5  # 0-1000 buffers max
 written/round
 
You'll probably want to increase both maxpages parameters substantially,
assuming that you've got good IO hardware.
 
 #CHECKPOINT
 
 # xlog will be  on a separate disk
 checkpoint_segments=256
 
 checkpoint_timeout = 5min

The further apart your checkpoints, the better. Might want to look at 10
minutes. I'd also set checkpoint_warning to just a bit below
checkpoint_timeout and watch for warnings to make sure you're not
checkpointing a lot more frequently than you're expecting.
-- 
Jim Nasby  [EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpAtlNgiEMna.pgp
Description: PGP signature


Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-07-17 Thread Gregory Stark

Marc Mamin [EMAIL PROTECTED] writes:

 We have the oppotunity to benchmark our application on a large server. I
 have to prepare the Postgres configuration and I'd appreciate some
 comments on it as I am not experienced with servers of such a scale.
 Moreover the configuration should be fail-proof as I won't be able to
 attend the tests. 

I really think that's a recipe for disaster. Even on a regular machine you
need to treat tuning as an on-going feedback process. There's no such thing as
a fail-proof configuration since every application is different.

On an exotic machine like this you're going to run into unique problems that
nobody here can anticipate with certainty.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-07-17 Thread Greg Smith

On Tue, 17 Jul 2007, Marc Mamin wrote:

Moreover the configuration should be fail-proof as I won't be able to 
attend the tests.


This is unreasonable.  The idea that you'll get a magic perfect 
configuration in one shot suggests a fundamental misunderstanding of how 
work like this is done.  If there's any way you could adjust things so 
that, say, you were allowed to give at least 4 different tuning setups and 
you got a report back with each of the results for them, that would let 
you design a much better test set.



Posgres version: 8.2.1


This has already been mentioned, but it really is critical for your type 
of test to run 8.2.4 instead so I wanted to emphasize it.  There is a 
major scalability bug in 8.2.1.  I'm going to ignore the other things that 
other people have already commented on (all the suggestions Stephan and 
Jim already made are good ones you should heed) and try to fill in the 
remaining gaps instead.



# I use the default for the bgwriter as I couldnt find recommendation on
those


The defaults are so small that it will barely do anything on a server of 
your size.  Tuning it properly so that it's effective but doesn't waste a 
lot of resources is tricky, which is why you haven't found such 
recommendations--they're fairly specific to what you're doing and require 
some testing to get right.  If you want to see an example from a big 
server, look at


http://www.spec.org/jAppServer2004/results/res2007q3/jAppServer2004-20070606-00065.html#DBDatabase_SW_Config0

That's tuned for a very specific benchmark though.  Here's a fairly 
generic set of parameters that would be much more aggressive than the 
defaults, while not going so far as to waste too many resources if the 
writer is just getting in the way on your server:


bgwriter_delay = 200ms
bgwriter_lru_percent = 3.0
bgwriter_lru_maxpages = 500
bgwriter_all_percent = 1.0
bgwriter_all_maxpages = 250


#WAL
fsync = on
#use default
#wal_sync_method


I'd expect wal_sync_method=open_datasync would outperfom the default, but 
you'd really want to test both ways here to be sure.  The fact that the 
Sun results I referenced above use the default of fdatasync makes me 
hesitate to recommend that change too strongly, as I haven't worked with 
this particular piece of hardware. See 
http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm for more 
information about this parameter.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-07-17 Thread Marc Mamin

Postgres configuration for 64 CPUs, 128 GB RAM...

Hello,

We have the oppotunity to benchmark our application on a large server. I
have to prepare the Postgres configuration and I'd appreciate some
comments on it as I am not experienced with servers of such a scale.
Moreover the configuration should be fail-proof as I won't be able to
attend the tests. 

Our application (java + perl) and Postgres will run on the same server,
whereas the application activity is low when Postgres has large
transactions to process.

There is a large gap between our current produtcion server (Linux, 4GB
RAM, 4 cpus) and the benchmark server; one of the target of this
benchmark is to verify the scalability of our application. 


And you have no reason to be envious as the server doesn't belong us :-)


Thanks for your comments,

Marc Mamin





Posgres version: 8.2.1



Server Specifications:
--

Sun SPARC Enterprise M8000 Server:

http://www.sun.com/servers/highend/m8000/specs.xml

File system:

http://en.wikipedia.org/wiki/ZFS



Planned configuration:


# we don't expect more than 150 parallel connections, 
# but I suspect a leak in our application that let some idle connections
open

max_connections=2000

ssl = off 

#maximum allowed
shared_buffers= 262143

# on our current best production server with 4GB RAM (not dedicated to
Postgres), work_mem is set to 600 MB
# this limitation is probably the bottleneck for our application as the
files in pgsql_tmp grows up to 15 GB 
# during large aggregations (we have a locking mechanismus to avoid
parallel processing of such transactions)
work_mem = 31457280  # (30 GB)

# index creation time is also an issue for us; the process is locking
other large processes too.
# our largest table so far is 13 GB + 11 GB indexes
maintenance_work_mem = 31457280  # (30 GB)

# more than the max number of tables +indexes expected during the
benchmark
max_fsm_relations = 10

max_fsm_pages = 180

# don't know if I schoud modify this.
# seems to be sufficient on our production servers
max_stack_depth = 2MB

# vacuum will be done per hand between each test session
autovacuum = off 



# required to analyse the benchmark
log_min_duration_statement = 1000


max_prepared_transaction = 100


# seems to be required to drop schema/roles containing large number of
objects
max_locks_per_transaction = 128 




# I use the default for the bgwriter as I couldnt find recommendation on
those

#bgwriter_delay = 200ms # 10-1ms between rounds
#bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers
scanned/round
#bgwriter_lru_maxpages = 5  # 0-1000 buffers max
written/round
#bgwriter_all_percent = 0.333   # 0-100% of all buffers
scanned/round
#bgwriter_all_maxpages = 5  # 0-1000 buffers max
written/round


#WAL

fsync = on

#use default
#wal_sync_method

# we are using 32 on our production system
wal_buffers=64


# we didn't make any testing with this parameter until now, but this
should'nt be a relevant
# point as our performance focus is on large transactions
commit_delay = 0 

#CHECKPOINT

# xlog will be  on a separate disk
checkpoint_segments=256

checkpoint_timeout = 5min


Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-07-17 Thread PFC
On Tue, 17  We have the oppotunity to benchmark our application on a  
large server. I

have to prepare the Postgres configuration and I'd appreciate some
comments on it as I am not experienced with servers of such a scale.
Moreover the configuration should be fail-proof as I won't be able to
attend the tests.

Our application (java + perl) and Postgres will run on the same server,
whereas the application activity is low when Postgres has large
transactions to process.


Please, can you be more specific about your application :

- what does it do ?
- what kind of workload does it generate ?
	[ie: many concurrent small queries (website) ; few huge queries,  
reporting, warehousing, all of the above, something else ?]

- percentage and size of update queries ?
	- how many concurrent threads / connections / clients do you serve on a  
busy day ?
	(I don't mean online users on a website, but ACTIVE concurrent database  
connections)


	I assume you find your current server is too slow or foresee it will  
become too slow soon and want to upgrade, so :


	- what makes the current server's performance inadequate ? is it IO, CPU,  
RAM, a mix ? which proportions in the mix ?


	This is very important. If you go to the dealer and ask I need a better  
vehicle, he'll sell you a Porsche. But if you say I need a better vehcle  
to carry two tons of cinderblocks he'll sell you something else I guess.  
Same with database servers. You could need some humongous CPU power, but  
you might as well not. Depends.



There is a large gap between our current produtcion server (Linux, 4GB
RAM, 4 cpus) and the benchmark server; one of the target of this
benchmark is to verify the scalability of our application.


	Define scalability. (no this isn't a joke, I mean, you know your  
application, how would you like it to scale ? How do you think it will  
scale ? Why ? What did you do so it would scale well ? etc.)





---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings