Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
Marc, You should expect that for the kind of OLAP workload you describe in steps 2 and 3 you will have exactly one CPU working for you in Postgres. If you want to accelerate the speed of this processing by a factor of 100 or more on this machine, you should try Greenplum DB which is Postgres 8.2 compatible. Based on the overall setup you describe, you may have a hybrid installation with GPDB doing the reporting / OLAP workload and the other Postgres databases handling the customer workloads. - Luke On 7/24/07 7:38 AM, Marc Mamin [EMAIL PROTECTED] wrote: Hello, thank you for all your comments and recommendations. I'm aware that the conditions for this benchmark are not ideal, mostly due to the lack of time to prepare it. We will also need an additional benchmark on a less powerful - more realistic - server to better understand the scability of our application. Our application is based on java and is generating dynamic reports from log files content. Dynamic means here that a repor will be calculated from the postgres data the first time it is requested (it will then be cached). Java is used to drive the data preparation and to handle/generate the reports requests. This is much more an OLAP system then an OLTP, at least for our performance concern. Data preparation: 1) parsing the log files with a heavy use of perl (regular expressions) to generate csv files. Prepared statements also maintain reference tables in the DB. Postgres performance is not an issue for this first step. 2) loading the csv files with COPY. As around 70% of the data to load come in a single daily table, we don't allow concurrent jobs for this step. We have between a few and a few hundreds files to load into a single table; they are processed one after the other. A primary key is always defined; for the case when the required indexes are alreay built and when the new data are above a given size, we are using a shadow table instead (without the indexes) , build the index after the import and then replace the live table with the shadow one. For example, we a have a table of 13 GB + 11 GB indexes (5 pieces). Performances : a) is there an ideal size to consider for our csv files (100 x 10 MB or better 1 x 1GB ?) b) maintenance_work_mem: I'll use around 1 GB as recommended by Stefan 3) Data agggregation. This is the heaviest part for Postgres. On our current system some queries need above one hour, with phases of around 100% cpu use, alterning with times of heavy i/o load when temporary results are written/read to the plate (pgsql_tmp). During the aggregation, other postgres activities are low (at least should be) as this should take place at night. Currently we have a locking mechanism to avoid having more than one of such queries running concurently. This may be to strict for the benchmark server but better reflect our current hardware capabilities. Performances : Here we should favorise a single huge transaction and consider a low probability to have another transaction requiring large sort space. Considering this, is it reasonable to define work_mem being 3GB (I guess I should raise this parameter dynamically before running the aggregation queries) 4) Queries (report generation) We have only few requests which are not satisfying while requiring large sort operations. The data are structured in different aggregation levels (minutes, hours, days) with logical time based partitions in oder to limit the data size to compute for a given report. Moreover we can scale our infrastrucure while using different or dedicated Postgres servers for different customers. Smaller customers may share a same instance, each of them having its own schema (The lock mechanism for large aggregations apply to a whole Postgres instance, not to a single customer) . The benchmark will help us to plan such distribution. During the benchmark, we will probably not have more than 50 not idle connections simultaneously. It is a bit too early for us to fine tune this part. The benchmark will mainly focus on the steps 1 to 3 During the benchmark, the Db will reach a size of about 400 GB, simulating 3 different customers, also with data quite equally splitted in 3 scheemas. I will post our configuration(s) later on. Thanks again for all your valuable input. Marc Mamin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
Hi Luke, On the same page of Solaris internals wiki you may find links to the study with db_STRESS benchmark (done on UFS and ZFS with PostgreSQL, MySQL and Oracle (well, Oracle results are removed, but at least I may say it entered into the same tuning as PgSQL). Tests were done on Sol10u3 (as well you may find any other platform details in report document)... Also, if block size adjustment is less or more transparent (don't read 32K if you need only 8K - with huge data volume you'll simply waste your cache; in case you're doing full scan - leave prefetch algorithm to work for you); probably ARC (cache) limitation need more light. Well, I even cannot say there is any problem, etc. with it - it just has too much aggressive implementation :)) If all your running programs fitting into 1GB of RAM - you may leave ARC size by default (leaves 1GB free of system RAM). Otherwise, you should limit ARC to keep your workload execution comfortable: ARC allocating memory very quickly and every time your program need more RAM - it entering into concurrency with ARC... In my tests I observed short workload freezes during such periods and I did not like it too much :)) specially with high connection numbers :)) well, we may spend hours to discuss :) (sorry to be short, I have a very limited mail access for the moment)... However, ZFS is improving all the time and works better and better with every Solaris release, so probably all current tuning will be different or obsolete at the end of this year :)) BTW, forgot to mention, you'll need Solaris 10u4 or at least 10u3 but with all recent patches applied to run M8000 on full power. Best regards! -Dimitri On 7/30/07, Luke Lonergan [EMAIL PROTECTED] wrote: Hi Dimitri, Can you post some experimental evidence that these settings matter? At this point we have several hundred terabytes of PG databases running on ZFS, all of them setting speed records for data warehouses. We did testing on these settings last year on S10U2, perhaps things have changed since then. - Luke Msg is shrt cuz m on ma treo -Original Message- From: Dimitri [mailto:[EMAIL PROTECTED] Sent: Monday, July 30, 2007 05:26 PM Eastern Standard Time To: Luke Lonergan Cc: Josh Berkus; pgsql-performance@postgresql.org; Marc Mamin Subject: Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM... Luke, ZFS tuning is not coming from general suggestion ideas, but from real practice... So, - limit ARC is the MUST for the moment to keep your database running comfortable (specially DWH!) - 8K blocksize is chosen to read exactly one page when PG ask to read one page - don't mix it with prefetch! when prefetch is detected, ZFS will read next blocks without any demand from PG; but otherwise why you need to read more pages each time PG asking only one?... - prefetch of course not needed for OLTP, but helps on OLAP/DWH, agree :) Rgds, -Dimitri On 7/22/07, Luke Lonergan [EMAIL PROTECTED] wrote: Josh, On 7/20/07 4:26 PM, Josh Berkus [EMAIL PROTECTED] wrote: There are some specific tuning parameters you need for ZFS or performance is going to suck. http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide (scroll down to PostgreSQL) http://www.sun.com/servers/coolthreads/tnb/applications_postgresql.jsp http://bugs.opensolaris.org/view_bug.do?bug_id=6437054 You also don't say anything about what kind of workload you're running. I think we're assuming that the workload is OLTP when putting these tuning guidelines forward. Note that the ZFS tuning guidance referred to in this bug article recommend turning vdev prefetching off for random I/O (databases). This is exactly the opposite of what we should do for OLAP workloads. Also, the lore that setting recordsize on ZFS is mandatory for good database performance is similarly not appropriate for OLAP work. If the workload is OLAP / Data Warehousing, I'd suggest ignoring all of the tuning information from Sun that refers generically to database. The untuned ZFS performance should be far better in those cases. Specifically, these three should be ignored: - (ignore this) limit ARC memory use - (ignore this) set recordsize to 8K - (ignore this) turn off vdev prefetch - Luke ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
Luke, ZFS tuning is not coming from general suggestion ideas, but from real practice... So, - limit ARC is the MUST for the moment to keep your database running comfortable (specially DWH!) - 8K blocksize is chosen to read exactly one page when PG ask to read one page - don't mix it with prefetch! when prefetch is detected, ZFS will read next blocks without any demand from PG; but otherwise why you need to read more pages each time PG asking only one?... - prefetch of course not needed for OLTP, but helps on OLAP/DWH, agree :) Rgds, -Dimitri On 7/22/07, Luke Lonergan [EMAIL PROTECTED] wrote: Josh, On 7/20/07 4:26 PM, Josh Berkus [EMAIL PROTECTED] wrote: There are some specific tuning parameters you need for ZFS or performance is going to suck. http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide (scroll down to PostgreSQL) http://www.sun.com/servers/coolthreads/tnb/applications_postgresql.jsp http://bugs.opensolaris.org/view_bug.do?bug_id=6437054 You also don't say anything about what kind of workload you're running. I think we're assuming that the workload is OLTP when putting these tuning guidelines forward. Note that the ZFS tuning guidance referred to in this bug article recommend turning vdev prefetching off for random I/O (databases). This is exactly the opposite of what we should do for OLAP workloads. Also, the lore that setting recordsize on ZFS is mandatory for good database performance is similarly not appropriate for OLAP work. If the workload is OLAP / Data Warehousing, I'd suggest ignoring all of the tuning information from Sun that refers generically to database. The untuned ZFS performance should be far better in those cases. Specifically, these three should be ignored: - (ignore this) limit ARC memory use - (ignore this) set recordsize to 8K - (ignore this) turn off vdev prefetch - Luke ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
Hi Dimitri, Can you post some experimental evidence that these settings matter? At this point we have several hundred terabytes of PG databases running on ZFS, all of them setting speed records for data warehouses. We did testing on these settings last year on S10U2, perhaps things have changed since then. - Luke Msg is shrt cuz m on ma treo -Original Message- From: Dimitri [mailto:[EMAIL PROTECTED] Sent: Monday, July 30, 2007 05:26 PM Eastern Standard Time To: Luke Lonergan Cc: Josh Berkus; pgsql-performance@postgresql.org; Marc Mamin Subject:Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM... Luke, ZFS tuning is not coming from general suggestion ideas, but from real practice... So, - limit ARC is the MUST for the moment to keep your database running comfortable (specially DWH!) - 8K blocksize is chosen to read exactly one page when PG ask to read one page - don't mix it with prefetch! when prefetch is detected, ZFS will read next blocks without any demand from PG; but otherwise why you need to read more pages each time PG asking only one?... - prefetch of course not needed for OLTP, but helps on OLAP/DWH, agree :) Rgds, -Dimitri On 7/22/07, Luke Lonergan [EMAIL PROTECTED] wrote: Josh, On 7/20/07 4:26 PM, Josh Berkus [EMAIL PROTECTED] wrote: There are some specific tuning parameters you need for ZFS or performance is going to suck. http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide (scroll down to PostgreSQL) http://www.sun.com/servers/coolthreads/tnb/applications_postgresql.jsp http://bugs.opensolaris.org/view_bug.do?bug_id=6437054 You also don't say anything about what kind of workload you're running. I think we're assuming that the workload is OLTP when putting these tuning guidelines forward. Note that the ZFS tuning guidance referred to in this bug article recommend turning vdev prefetching off for random I/O (databases). This is exactly the opposite of what we should do for OLAP workloads. Also, the lore that setting recordsize on ZFS is mandatory for good database performance is similarly not appropriate for OLAP work. If the workload is OLAP / Data Warehousing, I'd suggest ignoring all of the tuning information from Sun that refers generically to database. The untuned ZFS performance should be far better in those cases. Specifically, these three should be ignored: - (ignore this) limit ARC memory use - (ignore this) set recordsize to 8K - (ignore this) turn off vdev prefetch - Luke ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
Hello, thank you for all your comments and recommendations. I'm aware that the conditions for this benchmark are not ideal, mostly due to the lack of time to prepare it. We will also need an additional benchmark on a less powerful - more realistic - server to better understand the scability of our application. Our application is based on java and is generating dynamic reports from log files content. Dynamic means here that a repor will be calculated from the postgres data the first time it is requested (it will then be cached). Java is used to drive the data preparation and to handle/generate the reports requests. This is much more an OLAP system then an OLTP, at least for our performance concern. Data preparation: 1) parsing the log files with a heavy use of perl (regular expressions) to generate csv files. Prepared statements also maintain reference tables in the DB. Postgres performance is not an issue for this first step. 2) loading the csv files with COPY. As around 70% of the data to load come in a single daily table, we don't allow concurrent jobs for this step. We have between a few and a few hundreds files to load into a single table; they are processed one after the other. A primary key is always defined; for the case when the required indexes are alreay built and when the new data are above a given size, we are using a shadow table instead (without the indexes) , build the index after the import and then replace the live table with the shadow one. For example, we a have a table of 13 GB + 11 GB indexes (5 pieces). Performances : a) is there an ideal size to consider for our csv files (100 x 10 MB or better 1 x 1GB ?) b) maintenance_work_mem: I'll use around 1 GB as recommended by Stefan 3) Data agggregation. This is the heaviest part for Postgres. On our current system some queries need above one hour, with phases of around 100% cpu use, alterning with times of heavy i/o load when temporary results are written/read to the plate (pgsql_tmp). During the aggregation, other postgres activities are low (at least should be) as this should take place at night. Currently we have a locking mechanism to avoid having more than one of such queries running concurently. This may be to strict for the benchmark server but better reflect our current hardware capabilities. Performances : Here we should favorise a single huge transaction and consider a low probability to have another transaction requiring large sort space. Considering this, is it reasonable to define work_mem being 3GB (I guess I should raise this parameter dynamically before running the aggregation queries) 4) Queries (report generation) We have only few requests which are not satisfying while requiring large sort operations. The data are structured in different aggregation levels (minutes, hours, days) with logical time based partitions in oder to limit the data size to compute for a given report. Moreover we can scale our infrastrucure while using different or dedicated Postgres servers for different customers. Smaller customers may share a same instance, each of them having its own schema (The lock mechanism for large aggregations apply to a whole Postgres instance, not to a single customer) . The benchmark will help us to plan such distribution. During the benchmark, we will probably not have more than 50 not idle connections simultaneously. It is a bit too early for us to fine tune this part. The benchmark will mainly focus on the steps 1 to 3 During the benchmark, the Db will reach a size of about 400 GB, simulating 3 different customers, also with data quite equally splitted in 3 scheemas. I will post our configuration(s) later on. Thanks again for all your valuable input. Marc Mamin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
Josh, On 7/20/07 4:26 PM, Josh Berkus [EMAIL PROTECTED] wrote: There are some specific tuning parameters you need for ZFS or performance is going to suck. http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide (scroll down to PostgreSQL) http://www.sun.com/servers/coolthreads/tnb/applications_postgresql.jsp http://bugs.opensolaris.org/view_bug.do?bug_id=6437054 You also don't say anything about what kind of workload you're running. I think we're assuming that the workload is OLTP when putting these tuning guidelines forward. Note that the ZFS tuning guidance referred to in this bug article recommend turning vdev prefetching off for random I/O (databases). This is exactly the opposite of what we should do for OLAP workloads. Also, the lore that setting recordsize on ZFS is mandatory for good database performance is similarly not appropriate for OLAP work. If the workload is OLAP / Data Warehousing, I'd suggest ignoring all of the tuning information from Sun that refers generically to database. The untuned ZFS performance should be far better in those cases. Specifically, these three should be ignored: - (ignore this) limit ARC memory use - (ignore this) set recordsize to 8K - (ignore this) turn off vdev prefetch - Luke ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
Marc, Server Specifications: -- Sun SPARC Enterprise M8000 Server: http://www.sun.com/servers/highend/m8000/specs.xml File system: http://en.wikipedia.org/wiki/ZFS There are some specific tuning parameters you need for ZFS or performance is going to suck. http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide (scroll down to PostgreSQL) http://www.sun.com/servers/coolthreads/tnb/applications_postgresql.jsp http://bugs.opensolaris.org/view_bug.do?bug_id=6437054 You also don't say anything about what kind of workload you're running. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
Having done something similar recently, I would recommend that you look at adding connection pooling using pgBouncer transaction pooling between your benchmark app and PgSQL. In our application we have about 2000 clients funneling down to 30 backends and are able to sustain large transaction per second volume. This has been the #1 key to success for us in running on monster hardware. Regards, Gavin On 7/17/07, Marc Mamin [EMAIL PROTECTED] wrote: Postgres configuration for 64 CPUs, 128 GB RAM... Hello, We have the oppotunity to benchmark our application on a large server. I have to prepare the Postgres configuration and I'd appreciate some comments on it as I am not experienced with servers of such a scale. Moreover the configuration should be fail-proof as I won't be able to attend the tests. Our application (java + perl) and Postgres will run on the same server, whereas the application activity is low when Postgres has large transactions to process. There is a large gap between our current produtcion server (Linux, 4GB RAM, 4 cpus) and the benchmark server; one of the target of this benchmark is to verify the scalability of our application. And you have no reason to be envious as the server doesn't belong us :-) Thanks for your comments, Marc Mamin Posgres version: 8.2.1 Server Specifications: -- Sun SPARC Enterprise M8000 Server: *http://www.sun.com/servers/highend/m8000/specs.xml*http://www.sun.com/servers/highend/m8000/specs.xml File system: *http://en.wikipedia.org/wiki/ZFS* http://en.wikipedia.org/wiki/ZFS Planned configuration: # we don't expect more than 150 parallel connections, # but I suspect a leak in our application that let some idle connections open max_connections=2000 ssl = off #maximum allowed shared_buffers= 262143 # on our current best production server with 4GB RAM (not dedicated to Postgres), work_mem is set to 600 MB # this limitation is probably the bottleneck for our application as the files in pgsql_tmp grows up to 15 GB # during large aggregations (we have a locking mechanismus to avoid parallel processing of such transactions) work_mem = 31457280 # (30 GB) # index creation time is also an issue for us; the process is locking other large processes too. # our largest table so far is 13 GB + 11 GB indexes maintenance_work_mem = 31457280 # (30 GB) # more than the max number of tables +indexes expected during the benchmark max_fsm_relations = 10 max_fsm_pages = 180 # don't know if I schoud modify this. # seems to be sufficient on our production servers max_stack_depth = 2MB # vacuum will be done per hand between each test session autovacuum = off # required to analyse the benchmark log_min_duration_statement = 1000 max_prepared_transaction = 100 # seems to be required to drop schema/roles containing large number of objects max_locks_per_transaction = 128 # I use the default for the bgwriter as I couldnt find recommendation on those #bgwriter_delay = 200ms # 10-1ms between rounds #bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers scanned/round #bgwriter_lru_maxpages = 5 # 0-1000 buffers max written/round #bgwriter_all_percent = 0.333 # 0-100% of all buffers scanned/round #bgwriter_all_maxpages = 5 # 0-1000 buffers max written/round #WAL fsync = on #use default #wal_sync_method # we are using 32 on our production system wal_buffers=64 # we didn't make any testing with this parameter until now, but this should'nt be a relevant # point as our performance focus is on large transactions commit_delay = 0 #CHECKPOINT # xlog will be on a separate disk checkpoint_segments=256 checkpoint_timeout = 5min
Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
Marc Mamin wrote: Postgres configuration for 64 CPUs, 128 GB RAM... there are probably not that much installation out there that large - comments below Hello, We have the oppotunity to benchmark our application on a large server. I have to prepare the Postgres configuration and I'd appreciate some comments on it as I am not experienced with servers of such a scale. Moreover the configuration should be fail-proof as I won't be able to attend the tests. Our application (java + perl) and Postgres will run on the same server, whereas the application activity is low when Postgres has large transactions to process. There is a large gap between our current produtcion server (Linux, 4GB RAM, 4 cpus) and the benchmark server; one of the target of this benchmark is to verify the scalability of our application. [...] Posgres version: 8.2.1 upgrade to 8.2.4 File system: _http://en.wikipedia.org/wiki/ZFS_ way more important is what kind of disk-IO subsystem you have attached ... Planned configuration: # we don't expect more than 150 parallel connections, # but I suspect a leak in our application that let some idle connections open max_connections=2000 ssl = off #maximum allowed shared_buffers= 262143 this is probably on the lower side for a 128GB box # on our current best production server with 4GB RAM (not dedicated to Postgres), work_mem is set to 600 MB # this limitation is probably the bottleneck for our application as the files in pgsql_tmp grows up to 15 GB # during large aggregations (we have a locking mechanismus to avoid parallel processing of such transactions) work_mem = 31457280 # (30 GB) this is simply ridiculous - work_mem is PER SORT - so if your query requires 8 sorts it will feel free to use 8x30GB and needs to be multiplied by the number of concurrent connections. # index creation time is also an issue for us; the process is locking other large processes too. # our largest table so far is 13 GB + 11 GB indexes maintenance_work_mem = 31457280 # (30 GB) this is ridiculous too - testing has shown that there is not much point in going beyond 1GB or so # more than the max number of tables +indexes expected during the benchmark max_fsm_relations = 10 max_fsm_pages = 180 this is probably way to low for a database the size of yours - watch the oputput of VACUUM VERBOSE on a database wide vacuum for some stats on that. # don't know if I schoud modify this. # seems to be sufficient on our production servers max_stack_depth = 2MB # vacuum will be done per hand between each test session autovacuum = off # required to analyse the benchmark log_min_duration_statement = 1000 max_prepared_transaction = 100 # seems to be required to drop schema/roles containing large number of objects max_locks_per_transaction = 128 # I use the default for the bgwriter as I couldnt find recommendation on those #bgwriter_delay = 200ms # 10-1ms between rounds #bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers scanned/round #bgwriter_lru_maxpages = 5 # 0-1000 buffers max written/round #bgwriter_all_percent = 0.333 # 0-100% of all buffers scanned/round #bgwriter_all_maxpages = 5 # 0-1000 buffers max written/round #WAL fsync = on #use default #wal_sync_method # we are using 32 on our production system wal_buffers=64 values up to 512 or so have been reported to help on systems with very high concurrency what is missing here is your settings for: effective_cache_size and random_page_cost Stefan ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
On Tue, Jul 17, 2007 at 04:10:30PM +0200, Marc Mamin wrote: shared_buffers= 262143 You should at least try some runs with this set far, far larger. At least 10% of memory, but it'd be nice to see what happens with this set to 50% or higher as well (though don't set it larger than the database since it'd be a waste). How big is the database, anyway? # on our current best production server with 4GB RAM (not dedicated to Postgres), work_mem is set to 600 MB # this limitation is probably the bottleneck for our application as the files in pgsql_tmp grows up to 15 GB # during large aggregations (we have a locking mechanismus to avoid parallel processing of such transactions) Keep in mind that a good filesystem will be caching most of pgsql_tmp if it can. max_prepared_transaction = 100 Are you using 2PC? If not, there's no reason to touch this (could could just set it to 0). # I use the default for the bgwriter as I couldnt find recommendation on those #bgwriter_delay = 200ms # 10-1ms between rounds #bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers scanned/round #bgwriter_lru_maxpages = 5 # 0-1000 buffers max written/round #bgwriter_all_percent = 0.333 # 0-100% of all buffers scanned/round #bgwriter_all_maxpages = 5 # 0-1000 buffers max written/round You'll probably want to increase both maxpages parameters substantially, assuming that you've got good IO hardware. #CHECKPOINT # xlog will be on a separate disk checkpoint_segments=256 checkpoint_timeout = 5min The further apart your checkpoints, the better. Might want to look at 10 minutes. I'd also set checkpoint_warning to just a bit below checkpoint_timeout and watch for warnings to make sure you're not checkpointing a lot more frequently than you're expecting. -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpAtlNgiEMna.pgp Description: PGP signature
Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
Marc Mamin [EMAIL PROTECTED] writes: We have the oppotunity to benchmark our application on a large server. I have to prepare the Postgres configuration and I'd appreciate some comments on it as I am not experienced with servers of such a scale. Moreover the configuration should be fail-proof as I won't be able to attend the tests. I really think that's a recipe for disaster. Even on a regular machine you need to treat tuning as an on-going feedback process. There's no such thing as a fail-proof configuration since every application is different. On an exotic machine like this you're going to run into unique problems that nobody here can anticipate with certainty. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
On Tue, 17 Jul 2007, Marc Mamin wrote: Moreover the configuration should be fail-proof as I won't be able to attend the tests. This is unreasonable. The idea that you'll get a magic perfect configuration in one shot suggests a fundamental misunderstanding of how work like this is done. If there's any way you could adjust things so that, say, you were allowed to give at least 4 different tuning setups and you got a report back with each of the results for them, that would let you design a much better test set. Posgres version: 8.2.1 This has already been mentioned, but it really is critical for your type of test to run 8.2.4 instead so I wanted to emphasize it. There is a major scalability bug in 8.2.1. I'm going to ignore the other things that other people have already commented on (all the suggestions Stephan and Jim already made are good ones you should heed) and try to fill in the remaining gaps instead. # I use the default for the bgwriter as I couldnt find recommendation on those The defaults are so small that it will barely do anything on a server of your size. Tuning it properly so that it's effective but doesn't waste a lot of resources is tricky, which is why you haven't found such recommendations--they're fairly specific to what you're doing and require some testing to get right. If you want to see an example from a big server, look at http://www.spec.org/jAppServer2004/results/res2007q3/jAppServer2004-20070606-00065.html#DBDatabase_SW_Config0 That's tuned for a very specific benchmark though. Here's a fairly generic set of parameters that would be much more aggressive than the defaults, while not going so far as to waste too many resources if the writer is just getting in the way on your server: bgwriter_delay = 200ms bgwriter_lru_percent = 3.0 bgwriter_lru_maxpages = 500 bgwriter_all_percent = 1.0 bgwriter_all_maxpages = 250 #WAL fsync = on #use default #wal_sync_method I'd expect wal_sync_method=open_datasync would outperfom the default, but you'd really want to test both ways here to be sure. The fact that the Sun results I referenced above use the default of fdatasync makes me hesitate to recommend that change too strongly, as I haven't worked with this particular piece of hardware. See http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm for more information about this parameter. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
Postgres configuration for 64 CPUs, 128 GB RAM... Hello, We have the oppotunity to benchmark our application on a large server. I have to prepare the Postgres configuration and I'd appreciate some comments on it as I am not experienced with servers of such a scale. Moreover the configuration should be fail-proof as I won't be able to attend the tests. Our application (java + perl) and Postgres will run on the same server, whereas the application activity is low when Postgres has large transactions to process. There is a large gap between our current produtcion server (Linux, 4GB RAM, 4 cpus) and the benchmark server; one of the target of this benchmark is to verify the scalability of our application. And you have no reason to be envious as the server doesn't belong us :-) Thanks for your comments, Marc Mamin Posgres version: 8.2.1 Server Specifications: -- Sun SPARC Enterprise M8000 Server: http://www.sun.com/servers/highend/m8000/specs.xml File system: http://en.wikipedia.org/wiki/ZFS Planned configuration: # we don't expect more than 150 parallel connections, # but I suspect a leak in our application that let some idle connections open max_connections=2000 ssl = off #maximum allowed shared_buffers= 262143 # on our current best production server with 4GB RAM (not dedicated to Postgres), work_mem is set to 600 MB # this limitation is probably the bottleneck for our application as the files in pgsql_tmp grows up to 15 GB # during large aggregations (we have a locking mechanismus to avoid parallel processing of such transactions) work_mem = 31457280 # (30 GB) # index creation time is also an issue for us; the process is locking other large processes too. # our largest table so far is 13 GB + 11 GB indexes maintenance_work_mem = 31457280 # (30 GB) # more than the max number of tables +indexes expected during the benchmark max_fsm_relations = 10 max_fsm_pages = 180 # don't know if I schoud modify this. # seems to be sufficient on our production servers max_stack_depth = 2MB # vacuum will be done per hand between each test session autovacuum = off # required to analyse the benchmark log_min_duration_statement = 1000 max_prepared_transaction = 100 # seems to be required to drop schema/roles containing large number of objects max_locks_per_transaction = 128 # I use the default for the bgwriter as I couldnt find recommendation on those #bgwriter_delay = 200ms # 10-1ms between rounds #bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers scanned/round #bgwriter_lru_maxpages = 5 # 0-1000 buffers max written/round #bgwriter_all_percent = 0.333 # 0-100% of all buffers scanned/round #bgwriter_all_maxpages = 5 # 0-1000 buffers max written/round #WAL fsync = on #use default #wal_sync_method # we are using 32 on our production system wal_buffers=64 # we didn't make any testing with this parameter until now, but this should'nt be a relevant # point as our performance focus is on large transactions commit_delay = 0 #CHECKPOINT # xlog will be on a separate disk checkpoint_segments=256 checkpoint_timeout = 5min
Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
On Tue, 17 We have the oppotunity to benchmark our application on a large server. I have to prepare the Postgres configuration and I'd appreciate some comments on it as I am not experienced with servers of such a scale. Moreover the configuration should be fail-proof as I won't be able to attend the tests. Our application (java + perl) and Postgres will run on the same server, whereas the application activity is low when Postgres has large transactions to process. Please, can you be more specific about your application : - what does it do ? - what kind of workload does it generate ? [ie: many concurrent small queries (website) ; few huge queries, reporting, warehousing, all of the above, something else ?] - percentage and size of update queries ? - how many concurrent threads / connections / clients do you serve on a busy day ? (I don't mean online users on a website, but ACTIVE concurrent database connections) I assume you find your current server is too slow or foresee it will become too slow soon and want to upgrade, so : - what makes the current server's performance inadequate ? is it IO, CPU, RAM, a mix ? which proportions in the mix ? This is very important. If you go to the dealer and ask I need a better vehicle, he'll sell you a Porsche. But if you say I need a better vehcle to carry two tons of cinderblocks he'll sell you something else I guess. Same with database servers. You could need some humongous CPU power, but you might as well not. Depends. There is a large gap between our current produtcion server (Linux, 4GB RAM, 4 cpus) and the benchmark server; one of the target of this benchmark is to verify the scalability of our application. Define scalability. (no this isn't a joke, I mean, you know your application, how would you like it to scale ? How do you think it will scale ? Why ? What did you do so it would scale well ? etc.) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings