Re: [PERFORM] Recommended Initial Settings
Campbell, Lance wrote: I would like to get someone's recommendations on the best initial settings for a dedicated PostgreSQL server. I do realize that there are a lot of factors that influence how one should configure a database. I am just looking for a good starting point. Ideally I would like the database to reside as much as possible in memory with no disk access. The current database size of my 7.x version of PostgreSQL generates a 6 Gig file when doing a database dump. Your operating-system should be doing the caching for you. Dedicated PostgreSQL 8.2 Server Redhat Linux 4.x AS 64 bit version (EM64T) 4 Intel Xeon Processors If these are older Xeons, check the mailing list archives for xeon context switch. 20 Gig Memory Current PostgreSQL database is 6 Gig file when doing a database dump OK, so it's plausible the whole thing will fit in RAM (as a rule-of-thumb I assume headers, indexes etc. triple or quadruple the size). To know better, check the actual disk-usage of $PGDATA. /etc/sysctl.conf file settings: # 11 Gig kernel.shmmax = 11811160064 Hmm - that's a lot of shared RAM. See shared_buffers below. kernel.sem = 250 32000 100 128 net.ipv4.ip_local_port_range = 1024 65000 net.core.rmem_default = 262144 net.core.rmem_max = 262144 net.core.wmem_default = 262144 net.core.wmem_max = 262144 postgresql.conf file settings (if not listed then I used the defaults): max_connections = 300 How many connections do you expect typically/peak? It doesn't cost much to have max_connections set high but your workload is the most important thing missing from your question. shared_buffers = 10240MB For 7.x that's probably way too big, but 8.x organises its buffers better. I'd still be tempted to start a 1 or 2GB and work up - see where it stops buying you an improvement. work_mem = 10MB If you have large queries, doing big sorts I'd increase this. Don't forget it's per-sort, so if you have got about 300 connections live at any one time that could be 300*10MB*N if they're all doing something complicated. If you only have one connection live, you can increase this quite substantially. effective_cache_size = 512MB This isn't setting PG's memory usage, it's telling PG how much data your operating-system is caching. Check free and see what it says. For you, I'd expect 10GB+. maintenance_work_mem = 100MB This is for admin-related tasks, so you could probably increase it. Workload workload workload - we need to know what you're doing with it. Once connection summarising the entire database will want larger numbers than 100 connections running many small queries. HTH -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Recommended Initial Settings
Richard, Thanks for your reply. You said: Your operating-system should be doing the caching for you. My understanding is that as long as Linux has memory available it will cache files. Then from your comment I get the impression that since Linux would be caching the data files for the postgres database it would be redundant to have a large shared_buffers. Did I understand you correctly? Thanks, Lance Campbell Project Manager/Software Architect Web Services at Public Affairs University of Illinois 217.333.0382 http://webservices.uiuc.edu -Original Message- From: Richard Huxton [mailto:[EMAIL PROTECTED] Sent: Friday, February 23, 2007 10:29 AM To: Campbell, Lance Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Recommended Initial Settings Campbell, Lance wrote: I would like to get someone's recommendations on the best initial settings for a dedicated PostgreSQL server. I do realize that there are a lot of factors that influence how one should configure a database. I am just looking for a good starting point. Ideally I would like the database to reside as much as possible in memory with no disk access. The current database size of my 7.x version of PostgreSQL generates a 6 Gig file when doing a database dump. Your operating-system should be doing the caching for you. Dedicated PostgreSQL 8.2 Server Redhat Linux 4.x AS 64 bit version (EM64T) 4 Intel Xeon Processors If these are older Xeons, check the mailing list archives for xeon context switch. 20 Gig Memory Current PostgreSQL database is 6 Gig file when doing a database dump OK, so it's plausible the whole thing will fit in RAM (as a rule-of-thumb I assume headers, indexes etc. triple or quadruple the size). To know better, check the actual disk-usage of $PGDATA. /etc/sysctl.conf file settings: # 11 Gig kernel.shmmax = 11811160064 Hmm - that's a lot of shared RAM. See shared_buffers below. kernel.sem = 250 32000 100 128 net.ipv4.ip_local_port_range = 1024 65000 net.core.rmem_default = 262144 net.core.rmem_max = 262144 net.core.wmem_default = 262144 net.core.wmem_max = 262144 postgresql.conf file settings (if not listed then I used the defaults): max_connections = 300 How many connections do you expect typically/peak? It doesn't cost much to have max_connections set high but your workload is the most important thing missing from your question. shared_buffers = 10240MB For 7.x that's probably way too big, but 8.x organises its buffers better. I'd still be tempted to start a 1 or 2GB and work up - see where it stops buying you an improvement. work_mem = 10MB If you have large queries, doing big sorts I'd increase this. Don't forget it's per-sort, so if you have got about 300 connections live at any one time that could be 300*10MB*N if they're all doing something complicated. If you only have one connection live, you can increase this quite substantially. effective_cache_size = 512MB This isn't setting PG's memory usage, it's telling PG how much data your operating-system is caching. Check free and see what it says. For you, I'd expect 10GB+. maintenance_work_mem = 100MB This is for admin-related tasks, so you could probably increase it. Workload workload workload - we need to know what you're doing with it. Once connection summarising the entire database will want larger numbers than 100 connections running many small queries. HTH -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Recommended Initial Settings
Campbell, Lance wrote: Richard, Thanks for your reply. You said: Your operating-system should be doing the caching for you. My understanding is that as long as Linux has memory available it will cache files. Then from your comment I get the impression that since Linux would be caching the data files for the postgres database it would be redundant to have a large shared_buffers. Did I understand you correctly? That's right - PG works with the O.S. This means it *might* not be a big advantage to have a large shared_buffers. On older versions of PG, the buffer management code wasn't great with large shared_buffers values too. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Recommended Initial Settings
If you're doing much updating at all you'll also want to bump up checkpoint_segments. I like setting checkpoint_warning just a bit under checkpoint_timeout as a way to monitor how often you're checkpointing due to running out of segments. With a large shared_buffers you'll likely need to make the bgwriter more aggressive as well (increase the max_pages numbers), though how important that is depends on how much updating you're doing. If you see periodic spikes in IO corresponding to checkpoints, that's an indication bgwriter isn't doing a good enough job. If everything ends up in memory, it might be good to decrease random_page_cost to 1 or something close to it; though the database should just rely on effective_cache to figure out that everything's in memory. If you're on pre-8.2, you'll want to cut all the autovacuum parameters in half, if you're using it. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Recommended Initial Settings
In response to Campbell, Lance [EMAIL PROTECTED]: Richard, Thanks for your reply. You said: Your operating-system should be doing the caching for you. My understanding is that as long as Linux has memory available it will cache files. Then from your comment I get the impression that since Linux would be caching the data files for the postgres database it would be redundant to have a large shared_buffers. Did I understand you correctly? Keep in mind that keeping the data in the kernel's buffer requires Postgres to make a syscall to read a file, which the kernel then realizes is cached in memory. The kernel then has to make that data available to the Postgres (userland) process. If the data is in Postgres' buffers, Postgres can fetch it directly, thus avoiding the overhead of the syscalls and the kernel activity. You still have to make sysvshm calls, though. So, it depends on which is able to manage the memory better. Is the kernel so much more efficient that it makes up for the overhead of the syscalls? My understanding is that in recent versions of Postgres, this is not the case, and large shared_buffers improve performance. I've yet to do any in-depth testing on this, though. -- Bill Moran Collaborative Fusion Inc. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match