I recently built a rather powerful machine to be used in a heavily accessed database.. the machine is a dual AthlonMP 2800+, 2GB of PC2100 ECC, and a 4x18GB RAID-0 using 15k rpm fujitsu MAS drives on a 4ch u160 ICP-Vortex card with 256MB of cache.
The box runs FreeBSD, tracking RELENG_4 (-STABLE) and PostGreSQL 7.3.4 from ports (7.3.4_1)
There are a few databases running on the machine, but for now, the one that is the most performance sensitive is also arguably the worst designed. The access pattern on a day to day basis looks basically like this:
1. ~75k rows aggregate are inserted into two different tables, 70/30 split between two tables. The 70% going to the smaller table (containing just two integers) and the 30% going into a larger table containing a rather largish (~4KB) text field and more integer types; no searching of any kind is done on this text field, it appears in no where clauses, and is not indexed.
2. As these rows are inserted, other processes see them and for each row:
a. A new row containing just one field is inserted, that row being an FK into the 30% table mentioned above.
b. A row in a 3rd table is updated; this table never gets deleted from, and rarely sees inserts, it's just a status table, but it has nearly a million rows. The updated row is an integer.
c. The 30% table itself is updated.
3. When these processes finish their processing, the rows in both the 70/30 tables and the table from 2a are deleted; The 2b table has a row again updated.
There is only one process that does all the inserting, from a web backend. Steps 2 and 3 are done by several other backend processes on different machines, "fighting" to pick up the newly inserted rows and process them. Not the most efficient design, but modifying the current code isn't an option; rest assured that this is being redesigned and new code is being written, but the developer who wrote the original left us with his spaghetti-python mess and no longer works for us.
I run a 'vacuum analyze verbose' on the database in question every hour, and a reindex on every table in the database every six hours, 'vacuum full' is run manually as required perhaps anywhere from once a week to once a month. I realize the analyze may not be running often enough and the reindex more often than need be, but I don't think these are adversely affecting performance very much; degredation over time does not appear to be an issue.
So on with the question. Given the above machine with the above database and access pattern, I've configured the system with the following options. I'm just wondering what some of you more experierenced pg tuners have to say. I can provide more information such as ipcs, vmstat, iostat, etc output on request but I figure this message is getting long enough already..
Thanks for any input. Kernel and postgres information follows.
Related kernel configuration options:
... cpu I686_CPU maxusers 256 ... options MAXDSIZ="(1024UL*1024*1024)" options MAXSSIZ="(512UL*1024*1024)" options DFLDSIZ="(512UL*1024*1024)" ... options SYSVSHM #SYSV-style shared memory options SYSVMSG #SYSV-style message queues options SYSVSEM #SYSV-style semaphores options SHMMAXPGS=65536 options SHMMAX="(SHMMAXPGS*PAGE_SIZE+1)" options SHMSEG=256 options SEMMNI=384 options SEMMNS=768 options SEMMNU=384 options SEMMAP=384 ...
relevant postgresql.conf options:
max_connections = 128 shared_buffers = 20000 max_fsm_relations = 10000 max_fsm_pages = 2000000 max_locks_per_transaction = 64 wal_buffers = 128 sort_mem = 262144 # we have some large queries running at times vacuum_mem = 131072 checkpoint_segments = 16 checkpoint_timeout = 300 commit_delay = 1000 commit_siblings = 32 fsync = true wal_fsync_method = fsync effective_cache_size = 49152 # 384MB, this could probably be higher random_page_cost = 1.7 cpu_tuble_cost = 0.005 cpu_index_tuple_cost = 0.0005 cpu_operator_cost = 0.0012 geqo_threshold = 20 stats_start_collector = true stats_reset_on_server_start = off stats_command_string = true stats_row_level = true stats_block_level = true
---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend