I've been having some performance issues with a DB I use. I'm trying to come up with some performance recommendations to send to the "adminstrator".
Hardware: CPU0: Pentium III (Coppermine) 1000MHz (256k cache) CPU1: Pentium III (Coppermine) 1000MHz (256k cache) Memory: 3863468 kB (4 GB) OS: Red Hat Linux release 7.2 (Enigma) Kernel: 2.4.9-31smp I/O I believe is a 3-disk raid 5.
/proc/sys/kernel/shmmax and /proc/sys/kernel/shmall were set to 2G
Postgres version: 7.3.4
I know its a bit dated, and upgrades are planned, but several months out. Load average seems to hover between 1.0 and 5.0-ish during peak hours. CPU seems to be the limiting factor but I'm not positive (cpu utilization seems to be 40-50%). We have 2 of those set up as the back end to 3 web-servers each... supposedly load-balanced, but one of the 2 dbs consistently has higher load. We have a home-grown replication system that keeps them in sync with each other... peer to peer (master/master).
The DB schema is, well to put it nicely... not exactly normalized. No constraints to speak of except for the requisite not-nulls on the primary keys (many of which are compound). Keys are mostly varchar(256) fields.
Ok for what I'm uncertain of...
According to http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
Its more of a staging area and more isn't necessarily better. That psql relies on the OS to cache data for later use.
But according to http://www.ca.postgresql.org/docs/momjian/hw_performance/node3.html its where psql caches previous data for queries because the OS cache is slower, and should be as big as possible without causing swap.
Those seem to be conflicting statements. In our case, the "administrator" kept increasing this until performance seemed to increase, which means its now 250000 (x 8k is 2G).
Is this just a staging area for data waiting to move to the OS cache, or is this really the area that psql caches its data?
Again, according to the Varlena guide this tells psql how much system memory is available for it to do its work in.
until recently, this was set at the default value of 1000. It was just recently increased to 180000 (1.5G)
according to http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html it should be about 25% of memory?
Finally sort_mem: Was until recently left at the default of 1000. Is now 16000.
Increasing the effective cache and sort mem didn't seem to make much of a difference. I'm guessing the eff cache was probably raised a bit too much, and shared_buffers is way to high.
What can I do to help determine what the proper settings should be and/or look at other possible choke points. What should I look for in iostat, mpstat, or vmstat as red flags that cpu, memory, or i/o bound?
DB maintenance wise, I don't believe they were running vacuum full until I told them a few months ago that regular vacuum analyze no longer cleans out dead tuples. Now normal vac is run daily, vac full weekly (supposedly). How can I tell from the output of vacuum if the vac fulls aren't being done, or not done often enough? Or from the system tables, what can I read?
Is there anywhere else I can look for possible clues? I have access to the DB super-user, but not the system root/user.
Thank you for your time. Please let me know any help or suggestions you may have. Unfortunately upgrading postgres, OS, kernel, or re-writing schema is most likely not an option.
---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives?