>>>>> "AL" == Allen Landsidel <[EMAIL PROTECTED]> writes:
AL> I recently built a rather powerful machine to be used in a heavily AL> accessed database.. the machine is a dual AthlonMP 2800+, 2GB of AL> PC2100 ECC, and a 4x18GB RAID-0 using 15k rpm fujitsu MAS drives on a AL> 4ch u160 ICP-Vortex card with 256MB of cache.
The only recommendation I'd make is to switch from RAID0 to RAID10, unless you can afford the downtime (and loss of data) when one of your drives takes a vacation.
Also, is your RAID card cache battery backed up? If no, then you lose the ability to use write-back and this costs *dearly* in performance.
I'm planning to move it to -10 or -5 (or even -50) once we have more money to spend on drives. As it is right now though, I couldn't spare the space.. The box this was moved from was a 2x1000 P3 with a single u160 drive.. Battery backup is something I really should have gotten on the memory but I spaced out when placing the order, it'll come in the future.
I'm kind of "living on the edge" here with regard to no bbu on the raid and using raid-0 I know.. but it's for a short time, and I don't think in the scheme of things this is any more failure-prone than the crummy setup it was on before. Backup and backup often, I know that mantra very well and live by it. :)
AL> The box runs FreeBSD, tracking RELENG_4 (-STABLE) and PostGreSQL 7.3.4 AL> from ports (7.3.4_1)
An excellent choice. :-)
I recognize you from those lists.. didn't notice the Ph.D. before though.. but yes, I'm a huge FreeBSD fan.. I didn't need anyone to talk me into that particular choice. ;)
AL> I run a 'vacuum analyze verbose' on the database in question every AL> hour, and a reindex on every table in the database every six hours, AL> 'vacuum full' is run manually as required perhaps anywhere from once a AL> week to once a month. I realize the analyze may not be running often AL> enough and the reindex more often than need be, but I don't think AL> these are adversely affecting performance very much; degredation over AL> time does not appear to be an issue.
Personally, I don't think you need to reindex that much. And I don't think you need to vacuum full *ever* if you vacuum often like you do. Perhaps reducing the vacuum frequency may let you reach a steady state of disk usage?
Well I had the vacuums running every 15 minutes for a while.. via a simple cron script I wrote just to make sure no more than one vacuum ran at once, and to 'nice' the job.. but performance on the db does suffer a bit during vacuums or so it seems. The performance doesn't degrade noticably after only an hour without a vacuum though, so I'd like to make the state of degraded performance more periodic -- not the general rule during 24/7 operation.
I'll monkey around more with running the vacuum more often and see if the performance hit was more imagined than real.
Depending on how many concurrent actions you process, perhaps you can use a temporary table for each, so you don't have to delete many rows when you're done.
I'd love to but unfortunately the daemons that use the database are a mess, more or less 'unsupported' at this point.. thankfully they're being replaced along with a lot of performance-hurting SQL.
On my busy tables, I vacuum every 6 hours. The vacuum analyze is run on the entire DB nightly. I reindex every month or so my most often updated tables that show index bloat. Watch for bloat by monitoring the size of your indexes:
SELECT relname,relpages FROM pg_class WHERE relname LIKE 'some_table%' ORDER BY relname;
Thanks for that tidbit.. maybe I'll cron something else to grab the values once a day or so and archive them in another table for history.. make my life easier. ;)
AL> Related kernel configuration options:
AL> ... AL> cpu I686_CPU AL> maxusers 256
let the system autoconfigure maxusers...
Are you sure about this? I have always understood that explicitly setting this value was the best thing to do if you knew the maximum number of users you would encounter, as the kernel doesn't have to 'guess' at structure sizes and the like, or grow them later..
AL> ... AL> options MAXDSIZ="(1024UL*1024*1024)" AL> options MAXSSIZ="(512UL*1024*1024)" AL> options DFLDSIZ="(512UL*1024*1024)"
above are ok at defaults.
These are related to something else.. a linux developer on the system used to the way it'll always allow you access to all the memory on a machine and just kill a random process to give you memory if you allocated more than was free.. ;)
He didn't know processes were getting killed, but the defaults turned out to be not high enough. This will get turned back down to default once he's done migrating everything into the new database and his app no longer needs to run there. I just mentioned them in case they could adversely affect performance as-is.
AL> options SHMMAXPGS=65536
perhaps bump this and increase your shared buffers. I find that if you do lots of writes, having a few more shared buffers helps.
Any ideas how much of a bump, or does that depend entirely on me and I should just play with it? Would doubling it be too much of a bump?
AL> options SHMMAX="(SHMMAXPGS*PAGE_SIZE+1)"
you don't need to explicitly set this... it is automatically set based on the above setting.
I'm an explicit kind of guy. ;)
AL> relevant postgresql.conf options:
AL> max_fsm_pages = 2000000
this may be overkill. I currently run with 1000000
At only 6 bytes each I thought 12M wasn't too much to spare for the sake of making sure there is enough room there for everything.. I am watching my file sizes and vacuum numbers to try and tune this value but it's an arduous process.
AL> effective_cache_size = 49152 # 384MB, this could probably be higher
the current recommendation for freebsd is to set this to:
`sysctl -n vfs.hibufspace` / 8192
where 8192 is the blocksize used by postgres.
That comes out as 25520.. I have it at 384MB because I wanted to take the 256MB on the RAID controller into account as well.
I'm not entirely certain how much of that 256MB is available, and for what kind of cache.. I know the i960 based controllers all need to set aside at least 16MB for their "OS" and it isn't used for cache, not sure about ARM based cards like the ICP.. but I don't think assuming 128MB is too much of a stretch, or even 192MB.
You may also want to increase the max buffer space used by FreeBSD, which apparently is capped at 200M (I think) by dafault. I'll have to look up how to bump that, as most likely you have plenty of RAM sitting around unused. What does "top" say about that when you're busy?
Yes that hibufspace value comes out to 200MB.. (199.375 really, odd)
top usually shows me running with that same value.. 199MB.. and most of the time, with maybe 1.2GB free in the Inact area..
I'll see if sysctl lets me write this value, or if it's a kernel config option I missed, unless you have remembered between then and now. I'd really like to have this higher, say around 512MB.. more if I can spare it after watching for a bit.
Given this and the above about the controllers onboard cache (not to mention the per-drive cache) do you think I'll still need to lower effective_cache_size?
---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ?