My name is Alex Turner and I work for a small Tech company in Pottstown PA. We run
Postgresql on a number of systems for a variety of different applications, and it has
been a joy to deal with all around, working fast and reliably for over 2 years.
We recently upgraded from RedHat 7.2 to RedHat 9.0, and we are running Postgres 7.3.2
on our Proliant ML370 (Raid 1 2x18 10k, and Raid 5 3x36 10k, 2x866 PIII, 2GB RAM).
We seem to have had a serious drop after the upgrade. The database is a database of
properties that is updated on a daily basis, and when I say updated I mean that I
insert/update the whole data download because the data provider doesn't tell us what
changed, just gives us a complete dump. The integrity of the dumb isn't great so I
can't process as a COPY or a block transaction because some of the data is often bad.
Each and every row is a seperate insert or update.
Data insert performance used to degrade in a linear fasion as time progressed I'm
guessing as the transaction logs filled up. About once every six weeks I would dump
the database, destroy and recreate the db and reload the dump. This 'reset' the whole
thing, and brought insert/vacuum times back down. Since the upgrade, performance has
degraded very rapidly over the first week, and then more slowly later, but enough that
we now have to reload the db every 2-3 weeks. The insert procedure triggers a stored
procedure that updates a timestamp on the record so that we can figure out what
records have been touched, and which have not so that we can determine which
properties have been removed from the feed as the record was not touched in the last
I have noticed that whilst inserts seem to be slower than before, the vacuum full
doesn't seem to take as long overall.
postgresql.conf is pretty virgin, and we run postmaster with -B512 -N256 -i.
/var/lib/pgsql/data is a symlink to /eda/data, /eda being the mount point for the Raid
the database isn't huge, storing about 30000 properties, and the largest table is 2.1
Million rows for property features. The dump file is only 221MB. Alas, I did not
design the schema, but I have made several 'tweaks' to it to greatly improve read
performance allowing us to be the fastest provider in the Tristate area.
Unfortunately the Job starts at 01:05 (thats the earliest the dump is available) and
runs until completion finishing with a vacuum full. The vacuum full locks areas of
the database long enough that our service is temporarily down. At the worst point,
the vacuum full was starting after 09:00, which our customers didn't appreciate.
I'm wondering if there is anything I can do with postgres to allieviate this problem.
Either upgrading to 7.3.4 (although I tried 7.3.3 for another app, and we had to roll
back to 7.3.2 because of performance problems), or working with the postgresql.conf to
enhance performance. I really don't want to roll back the OS version if possible, but
I'm not ruling it out at this point, as that seems to be the biggest thing that has
changed. All the drive lights are showing green, so I don't believe the array is
running in degraded mode. I keep logs of all the insert jobs, and plotting average
insert times on a graph revealed that this started at the time of the upgrade.
Any help/suggestions would be grealy appreciated,
P.S. Sorry this is so long, but I wanted to include as much info as possible.
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend