Vivek Khera writes:
> Also, how close are you to the capacity of your disk bandwidth? I
> don't see that in your numbers. I know in freebsd I can run "systat
> -vmstat" and it gives me a percentage of utilization that lets me know
> when I'm near the capacity.
The vacuum totally consumes the system. It's in a constant "D". As
near as I can tell, it's hitting all blocks in the database.
The problem is interactive performance when vacuum is in a D state.
Even with just two processes doing "stuff" (vacuum and a select, let's
say), the select is very slow.
My understanding of the problem is that if a query hits the disk hard
(and many of my queries do) and vacuum is hitting the disk hard, they
contend for the same resource and nobody wins. The query optimizer
has lots of problems with my queries and ends up doing silly sorts.
As a simple example, one query goes like this:
select avg(f1) from t1 group by f2;
This results in a plan like:
Aggregate (cost=171672.95..180304.41 rows=115086 width=32)
-> Group (cost=171672.95..177427.26 rows=1150862 width=32)
-> Sort (cost=171672.95..174550.10 rows=1150862 width=32)
Sort Key: f2
-> Seq Scan on t1 (cost=0.00..39773.62 rows=1150862 width=32)
This is of course stupid, because it sorts a 1M rows, which probably
means it has to hit disk (sort_mem can only be so large). Turns out
there are only about 20 different values of f2, so it would be much
better to aggregate without sorting. This is the type of query which
runs while vacuum runs and I'm sure the two are just plain
incompatible. vacuum is read intensive and this query is write
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster