>From: [EMAIL PROTECTED]
>[mailto:[EMAIL PROTECTED] On Behalf Of Gregory Stark
>Sent: maandag 22 januari 2007 19:41
>To: Bruce Momjian
>Cc: Heikki Linnakangas; Russell Smith; Darcy Buskermolen;
>Simon Riggs; Alvaro Herrera; Matthew T. O'Connor; Pavan
>Deolasee; Christopher Browne; email@example.com;
>Subject: Re: [HACKERS] [GENERAL] Autovacuum Improvements
>"Bruce Momjian" <[EMAIL PROTECTED]> writes:
>> Yep, agreed on the random I/O issue. The larger question is if you
>> have a huge table, do you care to reclaim 3% of the table
>> than just vacuum it when it gets to 10% dirty? I realize the vacuum
>> is going to take a lot of time, but vacuuming to relaim 3%
>> seems like it is going to be more expensive than just vacuuming the
>> 10% once. And vacuuming to reclaim 1% ten times seems even more
>> expensive. The partial vacuum idea is starting to look like
>a loser to me again.
>Well the answer is of course "that depends".
>If you maintain the dead space at a steady state averaging
>1.5% instead of 5% your table is 3.33% smaller on average. If
>this is a DSS system that will translate into running your
>queries 3.33% faster. It will take a lot of vacuums before
>they hurt more than a 3%+ performance drop.
Good, this means a DSS system will mostly do table scans (right?). So
probably you should witness the 'table scan' statistic and rows fetched
aproaching the end of the universe (at least compared to
>If it's an OLTP system the it's harder to figure. a 3.33%
>increase in data density will translate to a higher cache hit
>rate but how much higher depends on a lot of factors. In our
>experiments we actually got bigger boost in these kinds of
>situations than the I expected (I expected comparable to the
>3.33% improvement). So it could be even more than 3.33%. But
>like said it depends.
>If you already have the whole database cache you won't see any
>improvement. If you are right on the cusp you could see a huge benefit.
These tables have high insert, update and delete rates, probably a lot
of index scans? I believe the workload on table scans should be (close
Are you willing to share some of this measured data? I'm quite
interested in such figures.
>It sounds like you're underestimating the performance drain
>10% wasted space has. If we found out that one routine was
>unnecessarily taking 10% of the cpu time it would be an
>obvious focus of attention. 10% wasted space is going to work
>out to about 10% of the i/o time.
>It also sounds like we're still focused on the performance
>impact in absolute terms. I'm much more interested in changing
>the performance characteristics so they're predictable and
>scalable. It doesn't matter much if your 1kb table is 100%
>slower than necessary but it does matter if your 1TB table
>needs 1,000x as much vacuuming as your 1GB table even if it's
>getting the same update traffic.
Or rather, the vacuuming should pay back.
A nice metric might be: cost_of_not_vacuuming / cost_of_vacuuming.
Obviously, the higher the better.
- Joris Dobbelsteen
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?