>-----Original Message-----
>[mailto:[EMAIL PROTECTED] On Behalf Of Gregory Stark
>Sent: maandag 22 januari 2007 19:41
>To: Bruce Momjian
>Cc: Heikki Linnakangas; Russell Smith; Darcy Buskermolen; 
>Simon Riggs; Alvaro Herrera; Matthew T. O'Connor; Pavan 
>Deolasee; Christopher Browne; pgsql-general@postgresql.org; 
>Subject: Re: [HACKERS] [GENERAL] Autovacuum Improvements
>"Bruce Momjian" <[EMAIL PROTECTED]> writes:
>> Yep, agreed on the random I/O issue.  The larger question is if you 
>> have a huge table, do you care to reclaim 3% of the table 
>size, rather 
>> than just vacuum it when it gets to 10% dirty?  I realize the vacuum 
>> is going to take a lot of time, but vacuuming to relaim 3% 
>three times 
>> seems like it is going to be more expensive than just vacuuming the 
>> 10% once.  And vacuuming to reclaim 1% ten times seems even more 
>> expensive.  The partial vacuum idea is starting to look like 
>a loser to me again.
>Well the answer is of course "that depends".
>If you maintain the dead space at a steady state averaging 
>1.5% instead of 5% your table is 3.33% smaller on average. If 
>this is a DSS system that will translate into running your 
>queries 3.33% faster. It will take a lot of vacuums before 
>they hurt more than a 3%+ performance drop.

Good, this means a DSS system will mostly do table scans (right?). So
probably you should witness the 'table scan' statistic and rows fetched
aproaching the end of the universe (at least compared to

>If it's an OLTP system the it's harder to figure. a 3.33% 
>increase in data density will translate to a higher cache hit 
>rate but how much higher depends on a lot of factors. In our 
>experiments we actually got bigger boost in these kinds of 
>situations than the I expected (I expected comparable to the 
>3.33% improvement). So it could be even more than 3.33%. But 
>like said it depends.
>If you already have the whole database cache you won't see any 
>improvement. If you are right on the cusp you could see a huge benefit.

These tables have high insert, update and delete rates, probably a lot
of index scans? I believe the workload on table scans should be (close
to) none.

Are you willing to share some of this measured data? I'm quite
interested in such figures.

>It sounds like you're underestimating the performance drain 
>10% wasted space has. If we found out that one routine was 
>unnecessarily taking 10% of the cpu time it would be an 
>obvious focus of attention. 10% wasted space is going to work 
>out to about 10% of the i/o time.
>It also sounds like we're still focused on the performance 
>impact in absolute terms. I'm much more interested in changing 
>the performance characteristics so they're predictable and 
>scalable. It doesn't matter much if your 1kb table is 100% 
>slower than necessary but it does matter if your 1TB table 
>needs 1,000x as much vacuuming as your 1GB table even if it's 
>getting the same update traffic.

Or rather, the vacuuming should pay back.
A nice metric might be: cost_of_not_vacuuming / cost_of_vacuuming.
Obviously, the higher the better.

- Joris Dobbelsteen

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?


Reply via email to