There is a process in Oracle which essentially allows you to do the equivalent of a CLUSTER in Postgres, but without locking the table, and so updates can continue throughout the process. It requires a bit of manual setup and fiddling (at least in Oracle 10g) .... this would probably scratch a lot of people's itches in this area. Of course, it's not trivial at all to implement :-(
The Oracle equivalent of "too many dead rows" is "too many chained rows" and that's where I've seen it used. Cheers Dave 2011/2/3 Robert Haas <robertmh...@gmail.com> > 2011/1/30 Віталій Тимчишин <tiv...@gmail.com>: > > I was thinking if a table file could be deleted if it has no single live > > row. And if this could be done by vacuum. In this case vacuum on table > that > > was fully updated recently could be almost as good as cluster - any scan > > would skip such non-existing files really fast. Also almost no disk space > > would be wasted. > > VACUUM actually already does something along these lines. If there > are 1 or any larger number of entirely-free pages at the end of a > table, VACUUM will truncate them away. In the degenerate case where > ALL pages are entirely-free, this results in zeroing out the file. > > The problem with this is that it rarely does much. Consider a table > with 1,000,000 pages, 50% of which contain live rows. On average, how > many pages will this algorithm truncate away? Answer: if the pages > containing live rows are randomly distributed, approximately one. > (Proof: There is a 50% chance that the last page will contain live > rows. If so, we can't truncate anything. If not, we can truncate one > page, and maybe more. Now the chances of the next page being free are > 499,999 in 999,999, or roughly one-half. So we have an almost-25% > chance of being able to truncate at least two pages. And so on. So > you get roughly 1/2 + 1/4 + 1/8 + 1/16 + 1/32 + ... = 1 page.) > > Your idea of having a set of heaps rather than a single heap is an > interesting one, but it's pretty much catering to the very specific > case of a full-table update. I think the code changes needed would be > far too invasive to seriously contemplate doing it just for that one > case - although it is an important case that I would like to see us > improve. Tom Lane previously objected to the idea of on-line table > compaction on the grounds that people's apps might break if CTIDs > changed under them, but I think a brawl between all the people who > want on-line table compaction and all the people who want to avoid > unexpected CTID changes would be pretty short. A bigger problem - or > at least another problem - is that moving tuples this way is > cumbersome and expensive. You basically have to move some tuples > (inserting new index entries for them), vacuum away the old index > entries (requiring a full scan of every index), and then repeat as > many times as necessary to shrink the table. This is not exactly a > smooth maintenance procedure, or one that can be done without > significant disruption, but AFAIK nobody's come up with a better idea > yet. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >