03.02.11 20:42, Robert Haas написав(ла):
2011/1/30 Віталій Тимчишин<tiv...@gmail.com>:
I was thinking if a table file could be deleted if it has no single live
row. And if this could be done by vacuum. In this case vacuum on table that
was fully updated recently could be almost as good as cluster - any scan
would skip such non-existing files really fast. Also almost no disk space
would be wasted.
VACUUM actually already does something along these lines.  If there
are 1 or any larger number of entirely-free pages at the end of a
table, VACUUM will truncate them away.  In the degenerate case where
ALL pages are entirely-free, this results in zeroing out the file.

The problem with this is that it rarely does much.  Consider a table
with 1,000,000 pages, 50% of which contain live rows.  On average, how
many pages will this algorithm truncate away?  Answer: if the pages
containing live rows are randomly distributed, approximately one.
Yes, but take into account operations on a (by different reasons) clustered tables, like removing archived data (yes I know, this is best done with partitioning, but one must still go to a point when he will decide to use partitioning :) ).
Your idea of having a set of heaps rather than a single heap is an
interesting one, but it's pretty much catering to the very specific
case of a full-table update.  I think the code changes needed would be
far too invasive to seriously contemplate doing it just for that one
case - although it is an important case that I would like to see us
improve.
Why do you expect such a invasive code changes? I know little about postgresql code layering, but what I propose (with changing delete to truncate) is:
1) Leave tuple addressing as it is now
2) Allow truncated files, treating non-existing part as if it contained not used tuples
3) Make vacuum truncate file if it has not used tuples at the end.

The only (relatively) tricky thing I can see is synchronizing truncation with parallel ongoing scan.

Best regards, Vitalii Tymchyshyn



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to