On 1/8/14, 1:43 AM, Heikki Linnakangas wrote:

I've wanted the cluster case for a long time. I also see the use for the RECENT 
scenario, especially if we had CLUSTER CONCURRENT that let you shrink the head 
of the table as needed.

I suspect the in-memory case would only be useful if it could look into the OS 
cache as well, at least until we can recommend you give Postgres 90% of memory 
instead of 25%. Even then, I'm not sure how useful it would ultimately be...

* PACK
We want the table to be smaller, so rather than run a VACUUM FULL we
want to force the table to choose an NBT at start of table, even at
the expense of concurrency. By avoiding putting new data at the top of
the table we allow the possibility that VACUUM will shrink table size.
This is same as current except we always reset the FSM pointer to zero
and re-seek from there. This takes some time to have an effect, but is
much less invasive than VACUUM FULL.

We already reset the FSM pointer to zero on vacuum. Would the above actually 
make any difference in practice?

What if your first request is for a large chunk of free space? You could skip a 
lot of blocks, even if the FSM is bucketized.

But there's probably a more important point to this one: for you to have any 
chance of packing you MUST get everything out of the tail of the table. 
Resetting to zero on every request is one possible way to do that, though it 
might be better to do something like reset only once the pointer goes past 
block X. The other thing you'd want is a way to force tuples off the last X 
pages. Due to a lack of ctid operators that was already hard, and HOT makes 
that even harder (BTW, related to this you'd ideally want HOT to continue to 
operate on the front of the table, but not the back.)

All that said, I've definitely wanted the ability to shrink tables in the past, 
though TBH I've wanted that more for indexes.

Ultimately, what I really want on this front is:

PACK TABLE blah BACKGROUND;
CLUSTER TABLE blah BACKGROUND;
REINDEX INDEX blah BACKGROUND;

where BACKGROUND would respect a throttle setting. (While I'm dreaming, it'd be 
nice to have DATABASE/TABLESPACE/SCHEMA alternate specifications too...)
--
Jim C. Nasby, Data Architect                       j...@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to