On Thu, 2007-02-01 at 14:38 +0200, Hannu Krosing wrote: > Ühel kenal päeval, N, 2007-02-01 kell 13:24, kirjutas Gavin Sherry: > > > A different approach discussed earlier involves greatly restricting the > > way in which the table is used. This table could only be written to if an > > exclusive lock is held; on error or ABORT, the table is truncated. > > > > The problem is that a lot of this looks like a hack and I haven't seen a > > very clean approach which has gone beyond basic brain dump. > > A more radical variation of the "restricted-use archive table" approach > is storing all tuple visibility info in a separate file. > > At first it seems to just add overhead, but for lots (most ? ) usecases > the separately stored visibility should be highly compressible, so for > example for bulk-loaded tables you could end up with one bit per page > saying that all tuples on this page are visible. > > Also this could be used to speed up vacuums, as only the visibility > table needs to be scanned duting phase 1 of vacuum, and so tables with > localised/moving hotspots can be vacuumed withoutd scanning lots of > static data. > > Also, storing the whole visibility info, but in a separate heap, lifts > all restrictions of the "restricted-use archive table" variant. > > And the compression of visibility info (mostly replacing per-tuple info > with per-page info) can be carried out by a separate vacuum-like > process. > > And it has many of the benefits of static/RO tables, like space saving > and index-only queries. Index-only will of course need to get the > visibility info from visibility heap, but if it is mostly heavily > compressed, it will be a lot cheaper than random access to data heap.
I like that idea, as a non-default option, since in-line visibility is important for OLTP applications. This idea is sufficiently flexible to allow a range of use cases without requiring a complete removal of functionality. Read-mostly is an important use case for very large databases. This is essentially the same thing as PhantomCommandId, just for the whole tuple header. It's something that would go on pg_class easily, just as WITH/WITHOUT OIDS has done. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly