On Fri, May 25, 2012 at 5:57 PM, Jim Nasby <j...@nasby.net> wrote: > It occurred to me that having a metapage with information useful to recovery > operations in *every segment* would be useful; it certainly seems worth the > extra block. It then occurred to me that we've basically been stuck with 2 > places to store relation data; either at the relation level in pg_class or > on each page. Sometimes neither one is a good fit.
AFAICS, having metadata in every segment is most only helpful for recovering from the situation where files have become disassociated from their filenames, i.e. database -> lost+found. From the view point of virtually the entire server, the block number space is just a continuous sequence that starts at 0 and counts up forever (or, anyway, until 2^32-1). While it wouldn't be impossible to allow that knowledge to percolate up to other parts of the server, it would basically involve drilling a fairly arbitrary hole through an abstraction boundary that has been intact for a very long time, and it's not clear that there's anything magical about 1GB. Nonwithstanding the foregoing... > ISTM that a lot of problems we've faced in the past few years are because > there's not a good abstraction between a (mostly) linear tuplespace and the > physical storage that goes underneath it. ...I agree with this. I'm not sure exactly what the replacement model would look like, but it's definitely worth some thought - e.g. perhaps there ought to be another mapping layer between logical block numbers and files on disk, so that we can effectively delete blocks out of the middle of a relation without requiring any special OS support, and so that we can multiplex many small relation forks onto a single physical file to minimize inode consumption. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers