The backend code for large objects goes to some lengths to be intelligent about sparsely-written blobs: if you seek out to the middle of nowhere and write a few bytes, you don't end up allocating space in pg_largeobject for all the byte positions you skipped over. However, pg_dump knows nothing of this. If you pg_dump a sparsely-stored large object, it will tediously transfer all those nonexistent zeros from server to client, and write them into the resulting archive. And then when you restore, the blob isn't sparse anymore ... those zeroes become non-virtual on the database side too.
Admittedly, this is no different than what happens when you try to back up a sparsely-stored Unix file, at least with simpler backup tools. But it seems to me we should try a bit harder. There are a couple of stumbling blocks to making that happen: * How should pg_dump find out where there are holes? It would be easy if it were to look into pg_largeobject, but that would destroy the ability to use pg_dump as non-superuser. I think we'd really have to provide some API to read from a blob in a sparse-storage-aware manner. The first idea that comes to mind is some way to tell lo_read to stop reading when it hits a gap (instead of manufacturing zeroes) and then a new "whence" option for lo_lseek that tells it to seek to the next non-dummy data in the blob. * How do we get pg_dump to make use of the knowledge once it's got it? The current code in that area is a masterpiece of ugly unreadability; it's near impossible to tell what connects to what else, and there are assorted magic switches that completely change the behavior of major interface functions. I'd kind of want to rewrite the whole mess before trying to change its behavior. I'm also pretty certain that we'd need an archive format change, though we've certainly done those before so that isn't a fatal objection. I'm not planning to do anything about this myself, but if someone is looking for a project, here's one. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers