Can't that be solved by just creating the permanent relation in a new
relfilenode? That's equivalent to a rewrite, yes, but we need to do that
for anything but wal_level=minimal anyway.

Maybe I'm missing something, but doesn't this actually involve writing the data 
twice? Once into WAL and again into the relation itself?

Yes. But as I said, that's unavoidable for anything but

Ideally, you would *only* write the data to WAL, when you do ALTER TABLE ...
SET LOGGED. There's no fundamental reason you need to rewrite the
heap, too.

As another point: What's the advantage of that? The amount of writes
will be the same, no? It doesn't seem to be all that interesting that
a second filenode exists temporarily?

Surely it's cheaper to read the whole relation and copy it to just WAL, than to read the whole relation and write it both the WAL and another file.

(Maybe it's not worth the trouble to avoid it - but that depends on whether we come up with a good design..)

- Heikki

