Heikki Linnakangas <[EMAIL PROTECTED]> writes: > On Sun, 8 May 2005, Tom Lane wrote: >> While your original patch is buggy, it's at least fixable and has >> localized, limited impact. I don't think these schemes are safe >> at all --- they put a great deal more weight on the semantics of >> the filesystem than I care to do.
> I'm going to try this some more, because I feel that a scheme like this > that doesn't rely on scanning pg_class and the file system would in fact > be safer. I think this proposal is getting more and more invasive and expensive, and it's all to solve a problem that we do not even know is worth spending any time on. I *really* think this is the wrong direction to take. Aside from the required effort and risk of breaking things, the original patch incurred cost only during crash recovery; this is pushing costs into the normal code paths. > Delay the actual file creation until it's first written to. The write > needs to be WAL logged anyway, so we would just piggyback on that. This is a bad idea since by then it's (potentially) too late to roll back the creating transaction if the creation fails. Consider for instance a tablespace directory that's mispermissioned read-only, or some such. I'd rather have the CREATE TABLE fail than a later INSERT. (Admittedly, we can't completely guarantee that an INSERT won't hit some kind of filesystem-level problem, but it's still something to try to avoid.) Also, the "first write" actually comes from mdextend, which is not a WAL-logged operation AFAIR. Some rethinking of that would be necessary before this would have any chance of working. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org