Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> On Sun, 8 May 2005, Tom Lane wrote:
>> While your original patch is buggy, it's at least fixable and has
>> localized, limited impact.  I don't think these schemes are safe
>> at all --- they put a great deal more weight on the semantics of
>> the filesystem than I care to do.

> I'm going to try this some more, because I feel that a scheme like this 
> that doesn't rely on scanning pg_class and the file system would in fact 
> be safer.

I think this proposal is getting more and more invasive and expensive,
and it's all to solve a problem that we do not even know is worth
spending any time on.  I *really* think this is the wrong direction
to take.  Aside from the required effort and risk of breaking things,
the original patch incurred cost only during crash recovery; this is
pushing costs into the normal code paths.

> Delay the actual file creation until it's first written to. The write 
> needs to be WAL logged anyway, so we would just piggyback on that.

This is a bad idea since by then it's (potentially) too late to roll
back the creating transaction if the creation fails.  Consider for
instance a tablespace directory that's mispermissioned read-only, or
some such.  I'd rather have the CREATE TABLE fail than a later INSERT.
(Admittedly, we can't completely guarantee that an INSERT won't hit
some kind of filesystem-level problem, but it's still something to
try to avoid.)

Also, the "first write" actually comes from mdextend, which is not a
WAL-logged operation AFAIR.  Some rethinking of that would be necessary
before this would have any chance of working.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

               http://archives.postgresql.org

Reply via email to