Re: Checking for missing heap/index files

2022-10-19 Thread Robert Haas
On Tue, Oct 18, 2022 at 5:44 PM Tom Lane wrote: > My concern about that is that it implies touching a whole lot of > places, and if you miss even one then you've lost whatever guarantee > you thought you were getting. More, there's no easy way to find > all the relevant places (some will be in

Re: Checking for missing heap/index files

2022-10-18 Thread Tom Lane
Robert Haas writes: > On Tue, Oct 18, 2022 at 3:59 PM Tom Lane wrote: >> Isn't it already the case (or could be made so) that relation file >> removal happens only in the checkpointer? > I believe that individual backends directly remove all relation forks > other than the main fork and all

Re: Checking for missing heap/index files

2022-10-18 Thread Robert Haas
On Tue, Oct 18, 2022 at 3:59 PM Tom Lane wrote: > Isn't it already the case (or could be made so) that relation file > removal happens only in the checkpointer? I wonder if we could > get to a situation where we can interlock file removal just by > commanding the checkpointer to not do it for

Re: Checking for missing heap/index files

2022-10-18 Thread Tom Lane
Robert Haas writes: > On Tue, Oct 18, 2022 at 2:37 PM Stephen Frost wrote: >> I don't see it as likely to be acceptable, but arranging to not add or >> remove files while the scan is happening would presumably eliminate the >> risk entirely. We've not seen this issue recur in the expire command

Re: Checking for missing heap/index files

2022-10-18 Thread Robert Haas
On Tue, Oct 18, 2022 at 2:37 PM Stephen Frost wrote: > While I don't think it's really something that should be happening, it's > definitely something that's been seen with some networked filesystems, > as reported. Do you have clear and convincing evidence of this happening on anything other

Re: Checking for missing heap/index files

2022-10-18 Thread Stephen Frost
Greetings, * Robert Haas (robertmh...@gmail.com) wrote: > On Tue, Oct 18, 2022 at 12:59 PM Tom Lane wrote: > > There is no text suggesting that it's okay to miss, or to double-return, > > an entry that is present throughout the scan. So I'd interpret the case > > you're worried about as

Re: Checking for missing heap/index files

2022-10-18 Thread Robert Haas
On Tue, Oct 18, 2022 at 12:59 PM Tom Lane wrote: > There is no text suggesting that it's okay to miss, or to double-return, > an entry that is present throughout the scan. So I'd interpret the case > you're worried about as "forbidden by POSIX". Of course, it's known that > NFS fails to provide

Re: Checking for missing heap/index files

2022-10-18 Thread Tom Lane
Robert Haas writes: > I'd be really interested in knowing whether this happens on a > mainstream, non-networked filesystem. It's not an irrelevant concern > even if it happens only on networked filesystems, but a lot more > people will be at risk if it also happens on ext4 or xfs. It does seem >

Re: Checking for missing heap/index files

2022-10-18 Thread Robert Haas
On Fri, Jun 17, 2022 at 6:31 PM Stephen Frost wrote: >> Hmm, this sounds pretty bad, and I agree that a workaround should be put >> in place. But where is pg_basebackup looping around readdir()? I >> couldn't find it. There's a call to readdir() in FindStreamingStart(), >> but that doesn't

Re: Checking for missing heap/index files

2022-06-17 Thread Stephen Frost
Greetings, On Fri, Jun 17, 2022 at 14:32 Alvaro Herrera wrote: > On 2022-Jun-09, Stephen Frost wrote: > > > TL;DR: if you're removing files from a directory that you've got an > > active readdir() running through, you might not actually get all of the > > *existing* files. Given that PG is

Re: Checking for missing heap/index files

2022-06-17 Thread Alvaro Herrera
On 2022-Jun-09, Stephen Frost wrote: > TL;DR: if you're removing files from a directory that you've got an > active readdir() running through, you might not actually get all of the > *existing* files. Given that PG is happy to remove files from PGDATA > while a backup is running, in theory this

Re: Checking for missing heap/index files

2022-06-13 Thread Peter Geoghegan
On Mon, Jun 13, 2022 at 4:15 PM Bruce Momjian wrote: > I agree --- it would be nice, but might be hard to justify the > engineering and overhead involved. I guess I was just checking that I > wasn't missing something obvious. I suspect that the cost of being sloppy about this kind of thing

Re: Checking for missing heap/index files

2022-06-13 Thread Bruce Momjian
On Mon, Jun 13, 2022 at 04:06:12PM -0400, Robert Haas wrote: > One idea might be for each heap table to have a metapage and store the > length - or an upper bound on the length - in the metapage. That'd > probably be cheaper than updating pg_class, but might still be > expensive in some scenarios,

Re: Checking for missing heap/index files

2022-06-13 Thread Robert Haas
On Wed, Jun 8, 2022 at 8:46 AM Bruce Momjian wrote: > We currently can check for missing heap/index files by comparing > pg_class with the database directory files. However, I am not clear if > this is safe during concurrent DDL. I assume we create the file before > the update to pg_class is

Re: Checking for missing heap/index files

2022-06-09 Thread Stephen Frost
Greetings, * Bruce Momjian (br...@momjian.us) wrote: > We currently can check for missing heap/index files by comparing > pg_class with the database directory files. However, I am not clear if > this is safe during concurrent DDL. I assume we create the file before > the update to pg_class is

Re: Checking for missing heap/index files

2022-06-09 Thread Peter Geoghegan
On Thu, Jun 9, 2022 at 11:46 AM Bruce Momjian wrote: > I don't have a need for it --- I was just wondering why we have > something that checks the relation contents, but not the file existence? We do this for B-tree indexes within amcheck. They must always have storage, if only to store the

Re: Checking for missing heap/index files

2022-06-09 Thread Bruce Momjian
On Thu, Jun 9, 2022 at 09:46:51AM -0700, Mark Dilger wrote: > > > > On Jun 8, 2022, at 5:45 AM, Bruce Momjian wrote: > > > > Is this something anyone has even needed or had requested? > > I might have put this in amcheck's verify_heapam() had there been an > interface for it. I vaguely recall

Re: Checking for missing heap/index files

2022-06-09 Thread Mark Dilger
> On Jun 8, 2022, at 5:45 AM, Bruce Momjian wrote: > > Is this something anyone has even needed or had requested? I might have put this in amcheck's verify_heapam() had there been an interface for it. I vaguely recall wanting something like this, yes. As it stands, verify_heapam() may

Checking for missing heap/index files

2022-06-08 Thread Bruce Momjian
We currently can check for missing heap/index files by comparing pg_class with the database directory files. However, I am not clear if this is safe during concurrent DDL. I assume we create the file before the update to pg_class is visible, but do we always delete the file after the update to