On Fri, May 3, 2019 at 2:15 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.mu...@gmail.com> writes: > > A while back I posted a patch[1] to change the order of resowner > > cleanup so that DSM handles are released last. That's useful for the > > error cleanup path on Windows, when a SharedFileSet is cleaned up (a > > mechanism that's used by parallel CREATE INDEX and parallel hash join, > > for spilling files to disk under a temporary directory, with automatic > > cleanup). > > I guess what I'm wondering is if there are any potential negative > consequences, ie code that won't work if we change the order like this. > I'm finding it hard to visualize what that would be, but then again > this failure mode wasn't obvious either.
I can't think of anything in core. The trouble here is that we're talking about hypothetical out-of-tree code that could want to plug in detach hooks to do anything at all, so it's hard to say. One idea that occurred to me is that if someone comes up with a genuine need to run arbitrary callbacks before locks are released (for example), we could provide a way to be called in all three phases and receive the phase, though admittedly in this case FileClose() is in the same phase as I'm proposing to put dsm_detach(), so there is an ordering requirement that might require more fine grained phases. I don't know. > > I suppose we probably should make the change to 12 though: then owners > > of extensions that use DSM detach hooks (if there any such extensions) > > will have a bit of time to get used to the new order during the beta > > period. I'll need to find someone to test this with a fault injection > > scenario on Windows before committing it, but wanted to sound out the > > list for any objections to this late change? > > Since we haven't started beta yet, I don't see a reason not to change > it. Worst case is that it causes problems and we revert it. > > I concur with not back-patching, in any case. Here's a way to produce an error which might produce the log message on Windows. Does anyone want to try it? postgres=# create table foo as select generate_series(1, 10000000)::int i; SELECT 10000000 postgres=# set synchronize_seqscans = off; SET postgres=# create index on foo ((1 / (5000000 - i))); psql: ERROR: division by zero postgres=# create index on foo ((1 / (5000000 - i))); psql: ERROR: division by zero postgres=# create index on foo ((1 / (5000000 - i))); psql: ERROR: division by zero CONTEXT: parallel worker (If you don't turn sync scan off, it starts scanning from where it left off last time and then fails immediately, which may interfere with the experiment if you run it more than once, I'm not sure). If it does produce the log message, then the attached patch should make it go away. -- Thomas Munro https://enterprisedb.com
0001-Detach-from-DSM-segments-last-in-resowner.c.patch
Description: Binary data