Re: [HACKERS] [BUGS] Breakage with VACUUM ANALYSE + partitions

Andres Freund Mon, 11 Apr 2016 10:19:37 -0700

On 2016-04-11 13:04:48 -0400, Robert Haas wrote:
> You're right, but I think that's more because I didn't say it
> correctly than because you haven't done something novel.


Could be.


> DROP and
> relation truncation know about shared buffers, and they go clear
> blocks that that might be affected from it as part of the truncate
> operation, which means that no other backend will see them after they
> are gone.  The lock makes sure that no other references can be added
> while we're busy removing any that are already there.  So I think that
> there is currently an invariant that any block we are attempting to
> access should actually still exist.

Note that we're not actually accessing any blocks, we're just opening a
segment to get the associated file descriptor.


> It sounds like these references are sticking around in backend-private
> memory, which means they are neither protected by locks nor able to be
> cleared out on drop or truncate.  I think that's a new thing, and a
> bit scary.

True. But how would you batch flush requests in a sorted manner
otherwise, without re-opening file descriptors otherwise? And that's
prety essential for performance.

I can think of a number of relatively easy ways to address this:
1) Just zap (or issue?) all pending flush requests when getting an
   smgrinval/smgrclosenode
2) Do 1), but filter for the closed relnode
3) Actually handle the case of the last open segment not being
   RELSEG_SIZE properly in _mdfd_getseg() - mdnblocks() does so.

I'm kind of inclined to do both 3) and 1).

> The possibly-saving grace here, I suppose, is that the references
> we're worried about are just being used to issue hints to the
> operating system.

Indeed.

> So I guess if we sent a hint on a wrong block or
> skip sending a hint altogether because of some failure, no harm done,
> as long as we don't error out.

Which the writeback code is careful not to do; afaics it's just the
"already open segment" issue making problems here.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] Breakage with VACUUM ANALYSE + partitions

Reply via email to