On Wed, Oct 19, 2011 at 6:04 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > ProcessStandbyHSFeedbackMessage has a race condition: it thinks it can > call GetOldestXmin and then the result will politely hold still while > it considers what to do next. But in fact, whoever has the oldest xmin > could exit their transaction, allowing the global minimum to advance. > If a VACUUM process then inspects the ProcArray, it could compute an > oldest xmin that is newer than the value that > ProcessStandbyHSFeedbackMessage installs just afterwards. So much > for keeping the data the standby wanted. > > AFAICS we have to do all the logic about choosing the new value for > MyProc->xmin while holding ProcArrayLock, which IMO means that it should > go into a function in procarray.c. The fact that walsender.c is taking > ProcArrayLock and whacking MyProc->xmin around is already a horrid > violation of modularity, even if it weren't incorrect. > > Also, it seems like using GetOldestXmin is quite wrong here anyway; we > don't really want a result modified by vacuum_defer_cleanup_age do we? > It looks to me like that would result in vacuum_defer_cleanup_age being > applied twice: the walsender process sets its xmin the defer age into > the past, and then subsequent calls of GetOldestXmin compute a result > that is the defer age further back. IOW this is an independent > mechanism that also results in the computed global xmin going backwards. > > (Now that we have a standby feedback mechanism, I'm a bit tempted to > propose getting rid of vacuum_defer_cleanup_age altogether, rather than > hacking things to avoid the above.)
curious: are these bugs in production, and what would be the user visible symptoms of seeing them in the wild? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers