Hi,
I am reviewing this one year old change again before backporting it to 9.1.3 for
production use.
ATM, I believe the code is correct, but I don't want to miss the change to spot
possible errors, so please let me dump my brain on some points:
- IIUC, SIGetDataEntries() can return 0 when in fact there _are_ messages
because stateP->hasMessages could come from a stale cache (iow there is no
read-membar used and because we return before acquiring SInvalReadLock (which
the patch is all about in the first place), we don't get an implicit
read-membar from a lock op any more).
What I can't judge on: Would this cause any harm? What are the consequences
of SIGetDataEntries returning 0 after another process has posted a message
(looking at global temporal ordering)?
I don't quite understand the significance of the respective comment in the
code that the incoherence should be acceptable because the cached read can't
migrate to before a previous lock acquisition (which itself is clear).
AcceptInvalidationMessages has a comment that it should be the first thing
to do in a transaction, and I am not sure if all the consumers have a
read-membar equivalent operation in place.
How bad would a missed cache invalidation be? Should we have a read-membar
in SIGetDataEntries just to be safe?
Other notes on points which appear correct to me (really more a note to myself):
- stateP->hasMessages = false in SIGetDataEntries is membar'ed by
SpinLockAcquire(&vsegP->msgnumLock), so it shouldn't happen that
clearing hasMessages moves behind reading msgnumLock
(in which case we could loose the hasMessages flag)
- but it can happen that hasMessages gets set when in fact there is
nothing to read (which is fine because we then check maxMsgNum)
Nils
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers