Hi all
Andres and I were going over a patch yesterday and found an unexpected
bug in tqual.c while attempting to trigger a hypothesized bug in that patch.
A SELECT ... FOR SHARE will incorrectly block on another open
transaction that ran SELECT ... FOR SHARE and still holds the locks if
it has to follow a ctid chain from the current snapshot through a
committed update to a share-locked tuple.
This also affects uniqueness checks in btrees, where it can cause
unnecessary waiting. It's also an issue with FOR KEY UPDATE, in that it
can cause an update to block when it doesn't have to.
The attached test case runs under isolationtester to exersise the
problem. I've tested it against 9.2, 9.3, and HEAD, but Andres looked
over the code and says the underlying bug goes back to the commit of
MVCC, it's just become easier to trigger. To reliably test this with
isolationtester I had to horribly abuse pg_advisory_lock(...) so that I
could force the first SELECT ... FOR UPDATE to acquire its snapshot
before the UPDATE runs.
A backtrace of the point where it's incorrectly blocked is attached.
What's happening is that the test for TransactionIdIsInProgress
unconditionally sets snapshot->xmax, even if xmax was only set for
locking purposes. This will cause the caller to wait for the xid in xmax
when it doesn't need to.
It should be testing HEAP_XMAX_IS_LOCKED_ONLY and only setting
snapshot->xmax if the tuple is really being deleted by an open tx.
Note that HeapTupleSatisfiesDirty tests the infomask for
HEAP_XMAX_IS_MULTI and handles multixacts separately, and in that case
it _does_ already test HEAP_XMAX_IS_LOCKED_ONLY.
When you run the attached test case it should block indefinitely. After
applying the attached patch it'll return as expected.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
# SELECT ... FOR UPDATE SKIP LOCKED
#
# Sanity checks for SELECT ... FOR UPDATE SKIP LOCKED.
setup
{
DROP TABLE IF EXISTS skip_locked;
CREATE TABLE skip_locked (
id integer PRIMARY KEY,
value integer not null
);
INSERT INTO skip_locked
SELECT x,x FROM generate_series(1,10) x;
}
teardown
{
DROP TABLE skip_locked;
}
# The session that does the SKIP LOCKED queries
session "sl1"
step "sl1_begin" {
BEGIN ISOLATION LEVEL READ COMMITTED;
}
step "sl1_selectblocking" {
SELECT xmin, xmax, ctid, * FROM skip_locked WHERE pg_advisory_lock(0) is not null FOR SHARE NOWAIT;
}
teardown { COMMIT; }
# A session that's used for an UPDATE of the rows to be locked, for when we're testing ctid
# chain following.
session "upd"
step "upd_getlock" {
SELECT pg_advisory_lock(0);
}
step "upd_doupdate" {
BEGIN ISOLATION LEVEL READ COMMITTED;
UPDATE skip_locked SET value = value WHERE id % 2 = 0;
COMMIT;
}
step "upd_releaselock" {
SELECT pg_advisory_unlock(0);
}
# A session that acquires locks that sl1 is supposed to skip
session "lk1"
step "lk1_doforshare" {
BEGIN ISOLATION LEVEL READ COMMITTED;
SELECT id FROM skip_locked WHERE id % 2 = 0 FOR SHARE;
}
teardown {
COMMIT;
}
permutation "upd_getlock" "sl1_begin" "sl1_selectblocking" "upd_doupdate" "lk1_doforshare" "upd_releaselock"
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
new file mode 100644
index da2ce18..0c72115
*** a/src/backend/utils/time/tqual.c
--- b/src/backend/utils/time/tqual.c
*************** HeapTupleSatisfiesDirty(HeapTuple htup,
*** 1012,1018 ****
if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmax(tuple)))
{
! snapshot->xmax = HeapTupleHeaderGetRawXmax(tuple);
return true;
}
--- 1012,1019 ----
if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmax(tuple)))
{
! if (!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask))
! snapshot->xmax = HeapTupleHeaderGetRawXmax(tuple);
return true;
}
#0 0x0000003e46af6937 in semop () at ../sysdeps/unix/syscall-template.S:81
#1 0x0000000000600697 in PGSemaphoreLock (sema=0x7fe7027674d0,
interruptOK=interruptOK@entry=1 '\001') at pg_sema.c:415
#2 0x000000000063e5c6 in ProcSleep (locallock=locallock@entry=0x1e30560,
lockMethodTable=lockMethodTable@entry=0x85e4c0 <default_lockmethod>) at
proc.c:1092
#3 0x00000000006398f9 in WaitOnLock (locallock=locallock@entry=0x1e30560,
owner=owner@entry=0x1ef6e28) at lock.c:1586
#4 0x000000000063abeb in LockAcquireExtended
(locktag=locktag@entry=0x7fff26f0d680, lockmode=lockmode@entry=5,
sessionLock=sessionLock@entry=0 '\000', dontWait=dontWait@entry=0 '\000',
reportMemoryError=reportMemoryError@entry=1 '\001') at lock.c:957
#5 0x000000000063afb1 in LockAcquire (locktag=locktag@entry=0x7fff26f0d680,
lockmode=lockmode@entry=5, sessionLock=sessionLock@entry=0 '\000',
dontWait=dontWait@entry=0 '\000') at lock.c:662
#6 0x0000000000639350 in XactLockTableWait (xid=124785) at lmgr.c:495
#7 0x000000000057c0e8 in EvalPlanQualFetch (estate=estate@entry=0x1f13e60,
relation=0x7fe70018f1d8, lockmode=lockmode@entry=0,
tid=tid@entry=0x7fff26f0d7c0, priorXmax=<optimized out>) at execMain.c:1835
#8 0x000000000059063d in ExecLockRows (node=node@entry=0x1f14070) at
nodeLockRows.c:147
#9 0x000000000057ce48 in ExecProcNode (node=node@entry=0x1f14070) at
execProcnode.c:495
#10 0x000000000057a8e0 in ExecutePlan (dest=0x1f0f908, direction=<optimized
out>, numberTuples=0, sendTuples=1 '\001', operation=CMD_SELECT,
planstate=0x1f14070, estate=0x1f13e60) at execMain.c:1395
#11 standard_ExecutorRun (queryDesc=0x1e34030, direction=<optimized out>,
count=0) at execMain.c:303
#12 0x000000000064eb50 in PortalRunSelect (portal=portal@entry=0x1e31000,
forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807,
dest=dest@entry=0x1f0f908) at pquery.c:944
#13 0x000000000064fe8b in PortalRun (portal=0x1e31000,
count=9223372036854775807, isTopLevel=<optimized out>, dest=0x1f0f908,
altdest=0x1f0f908, completionTag=0x7fff26f0dcd0 "") at pquery.c:788
#14 0x000000000064be96 in PostgresMain (argc=<optimized out>, argv=<optimized
out>, dbname=<optimized out>, username=<optimized out>) at postgres.c:1046
#15 0x000000000060d7be in BackendRun (port=0x1e352a0) at postmaster.c:3614
#16 BackendStartup (port=0x1e352a0) at postmaster.c:3304
#17 ServerLoop () at postmaster.c:1367
#18 0x000000000060e42c in PostmasterMain (argc=argc@entry=5,
argv=argv@entry=0x1e0f320) at postmaster.c:1127
#19 0x000000000045b7c9 in main (argc=5, argv=0x1e0f320) at main.c:199
#0 0x0000003e46af6937 in semop () at ../sysdeps/unix/syscall-template.S:81
No locals.
#1 0x0000000000600697 in PGSemaphoreLock (sema=0x7fe7027674d0,
interruptOK=interruptOK@entry=1 '\001') at pg_sema.c:415
errStatus = <optimized out>
sops = {sem_num = 3, sem_op = -1, sem_flg = 0}
__func__ = "PGSemaphoreLock"
#2 0x000000000063e5c6 in ProcSleep (locallock=locallock@entry=0x1e30560,
lockMethodTable=lockMethodTable@entry=0x85e4c0 <default_lockmethod>) at
proc.c:1092
lockmode = 5
lock = 0x7fe7024933f8
proclock = <optimized out>
hashcode = <optimized out>
partitionLock = 53
waitQueue = 0x7fe702493420
myHeldLocks = <optimized out>
early_deadlock = <optimized out>
allow_autovacuum_cancel = 1 '\001'
myWaitStatus = <optimized out>
proc = <optimized out>
i = <optimized out>
__func__ = "ProcSleep"
#3 0x00000000006398f9 in WaitOnLock (locallock=locallock@entry=0x1e30560,
owner=owner@entry=0x1ef6e28) at lock.c:1586
save_exception_stack = 0x7fff26f0d970
save_context_stack = 0x0
local_sigjmp_buf = {{__jmpbuf = {32468520, 934449783848447161, 53,
140630153119160, 140733846705792, 5, 934449784732921017, -934048294823514951},
__mask_was_saved = 0, __saved_mask = {__val = {0, 0, 140630152524792,
2790716164, 140630155490496, 5, 8774864, 5, 6529124, 47, 0, 140630152524792,
140630155490496, 456, 0, 31655264}}}}
lockmethodid = <optimized out>
lockMethodTable = 0x85e4c0 <default_lockmethod>
new_status = 0x1f16e60 "SELECT"
#4 0x000000000063abeb in LockAcquireExtended
(locktag=locktag@entry=0x7fff26f0d680, lockmode=lockmode@entry=5,
sessionLock=sessionLock@entry=0 '\000', dontWait=dontWait@entry=0 '\000',
reportMemoryError=reportMemoryError@entry=1 '\001') at lock.c:957
lockmethodid = <optimized out>
lockMethodTable = 0x85e4c0 <default_lockmethod>
localtag = {lock = {locktag_field1 = 124785, locktag_field2 = 0,
locktag_field3 = 0, locktag_field4 = 0, locktag_type = 4 '\004',
locktag_lockmethodid = 1 '\001'}, mode = 5}
locallock = <optimized out>
lock = 0x7fe7024933f8
proclock = 0x7fe7025245b8
found = 0 '\000'
owner = 0x1ef6e28
hashcode = 2790716164
partitionLock = 53
status = <optimized out>
log_lock = 0 '\000'
__func__ = "LockAcquireExtended"
#5 0x000000000063afb1 in LockAcquire (locktag=locktag@entry=0x7fff26f0d680,
lockmode=lockmode@entry=5, sessionLock=sessionLock@entry=0 '\000',
dontWait=dontWait@entry=0 '\000') at lock.c:662
No locals.
#6 0x0000000000639350 in XactLockTableWait (xid=124785) at lmgr.c:495
tag = {locktag_field1 = 124785, locktag_field2 = 0, locktag_field3 = 0,
locktag_field4 = 0, locktag_type = 4 '\004', locktag_lockmethodid = 1 '\001'}
#7 0x000000000057c0e8 in EvalPlanQualFetch (estate=estate@entry=0x1f13e60,
relation=0x7fe70018f1d8, lockmode=lockmode@entry=0,
tid=tid@entry=0x7fff26f0d7c0, priorXmax=<optimized out>) at execMain.c:1835
test = <optimized out>
update_ctid = {ip_blkid = {bi_hi = 55232, bi_lo = 9968}, ip_posid =
32767}
update_xmax = 0
buffer = 417
copyTuple = 0x0
tuple = {t_len = 32, t_self = {ip_blkid = {bi_hi = 0, bi_lo = 0},
ip_posid = 11}, t_tableOid = 365619, t_data = 0x7fe7006fec20}
SnapshotDirty = {satisfies = 0x734d90 <HeapTupleSatisfiesDirty>, xmin =
0, xmax = 124785, xip = 0x0, xcnt = 0, subxcnt = 0, subxip = 0x0, suboverflowed
= 96 '`', takenDuringRecovery = 66 'B', copied = -15 '\361', curcid = 0,
active_count = 1, regd_count = 0}
__func__ = "EvalPlanQualFetch"
#8 0x000000000059063d in ExecLockRows (node=node@entry=0x1f14070) at
nodeLockRows.c:147
aerm = <optimized out>
isNull = 0 '\000'
buffer = 417
erm = 0x1f13ff0
tuple = {t_len = 32, t_self = {ip_blkid = {bi_hi = 0, bi_lo = 0},
ip_posid = 2}, t_tableOid = 365619, t_data = 0x7fe7006fed40}
update_xmax = 124784
lockmode = LockTupleShared
copyTuple = <optimized out>
datum = <optimized out>
update_ctid = {ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 11}
test = <optimized out>
slot = 0x1f15060
estate = 0x1f13e60
outerPlan = 0x1f14260
epq_started = 0 '\000'
lc = 0x1f162e8
__func__ = "ExecLockRows"
#9 0x000000000057ce48 in ExecProcNode (node=node@entry=0x1f14070) at
execProcnode.c:495
result = <optimized out>
__func__ = "ExecProcNode"
#10 0x000000000057a8e0 in ExecutePlan (dest=0x1f0f908, direction=<optimized
out>, numberTuples=0, sendTuples=1 '\001', operation=CMD_SELECT,
planstate=0x1f14070, estate=0x1f13e60) at execMain.c:1395
slot = <optimized out>
current_tuple_count = 1
#11 standard_ExecutorRun (queryDesc=0x1e34030, direction=<optimized out>,
count=0) at execMain.c:303
estate = 0x1f13e60
operation = CMD_SELECT
dest = 0x1f0f908
sendTuples = <optimized out>
oldcontext = 0x1ef6c88
#12 0x000000000064eb50 in PortalRunSelect (portal=portal@entry=0x1e31000,
forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807,
dest=dest@entry=0x1f0f908) at pquery.c:944
queryDesc = 0x1e34030
direction = <optimized out>
nprocessed = <optimized out>
__func__ = "PortalRunSelect"
#13 0x000000000064fe8b in PortalRun (portal=0x1e31000,
count=9223372036854775807, isTopLevel=<optimized out>, dest=0x1f0f908,
altdest=0x1f0f908, completionTag=0x7fff26f0dcd0 "") at pquery.c:788
save_exception_stack = 0x7fff26f0dbe0
save_context_stack = 0x0
local_sigjmp_buf = {{__jmpbuf = {32233432, -934044991753147207,
32233944, 32569608, 2, 32176976, 934449784546274489, -934048285353825095},
__mask_was_saved = 0, __saved_mask = {__val = {7438292, 32176976, 32158096,
144, 31657984, 8956638, 80, 31524864, 32233944, 8709199, 2, 32176976, 7512933,
32233432, 2, 31657984}}}}
result = <optimized out>
nprocessed = <optimized out>
saveTopTransactionResourceOwner = 0x1eafc60
saveTopTransactionContext = 0x1eafb50
saveActivePortal = 0x0
saveResourceOwner = 0x1eafc60
savePortalContext = 0x0
saveMemoryContext = 0x1eafb50
__func__ = "PortalRun"
#14 0x000000000064be96 in PostgresMain (argc=<optimized out>, argv=<optimized
out>, dbname=<optimized out>, username=<optimized out>) at postgres.c:1046
firstchar = 32233432
input_message = {data = 0x1ebc6a0 "\n\tSELECT xmin, xmax, ctid, * FROM
skip_locked WHERE pg_advisory_lock(0) is not null FOR SHARE NOWAIT;\n", len =
103, maxlen = 1024, cursor = 103}
local_sigjmp_buf = {{__jmpbuf = {31527232, -934045008961327943, 1,
11808576, 0, 1, 934449784523205817, -934048287395402567}, __mask_was_saved = 1,
__saved_mask = {__val = {0, 11808576, 0, 60, 267473953272, 2, 206158430256,
140733846707520, 140733846707312, 0, 11808576, 0, 8192, 32, 31527232, 0}}}}
send_ready_for_query = 0 '\000'
__func__ = "PostgresMain"
#15 0x000000000060d7be in BackendRun (port=0x1e352a0) at postmaster.c:3614
ac = 1
secs = 428732844
usecs = 503164
i = <optimized out>
av = 0x1e11178
maxac = <optimized out>
#16 BackendStartup (port=0x1e352a0) at postmaster.c:3304
bn = 0x1e35260
pid = 0
#17 ServerLoop () at postmaster.c:1367
rmask = {fds_bits = {32, 0 <repeats 15 times>}}
selres = <optimized out>
readmask = {fds_bits = {56, 0 <repeats 15 times>}}
now = <optimized out>
last_touch_time = 1375414727
__func__ = "ServerLoop"
#18 0x000000000060e42c in PostmasterMain (argc=argc@entry=5,
argv=argv@entry=0x1e0f320) at postmaster.c:1127
opt = <optimized out>
status = <optimized out>
userDoption = <optimized out>
listen_addr_saved = <optimized out>
i = <optimized out>
__func__ = "PostmasterMain"
#19 0x000000000045b7c9 in main (argc=5, argv=0x1e0f320) at main.c:199
No locals.
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers