Tom Lane wrote:
"Florian G. Pflug" <[EMAIL PROTECTED]> writes:
I've spent the last few days factoring out that work, and turning it into
a general solution. The result is this patch, which basically does the following
.) It defines a special TemporaryTransactionId that is used as an xact's xid
until the xact calls GetCurrentTransactionId / GetTopTransactionId.
[ squint... ] Why do you need that? The behavior we use with
subtransactions is just to leave the XID as InvalidOid until there's
a reason to assign it, and I think we should do the same with top-level
XIDs. Having an extra TemporaryTransactionId seems ugly, mainly because
it's not clear how XID comparison should handle it.
I invented the TemporaryTransactionId because it seemed worthwhile to
distinguish between backends which do not currently run a transaction
(xid == InvalidTransactionId), and such which run a transaction that
doesn't yet have an xid assiged (xid == TemporaryTransactionId). Currently,
the TemporaryTransactionId is treated to be later than any other xid
value. I'm not wedded to those TemporaryTransactionIds though - they just
seemed like a good idea when I started with my readonly-queries on PITR-slaves
work, and it allows for a few more assertions.
To leave XID at 0, you will need to teach GetSnapshotData and maybe
some other places that a proc could be advertising nonzero xmin even
when its XID is still 0. This does not seem like a big problem though.
Yeah - TemporaryTransactionId removed the need for a few special cases
in that area - but at the cost of having the distinguish between
TransactionIdIsValid and TransactionIdIsPermanent (meaning valid && !temporary).
.) Each transaction get an "rid" (ResourceOwnerId) assigned when it starts, and
obtains a lock on that rid, similar to how the xid is locked. This can
be used to wait for a transaction's toplevel resource owner to release all
it's locks, and serves as a unique identifier for a running transaction.
This seems like inventing a concept we could do without, also overhead
we could do without (assigning globally unique RIDs would require as
much mechanism and contention as XID assignment does). What about
locking the backend's PID instead? I do not see that long-term
uniqueness is needed, we just want to be able to wait for the backend's
current transaction to end.
If you do think it's important to distinguish the other guy's current
and next transaction (which maybe it is), then possibly we could lock a
combination of the PID and a *local*, per-backend transaction counter
(there should be plenty of room in LOCKTAG for this). This counter
value would have to be advertised in PGPROC, but there wouldn't be any
contention involved to assign a new value.
I wanted some (single) value that would fit into some standard C datatype.
Since I guess using "long long" in postgres code code is a bad idea
(It's not supported on all 32-bit plattforms I think), I wanted to come
up with some 32-bit identifier. If the PID were guaranteed to be 16-bit
we could use the other 16 bits as a counter - but all modern Unixen have
moved past a limit of 65535 processes I fear...
While writing this mail I realized that my RID generation algorithm -
while being quite lightweight I think - has a small race condition.
The algorithm is
for(;;) {
rid = ShmemVariableCache->nextRid++ ;
if (ResourceOwnerIdIsValid(rid) && ResourceOwnerLockTableInsert(rid))
break ;
}
I just realized that if two backend manage to obtain the same rid, and
one than is paged out long enough for the other to lock the rid, run
it's transaction and commit, then the second backend will get the same
rid :-(. So it's back to the drawing board anyway...
It's slightly annoying that this scheme involves taking two lmgr locks
per read-write transaction. I wonder whether we couldn't dispense with
the notion of locking one's XID per se. This would mean that where we
try to wait for another transaction by XID, we have to trawl through
the ProcArray to find that XID and see what PID/localID it maps to;
but if we're in that path we're already going to be blocking, so more
cycles there might be a good tradeoff for fewer cycles in transaction
start.
Yeah - I do not really like that dual-locking thing either. But it makes
prepared transaction handling much easier - if we were to only lock the
RID, we'd have to store the rid<->xid mapping for prepared transactions
somewhere *and* guarantee that we won't assign that RID to another transaction -
even after a server restart...
1) The second waiting phase of concurrent index builds fail to wait for xacts
that haven't been assigned an XID when the reference shapshot was taken.
The "rid" doesn't help here, because it's not currently store in the
snapshot.
We'd probably want to whack that around so that it pays attention to the
xmins advertised by other backends, rather than their XIDs. This is
just a handwave though.
It's even simpler - In that section of the code, the XID is used just as a
unique identifier for currently running transactions - and talking a snapshot
convenientl creates just such a list of XIDs. The XID isn't used in any
comparisions. So I think I'd be sufficient to scan the proc array after
obtaining the reference snapshot, and just manually build a list of RIDs
instead of XIDs. It wouldn't matter that we build the list slightly later than
the snapshot - the only effect of this would be that we might end up waiting
for a transaction unnecessarily.
However, before putting more work into that part of the code, I'll check
the latest version of HOT. I seem to remember that there was this idea
of storing some kind of xmin with an index that could be used to decide
if an index was usable for a certain snapshot.
greetings, Florian Pflug
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match