Hello hackers,
While working on two phase related issues, I found something related to two
phase could be optimized.
1. The current implementation decouples PREPRE and COMMIT/ABORT PREPARE a lot.
This is flexible, but if
PREPARE & COMMIT/ABORT mostly happens on the same backend we could use the
cache mechanism to
speed up, e.g.
a. FinishPreparedTransaction()->LockGXact(gid, user)
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
find the gxact that matches gid
For this we can cache the gxact during PREPARE and use that for a
fast path, i.e. if the cached gxact
matches gid we do not need to walk through the gxact array. By the
way, if the gxact array is large this
will be a separate performance issue (use shared-memory hash table if
needed?).
b. FinishPreparedTransaction() reads the PREPARE information from either
state file (stored during checkpoint)
or wal file. We could cache the content during PREPARE, i.e. in
EndPrepare() then in FinishPreparedTransaction()
we can avoid reading the state file or the wal file.
It is possible that some databases based on Postgres two phase might not
want the cache, e.g. if PREPARE
backend is always different than the COMMIT/ABORT PREPARE backend (I do
not know what database is
designing like this though), but gxact caching is almost no overhead and
for b we could use ifdef to guard the
PREPARE wal data copying code.
The two optimizations are easy and small. I've verified on Greenplum
database (based on Postgres 12).
2. wal content duplication between PREPARE and COMMT/ABORT PREPARE
See the below COMMIT PREPARE function call. Those hdr->* have existed in
PREPARE wal also. We do
not need them in the COMMIT PREPARE wal also. During recovery, we could
load these information (both
COMMIT and ABORT) into memory and in COMMIT/ABORT PREPARE redo we use the
corresponding data.
RecordTransactionCommitPrepared(xid,
hdr->nsubxacts, children,
hdr->ncommitrels, commitrels,
hdr->ninvalmsgs, invalmsgs,
hdr->initfileinval, gid);
One drawback of the change is this might involve non-trivial change.
3. About gid, current gid is defined as a char[]. I'm wondering if we should
define an opaque type and let some
Databases implement their own gid types using callbacks. Typically if I
want to use 64-bit distributed xid as gid,
current code is not that performance & storage friendly (e.g. still need to
use strcmp to find gxact in LockGXact,).
We may implement a default implementation as char[]. gid is not widely used
so the change seems to
be small (interfaces of copy, comparison, conversion from string to
internal gid type for the PREPARE statement, etc)
Any thoughts?
Regards,
Paul