On 19/09/10 21:57, I wrote:
Putting that aside for now, we have one very serious problem with this
algorithm:

While they [SIREAD locks] are associated with a transaction, they must
survive
a successful COMMIT of that transaction, and remain until all overlapping
 > transactions complete.

Long-running transactions are already nasty because they prevent VACUUM
from cleaning up old tuple versions, but this escalates the problem to a
whole new level. If you have one old transaction sitting idle, every
transaction that follows consumes a little bit of shared memory, until
that old transaction commits. Eventually you will run out of shared
memory, and will not be able to start new transactions anymore.

Is there anything we can do about that? Just a thought, but could you
somehow coalesce the information about multiple already-committed
transactions to keep down the shared memory usage? For example, if you
have this:

1. Transaction <slow> begins
2. 100 other transactions begin and commit

Could you somehow group together the 100 committed transactions and
represent them with just one SERIALIZABLEXACT struct?

Ok, I think I've come up with a scheme that puts an upper bound on the amount of shared memory used, wrt. number of transactions. You can still run out of shared memory if you lock a lot of objects, but that doesn't worry me as much.

When a transaction is commits, its predicate locks must be held, but it's not important anymore *who* holds them, as long as they're hold for long enough.

Let's move the finishedBefore field from SERIALIZABLEXACT to PREDICATELOCK. When a transaction commits, set the finishedBefore field in all the PREDICATELOCKs it holds, and then release the SERIALIZABLEXACT struct. The predicate locks stay without an associated SERIALIZABLEXACT entry until finishedBefore expires.

Whenever there are two predicate locks on the same target that both belonged to an already-committed transaction, the one with a smaller finishedBefore can be dropped, because the one with higher finishedBefore value covers it already.

There. That was surprisingly simple, I must be missing something.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to