
We are currently working on transactional SQL and distributed deadlocks are
serious problem for us. It looks like current deadlock detection mechanism
has several deficiencies:
1) It transfer keys! No go for SQL as we may have millions of keys.
2) By default we wait for a minute. Way too much IMO.

What if we change it as follows:
1) Collect XIDs of all preceding transactions while obtaining lock within
current transaction object. This way we will always have the list of TXes
we wait for.
2) Define TX deadlock coordinator node
3) Periodically (e.g. once per second), iterate over active transactions
and detect ones waiting for a lock for too long (e.g. >2-3 sec). Timeouts
could be adaptive depending on the workload and false-pasitive alarms rate.
4) Send info about those long-running guys to coordinator in a form Map[XID
-> List<XID>]
5) Rebuild global wait-for graph on coordinator and search for deadlocks
6) Choose the victim and send problematic wait-for graph to it
7) Victim collects necessary info (e.g. keys, SQL statements, thread IDs,
cache IDs, etc.) and throws an exception.

1) We ignore short transactions. So if there are tons of short TXes on
typical OLTP workload, we will never many of them
2) Only minimal set of data is sent between nodes, so we can exchange data
often without loosing performance.



Reply via email to