[
https://issues.apache.org/jira/browse/IGNITE-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15220558#comment-15220558
]
Andrey Gura edited comment on IGNITE-2854 at 4/1/16 7:04 PM:
-------------------------------------------------------------
Algorithm is changed in order to limit amount of requested info on keys basis:
# When {{GridDhtLockFuture}} is timed out we run deadlock detection. As input
we have near transaction ID and pending keys that wasn't locked by this
transaction.
# Deadlock detector maps pending keys on primary nodes (on first step it is
always current node). As results deadlock detector have set of candidates
represented by pairs {{UUID -> List<IgniteTxKey>}}
# For each candidate (if exists) deadlock detector send request to node by its
{{UUID}}. Request contains keys from candidates pairs. If thre is no candidate
process finishes.
# Selected candidate removed from candidate set, node and all keys marked as
processed.
# Node processes request and returns all mvcc candidates that hold or waiting
for *passed keys* and all other keys involved into transactions that associated
with found mvcc candidates.
# Deadlock detector builds wat-for-graph (or updates it) and tries to find
cycle on it using input transaction ID as first vertex of graph.
# If cycle is found then deadlock detection stops (deadlock found).
# If cycle isn't found then deadlock detector maps obtained keys to primary
nodes and near nodes. Candidates set is updated.
# Process continues from step 3
Properties of this implementation:
* Always will found at most one deadlock for given timed out transaction.
* Always will detect deadlock which cause an user transaction timeout (if
exist). Step 6.
* Detection will finish as soon as possible because after each update of
wait-for-graph it can find deadlock.
* Detection minimize the network utilisation. Step 5.
Implementation requires some test coverage for different cases:
* Different nodes that start deadlocked transaction (all from one
(clinet/server), all from different (client/server), mix)
* Different nodes that start transaction with timeout (server/client near node,
server/client non near node)
* More then one cycle (waiting for each other or independent)
* Transitive transactions waiting for each other and eventually waiting for
deadlocked transaction.
Problems to be solved:
* Deadlock detector behaviour in case of topology changes and transactions
remapping.
* Deadlock detector behaviour in case of remote request failed.
was (Author: agura):
Algorithm is changed in order to limit amount of requested info on keys basis:
# When {{GridDhtLockFuture}} is timed out we run deadlock detection. As input
we have near transaction ID and pending keys that wasn't locked by this
transaction.
# Deadlock detector maps pending keys on primary nodes (on first step it is
always current node). As results deadlock detector have set of candidates
represented by pairs {{UUID -> List<IgniteTxKey>}}
# For each candidate (if exists) deadlock detector send request to node by its
{{UUID}}. Request contains keys from candidates pairs. If thre is no candidate
process finishes.
# Selected candidate removed from candidate set, node and all keys marked as
processed.
# Node processes request and returns all mvcc candidates that hold or waiting
for *passed keys* and all other keys involved into transactions that associated
with found mvcc candidates.
# Deadlock detector builds wat-for-graph (or updates it) and tries to find
cycle on it using input transaction ID as first vertex of graph.
# If cycle is found then deadlock detection stops (deadlock found).
# If cycle isn't found then deadlock detector maps obtained keys to primary
nodes and near nodes. Candidates set is updated.
# Process continues from step 3
Properties of this implementation:
* Always will found at most one deadlock for given timed out transaction.
* Always will detect deadlock which cause an user transaction timeout (if
exist). Step 6.
* Detection will finish as soon as possible because after each update of
wait-for-graph it can find deadlock.
* Detection minimize the network utilisation. Step 5.
Implementation requires some test coverage for different cases:
* Different nodes that start deadlocked transaction (all from one
(clinet/server), all from different (client/server), mix)
* Different nodes that start transaction with timeout (server/client near node,
server/client non near node)
* More then one cycle (waiting for each other or independent)
* Transitive transactions waiting for each other and eventually waiting for
deadlocked transaction.
Problems to be solved:
* Deadlock detector behaviour in case of topologu changes and transactions
remapping.
* Deadlock detector behaviour in case of remote request failed.
> Need to implement deadlock detection
> ------------------------------------
>
> Key: IGNITE-2854
> URL: https://issues.apache.org/jira/browse/IGNITE-2854
> Project: Ignite
> Issue Type: New Feature
> Components: cache
> Affects Versions: 1.5.0.final
> Reporter: Valentin Kulichenko
> Assignee: Andrey Gura
> Fix For: 1.6
>
>
> Currently, if transactional deadlock occurred, there is no easy way to find
> out which locks were reordered.
> We need to add a mechanism that will collect information about awating
> candidates, analyze it and show guilty keys. Most likely this should be
> implemented with the help of custom discovery message.
> In addition we should automatically execute this mechanism if transaction
> times out and add information to timeout exception.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)