1) The essential problem lies in *IgniteTxAdapter#threadId*. Thread id is set when transaction is created and afterwards is transferred between nodes by GridDistributedTx requests\responses when we perform put\get operations. When we suspend and resume transaction, thread id is got changed locally, but not on remote nodes.
Actually, we don't need local tx thread id on remote nodes. This brings us to the solution. Remove thread id from prepare\finish requests\responses and don't use it remotely. But preserve it in lock requests, so long as cache lock uses thread id on remote nodes. Such changes let us not send thread id to remote nodes, and eliminate the core problem. Thread id is moved from global IgniteTxAdapter to GridNearTxLocal, as long as only *near local* transaction is need it. Now, only candidates, created by near tx local have got thread id defined.(i.e. remote candidates have got thread id undefined) Thread id in GridCacheTxFinishSync is replaced with version check(so we don't need to store and send thread id along with finish response). When pessimistic tx is resumed, then candidates are updated with new thread id. Lock check in IgniteTxAdapter#ownsLock now is done against version(not thread id, because version is unique). 2) No new messages needed for near lock requests, because thread id is preserved for cache lock requests. We still need to replace thread id with version check in TxFinishSync in other solution. In other solution we need to change cache lock code : replace thread id with new tx counter. In another solution we still need to remove thread id from requests\responses. If my design is OK I will do benchmarking вт, 13 февр. 2018 г. в 18:22, Nikolay Izhikov <nizhi...@apache.org>: > Hello, Alexey. > > Could you please, write little more about your implementation > > 1. Design of implementation. > > 2. Advantages of you implementation in compare with other ways? > > 3. Transactions performance penalties/improvements? > > В Вт, 13/02/2018 в 14:17 +0000, ALEKSEY KUZNETSOV пишет: > > Hi, Igniters! > > > > Currently we have context switching implemented for optimistic > > transactions . > > > > Goal of the current ticket is to support transaction suspend()\resume() > > operations for pessimistic transactions. > > > > The essential problem with them lies in *IgniteTxAdapter#threadId*. > > Thread id is set when transaction is created and afterwards is > transferred > > between nodes by GridDistributedTx requests\responses when we perform > > put\get operations. > > When we suspend and resume transaction, thread id is got changed > locally, > > but not on remote nodes. > > > > In ticket I decided to partly remove thread id from source, and > introduced > > *undefined* value for it, where its value must be ignored. > > Another solution can be to replace thread id usage with some new global > > transaction id counter. > > > > The former solution has advantages : > > compatibility is preserved, step-by-step clear implementation, minimal > > changes to explicit cache lock work(it still creates candidates with > > not-null thread id) as opposed to the last solution. > > > > There are 3 possible solutions to "thread id on remote nodes" issue : > > 1) Change thread id on remote nodes once suspend()\resume() is called. > > 2) Get rid of sending thread id to remote nodes. > > 3) Don’t remove the field, just put -1 (undefined) in it. > > > > The last option was chosen, because it will save compatibility in > cluster > > with nodes of older versions. > > Note that still outside the transaction, when explicit cache lock is > > created, thread id is set not null value in lock request(i.e. > > GridNearLockRequest). > > > > Thread id is moved from global IgniteTxAdapter to GridNearTxLocal, as > long > > as only *near local* transaction is need it. > > For instance, when local candidate(either near local or dht local) is > > created for GridNearTxLocal. Note that remote candidates are created with > > thread id undefined, because it useless for non-local candidates. > > In IgniteTxAdapter#ownsLock thread id is replaced with tx version check. > > We could do it, because near transactions has got unique versions to > check > > against. > > > > In tx synchronizer GridCacheTxFinishSync thread id is replaced with tx > > version, so we don't need to store it and send by GridFinishResponse > > messages. > > As a consequence, thread id is also removed from grid near > finish\prepare > > request\response. > > > > Also, thread id information is removed from deadlock messages (in > > TxDeadlock, TxDeadlockDetection). > > > > Please, review it: > > > > ticket *https://issues.apache.org/jira/browse/IGNITE-5714 > > <https://issues.apache.org/jira/browse/IGNITE-5714>* > > pull request *https://github.com/apache/ignite/pull/2789 > > <https://github.com/apache/ignite/pull/2789>* > > review https://reviews.ignite.apache.org/ignite/review/IGNT-CR-364 > > > >  : https://issues.apache.org/jira/browse/IGNITE-5712. -- *Best Regards,* *Kuznetsov Aleksey*