iAmClever opened a new issue, #6679:
URL: https://github.com/apache/incubator-seata/issues/6679

   ### Ⅰ. Issue Description
   TCC模式下,mysql事务隔离级别是RR,如果prepare阶段发生悬挂 && rollback阶段也发生悬挂,会出现异常【Deadlock 
found when trying to get lock; try restarting transaction】
   
   ### Ⅱ. Describe what happened
   TCC模式下,mysql事务隔离级别是RR,如果prepare阶段发生悬挂 && 
rollback阶段也发生悬挂,因为rollback方法【org.apache.seata.rm.fence.SpringFenceHandler#rollbackFence】会重试,当rollback悬挂消失(prepare悬挂还未消失
 or 比rollback慢一步执行)时,此时就可能出现多个请求同时执行rollback方法,这些请求会开启不同的本地事务, 
每个本地事务都会执行一次【select ... for 
update】查询,由于此时prepare阶段还处于悬挂状态,所以表【tcc_fence_log】还没有该分支事务的fence记录,由于该分支事务的fence记录是不存在的,所以【select
 ... for update】查询会从行锁 退化成 
间隙锁,由于不同事务是可以同时获取同一范围的间隙锁,所以这多个rollback请求都不会被阻塞,于是都去执行【insert】操作,在执行insert操作时,他们都需要等待彼此的间隙锁,于是发生了死锁!
   
   ```
   2024-07-15 21:57:09.942 ERROR 86408 --- [h_RMROLE_1_1_24] 
io.seata.rm.AbstractResourceManager      : rollback TCC resource error, 
resourceId: updateInventoryAcquire, xid: 
10.244.137.109:8091:8944687993233979087.
   
   io.seata.common.exception.StoreException: Deadlock found when trying to get 
lock; try restarting transaction
        at 
io.seata.rm.tcc.store.db.TCCFenceStoreDataBaseDAO.insertTCCFenceDO(TCCFenceStoreDataBaseDAO.java:133)
        at 
io.seata.rm.tcc.TCCFenceHandler.insertTCCFenceLog(TCCFenceHandler.java:235)
        at 
io.seata.rm.tcc.TCCFenceHandler.lambda$rollbackFence$2(TCCFenceHandler.java:193)
        at 
org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
        at 
io.seata.rm.tcc.TCCFenceHandler.rollbackFence(TCCFenceHandler.java:187)
        at 
io.seata.rm.tcc.TCCResourceManager.branchRollback(TCCResourceManager.java:164)
        at 
io.seata.rm.AbstractRMHandler.doBranchRollback(AbstractRMHandler.java:125)
        at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:67)
        at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:63)
        at 
io.seata.core.exception.AbstractExceptionHandler.exceptionHandleTemplate(AbstractExceptionHandler.java:131)
        at io.seata.rm.AbstractRMHandler.handle(AbstractRMHandler.java:63)
        at io.seata.rm.DefaultRMHandler.handle(DefaultRMHandler.java:68)
        at 
io.seata.core.protocol.transaction.BranchRollbackRequest.handle(BranchRollbackRequest.java:35)
        at io.seata.rm.AbstractRMHandler.onRequest(AbstractRMHandler.java:150)
        at 
io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.handleBranchRollback(RmBranchRollbackProcessor.java:63)
        at 
io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.process(RmBranchRollbackProcessor.java:58)
        at 
io.seata.core.rpc.netty.AbstractNettyRemoting.lambda$processMessage$2(AbstractNettyRemoting.java:281)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:829)
   Caused by: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: 
Deadlock found when trying to get lock; try restarting transaction
        at 
com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:123)
        at 
com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
        at 
com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:916)
        at 
com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1061)
        at 
com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1009)
        at 
com.mysql.cj.jdbc.ClientPreparedStatement.executeLargeUpdate(ClientPreparedStatement.java:1320)
        at 
com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdate(ClientPreparedStatement.java:994)
        at 
com.alibaba.druid.pool.DruidPooledPreparedStatement.executeUpdate(DruidPooledPreparedStatement.java:255)
        at 
io.seata.rm.tcc.store.db.TCCFenceStoreDataBaseDAO.insertTCCFenceDO(TCCFenceStoreDataBaseDAO.java:128)
        ... 20 common frames omitted
   
   2024-07-15 21:57:09.942 ERROR 86408 --- [h_RMROLE_1_3_24] 
io.seata.rm.AbstractResourceManager      : rollback TCC resource error, 
resourceId: updateInventoryAcquire, xid: 
10.244.137.109:8091:8944687993233979087.
   
   io.seata.common.exception.StoreException: Deadlock found when trying to get 
lock; try restarting transaction
        at 
io.seata.rm.tcc.store.db.TCCFenceStoreDataBaseDAO.insertTCCFenceDO(TCCFenceStoreDataBaseDAO.java:133)
        at 
io.seata.rm.tcc.TCCFenceHandler.insertTCCFenceLog(TCCFenceHandler.java:235)
        at 
io.seata.rm.tcc.TCCFenceHandler.lambda$rollbackFence$2(TCCFenceHandler.java:193)
        at 
org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
        at 
io.seata.rm.tcc.TCCFenceHandler.rollbackFence(TCCFenceHandler.java:187)
        at 
io.seata.rm.tcc.TCCResourceManager.branchRollback(TCCResourceManager.java:164)
        at 
io.seata.rm.AbstractRMHandler.doBranchRollback(AbstractRMHandler.java:125)
        at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:67)
        at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:63)
        at 
io.seata.core.exception.AbstractExceptionHandler.exceptionHandleTemplate(AbstractExceptionHandler.java:131)
        at io.seata.rm.AbstractRMHandler.handle(AbstractRMHandler.java:63)
        at io.seata.rm.DefaultRMHandler.handle(DefaultRMHandler.java:68)
        at 
io.seata.core.protocol.transaction.BranchRollbackRequest.handle(BranchRollbackRequest.java:35)
        at io.seata.rm.AbstractRMHandler.onRequest(AbstractRMHandler.java:150)
        at 
io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.handleBranchRollback(RmBranchRollbackProcessor.java:63)
        at 
io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.process(RmBranchRollbackProcessor.java:58)
        at 
io.seata.core.rpc.netty.AbstractNettyRemoting.lambda$processMessage$2(AbstractNettyRemoting.java:281)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:829)
   Caused by: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: 
Deadlock found when trying to get lock; try restarting transaction
        at 
com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:123)
        at 
com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
        at 
com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:916)
        at 
com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1061)
        at 
com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1009)
        at 
com.mysql.cj.jdbc.ClientPreparedStatement.executeLargeUpdate(ClientPreparedStatement.java:1320)
        at 
com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdate(ClientPreparedStatement.java:994)
        at 
com.alibaba.druid.pool.DruidPooledPreparedStatement.executeUpdate(DruidPooledPreparedStatement.java:255)
        at 
io.seata.rm.tcc.store.db.TCCFenceStoreDataBaseDAO.insertTCCFenceDO(TCCFenceStoreDataBaseDAO.java:128)
        ... 20 common frames omitted
   
   2024-07-15 21:57:09.943  INFO 86408 --- [h_RMROLE_1_3_24] 
io.seata.rm.AbstractRMHandler            : Branch Rollbacked result: 
PhaseTwo_RollbackFailed_Retryable
   2024-07-15 21:57:09.943  INFO 86408 --- [h_RMROLE_1_1_24] 
io.seata.rm.AbstractRMHandler            : Branch Rollbacked result: 
PhaseTwo_RollbackFailed_Retryable
   2024-07-15 21:57:09.952 ERROR 86408 --- [h_RMROLE_1_5_24] 
io.seata.rm.AbstractResourceManager      : rollback TCC resource error, 
resourceId: updateInventoryAcquire, xid: 
10.244.137.109:8091:8944687993233979087.
   
   io.seata.common.exception.StoreException: Deadlock found when trying to get 
lock; try restarting transaction
        at 
io.seata.rm.tcc.store.db.TCCFenceStoreDataBaseDAO.insertTCCFenceDO(TCCFenceStoreDataBaseDAO.java:133)
        at 
io.seata.rm.tcc.TCCFenceHandler.insertTCCFenceLog(TCCFenceHandler.java:235)
        at 
io.seata.rm.tcc.TCCFenceHandler.lambda$rollbackFence$2(TCCFenceHandler.java:193)
        at 
org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
        at 
io.seata.rm.tcc.TCCFenceHandler.rollbackFence(TCCFenceHandler.java:187)
        at 
io.seata.rm.tcc.TCCResourceManager.branchRollback(TCCResourceManager.java:164)
        at 
io.seata.rm.AbstractRMHandler.doBranchRollback(AbstractRMHandler.java:125)
        at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:67)
        at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:63)
        at 
io.seata.core.exception.AbstractExceptionHandler.exceptionHandleTemplate(AbstractExceptionHandler.java:131)
        at io.seata.rm.AbstractRMHandler.handle(AbstractRMHandler.java:63)
        at io.seata.rm.DefaultRMHandler.handle(DefaultRMHandler.java:68)
        at 
io.seata.core.protocol.transaction.BranchRollbackRequest.handle(BranchRollbackRequest.java:35)
        at io.seata.rm.AbstractRMHandler.onRequest(AbstractRMHandler.java:150)
        at 
io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.handleBranchRollback(RmBranchRollbackProcessor.java:63)
        at 
io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.process(RmBranchRollbackProcessor.java:58)
        at 
io.seata.core.rpc.netty.AbstractNettyRemoting.lambda$processMessage$2(AbstractNettyRemoting.java:281)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:829)
   Caused by: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: 
Deadlock found when trying to get lock; try restarting transaction
        at 
com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:123)
        at 
com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
        at 
com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:916)
        at 
com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1061)
        at 
com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1009)
        at 
com.mysql.cj.jdbc.ClientPreparedStatement.executeLargeUpdate(ClientPreparedStatement.java:1320)
        at 
com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdate(ClientPreparedStatement.java:994)
        at 
com.alibaba.druid.pool.DruidPooledPreparedStatement.executeUpdate(DruidPooledPreparedStatement.java:255)
        at 
io.seata.rm.tcc.store.db.TCCFenceStoreDataBaseDAO.insertTCCFenceDO(TCCFenceStoreDataBaseDAO.java:128)
        ... 20 common frames omitted
   ```
   
   ### Ⅲ. Describe what you expected to happen
   多个rollback请求同时执行发生冲突时,应该出现【duplicate key exception】,避免出现死锁!
   
   
   ### Ⅳ. Anything else we need to know?
   我的想法(方案)有4个:
   1. 在rollback方法中,将【select .... for update】和【insert】操作调换下位置,先执行insert,再执行for 
update,避免间隙锁引起的死锁问题。(但是由于insert tcc fence操作一般都是在prepare阶段做的,prepare悬挂导致insert 
tcc fence操作转移到了rollback方法毕竟是少数,如果调换【select .... for 
update】和【insert】操作的位置,会导致每次rollback操作都需要执行两次sql操作,性能会降低,所以不推荐!)
   
   2. 
使用redis等中间件做一个分布式锁,对【org.apache.seata.rm.fence.SpringFenceHandler】的【prepareFence】、【commitFence】、【rollbackFence】操作都需要获取分布式锁才能操作,这样也能避免死锁问题(但是这样会导致这三个操作都需要有两次网络io操作,性能也会降低,所以也不推荐!)
   
   3. 
由于死锁是因为RR级别下的间隙锁造成的,那如果把事务隔离级别调低,换成RC,此时就没有间隙锁,自然也就不会产生死锁!于是有了如下方案:对【org.apache.seata.rm.fence.SpringFenceHandler】的【prepareFence】、【commitFence】、【rollbackFence】的db操作,是通过【org.springframework.transaction.support.TransactionTemplate#execute】实现的,所以只需要单独对这三个方法做一下改造,在执行这些方法时,临时把【TransactionTemplate】的事务隔离级别换成
 RC,执行完后再换回来默认的事务隔离级别即可。
   
   4. 
(推荐方案)由于死锁是因为RR级别下的间隙锁造成的,那如果把事务隔离级别调低,换成RC,此时就没有间隙锁,自然也就不会产生死锁!于是有了如下方案:在【io.seata.rm.tcc.config.TCCFenceConfig】配置文件中加入事务隔离级别属性【isolationLevel】,允许用户通过【seata.tcc.fence.isolationLevel】自定义tccFence的事务隔离级别。在【org.apache.seata.rm.fence.SpringFenceConfig#afterPropertiesSet】中判断
 
如果用户没有自定义事务隔离级别,则使用默认的事务隔离级别,相反,如果用户自定义了事务隔离级别,那么此时将【TransactionTemplate】的事务隔离级别
 替换成 自定义事务隔离级别。这样,就可以通过这个拓展点,解决RR级别下的死锁问题。
   
   方案3 or 方案4改动后,当prepare阶段发生悬挂 && rollback阶段也发生悬挂时,报错如下,避免了死锁问题:
   ```
   2024-07-15 23:28:58.090 ERROR 59548 --- [h_RMROLE_1_4_24] 
io.seata.rm.AbstractResourceManager      : rollback TCC resource error, 
resourceId: updateInventoryAcquire, xid: 
10.244.137.109:8091:8944687993233979184.
   
   io.seata.rm.tcc.exception.TCCFenceException: Insert tcc fence record 
duplicate key exception. xid= 10.244.137.109:8091:8944687993233979184, 
branchId= 8944687993233979186
        at 
io.seata.rm.tcc.store.db.TCCFenceStoreDataBaseDAO.insertTCCFenceDO(TCCFenceStoreDataBaseDAO.java:130)
        at 
io.seata.rm.tcc.TCCFenceHandler.insertTCCFenceLog(TCCFenceHandler.java:235)
        at 
io.seata.rm.tcc.TCCFenceHandler.lambda$rollbackFence$2(TCCFenceHandler.java:193)
        at 
org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
        at 
io.seata.rm.tcc.TCCFenceHandler.rollbackFence(TCCFenceHandler.java:187)
        at 
io.seata.rm.tcc.TCCResourceManager.branchRollback(TCCResourceManager.java:164)
        at 
io.seata.rm.AbstractRMHandler.doBranchRollback(AbstractRMHandler.java:125)
        at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:67)
        at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:63)
        at 
io.seata.core.exception.AbstractExceptionHandler.exceptionHandleTemplate(AbstractExceptionHandler.java:131)
        at io.seata.rm.AbstractRMHandler.handle(AbstractRMHandler.java:63)
        at io.seata.rm.DefaultRMHandler.handle(DefaultRMHandler.java:68)
        at 
io.seata.core.protocol.transaction.BranchRollbackRequest.handle(BranchRollbackRequest.java:35)
        at io.seata.rm.AbstractRMHandler.onRequest(AbstractRMHandler.java:150)
        at 
io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.handleBranchRollback(RmBranchRollbackProcessor.java:63)
        at 
io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.process(RmBranchRollbackProcessor.java:58)
        at 
io.seata.core.rpc.netty.AbstractNettyRemoting.lambda$processMessage$2(AbstractNettyRemoting.java:281)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:829)
   
   2024-07-15 23:28:58.090 ERROR 59548 --- [h_RMROLE_1_2_24] 
io.seata.rm.AbstractResourceManager      : rollback TCC resource error, 
resourceId: updateInventoryAcquire, xid: 
10.244.137.109:8091:8944687993233979184.
   
   io.seata.rm.tcc.exception.TCCFenceException: Insert tcc fence record 
duplicate key exception. xid= 10.244.137.109:8091:8944687993233979184, 
branchId= 8944687993233979186
        at 
io.seata.rm.tcc.store.db.TCCFenceStoreDataBaseDAO.insertTCCFenceDO(TCCFenceStoreDataBaseDAO.java:130)
        at 
io.seata.rm.tcc.TCCFenceHandler.insertTCCFenceLog(TCCFenceHandler.java:235)
        at 
io.seata.rm.tcc.TCCFenceHandler.lambda$rollbackFence$2(TCCFenceHandler.java:193)
        at 
org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
        at 
io.seata.rm.tcc.TCCFenceHandler.rollbackFence(TCCFenceHandler.java:187)
        at 
io.seata.rm.tcc.TCCResourceManager.branchRollback(TCCResourceManager.java:164)
        at 
io.seata.rm.AbstractRMHandler.doBranchRollback(AbstractRMHandler.java:125)
        at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:67)
        at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:63)
        at 
io.seata.core.exception.AbstractExceptionHandler.exceptionHandleTemplate(AbstractExceptionHandler.java:131)
        at io.seata.rm.AbstractRMHandler.handle(AbstractRMHandler.java:63)
        at io.seata.rm.DefaultRMHandler.handle(DefaultRMHandler.java:68)
        at 
io.seata.core.protocol.transaction.BranchRollbackRequest.handle(BranchRollbackRequest.java:35)
        at io.seata.rm.AbstractRMHandler.onRequest(AbstractRMHandler.java:150)
        at 
io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.handleBranchRollback(RmBranchRollbackProcessor.java:63)
        at 
io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.process(RmBranchRollbackProcessor.java:58)
        at 
io.seata.core.rpc.netty.AbstractNettyRemoting.lambda$processMessage$2(AbstractNettyRemoting.java:281)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:829)
   ```
   
   
   
   如果认定这是一个 bug or 优化,我可以尝试提交修改的 PR ~~~
   
   ### Ⅵ. Environment:
   JDK version(e.g. java -version): 11
   Seata client/server version: 1.8.0
   Database version: 8.0.29


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@seata.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscr...@seata.apache.org
For additional commands, e-mail: notifications-h...@seata.apache.org

Reply via email to