[ 
https://issues.apache.org/jira/browse/IGNITE-25538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Plekhanov updated IGNITE-25538:
---------------------------------------
    Ignite Flags: Release Notes Required  (was: Docs Required,Release Notes 
Required)

> ROLLED_BACK transactions are not removed from active transactions list 
> -----------------------------------------------------------------------
>
>                 Key: IGNITE-25538
>                 URL: https://issues.apache.org/jira/browse/IGNITE-25538
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Mikhail Petrov
>            Assignee: Aleksey Plekhanov
>            Priority: Minor
>              Labels: ise
>             Fix For: 2.18
>
>
> User can observe the following output of `control.sh tx` command:
> {code:java}
> Matching transactions:
> TcpDiscoveryNode [id=34fd49ed-c325-4a93-a32c-3726c1c19130, 
> addrs=[10.19.138.119], order=3, ver=16.1.3#20241226-sha1:900bfa69, 
> isClient=false, consistentId=epk_rb_si_pplad-pprbrbepk0071.ca.sbrf.ru]
> Tx: [xid=0a2e8e50791-00000000-156e-2f01-0000-000000000013, 
> label=UcpSearchServiceDecorator.searchByClientId, state=ROLLED_BACK, 
> startTime=2025-05-26 23:53:58.515, duration=224437 sec, 
> isolation=READ_COMMITTED, concurrency=PESSIMISTIC, topVer=N/A, timeout=0 sec, 
> size=0, dhtNodes=[], 
> nearXid=0a2e8e50791-00000000-156e-2f01-0000-000000000013, 
> parentNodeIds=[86cc9e5e]]
> Tx: [xid=087e3040791-00000000-156e-2f01-0000-000000000030, 
> label=bs-ucp-4g-update-service, state=ROLLED_BACK, startTime=2025-05-25 
> 23:45:45.961, duration=311329 sec, isolation=READ_COMMITTED, 
> concurrency=PESSIMISTIC, topVer=N/A, timeout=0 sec, size=0, dhtNodes=[], 
> nearXid=087e3040791-00000000-156e-2f01-0000-000000000030, 
> parentNodeIds=[60400a24]]
> Tx: [xid=0e60d620791-00000000-156e-2f01-0000-000000000035, 
> label=CloudClientSearchService.byCriteria, state=ROLLED_BACK, 
> startTime=2025-05-24 23:49:05.016, duration=397530 sec, 
> isolation=READ_COMMITTED, concurrency=PESSIMISTIC, topVer=N/A, timeout=0 sec, 
> size=0, dhtNodes=[], 
> nearXid=0e60d620791-00000000-156e-2f01-0000-000000000035, 
> parentNodeIds=[448e854c]]
> TcpDiscoveryNode [id=9f11128e-c5a2-4700-af6b-c4777edfa31b, 
> addrs=[10.19.138.75], order=54, ver=16.1.3#20241226-sha1:900bfa69, 
> isClient=false, consistentId=epk_rb_si_pplad-pprbrbepk0025.ca.sbrf.ru]
> Command [TX] finished with code: 0
> {code}
> From the user perspective the mentioned output can be interpreted as bunch of 
> LRTs (long running transaction). Moreover this transactions cannot be 
> `killed` through contro.sh --kill command and are present in active 
> transactions list until node is rebooted.
> It worth to mention that the described problem is not reproduced for every 
> rolled back transaction, but for some under certain conditions.
> Reproducer:
> 1. Start server node.
> 2. Start tx through thin client with timeout.
> 3. Inject sleep in IgniteTxManager#onCreated after isCompleted check with 
> value greater than tx timeout. It can definitely be a case if the thread that 
> started the transactions is switched by the scheduler.
> 4. Wait for tx to complete with timeout error.
> As a result the transaction is rolled back by timeout worker and then is 
> stored in active transactions map in IgniteTxManager#onCreated method.
> The described above "hanging" transactions in ROLLED_BACK state do not hold 
> any data key locks and does not affect PME in any way. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to