[ 
https://issues.apache.org/jira/browse/IGNITE-22005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikita Sivkov updated IGNITE-22005:
-----------------------------------
    Description: 
*Steps to reproduce:*

Perform a long (about 2 hours) load test with a balance transfer scenario (see 
scenario pseudo code in attachments).

*Expected result:*

No errors happen.

*Actual result:*

Get error in server logs - {{Failed to process replica request}}
{code:java}
2024-04-05 17:50:55:802 +0300 
[WARNING][%poc-tester-SERVER-192.168.1.97-id-0%JRaft-AppendEntries-Processor-2][NodeImpl]
 Node <193_part_15/poc-tester-SERVER-192.168.1.97-id-0> is not in active state, 
currTerm=2.
2024-04-05 17:50:55:805 +0300 
[WARNING][%poc-tester-SERVER-192.168.1.97-id-0%Raft-Group-Client-19][ReplicaManager]
 Failed to process replica request [request=TxFinishReplicaRequestImpl 
[commit=false, commitTimestampLong=0, 
enlistmentConsistencyToken=112218720633356321, groupId=123_part_21, 
groups=HashMap {141_part_13=poc-tester-SERVER-192.168.1.27-id-0, 
139_part_9=poc-tester-SERVER-192.168.1.97-id-0, 
193_part_3=poc-tester-SERVER-192.168.1.27-id-0, 
19_part_23=poc-tester-SERVER-192.168.1.27-id-0, 
117_part_17=poc-tester-SERVER-192.168.1.18-id-0, 
45_part_9=poc-tester-SERVER-192.168.1.18-id-0, 
39_part_3=poc-tester-SERVER-192.168.1.18-id-0, 
77_part_4=poc-tester-SERVER-192.168.1.18-id-0, 
105_part_4=poc-tester-SERVER-192.168.1.18-id-0, 
123_part_21=poc-tester-SERVER-192.168.1.97-id-0, 
103_part_9=poc-tester-SERVER-192.168.1.18-id-0, 
161_part_15=poc-tester-SERVER-192.168.1.27-id-0, 
103_part_22=poc-tester-SERVER-192.168.1.27-id-0, 
89_part_10=poc-tester-SERVER-192.168.1.18-id-0, 
39_part_19=poc-tester-SERVER-192.168.1.27-id-0, 
149_part_13=poc-tester-SERVER-192.168.1.27-id-0, 
97_part_24=poc-tester-SERVER-192.168.1.97-id-0, 
83_part_9=poc-tester-SERVER-192.168.1.27-id-0, 
209_part_10=poc-tester-SERVER-192.168.1.27-id-0, 
185_part_5=poc-tester-SERVER-192.168.1.18-id-0, 
117_part_9=poc-tester-SERVER-192.168.1.27-id-0, 
105_part_22=poc-tester-SERVER-192.168.1.18-id-0}, 
timestampLong=112219170129903617, txId=018eaebd-88ba-0001-606d-622500000001]].
java.util.concurrent.CompletionException: 
org.apache.ignite.tx.TransactionException: IGN-TX-7 
TraceId:cb1577e6-ec35-47f0-ab7d-56a0687344ed 
java.util.concurrent.TimeoutException
    at 
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
    at 
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
    at 
java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:932)
    at 
java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
    at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
    at 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
    at 
org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$applyCmdWithRetryOnSafeTimeReorderException$126(PartitionReplicaListener.java:2806)
    at 
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
    at 
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
    at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
    at 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
    at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:550)
    at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$handleErrorResponse$44(RaftGroupServiceImpl.java:653)
    at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.ignite.tx.TransactionException: IGN-TX-7 
TraceId:cb1577e6-ec35-47f0-ab7d-56a0687344ed 
java.util.concurrent.TimeoutException
    at 
org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$finishTransaction$70(PartitionReplicaListener.java:1867)
    at 
java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930)
    ... 16 more
Caused by: java.util.concurrent.CompletionException: 
java.util.concurrent.TimeoutException
    at 
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
    at 
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
    at 
java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:632)
    ... 10 more
Caused by: java.util.concurrent.TimeoutException
    ... 8 more
2024-04-05 17:50:55:807 +0300 
[WARNING][%poc-tester-SERVER-192.168.1.97-id-0%JRaft-Request-Processor-42][NodeImpl]
 Node <35_part_20/poc-tester-SERVER-192.168.1.97-id-0> is not in active state, 
currTerm=4.{code}

  was:
*Steps to reproduce:*

Perform a long (about 2 hours) load test with a balance transfer scenario (see 
scenario pseudo code in attachments).

*Expected result:*

No errors happen.

*Actual result:*

Get error in server logs - {{Failed to process delayed response}}
{code:java}
2024-04-05 17:50:50:776 +0300 
[WARNING][%poc-tester-SERVER-192.168.1.97-id-0%JRaft-Request-Processor-1][NodeImpl]
 Node <27_part_23/poc-tester-SERVER-192.168.1.97-id-0> is not in active state, 
currTerm=2.
2024-04-05 17:50:50:778 +0300 
[WARNING][%poc-tester-SERVER-192.168.1.97-id-0%Raft-Group-Client-5][ReplicaManager]
 Failed to process delayed response 
[request=ReadWriteSingleRowReplicaRequestImpl 
[commitPartitionId=TablePartitionIdMessageImpl [partitionId=21, tableId=123], 
coordinatorId=3de6f999-7ab9-4405-aff0-ee0c7e4886ce, 
enlistmentConsistencyToken=112218720633356321, full=false, groupId=123_part_21, 
requestType=RW_UPSERT, schemaVersion=1, timestampLong=112219169796915211, 
transactionId=018eaebd-88ba-0001-606d-622500000001]]
java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException
    at 
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
    at 
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
    at 
java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:632)
    at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
    at 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
    at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:550)
    at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$handleErrorResponse$44(RaftGroupServiceImpl.java:653)
    at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.util.concurrent.TimeoutException
    ... 8 more
2024-04-05 17:50:50:780 +0300 
[WARNING][%poc-tester-SERVER-192.168.1.97-id-0%JRaft-Request-Processor-27][NodeImpl]
 Node <99_part_6/poc-tester-SERVER-192.168.1.97-id-0> is not in active state, 
currTerm=3. {code}


> 'Failed to process replica request' error under load with balance transfer 
> scenario
> -----------------------------------------------------------------------------------
>
>                 Key: IGNITE-22005
>                 URL: https://issues.apache.org/jira/browse/IGNITE-22005
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 3.0.0-beta2
>         Environment: Cluster of 3 nodes
>            Reporter: Nikita Sivkov
>            Priority: Major
>              Labels: ignite-3
>         Attachments: transfer_ign3.yaml
>
>
> *Steps to reproduce:*
> Perform a long (about 2 hours) load test with a balance transfer scenario 
> (see scenario pseudo code in attachments).
> *Expected result:*
> No errors happen.
> *Actual result:*
> Get error in server logs - {{Failed to process replica request}}
> {code:java}
> 2024-04-05 17:50:55:802 +0300 
> [WARNING][%poc-tester-SERVER-192.168.1.97-id-0%JRaft-AppendEntries-Processor-2][NodeImpl]
>  Node <193_part_15/poc-tester-SERVER-192.168.1.97-id-0> is not in active 
> state, currTerm=2.
> 2024-04-05 17:50:55:805 +0300 
> [WARNING][%poc-tester-SERVER-192.168.1.97-id-0%Raft-Group-Client-19][ReplicaManager]
>  Failed to process replica request [request=TxFinishReplicaRequestImpl 
> [commit=false, commitTimestampLong=0, 
> enlistmentConsistencyToken=112218720633356321, groupId=123_part_21, 
> groups=HashMap {141_part_13=poc-tester-SERVER-192.168.1.27-id-0, 
> 139_part_9=poc-tester-SERVER-192.168.1.97-id-0, 
> 193_part_3=poc-tester-SERVER-192.168.1.27-id-0, 
> 19_part_23=poc-tester-SERVER-192.168.1.27-id-0, 
> 117_part_17=poc-tester-SERVER-192.168.1.18-id-0, 
> 45_part_9=poc-tester-SERVER-192.168.1.18-id-0, 
> 39_part_3=poc-tester-SERVER-192.168.1.18-id-0, 
> 77_part_4=poc-tester-SERVER-192.168.1.18-id-0, 
> 105_part_4=poc-tester-SERVER-192.168.1.18-id-0, 
> 123_part_21=poc-tester-SERVER-192.168.1.97-id-0, 
> 103_part_9=poc-tester-SERVER-192.168.1.18-id-0, 
> 161_part_15=poc-tester-SERVER-192.168.1.27-id-0, 
> 103_part_22=poc-tester-SERVER-192.168.1.27-id-0, 
> 89_part_10=poc-tester-SERVER-192.168.1.18-id-0, 
> 39_part_19=poc-tester-SERVER-192.168.1.27-id-0, 
> 149_part_13=poc-tester-SERVER-192.168.1.27-id-0, 
> 97_part_24=poc-tester-SERVER-192.168.1.97-id-0, 
> 83_part_9=poc-tester-SERVER-192.168.1.27-id-0, 
> 209_part_10=poc-tester-SERVER-192.168.1.27-id-0, 
> 185_part_5=poc-tester-SERVER-192.168.1.18-id-0, 
> 117_part_9=poc-tester-SERVER-192.168.1.27-id-0, 
> 105_part_22=poc-tester-SERVER-192.168.1.18-id-0}, 
> timestampLong=112219170129903617, txId=018eaebd-88ba-0001-606d-622500000001]].
> java.util.concurrent.CompletionException: 
> org.apache.ignite.tx.TransactionException: IGN-TX-7 
> TraceId:cb1577e6-ec35-47f0-ab7d-56a0687344ed 
> java.util.concurrent.TimeoutException
>     at 
> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
>     at 
> java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
>     at 
> java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:932)
>     at 
> java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
>     at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>     at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>     at 
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$applyCmdWithRetryOnSafeTimeReorderException$126(PartitionReplicaListener.java:2806)
>     at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
>     at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
>     at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>     at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>     at 
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:550)
>     at 
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$handleErrorResponse$44(RaftGroupServiceImpl.java:653)
>     at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>     at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>     at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: org.apache.ignite.tx.TransactionException: IGN-TX-7 
> TraceId:cb1577e6-ec35-47f0-ab7d-56a0687344ed 
> java.util.concurrent.TimeoutException
>     at 
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$finishTransaction$70(PartitionReplicaListener.java:1867)
>     at 
> java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930)
>     ... 16 more
> Caused by: java.util.concurrent.CompletionException: 
> java.util.concurrent.TimeoutException
>     at 
> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
>     at 
> java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
>     at 
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:632)
>     ... 10 more
> Caused by: java.util.concurrent.TimeoutException
>     ... 8 more
> 2024-04-05 17:50:55:807 +0300 
> [WARNING][%poc-tester-SERVER-192.168.1.97-id-0%JRaft-Request-Processor-42][NodeImpl]
>  Node <35_part_20/poc-tester-SERVER-192.168.1.97-id-0> is not in active 
> state, currTerm=4.{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to