[
https://issues.apache.org/jira/browse/IGNITE-24342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vyacheslav Koptilin updated IGNITE-24342:
-----------------------------------------
Labels: ignite-3 (was: )
> [Flaky] Cannot reliably start 3-nodes cluster on a single Windows machine
> -------------------------------------------------------------------------
>
> Key: IGNITE-24342
> URL: https://issues.apache.org/jira/browse/IGNITE-24342
> Project: Ignite
> Issue Type: Bug
> Affects Versions: 3.0
> Environment: A single Windows 10 machine with 32 Gb of RAM
> Reporter: Andrey Khitrin
> Priority: Major
> Labels: ignite-3
> Attachments: logs.tgz
>
>
> This issue doesn't have a 100% reproducibility rate, but is frequent enough
> to observe.
> How to reproduce:
> # Try to start 3 AI nodes with a static `nodeFinder` on a single machine
> (configs are attached)
> {code:java}
> nodeFinder {
> netClusterNodes=[
> "127.0.0.1:3344",
> "127.0.0.1:3345",
> "127.0.0.1:3346"
> ]
> type=STATIC
> }
> {code}
> Expected result: all nodes are up.
> Actual result: 2 of 3 nodes terminated with thread dumps, cannot initialize
> cluster.
> Key exceptions in logs:
> # "IllegalStateException: cannot send more responses than requests" (see
> attachment)
> # Various RAFT-related and timeout errors:
> {code:java}
> 2025-01-28 06:03:05:471 -0600
> [ERROR][%TablesAmountCapacityMultiNodeTest_cluster_1%JRaft-Response-Processor-8][AbstractClientService]
> Fail to connect TablesAmountCapacityMultiNodeTest_cluster_0, exception:
> java.util.concurrent.TimeoutException.
> 2025-01-28 06:03:05:815 -0600
> [INFO][%TablesAmountCapacityMultiNodeTest_cluster_1%JRaft-Request-Processor-24][NodeImpl]
> Node <cmg_group/TablesAmountCapacityMultiNodeTest_cluster_1> ignore
> PreVoteRequest from TablesAmountCapacityMultiNodeTest_cluster_0, term=2,
> currTerm=1, because the leader TablesAmountCapacityMultiNodeTest_cluster_1's
> lease is still valid.
> 2025-01-28 06:03:05:815 -0600
> [ERROR][%TablesAmountCapacityMultiNodeTest_cluster_1%JRaft-Response-Processor-8][ReplicatorGroupImpl]
> Fail to check replicator connection to
> peer=TablesAmountCapacityMultiNodeTest_cluster_0, replicatorType=Follower.
> 2025-01-28 06:03:05:836 -0600
> [ERROR][%TablesAmountCapacityMultiNodeTest_cluster_1%JRaft-Response-Processor-8][NodeImpl]
> Fail to add a replicator, peer=TablesAmountCapacityMultiNodeTest_cluster_0.
> {code}
> # Thread dumps in logs for 2 of 3 nodes (see attachment)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)