[
https://issues.apache.org/jira/browse/IGNITE-22187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladimir Pligin reassigned IGNITE-22187:
----------------------------------------
Assignee: Aleksandr Polovtsev
> Cluster of 2 or 3 nodes doesn't work if one node is down
> --------------------------------------------------------
>
> Key: IGNITE-22187
> URL: https://issues.apache.org/jira/browse/IGNITE-22187
> Project: Ignite
> Issue Type: Bug
> Components: general, jdbc, networking, persistence
> Affects Versions: 3.0.0-beta1
> Environment: The 2 or 3 nodes cluster running locally.
> Reporter: Igor
> Assignee: Aleksandr Polovtsev
> Priority: Major
> Labels: ignite-3
>
> *Steps to reproduce:*
> # Create zone with replication equals to amount of nodes (2 or 3
> corresponding)
> # Create 10 tables inside the zone.
> # Insert 100 rows in every table.
> # Await all tables*partitions*nodes local state is "HEALTHY"
> # Await all tables*partitions*nodes global state is "AVAILABLE"
> # Kill first node with kill -9.
> # Assert all tables*partitions*nodes local state is "HEALTHY"
> # Await all tables*partitions*nodes global state is "READ_ONLY" for 2 nodes
> cluster or "DEGRADED" for 3 nodes cluster,
> # Execute select query using JDBC connecting to the second node (which is
> alive).
> *Expected:*
> Data is returned.
> *Actual:*
> On the step 7 it returns error by REST API:
> {code:java}
> {"title":"Internal Server
> Error","status":500,"code":"IGN-RECOVERY-3","type":null,"detail":"io.netty.channel.AbstractChannel$AnnotatedConnectException:
> Connection refused:
> /172.120.6.2:3344","node":null,"traceId":"2acb52fc-3275-411b-a4de-45f14873f15c","invalidParams":null}{code}
> In the server logs continuous errors:
> {code:java}
> 2024-05-08 10:37:19:796 +0200
> [ERROR][%ClusterFailover3NodesTest_cluster_1%JRaft-StepDownTimer-9][AbstractClientService]
> Fail to connect ClusterFailover3NodesTest_cluster_0, exception:
> java.net.ConnectException.
> 2024-05-08 10:37:19:796 +0200
> [ERROR][%ClusterFailover3NodesTest_cluster_1%JRaft-StepDownTimer-9][ReplicatorGroupImpl]
> Fail to check replicator connection to
> peer=ClusterFailover3NodesTest_cluster_0, replicatorType=Follower.
> 2024-05-08 10:37:19:796 +0200
> [ERROR][%ClusterFailover3NodesTest_cluster_1%JRaft-StepDownTimer-12][AbstractClientService]
> Fail to connect ClusterFailover3NodesTest_cluster_0, exception:
> java.net.ConnectException.
> 2024-05-08 10:37:19:796 +0200
> [ERROR][%ClusterFailover3NodesTest_cluster_1%JRaft-StepDownTimer-12][ReplicatorGroupImpl]
> Fail to check replicator connection to
> peer=ClusterFailover3NodesTest_cluster_0, replicatorType=Follower. {code}
> If skip steps 7 and 8, then the exception on step 9 occurs:
> {code:java}
> java.sql.SQLException: Unable to send fragment
> [targetNode=ClusterFailover3NodesTest_cluster_0, fragmentId=1,
> cause=io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection
> refused: no further information: /192.168.100.5:3344]
> at
> org.apache.ignite.internal.jdbc.proto.IgniteQueryErrorCode.createJdbcSqlException(IgniteQueryErrorCode.java:57)
> at
> org.apache.ignite.internal.jdbc.JdbcStatement.execute0(JdbcStatement.java:154)
> at
> org.apache.ignite.internal.jdbc.JdbcStatement.executeQuery(JdbcStatement.java:111)
> at
> org.gridgain.ai3tests.tests.teststeps.JdbcSteps.executeQuery(JdbcSteps.java:91)
> at
> org.gridgain.ai3tests.tests.failover.ClusterFailoverTestBase.getActualResult(ClusterFailoverTestBase.java:336)
> at
> org.gridgain.ai3tests.tests.failover.ClusterFailoverTestBase.assertDataIsFilledWithoutErrors(ClusterFailoverTestBase.java:154)
> at
> org.gridgain.ai3tests.tests.failover.ClusterFailover3NodesTest.singleKillAndCheckOtherNodeWorks(ClusterFailover3NodesTest.java:96)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834) {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)