[
https://issues.apache.org/jira/browse/IGNITE-24016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrey Khitrin updated IGNITE-24016:
------------------------------------
Affects Version/s: 3.0
> [aimem] Restart node with open transaction leads to `Failed to get the
> primary replica` after node is started
> -------------------------------------------------------------------------------------------------------------
>
> Key: IGNITE-24016
> URL: https://issues.apache.org/jira/browse/IGNITE-24016
> Project: Ignite
> Issue Type: Bug
> Components: persistence
> Affects Versions: 3.0, 3.0.0-beta1
> Environment: 3 nodes (each node is CMG, each node
> {color:#067d17}"{color}{color:#067d17}-Xms512m"{color},
> {color:#067d17}"-Xmx1536m{color}{color:#067d17}"{color}), each on separate
> host. Each host vCPU: 4, Memory: 32GB.
> Reporter: Igor
> Priority: Major
> Labels: ignite-3
> Attachments: cluster logs.zip
>
>
> *Steps to reproduce:*
> # Start 3 nodes (each node is CMG, each node
> {color:#067d17}"-Xms512m"{color}, {color:#067d17}"-Xmx1536m"{color}), each on
> separate host. Each host vCPU: 4, Memory: 32GB.
> # Create 1 table
> ## Execute query: create zone if not exists "cluster_failover_3" with
> replicas=3, data_nodes_auto_adjust_scale_up=10,
> data_nodes_auto_adjust_scale_down=10, storage_profiles='default_aimem'
> ## Execute query: create TABLE failoverTest00(k1 INTEGER not null, k2
> INTEGER not null, v1 VARCHAR(100), v2 VARCHAR(255), v3 TIMESTAMP not null,
> primary key (k1, k2)) ZONE "cluster_failover_3"
> # Fill table with 1000 rows each
> # Await all partitions of all tables local state is "HEALTHY"
> # Await all partitions of all tables global state is "AVAILABLE"
> # Assert the tables has been filled, with expected row count of '1000' and
> no errors in logs
> # Assert that ignite log contains no errors or exceptions
> # Kill node and start it again with opened transactions
> #
> ## Begin IgniteSql transaction and execute insert
> ## Begin KeyValueView transaction and execute insert
> ## Begin JDBC transaction and execute insert
> ## Kill node and start it again
> # Wait node is ready after restart
> # Assert physical and logical topologies are correct
> # Assert the tables has been filled.
> *Expected:*
> All data present and consistent.
> *Actual:*
> Exception on step 11:
> {code:java}
> java.sql.SQLException: java.sql.SQLException: Failed to get the primary
> replica [tablePartitionId=17_part_2]java.sql.SQLException: Failed to get the
> primary replica [tablePartitionId=17_part_2] at
> org.apache.ignite.internal.jdbc.proto.IgniteQueryErrorCode.createJdbcSqlException(IgniteQueryErrorCode.java:57)
> at
> org.apache.ignite.internal.jdbc.JdbcStatement.execute0(JdbcStatement.java:160)
> at
> org.apache.ignite.internal.jdbc.JdbcStatement.executeQuery(JdbcStatement.java:115)
> at
> org.gridgain.ai3tests.tests.teststeps.JdbcSteps.executeQuery(JdbcSteps.java:113)
> at
> org.gridgain.ai3tests.tests.failover.ClusterFailoverTestBase.tryGetActualResult(ClusterFailoverTestBase.java:340)
> at
> org.gridgain.ai3tests.tests.failover.ClusterFailoverTestBase.lambda$getActualResult$7(ClusterFailoverTestBase.java:319)
> at
> org.gridgain.ai3tests.core.utils.RetryUtils.retryOnAllowedException(RetryUtils.java:61)
> at
> org.gridgain.ai3tests.core.utils.RetryUtils.retryOnAllowedException(RetryUtils.java:36)
> at
> org.gridgain.ai3tests.tests.failover.ClusterFailoverTestBase.getActualResult(ClusterFailoverTestBase.java:318)
> at
> org.gridgain.ai3tests.tests.failover.ClusterFailoverTestBase.assertDataIsFilledWithoutErrors(ClusterFailoverTestBase.java:179)
> at
> org.gridgain.ai3tests.tests.failover.ClusterFailover3NodesTest.singleKillAndRestartNodeWhenDataIsLoadedWithOpenTransactions(ClusterFailover3NodesTest.java:126)
> at java.base/java.lang.reflect.Method.invoke(Method.java:568) at
> io.qameta.allure.junit5.AllureJunit5.interceptTestTemplateMethod(AllureJunit5.java:59)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> at java.base/java.lang.Thread.run(Thread.java:842) {code}
> [^cluster logs.zip]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)