[ https://issues.apache.org/jira/browse/IGNITE-20834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrey Khitrin updated IGNITE-20834: ------------------------------------ Description: How to reproduce: 1. Start a 1-node cluster 2. Create several simple tables (usually 5-10 is enough to reproduce): {code:sql} create table failoverTest00(k1 INTEGER not null, k2 INTEGER not null, v1 VARCHAR(100), v2 VARCHAR(255), v3 TIMESTAMP not null, primary key (k1, k2)); create table failoverTest01(k1 INTEGER not null, k2 INTEGER not null, v1 VARCHAR(100), v2 VARCHAR(255), v3 TIMESTAMP not null, primary key (k1, k2)); ... {code} 3. Fill every table with 1000 rows. 4. Ensure that every table contains 1000 rows: {code:sql} SELECT COUNT(*) FROM failoverTest00; ... {code} 5. Restart node (kill a Java process and start node again). 6. Check all tables again. Expected behavior: after restart, all tables still contains the same data as before. Actual behavior: cannot perform SQL query after restart. It hangs for a long time. Ignite log is overwhelmed with "Primary replica expired" messages. This bug was first observed soon after fixes in https://issues.apache.org/jira/browse/IGNITE-20116. was: How to reproduce: 1. Start a 1-node cluster 2. Create several simple tables (usually 5 is enough to reproduce): {code:sql} create table failoverTest00(k1 INTEGER not null, k2 INTEGER not null, v1 VARCHAR(100), v2 VARCHAR(255), v3 TIMESTAMP not null, primary key (k1, k2)); create table failoverTest01(k1 INTEGER not null, k2 INTEGER not null, v1 VARCHAR(100), v2 VARCHAR(255), v3 TIMESTAMP not null, primary key (k1, k2)); ... {code} 3. Fill every table with 1000 rows. 4. Ensure that every table contains 1000 rows: {code:sql} SELECT COUNT(*) FROM failoverTest00; ... {code} 5. Restart node (kill a Java process and start node again). 6. Check all tables again. Expected behavior: after restart, all tables still contains the same data as before. Actual behavior: for some tables, 1 or 2 rows may be missing, if we're fast enough on steps 3-4-5. Some contains 1000 rows, some contains 999 or 998. This bug was first observed only near Sep 15, 2023. Most probably, it was introduced somewhere near that date. Probably, it's an another face of IGNITE-20425 (I'm not sure though). No errors in logs observed. *UPD*: The problem is caused by https://issues.apache.org/jira/browse/IGNITE-20116, current issue will be solved once https://issues.apache.org/jira/browse/IGNITE-20116 will be done > SQL query may hang forerver after node restart > ---------------------------------------------- > > Key: IGNITE-20834 > URL: https://issues.apache.org/jira/browse/IGNITE-20834 > Project: Ignite > Issue Type: Bug > Reporter: Andrey Khitrin > Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > How to reproduce: > 1. Start a 1-node cluster > 2. Create several simple tables (usually 5-10 is enough to reproduce): > {code:sql} > create table failoverTest00(k1 INTEGER not null, k2 INTEGER not null, v1 > VARCHAR(100), v2 VARCHAR(255), v3 TIMESTAMP not null, primary key (k1, k2)); > create table failoverTest01(k1 INTEGER not null, k2 INTEGER not null, v1 > VARCHAR(100), v2 VARCHAR(255), v3 TIMESTAMP not null, primary key (k1, k2)); > ... > {code} > 3. Fill every table with 1000 rows. > 4. Ensure that every table contains 1000 rows: > {code:sql} > SELECT COUNT(*) FROM failoverTest00; > ... > {code} > 5. Restart node (kill a Java process and start node again). > 6. Check all tables again. > Expected behavior: after restart, all tables still contains the same data as > before. > Actual behavior: cannot perform SQL query after restart. It hangs for a long > time. Ignite log is overwhelmed with "Primary replica expired" messages. > This bug was first observed soon after fixes in > https://issues.apache.org/jira/browse/IGNITE-20116. -- This message was sent by Atlassian Jira (v8.20.10#820010)