[
https://issues.apache.org/jira/browse/IGNITE-17737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nikita Amelchev updated IGNITE-17737:
-------------------------------------
Description:
Cluster snapshots may be inconsistent under load.
Reproducer:
One thread does a transactional load: cache#put into transactional cache.
Another thread does periodic snapshots and checks them.
Reproducer attached (flacky, please repeat several times).
Example of a fail:
{noformat}
[2022-09-21T19:35:51,158][WARN ][async-runnable-runner-1][] The check procedure
has failed, conflict partitions has been found: [counterConflicts=1,
hashConflicts=1]
[2022-09-21T19:35:51,158][WARN ][async-runnable-runner-1][] Update counter
conflicts:
[2022-09-21T19:35:51,158][WARN ][async-runnable-runner-1][] Conflict partition:
PartitionKeyV2 [grpId=1544803905, grpName=default, partId=432]
[2022-09-21T19:35:51,159][WARN ][async-runnable-runner-1][] Partition
instances: [PartitionHashRecordV2 [isPrimary=false,
consistentId=snapshot.SnapshotTest2, updateCntr=21, partitionState=OWNING,
size=19, partHash=1245894112], PartitionHashRecordV2 [isPrimary=false,
consistentId=snapshot.SnapshotTest0, updateCntr=22, partitionState=OWNING,
size=20, partHash=1705601802], PartitionHashRecordV2 [isPrimary=false,
consistentId=snapshot.SnapshotTest1, updateCntr=21, partitionState=OWNING,
size=19, partHash=1245894112]]
[2022-09-21T19:35:51,159][WARN ][async-runnable-runner-1][]
[2022-09-21T19:35:51,159][WARN ][async-runnable-runner-1][] Hash conflicts:
[2022-09-21T19:35:51,159][WARN ][async-runnable-runner-1][] Conflict partition:
PartitionKeyV2 [grpId=1544803905, grpName=default, partId=432]
[2022-09-21T19:35:51,159][WARN ][async-runnable-runner-1][] Partition
instances: [PartitionHashRecordV2 [isPrimary=false,
consistentId=snapshot.SnapshotTest2, updateCntr=21, partitionState=OWNING,
size=19, partHash=1245894112], PartitionHashRecordV2 [isPrimary=false,
consistentId=snapshot.SnapshotTest0, updateCntr=22, partitionState=OWNING,
size=20, partHash=1705601802], PartitionHashRecordV2 [isPrimary=false,
consistentId=snapshot.SnapshotTest1, updateCntr=21, partitionState=OWNING,
size=19, partHash=1245894112]]
[2022-09-21T19:35:51,159][WARN ][async-runnable-runner-1][]
{noformat}
was:
Cluster snapshots may be inconsistent under load.
Reproducer:
One thread does a transactional load: cache#put into transactional cache.
Another thread does periodic snapshots and checks them.
Reproducer attached (flacky, please repeat several times).
> Cluster snapshots may be inconsistent under load.
> --------------------------------------------------
>
> Key: IGNITE-17737
> URL: https://issues.apache.org/jira/browse/IGNITE-17737
> Project: Ignite
> Issue Type: Bug
> Reporter: Nikita Amelchev
> Assignee: Nikita Amelchev
> Priority: Major
> Labels: ise
> Attachments: SnapshotTest.java
>
>
> Cluster snapshots may be inconsistent under load.
> Reproducer:
> One thread does a transactional load: cache#put into transactional cache.
> Another thread does periodic snapshots and checks them.
> Reproducer attached (flacky, please repeat several times).
> Example of a fail:
> {noformat}
> [2022-09-21T19:35:51,158][WARN ][async-runnable-runner-1][] The check
> procedure has failed, conflict partitions has been found:
> [counterConflicts=1, hashConflicts=1]
> [2022-09-21T19:35:51,158][WARN ][async-runnable-runner-1][] Update counter
> conflicts:
> [2022-09-21T19:35:51,158][WARN ][async-runnable-runner-1][] Conflict
> partition: PartitionKeyV2 [grpId=1544803905, grpName=default, partId=432]
> [2022-09-21T19:35:51,159][WARN ][async-runnable-runner-1][] Partition
> instances: [PartitionHashRecordV2 [isPrimary=false,
> consistentId=snapshot.SnapshotTest2, updateCntr=21, partitionState=OWNING,
> size=19, partHash=1245894112], PartitionHashRecordV2 [isPrimary=false,
> consistentId=snapshot.SnapshotTest0, updateCntr=22, partitionState=OWNING,
> size=20, partHash=1705601802], PartitionHashRecordV2 [isPrimary=false,
> consistentId=snapshot.SnapshotTest1, updateCntr=21, partitionState=OWNING,
> size=19, partHash=1245894112]]
> [2022-09-21T19:35:51,159][WARN ][async-runnable-runner-1][]
> [2022-09-21T19:35:51,159][WARN ][async-runnable-runner-1][] Hash conflicts:
> [2022-09-21T19:35:51,159][WARN ][async-runnable-runner-1][] Conflict
> partition: PartitionKeyV2 [grpId=1544803905, grpName=default, partId=432]
> [2022-09-21T19:35:51,159][WARN ][async-runnable-runner-1][] Partition
> instances: [PartitionHashRecordV2 [isPrimary=false,
> consistentId=snapshot.SnapshotTest2, updateCntr=21, partitionState=OWNING,
> size=19, partHash=1245894112], PartitionHashRecordV2 [isPrimary=false,
> consistentId=snapshot.SnapshotTest0, updateCntr=22, partitionState=OWNING,
> size=20, partHash=1705601802], PartitionHashRecordV2 [isPrimary=false,
> consistentId=snapshot.SnapshotTest1, updateCntr=21, partitionState=OWNING,
> size=19, partHash=1245894112]]
> [2022-09-21T19:35:51,159][WARN ][async-runnable-runner-1][]
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)