[
https://issues.apache.org/jira/browse/IGNITE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818130#comment-17818130
]
Roman Puchkovskiy edited comment on IGNITE-20870 at 2/17/24 7:54 AM:
---------------------------------------------------------------------
For our Raft snapshot mechanics to work correctly, the following invariants
must be maintained:
# All updates to storages must be made via SnapshotAwarePartitionDataStorage
# ... while holding PartitionSnapshotsReadLock
# After executing a 'command', lastUpdateIndex+Term must be updated according
with the 'command'
# ... this index+term update must be made under the same
PartitionSnapshotsReadLock mentioned in the item 2
That's why in PartitionListener we take PartitionSnapshotsReadLock explicitly
and not inside storage methods (also because execution of one command actually
does a lot more now than just calling one storage method: it might also do
storage cleanup [it's about write intent cleanup, not something like GC]). It
doesn't seem possible to move acquisition of the lock inside storage methods.
There is a problem that is a lot worse: PartitionReplicaListener operates out
of Raft, and it does not know any index when executing its 'commands'. Right
now, I don't have any ideas how to deal with this as Raft snapshots mechanics
clearly require Raft index to make snapshot cuts. Maybe try to use HLC?
was (Author: rpuch):
For our Raft snapshot mechanics to work correctly, the following invariants
must be maintained:
# All updates to storages must be made via SnapshotAwarePartitionDataStorage
# ... while holding PartitionSnapshotsReadLock
# After executing a 'command', lastUpdateIndex+Term must be updated according
with the 'command'
# ... this index+term update must be made under the same
PartitionSnapshotsReadLock mentioned in the item 2
That's why in PartitionListener we take PartitionSnapshotsReadLock explicitly
and not inside storage methods (also because execution of one command actually
does a lot more now than just calling one storage method: it might also do
storage cleanup [it's about write intent cleanup, not something like GC]). It
doesn't seem possible to move acquisition of the lock inside storage methods.
There is a problem that is a lot worse: PartitionReplicaListener operates out
of Raft, and it does not know any index when executing its 'commands'. Right
now, I don't have any ideas how to deal with this as Raft snapshots mechanics
clearly require Raft index to make snapshot cuts.
> Partition replica listener skips taking snapshot lock
> -----------------------------------------------------
>
> Key: IGNITE-20870
> URL: https://issues.apache.org/jira/browse/IGNITE-20870
> Project: Ignite
> Issue Type: Bug
> Reporter: Ivan Bessonov
> Priority: Major
> Labels: ignite-3
>
> See `SnapshotAwarePartitionDataStorage#acquirePartitionSnapshotsReadLock`.
> Right now the data, that's being sent by snapshot reader, might be
> inconsistent, because reads are not synchronized with writes.
> There's a proposal to hide this lock somewhere inside data storage instance,
> if possible. Anyway, data consistency must be fixed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)