[
https://issues.apache.org/jira/browse/IGNITE-16304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Bessonov resolved IGNITE-16304.
------------------------------------
Resolution: Duplicate
> [POC] In-Memory storage integration
> -----------------------------------
>
> Key: IGNITE-16304
> URL: https://issues.apache.org/jira/browse/IGNITE-16304
> Project: Ignite
> Issue Type: Task
> Components: persistence
> Affects Versions: 3.0.0-alpha3
> Reporter: Ivan Bessonov
> Priority: Major
> Labels: iep-74, ignite-3
>
> Goals
> We need an in-memory store, similar to Ignite-2. This store must reuse common
> replication infrastructure, in other words, be integrated into raft STM and
> support transactions.
> The raft protocol implies some persistent state: metadata, logs, snapshot.
> Simplest solution - write a raft persistent state on disk (this is already
> implemented for
> org.apache.ignite.internal.storage.basic.ConcurrentHashMapPartitionStorage).
> Drawback - not fully in-memory solution, doesn't much differ from a database
> cache
> We can go the pure in-memory way - keep all raft state in a volatile store.
> h3. Raft metadata
> Must not be persisted for a pure in-memory cluster, because the state is
> always lost on restart.
> Note: a node must always be removed from the raft group when it’s removed
> from baseline by auto adjust and should join as new (in-memory always works
> with auto-adjust similarly to Ignite 2). *Out of scope.*
> h3. Log store
> Has working in-memory implementation (currently used in tests):
> org.apache.ignite.raft.jraft.storage.impl.LocalLogStorage
> Note: generally speaking, log is only required for "historical rebalancing"
> after the snapshot rebalance. It won't be needed at all once it is possible
> to apply snapshot and concurrent updates at the same time, for example when a
> solution like mvcc is implemented.
> h3. Snapshots
> Can be implemented over any kv store extended with some kind of Copy-On-Write
> support. Not implemented currently. More details below.
> h3. COW buffer
> To create an in-memory snapshot, the snapshot data is written to a separate
> in-memory buffer. The buffer is populated from the state machine update
> thread either by the update operations or by a snapshot advance mini-task
> which is submitted to the state machine update thread as needed.
> To maintain a snapshot, the state machine needs to keep an snapshot iterator
> boundary key. If a key being updated is smaller or equal than the boundary
> key, there is no need in any additional action because the snapshot iterator
> has already processed this key. If a key being updated is larger than the
> boundary key, the old version of the key is eagerly put to the snapshot
> buffer and the key is marked with snapshot ID (so that the key is skipped
> during further iteration). Snapshot advance mini-task iterates over a next
> batch of the keys starting from the boundary key and puts to the snapshot
> buffer only keys that are not yet marked by the snapshot ID.
> This approach has similar memory requirements to the first alternative, but
> does not require to modify the storage tree so that it can store multiple
> versions of the same key. This approach, however, allows for transparent
> snapshot buffer offloading to disk which can reduce memory requirements. It
> is also simpler in implementation because the code is essentially
> single-threaded and only requires synchronization for the in-memory buffer.
> The downside is that snapshot advance tasks will increase tail latency of
> state machine update operations.
> Can be implemented on top of any kv store.
> Note: we should consider the possibility of streaming the snapshot instead of
> storing it in memory until it is completed.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)