[
https://issues.apache.org/jira/browse/IGNITE-11704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897261#comment-16897261
]
Semen Boikov commented on IGNITE-11704:
---------------------------------------
Implemented initial support for tombstones, changes in branch ignite-11704
(tombstones are created for tx cache while rebalance is in progress):
* to do not change data storage format tombstones are stored as a regular rows
where value is CacheObject containing marshaled null (1 byte array with binary
marshaller). Nulls should not be stored in cache, so it always possible to
distinguish tombstones and regular values
* tombstones are created for cache removes while partition is MOVING, and
asynchronously removed when partition becomes OWNING
* tombstones cleanup should not consume too much resources, in this regard it
is similar to partition eviction, so I reused PartitionsEvictManager for
tombstones cleanup tasks
* added special cache store iteration mode for tombstone cleanup
(RowData.TOMBSTONES) to read as little data as possible for non-tombstone
entries
* added per-cache metrics - number of tombstone entries
Further tasks (going to add separate JIRAs for this):
* when persistence is enabled, can write tombstones on incomplete baseline,
then include tombstones in rebalance when node returns, and in this case full
partition cleanup on joining node won't be needed
* for ATOMIC cache 'deferredDelete' is used to prevent race on backups vs
concurrent remove and put. So for atomic caches need some other logic to
understand when it is safe to remove tombstones
> Write tombstones during rebalance to get rid of deferred delete buffer
> ----------------------------------------------------------------------
>
> Key: IGNITE-11704
> URL: https://issues.apache.org/jira/browse/IGNITE-11704
> Project: Ignite
> Issue Type: Improvement
> Reporter: Alexey Goncharuk
> Assignee: Semen Boikov
> Priority: Major
> Labels: rebalance
>
> Currently Ignite relies on deferred delete buffer in order to handle
> write-remove conflicts during rebalance. Given the limit size of the buffer,
> this approach is fundamentally flawed, especially in case when persistence is
> enabled.
> I suggest to extend the logic of data storage to be able to store key
> tombstones - to keep version for deleted entries. The tombstones will be
> stored when rebalance is in progress and should be cleaned up when rebalance
> is completed.
> Later this approach may be used to implement fast partition rebalance based
> on merkle trees (in this case, tombstones should be written on an incomplete
> baseline).
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)