[ 
https://issues.apache.org/jira/browse/IGNITE-11704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897261#comment-16897261
 ] 

Semen Boikov commented on IGNITE-11704:
---------------------------------------

Implemented initial support for tombstones, changes in branch ignite-11704 
(tombstones are created for tx cache while rebalance is in progress):
 * to do not change data storage format tombstones are stored as a regular rows 
where value is CacheObject containing marshaled null (1 byte array with binary 
marshaller). Nulls should not be stored in cache, so it always possible to 
distinguish tombstones and regular values
 * tombstones are created for cache removes while partition is MOVING, and 
asynchronously removed when partition becomes OWNING
 * tombstones cleanup should  not consume too much resources, in this regard it 
is similar to partition eviction, so I reused PartitionsEvictManager for 
tombstones cleanup tasks
 * added special cache store iteration mode for tombstone cleanup 
(RowData.TOMBSTONES) to read as little data as possible for non-tombstone 
entries
 * added per-cache metrics - number of tombstone entries 

Further tasks (going to add separate JIRAs for this):
 * when persistence is enabled, can write tombstones on incomplete baseline, 
then include tombstones in rebalance when node returns, and in this case full 
partition cleanup on joining node won't be needed
 * for ATOMIC cache 'deferredDelete' is used to prevent race on backups vs 
concurrent remove and put. So for atomic caches need some other logic to 
understand when it is safe to remove tombstones 

> Write tombstones during rebalance to get rid of deferred delete buffer
> ----------------------------------------------------------------------
>
>                 Key: IGNITE-11704
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11704
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Alexey Goncharuk
>            Assignee: Semen Boikov
>            Priority: Major
>              Labels: rebalance
>
> Currently Ignite relies on deferred delete buffer in order to handle 
> write-remove conflicts during rebalance. Given the limit size of the buffer, 
> this approach is fundamentally flawed, especially in case when persistence is 
> enabled.
> I suggest to extend the logic of data storage to be able to store key 
> tombstones - to keep version for deleted entries. The tombstones will be 
> stored when rebalance is in progress and should be cleaned up when rebalance 
> is completed.
> Later this approach may be used to implement fast partition rebalance based 
> on merkle trees (in this case, tombstones should be written on an incomplete 
> baseline).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to