[ 
https://issues.apache.org/jira/browse/IGNITE-11704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900252#comment-16900252
 ] 

Pavel Kovalenko commented on IGNITE-11704:
------------------------------------------

[~sboikov]
Thank you for contribution.
I have a couple of questions and suggestions regarding the change:
1) I think we should remain 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager#partitionIterator
 with default behaviour (withTombstones=false). It can help to avoid making 
changes in existing code using partitionIterator previous behavior.
2) Unnecessary brackets in 
org/apache/ignite/internal/processors/cache/GridCacheMapEntry.java:1715
3) Why double-check in 
org/apache/ignite/internal/processors/cache/GridCacheMapEntry.java:1723 is 
needed?
4) Broken javadoc in 
org/apache/ignite/internal/processors/cache/GridCacheMapEntry.java:5859

The main concern about change is that tombstones can remain in partition 
forever if partition CASed to OWNING state and immediately shut down. In this 
case after node return back it can never clear tombstones. I think 
"tombstoneCreated" flag should be reflected in partition meta information and 
saved during a checkpoint. The same information should be added to the 
appropriate WAL delta record. During recovery, we can notice that partition has 
tombstones and run the cleaning process. Also, this flag is never reset looking 
to code.

> Write tombstones during rebalance to get rid of deferred delete buffer
> ----------------------------------------------------------------------
>
>                 Key: IGNITE-11704
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11704
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Alexey Goncharuk
>            Priority: Major
>              Labels: rebalance
>
> Currently Ignite relies on deferred delete buffer in order to handle 
> write-remove conflicts during rebalance. Given the limit size of the buffer, 
> this approach is fundamentally flawed, especially in case when persistence is 
> enabled.
> I suggest to extend the logic of data storage to be able to store key 
> tombstones - to keep version for deleted entries. The tombstones will be 
> stored when rebalance is in progress and should be cleaned up when rebalance 
> is completed.
> Later this approach may be used to implement fast partition rebalance based 
> on merkle trees (in this case, tombstones should be written on an incomplete 
> baseline).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to