[ 
https://issues.apache.org/jira/browse/IGNITE-8610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503419#comment-16503419
 ] 

Alexey Goncharuk commented on IGNITE-8610:
------------------------------------------

[~Jokser], a few comments:
* In {{GridDhtPreloader}} you've added the following code:
{code}
        if (!assignments.isEmpty() && grp.persistenceEnabled()) {
            ctx.database().checkpointReadLock();

            try {
                ((GridCacheDatabaseSharedManager) 
ctx.database()).lastCheckpointInapplicableForWalRebalance(grp.groupId());
            }
            finally {
                ctx.database().checkpointReadUnlock();
            }
        }
{code}
I suggest to introduce such a method to the DatabaseSharedManager and have it 
empty for default implementation, while persistence-enabled implementation will 
acquire checkpoint read lock and du necessary work. This will hide both 
{{instanceof}} and {{if (persistenceEnabled())}}

* You've added a synchronous wait for partition re-creation in 
{{generateAssignments}}, which happens in exchange thread. Let's add our 
generic timed-spin-wait and warn if the wait is too long.

> Searching checkpoint / WAL history for rebalancing is not properly working in 
> case of local/global WAL disabling
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-8610
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8610
>             Project: Ignite
>          Issue Type: Bug
>          Components: cache
>    Affects Versions: 2.5
>            Reporter: Pavel Kovalenko
>            Assignee: Pavel Kovalenko
>            Priority: Major
>             Fix For: 2.6
>
>
> After implementation IGNITE-6411 and IGNITE-8087 we can face with situation 
> when after some checkpoint, WAL was temporarily disabled and enabled again. 
> In this case we can't treat that checkpoint as start point to rebalance, 
> because WAL history after such checkpoint may contain gaps.
> We should rework our checkpoint / wal history searching mechanism and ignore 
> such checkpoints.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to