[
https://issues.apache.org/jira/browse/IGNITE-8610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503419#comment-16503419
]
Alexey Goncharuk commented on IGNITE-8610:
------------------------------------------
[~Jokser], a few comments:
* In {{GridDhtPreloader}} you've added the following code:
{code}
if (!assignments.isEmpty() && grp.persistenceEnabled()) {
ctx.database().checkpointReadLock();
try {
((GridCacheDatabaseSharedManager)
ctx.database()).lastCheckpointInapplicableForWalRebalance(grp.groupId());
}
finally {
ctx.database().checkpointReadUnlock();
}
}
{code}
I suggest to introduce such a method to the DatabaseSharedManager and have it
empty for default implementation, while persistence-enabled implementation will
acquire checkpoint read lock and du necessary work. This will hide both
{{instanceof}} and {{if (persistenceEnabled())}}
* You've added a synchronous wait for partition re-creation in
{{generateAssignments}}, which happens in exchange thread. Let's add our
generic timed-spin-wait and warn if the wait is too long.
> Searching checkpoint / WAL history for rebalancing is not properly working in
> case of local/global WAL disabling
> ----------------------------------------------------------------------------------------------------------------
>
> Key: IGNITE-8610
> URL: https://issues.apache.org/jira/browse/IGNITE-8610
> Project: Ignite
> Issue Type: Bug
> Components: cache
> Affects Versions: 2.5
> Reporter: Pavel Kovalenko
> Assignee: Pavel Kovalenko
> Priority: Major
> Fix For: 2.6
>
>
> After implementation IGNITE-6411 and IGNITE-8087 we can face with situation
> when after some checkpoint, WAL was temporarily disabled and enabled again.
> In this case we can't treat that checkpoint as start point to rebalance,
> because WAL history after such checkpoint may contain gaps.
> We should rework our checkpoint / wal history searching mechanism and ignore
> such checkpoints.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)