[
https://issues.apache.org/jira/browse/IGNITE-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17262787#comment-17262787
]
Ilya Kasnacheev commented on IGNITE-13976:
------------------------------------------
[~av] can you please review the PR, hint at what is the root cause of the
problem? The main reproducer is WalDisableTest (a main() runnable program).
I have devised a sequential test (see the PR) and fixed it with some hack, but
it did not fix the root cause. I have also tried to patch in another place, but
it did not fix the reproducer.
I don't understand why on a fresh node there's
org.apache.ignite.internal.processors.cache.CacheGroupDescriptor#walChangeReqs
if WAL change is PME and is purely sequential. I also don't understand why the
WAL status is not propagated from cache information to cache group.
> WAL disable/enable with node restarts results in mismatching state, data loss
> -----------------------------------------------------------------------------
>
> Key: IGNITE-13976
> URL: https://issues.apache.org/jira/browse/IGNITE-13976
> Project: Ignite
> Issue Type: Bug
> Components: cache
> Affects Versions: 2.9.1
> Reporter: Ilya Kasnacheev
> Assignee: Ilya Kasnacheev
> Priority: Major
>
> If you try to enable/disable WAL on unstable topology, you will get to state
> when WAL status is undefined, nodes might have different wall status and the
> only way to fix it is to restart the cluster, which will lead to data loss
> because ignite removes data if WAL is disabled on restart.
> See the reproducer in PR.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)