[jira] [Commented] (IGNITE-19111) Storage corruption if pages changed after last checkpoint during deactivation

Ignite TC Bot (Jira) Fri, 31 Mar 2023 00:19:07 -0700


    [ 
https://issues.apache.org/jira/browse/IGNITE-19111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17707161#comment-17707161
 ]


Ignite TC Bot commented on IGNITE-19111:
----------------------------------------

{panel:title=Branch: [pull/10615/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/10615/head] Base: [master] : New Tests 
(2)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#00008b}Disk Page Compressions 1{color} [[tests 
1|https://ci2.ignite.apache.org/viewLog.html?buildId=7116301]]
* {color:#013220}IgnitePdsCompressionTestSuite: 
IgnitePdsCheckpointAfterDeactivateTest.testCpAfterClusterDeactivate - 
PASSED{color}

{color:#00008b}PDS 1{color} [[tests 
1|https://ci2.ignite.apache.org/viewLog.html?buildId=7116255]]
* {color:#013220}IgnitePdsTestSuite: 
IgnitePdsCheckpointAfterDeactivateTest.testCpAfterClusterDeactivate - 
PASSED{color}

{panel}
[TeamCity *--&gt; Run :: All* 
Results|https://ci2.ignite.apache.org/viewLog.html?buildId=7116304&amp;buildTypeId=IgniteTests24Java8_RunAll]

> Storage corruption if pages changed after last checkpoint during deactivation
> -----------------------------------------------------------------------------
>
>                 Key: IGNITE-19111
>                 URL: https://issues.apache.org/jira/browse/IGNITE-19111
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Aleksey Plekhanov
>            Assignee: Aleksey Plekhanov
>            Priority: Major
>              Labels: ise
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> During cluster deactivation we force checkpoint (with "caches stop" reason) 
> and remove checkpoint listeners before actual caches stop. But if there are 
> some activity with data pages on the node after that checkpoint, but before 
> caches stops and next checkpoint is started, the storage can be corrupted.
> Reproducer:
> {code:java}
>     /** {@inheritDoc} */
>     @Override protected IgniteConfiguration getConfiguration(String 
> igniteInstanceName) throws Exception {
>         return super.getConfiguration(igniteInstanceName)
>             .setDataStorageConfiguration(new DataStorageConfiguration()
>                 .setDefaultDataRegionConfiguration(new 
> DataRegionConfiguration().setPersistenceEnabled(true))
>                 .setCheckpointFrequency(1_000L))
>             .setFailureHandler(new StopNodeFailureHandler());
>     }
>     /** */
>     @Test
>     public void testCpAfterClusterDeactivate() throws Exception {
>         IgniteEx ignite0 = startGrid(0);
>         IgniteEx ignite1 = startGrid(1);
>         ignite0.cluster().state(ClusterState.ACTIVE);
>         ignite0.getOrCreateCache(new 
> CacheConfiguration<>(DEFAULT_CACHE_NAME).setBackups(1)
>             .setAffinity(new RendezvousAffinityFunction(false, 10)));
>         try (IgniteDataStreamer<Integer, Integer> streamer = 
> ignite0.dataStreamer(DEFAULT_CACHE_NAME)) {
>             for (int i = 0; i < 100_000; i++)
>                 streamer.addData(i, i);
>         }
>         stopGrid(0);
>         try (IgniteDataStreamer<Integer, Integer> streamer = 
> ignite1.dataStreamer(DEFAULT_CACHE_NAME)) {
>             streamer.allowOverwrite(true);
>             for (int i = 0; i < 100_000; i++)
>                 streamer.addData(i, i + 1);
>         }
>         ignite0 = startGrid(0);
>         
> ((GridCacheDatabaseSharedManager)ignite0.context().cache().context().database()).addCheckpointListener(new
>  CheckpointListener() {
>             @Override public void onMarkCheckpointBegin(Context ctx) {
>                 // No-op.
>             }
>             @Override public void onCheckpointBegin(Context ctx) {
>                 if ("caches stop".equals(ctx.progress().reason()))
>                     doSleep(1_000L);
>             }
>             @Override public void beforeCheckpointBegin(Context ctx) {
>                 // No-op.
>             }
>         });
>         ignite0.cluster().state(ClusterState.INACTIVE);
>         doSleep(2_000L);
>         ignite0.cluster().state(ClusterState.ACTIVE);
>         IgniteCache<Integer, Integer> cache = 
> ignite0.cache(DEFAULT_CACHE_NAME);
>         for (int i = 0; i < 100_000; i++)
>             assertEquals((Integer)(i + 1), cache.get(i));
>     } {code}
> This reproducer shuts down the node with some probability (about 1/5 on my 
> laptop) on activation or on last check with {{{}CorruptedTreeException{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (IGNITE-19111) Storage corruption if pages changed after last checkpoint during deactivation

Reply via email to