[ https://issues.apache.org/jira/browse/IGNITE-19111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17707161#comment-17707161 ]
Ignite TC Bot commented on IGNITE-19111: ---------------------------------------- {panel:title=Branch: [pull/10615/head] Base: [master] : No blockers found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel} {panel:title=Branch: [pull/10615/head] Base: [master] : New Tests (2)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1} {color:#00008b}Disk Page Compressions 1{color} [[tests 1|https://ci2.ignite.apache.org/viewLog.html?buildId=7116301]] * {color:#013220}IgnitePdsCompressionTestSuite: IgnitePdsCheckpointAfterDeactivateTest.testCpAfterClusterDeactivate - PASSED{color} {color:#00008b}PDS 1{color} [[tests 1|https://ci2.ignite.apache.org/viewLog.html?buildId=7116255]] * {color:#013220}IgnitePdsTestSuite: IgnitePdsCheckpointAfterDeactivateTest.testCpAfterClusterDeactivate - PASSED{color} {panel} [TeamCity *--> Run :: All* Results|https://ci2.ignite.apache.org/viewLog.html?buildId=7116304&buildTypeId=IgniteTests24Java8_RunAll] > Storage corruption if pages changed after last checkpoint during deactivation > ----------------------------------------------------------------------------- > > Key: IGNITE-19111 > URL: https://issues.apache.org/jira/browse/IGNITE-19111 > Project: Ignite > Issue Type: Bug > Reporter: Aleksey Plekhanov > Assignee: Aleksey Plekhanov > Priority: Major > Labels: ise > Time Spent: 10m > Remaining Estimate: 0h > > During cluster deactivation we force checkpoint (with "caches stop" reason) > and remove checkpoint listeners before actual caches stop. But if there are > some activity with data pages on the node after that checkpoint, but before > caches stops and next checkpoint is started, the storage can be corrupted. > Reproducer: > {code:java} > /** {@inheritDoc} */ > @Override protected IgniteConfiguration getConfiguration(String > igniteInstanceName) throws Exception { > return super.getConfiguration(igniteInstanceName) > .setDataStorageConfiguration(new DataStorageConfiguration() > .setDefaultDataRegionConfiguration(new > DataRegionConfiguration().setPersistenceEnabled(true)) > .setCheckpointFrequency(1_000L)) > .setFailureHandler(new StopNodeFailureHandler()); > } > /** */ > @Test > public void testCpAfterClusterDeactivate() throws Exception { > IgniteEx ignite0 = startGrid(0); > IgniteEx ignite1 = startGrid(1); > ignite0.cluster().state(ClusterState.ACTIVE); > ignite0.getOrCreateCache(new > CacheConfiguration<>(DEFAULT_CACHE_NAME).setBackups(1) > .setAffinity(new RendezvousAffinityFunction(false, 10))); > try (IgniteDataStreamer<Integer, Integer> streamer = > ignite0.dataStreamer(DEFAULT_CACHE_NAME)) { > for (int i = 0; i < 100_000; i++) > streamer.addData(i, i); > } > stopGrid(0); > try (IgniteDataStreamer<Integer, Integer> streamer = > ignite1.dataStreamer(DEFAULT_CACHE_NAME)) { > streamer.allowOverwrite(true); > for (int i = 0; i < 100_000; i++) > streamer.addData(i, i + 1); > } > ignite0 = startGrid(0); > > ((GridCacheDatabaseSharedManager)ignite0.context().cache().context().database()).addCheckpointListener(new > CheckpointListener() { > @Override public void onMarkCheckpointBegin(Context ctx) { > // No-op. > } > @Override public void onCheckpointBegin(Context ctx) { > if ("caches stop".equals(ctx.progress().reason())) > doSleep(1_000L); > } > @Override public void beforeCheckpointBegin(Context ctx) { > // No-op. > } > }); > ignite0.cluster().state(ClusterState.INACTIVE); > doSleep(2_000L); > ignite0.cluster().state(ClusterState.ACTIVE); > IgniteCache<Integer, Integer> cache = > ignite0.cache(DEFAULT_CACHE_NAME); > for (int i = 0; i < 100_000; i++) > assertEquals((Integer)(i + 1), cache.get(i)); > } {code} > This reproducer shuts down the node with some probability (about 1/5 on my > laptop) on activation or on last check with {{{}CorruptedTreeException{}}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)