Hi,

Before I go into error, here is our production configuration : 
We are using Ignite as persistent store, in Partitioned Mode having 6 data
node, each running on different server.  Atomicity mode is ATOMIC, and
Rebalance mode is ASYNC while CacheWriteSynchronizationMode is FULL_SYNC. 

We got the following errors when we upgraded our production environment from
Ignite 2.1 to Ignite 2.3.

Out of six nodes , we got page corrupt error only on one of the node (Though
all the nodes shares same set of ) , below is the stack trace for the error: 

java.lang.IllegalStateException: Failed to get page IO instance (page
content is corrupted)
        at
org.apache.ignite.internal.processors.cache.persistence.tree.io.IOVersions.forVersion(IOVersions.java:83)
        at
org.apache.ignite.internal.processors.cache.persistence.tree.io.IOVersions.forPage(IOVersions.java:95)
        at
org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.init(PagesList.java:219)
        at
org.apache.ignite.internal.processors.cache.persistence.freelist.FreeListImpl.<init>(FreeListImpl.java:358)
        at
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore$1.<init>(GridCacheOffheapManager.java:923)
        at
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.init0(GridCacheOffheapManager.java:915)
        at
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.initialUpdateCounter(GridCacheOffheapManager.java:1202)
        at
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.onPartitionInitialCounterUpdated(GridCacheOffheapManager.java:471)
        at
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restorePartitionState(GridCacheDatabaseSharedManager.java:1700)
        at
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.applyLastUpdates(GridCacheDatabaseSharedManager.java:1657)
        at
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreState(GridCacheDatabaseSharedManager.java:1072)
        at
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.beforeExchange(GridCacheDatabaseSharedManager.java:863)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1019)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:651)
        at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2279)
        at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
        at java.lang.Thread.run(Thread.java:748)


Thinking that page is corrupt, we cleaned various Ignite directories
IgnitePersistentStore/WalStore/WalArchive/Work from the same machine.  After
cleanup, when we started the node, we started getting different error : 
org.apache.ignite.internal.IgniteKernal%eb4de377-2601-47c6-9b13-fd6ee02cb0fe
- Got exception while starting (will rollback startup routine). class
org.apache.ignite.IgniteCheckedException: Memory configuration mismatch (fix
configuration or set -DIGNITE_SKIP_CONFIGURATION_CONSISTENCY_CHECK=true
system property) [rmtNodeId=47e76683-39ec-4d22-b9c8-eadd6dad7fdc,
locPageSize = 4096, rmtPageSize = 2048]
        at
org.apache.ignite.internal.processors.cache.GridCacheProcessor.checkMemoryConfiguration(GridCacheProcessor.java:3088)
        at
org.apache.ignite.internal.processors.cache.GridCacheProcessor.checkConsistency(GridCacheProcessor.java:825)
        at
org.apache.ignite.internal.processors.cache.GridCacheProcessor.onKernalStart(GridCacheProcessor.java:763)
        at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1060)
        at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1909)
        at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1652)
        at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1080)
        at
org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:998)
        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:884)
        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:783)
        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:653)
        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:622)
        at org.apache.ignite.Ignition.start(Ignition.java:347)
        at
com.msci.datagrid.AlcyoneDataGridServer.main(AlcyoneDataGridServer.java:63)


We know that page size has changed from 2KB to 4 KB in Ignite 2.3, but as
there was no way to migrate existing persisted data and we didnot want to
loose existing persisted data as well hence we didnot clean the data
persisted with Ignite 2.1 version.

But we didnot set the property DIGNITE_SKIP_CONFIGURATION_CONSISTENCY_CHECK
either as we didnot encountered any error of this sort while upgrading from
2.1 to 2.3 in our UAT environment.

One way to avoid this error is setting up the system property
DIGNITE_SKIP_CONFIGURATION_CONSISTENCY_CHECK now in production.  Do you see
any issue with this ?

Thanks,









--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Reply via email to