[ 
https://issues.apache.org/jira/browse/IGNITE-11743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Chugunov updated IGNITE-11743:
-------------------------------------
    Description: 
When an existing cache is stopped (e.g. via call Ignite#destroyCache(String 
name)) this action is distributed across cluster by discovery mechanism (and is 
processed from *disco-notifier-worker* thread).
At the same time joining node prepares to start caches from *exchange-worker* 
thread.

If a cache stop request arrives to new node right in the middle of cache start 
prepare, it may lead to exception in FilePageStoreManager like one below and 
node crash.

Test reproducing the issue is attached.

{noformat}
class org.apache.ignite.IgniteCheckedException: Failed to get page store for 
the given cache ID (cache has not been started): -1422502786
        at 
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.getStore(FilePageStoreManager.java:1132)
        at 
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:482)
        at 
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:469)
        at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:854)
        at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:681)
        at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.getOrAllocateCacheMetas(GridCacheOffheapManager.java:869)
        at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.initDataStructures(GridCacheOffheapManager.java:128)
        at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.start(IgniteCacheOffheapManagerImpl.java:193)
        at 
org.apache.ignite.internal.processors.cache.CacheGroupContext.start(CacheGroupContext.java:1043)
        at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCacheGroup(GridCacheProcessor.java:2829)
        at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.getOrCreateCacheGroupContext(GridCacheProcessor.java:2557)
        at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheContext(GridCacheProcessor.java:2387)
        at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$null$6a5b31b9$1(GridCacheProcessor.java:2209)
        at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$prepareStartCaches$5(GridCacheProcessor.java:2130)
        at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$prepareStartCaches$926b6886$1(GridCacheProcessor.java:2206)
        at 
org.apache.ignite.internal.util.IgniteUtils.lambda$null$1(IgniteUtils.java:10874)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
{noformat}

  was:
When an existing cache is stopped (e.g. via call Ignite#destroyCache(String 
name)) this action is distributed across cluster by discovery mechanism (and is 
processed from *disco-notifier-worker* thread).
At the same time joining node prepares to start caches from *exchange-thread*.

If a cache stop request arrives to new node right in the middle of cache start 
prepare, it may lead to exception in FilePageStoreManager like one below and 
node crash.

Test reproducing the issue is attached.

{noformat}
class org.apache.ignite.IgniteCheckedException: Failed to get page store for 
the given cache ID (cache has not been started): -1422502786
        at 
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.getStore(FilePageStoreManager.java:1132)
        at 
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:482)
        at 
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:469)
        at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:854)
        at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:681)
        at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.getOrAllocateCacheMetas(GridCacheOffheapManager.java:869)
        at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.initDataStructures(GridCacheOffheapManager.java:128)
        at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.start(IgniteCacheOffheapManagerImpl.java:193)
        at 
org.apache.ignite.internal.processors.cache.CacheGroupContext.start(CacheGroupContext.java:1043)
        at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCacheGroup(GridCacheProcessor.java:2829)
        at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.getOrCreateCacheGroupContext(GridCacheProcessor.java:2557)
        at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheContext(GridCacheProcessor.java:2387)
        at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$null$6a5b31b9$1(GridCacheProcessor.java:2209)
        at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$prepareStartCaches$5(GridCacheProcessor.java:2130)
        at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$prepareStartCaches$926b6886$1(GridCacheProcessor.java:2206)
        at 
org.apache.ignite.internal.util.IgniteUtils.lambda$null$1(IgniteUtils.java:10874)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
{noformat}


> Stopping caches concurrently with node join may lead to crash of the node
> -------------------------------------------------------------------------
>
>                 Key: IGNITE-11743
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11743
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.7
>            Reporter: Sergey Chugunov
>            Assignee: Sergey Chugunov
>            Priority: Major
>             Fix For: 2.8
>
>         Attachments: IgnitePdsNodeRestartCacheCreateTest.java
>
>
> When an existing cache is stopped (e.g. via call Ignite#destroyCache(String 
> name)) this action is distributed across cluster by discovery mechanism (and 
> is processed from *disco-notifier-worker* thread).
> At the same time joining node prepares to start caches from *exchange-worker* 
> thread.
> If a cache stop request arrives to new node right in the middle of cache 
> start prepare, it may lead to exception in FilePageStoreManager like one 
> below and node crash.
> Test reproducing the issue is attached.
> {noformat}
> class org.apache.ignite.IgniteCheckedException: Failed to get page store for 
> the given cache ID (cache has not been started): -1422502786
>       at 
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.getStore(FilePageStoreManager.java:1132)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:482)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:469)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:854)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:681)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.getOrAllocateCacheMetas(GridCacheOffheapManager.java:869)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.initDataStructures(GridCacheOffheapManager.java:128)
>       at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.start(IgniteCacheOffheapManagerImpl.java:193)
>       at 
> org.apache.ignite.internal.processors.cache.CacheGroupContext.start(CacheGroupContext.java:1043)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCacheGroup(GridCacheProcessor.java:2829)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.getOrCreateCacheGroupContext(GridCacheProcessor.java:2557)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheContext(GridCacheProcessor.java:2387)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$null$6a5b31b9$1(GridCacheProcessor.java:2209)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$prepareStartCaches$5(GridCacheProcessor.java:2130)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$prepareStartCaches$926b6886$1(GridCacheProcessor.java:2206)
>       at 
> org.apache.ignite.internal.util.IgniteUtils.lambda$null$1(IgniteUtils.java:10874)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to