Hi,

It looks like that I found the issue:

https://issues.apache.org/jira/browse/IGNITE-8917

When you use *put *or *removeAll *in persistence cache with more data than data region size throw IgniteOutOfMemoryException. Data streamer looks like don't affected by this ticket.

The WA is pretty simple - use bigger data region.

BR,
Andrei

9/24/2020 11:44 PM, Mitchell Rathbun (BLOOMBERG/ 731 LEX) пишет:
I tried doubling it from 200 MB to 400 MB. My initial test worked, but as I increased the number of fields that I was writing per entry, the same issue occurred again. So it seems to just increase the capacity of what can be written, not actually prevent the exception from occurring. I guess my main question is why is it possible for Ignite to get an OOME when persistence and eviction are enabled? It seems like if there are a lot of writes, performance should degrade as the in memory cache evicts members of the cache, but no exceptions should occur. The error is always "Failed to find a page for eviction", which doesn't really make sense when eviction is enabled. What are the internal structures that Ignite holds in Off-heap memory? Also, why isn't this an issue when using IgniteDataStreamer if the issue has to do with space for internal structures? Wouldn't Ignite need the same internal structures for either case?
From: user@ignite.apache.org At: 09/24/20 11:42:21
To: user@ignite.apache.org <mailto:user@ignite.apache.org> Subject: Re: OutOfMemoryException with Persistence and Eviction Enabled

    Hi, Did you try to increase the DataRegion size a little bit? It
    looks like 190 MB isn't enough for some internal structures that
    Ignite stores in OFF-HEAP except the data. I suggest you increase
    the data region size to for example 512 MB - 1024 MB and take a
    look at how it will work. If you still will see the issue then I
    guess we should create the ticket: 1)Collect the logs 2)Provide
    the java code example 3)Provide the configuration of the nodes
    After that, we can take a look more deeply into it and if it's an
    issue then file JIRA. BR, Andrei

    9/23/2020 7:36 PM, Mitchell Rathbun (BLOOMBERG/ 731 LEX) пишет:
    Here is the exception:
    Sep 22, 2020 7:58:22 PM java.util.logging.LogManager$RootLogger log
    SEVERE: Critical system error detected. Will be handled
    accordingly to configured handler
    [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
    super=AbstractFailureHandler
    [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED,
    SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext
    [type=CRITICAL_ERROR, err=class
    o.a.i.i.mem.IgniteOutOfMemoryException: Out of memory in data
    region [name=customformulacalcrts, initSize=190.7 MiB,
    maxSize=190.7 MiB, persistenceEnabled=true] Try the following:
    ^-- Increase maximum off-heap memory size
    (DataRegionConfiguration.maxSize)
    ^-- Enable Ignite persistence
    (DataRegionConfiguration.persistenceEnabled)
    ^-- Enable eviction or expiration policies]]
    class org.apache.ignite.internal.mem.IgniteOutOfMemoryException:
    Out of memory in data region [name=customformulacalcrts,
    initSize=190.7 MiB, maxSize=190.7 MiB, persistenceEnabled=true]
    Try the following:
    ^-- Increase maximum off-heap memory size
    (DataRegionConfiguration.maxSize)
    ^-- Enable Ignite persistence
    (DataRegionConfiguration.persistenceEnabled)
    ^-- Enable eviction or expiration policies
    at
    
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.allocatePage(PageMemoryImpl.java:607)
    at
    
org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.allocateDataPage(AbstractFreeList.java:464)
    at
    
org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:491)
    at
    
org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeListImpl.insertDataRow(CacheFreeListImpl.java:59)
    at
    
org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeListImpl.insertDataRow(CacheFreeListImpl.java:35)
    at
    
org.apache.ignite.internal.processors.cache.persistence.RowStore.addRow(RowStore.java:103)
    at
    
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.createRow(IgniteCacheOffheapManagerImpl.java:1691)
    at
    
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.createRow(GridCacheOffheapManager.java:1910)
    at
    
org.apache.ignite.internal.processors.cache.GridCacheMapEntry$UpdateClosure.call(GridCacheMapEntry.java:5701)
    at
    
org.apache.ignite.internal.processors.cache.GridCacheMapEntry$UpdateClosure.call(GridCacheMapEntry.java:5643)
    at
    
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.invokeClosure(BPlusTree.java:3719)
    at
    
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.access$5900(BPlusTree.java:3613)
    at
    
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:1895)
    at
    
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:1872)
    at
    
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1779)
    at
    
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke0(IgniteCacheOffheapManagerImpl.java:1638)
    at
    
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1621)
    at
    
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:1935)
    at
    
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:428)
    at
    
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:4248)
    at
    
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:4226)
    at
    
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerUpdateLocal(GridCacheMapEntry.java:2106)
    at
    
org.apache.ignite.internal.processors.cache.local.atomic.GridLocalAtomicCache.updateAllInternal(GridLocalAtomicCache.java:929)
    at
    
org.apache.ignite.internal.processors.cache.local.atomic.GridLocalAtomicCache.access$100(GridLocalAtomicCache.java:86)
    at
    
org.apache.ignite.internal.processors.cache.local.atomic.GridLocalAtomicCache$6.call(GridLocalAtomicCache.java:776)
    at
    
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6817)
    at
    
org.apache.ignite.internal.processors.closure.GridClosureProcessor$2.body(GridClosureProcessor.java:967)
    at
    org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
    at
    
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at
    
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
    Caused by: class
    org.apache.ignite.internal.mem.IgniteOutOfMemoryException: Failed
    to find a page for eviction [segmentCapacity=6032, loaded=2365,
    maxDirtyPages=1774, dirtyPages=2365, cpPages=0,
    pinnedInSegment=0, failedToPrepare=2366]
    Out of memory in data region [name=customformulacalcrts,
    initSize=190.7 MiB, maxSize=190.7 MiB, persistenceEnabled=true]
    Try the following:
    ^-- Increase maximum off-heap memory size
    (DataRegionConfiguration.maxSize)
    ^-- Enable Ignite persistence
    (DataRegionConfiguration.persistenceEnabled)
    ^-- Enable eviction or expiration policies
    at
    
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl$Segment.tryToFindSequentially(PageMemoryImpl.java:2427)
    at
    
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl$Segment.removePageForReplacement(PageMemoryImpl.java:2321)
    at
    
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl$Segment.access$900(PageMemoryImpl.java:1930)
    at
    
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.allocatePage(PageMemoryImpl.java:552)
    ... 30 more
    Sep 22, 2020 7:58:23 PM java.util.logging.LogManager$RootLogger log
    SEVERE: JVM will be halted immediately due to the failure:
    [failureCtx=FailureContext [type=CRITICAL_ERROR, err=class
    o.a.i.i.mem.IgniteOutOfMemoryException: Out of memory in data
    region [name=customformulacalcrts, initSize=190.7 MiB,
    maxSize=190.7 MiB, persistenceEnabled=true] Try the following:
    ^-- Increase maximum off-heap memory size
    (DataRegionConfiguration.maxSize)
    ^-- Enable Ignite persistence
    (DataRegionConfiguration.persistenceEnabled)
    ^-- Enable eviction or expiration policies]]
    From: user@ignite.apache.org At: 09/23/20 12:34:36
    To: user@ignite.apache.org Subject: OutOfMemoryException with
    Persistence and Eviction Enabled

        We currently have a cache that is structured with a key of
        record type and a value that is a map from field id to field.
        So to update this cache, which has persistence enabled, we
        need to atomically load the value map for a key, add to that
        map, and write the map back to the cache. This can be done
        using invokeAll and a CacheEntryProcessor. However, when I
        test with a higher load (100k records with 50 fields each), I
        run into an OOM exception that I will post below. The cause
        of the exception is reported to be the failure to find a page
        to evict. However, even when I set the DataRegion's eviction
        threshold to .5 and the page eviction mode to RANDOM_2_LRU, I
        am still getting the same error. I have 2 main questions from
        this:
        1. Why is it failing to evict a page even with a lower
        threshold and eviction enabled? Is it failing to reach the
        threshold somehow? Are non-data pages like metadata and index
        pages taken into account when determining if the threshold
        has been reached?
        2. We don't have this issue when using IgniteDataStreamer to
        write large amounts of data to the cache, we just can't get
        transactional support at the same time. Why is this OOME an
        issue with regular cache puts but not with
        IgniteDataStreamer? I would think that any issues with
        checkpointing and eviction would also occur with
        IgniteDataStreamer.

Reply via email to