Hi,
It looks like that I found the issue:
https://issues.apache.org/jira/browse/IGNITE-8917
When you use *put *or *removeAll *in persistence cache with more data
than data region size throw IgniteOutOfMemoryException. Data streamer
looks like don't affected by this ticket.
The WA is pretty simple - use bigger data region.
BR,
Andrei
9/24/2020 11:44 PM, Mitchell Rathbun (BLOOMBERG/ 731 LEX) пишет:
I tried doubling it from 200 MB to 400 MB. My initial test worked, but
as I increased the number of fields that I was writing per entry, the
same issue occurred again. So it seems to just increase the capacity
of what can be written, not actually prevent the exception from
occurring. I guess my main question is why is it possible for Ignite
to get an OOME when persistence and eviction are enabled? It seems
like if there are a lot of writes, performance should degrade as the
in memory cache evicts members of the cache, but no exceptions should
occur. The error is always "Failed to find a page for eviction", which
doesn't really make sense when eviction is enabled. What are the
internal structures that Ignite holds in Off-heap memory?
Also, why isn't this an issue when using IgniteDataStreamer if the
issue has to do with space for internal structures? Wouldn't Ignite
need the same internal structures for either case?
From: user@ignite.apache.org At: 09/24/20 11:42:21
To: user@ignite.apache.org <mailto:user@ignite.apache.org> Subject:
Re: OutOfMemoryException with Persistence and Eviction Enabled
Hi, Did you try to increase the DataRegion size a little bit? It
looks like 190 MB isn't enough for some internal structures that
Ignite stores in OFF-HEAP except the data. I suggest you increase
the data region size to for example 512 MB - 1024 MB and take a
look at how it will work. If you still will see the issue then I
guess we should create the ticket: 1)Collect the logs 2)Provide
the java code example 3)Provide the configuration of the nodes
After that, we can take a look more deeply into it and if it's an
issue then file JIRA. BR, Andrei
9/23/2020 7:36 PM, Mitchell Rathbun (BLOOMBERG/ 731 LEX) пишет:
Here is the exception:
Sep 22, 2020 7:58:22 PM java.util.logging.LogManager$RootLogger log
SEVERE: Critical system error detected. Will be handled
accordingly to configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler
[ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED,
SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext
[type=CRITICAL_ERROR, err=class
o.a.i.i.mem.IgniteOutOfMemoryException: Out of memory in data
region [name=customformulacalcrts, initSize=190.7 MiB,
maxSize=190.7 MiB, persistenceEnabled=true] Try the following:
^-- Increase maximum off-heap memory size
(DataRegionConfiguration.maxSize)
^-- Enable Ignite persistence
(DataRegionConfiguration.persistenceEnabled)
^-- Enable eviction or expiration policies]]
class org.apache.ignite.internal.mem.IgniteOutOfMemoryException:
Out of memory in data region [name=customformulacalcrts,
initSize=190.7 MiB, maxSize=190.7 MiB, persistenceEnabled=true]
Try the following:
^-- Increase maximum off-heap memory size
(DataRegionConfiguration.maxSize)
^-- Enable Ignite persistence
(DataRegionConfiguration.persistenceEnabled)
^-- Enable eviction or expiration policies
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.allocatePage(PageMemoryImpl.java:607)
at
org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.allocateDataPage(AbstractFreeList.java:464)
at
org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:491)
at
org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeListImpl.insertDataRow(CacheFreeListImpl.java:59)
at
org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeListImpl.insertDataRow(CacheFreeListImpl.java:35)
at
org.apache.ignite.internal.processors.cache.persistence.RowStore.addRow(RowStore.java:103)
at
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.createRow(IgniteCacheOffheapManagerImpl.java:1691)
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.createRow(GridCacheOffheapManager.java:1910)
at
org.apache.ignite.internal.processors.cache.GridCacheMapEntry$UpdateClosure.call(GridCacheMapEntry.java:5701)
at
org.apache.ignite.internal.processors.cache.GridCacheMapEntry$UpdateClosure.call(GridCacheMapEntry.java:5643)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.invokeClosure(BPlusTree.java:3719)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.access$5900(BPlusTree.java:3613)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:1895)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:1872)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1779)
at
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke0(IgniteCacheOffheapManagerImpl.java:1638)
at
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1621)
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:1935)
at
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:428)
at
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:4248)
at
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:4226)
at
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerUpdateLocal(GridCacheMapEntry.java:2106)
at
org.apache.ignite.internal.processors.cache.local.atomic.GridLocalAtomicCache.updateAllInternal(GridLocalAtomicCache.java:929)
at
org.apache.ignite.internal.processors.cache.local.atomic.GridLocalAtomicCache.access$100(GridLocalAtomicCache.java:86)
at
org.apache.ignite.internal.processors.cache.local.atomic.GridLocalAtomicCache$6.call(GridLocalAtomicCache.java:776)
at
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6817)
at
org.apache.ignite.internal.processors.closure.GridClosureProcessor$2.body(GridClosureProcessor.java:967)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: class
org.apache.ignite.internal.mem.IgniteOutOfMemoryException: Failed
to find a page for eviction [segmentCapacity=6032, loaded=2365,
maxDirtyPages=1774, dirtyPages=2365, cpPages=0,
pinnedInSegment=0, failedToPrepare=2366]
Out of memory in data region [name=customformulacalcrts,
initSize=190.7 MiB, maxSize=190.7 MiB, persistenceEnabled=true]
Try the following:
^-- Increase maximum off-heap memory size
(DataRegionConfiguration.maxSize)
^-- Enable Ignite persistence
(DataRegionConfiguration.persistenceEnabled)
^-- Enable eviction or expiration policies
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl$Segment.tryToFindSequentially(PageMemoryImpl.java:2427)
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl$Segment.removePageForReplacement(PageMemoryImpl.java:2321)
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl$Segment.access$900(PageMemoryImpl.java:1930)
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.allocatePage(PageMemoryImpl.java:552)
... 30 more
Sep 22, 2020 7:58:23 PM java.util.logging.LogManager$RootLogger log
SEVERE: JVM will be halted immediately due to the failure:
[failureCtx=FailureContext [type=CRITICAL_ERROR, err=class
o.a.i.i.mem.IgniteOutOfMemoryException: Out of memory in data
region [name=customformulacalcrts, initSize=190.7 MiB,
maxSize=190.7 MiB, persistenceEnabled=true] Try the following:
^-- Increase maximum off-heap memory size
(DataRegionConfiguration.maxSize)
^-- Enable Ignite persistence
(DataRegionConfiguration.persistenceEnabled)
^-- Enable eviction or expiration policies]]
From: user@ignite.apache.org At: 09/23/20 12:34:36
To: user@ignite.apache.org Subject: OutOfMemoryException with
Persistence and Eviction Enabled
We currently have a cache that is structured with a key of
record type and a value that is a map from field id to field.
So to update this cache, which has persistence enabled, we
need to atomically load the value map for a key, add to that
map, and write the map back to the cache. This can be done
using invokeAll and a CacheEntryProcessor. However, when I
test with a higher load (100k records with 50 fields each), I
run into an OOM exception that I will post below. The cause
of the exception is reported to be the failure to find a page
to evict. However, even when I set the DataRegion's eviction
threshold to .5 and the page eviction mode to RANDOM_2_LRU, I
am still getting the same error. I have 2 main questions from
this:
1. Why is it failing to evict a page even with a lower
threshold and eviction enabled? Is it failing to reach the
threshold somehow? Are non-data pages like metadata and index
pages taken into account when determining if the threshold
has been reached?
2. We don't have this issue when using IgniteDataStreamer to
write large amounts of data to the cache, we just can't get
transactional support at the same time. Why is this OOME an
issue with regular cache puts but not with
IgniteDataStreamer? I would think that any issues with
checkpointing and eviction would also occur with
IgniteDataStreamer.