Hi dmagda, I am trying to drop the table which has around 10 million records and I am seeing "*Out of memory in data region*" error messages in Ignite logs and ignite node [Ignite pod on kubernetes] is restarting. I have configured 3GB for default data region, 7GB for JVM and total 15GB for Ignite container and enabled native persistence. Earlier I was in an impression that restart was caused by " *SYSTEM_WORKER_BLOCKED*" errors but now I am realized that " *SYSTEM_WORKER_BLOCKED*" is added to ignore failure list and the actual cause is " *CRITICAL_ERROR* " due to "*Out of memory in data region"*
This is the error messages in logs: ""[2019-09-17T08:25:35,054][ERROR][sys-#773][] *JVM will be halted immediately due to the failure: [failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.mem.IgniteOutOfMemoryException: Failed to find a page for eviction* [segmentCapacity=971652, loaded=381157, maxDirtyPages=285868, dirtyPages=381157, cpPages=0, pinnedInSegment=3, failedToPrepare=381155] *Out of memory in data region* [name=Default_Region, initSize=500.0 MiB, maxSize=3.0 GiB, persistenceEnabled=true] Try the following: ^-- Increase maximum off-heap memory size (DataRegionConfiguration.maxSize) ^-- Enable Ignite persistence (DataRegionConfiguration.persistenceEnabled) ^-- Enable eviction or expiration policies]] Could you please help me on why *drop table operation* causing "*Out of memory in data region"*? and how I can avoid it? We have a use case where application inserts records to many tables in Ignite simultaneously for some time period and other applications run a query on that time period data and update the dashboard. we need to delete the records inserted in the previous time period before inserting new records. even during *delete from table* operation, I have seen: "Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext [*type=CRITICAL_ERROR*, err=class o.a.i.IgniteException: *Checkpoint read lock acquisition has been timed out*.]] class org.apache.ignite.IgniteException: Checkpoint read lock acquisition has been timed out.| On Mon, Apr 29, 2019 at 12:17 PM Denis Magda <[email protected]> wrote: > Hi Shiva, > > That was designed to prevent global cluster performance degradation or > other outages. Have you tried to apply my recommendation of turning of the > failure handler for this system threads? > > - > Denis > > > On Sun, Apr 28, 2019 at 10:28 AM shivakumar <[email protected]> > wrote: > >> HI Denis, >> >> is there any specific reason for the blocking of critical thread, like CPU >> is full or Heap is full ? >> We are again and again hitting this issue. >> is there any other way to drop tables/cache ? >> This looks like a critical issue. >> >> regards, >> shiva >> >> >> >> -- >> Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >> >
