Hi dmagda,

I am trying to drop the table which has around 10 million records and I am
seeing "*Out of memory in data region*" error messages in Ignite logs and
ignite node [Ignite pod on kubernetes] is restarting.
I have configured 3GB for default data region, 7GB for JVM and total 15GB
for Ignite container and enabled native persistence.
Earlier I was in an impression that restart was caused by "
*SYSTEM_WORKER_BLOCKED*" errors but now I am realized that  "
*SYSTEM_WORKER_BLOCKED*" is added to ignore failure list and the actual
cause is " *CRITICAL_ERROR* " due to  "*Out of memory in data region"*

This is the error messages in logs:

""[2019-09-17T08:25:35,054][ERROR][sys-#773][] *JVM will be halted
immediately due to the failure: [failureCtx=FailureContext
[type=CRITICAL_ERROR, err=class o.a.i.i.mem.IgniteOutOfMemoryException:
Failed to find a page for eviction* [segmentCapacity=971652, loaded=381157,
maxDirtyPages=285868, dirtyPages=381157, cpPages=0, pinnedInSegment=3,
failedToPrepare=381155]
*Out of memory in data region* [name=Default_Region, initSize=500.0 MiB,
maxSize=3.0 GiB, persistenceEnabled=true] Try the following:
  ^-- Increase maximum off-heap memory size
(DataRegionConfiguration.maxSize)
  ^-- Enable Ignite persistence (DataRegionConfiguration.persistenceEnabled)
  ^-- Enable eviction or expiration policies]]

Could you please help me on why *drop table operation* causing  "*Out of
memory in data region"*? and how I can avoid it?

We have a use case where application inserts records to many tables in
Ignite simultaneously for some time period and other applications run a
query on that time period data and update the dashboard. we need to delete
the records inserted in the previous time period before inserting new
records.

even during *delete from table* operation, I have seen:

"Critical system error detected. Will be handled accordingly to
configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false,
timeout=0, super=AbstractFailureHandler
[ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]],
failureCtx=FailureContext [*type=CRITICAL_ERROR*, err=class
o.a.i.IgniteException: *Checkpoint read lock acquisition has been
timed out*.]] class org.apache.ignite.IgniteException: Checkpoint read
lock acquisition has been timed out.|



On Mon, Apr 29, 2019 at 12:17 PM Denis Magda <[email protected]> wrote:

> Hi Shiva,
>
> That was designed to prevent global cluster performance degradation or
> other outages. Have you tried to apply my recommendation of turning of the
> failure handler for this system threads?
>
> -
> Denis
>
>
> On Sun, Apr 28, 2019 at 10:28 AM shivakumar <[email protected]>
> wrote:
>
>> HI Denis,
>>
>> is there any specific reason for the blocking of critical thread, like CPU
>> is full or Heap is full ?
>> We are again and again hitting this issue.
>> is there any other way to drop tables/cache ?
>> This looks like a critical issue.
>>
>> regards,
>> shiva
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>
>

Reply via email to