Re: nodes are restarting when i try to drop a table created with persistence enabled

Denis Mekhanikov Wed, 25 Sep 2019 00:43:16 -0700

I think, the issue is that Ignite can't recover from
IgniteOutOfMemory, even by removing data.
Shiva, did IgniteOutOfMemory occur for the first time when you did the
DROP TABLE, or before that?


Denis

ср, 25 сент. 2019 г. в 02:30, Denis Magda <[email protected]>:
>
> Shiva,
>
> Does this issue still exist? Ignite Dev how do we debug this sort of thing?
>
> -
> Denis
>
>
> On Tue, Sep 17, 2019 at 7:22 AM Shiva Kumar <[email protected]> wrote:
>>
>> Hi dmagda,
>>
>> I am trying to drop the table which has around 10 million records and I am 
>> seeing "Out of memory in data region" error messages in Ignite logs and 
>> ignite node [Ignite pod on kubernetes] is restarting.
>> I have configured 3GB for default data region, 7GB for JVM and total 15GB 
>> for Ignite container and enabled native persistence.
>> Earlier I was in an impression that restart was caused by 
>> "SYSTEM_WORKER_BLOCKED" errors but now I am realized that  
>> "SYSTEM_WORKER_BLOCKED" is added to ignore failure list and the actual cause 
>> is " CRITICAL_ERROR " due to  "Out of memory in data region"
>>
>> This is the error messages in logs:
>>
>> ""[2019-09-17T08:25:35,054][ERROR][sys-#773][] JVM will be halted 
>> immediately due to the failure: [failureCtx=FailureContext 
>> [type=CRITICAL_ERROR, err=class o.a.i.i.mem.IgniteOutOfMemoryException: 
>> Failed to find a page for eviction [segmentCapacity=971652, loaded=381157, 
>> maxDirtyPages=285868, dirtyPages=381157, cpPages=0, pinnedInSegment=3, 
>> failedToPrepare=381155]
>> Out of memory in data region [name=Default_Region, initSize=500.0 MiB, 
>> maxSize=3.0 GiB, persistenceEnabled=true] Try the following:
>>   ^-- Increase maximum off-heap memory size (DataRegionConfiguration.maxSize)
>>   ^-- Enable Ignite persistence (DataRegionConfiguration.persistenceEnabled)
>>   ^-- Enable eviction or expiration policies]]
>>
>> Could you please help me on why drop table operation causing  "Out of memory 
>> in data region"? and how I can avoid it?
>>
>> We have a use case where application inserts records to many tables in 
>> Ignite simultaneously for some time period and other applications run a 
>> query on that time period data and update the dashboard. we need to delete 
>> the records inserted in the previous time period before inserting new 
>> records.
>>
>> even during delete from table operation, I have seen:
>>
>> "Critical system error detected. Will be handled accordingly to configured 
>> handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
>> super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], 
>> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
>> o.a.i.IgniteException: Checkpoint read lock acquisition has been timed 
>> out.]] class org.apache.ignite.IgniteException: Checkpoint read lock 
>> acquisition has been timed out.|
>>
>>
>>
>> On Mon, Apr 29, 2019 at 12:17 PM Denis Magda <[email protected]> wrote:
>>>
>>> Hi Shiva,
>>>
>>> That was designed to prevent global cluster performance degradation or 
>>> other outages. Have you tried to apply my recommendation of turning of the 
>>> failure handler for this system threads?
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Sun, Apr 28, 2019 at 10:28 AM shivakumar <[email protected]> 
>>> wrote:
>>>>
>>>> HI Denis,
>>>>
>>>> is there any specific reason for the blocking of critical thread, like CPU
>>>> is full or Heap is full ?
>>>> We are again and again hitting this issue.
>>>> is there any other way to drop tables/cache ?
>>>> This looks like a critical issue.
>>>>
>>>> regards,
>>>> shiva
>>>>
>>>>
>>>>
>>>> --
>>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: nodes are restarting when i try to drop a table created with persistence enabled

Reply via email to