Re: nodes are restarting when i try to drop a table created with persistence enabled

Denis Magda Fri, 27 Sep 2019 14:50:17 -0700

Ivan, Igor, Andrey, as SQL experts,

Does this sound like a known limitation or issue? If not, what do we need
to reproduce the scenario - heapdums?


-
Denis


On Thu, Sep 26, 2019 at 2:12 AM Shiva Kumar <[email protected]>
wrote:

> Hi dmagda,
>
> When I insert many records (~ 10 or 20 million) to the same table and try
> to drop table or delete records from the table, nodes are restarting, the
> restarts happens In the middle of drop or delete operation.
> According to the logs the cause for restart looks like OOM in the data
> region.
>
> regards,
> shiva
>
> On Wed, Sep 25, 2019 at 1:12 PM Denis Mekhanikov <[email protected]>
> wrote:
>
>> I think, the issue is that Ignite can't recover from
>> IgniteOutOfMemory, even by removing data.
>> Shiva, did IgniteOutOfMemory occur for the first time when you did the
>> DROP TABLE, or before that?
>>
>> Denis
>>
>> ср, 25 сент. 2019 г. в 02:30, Denis Magda <[email protected]>:
>> >
>> > Shiva,
>> >
>> > Does this issue still exist? Ignite Dev how do we debug this sort of
>> thing?
>> >
>> > -
>> > Denis
>> >
>> >
>> > On Tue, Sep 17, 2019 at 7:22 AM Shiva Kumar <[email protected]>
>> wrote:
>> >>
>> >> Hi dmagda,
>> >>
>> >> I am trying to drop the table which has around 10 million records and
>> I am seeing "Out of memory in data region" error messages in Ignite logs
>> and ignite node [Ignite pod on kubernetes] is restarting.
>> >> I have configured 3GB for default data region, 7GB for JVM and total
>> 15GB for Ignite container and enabled native persistence.
>> >> Earlier I was in an impression that restart was caused by
>> "SYSTEM_WORKER_BLOCKED" errors but now I am realized that
>> "SYSTEM_WORKER_BLOCKED" is added to ignore failure list and the actual
>> cause is " CRITICAL_ERROR " due to  "Out of memory in data region"
>> >>
>> >> This is the error messages in logs:
>> >>
>> >> ""[2019-09-17T08:25:35,054][ERROR][sys-#773][] JVM will be halted
>> immediately due to the failure: [failureCtx=FailureContext
>> [type=CRITICAL_ERROR, err=class o.a.i.i.mem.IgniteOutOfMemoryException:
>> Failed to find a page for eviction [segmentCapacity=971652, loaded=381157,
>> maxDirtyPages=285868, dirtyPages=381157, cpPages=0, pinnedInSegment=3,
>> failedToPrepare=381155]
>> >> Out of memory in data region [name=Default_Region, initSize=500.0 MiB,
>> maxSize=3.0 GiB, persistenceEnabled=true] Try the following:
>> >>   ^-- Increase maximum off-heap memory size
>> (DataRegionConfiguration.maxSize)
>> >>   ^-- Enable Ignite persistence
>> (DataRegionConfiguration.persistenceEnabled)
>> >>   ^-- Enable eviction or expiration policies]]
>> >>
>> >> Could you please help me on why drop table operation causing  "Out of
>> memory in data region"? and how I can avoid it?
>> >>
>> >> We have a use case where application inserts records to many tables in
>> Ignite simultaneously for some time period and other applications run a
>> query on that time period data and update the dashboard. we need to delete
>> the records inserted in the previous time period before inserting new
>> records.
>> >>
>> >> even during delete from table operation, I have seen:
>> >>
>> >> "Critical system error detected. Will be handled accordingly to
>> configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false,
>> timeout=0, super=AbstractFailureHandler
>> [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext
>> [type=CRITICAL_ERROR, err=class o.a.i.IgniteException: Checkpoint read lock
>> acquisition has been timed out.]] class org.apache.ignite.IgniteException:
>> Checkpoint read lock acquisition has been timed out.|
>> >>
>> >>
>> >>
>> >> On Mon, Apr 29, 2019 at 12:17 PM Denis Magda <[email protected]>
>> wrote:
>> >>>
>> >>> Hi Shiva,
>> >>>
>> >>> That was designed to prevent global cluster performance degradation
>> or other outages. Have you tried to apply my recommendation of turning of
>> the failure handler for this system threads?
>> >>>
>> >>> -
>> >>> Denis
>> >>>
>> >>>
>> >>> On Sun, Apr 28, 2019 at 10:28 AM shivakumar <[email protected]>
>> wrote:
>> >>>>
>> >>>> HI Denis,
>> >>>>
>> >>>> is there any specific reason for the blocking of critical thread,
>> like CPU
>> >>>> is full or Heap is full ?
>> >>>> We are again and again hitting this issue.
>> >>>> is there any other way to drop tables/cache ?
>> >>>> This looks like a critical issue.
>> >>>>
>> >>>> regards,
>> >>>> shiva
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>
>

Re: nodes are restarting when i try to drop a table created with persistence enabled

Reply via email to