Hello!

This looks like a PDS corruption to me. Can you by chance share persistence
files from problematic node? I am assuming that it fails every time on
restart?

Regards,
-- 
Ilya Kasnacheev


чт, 20 мая 2021 г. в 12:52, Lo, Marcus <marcus...@citi.com>:

> Hi,
>
>
>
> We have a 4 node ignite cluster setup. After running the cluster for 1
> day, we encounter the following error almost at the same time at node #2,
> #3, and #4:
>
>
>
> Critical system error detected. Will be handled accordingly to configured
> handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [
> SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class
> o.a.i.IgniteCheckedException: Maximum number of retries 1000 reached for
> Put operation (the tree may be corrupted). Increase
> IGNITE_BPLUS_TREE_LOCK_RETRIES system property if you regularly see this
> message (current value is 1000).]]
> org.apache.ignite.IgniteCheckedException: Maximum number of retries 1000
> reached for Put operation (the tree may be corrupted). Increase
> IGNITE_BPLUS_TREE_LOCK_RETRIES system property if you regularly see this
> message (current value is 1000). at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Get.checkLockRetry
> (BPlusTree.java:3109) [ignite-core-2.10.0.jar:2.10.0] at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.checkLockRetry
> (BPlusTree.java:3906) [ignite-core-2.10.0.jar:2.10.0]
>
>
>
> Tried increasing IGNITE_BPLUS_TREE_LOCK_RETRIES to 100,000 and restarted
> the nodes, but it didn’t help and the node went into the same error
> straight away.
>
>
>
> Can you please shed some lights on how to resolve the issue? Thanks.
>
>
>
> I also attach the logs for your reference:
>
> ignite-node-[1,2,3,4].log: the full log files for all nodes
>
> ignite-restart.log: the log for node 2 when it crashed
>
>
>
> Regards,
>
> Marcus
>
>
>

Reply via email to