Re: Node requires maintenance, non-empty set of maintainance tasks is found - node is not coming up

Gianluca Bonetti Wed, 29 May 2024 04:43:01 -0700

Hello Naveen

Apache Ignite 2.13 is more than 2 years old, 25 months old in actual fact.
Three bugfix releases had been rolled out over time up to 2.16 release.

It seems you are restarting your cluster on a regular basis, so you'd
better upgrade to 2.16 as soon as possible.
Otherwise it will also be very difficult for people on a community based
mailing list, on volunteer time, to work out a solution with a 2 years old
version running.

Besides that, you are not providing very much information about your
cluster setup.
How many nodes, what infrastructure, how many caches, overall data size.
One could only guess you have more than 1 node running, with at least 1
cache, and non-empty dataset. :)

This document from GridGain may be helpful but I don't see the same for
Ignite, it may still be worth checking it out.
https://www.gridgain.com/docs/latest/perf-troubleshooting-guide/maintenance-mode

On the other hand you should also check your failing node.
If it is always the same node failing, then there should be some root cause
apart from Ignite.
Indeed if the nodes configuration is the same across all nodes, and just
this one fails, you should also consider some network issues (check
connectivity and network latency between nodes) and hardware related issues
(faulty disks, faulty memory)
In the end, one option might be to replace the faulty machine with a brand
new one.
In cloud environments this is actually quite cheap and easy to do.

Cheers
Gianluca

On Wed, 29 May 2024 at 08:43, Naveen Kumar <[email protected]> wrote:

> Hello All
>
> We are using Ignite 2.13.0
>
> After a cluster restart, one of the node is not coming up and in node logs
> are seeing this error - Node requires maintenance, non-empty set of
> maintainance  tasks is found - node is not coming up
>
> we are getting errors like time out is reached before computation is
> completed error in other nodes as well.
>
> I could see that, we have control.sh script to backup and clean up the
> corrupted files, but when I run the command, it fails.
>
> I have removed the node from baseline and tried to run as well, still its
> failing
>
> what could be the solution for this, cluster is functioning, however there
> are requests failing
>
> Is there anyway we can start ignite node in  maintenance mode and try
> running clean corrupted commands
>
> Thanks
> Naveen
>
>
>

Re: Node requires maintenance, non-empty set of maintainance tasks is found - node is not coming up

Reply via email to