Guys, there is no problem in blocking thread monitroing. Please, look at the error message: "failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteCheckedException: Node is stopping: grid-2]]". Some critical worker was terminated unexpectedly. So the problem isn't related with any timeouts. It's a bug that should be investigated.
On Thu, Dec 27, 2018 at 9:27 PM Denis Magda <dma...@apache.org> wrote: > > Folks, > > What are the current timeouts? We need to know the probability of failures > in dev environment. This affect usability. > > -- > Denis > > On Thu, Dec 27, 2018 at 4:59 AM Alexey Goncharuk <alexey.goncha...@gmail.com> > wrote: > > > Nikolay, > > > > Yes, the fix is already in master. Looks like I was wrong, in your case > > failure handler is triggered by 'Node is stopping: grid-2'. Can you please > > share the full trace? > > > > > > > > чт, 27 дек. 2018 г. в 12:41, Nikolay Izhikov <nizhi...@apache.org>: > > > > > Alexey > > > > > > Fix for this issue already in master? > > > I run tests on current master. > > > > > > > Should we somehow announce it on the user-list or highlight on > > readme.io > > > ? > > > > > > I don't think our users will be happy to users stuck with this behavior > > in > > > production. > > > > > > Am I understand you correctly: > > > If someone use 2.7. release and Ignite process slowing for a few seconds > > > for any reason(low-end hardwre, VM pause, other processes grab the > > > resources) then Ignite node will be stopped? > > > > > > > This is the issue I mentioned in "Critical worker threads liveness > > > checking > > > drawbacks" topic > > > > > > Thanks for the link, I will check it out. > > > > > > чт, 27 дек. 2018 г. в 12:24, Alexey Goncharuk < > > alexey.goncha...@gmail.com > > > >: > > > > > > > Hi Nikolay, > > > > > > > > This is the issue I mentioned in "Critical worker threads liveness > > > checking > > > > drawbacks" topic which I was expecting to be included to Ignite 2.7, > > but > > > it > > > > was not. To workaround the issue, you should set > > > > DataStorageConfiguration#setCheckpointReadLockTimeout to 0. > > > > > > > > Should we somehow announce it on the user-list or highlight on > > readme.io > > > ? > > > > > > > > чт, 27 дек. 2018 г. в 11:57, Nikolay Izhikov <nizhi...@apache.org>: > > > > > > > > > Hello, Igniters. > > > > > > > > > > I run into issue with critical system worker failure handler. > > > > > I just run `IgniteDataFrameSuite` and it terminates on random test. > > > > > My laptop doesn't have bleeding edge hardware, so tests can take > > > > > significant amount of time. > > > > > Looks like our watch dog too aggressive on development environment > > > > > > > > > > Can you please, help me. What should I do to configure or turn off > > > watch > > > > > dog? > > > > > Should we relax it a little bit? At least for a test environment. > > > > > > > > > > Error message contains following message: > > > > > > > > > > ``` > > > > > [2018-12-27 11:40:23,597][ERROR][exchange-worker-#5547%grid-2%][root] > > > > > Critical system error detected. Will be handled accordingly to > > > configured > > > > > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > > > > > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet > > > > > [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], > > > > > failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=class > > > > > o.a.i.IgniteCheckedException: Node is stopping: grid-2]] > > > > > class org.apache.ignite.IgniteCheckedException: Node is stopping: > > > grid-2 > > > > > ``` > > > > > > > > > > > > > >