Re: IEP-14: Ignite failures handling (Discussion)

Dmitry Pavlov Tue, 13 Mar 2018 05:17:53 -0700

Dmitriy, alternative is "kill if standalone, stop if embedded"

User will be still able to set something like
-DNODE_CRASH_ACTION="kill"
if ignite.sh is not used and user accepts alternative that whole process
would be killed if node is crashed.


Default would be 'node stop', but not hang up infinetely.

Sincerely,
Dmitriy Pavlov

вт, 13 мар. 2018 г. в 14:53, Dmitriy Setrakyan <[email protected]>:

> Guys, I do not understand the alternative. If Ignite is frozen and causes
> the whole grid to freeze, how can we justify not killing it? Will uses
> rather have their applications freeze?
>
> I would consider real life use cases here. Can someone present a life
> example where keeping a frozen grid node around is better than killing JVM?
>
> D.
>
> On Tue, Mar 13, 2018 at 6:16 AM, Alexey Goncharuk <
> [email protected]> wrote:
>
> > I also like "kill if standalone, stop if embedded" by default. A use can
> > change it to kill for embedded mode, but it will be a controlled safe
> > choice.
> >
> > 2018-03-13 11:26 GMT+03:00 Vladimir Ozerov <[email protected]>:
> >
> > > +1 for "kill if standalone, stop if embedded". We should never kill a
> > > process in embedded node because it might be disastrous for user
> > > application.
> > >
> > > On Tue, Mar 13, 2018 at 10:41 AM, Dmitry Pavlov <[email protected]
> >
> > > wrote:
> > >
> > > > Denis, Dmitriy, I am not sure I agree here, please see close
> analogue -
> > > JVM
> > > > itself, and its parameter ExitOnOutOfMemoryError,- it is not default.
> > > >
> > > > If server node is started from sh script, kill OK for me, as process
> is
> > > > controlled only by ignite.  It is sufficient to add option to
> override
> > > > default for sh script.
> > > >
> > > > Users interested in this behaviour may also setup this option to
> "kill"
> > > >
> > > > If server node is started from java, it should never kill whole
> > process.
> > > > This mode is not prohibited by docs, users are allowed to start
> several
> > > > nodes in one process, run its own application logic in this node.
> > > >
> > > > Why we should kill user code running? It could be negative surprise
> to
> > > > user.
> > > >
> > > >
> > > >
> > > > вт, 13 мар. 2018 г. в 8:26, Dmitriy Setrakyan <[email protected]
> >:
> > > >
> > > > > On Tue, Mar 13, 2018 at 1:18 AM, Andrey Kornev <
> > > [email protected]
> > > > >
> > > > > wrote:
> > > > >
> > > > > > I believe the only reasonable way to handle a critical system
> > failure
> > > > (as
> > > > > > it is defined in the IEP) is a JVM halt (not a graceful
> > > > exit/shutdown!).
> > > > > > The sooner - the better, lesser impact. There’s simply no way to
> > > reason
> > > > > > about the state of the system in a situation like that, all bets
> > are
> > > > off.
> > > > > > Any other policy would only confuse the matters and in all
> > likelihood
> > > > > make
> > > > > > things worse.
> > > > > >
> > > > > > In practice, SREs/Operations would very much rather have a
> process
> > > die
> > > > a
> > > > > > quick clean death, than let it run indefinitely and hope that
> it’ll
> > > > > somehow
> > > > > > recover by itself at some point in future, potentially degrading
> > the
> > > > > > overall system stability and availability all the while.
> > > > > >
> > > > >
> > > > > Completely agree.
> > > > >
> > > >
> > >
> >
>

Re: IEP-14: Ignite failures handling (Discussion)

Reply via email to