Thanks Andrey! I have added a few comments to the IEP-14 page.

D.

On Fri, Mar 16, 2018 at 6:44 AM, Andrey Gura <ag...@apache.org> wrote:

> Hi!
>
> Thank you all for your opinions and ideas!
>
> While reading the thread I made two important conclusions:
>
> 1. Proposed API should be changed because possible actions enumeration
> is bad idea. More clean and simple design should allow user provide
> failure handler implementation with custom logic of failure handling
> if needed.
>
> 2. Several failure handler implementations should be provided out-of
> box in order to provide simple way of changing default behaviour
> through configuration. The following implementations should be
> provided:
>
>      - NoOpFailureHandler - It's useful for tests and debugging.
>      - RestartProcessFailureHandler - Specific implementation that
> could be used only with ignite.(sh|bat).
>      - StopNodeFailureHandler - This implementation will stop Ignite
> node in case of critical error.
>      - StopNodeOrHaltFailureHandler(boolean tryStop, long timeout) -
> Default failure handler will try stop node if tryStop value is true.
> If node can't be stopped or tryStop value is false then JVM process
> will be terminated forcibly (Runtime.halt()). Default value for
> tryStop parameter is false. Of course we should limit time of node
> shutdown in order to prevent hangs.
>
> As for the default behavior, I agree with those who believe that most
> suitable default option is process termination (although I had a
> different opinion before) and most strong argument for this choice is
> impossibility of reasoning about system state in case of critical
> error.
> Also I believe that we can't choose solution that will be suitable for
> any community member and the best that we can do is provide simple way
> of changing this behavior.
>
> So, I think, default behavior discussion should be finished. I'll
> update IEP-14 [1] accordingly to my conclusions above. If you have any
> ideas or thoughts about this conclusions, please feel free to share.
>
> Thanks!
>
> [1] https://cwiki.apache.org/confluence/display/IGNITE/IEP-
> 14+Ignite+failures+handling
>
> On Fri, Mar 16, 2018 at 1:07 AM, Dmitriy Setrakyan
> <dsetrak...@apache.org> wrote:
> > On Thu, Mar 15, 2018 at 5:21 AM, Dmitry Pavlov <dpavlov....@gmail.com>
> > wrote:
> >
> >> Hi Dmitriy,
> >>
> >> It seems, here everyone agrees that killing the process will give a more
> >> guaranteed result. The question is that the majority in the community
> does
> >> not consider this to be acceptable in case Ignite as started as embedded
> >> lib (e.g. from Java, using Ignition.start())
> >>
> >> What can help to accept the community's opinion? Let's remember Apache
> >> principle: "community first".
> >>
> >
> > I am still confused about the problem the majority of the community is
> > trying to solve. If our priority is to keep the cluster in frozen state,
> > then what is the reason for this task altogether?
> >
> > The priority should be to keep the cluster operational, not frozen. The
> > only solution here is "kill" or "stop+kill". If the community does not
> > accept this option as a default, then I propose to drop this task
> > altogether, because we do not have to do anything to keep the cluster
> > frozen.
> >
> >
> >> If release 2.5 will show us it was inpractical, we will change default
> to
> >> kill even for library. What do you think?
> >>
> >
> > See above. I do not see a reason to continue with this task if the end
> > result is identical to what we have today.
> >
> > I want to give the community another chance to speak up and voice their
> > opinions again, having fully understood the context and the problem being
> > solved here.
> >
> > D.
>

Reply via email to