nice, I think there is value in just logging this information and not
halting of stopping.
In this way the feature can be used to determine usage patterns and spikes
etc and it would be possible to determine what the critical levels are.
This would allow a separation between getting information and doing
something about it.

On Sun, 6 Aug 2017 at 05:58 Michael André Pearce <
michael.andre.pea...@me.com> wrote:

> Thanks Clebert have left my feedback directly on the PR.
>
> Cheers
> Mike
>
> Sent from my iPhone
>
> > On 5 Aug 2017, at 06:03, Clebert Suconic <clebert.suco...@gmail.com>
> wrote:
> >
> > PR Sent.. i would appreciate reviews.
> >
> > thanks
> >
> > On Fri, Aug 4, 2017 at 1:02 PM, Clebert Suconic
> > <clebert.suco...@gmail.com> wrote:
> >> I'm adding some logic to detect cases where the broker may become
> irresponsive.
> >>
> >> I'm adding a component called CriticalAnalyzer, which will inspect
> >> response times of certain operations and decide to take the broker
> >> down when bad things are happening.
> >>
> >>
> >> Along several critical operations on the broker, I'm adding this
> pattern:
> >>
> >>
> >> enterCritical(pathID);
> >> try {
> >>   synchronized (lock) {
> >>   }
> >> } finally {
> >>   leaveCritical(pathID);
> >> }
> >>
> >> The CriticalAnalyzer will look at the times between enter and leave,
> >> and with a configured timeout, it will take the broker down.
> >>
> >>
> >>
> >> Now, when it's coming to the configuration, I'm not finding a good
> >> nomenclature for this.. and I'm asking for help:
> >>
> >> So, far I came up with these names:
> >>
> >> - analyze-critical : default true
> >>  is the critical analyzer on?
> >>
> >> - analyze-critical-timeout: default 120000 (milliseconds, 2 minutes)
> >>  The timeout used to
> >>
> >> - analyze-critical-check-period default 1/2 of analyze-critical-timeout
> >>
> >> - analyze-critical-halt-on-failure: default false
> >>  In case of an issue, the a Runtime.halt() would be issued if true,
> >>  otherwise a shutdown.
> >>
> >> During deadlocks or IO issues, the most effective way would be
> >> actually the halt. We could even change the start scripts to restart
> >> the server in case of a returned value.
> >>
> >>
> >>
> >>
> >> Any input?
> >>
> >>
> >> I will send a Pull Request soon.
> >>
> >>
> >> --
> >> Clebert Suconic
> >
> >
> >
> > --
> > Clebert Suconic
>

Reply via email to