Re: Proposal to donate the system readiness check framework to Apache Felix

Neil Bartlett Tue, 17 Apr 2018 08:04:25 -0700

I agree with Richard's point about improving after contribution, so that
means +1 from me for the contribution of this project as it stands.


Regarding your points in relation to AEM, I think the overall concept can
be split into the following three concerns that can be decoupled:

1. A part that checks health of the system and reports true or false. Your
implementation would be a really flexible component driven by
configuration. In a simpler application the implementation would be just a
DS component with some mandatory service references.
2. A part that waits for the system to be reported as healthy, or shuts
down (with diagnostics) when not.
3. Root cause analysis, callable manually as a command, and by (2).

Neil


On Tue, Apr 17, 2018 at 3:29 PM, Christian Schneider <
ch...@die-schneider.net> wrote:

> The problem we face in our environment (AEM) is that the system is highly
> configurable.
> So the checks can not be defined statically for all AEM based systems. This
> is why we came up with the
> check services that can each solve a part fo the problem and then be
> combined to show the aggregated state.
>
> For a single purpose application I agree with you that you can implement an
> application specific check that covers all aspects of the application
> readyness.
>
> The root cause analysis is something we could split off at some point and
> implement in its own bundle. I think it is not yet covering all aspects but
> I am pretty sure we can evolve it into a good tool that really helps
> developers. I already implemented it in its own packet with no deps to the
> other packages so it can be easily split off.
>
> Christian
>
>
> 2018-04-17 14:53 GMT+02:00 Neil Bartlett <njbartl...@gmail.com>:
>
> > I like the general idea but, like Guillaume, I feel maybe this should be
> > implemented at a lower level. The core `SystemReadinessMonitorImpl` and
> the
> > rootcause command are implemented as DS components and configured via
> > Config Admin... but what if SCR and/or ConfigAdmin are unavailable or not
> > working?
> >
> > I'm also not sure about the way in which checks are defined and extended.
> > Only the application knows what "should" be started, but this can be
> > defined at the application level using a DS component that has
> dependencies
> > on the necessary services, config etc. That DS component would provide a
> > SystemReady service when it has decided the system is ready.
> >
> > Thus I think your framework can be boiled down to something simpler:
> >
> > * An exported SystemReady service interface, which should appear within a
> > configurable timeout;
> > * The root cause analysis tool, which is something I have always wanted
> to
> > have and I hope your implementation works as well as described!
> >
> > Regards,
> > Neil
> >
> > On Tue, Apr 17, 2018 at 1:37 PM, Andrei Dulvac <dul...@apache.org>
> wrote:
> >
> > > Hi Guillaume.
> > >
> > > Thanks!
> > >
> > > There's the OOTB ServicesCheck check that can be configured with a list
> > of
> > > services [0].
> > > We were thinking we could add the mandatory checks there through
> > > configuration.
> > >
> > > The fact that the system can initially green, because no checks are
> > present
> > > is VERY valid.
> > > We try to mediate that with the ServicesCheck and by making sure the
> > > monitor waits for the framework to be up before reporting anything
> other
> > > than YELLOW.
> > >
> > > Hope I got the question and suggestion right :D
> > >
> > > - Andrei
> > >
> > >
> > > ---
> > > [0]
> > > https://github.com/dulvac/system-readiness/blob/master/
> > > src/main/java/org/apache/sling/systemreadiness/core/
> > > impl/ServicesCheck.java#L59
> > >
> > > On Tue, Apr 17, 2018 at 2:16 PM Guillaume Nodet <gno...@apache.org>
> > wrote:
> > >
> > > > I like it a lot, the API is simple and extensible enough.  Really
> nice
> > > work
> > > > !
> > > > I'm just a bit nervous about having such a low-level component depend
> > on
> > > an
> > > > external extender...
> > > >
> > > > I think it's missing one bit though: some kind of expectations. I.e.
> it
> > > > checks existing stuff, but it does not cover missing stuff.  I
> suppose
> > it
> > > > could be implemented specifically using custom checks, but I think
> > there
> > > is
> > > > still a hole, which is the fact that those custom checks are not
> > > > available.  So I wonder if there should be an additional built-in
> check
> > > > that would grab a configuration entry, turn that into a list of
> > mandatory
> > > > checks and be green if all those check are available, yellow/red
> > > > otherwise.  This would ensure your container does not switch between
> > > > green/yellow while the container is booting/provisioning.
> > > >
> > > > 2018-04-17 10:02 GMT+02:00 Christian Schneider <
> > ch...@die-schneider.net
> > > >:
> > > >
> > > > > Dear Felix community,
> > > > >
> > > > > during the last weeks Andrei Dulvac and I worked on a small
> framework
> > > to
> > > > > check if an OSGi based system is fully up.
> > > > >
> > > > > Our work originated in testing sling modules and whole sling
> > instances.
> > > > We
> > > > > soon found though that the concept is more general than sling and
> can
> > > be
> > > > > applied to any OSGi based system.
> > > > >
> > > > > The system readiness framework has a SystemReadinessMonitor service
> > > that
> > > > > reports the aggregated state of the system. It delegates to
> > > > > SystemReadinessCheck services that each check for a certain aspect.
> > We
> > > > > implemented a first check based on a list of expected top level
> > > services.
> > > > > The system can be customised by adding specific checks for your
> > > > > application. For example we plan to add sling specific checks
> inside
> > > the
> > > > > sling project.
> > > > >
> > > > > In addition to simply detecting if the system is ready we also
> > created
> > > a
> > > > DS
> > > > > based root cause analysis that can be very helpful to detect why a
> > set
> > > of
> > > > > components does not come up as expected.
> > > > >
> > > > > We would like to donate this project to the Apache Felix project as
> > it
> > > > > might get more attention there by people that are not related to
> > sling.
> > > > The
> > > > > project is Apache licensed from the start and we already got a
> basic
> > > > > documentation as well as good test coverage.
> > > > >
> > > > > We currently host it in this github repository:
> > > > > https://github.com/dulvac/system-readiness
> > > > >
> > > > > The packages are still mentioning sling but of course we would
> change
> > > > this
> > > > > to felix if this community is interested in the donation.
> > > > >
> > > > > Best regards
> > > > >
> > > > > Christian and Andrei
> > > > >
> > > > >
> > > > > --
> > > > > --
> > > > > Christian Schneider
> > > > > http://www.liquid-reality.de
> > > > >
> > > > > Computer Scientist
> > > > > http://www.adobe.com
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > ------------------------
> > > > Guillaume Nodet
> > > >
> > >
> >
>
>
>
> --
> --
> Christian Schneider
> http://www.liquid-reality.de
>
> Computer Scientist
> http://www.adobe.com
>

Re: Proposal to donate the system readiness check framework to Apache Felix

Reply via email to