I agree with Richard's point about improving after contribution, so that means +1 from me for the contribution of this project as it stands.
Regarding your points in relation to AEM, I think the overall concept can be split into the following three concerns that can be decoupled: 1. A part that checks health of the system and reports true or false. Your implementation would be a really flexible component driven by configuration. In a simpler application the implementation would be just a DS component with some mandatory service references. 2. A part that waits for the system to be reported as healthy, or shuts down (with diagnostics) when not. 3. Root cause analysis, callable manually as a command, and by (2). Neil On Tue, Apr 17, 2018 at 3:29 PM, Christian Schneider < ch...@die-schneider.net> wrote: > The problem we face in our environment (AEM) is that the system is highly > configurable. > So the checks can not be defined statically for all AEM based systems. This > is why we came up with the > check services that can each solve a part fo the problem and then be > combined to show the aggregated state. > > For a single purpose application I agree with you that you can implement an > application specific check that covers all aspects of the application > readyness. > > The root cause analysis is something we could split off at some point and > implement in its own bundle. I think it is not yet covering all aspects but > I am pretty sure we can evolve it into a good tool that really helps > developers. I already implemented it in its own packet with no deps to the > other packages so it can be easily split off. > > Christian > > > 2018-04-17 14:53 GMT+02:00 Neil Bartlett <njbartl...@gmail.com>: > > > I like the general idea but, like Guillaume, I feel maybe this should be > > implemented at a lower level. The core `SystemReadinessMonitorImpl` and > the > > rootcause command are implemented as DS components and configured via > > Config Admin... but what if SCR and/or ConfigAdmin are unavailable or not > > working? > > > > I'm also not sure about the way in which checks are defined and extended. > > Only the application knows what "should" be started, but this can be > > defined at the application level using a DS component that has > dependencies > > on the necessary services, config etc. That DS component would provide a > > SystemReady service when it has decided the system is ready. > > > > Thus I think your framework can be boiled down to something simpler: > > > > * An exported SystemReady service interface, which should appear within a > > configurable timeout; > > * The root cause analysis tool, which is something I have always wanted > to > > have and I hope your implementation works as well as described! > > > > Regards, > > Neil > > > > On Tue, Apr 17, 2018 at 1:37 PM, Andrei Dulvac <dul...@apache.org> > wrote: > > > > > Hi Guillaume. > > > > > > Thanks! > > > > > > There's the OOTB ServicesCheck check that can be configured with a list > > of > > > services [0]. > > > We were thinking we could add the mandatory checks there through > > > configuration. > > > > > > The fact that the system can initially green, because no checks are > > present > > > is VERY valid. > > > We try to mediate that with the ServicesCheck and by making sure the > > > monitor waits for the framework to be up before reporting anything > other > > > than YELLOW. > > > > > > Hope I got the question and suggestion right :D > > > > > > - Andrei > > > > > > > > > --- > > > [0] > > > https://github.com/dulvac/system-readiness/blob/master/ > > > src/main/java/org/apache/sling/systemreadiness/core/ > > > impl/ServicesCheck.java#L59 > > > > > > On Tue, Apr 17, 2018 at 2:16 PM Guillaume Nodet <gno...@apache.org> > > wrote: > > > > > > > I like it a lot, the API is simple and extensible enough. Really > nice > > > work > > > > ! > > > > I'm just a bit nervous about having such a low-level component depend > > on > > > an > > > > external extender... > > > > > > > > I think it's missing one bit though: some kind of expectations. I.e. > it > > > > checks existing stuff, but it does not cover missing stuff. I > suppose > > it > > > > could be implemented specifically using custom checks, but I think > > there > > > is > > > > still a hole, which is the fact that those custom checks are not > > > > available. So I wonder if there should be an additional built-in > check > > > > that would grab a configuration entry, turn that into a list of > > mandatory > > > > checks and be green if all those check are available, yellow/red > > > > otherwise. This would ensure your container does not switch between > > > > green/yellow while the container is booting/provisioning. > > > > > > > > 2018-04-17 10:02 GMT+02:00 Christian Schneider < > > ch...@die-schneider.net > > > >: > > > > > > > > > Dear Felix community, > > > > > > > > > > during the last weeks Andrei Dulvac and I worked on a small > framework > > > to > > > > > check if an OSGi based system is fully up. > > > > > > > > > > Our work originated in testing sling modules and whole sling > > instances. > > > > We > > > > > soon found though that the concept is more general than sling and > can > > > be > > > > > applied to any OSGi based system. > > > > > > > > > > The system readiness framework has a SystemReadinessMonitor service > > > that > > > > > reports the aggregated state of the system. It delegates to > > > > > SystemReadinessCheck services that each check for a certain aspect. > > We > > > > > implemented a first check based on a list of expected top level > > > services. > > > > > The system can be customised by adding specific checks for your > > > > > application. For example we plan to add sling specific checks > inside > > > the > > > > > sling project. > > > > > > > > > > In addition to simply detecting if the system is ready we also > > created > > > a > > > > DS > > > > > based root cause analysis that can be very helpful to detect why a > > set > > > of > > > > > components does not come up as expected. > > > > > > > > > > We would like to donate this project to the Apache Felix project as > > it > > > > > might get more attention there by people that are not related to > > sling. > > > > The > > > > > project is Apache licensed from the start and we already got a > basic > > > > > documentation as well as good test coverage. > > > > > > > > > > We currently host it in this github repository: > > > > > https://github.com/dulvac/system-readiness > > > > > > > > > > The packages are still mentioning sling but of course we would > change > > > > this > > > > > to felix if this community is interested in the donation. > > > > > > > > > > Best regards > > > > > > > > > > Christian and Andrei > > > > > > > > > > > > > > > -- > > > > > -- > > > > > Christian Schneider > > > > > http://www.liquid-reality.de > > > > > > > > > > Computer Scientist > > > > > http://www.adobe.com > > > > > > > > > > > > > > > > > > > > > -- > > > > ------------------------ > > > > Guillaume Nodet > > > > > > > > > > > > > -- > -- > Christian Schneider > http://www.liquid-reality.de > > Computer Scientist > http://www.adobe.com >