[jira] [Commented] (FELIX-5410) Web console plugin for troubleshooting wiring issues

2016-12-01 Thread Alexander Klimetschek (JIRA)

[ 
https://issues.apache.org/jira/browse/FELIX-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15711567#comment-15711567
 ] 

Alexander Klimetschek commented on FELIX-5410:
--

I turned this into a standalone plugin, working on it at 
https://github.com/alexkli/osgi-troubleshoot

Under the Apache License. I still would like to contribute this to Apache Felix 
when it's in a good state.

> Web console plugin for troubleshooting wiring issues
> 
>
> Key: FELIX-5410
> URL: https://issues.apache.org/jira/browse/FELIX-5410
> Project: Felix
>  Issue Type: New Feature
>  Components: Web Console
>Reporter: Alexander Klimetschek
> Attachments: FELIX-5410-with-services.patch, FELIX-5410.patch, 
> webconsole-troubleshoot-services.png, webconsole-troubleshoot.png
>
>
> h4. Feature
> Add a new view/plugin to the standard webconsole that helps to pin point 
> which bundles, services or components are the true source for inactive 
> bundles or services.
> * For *bundles* the underlying assumption would be a healthy system with all 
> bundles active, and thus any inactive can be shown and analyzed as being 
> problematic.
> * For *services/components* one can look at inactive _immediate_ services 
> that fail because of unsatisfied references. For others, the user might need 
> to enter the "problematic" service or component they expect to be running to 
> start the analysis.
> h4. Motivation
> In a larger OSGi application with many bundles and components, it can be 
> difficult to find out the root cause why certain bundles do not start or why 
> a service is not active, especially for folks new to OSGi or with limited 
> knowledge about the application. I have seen many people fail, and thus "not 
> like" OSGi because of such hurdles during development, where it is easy to 
> update on bundle but miss out on crucial dependencies.
> Figuring out is possible through the current web console, but only for 
> experts, if you click through the bundle or service details. This is usually 
> tedious work, if for example a lower level bundle is the problem, and 200 
> others are not active because of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FELIX-5410) Web console plugin for troubleshooting wiring issues

2016-11-30 Thread Alexander Klimetschek (JIRA)

[ 
https://issues.apache.org/jira/browse/FELIX-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709807#comment-15709807
 ] 

Alexander Klimetschek commented on FELIX-5410:
--

To track the *origin of dynamically registered services*, a 
[ServiceListener|https://osgi.org/javadoc/r6/core/org/osgi/framework/ServiceListener.html]
 could be used. It would track the (last) dynamic unregistration of services 
and inspect the stack which looks something like this:

{noformat}
listener: at 
org.apache.felix.scr.impl.BundleComponentActivator$ListenerInfo.serviceChanged(BundleComponentActivator.java:120)
  at 
org.apache.felix.framework.util.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:991)
  at 
org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:839)
  at 
org.apache.felix.framework.util.EventDispatcher.fireServiceEvent(EventDispatcher.java:546)
  at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4557)
  at org.apache.felix.framework.Felix.access$000(Felix.java:106)
  at org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:420)
  at 
org.apache.felix.framework.ServiceRegistry.unregisterService(ServiceRegistry.java:170)
  at 
org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:144)
origin:   at 
org.apache.jackrabbit.oak.plugins.metric.StatisticsProviderFactory.deactivate(StatisticsProviderFactory.java:103)
{noformat}

This would be stored in a map of service -> origin (class name).

In contrast, a registration by SCR has a stacktrace where the origin is 
{{org.apache.felix.scr}}:

{noformat}
listnr: at 
org.apache.felix.scr.impl.BundleComponentActivator$ListenerInfo.serviceChanged(BundleComponentActivator.java:120)
at 
org.apache.felix.framework.util.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:991)
at 
org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:839)
at 
org.apache.felix.framework.util.EventDispatcher.fireServiceEvent(EventDispatcher.java:546)
at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4557)
at org.apache.felix.framework.Felix.access$000(Felix.java:106)
at org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:420)
at 
org.apache.felix.framework.ServiceRegistry.unregisterService(ServiceRegistry.java:170)
at 
org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:144)
scr:at 
org.apache.felix.scr.impl.manager.AbstractComponentManager$3.unregister(AbstractComponentManager.java:883)
at 
org.apache.felix.scr.impl.manager.AbstractComponentManager$3.unregister(AbstractComponentManager.java:857)
at 
org.apache.felix.scr.impl.manager.RegistrationManager.changeRegistration(RegistrationManager.java:140)
at 
org.apache.felix.scr.impl.manager.AbstractComponentManager.unregisterService(AbstractComponentManager.java:925)
{noformat}

In that case the origin would probably be the service implementation itself 
(which might fail to start because of an exception in its activate).

While this is implementation specific (bound to Felix & requires knowing it's 
internal package names), for a troubleshooting tool this is ok. It can be 
adapted for newer Felix versions where things might change.

With the origin class/package is known (e.g. 
{{org.apache.jackrabbit.oak.plugins.metric.StatisticsProviderFactory}} in this 
case), at troubleshooting time it can check on the bundle's state and possibly 
grep the error log file for any messages from that class or package and provide 
as hints.

In our Sling based application, the JCR repository (database) is registered 
dynamically, and most of the application bundles depend on it directly or 
indirectly. It's startup can be prone to various low level exceptions 
(persistence problems, configuration issues), which prevent the dynamic 
registration. However, the exception message easily gets lost in the error log 
as usually there is a lot more going on when the repository restarts. A 
troubleshooting tool that can find this automatically (i.e. without knowing 
about the specific service names) would be useful.

The question is if getting the stacktrace for each service unregistration might 
be too costly. See 
http://stackoverflow.com/questions/2347828/how-expensive-is-thread-getstacktrace

> Web console plugin for troubleshooting wiring issues
> 
>
> Key: FELIX-5410
> URL: https://issues.apache.org/jira/browse/FELIX-5410
> Project: Felix
>  Issue Type: New Feature
>  Components: Web Console
>Reporter: Alexander Klimetschek
> Attachments: FELIX-5410-with-services.patch, 

[jira] [Commented] (FELIX-5410) Web console plugin for troubleshooting wiring issues

2016-11-30 Thread Alexander Klimetschek (JIRA)

[ 
https://issues.apache.org/jira/browse/FELIX-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709713#comment-15709713
 ] 

Alexander Klimetschek commented on FELIX-5410:
--

It should also provide an InventoryPrinter so a configuration dump of the 
system includes this potentially valuable troubleshooting information.

> Web console plugin for troubleshooting wiring issues
> 
>
> Key: FELIX-5410
> URL: https://issues.apache.org/jira/browse/FELIX-5410
> Project: Felix
>  Issue Type: New Feature
>  Components: Web Console
>Reporter: Alexander Klimetschek
> Attachments: FELIX-5410-with-services.patch, FELIX-5410.patch, 
> webconsole-troubleshoot-services.png, webconsole-troubleshoot.png
>
>
> h4. Feature
> Add a new view/plugin to the standard webconsole that helps to pin point 
> which bundles, services or components are the true source for inactive 
> bundles or services.
> * For *bundles* the underlying assumption would be a healthy system with all 
> bundles active, and thus any inactive can be shown and analyzed as being 
> problematic.
> * For *services/components* one can look at inactive _immediate_ services 
> that fail because of unsatisfied references. For others, the user might need 
> to enter the "problematic" service or component they expect to be running to 
> start the analysis.
> h4. Motivation
> In a larger OSGi application with many bundles and components, it can be 
> difficult to find out the root cause why certain bundles do not start or why 
> a service is not active, especially for folks new to OSGi or with limited 
> knowledge about the application. I have seen many people fail, and thus "not 
> like" OSGi because of such hurdles during development, where it is easy to 
> update on bundle but miss out on crucial dependencies.
> Figuring out is possible through the current web console, but only for 
> experts, if you click through the bundle or service details. This is usually 
> tedious work, if for example a lower level bundle is the problem, and 200 
> others are not active because of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FELIX-5410) Web console plugin for troubleshooting wiring issues

2016-11-14 Thread Robert Munteanu (JIRA)

[ 
https://issues.apache.org/jira/browse/FELIX-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665128#comment-15665128
 ] 

Robert Munteanu commented on FELIX-5410:


Wow, this looks pretty nice :-)

> Web console plugin for troubleshooting wiring issues
> 
>
> Key: FELIX-5410
> URL: https://issues.apache.org/jira/browse/FELIX-5410
> Project: Felix
>  Issue Type: New Feature
>  Components: Web Console
>Reporter: Alexander Klimetschek
> Attachments: FELIX-5410.patch, webconsole-troubleshoot.png
>
>
> h4. Feature
> Add a new view/plugin to the standard webconsole that helps to pin point 
> which bundles, services or components are the true source for inactive 
> bundles or services.
> * For *bundles* the underlying assumption would be a healthy system with all 
> bundles active, and thus any inactive can be shown and analyzed as being 
> problematic.
> * For *services/components* one can look at inactive _immediate_ services 
> that fail because of unsatisfied references. For others, the user might need 
> to enter the "problematic" service or component they expect to be running to 
> start the analysis.
> h4. Motivation
> In a larger OSGi application with many bundles and components, it can be 
> difficult to find out the root cause why certain bundles do not start or why 
> a service is not active, especially for folks new to OSGi or with limited 
> knowledge about the application. I have seen many people fail, and thus "not 
> like" OSGi because of such hurdles during development, where it is easy to 
> update on bundle but miss out on crucial dependencies.
> Figuring out is possible through the current web console, but only for 
> experts, if you click through the bundle or service details. This is usually 
> tedious work, if for example a lower level bundle is the problem, and 200 
> others are not active because of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FELIX-5410) Web console plugin for troubleshooting wiring issues

2016-11-14 Thread Alexander Klimetschek (JIRA)

[ 
https://issues.apache.org/jira/browse/FELIX-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665096#comment-15665096
 ] 

Alexander Klimetschek commented on FELIX-5410:
--

More ideas:
- Add a "try to start all bundles" button: Once some bundles are updated and 
all hard errors are resolved, typically other bundles in Installed that are 
just "waiting" for their dependencies to become active won't automatically 
start. You currently have to nudge them by manually clicking "Activate" on each 
of them (or using "Refresh Packages" as a broader action). It would be nice if 
the Troubleshoot view would have this button that it shows once all issues are 
resolved.
- Table layout for the error list (like in bundles)?

> Web console plugin for troubleshooting wiring issues
> 
>
> Key: FELIX-5410
> URL: https://issues.apache.org/jira/browse/FELIX-5410
> Project: Felix
>  Issue Type: New Feature
>  Components: Web Console
>Reporter: Alexander Klimetschek
> Attachments: FELIX-5410.patch, webconsole-troubleshoot.png
>
>
> h4. Feature
> Add a new view/plugin to the standard webconsole that helps to pin point 
> which bundles, services or components are the true source for inactive 
> bundles or services.
> * For *bundles* the underlying assumption would be a healthy system with all 
> bundles active, and thus any inactive can be shown and analyzed as being 
> problematic.
> * For *services/components* one can look at inactive _immediate_ services 
> that fail because of unsatisfied references. For others, the user might need 
> to enter the "problematic" service or component they expect to be running to 
> start the analysis.
> h4. Motivation
> In a larger OSGi application with many bundles and components, it can be 
> difficult to find out the root cause why certain bundles do not start or why 
> a service is not active, especially for folks new to OSGi or with limited 
> knowledge about the application. I have seen many people fail, and thus "not 
> like" OSGi because of such hurdles during development, where it is easy to 
> update on bundle but miss out on crucial dependencies.
> Figuring out is possible through the current web console, but only for 
> experts, if you click through the bundle or service details. This is usually 
> tedious work, if for example a lower level bundle is the problem, and 200 
> others are not active because of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)