Can you open a jira issue to track this and attach the patch there?
Christian
On 20.03.2013 19:25, A. Rothman wrote:
[The message body disappeared for some reason, resending...]
Hi,
While trying to use DOSGi (1.4.0, in Karaf 2.3.1) I came across some
serious issues, namely when starting up a fresh Karaf instance
repeatedly, only about 50% of the time the service actually got
exported (as verified with a browser pointing to the appropriate WSDL
URL). This is just on a single instance on a local host, before trying
to do anything remotely. I started investigating into the matter,
which required quite a lot of debugging and code reviewing, and found
various issues with the (strangely named?) TopologyManagerExport class
which seems to be at the heart of the matter. This took several
iterations and includes also non-critical cleanups I did along the way
- I'm attaching the final version I arrived at which hopefully
addresses the issues, though due to the law of clustering of bugs, I
wouldn't be surprised if there are other new or old bugs still lurking
there... any feedback and additional code reviews would be greatly
appreciated.
Here is a summary of all the changes, big and small:
- fixed typos in docs and logs.
- fixed formatting to be more consistent and readable.
- added debug logs to
RemoteServiceAdminLifeCycleListener.added/removed callbacks (this
proved the serious tracker bug below, but is also useful for general
debugging).
- simplified shouldExportService returned boolean expression.
- replaced iterations on map keys followed by a get of the
corresponding value with an iteration on Map.Entry (prevents
unnecessary lookups).
- replaced map containsKey checks followed by get with a get followed
by null check (prevents unnecessary lookups, and theoretically race
conditions too).
- fixed tracker misuse: when tracker callback (added) is called, the
tracker itself does not yet reflect the new state, so accessing
services (or size) on the tracker from within the callback will fail
to include the new service. Although the actual service export
triggered by the callback is performed on a separate thread, this
still leaves a race condition between the two which sometimes causes
the export to fail (with "No RemoteServiceAdmin available!" log). To
fix this, the newly added RSA is passed down the method chain to the
doExportService method explicitly. doExportService then uses the
reference if it is given, or if null (as when called from elsewhere),
uses the tracker to get the service list as it did in the past.
- fixed exportService always overwriting a service's entry in
exportedServices with a new empty data structure - now it only does so
if it does not already exist.
- fixed exportedServices inner maps to be created as synchronized maps
- previously doExportService wrapped it in a synchronized map on each
invocation, which results effectively in no synchronization, since the
synchronized wrapper map is local only so each thread accessing the
map uses a different lock. Also fixed other accesses of the inner maps
to be synchronized (Collections.synchronizedMap requires external
synchronization on it when being iterated, as specified in the docs).
- fixed exports null check in doExportService (in the old code,
Collections.synchronizedMap never returns null; In the new code it
makes more sense.)
- removed redundant remoteServiceAdminTracker null check in
doExportService (it is already used in the constructor, so can't be
null at this point).
- added cast of null to String when calling getServiceReferences in
exportExistingServices, so that it will compile also with the OSGi
4.3.0 and later API (which added an overloaded method accepting a
Class). Does not affect previous versions.
- removed removeExportRegistration, removeExportReference and
remoteAdminEvent methods, all of which are both unused and have a
no-op implementation.
Feel free to pick out the changes u like...
In addition, I had a few related questions/thoughts, I hope someone
here can help with them:
1. Is DOSGi being updated to work with OSGi 4.3.0/Felix 4.x/Karaf
2.3.x etc.? It seems to be slightly outdated. At a glance, it looks
like making it compile against OSGi 4.3.0 requires just a few
backward-compatible syntax changes, and hopefully the semantics have
not changed too much.
2. The website says on the Karaf feature page: "CXF DOSGi does not
work with Karaf 2.3.0", however it does not mention whether this
applies to 2.3.1 as well (is it a DOSGi issue or Karaf issue that
prevents it from working?), nor does it say anything about what the
problem is, where it might be relevant, or link to any issue in JIRA -
could someone in the know please update the text to be a bit more
informative? Also, is this indeed just an issue with installation via
the feature, or with compatibility between the projects/versions in
general regardless of installation method?
3. After all the fixes above, the export success rate is definitely
better than 50%, but still isn't 100% - every once in a while the
export still fails. Looking through the logs, I found the message
"RemoteServiceAdmin Implementation is shutting down now" - during
Karaf startup! I'm still not sure what exactly is going on, but it
appears that the RSA is started, services exported properly etc., but
then it suddenly commits suicide a few moments later with no other
error or indication of who/what initiated the shutdown or why. The
initial theory is that the dsw Activator might be getting a
ConfigurationAdmin updated event with null configuration, which would
trigger a shutdown - but I haven't yet confirmed that this is the
case, nor why it would happen or what the proper behavior in such a
case should be.
I'm totally new to CXF and DOSGi (and pretty new to OSGi too), so
please forgive any silly mistakes or assumptions above... but I do
hope this will be useful in making it a better product and making it
work better out of the box :-)
Thanks to anyone who read this far!
Amichai
--
Christian Schneider
http://www.liquid-reality.de
Open Source Architect
http://www.talend.com