Amichai Rothman created DOSGI-161:
-------------------------------------

             Summary: services sometimes don't get exported
                 Key: DOSGI-161
                 URL: https://issues.apache.org/jira/browse/DOSGI-161
             Project: CXF Distributed OSGi
          Issue Type: Bug
          Components: DSW
    Affects Versions: 1.4
         Environment: Oracle JDK 1.7.0_17, Karaf 2.3.1, DOSGi 1.4.0 (installed 
from Karaf feature)
            Reporter: Amichai Rothman


When starting up a fresh Karaf instance repeatedly, a service that should be 
exported sometimes does not in fact get exported. The issue stems from some 
bugs in the TopologyManagerExport class, the main ones being wrong assumptions 
about the state of the Tracker within its callback method (during the callback 
its internal state is not yet updated, but the code assumes it is), 
unconditional overwriting of data structures (which may already exist) and 
synchronization issues.

During my investigations I came across various other small issues and 
enhancements in the class, so the provided patch includes the final result I 
arrived with including all changes. I've split it up into several consecutive 
patches in order to make the review easier. These are the changes:

Patch #1 - cleanup (no semantic changes):
- fixed typos in docs and logs.
- fixed formatting to be more consistent and readable.
- added debug logs to RemoteServiceAdminLifeCycleListener.added/removed 
callbacks (this proved the serious tracker bug below, but is also useful for 
general debugging).
- simplified shouldExportService returned boolean expression.
- removed removeExportRegistration, removeExportReference and remoteAdminEvent 
methods, all of which are both unused and have a no-op implementation.

Patch #2 (improved iterations):
- replaced iterations on map keys followed by a get of the corresponding value 
with an iteration on Map.Entry (prevents unnecessary lookups).
- replaced map containsKey checks followed by get with a get followed by null 
check (prevents unnecessary lookups, and theoretically race conditions too).
- fixed remoteServiceAdmin.exportService return value handling - its docs say 
the result can never be null, but the code assumes it might be. This is now 
replaced with an isEmpty() check instead, I hope there are no other 
consequences. The reason this is bundled with the iterations fixes is since it 
shows explicitly that map values cannot in fact be null, which proves the 
iteration fixes to be correct (i.e. if Map.get returns null, the key does not 
exist, so this is equivalent to calling Map.containsKey).

Patch #3 (tracker and sync issues):
- fixed tracker misuse: when tracker callback (added) is called, the tracker 
itself does not yet reflect the new state, so accessing services (or size) on 
the tracker from within the callback will fail to include the new service. 
Although the actual service export triggered by the callback is performed on a 
separate thread, this still leaves a race condition between the two which 
sometimes causes the export to fail (with "No RemoteServiceAdmin available!" 
log). To fix this, the newly added RSA is passed down the method chain to the 
doExportService method explicitly. doExportService then uses the reference if 
it is given, or if null (as when called from elsewhere), uses the tracker to 
get the service list as it did in the past.
- fixed exportService always overwriting a service's entry in exportedServices 
with a new empty data structure - now it only does so if it does not already 
exist.
- fixed exportedServices inner maps to be created as synchronized maps - 
previously doExportService wrapped it in a synchronized map on each invocation, 
which results effectively in no synchronization, since the synchronized wrapper 
map is local only so each thread accessing the map uses a different lock. Also 
fixed other accesses of the inner maps to be synchronized 
(Collections.synchronizedMap requires external synchronization on it when being 
iterated, as specified in the docs).
- fixed exports null check in doExportService (in the old code, 
Collections.synchronizedMap never returns null; In the new code it makes more 
sense in case of concurrent removal.)
- removed redundant remoteServiceAdminTracker null check in doExportService (it 
is already used in the constructor, so can't be null at this point).


Please review the changes to make sure nothing was missed (a cluster of issues 
in the same code section usually implies additional lurking bugs...), and feel 
free to drop any non-critical changes that you don't like - I find them all 
beneficial, but it can be a matter of taste.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to