Alin Brici created FELIX-5435:
---------------------------------
Summary: [DS] Service does not get loaded with updated properties
that have been modified on file system after felix framework restart
Key: FELIX-5435
URL: https://issues.apache.org/jira/browse/FELIX-5435
Project: Felix
Issue Type: Bug
Components: Declarative Services (SCR)
Affects Versions: scr-2.0.6, scr-2.0.4
Environment: Java 1.7; Using Felix framework version 5.4.0.
Reporter: Alin Brici
Priority: Blocker
In felix.scr 2.0.6, seeing an issue where a configuration for a particular
Service is not loaded.
We have one @Component(Componenent A) that offers one service, lets call this
Service A.
ComponenetA attributes are as follows: {noformat}(name="ServiceA", policy=
ConfigurationPolicy.REQUIRE, metatype=true, immediate=true) {noformat}
ServiceA has a configuration that could look like this:
{noformat}
{
"name" : "ServiceA",
"description" : "Displays which version it has of ServiceB",
"versionRequired" : "1"
}
{noformat}
We have a second @Component(Component B) that offers another service, lets call
that Service B.
The Component attributes for Component B are as follows:
{noformat}(name:"ServiceB", policy=ConfigurationPolicy.OPTIONAL, metatype=true,
immediate=true).{noformat}
(I understand with immediate=true and ConfiguraitonPolicy.OPTIONAL, Service B
will be initialized without any configuration when it first gets started by the
framework, however if there is configuration for it on the file system, Felix
DS will reload Service B with what is on file system when it scans the
directory for ServiceB configuraiton).
Service B, has some properties in the configuraiton for it, that are used by
ServiceA if ServiceB has been loaded with configuration. For example say we
have a configuration for ServiceB that looks like this.
{noformat}
{
"name" : "ServiceB",
"description" : "is used by ServiceA"
"version" : "1"
}
{noformat}
Service A has a reference to Service B, as so:
{noformat}
@Reference(policy = ReferencePolicy.DYNAMIC)
volatile ServiceB serviceB = null;
For example if we have something like this for ComponentA/ServiceA, I will keep
the code short to illustrate the problem.
@Component(name = "ServiceA",
policy = ConfigurationPolicy.REQUIRE,
metatype = true,
description = "ComponentA with ServiceA",
immediate = true)
@Service(value = {ServiceAInterface.class})
public class ServiceA implements ServiceAInterface {
private String version;
private String description;
private Json config = null;
// version requested by VersionB
private String versionRequested;
@Reference(policy = ReferencePolicy.DYNAMIC)
volatile ServiceB serviceB;
@Activate
protected void activate(ComponentContext context) {
this.config = getConfigFromComponentContext(context);
this.description = config.get("description");
this.versionRequested = config.get("versionRequested");
// get the version that ServiceA config requests
serviceB.getVersionAsStringAsync(versionRequested).thenOnResult(
this.version = versionFromServiceB;
);
}
// @Deactivate method would go here
@override
public String getVersion() {
if (version != null) {
return version;
} else {
return "No Version has been set yet!";
}
}
}
{noformat}
For the scenario where this problem exists we need a configuration for both
ServiceA and ServiceB. The configuration is stored on the file system in some
directory that felix watches to load changes as they occur. /tmp/felixConfig/
ServiceA.json ServiceB.json.
Start up the Felix OSGi environment with the configuration and what I see is
that it works as expected. ServiceA waits until it has its @Refernce to
ServiceB satisfied before it attempts to activate as expected. Then if I made a
call to ServiceA#getVersion("1"), and ServiceB is up and running with
configuration from the file that was found, ServiceA will be able to get the
version.
Here is where the problem is. I shutdown the OSGi Framework. Then I change the
configuration to both ServiceA and ServiceB, "versionRequired" : "2" and
"version" : "2", respectively. I then start the OSGi framework back up and now
ServiceA does not load correctly in the ComponentRegistry as "versionRequired"
: "2", but rather has the old version, "1". I see that felix.scr picked up the
changes and is attempting to notify all the
ConfigurationListener(org.osgi.service.cm.ConfigurationListener)
implementations.
The setup I am working with picks up a change and makes a call to
org.apache.felix.cm.impl.ConfigurationAdapter#update,
{noformat}
@Override
public void update( Dictionary<String, ?> properties ) throws IOException
{
delegatee.getConfigurationManager().log( LogService.LOG_DEBUG,
"update(properties={0})", new Object[]
{ properties } );
checkActive();
checkDeleted();
delegatee.update( properties );
}
{noformat}
{noformat}
** An Important NOTE **
I set a breakpoint at `delegatee.update(properties)`. When I look here at the
`configurations` inside the
ConfigurationAdapter->ConfigurationAdmin->configurations, the configuration for
ServiceA is not here! The configuration for ServiceB is in that list of
configurations. IF I look at a previous version of felix.scr, 2.0.2, I see
ServiceA in that list of configurations..it no longer is there, in version
2.0.4,2.0.6, and trunk and this will result in the problem because the revision
number will not be updated.
** **
{noformat}
These properties are the new properties for ServiceA, that has the the
versionRequested attribute in the ServiceA object updated to versionRequest =
2, because I changed it on the file system when the system was down and on
startup of the framework it pick up those changes on the file system and is
attempting to propagate this configuration to all necessary services that
require it.
Digging deeper in the call to delegatee.update( properties), is implemented in
org.apache.felix.cm.impl#update(Dictionary<String, ?> properties) as so,
{noformat}
* @see org.osgi.service.cm.Configuration#update(java.util.Dictionary)
*/
public void update( Dictionary<String, ?> properties ) throws IOException
{
PersistenceManager localPersistenceManager = getPersistenceManager();
if ( localPersistenceManager != null )
{
CaseInsensitiveDictionary newProperties = new
CaseInsensitiveDictionary( properties );
getConfigurationManager().log( LogService.LOG_DEBUG, "Updating
config {0} with {1}", new Object[]
{ getPidString(), newProperties } );
setAutoProperties( newProperties, true );
// persist new configuration
localPersistenceManager.store( getPidString(), newProperties );
// finally assign the configuration for use
configure( newProperties );
// if this is a factory configuration, update the factory with
// do this only after configuring with current properties such
// that a concurrently registered ManagedServiceFactory service
// does not receive a new/unusable configuration
updateFactory();
// update the service and fire an CM_UPDATED event
getConfigurationManager().updated( this, true );
}
}
{noformat}
The intresting thing here is that the reference now to 'this' ConfigurationImpl
seen on the code above `getConfigurationManager().updated( this, true );`
has the new revision, and properties of the ConfigurationImpl, but later on
when we dig deeper in the call to getConfigurationManager().updated( this, true
);
we will see that we cannot find the ConfigurationImpl(with the UPDATED revision
and properties) in the list of Configurations of the ConfirurationAdmin.
Following the getConfigurationManager().updated( this, true ); code brings us
to where the ConfigurationManager perpares the eventTread and the UpdateThread
as so:
org.apache.felix.cm.impl.ConfigurationManager
{noformat}
void updated( ConfigurationImpl config, boolean fireEvent )
{
if ( fireEvent )
{
fireConfigurationEvent( ConfigurationEvent.CM_UPDATED,
config.getPidString(), config.getFactoryPidString() );
}
updateThread.schedule( new UpdateConfiguration( config ) ); // this
will occur with the correct revision...
log( LogService.LOG_DEBUG, "UpdateConfiguration({0}) scheduled", new
Object[]
{ config.getPid() } );
}
{noformat}
One very important thing to note here is that the `ConfigurationImpl config`
passed into this updated(config, fireEvent) method has the correct `revision`
and `properties`.
That ConfigurationImpl is not passed into the `fireConfigurationEvent`, but we
later look it up in the ConfigAdmin...which we will see that it is not there,
and this is what I believe to be the root of the problem.
Before we get to the updateThread.schedule(), lets go ahead into the
fireConfiguraitonEvent() which creates a new FireConfigurationEvent object for
both asyncSender and for
syncSender, I did not have any syncSender so that could be ignored. This new
object gets scheduled in the `eventThread`. This thread will then loop through
each ConfigurationListener
and passes the ConfiguraitonEvent that it had created letting them know about
the changed.
Using from the sample I wrote above, ComponentA and all its properties are part
of this ConfigurationEvent. The listener that is having problems actually doing
anything with the ConfigurationEvent is the
org.apache.felix.scr.impl.manager.RegionConfigurationSupport#configurationEvent(ConfigurationEvent
event)
when we look at the code inside the implementation of that we see this line
that is returning
a ConfigurationInfo that has a revision that is revision=1, and the properties
are the updatedProperties{version=2,etc.} instead of returning
revision=2,updatedProperties{version=2,etc.}
around like 259
{noformat}
final ConfigurationInfo configInfo = getConfigurationInfo( pid, targetedPid, //
This IS where the config returns the wrong revision
componentHolder, bundleContext );
{noformat}
If we dig deeper to understand why the ConfigurationInfo comes back with the
correct updated properties but without updating the revision to 2 we will
inside
org.apache.felix.scr.impl.manager.RegionConfigurationSupport#getConfigurationInfo(final
TargetedPID pid, TargetedPID targetedPid, ComponentHolder<?> componentHolder,
final BundleContext bundleContext)
{noformat}
try
{//final ServiceReference caRef =
bundleContext.getServiceReference(ComponentRegistry.CONFIGURATION_ADMIN);
final ConfigurationAdmin ca = getConfigAdmin( bundleContext );
try // this looks interesting, could be getting the wrong config
admin? ?
{
Configuration[] configs = ca.listConfigurations( filter(
pid.getRawPid() ) ); <----- this ca.listConfigurations() is the problem, like I
mentioned earlier in the ** NOTE **
if ( configs != null && configs.length > 0 )
{
Configuration config = configs[0];
return new ConfigurationInfo( config.getProperties(),
config.getBundleLocation(),
config.getChangeCount() );
}
... the rest of the implementation
{noformat}
The problem I am seeing is the the `ConfigurationAdmin ca` does not have the
configuration in its cachedConfigs
{noformat}
// the cache of Configuration instances mapped by their PID
// have this always set to prevent NPE on bundle shutdown
private final HashMap<String, ConfigurationImpl> configurations = new
HashMap<String, ConfigurationImpl>();
{noformat}
when it makese the call to ca.listConfigurations..
{noformat}
// ensure the service.pid and returned a cached
config if available
ConfigurationImpl cfg = getCachedConfiguration( pid ); <-----
this returns null
if ( cfg == null )
{
cfg = new ConfigurationImpl( this, pmList[i], config );
}
{noformat}
it returns no configuraiton from the cache, so then we return a new
ConfigurationImpl which configures a NEW implementation of `ConfigurationImpl`
from what we stored in the PersistanceManager above
at org.apache.felix.cm.impl#update(Dictionary<String, ?> properties), but the
revision, because it's a NEW object, is set to 1. and when this goes back up
the stack and gets to
org.apache.felix.scr.impl.manager.ConfigurableComponentHolder#configurationUpdated()
The logical statement inside,
{noformat}
if (oldChangeCount != null && changeCount <= oldChangeCount &&
factoryPid.equals(oldTargetedPID)) {
return false;
}
{noformat}
is TRUE so the code immediately returns false and the Component(ComponentA) is
not reconfigured with the new properties.
This DOES NOT happen on version felix.scr 2.0.2, but tracing through that code
I see a lot of code that has been refactored. The RegionConfigurationSupport
has taken over what was the ConfigurationSupport. The revision changeCount has
been refactored, and the CA has usage has also been refactored.
It is broken in both 2.0.4, and 2.0.6.
I did a manual bisect until I was able to narrow it down, here is a snapshot of
when the behavior was introduced:
{noformat}
------------------------------------------------------------------------
r1747831 | djencks | 2016-06-10 18:14:36 -0700 (Fri, 10 Jun 2016) | 1 line
<------------- TEST HERE [ BROKEN ]
FELIX-5270 Don't set bundle location on configurations
------------------------------------------------------------------------
{noformat}
This only started occuring after the first commit to
{noformat}FELIX-5270{noformat}
Here a larger snapshot of the actual bisect steps I took:
{noformat}
FELIX-5270 log when bundle locations are inconsistent
------------------------------------------------------------------------
r1749929 | djencks | 2016-06-23 08:57:30 -0700 (Thu, 23 Jun 2016) | 1 line
<------------- TEST HERE [ BROKEN ] AT THIS REVISION
FELIX-5248 missing license header
------------------------------------------------------------------------
r1749927 | cziegeler | 2016-06-23 08:48:00 -0700 (Thu, 23 Jun 2016) | 1 line
Revert rev 1749869
------------------------------------------------------------------------
r1749869 | cziegeler | 2016-06-23 04:45:33 -0700 (Thu, 23 Jun 2016) | 1 line
add missing license header
------------------------------------------------------------------------
r1748287 | djencks | 2016-06-13 10:14:22 -0700 (Mon, 13 Jun 2016) | 1 line
FELIX-5248 test for complaint
------------------------------------------------------------------------
r1747831 | djencks | 2016-06-10 18:14:36 -0700 (Fri, 10 Jun 2016) | 1 line
<------------- TEST HERE [ BROKEN ] WINNER!!!
FELIX-5270 Don't set bundle location on configurations
------------------------------------------------------------------------
r1747329 | djencks | 2016-06-07 17:14:12 -0700 (Tue, 07 Jun 2016) | 1 line
<------------- TEST HERE [ WORKS ]
FELIX-5276 track service event before changing service properties
------------------------------------------------------------------------
r1746618 | djencks | 2016-06-02 12:36:26 -0700 (Thu, 02 Jun 2016) | 1 line
<------------- TEST HERE [ WORKS ]
FELIX-4417 Improve logging of circular references. Fix some problems
introduced with rev 1744827 when activate changes service properties.
------------------------------------------------------------------------
r1746617 | djencks | 2016-06-02 12:36:22 -0700 (Thu, 02 Jun 2016) | 1 line
FELIX-5264 Introduce a single State enum and use an atomic to track it, and use
some optimistic locking on state changes. This fixes the specific issue found
and should provide much easier diagnosis of any remaining or new problems.
------------------------------------------------------------------------
r1746471 | gnodet | 2016-06-01 07:36:32 -0700 (Wed, 01 Jun 2016) | 1 line
<-------- TEST HERE [ WORKS ]
[FELIX-5243] Make ComponentContextImpl#setImplementationAccessible public,
similar to setImplementationObject
------------------------------------------------------------------------
r1746470 | gnodet | 2016-06-01 07:20:03 -0700 (Wed, 01 Jun 2016) | 1 line
[FELIX-5243] Remove anonymous inner class, add a unit test to ensure package
consistency
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)