[
https://issues.apache.org/jira/browse/FELIX-3910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13777358#comment-13777358
]
Pierre De Rop edited comment on FELIX-3910 at 9/25/13 12:25 PM:
----------------------------------------------------------------
I tried to follow your suggestion by reusing the same aspect locking mechanism
for non aspect callbacks.
The patch is not finished but I prefer to attach it now (see
FELIX-3910.patch.4) so you can take a look at.
patch description
=================
- if I understand correctly, the locking methods used by the aspect aware
methods allow to serialize the execution of tasks, and other concurrent tasks
are blocked until the currently executing task is done.
So, I have added the following method which I call from other methods needing
to be serialized:
private void executeExclusivelyOrBlock(Runnable task) {
enqueueCallback(task);
waitForCallbackLock(task);
try {
execute(task);
} finally {
releaseCallback(task);
}
}
- in the waitForCallbackLock, I did a minor optimization and replaced "indexOf"
usage by this:
private synchronized void waitForCallbackLock(Runnable runnable) {
while (m_injectionQueue.size() > 0 && m_injectionQueue.get(0) !=
runnable) {
try {
wait();
} catch (InterruptedException e) {
}
}
}
I think this might improve performance, especially when there is a high number
of dependency services.
But perhaps, this optimization might be done later ...
- while doing the patch, I came across another synchronization hole that I did
not notice and fix in the previous FELIX-3910.patch.3 patch: Indeed, there is a
subtle issue in the addedService(ServiceReference ref, Object service) and
removed(ServiceReference ref, Object service) methods, which I think have to be
entirely serialized (by reusing your guarded methods): The race condition
concerns the call to makeAvailable(), makeUnavailable() methods ...
So, in the FELIX-3910.patch.4: the addedService/removedService are serialized
using the call to executeExclusivelyOrBlock method. I also serialized the
modifiedService, but I am not sure for now if this is necessary:
public void addedService(final ServiceReference ref, final Object service) {
executeExclusivelyOrBlock(new Runnable() {
public void run() {
addedServiceExclusively(ref, service);
}
});
}
private void addedServiceExclusively(ServiceReference ref, Object service) {
// same code as in addedService, before the patch.
...
}
public void modifiedService(final ServiceReference ref, final Object
service) {
executeExclusivelyOrBlock(new Runnable() {
public void run() {
modifiedServiceExclusively(ref, service);
}
});
}
private void modifiedServiceExclusively(ServiceReference ref, Object
service) {
// same code as in modifiedService, before the patch.
...
}
public void removedService(final ServiceReference ref, final Object
service) {
executeExclusivelyOrBlock(new Runnable() {
public void run() {
removedServiceExclusively(ref, service);
}
});
}
private void removedServiceExclusively(ServiceReference ref, Object
service) {
// same code as in removedService, before the patch.
...
}
With these patches, the "RaceTest" test seems to pass seamlessly. I ran it on a
multi-core machine with 32 cores.
Comments/Questions:
==================
1) since the three addedService/modifiedService/removedService are now
serialized, I wonder if the locking blocks are still necessary in the
handleAspectAwareAdded methods ?
I think we shall remove the locking blocks from this method because we would
then have a deadlock, since the addedService/removedService are already holding
locks (using the executeExclusivelyOrBlock method).
Do you agree with this ? (I will provide FELIX-3910.patch.5 which does this, if
you are OK).
2) in the handleAspectAwareAdded method, I may miss something but I wonder why
the
tasks are executed while holding a lock on the "rankings" variable ?
if (callbackRunnable != null) {
waitForCallbackLock(callbackRunnable);
synchronized (rankings) {
releaseCallback(callbackRunnable);
execute(callbackRunnable);
}
}
Could it be possible to have a deadlock if the callback blocks on a
synchronization lock ?
In this case, then why not doing the following:
if (callbackRunnable != null) {
waitForCallbackLock(callbackRunnable);
try {
execute(callbackRunnable);
} finally {
releaseCallback(callbackRunnable);
}
}
3) I'm currently wondering if the invokeAdded(DependencyService service) and
invokeRemoved(DependencyService service) methods shall (or not) also also
serialized ? I'm not sure for now ... what do you think ?
4) I have to modify the "RaceTest" test in order to use aspects as well.
thanks again for your feedback; feel free to propose another patch if you think
I'm not on the right direction ...
/Pierre
was (Author: pderop):
I tried to follow your suggestion by reusing the same aspect locking
mechanism for non aspect callbacks.
The patch is not finished but I prefer to attach it now (see
FELIX-3910.patch.4) so you can take a look at.
patch description
=================
- if I understand correctly, the locking methods used by the aspect away
methods allow to serialize the execution of tasks, and other concurrent tasks
are blocked until the currently being executed task is done.
So, I have added the following method which I call from other methods needing
to be serialized:
private void executeExclusivelyOrBlock(Runnable task) {
enqueueCallback(task);
waitForCallbackLock(task);
try {
execute(task);
} finally {
releaseCallback(task);
}
}
- in the waitForCallbackLock, I did a minor optimization and replaced "indexOf"
usage by this:
private synchronized void waitForCallbackLock(Runnable runnable) {
while (m_injectionQueue.size() > 0 && m_injectionQueue.get(0) !=
runnable) {
try {
wait();
} catch (InterruptedException e) {
}
}
}
I think this might improve performance, especially when there is a high number
of dependency services.
But perhaps, this optimization might be done later ...
- while doing the patch, I came across another synchronization hole that I did
not notice and fix in the previous FELIX-3910.patch.3 patch: Indeed, there is a
subtle issue in the addedService(ServiceReference ref, Object service) and
removed(ServiceReference ref, Object service) methods, which I think have to be
entirely serialized (by reusing your guarded methods): The race condition
concerns the call to makeAvailable(), makeUnavailable() methods ...
So, in the FELIX-3910.patch.4: the addedService/removedService are serialized
using the call to executeExclusivelyOrBlock method. I also serialized the
modifiedService, but I am not sure for now if this is necessary:
public void addedService(final ServiceReference ref, final Object service) {
executeExclusivelyOrBlock(new Runnable() {
public void run() {
addedServiceExclusively(ref, service);
}
});
}
private void addedServiceExclusively(ServiceReference ref, Object service) {
// same code as in addedService, before the patch.
...
}
public void modifiedService(final ServiceReference ref, final Object
service) {
executeExclusivelyOrBlock(new Runnable() {
public void run() {
modifiedServiceExclusively(ref, service);
}
});
}
private void modifiedServiceExclusively(ServiceReference ref, Object
service) {
// same code as in modifiedService, before the patch.
...
}
public void removedService(final ServiceReference ref, final Object
service) {
executeExclusivelyOrBlock(new Runnable() {
public void run() {
removedServiceExclusively(ref, service);
}
});
}
private void removedServiceExclusively(ServiceReference ref, Object
service) {
// same code as in removedService, before the patch.
...
}
With these patches, the "RaceTest" test seems to pass seamlessly. I ran it on a
multi-core machine with 32 cores.
Comments/Questions:
==================
1) since the three addedService/modifiedService/removedService are now
serialized, I wonder if the locking blocks are still necessary in the
handleAspectAwareAdded methods ?
I think we shall remove the locking blocks from this method because we would
then have a deadlock, since the addedService/removedService are already holding
locks (using the executeExclusivelyOrBlock method).
Do you agree with this ? (I will provide FELIX-3910.patch.5 which does this, if
you are OK).
2) in the handleAspectAwareAdded method, I may miss something but I wonder why
the
tasks are executed while holding a lock on the "rankings" variable ?
if (callbackRunnable != null) {
waitForCallbackLock(callbackRunnable);
synchronized (rankings) {
releaseCallback(callbackRunnable);
execute(callbackRunnable);
}
}
Could it be possible to have a deadlock if the callback blocks on a
synchronization lock ?
In this case, then why not doing the following:
if (callbackRunnable != null) {
waitForCallbackLock(callbackRunnable);
try {
execute(callbackRunnable);
} finally {
releaseCallback(callbackRunnable);
}
}
3) I'm currently wondering if the invokeAdded(DependencyService service) and
invokeRemoved(DependencyService service) methods shall (or not) also also
serialized ?
I think this is not the case ?
4) I have to modify the "RaceTest" test in order to use aspects as well.
thanks again for your feedback; feel free to propose another patch if you think
I'm not on the right direction ...
/Pierre
> Race conditions in DependencyManager
> ------------------------------------
>
> Key: FELIX-3910
> URL: https://issues.apache.org/jira/browse/FELIX-3910
> Project: Felix
> Issue Type: Bug
> Components: Dependency Manager
> Affects Versions: dependencymanager-3.0.0
> Environment: jdk1.6, jdk1.7, linux fc16
> Reporter: Pierre De Rop
> Attachments: FELIX-3910-patch, FELIX-3910.patch.2,
> FELIX-3910.patch.3, FELIX-3910.patch.4
>
>
> In a multi threaded context, where dependencies are injected concurrently
> from different threads, we came across some exceptions which seem to take
> place from dependencymanager.
> I have tried to reproduce the problems using a paxexam test which I will
> commit.
> Not all exceptions are reproduced by the test case, but I think that the
> testcase really reproduces a problem.
> I also have a candidate patch, which I will submit to this jira issue.
> Here are the exceptions we have seen:
> first stacktrace seen:
> ===============
> ERROR: Bundle test.dm [21] EventDispatcher: Error during dispatch.
> (java.lang.NullPointerException)
> java.lang.NullPointerException
> at
> org.apache.felix.dm.impl.dependencies.ServiceDependencyImpl.addedService(ServiceDependencyImpl.java:481)
> at
> org.apache.felix.dm.tracker.ServiceTracker$Tracked.customizerAdded(ServiceTracker.java:1325)
> at
> org.apache.felix.dm.tracker.AbstractTracked.trackAdding(AbstractTracked.java:290)
> at
> org.apache.felix.dm.tracker.AbstractTracked.track(AbstractTracked.java:236)
> at
> org.apache.felix.dm.tracker.ServiceTracker$Tracked.serviceChangedHideAspects(ServiceTracker.java:1206)
> at
> org.apache.felix.dm.tracker.ServiceTracker$Tracked.serviceChanged(ServiceTracker.java:1101)
> at
> org.apache.felix.framework.util.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:932)
> at
> org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:793)
> at
> org.apache.felix.framework.util.EventDispatcher.fireServiceEvent(EventDispatcher.java:543)
> at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4260)
> at org.apache.felix.framework.Felix.registerService(Felix.java:3275)
> at
> org.apache.felix.framework.BundleContextImpl.registerService(BundleContextImpl.java:346)
> at
> org.apache.felix.framework.BundleContextImpl.registerService(BundleContextImpl.java:320)
> at test.dm.race.RaceTest$RegistrationHelper$1.run(RaceTest.java:104)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> second exceptions:
> ==============
> ERROR: EventDispatcher: Error during dispatch.
> (java.lang.NullPointerException)
> java.lang.NullPointerException
> at
> org.apache.felix.dm.impl.ComponentImpl.invokeCallbackMethod(ComponentImpl.java:686)
> at
> org.apache.felix.dm.impl.dependencies.ServiceDependencyImpl.invoke(ServiceDependencyImpl.java:704)
> at
> org.apache.felix.dm.impl.dependencies.ServiceDependencyImpl.invokeRemoved(ServiceDependencyImpl.java:666)
> at
> org.apache.felix.dm.impl.dependencies.ServiceDependencyImpl.removedService(ServiceDependencyImpl.java:520)
> at
> org.apache.felix.dm.tracker.ServiceTracker$Tracked.customizerRemoved(ServiceTracker.java:1351)
> at
> org.apache.felix.dm.tracker.AbstractTracked.untrack(AbstractTracked.java:359)
> at
> org.apache.felix.dm.tracker.ServiceTracker$Tracked.serviceChangedHideAspects(ServiceTracker.java:1285)
> at
> org.apache.felix.dm.tracker.ServiceTracker$Tracked.serviceChanged(ServiceTracker.java:1101)
> at
> org.apache.felix.framework.util.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:878)
> at
> org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:732)
> at
> org.apache.felix.framework.util.EventDispatcher.fireServiceEvent(EventDispatcher.java:662)
> at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:3587)
> at org.apache.felix.framework.Felix.access$000(Felix.java:40)
> at org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:625)
> at
> org.apache.felix.framework.ServiceRegistry.unregisterService(ServiceRegistry.java:117)
> at
> org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:128)
> at org.apache.felix.dm.test.RaceTest$AFactory$2.run(RaceTest.java:151)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira