[
https://issues.apache.org/jira/browse/ARTEMIS-2347?focusedWorklogId=246332&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-246332
]
ASF GitHub Bot logged work on ARTEMIS-2347:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 21/May/19 19:34
Start Date: 21/May/19 19:34
Worklog Time Spent: 10m
Work Description: clebertsuconic commented on pull request #2675:
ARTEMIS-2347 JournalStorageManager::stopReplication can deadlock while stopping
URL: https://github.com/apache/activemq-artemis/pull/2675#discussion_r286189481
##########
File path:
artemis-server/src/test/java/org/apache/activemq/artemis/core/persistence/impl/journal/JournalStorageManagerTest.java
##########
@@ -145,4 +166,73 @@ public void testFixJournalFileSize() {
Assert.assertEquals(4096, manager.fixJournalFileSize(4098, 4096));
Assert.assertEquals(8192, manager.fixJournalFileSize(8192, 4096));
}
+
+ @Test(timeout = 20_000)
Review comment:
This test is hard coded to libaio.
Please, either make it runnable in other platforms, or add an Assume call to
ignore the test in case of libaio:
```java
org.junit.Assume.assumeTrue("Test case needs AIO to run",
AIOSequentialFileFactory.isSupported());
```
It's probably best if you move the new test to a new test class, and add
this on the @BeforeClass:
```java
@BeforeClass
public static void hasAIO() {
org.junit.Assume.assumeTrue("Test case needs AIO to run",
AIOSequentialFileFactory.isSupported());
}
```
An example of a class playing with Assume is AIOImportExportTest.
But you also have the option to make it runnable with NIO. But if not
possible, please add the Assume call here.
I ran this test on my MacOS and I got this:
```
java.lang.UnsatisfiedLinkError:
org.apache.activemq.artemis.nativo.jlibaio.LibaioContext.newContext(I)Ljava/nio/ByteBuffer;
at
org.apache.activemq.artemis.nativo.jlibaio.LibaioContext.newContext(Native
Method)
at
org.apache.activemq.artemis.nativo.jlibaio.LibaioContext.<init>(LibaioContext.java:161)
at
org.apache.activemq.artemis.core.io.aio.AIOSequentialFileFactory.start(AIOSequentialFileFactory.java:260)
at
org.apache.activemq.artemis.core.journal.impl.JournalImpl.start(JournalImpl.java:2449)
at
org.apache.activemq.artemis.core.persistence.impl.journal.AbstractJournalStorageManager.start(AbstractJournalStorageManager.java:1538)
at
org.apache.activemq.artemis.core.persistence.impl.journal.JournalStorageManagerTest.testStopReplicationDoesNotDeadlockWhileStopping(JournalStorageManagerTest.java:179)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
```
And I would like to keep the testsuite sane, especially for those eventually
running it on windows and Mac.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 246332)
Time Spent: 40m (was: 0.5h)
> JournalStorageManager::stopReplication can deadlock while stopping
> ------------------------------------------------------------------
>
> Key: ARTEMIS-2347
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2347
> Project: ActiveMQ Artemis
> Issue Type: Bug
> Components: Broker
> Affects Versions: 2.8.1
> Reporter: Francesco Nigro
> Priority: Major
> Fix For: 2.9.0
>
> Attachments: deadlock_stacktrace.txt
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> JournalStorageManager::stopReplication needs to:
> # acquires the manager write lock
> # acquires (if any) large message intrinsic locks during
> performCachedLargeMessageDeletes
> # acquires the manager read lock on confirmPendingLargeMessage
> JournalStorageManager::stop needs to:
> # acquires (if any) large message intrinsic locks during
> performCachedLargeMessageDeletes
> # acquires the manager read lock on confirmPendingLargeMessage
> A racing call to JournalStorageManager::stopReplication while stopping could
> deadlock
> the broker:
> # JournalStorageManager::stop acquires a large message intrinsic locks
> during performCachedLargeMessageDeletes
> # JournalStorageManager::stopReplication acquires the manager write lock
> # JournalStorageManager::stop wait is blocked on the manager write lock
> before releasing the lock on large message
> # JournalStorageManager::stopReplication is blocked on the large message
> intrinsic lock before releasing the manager write lock
> # deadlock occurred: none can proceed
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)