[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16380586#comment-16380586 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user clebertsuconic commented on the issue: https://github.com/apache/activemq-artemis/pull/1899 @shoukunhuai please let me know if you still see any issues... Testsuites have improved after this commit. > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379301#comment-16379301 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user asfgit closed the pull request at: https://github.com/apache/activemq-artemis/pull/1904 > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379300#comment-16379300 ] ASF subversion and git services commented on ARTEMIS-1700: -- Commit 7e06a2b1922f40447cd2eff4f7147e4b7056ae1e in activemq-artemis's branch refs/heads/master from Clebert Suconic [ https://git-wip-us.apache.org/repos/asf?p=activemq-artemis.git;h=7e06a2b ] ARTEMIS-1700 Using IOExecutors for more IO tasks > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16378932#comment-16378932 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user clebertsuconic commented on the issue: https://github.com/apache/activemq-artemis/pull/1899 @shoukunhuai see #1904 > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16378931#comment-16378931 ] ASF GitHub Bot commented on ARTEMIS-1700: - GitHub user clebertsuconic opened a pull request: https://github.com/apache/activemq-artemis/pull/1904 ARTEMIS-1700 Using IOExecutors for more IO tasks You can merge this pull request into a Git repository by running: $ git pull https://github.com/clebertsuconic/activemq-artemis deadlock Alternatively you can review and apply these changes as the patch at: https://github.com/apache/activemq-artemis/pull/1904.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1904 > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16378700#comment-16378700 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user clebertsuconic commented on the issue: https://github.com/apache/activemq-artemis/pull/1899 You're right.. Man!!! you're good! :) I will send another PR! > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377931#comment-16377931 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user shoukunhuai commented on the issue: https://github.com/apache/activemq-artemis/pull/1899 So it is a mistake to use global thread pool instead of io thread pool for page cursor. But this does not fix our problem, as you can see ``` "Thread-274672 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$5@4e91d63f)" Id=274703 TIMED_WAITING on java.util.concurrent.CountDownLatch$Sync@5c416651 at sun.misc.Unsafe.park(Native Method) - waiting on java.util.concurrent.CountDownLatch$Sync@5c416651 at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) at org.apache.activemq.artemis.core.journal.impl.SimpleWaitIOCallback.waitCompletion(SimpleWaitIOCallback.java:73) at org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl.waitCompletion(OperationContextImpl.java:313) at org.apache.activemq.artemis.core.persistence.impl.journal.AbstractJournalStorageManager.waitOnOperations(AbstractJournalStorageManager.java:294) at org.apache.activemq.artemis.core.paging.cursor.impl.PageCursorProviderImpl.storeBookmark(PageCursorProviderImpl.java:539) at org.apache.activemq.artemis.core.paging.cursor.impl.PageCursorProviderImpl.cleanupComplete(PageCursorProviderImpl.java:431) at org.apache.activemq.artemis.core.paging.cursor.impl.PageCursorProviderImpl.cleanup(PageCursorProviderImpl.java:383) - locked org.apache.activemq.artemis.core.paging.cursor.impl.PageCursorProviderImpl@4f8d6d9a at org.apache.activemq.artemis.core.paging.cursor.impl.PageCursorProviderImpl$1.run(PageCursorProviderImpl.java:291) at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42) at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31) at org.apache.activemq.artemis.utils.actors.ProcessorBase$ExecutorTask.run(ProcessorBase.java:53) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Number of locked synchronizers = 2 - java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@613d010 - java.util.concurrent.ThreadPoolExecutor$Worker@4c03ae59 ``` When exit paging state, we will store bookmark for each page subscription and wait until all callbacks done. I believe this may happen even running in io thread as long as singleThreadExecutor in AbstractJournalStrorageManager use thread from global server thread pool. > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377709#comment-16377709 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user asfgit closed the pull request at: https://github.com/apache/activemq-artemis/pull/1894 > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377710#comment-16377710 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user asfgit closed the pull request at: https://github.com/apache/activemq-artemis/pull/1899 > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377707#comment-16377707 ] ASF subversion and git services commented on ARTEMIS-1700: -- Commit ecf4110b1b283291e9faced99d69823f8c92632a in activemq-artemis's branch refs/heads/master from Clebert Suconic [ https://git-wip-us.apache.org/repos/asf?p=activemq-artemis.git;h=ecf4110 ] ARTEMIS-1700 Fixed deadlock in paging state This closes #1894 > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377703#comment-16377703 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user clebertsuconic commented on the issue: https://github.com/apache/activemq-artemis/pull/1899 it's ready to be merged! testsuite pass! > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377366#comment-16377366 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user clebertsuconic commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1899#discussion_r170695023 --- Diff: artemis-commons/src/main/java/org/apache/activemq/artemis/utils/actors/ArtemisExecutor.java --- @@ -50,6 +50,16 @@ default int shutdownNow(Consumer onPendingTask) { return 0; } + default boolean flush(long timeout, TimeUnit unit) { + CountDownLatch latch = new CountDownLatch(1); + execute(latch::countDown); --- End diff -- This won’t be used at all. It’s just the default implementation for tests. OrderedExecutor has an implementation. > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377342#comment-16377342 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user franz1981 commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1899#discussion_r170691220 --- Diff: artemis-commons/src/main/java/org/apache/activemq/artemis/utils/actors/ArtemisExecutor.java --- @@ -50,6 +50,16 @@ default int shutdownNow(Consumer onPendingTask) { return 0; } + default boolean flush(long timeout, TimeUnit unit) { + CountDownLatch latch = new CountDownLatch(1); + execute(latch::countDown); --- End diff -- if the latch is submitted from within the same thread that is executing the tasks, waiting won't be necessary, because it cannot be triggered > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377328#comment-16377328 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user clebertsuconic commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170689156 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- if you could please comment on #1899 > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377326#comment-16377326 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user clebertsuconic commented on the issue: https://github.com/apache/activemq-artemis/pull/1899 Please wait my ack before merging this.. I'm running the whole testsuite! open for discussion only now > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377323#comment-16377323 ] ASF GitHub Bot commented on ARTEMIS-1700: - GitHub user clebertsuconic opened a pull request: https://github.com/apache/activemq-artemis/pull/1899 ARTEMIS-1700 Fixed deadlock in paging state This closes #1894 You can merge this pull request into a Git repository by running: $ git pull https://github.com/clebertsuconic/activemq-artemis deadlock Alternatively you can review and apply these changes as the patch at: https://github.com/apache/activemq-artemis/pull/1899.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1899 commit a4c0e9bcdf072404c07816fd8fe3e8402a774382 Author: Clebert Suconic Date: 2018-02-26T18:13:50Z ARTEMIS-1700 Fixed deadlock in paging state This closes #1894 > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377250#comment-16377250 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user clebertsuconic commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170676481 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- @shoukunhuai there's another pool to be used with IO.. iothreadpool.. this is using the wrong pool.. besides we shouldn't wait forever to stop... I'm sending another PR. > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16375504#comment-16375504 ] Francesco Nigro commented on ARTEMIS-1700: -- Thanks! Just a couple of tips on it. Considering how the TimedBuffer works, if you need scalability vs a big number of writers with blocking API like NIO/MAPPED you need to: - run ./artemis perf-journal --journal-type MAPPED --sync --verbose --sync-writes - read the histogram and use at least the latencies starting from the 0.90 percentiles, then add your network latency and the results would be the optimal buffer-timeout. As a simpler rule of thumb I use about twice the default timeout proposed by Artemis: you will loose something on the single producer/consumer case while gaining more chances of batched writes with multiple producers/consumers. While, if you are just benchmarking and do not care about power failures but just OS failures try buffer-timeout = 0 and datasync = false on broker.xml with MAPPED. Il giorno sab 24 feb 2018 alle ore 02:51 Qihong Xu (JIRA) > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16375260#comment-16375260 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user shoukunhuai commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170407785 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- https://github.com/apache/activemq-artemis/blob/d6d895c558cc104475188d942473771418b5e3e6/artemis-server/src/main/java/org/apache/activemq/artemis/core/server/group/impl/LocalGroupingHandler.java#L109-L117 @clebertsuconic the comment said we need an executor out side the pool > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16375225#comment-16375225 ] Qihong Xu commented on ARTEMIS-1700: [~nigro@gmail.com] Yes, we just use the default setting here. > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16375220#comment-16375220 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user qihongxu commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170405758 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- [artemis.log](https://github.com/apache/activemq-artemis/files/1753723/artemis.log) Please see the thread dump file in attachment. > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16375219#comment-16375219 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user shoukunhuai commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170405747 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- We are running 2.4.0 See https://issues.apache.org/jira/browse/ARTEMIS-1700 There is a artemis.log attached. > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16375209#comment-16375209 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user clebertsuconic commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170405391 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- Can I see a thread dump ? Maybe there is a better fix. A test would be best. But a thread dump would be ok as long as you tell me the version (or got commit) it is associated. > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16375207#comment-16375207 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user shoukunhuai commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170405300 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- We believe this happens when - using fixed thread pool which is the default, and - an address has many producers, more then thread pool's size, and - the address is about to exit paging state > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16375205#comment-16375205 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user shoukunhuai commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170405082 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- What if the pool is full? In our case, the pool is a 60 thread fixed pool. One of the thread is doing page cleanup, and try to exit paging state, it holds the lock in paging store. All other 59 threads is blocked on the lock, trying to page. While cleanup, we need to store bookmark in journal for each page subscription, then wait until completed. In log, stored equals to storeLineUp, but there are pending tasks(there are going to count down latch cleanup thread is waiting on), the deadlock happened. ``` 16:44:28,930 AMQ222024: Could not complete operations on IO context OperationContextImpl [1251391301] [minimalStore=1, storeLineUp=2, stored=2, minimalReplicated=0, replicationLineUp=0, replicated=0, paged=0, minimalPage=0, pageLineUp=0, errorCode=-1, errorMessage=null, executorsPending=3, executor=OrderedExecutor(tasks=[org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl$1@4d09259, org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl$1@54b73dc4, org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl$1@640495d4])] ``` > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16374923#comment-16374923 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user clebertsuconic commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170356212 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- Can you please close your PR? The executor factory is providing the same thing you suggested here. it won't fix anything. .just will create more threads and problems. > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16374921#comment-16374921 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user clebertsuconic commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170355937 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- nope.. that's wrong... executorFactory.getExecutor() is returning on thread executor from the pool. it won't always be the same thread.. but it will always be the same context.. this patch is not valid. in what situation do you see a deadlock. hornetq it might be different.. I would need a test to be able to accept a patch here. we should reuse the thread from the pool always. > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16374489#comment-16374489 ] Francesco Nigro commented on ARTEMIS-1700: -- Hi I've noticed in the dump that there is a TimedBuffer running and you're using MAPPED journal, hence I suppose that you have datasync = true. It is correct? > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16374475#comment-16374475 ] ASF GitHub Bot commented on ARTEMIS-1700: - Github user franz1981 commented on the issue: https://github.com/apache/activemq-artemis/pull/1894 @qihongxu Thanks for the PR! IMO would be better to provide a test or a reproducer that has been fixed by this change. Re the changes: the `ArtemisExecutor` is a `fake` executor that just provide a queue of `Runnable` that can be drained exclusively by one consumer Thread at time, making it a single threaded executor, just not by the same thread. > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state
[ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16374280#comment-16374280 ] ASF GitHub Bot commented on ARTEMIS-1700: - GitHub user qihongxu opened a pull request: https://github.com/apache/activemq-artemis/pull/1894 ARTEMIS-1700 Fixed deadlock in paging state JournalStorageManager is not indeed using a `single` thread. We apply this patch to use a simple single thread executor. We have seen similar threads on Internet. This seems to be a remaining problem from hornetQ. You can merge this pull request into a Git repository by running: $ git pull https://github.com/qihongxu/activemq-artemis ARTEMIS-1700 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/activemq-artemis/pull/1894.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1894 commit 53daaa2de6dc8407d6c263493b6a2f50b8f5668d Author: 17060606 <17060606@...> Date: 2018-02-23T12:00:00Z ARTEMIS-1700 Fixed deadlock in paging state JournalStorageManager is not indeed using a `single` thread. We apply this patch to use a simple single thread executor. We have seen similar threads on Internet. This seems to be a remaining problem from hornetQ. > Server stopped responding and killed itself while exiting paging state > -- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.4.0 >Reporter: Qihong Xu >Priority: Major > Attachments: artemis.log > > > We are currently experiencing this error while running stress test on artemis. > > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type = MAPPED. > Threadpool max size = 60. > > In order to test the throughput of artemis we use 300 producers and 300 > consumers. However we found that sometimes when artemis exit paging state, it > will stop responding and kill itself. This situatuion happened on some > specific servers. > > Details can be found in attached dump file. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)