[GitHub] activemq-artemis pull request #1894: ARTEMIS-1700 Fixed deadlock in paging s...
Github user asfgit closed the pull request at: https://github.com/apache/activemq-artemis/pull/1894 ---
[GitHub] activemq-artemis pull request #1894: ARTEMIS-1700 Fixed deadlock in paging s...
Github user clebertsuconic commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170689156 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- if you could please comment on #1899 ---
[GitHub] activemq-artemis pull request #1894: ARTEMIS-1700 Fixed deadlock in paging s...
Github user clebertsuconic commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170676481 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- @shoukunhuai there's another pool to be used with IO.. iothreadpool.. this is using the wrong pool.. besides we shouldn't wait forever to stop... I'm sending another PR. ---
[GitHub] activemq-artemis pull request #1894: ARTEMIS-1700 Fixed deadlock in paging s...
Github user shoukunhuai commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170407785 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- https://github.com/apache/activemq-artemis/blob/d6d895c558cc104475188d942473771418b5e3e6/artemis-server/src/main/java/org/apache/activemq/artemis/core/server/group/impl/LocalGroupingHandler.java#L109-L117 @clebertsuconic the comment said we need an executor out side the pool ---
[GitHub] activemq-artemis pull request #1894: ARTEMIS-1700 Fixed deadlock in paging s...
Github user qihongxu commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170405758 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- [artemis.log](https://github.com/apache/activemq-artemis/files/1753723/artemis.log) Please see the thread dump file in attachment. ---
[GitHub] activemq-artemis pull request #1894: ARTEMIS-1700 Fixed deadlock in paging s...
Github user shoukunhuai commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170405747 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- We are running 2.4.0 See https://issues.apache.org/jira/browse/ARTEMIS-1700 There is a artemis.log attached. ---
[GitHub] activemq-artemis pull request #1894: ARTEMIS-1700 Fixed deadlock in paging s...
Github user clebertsuconic commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170405391 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- Can I see a thread dump ? Maybe there is a better fix. A test would be best. But a thread dump would be ok as long as you tell me the version (or got commit) it is associated. ---
[GitHub] activemq-artemis pull request #1894: ARTEMIS-1700 Fixed deadlock in paging s...
Github user shoukunhuai commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170405300 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- We believe this happens when - using fixed thread pool which is the default, and - an address has many producers, more then thread pool's size, and - the address is about to exit paging state ---
[GitHub] activemq-artemis pull request #1894: ARTEMIS-1700 Fixed deadlock in paging s...
Github user shoukunhuai commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170405082 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- What if the pool is full? In our case, the pool is a 60 thread fixed pool. One of the thread is doing page cleanup, and try to exit paging state, it holds the lock in paging store. All other 59 threads is blocked on the lock, trying to page. While cleanup, we need to store bookmark in journal for each page subscription, then wait until completed. In log, stored equals to storeLineUp, but there are pending tasks(there are going to count down latch cleanup thread is waiting on), the deadlock happened. ``` 16:44:28,930 AMQ222024: Could not complete operations on IO context OperationContextImpl [1251391301] [minimalStore=1, storeLineUp=2, stored=2, minimalReplicated=0, replicationLineUp=0, replicated=0, paged=0, minimalPage=0, pageLineUp=0, errorCode=-1, errorMessage=null, executorsPending=3, executor=OrderedExecutor(tasks=[org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl$1@4d09259, org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl$1@54b73dc4, org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl$1@640495d4])] ``` ---
[GitHub] activemq-artemis pull request #1894: ARTEMIS-1700 Fixed deadlock in paging s...
Github user clebertsuconic commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170356212 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- Can you please close your PR? The executor factory is providing the same thing you suggested here. it won't fix anything. .just will create more threads and problems. ---
[GitHub] activemq-artemis pull request #1894: ARTEMIS-1700 Fixed deadlock in paging s...
Github user clebertsuconic commented on a diff in the pull request: https://github.com/apache/activemq-artemis/pull/1894#discussion_r170355937 --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java --- @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception { beforeStart(); - singleThreadExecutor = executorFactory.getExecutor(); + ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction() { + @Override + public ThreadFactory run() { +return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader()); + } + }); + singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory); --- End diff -- nope.. that's wrong... executorFactory.getExecutor() is returning on thread executor from the pool. it won't always be the same thread.. but it will always be the same context.. this patch is not valid. in what situation do you see a deadlock. hornetq it might be different.. I would need a test to be able to accept a patch here. we should reuse the thread from the pool always. ---
[GitHub] activemq-artemis pull request #1894: ARTEMIS-1700 Fixed deadlock in paging s...
GitHub user qihongxu opened a pull request: https://github.com/apache/activemq-artemis/pull/1894 ARTEMIS-1700 Fixed deadlock in paging state JournalStorageManager is not indeed using a `single` thread. We apply this patch to use a simple single thread executor. We have seen similar threads on Internet. This seems to be a remaining problem from hornetQ. You can merge this pull request into a Git repository by running: $ git pull https://github.com/qihongxu/activemq-artemis ARTEMIS-1700 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/activemq-artemis/pull/1894.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1894 commit 53daaa2de6dc8407d6c263493b6a2f50b8f5668d Author: 17060606 <17060606@...> Date: 2018-02-23T12:00:00Z ARTEMIS-1700 Fixed deadlock in paging state JournalStorageManager is not indeed using a `single` thread. We apply this patch to use a simple single thread executor. We have seen similar threads on Internet. This seems to be a remaining problem from hornetQ. ---