[jira] [Commented] (ARTEMIS-2811) Component org.apache.activemq.artemis.core.io.buffer.TimedBuffer is expired on path 0
[ https://issues.apache.org/jira/browse/ARTEMIS-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17142854#comment-17142854 ] daves commented on ARTEMIS-2811: [~jbertram] Thanks for your comment. I will look at the documentation. I think the problem with the windows service is that the service is not running the java process directly but uses a service wrapper called "artemis-service.exe". I've used such wrappers before. My favorite is nssm [https://nssm.cc/] which as far as is know detects if the "hosted" process existed an is able to signal this exit to the service manager. Sadly the broker is hosted in an environment not controlled by us. I don't have any option to change the monitoring or check if everything is ok with the filestystem… I know you can't do anything neither but maybe it would be an option to take a look at "artemis-service.exe" and check if there is an option to detect a stopped java process? > Component org.apache.activemq.artemis.core.io.buffer.TimedBuffer is expired > on path 0 > - > > Key: ARTEMIS-2811 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2811 > Project: ActiveMQ Artemis > Issue Type: Bug >Affects Versions: 2.11.0 > Environment: * Windows Server 2016 > * Artemis running as Windows Service >Reporter: daves >Assignee: Justin Bertram >Priority: Major > Attachments: broker.xml > > > We run Artemis 2.11.0 on Windows Server 2016 as Windows Service. Suddenly > Artemis stopped working. The Artemis process stopped but the Windows > Service/Service wrapper was still running. We monitor all Services if they > are running, but since the Artemis-Service was still running our monitoring > did not detect that Artemis was not running anymore. > > # Is it possible to kill the Windows-Service together with the Artemis > process? (would be very nice for monitoring etc.) > # Is there a fix for this issue maybe in a newer version? > Please see below stacktrace for more details. Please let me know if you need > any additional information. > > {code:java} > 2020-06-18 11:06:20,865 WARN > [org.apache.activemq.artemis.utils.critical.CriticalMeasure] Component > org.apache.activemq.artemis.core.io.buffer.TimedBuffer is expired on path 0 > 2020-06-18 11:06:20,865 WARN > [org.apache.activemq.artemis.utils.critical.CriticalMeasure] Component > org.apache.activemq.artemis.core.io.buffer.TimedBuffer is expired on path 0 > 2020-06-18 11:06:20,865 ERROR [org.apache.activemq.artemis.core.server] > AMQ224079: The process for the virtual machine will be killed, as component > org.apache.activemq.artemis.core.io.buffer.TimedBuffer@37d4349f is not > responsive > 2020-06-18 11:06:21,146 WARN [org.apache.activemq.artemis.core.server] > AMQ222199: Thread dump: > ***Complete > Thread dump "qtp140224-69441" Id=69441 TIMED_WAITING on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@5b811449 > at sun.misc.Unsafe.park(Native Method) - waiting on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@5b811449 > at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source) at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown > Source) at > org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:564) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:49) > at > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:627) > at java.lang.Thread.run(Unknown Source)"Thread-11562 > (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@3cc1435c)" > Id=69440 TIMED_WAITING on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1e83815d > at sun.misc.Unsafe.park(Native Method) - waiting on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1e83815d > at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source) at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown > Source) at java.util.concurrent.LinkedBlockingQueue.poll(Unknown Source) at > org.apache.activemq.artemis.utils.ActiveMQThreadPoolExecutor$ThreadPoolQueue.poll(ActiveMQThreadPoolExecutor.java:112) > at > org.apache.activemq.artemis.utils.ActiveMQThreadPoolExecutor$ThreadPoolQueue.poll(ActiveMQThreadPoolExecutor.java:45) > at java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source) at > java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at > java.util.concurr
[jira] [Commented] (ARTEMIS-2811) Component org.apache.activemq.artemis.core.io.buffer.TimedBuffer is expired on path 0
[ https://issues.apache.org/jira/browse/ARTEMIS-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17141156#comment-17141156 ] Justin Bertram commented on ARTEMIS-2811: - Without more details about your configuration it's hard to say for sure, but based on the logging it appears there was an issue with your environment. Specifically, it looks like the broker was not able to complete a disk IO operation in the allotted time. This caused the "critical analyzer" to halt the JVM process. You can [read more about the critical analyzer in the documentation|http://activemq.apache.org/components/artemis/documentation/latest/critical-analysis.html]. By default this is the configuration for the critical analyzer in {{broker.xml}}: {code:xml} true 12 6 HALT {code} I assume this is what you're using. This configuration means that every 60 seconds the critical analyzer will run and check if any "critical" operations have exceeded 120 seconds. In your case the {{org.apache.activemq.artemis.core.io.buffer.TimedBuffer}} took too long on "path 0." Path {{0}} for the {{TimedBuffer}} is a flush operation to store data on the disk. In the thread dump we can see: {noformat} "activemq-buffer-timeout" Id=15 RUNNABLE (in native) at sun.nio.ch.FileDispatcherImpl.force0(Native Method) at sun.nio.ch.FileDispatcherImpl.force(Unknown Source) at sun.nio.ch.FileChannelImpl.force(Unknown Source) at org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.sync(NIOSequentialFile.java:262) at org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.doInternalWrite(NIOSequentialFile.java:391) at org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.internalWrite(NIOSequentialFile.java:359) at org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.access$100(NIOSequentialFile.java:43) at org.apache.activemq.artemis.core.io.nio.NIOSequentialFile$SyncLocalBufferObserver.flushBuffer(NIOSequentialFile.java:434) at org.apache.activemq.artemis.core.io.buffer.TimedBuffer.flushBatch(TimedBuffer.java:361) - locked org.apache.activemq.artemis.core.io.buffer.TimedBuffer@37d4349f at org.apache.activemq.artemis.core.io.buffer.TimedBuffer.flush(TimedBuffer.java:338) at org.apache.activemq.artemis.core.io.buffer.TimedBuffer$CheckTimer.run(TimedBuffer.java:473) at java.lang.Thread.run(Unknown Source) {noformat} Spending 120 seconds trying to flush a disk write indicates a problem with your storage. At this point the broker will invoke [{{Runtime.getRuntime().halt()}}|https://docs.oracle.com/javase/8/docs/api/java/lang/Runtime.html#halt-int-]. As the JavaDoc states this method, "Forcibly terminates the currently running Java virtual machine." It's not clear what else could be done to help Windows recognize that the broker is dead. I would expect the Windows Service to terminate when the JVM is halted. Perhaps you could monitor something that's more directly related to broker operation (e.g. if you can connect to the broker's port(s)). In conclusion, I don't see anything wrong with the broker at this point. I recommend you investigate the performance of your storage as well as alternative monitoring strategies. > Component org.apache.activemq.artemis.core.io.buffer.TimedBuffer is expired > on path 0 > - > > Key: ARTEMIS-2811 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2811 > Project: ActiveMQ Artemis > Issue Type: Bug >Affects Versions: 2.11.0 > Environment: * Windows Server 2016 > * Artemis running as Windows Service >Reporter: daves >Assignee: Justin Bertram >Priority: Major > > We run Artemis 2.11.0 on Windows Server 2016 as Windows Service. Suddenly > Artemis stopped working. The Artemis process stopped but the Windows > Service/Service wrapper was still running. We monitor all Services if they > are running, but since the Artemis-Service was still running our monitoring > did not detect that Artemis was not running anymore. > > # Is it possible to kill the Windows-Service together with the Artemis > process? (would be very nice for monitoring etc.) > # Is there a fix for this issue maybe in a newer version? > Please see below stacktrace for more details. Please let me know if you need > any additional information. > > {code:java} > 2020-06-18 11:06:20,865 WARN > [org.apache.activemq.artemis.utils.critical.CriticalMeasure] Component > org.apache.activemq.artemis.core.io.buffer.TimedBuffer is expired on path 0 > 2020-06-18 11:06:20,865 WARN > [org.apache.activemq.artemis.utils.critical.CriticalMeasure] Component > org.apache.activemq.artemis.core.io.buffer.TimedBuffer is expired on path 0 > 2020-06-18 11:06:20,865 ERROR [org.apache.