Jiri Patera created AMQ-4835:
--------------------------------
Summary: JournalPersistenceAdapter: Persistence Adaptor Task
iterates "forever"
Key: AMQ-4835
URL: https://issues.apache.org/jira/browse/AMQ-4835
Project: ActiveMQ
Issue Type: Bug
Components: Message Store
Affects Versions: 5.7.0
Environment: JDK 1.6.0_45
Reporter: Jiri Patera
This erroneous behavior happens after a simple asynchronous message
send/receive test has passed and is about to exit. It happens only in case the
Journal usage is enabled via the following configuration property:
{code}
jms-engine.useJournal=true
{code}
The test shutdown process than takes about 5-10 minutes to complete. The test
result is then "passed".
The thread waits here:
{code}
org.apache.activemq.thread.PooledTaskRunner.shutdown(long):101
{code}
And will be notified when all the 1000 for-cycle iterations pass here:
{code}
org.apache.activemq.thread.PooledTaskRunner.runTask():128
{code}
And a log:
{code}
13.10.15 12:04:36:951 DEBUG [main] [PooledConnectionFactory ] Stopping
the PooledConnectionFactory, number of connections in cache: 0
13.10.15 12:04:36:952 DEBUG [ActiveMQ VMTransport:
vm://JmsEngineActivemqBroker#1-1] [ManagementContext ]
Unregistering MBean
org.apache.activemq:BrokerName=JmsEngineActivemqBroker,Type=Producer,destinationType=Dynamic,clientId=ID_localhost-51012-1381831459465-2_1,producerId=ID_localhost-51012-1381831459465-3_1_1_1
13.10.15 12:04:36:955 DEBUG [ActiveMQ VMTransport:
vm://JmsEngineActivemqBroker#1-1] [ManagementContext ]
Unregistering MBean
org.apache.activemq:BrokerName=JmsEngineActivemqBroker,Type=Producer,destinationType=Dynamic,clientId=ID_localhost-51012-1381831459465-2_1,producerId=ID_localhost-51012-1381831459465-3_1_3_1
13.10.15 12:04:36:955 DEBUG [ActiveMQ VMTransport:
vm://JmsEngineActivemqBroker#1-1] [ManagementContext ]
Unregistering MBean
org.apache.activemq:BrokerName=JmsEngineActivemqBroker,Type=Subscription,persistentMode=Non-Durable,destinationType=Topic,destinationName=ActiveMQ.Advisory.TempQueue_ActiveMQ.Advisory.TempTopic,clientId=ID_localhost-51012-1381831459465-2_1,consumerId=ID_localhost-51012-1381831459465-3_1_-1_1
13.10.15 12:04:36:955 DEBUG [ActiveMQ VMTransport:
vm://JmsEngineActivemqBroker#1-1] [AbstractRegion ]
JmsEngineActivemqBroker removing consumer:
ID:localhost-51012-1381831459465-3:1:-1:1 for destination:
ActiveMQ.Advisory.TempQueue,ActiveMQ.Advisory.TempTopic
13.10.15 12:04:36:956 DEBUG [ActiveMQ VMTransport:
vm://JmsEngineActivemqBroker#1-1] [TransportConnection ] remove
connection id: ID:localhost-51012-1381831459465-3:1
13.10.15 12:04:36:963 DEBUG [ActiveMQ BrokerService[JmsEngineActivemqBroker]
Task-1] [TransportConnection ] Stopping connection:
vm://JmsEngineActivemqBroker#0
13.10.15 12:04:36:964 DEBUG [ActiveMQ BrokerService[JmsEngineActivemqBroker]
Task-1] [ThreadPoolUtils ] Forcing shutdown of ExecutorService:
java.util.concurrent.ThreadPoolExecutor@22f62eba
13.10.15 12:04:36:964 DEBUG [ActiveMQ BrokerService[JmsEngineActivemqBroker]
Task-1] [TransportConnection ] Stopped transport:
vm://JmsEngineActivemqBroker#0
13.10.15 12:04:36:964 DEBUG [ActiveMQ BrokerService[JmsEngineActivemqBroker]
Task-1] [TransportConnection ] Connection Stopped:
vm://JmsEngineActivemqBroker#0
13.10.15 12:04:36:965 DEBUG [main] [ThreadPoolUtils ] Shutdown of
ExecutorService: java.util.concurrent.ThreadPoolExecutor@30721965 is shutdown:
true and terminated: false took: 0.000 seconds.
13.10.15 12:04:36:965 DEBUG [main] [ThreadPoolUtils ] Shutdown of
ExecutorService: java.util.concurrent.ThreadPoolExecutor@d0cc53e is shutdown:
true and terminated: true took: 0.000 seconds.
13.10.15 12:04:36:965 DEBUG [main] [ThreadPoolUtils ] Forcing
shutdown of ExecutorService: java.util.concurrent.ThreadPoolExecutor@24d37b87
13.10.15 12:04:36:965 DEBUG [main] [VMTransportFactory ] Shutting
down VM connectors for broker: JmsEngineActivemqBroker
13.10.15 12:04:36:965 INFO [main] [TransportConnector ] Connector
vm://JmsEngineActivemqBroker Stopped
13.10.15 12:04:36:966 INFO [main] [BrokerService ] Apache
ActiveMQ 5.7.0 (JmsEngineActivemqBroker, ID:localhost-51012-1381831459465-0:1)
is shutting down
13.10.15 12:04:38:002 INFO [main] [TransportConnector ] Connector
tcp://localhost:0 Stopped
13.10.15 12:04:38:021 DEBUG [main] [ManagementContext ]
Unregistering MBean
org.apache.activemq:BrokerName=JmsEngineActivemqBroker,Type=Topic,Destination=ActiveMQ.Advisory.Connection
13.10.15 12:04:38:022 DEBUG [main] [ManagementContext ]
Unregistering MBean
org.apache.activemq:BrokerName=JmsEngineActivemqBroker,Type=Queue,Destination=requestQueue
13.10.15 12:04:38:022 DEBUG [main] [ManagementContext ]
Unregistering MBean
org.apache.activemq:BrokerName=JmsEngineActivemqBroker,Type=Topic,Destination=ActiveMQ.Advisory.Queue
13.10.15 12:04:38:023 DEBUG [main] [ManagementContext ]
Unregistering MBean
org.apache.activemq:BrokerName=JmsEngineActivemqBroker,Type=Topic,Destination=ActiveMQ.Advisory.Consumer.Queue.requestQueue
13.10.15 12:04:38:023 DEBUG [main] [ManagementContext ]
Unregistering MBean
org.apache.activemq:BrokerName=JmsEngineActivemqBroker,Type=Queue,Destination=replyQueue
13.10.15 12:04:38:024 DEBUG [main] [ManagementContext ]
Unregistering MBean
org.apache.activemq:BrokerName=JmsEngineActivemqBroker,Type=Topic,Destination=ActiveMQ.Advisory.Consumer.Queue.replyQueue
13.10.15 12:04:38:025 INFO [main] [PListStore ]
PListStore:[/.../jms-engine-activemq/target/jms-engine-activemq-data-custom/broker/JmsEngineActivemqBroker/tmp_storage]
stopped
13.10.15 12:04:38:025 DEBUG [main] [JournalPersistenceAdapter ] Waking for
checkpoint to complete.
13.10.15 12:04:38:855 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Marking journal at: 0:3122
13.10.15 12:04:38:855 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint done.
13.10.15 12:04:38:855 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint started.
13.10.15 12:04:40:593 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Marking journal at: 0:3122
13.10.15 12:04:40:593 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint done.
13.10.15 12:04:40:593 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint started.
13.10.15 12:04:42:693 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Marking journal at: 0:3122
13.10.15 12:04:42:694 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint done.
13.10.15 12:04:42:694 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint started.
13.10.15 12:04:44:622 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Marking journal at: 0:3122
13.10.15 12:04:44:622 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint done.
13.10.15 12:04:44:622 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint started.
13.10.15 12:04:46:468 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Marking journal at: 0:3122
13.10.15 12:04:46:469 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint done.
13.10.15 12:04:46:469 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint started.
13.10.15 12:04:48:244 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Marking journal at: 0:3122
13.10.15 12:04:48:244 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint done.
13.10.15 12:04:48:244 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint started.
13.10.15 12:04:50:027 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Marking journal at: 0:3122
13.10.15 12:04:50:027 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint done.
13.10.15 12:04:50:027 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint started.
13.10.15 12:04:51:900 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Marking journal at: 0:3122
13.10.15 12:04:51:900 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint done.
13.10.15 12:04:51:900 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint started.
13.10.15 12:04:54:010 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Marking journal at: 0:3122
13.10.15 12:04:54:010 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint done.
13.10.15 12:04:54:011 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint started.
13.10.15 12:04:55:991 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Marking journal at: 0:3122
13.10.15 12:04:55:991 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint done.
13.10.15 12:04:55:991 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint started.
13.10.15 12:04:57:926 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Marking journal at: 0:3122
13.10.15 12:04:57:926 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint done.
13.10.15 12:04:57:927 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint started.
13.10.15 12:04:59:680 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Marking journal at: 0:3122
13.10.15 12:04:59:681 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint done.
13.10.15 12:04:59:681 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint started.
13.10.15 12:05:01:958 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Marking journal at: 0:3122
13.10.15 12:05:01:959 DEBUG [Persistence Adaptor Task-1]
[JournalPersistenceAdapter ] Checkpoint done.
.
.
.
{code}
This iterating makes the ActiveMQ to finish with success after a couple of
minutes (after the 1000 iterations or sometimes sooner if the DB operations are
fast enough to get their elapsed time under the 500 ms). So it does not really
block the applicacion "forever".
The task.iterate() method should return the {{false}} value, but the call to
the following method sets the {{JournalPersistenceAdapter.fullCheckPoint}}
always to {{true}}. The reason seems to be connected with a time delay which is
considerably larger when using a remote DB server in contrast with a local one.
The following method sets the {{JournalPersistenceAdapter.fullCheckPoint}} to
{{true}}:
{code}
org.apache.activemq.store.journal.JournalPersistenceAdapter.checkpoint(boolean,
boolean):352
{code}
There is a hard-coded constant in the ActiveMQ code:
{code}
org.apache.activeio.journal.active.JournalImpl.OVERFLOW_RENOTIFICATION_DELAY =
500;
{code}
When a set of persistence operations takes longer than 500 ms (which in our
use-case with a remote DB server does, but with a local DB does not), the
boolean flag {{JournalPersistenceAdapter.fullCheckPoint}} is set to {{true}}
and the for-cycle goes to the next iteration (out of 1000).
The 1000 iterations is a default value for the {{TaskRunnerFactory}} set in the
{{org.apache.activemq.broker.BrokerService.getPersistenceTaskRunnerFactory()}}
getter method. Altering this {{TaskRunnerFactory.maxIterationsPerRun}} value to
some lower number when creating the {{BrokerService}} instance can be
considered a workaround (but I am not sure if it does not break something else?)
The problem can be simulated by placing a {{sleep(500)}} to the right place in
the code to make sure the persistence operations always take longer than 500 ms:
{code}
private void rolloverCheck() throws IOException {
// See if we need to issue an overflow notification.
if (eventListener != null && file.isPastHalfActive()
&& overflowNotificationTime + OVERFLOW_RENOTIFICATION_DELAY <
System.currentTimeMillis() && !doingNotification ) {
doingNotification = true;
try {
// We need to send an overflow notification to free up
// some logFiles.
Location safeSpot =
file.getFirstRecordLocationOfSecondActiveLogFile();
eventListener.overflowNotification(safeSpot);
overflowNotificationTime = System.currentTimeMillis();
try {
Thread.sleep(500);
} catch (InterruptedException e1) {
e1.printStackTrace();
}
} finally {
doingNotification = false;
}
}
...
{code}
What is strange is that if the sleep is placed in the beginning of the method
body (or setting the {{JournalImpl.OVERFLOW_RENOTIFICATION_DELAY}} to some
really low value), a NPE occurs:
{code}
Caused by: java.io.IOException: Write failed: java.lang.NullPointerException
at
org.apache.activeio.journal.active.JournalImpl.write(JournalImpl.java:197)
at
org.apache.activeio.journal.active.JournalImpl.write(JournalImpl.java:158)
at
org.apache.activemq.store.journal.JournalPersistenceAdapter.deleteAllMessages(JournalPersistenceAdapter.java:679)
at
org.apache.activemq.broker.BrokerService.deleteAllMessages(BrokerService.java:1500)
at
org.apache.activemq.broker.BrokerService.doStartPersistenceAdapter(BrokerService.java:621)
at
org.apache.activemq.broker.BrokerService.startPersistenceAdapter(BrokerService.java:612)
at
org.apache.activemq.broker.BrokerService.start(BrokerService.java:577)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeCustomInitMethod(AbstractAutowireCapableBeanFactory.java:1544)
at
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1485)
at
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1417)
... 52 more
Caused by: java.lang.NullPointerException
at
org.apache.activemq.store.journal.JournalPersistenceAdapter.checkpoint(JournalPersistenceAdapter.java:356)
at
org.apache.activemq.store.journal.JournalPersistenceAdapter.overflowNotification(JournalPersistenceAdapter.java:333)
at
org.apache.activeio.journal.active.JournalImpl.rolloverCheck(JournalImpl.java:284)
at
org.apache.activeio.journal.active.JournalImpl.write(JournalImpl.java:184)
... 65 more
{code}
--
This message was sent by Atlassian JIRA
(v6.1#6144)