Rob,
Yeah, in general, I've been deleting the entire local store
directory in
between tests, since we're just testing it is not critical that the
messages
are persisted. (with the message queue down of course)
Interestingly, I
restarted the broker to try to fix this problem. It marked the
store as bad
and regenerated the index from the journal (about an hour+ with
~1.5-2m
messages).
The consumers started working again at that point, for a while.
After it
reached ~7200 dispatched messages, it really blew up. I got this
in the
wrapper.log:
INFO | jvm 1 | 2007/11/01 20:30:09 |
java.lang.NullPointerException
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.kaha.MessageIdMarshaller.writePayload
(MessageIdMarshaller.java:38)
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.kaha.MessageIdMarshaller.writePayload
(MessageIdMarshaller.java:29)
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.kaha.impl.index.hash.HashEntry.write
(HashEntry.java:93)
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.kaha.impl.index.hash.HashPage.write
(HashPage.java:123)
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.kaha.impl.index.hash.HashIndex.writeFullPage
(HashIndex.java:350)
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.kaha.impl.index.hash.HashPageInfo.end
(HashPageInfo.java:98)
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.kaha.impl.index.hash.HashBin.end(HashBin.java:299)
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.kaha.impl.index.hash.HashBin.put(HashBin.java:153)
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.kaha.impl.index.hash.HashIndex.store
(HashIndex.java:241)
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.kaha.impl.container.MapContainerImpl.put
(MapContainerImpl.java:303)
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.store.kahadaptor.KahaReferenceStore.addMessageRefe
rence(KahaReferenceStore.java:113)
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.store.amq.AMQMessageStore$4.execute
(AMQMessageStore.java:341)
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.util.TransactionTemplate.run
(TransactionTemplate.java:44)
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.store.amq.AMQMessageStore.doAsyncWrite
(AMQMessageStore.java:330)
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.store.amq.AMQMessageStore.asyncWrite
(AMQMessageStore.java:300)
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.store.amq.AMQMessageStore$1.iterate
(AMQMessageStore.java:92)
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.thread.DedicatedTaskRunner.runTask
(DedicatedTaskRunner.java:9
INFO | jvm 1 | 2007/11/01 20:30:09 | at
org.apache.activemq.thread.DedicatedTaskRunner$1.run
(DedicatedTaskRunner.java:36)
After that stack trace, I start getting a lot more
NullPointerExceptions in
the logs, however, they don't have a corresponding stack trace:
INFO | jvm 1 | 2007/11/01 20:30:45 | WARN AMQMessageStore
- Message could not be added to long term store: null
INFO | jvm 1 | 2007/11/01 20:30:45 |
java.lang.NullPointerException
INFO | jvm 1 | 2007/11/01 20:30:45 | WARN AMQMessageStore
- Message could not be added to long term store: null
INFO | jvm 1 | 2007/11/01 20:30:45 |
java.lang.NullPointerException
Then, a different kind of exception, with stack trace:
INFO | jvm 1 | 2007/11/01 20:31:20 | ERROR QueueStorePrefetch
- Failed to fill batch
INFO | jvm 1 | 2007/11/01 20:31:20 |
java.lang.NullPointerException
INFO | jvm 1 | 2007/11/01 20:31:20 | ERROR IG
- Failed to page in more queue messages
INFO | jvm 1 | 2007/11/01 20:31:20 | java.lang.RuntimeException:
java.lang.NullPointerException
INFO | jvm 1 | 2007/11/01 20:31:20 | at
org.apache.activemq.broker.region.cursors.QueueStorePrefetch.hasNext
(QueueStorePrefetch.java:112)
INFO | jvm 1 | 2007/11/01 20:31:20 | at
org.apache.activemq.broker.region.cursors.StoreQueueCursor.hasNext
(StoreQueueCursor.java:129)
INFO | jvm 1 | 2007/11/01 20:31:20 | at
org.apache.activemq.broker.region.Queue.buildList(Queue.java:1022)
INFO | jvm 1 | 2007/11/01 20:31:20 | at
org.apache.activemq.broker.region.Queue.doPageIn(Queue.java:1005)
INFO | jvm 1 | 2007/11/01 20:31:20 | at
org.apache.activemq.broker.region.Queue.pageInMessages(Queue.java:
1068)
INFO | jvm 1 | 2007/11/01 20:31:20 | at
org.apache.activemq.broker.region.Queue.iterate(Queue.java:941)
INFO | jvm 1 | 2007/11/01 20:31:20 | at
org.apache.activemq.thread.DedicatedTaskRunner.runTask
(DedicatedTaskRunner.java:98)
INFO | jvm 1 | 2007/11/01 20:31:20 | at
org.apache.activemq.thread.DedicatedTaskRunner$1.run
(DedicatedTaskRunner.java:36)
INFO | jvm 1 | 2007/11/01 20:31:20 | Caused by:
java.lang.NullPointerException
A couple other stack traces that look a lot like that, and then
tons of:
INFO | jvm 1 | 2007/11/01 20:31:20 | ERROR QueueStorePrefetch
- Failed to fill batch
INFO | jvm 1 | 2007/11/01 20:31:20 |
java.lang.NullPointerException
INFO | jvm 1 | 2007/11/01 20:31:20 | ERROR QueueStorePrefetch
- Failed to fill batch
INFO | jvm 1 | 2007/11/01 20:31:20 |
java.lang.NullPointerException
INFO | jvm 1 | 2007/11/01 20:31:20 | ERROR QueueStorePrefetch
- Failed to fill batch
(Sorry, I haven't searched JIRA/Google for this one yet so this may
be a
known issue)
I reran most of the test with the journaledJDBC connector but it's not
nearly as fast as the local store so I'm hoping to get this
working. :-)
I've made the indexBinSize change you've recommended and I'll re-
run the
tests tomorrow. BTW thanks for putting together that amq-message-
store page
on the wiki. Thanks for the assistance!
-Andy
rajdavies wrote:
Hi andy,
did you start with a completely new AMQ Store data directory ? - You
could delete the references store only - e.g. the Recovery section in
http://activemq.apache.org/amq-message-store.html
Also - as you are storing a few million messages - you might want to
change the number of indexBinSize - the default number of has bins -
parameter to something large - like 16000 (the default is 1024).
I'l see if I can reproduce your latest problem with the
RecoveryListenerAdaptor.
cheers,
Rob
http://rajdavies.blogspot.com/
On Nov 2, 2007, at 2:16 AM, Andrew Miller wrote:
Thanks for the fix! We're running the test now... Parameters:
SVN version: 591163
Client jar: Updated to 5.0-20071028.173739
We haven't seen any of the original IndexOutOfBounds exception
messages,
even after about 1.2 million Enqueues. However, the consumers
appear to be
stuck and unable to consume. We had a RecoveryListenerAdapter
message
show up in the log, though I'm not sure this correlates with the
inability
of the consumers to consume these messages.
2007-11-01 18:10:42,645 [68.194.21:40095] ERROR
RecoveryListenerAdapter
- Message id ID:rlab-0007.xxxx.com-48174-119396578373
6-0:2:1:1:11 could not be recovered from the data store!
I even wrote a little perl client that does nothing but subscribe
to this
queue, consumes one message, and exits. The script just blocks and
then
times out, even though there are many messages in the queue:
rlab-0007:/opt/activemq/bin# ./query -QQueue=IG
ACTIVEMQ_HOME: /opt/activemq
ACTIVEMQ_BASE: /opt/activemq
Type = Queue
DispatchCount = 213
Destination = IG
MaxEnqueueTime = 1371
QueueSize = 926441
Name = IG
DequeueCount = 213
MemoryPercentageUsed = 0
ConsumerCount = 19
MemoryLimit = 30198988
EnqueueCount = 1190424
MinEnqueueTime = 1
AverageEnqueueTime = 196.80751173708921
Here's the one-off perl script and the output: (I assume there's no
reason I
would be unable to consume a message sent via JMS in stomp/perl?)
(redacting script since my previous mail on this subject got
rejected for
being too spammy - suffice to say it simply subscribes to the
queue in
question and does a stomp->receive_frame())
[EMAIL PROTECTED]:/opt/amiller$ ./c1.pl
connected
could not receive message after 20 seconds at ./c1.pl line 17,
<GEN0> line
4.
We also had this problem with the 20071031 snapshot, but at that
time I
thought it was because we were using the 4.1.1 client jars. BTW
even though
we didn't see this issue in 20071002, we needed to upgrade to solve
AMQ-1479.
Thanks again!
-Andy
rajdavies wrote:
There's a potential fix for this I've just added to trunk -
would it
be possible to build the latest and test ?
cheers,
Rob
http://rajdavies.blogspot.com/
On Nov 1, 2007, at 9:47 PM, Andrew Miller wrote:
We've been using ActiveMQ in an infrastructure refresh project,
and
have run
into a nasty looking bug, it looks like we are dropping
messages. We have
a few fast producers that will produce approximately 4-5 million
persistent
messages over the course of about 2.5 hours, while many slower
consumers
will consume these messages over the course of a day or so.
We are using Jencks/Spring for the producers and consumers, using
the
openwire protocol. For this test, we were using the AMQ 4.1.1
client jars,
if that matters. (We will try updating to the new AMQ5 client
jars
today)
About halfway through the production run, we started getting
IndexOutOfBounds exceptions in the log pretty frequently.
Later on
the
consumer tries to get these messages but appears to fail.
(exception messages trimmed)
--
View this message in context: http://www.nabble.com/
IndexOutOfBoundsException-in-AMQMessageStore-using-20071028.173739-
tf4734264s2354.html#a13541167
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.
--
View this message in context: http://www.nabble.com/
IndexOutOfBoundsException-in-AMQMessageStore-using-20071028.173739-
tf4734264s2354.html#a13543207
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.