How would a journal file go 'missing'?

Our AMQ usage uses the file store or KahaDB. A few months ago there was an 
incident wherein, for reasons I cannot found the root cause, some KahaDB 
journal files (*.log) got deleted or are missing. Due to this consumers cannot 
get the messages because the broker cannot find the next message in queue. 
After that subsequent messages just gets stuck, at first we don't have any 
choice but to delete all KahaDB files including indexes just to make it work of 
course the downside is we lose all the current messages. Here's the java stack 
trace we got when we encountered the issue.

java.io.IOException: Could not locate data file /.../KahaDB/db-4875.log
        at org.apache.kahadb.journal.Journal.getDataFile(Journal.java:345)
        at org.apache.kahadb.journal.Journal.read(Journal.java:592)
        at 
org.apache.activemq.store.kahadb.MessageDatabase.load(MessageDatabase.java:786)
        at 
org.apache.activemq.store.kahadb.KahaDBStore.loadMessage(KahaDBStore.java:956)
        at 
org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore$5.execute(KahaDBStore.java:494)
        at org.apache.kahadb.page.Transaction.execute(Transaction.java:728)
        at 
org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore.recoverNextMessages(KahaDBStore.java:485)
        at 
org.apache.activemq.store.ProxyMessageStore.recoverNextMessages(ProxyMessageStore.java:88)
        at 
org.apache.activemq.broker.region.cursors.QueueStorePrefetch.doFillBatch(QueueStorePrefetch.java:97)
        at 
org.apache.activemq.broker.region.cursors.AbstractStoreCursor.fillBatch(AbstractStoreCursor.java:262)
        at 
org.apache.activemq.broker.region.cursors.AbstractStoreCursor.reset(AbstractStoreCursor.java:110)
        at 
org.apache.activemq.broker.region.cursors.StoreQueueCursor.reset(StoreQueueCursor.java:157)
        at org.apache.activemq.broker.region.Queue.doPageIn(Queue.java:1678)
        at 
org.apache.activemq.broker.region.Queue.pageInMessages(Queue.java:1898)
        at org.apache.activemq.broker.region.Queue.doBrowse(Queue.java:968)
        at 
org.apache.activemq.broker.region.Queue.expireMessages(Queue.java:772)
        at org.apache.activemq.broker.region.Queue.access$100(Queue.java:83)
        at org.apache.activemq.broker.region.Queue$2.run(Queue.java:123)
        at 
org.apache.activemq.thread.SchedulerTimerTask.run(SchedulerTimerTask.java:33)
        at java.util.TimerThread.mainLoop(Timer.java:512)
        at java.util.TimerThread.run(Timer.java:462)

This is how we addressed the issue so subsequent messages will not get stuck 
and we only lose a few messages that are in the offending journal file(s). This 
is the Spring bean way, so something similar in plain AMQ is also available.
<bean ... class="org.apache.activemq.broker.BrokerService">
  ...
  <property name="persistenceAdapter">
    <bean class="org.apache.activemq.store.kahadb.KahaDBPersistenceAdapter">
      <property name="directory" value="${broker.dataDirectory}"/>
      <property name="ignoreMissingJournalfiles" value="true"/>
    </bean>
  </property>
  ...
</bean>


Regards,

Barry



Reply via email to