Ken Barber created AMQ-4357:
-------------------------------
Summary: Corrupt KahaDB Index may cause EOFException on first
message
Key: AMQ-4357
URL: https://issues.apache.org/jira/browse/AMQ-4357
Project: ActiveMQ
Issue Type: Bug
Components: Message Store
Affects Versions: 5.8.0
Environment: Java 1.6.0 most releases, tested on lots of different
hardware, tested on Linux distros only.
Reporter: Ken Barber
Similar to AMQ-4339
If the index is corrupted, ActiveMQ may receive an EOFException when the first
message is published. The easiest way to replicate this for us has been to
purposely put junk into a journal as so:
truncate -s 32700 db.data
The exception in this case has been (this error is specific to AMQ 5.6.0, so
lines may mismatch on 5.8.0, but similar problem was confirmed on 5.8.0):
{code}
2013-03-01 16:05:06,451 ERROR [kahadb.MessageDatabase] KahaDB failed to store
to Journal
java.io.EOFException
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:399)
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377)
at org.apache.kahadb.page.PageFile.readPage(PageFile.java:867)
at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:446)
at org.apache.kahadb.page.Transaction$2.<init>(Transaction.java:437)
at
org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:434)
at org.apache.kahadb.page.Transaction.load(Transaction.java:410)
at org.apache.kahadb.page.Transaction.load(Transaction.java:367)
at org.apache.kahadb.index.BTreeIndex.loadNode(BTreeIndex.java:262)
at org.apache.kahadb.index.BTreeIndex.getRoot(BTreeIndex.java:174)
at org.apache.kahadb.index.BTreeIndex.put(BTreeIndex.java:189)
at
org.apache.activemq.store.kahadb.MessageDatabase.upadateIndex(MessageDatabase.java:1197)
at
org.apache.activemq.store.kahadb.MessageDatabase$14.execute(MessageDatabase.java:1027)
at org.apache.kahadb.page.Transaction.execute(Transaction.java:769)
at
org.apache.activemq.store.kahadb.MessageDatabase.process(MessageDatabase.java:1025)
at
org.apache.activemq.store.kahadb.MessageDatabase$13.visit(MessageDatabase.java:972)
at
org.apache.activemq.store.kahadb.data.KahaAddMessageCommand.visit(KahaAddMessageCommand.java:241)
at
org.apache.activemq.store.kahadb.MessageDatabase.process(MessageDatabase.java:969)
at
org.apache.activemq.store.kahadb.MessageDatabase.store(MessageDatabase.java:881)
at
org.apache.activemq.store.kahadb.MessageDatabase.store(MessageDatabase.java:863)
at
org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore.addMessage(KahaDBStore.java:431)
at
org.apache.activemq.store.kahadb.KahaDBStore$StoreQueueTask.run(KahaDBStore.java:1177)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:680)
2013-03-01 16:05:06,743 WARN [server.AbstractHttpConnection] /v2/commands/
org.springframework.jms.UncategorizedJmsException: Uncategorized exception
occured during JMS processing; nested exception is javax.jms.JMSException:
java.io.EOFException
at
org.springframework.jms.support.JmsUtils.convertJmsAccessException(JmsUtils.java:316)
at
org.springframework.jms.support.JmsAccessor.convertJmsAccessException(JmsAccessor.java:168)
at
org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:469)
at org.springframework.jms.core.JmsTemplate.send(JmsTemplate.java:543)
at
org.springframework.jms.core.JmsTemplate.convertAndSend(JmsTemplate.java:653)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28)
at clamq.jms$jms_producer$reify__3138.publish(jms.clj:29)
at clamq.jms$jms_producer$reify__3138.publish(jms.clj:30)
at
clamq.protocol.producer$eval3094$fn__3095$G__3086__3103.invoke(producer.clj:3)
at
clamq.protocol.producer$eval3094$fn__3095$G__3085__3112.invoke(producer.clj:3)
at clojure.lang.AFn.applyToHelper(AFn.java:167)
at clojure.lang.AFn.applyTo(AFn.java:151)
at clojure.core$apply.invoke(core.clj:603)
at com.puppetlabs.mq$connect_and_publish_BANG_.doInvoke(mq.clj:136)
at clojure.lang.RestFn.invoke(RestFn.java:439)
at
com.puppetlabs.puppetdb.command$enqueue_raw_command_BANG_$fn__3754.invoke(command.clj:254)
at
com.puppetlabs.puppetdb.command$enqueue_raw_command_BANG_.invoke(command.clj:253)
at
com.puppetlabs.puppetdb.http.v1.command$enqueue_command.invoke(command.clj:22)
at
com.puppetlabs.middleware$verify_accepts_content_type$fn__4240.invoke(middleware.clj:67)
at
com.puppetlabs.middleware$verify_checksum$fn__4249.invoke(middleware.clj:102)
at
com.puppetlabs.middleware$verify_param_exists$fn__4244.invoke(middleware.clj:79)
at
com.puppetlabs.puppetdb.http.v1.command$command_app.invoke(command.clj:27)
at com.puppetlabs.puppetdb.http.v2$v2_app$fn__6640.invoke(v2.clj:12)
at net.cgrand.moustache$alter_request$fn__4450.invoke(moustache.clj:54)
at com.puppetlabs.puppetdb.http.v2$v2_app.invoke(v2.clj:12)
at
com.puppetlabs.puppetdb.http.server$routes$fn__7042.invoke(server.clj:27)
at net.cgrand.moustache$alter_request$fn__4450.invoke(moustache.clj:54)
at com.puppetlabs.puppetdb.http.server$routes.invoke(server.clj:27)
at
ring.middleware.resource$wrap_resource$fn__7022.invoke(resource.clj:14)
at ring.middleware.params$wrap_params$fn__4400.invoke(params.clj:55)
at
com.puppetlabs.middleware$wrap_with_authorization$fn__4226.invoke(middleware.clj:21)
at
com.puppetlabs.middleware$wrap_with_certificate_cn$fn__4230.invoke(middleware.clj:36)
at
com.puppetlabs.middleware$wrap_with_default_body$fn__4233.invoke(middleware.clj:43)
at
com.puppetlabs.middleware$wrap_with_metrics_STAR_$fn__4253$fn__4254.invoke(middleware.clj:119)
at
com.puppetlabs.middleware.proxy$java.lang.Object$Callable$f8c5758f.call(Unknown
Source)
at com.yammer.metrics.core.Timer.time(Timer.java:91)
at
com.puppetlabs.middleware$wrap_with_metrics_STAR_$fn__4253.invoke(middleware.clj:117)
at
com.puppetlabs.middleware$wrap_with_globals$fn__4236.invoke(middleware.clj:54)
at ring.adapter.jetty$proxy_handler$fn__4071.invoke(jetty.clj:18)
at
ring.adapter.jetty.proxy$org.eclipse.jetty.server.handler.AbstractHandler$0.handle(Unknown
Source)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
at org.eclipse.jetty.server.Server.handle(Server.java:349)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:452)
at
org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:894)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:948)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:857)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at
org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:76)
at
org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:609)
at
org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:45)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534)
at java.lang.Thread.run(Thread.java:680)
Caused by: javax.jms.JMSException: java.io.EOFException
at
org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:49)
at
org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1362)
at
org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1290)
at org.apache.activemq.ActiveMQSession.send(ActiveMQSession.java:1785)
at
org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:277)
at
org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:212)
at
org.apache.activemq.ActiveMQMessageProducerSupport.send(ActiveMQMessageProducerSupport.java:269)
at
org.springframework.jms.connection.CachedMessageProducer.send(CachedMessageProducer.java:117)
at org.springframework.jms.core.JmsTemplate.doSend(JmsTemplate.java:592)
at org.springframework.jms.core.JmsTemplate.doSend(JmsTemplate.java:569)
at
org.springframework.jms.core.JmsTemplate$4.doInJms(JmsTemplate.java:546)
at
org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:466)
... 55 more
Caused by: java.util.concurrent.ExecutionException: java.io.EOFException
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at org.apache.activemq.broker.region.Queue.doMessageSend(Queue.java:785)
at org.apache.activemq.broker.region.Queue.send(Queue.java:707)
at
org.apache.activemq.broker.region.AbstractRegion.send(AbstractRegion.java:407)
at
org.apache.activemq.broker.region.RegionBroker.send(RegionBroker.java:503)
at
org.apache.activemq.broker.jmx.ManagedRegionBroker.send(ManagedRegionBroker.java:305)
at org.apache.activemq.broker.BrokerFilter.send(BrokerFilter.java:129)
at
org.apache.activemq.broker.scheduler.SchedulerBroker.send(SchedulerBroker.java:189)
at org.apache.activemq.broker.BrokerFilter.send(BrokerFilter.java:129)
at
org.apache.activemq.broker.CompositeDestinationBroker.send(CompositeDestinationBroker.java:96)
at
org.apache.activemq.broker.TransactionBroker.send(TransactionBroker.java:306)
at
org.apache.activemq.broker.MutableBrokerFilter.send(MutableBrokerFilter.java:135)
at
org.apache.activemq.broker.TransportConnection.processMessage(TransportConnection.java:453)
at
org.apache.activemq.command.ActiveMQMessage.visit(ActiveMQMessage.java:681)
at
org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:292)
at
org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:150)
at
org.apache.activemq.transport.ResponseCorrelator.onCommand(ResponseCorrelator.java:116)
at
org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport.java:50)
at
org.apache.activemq.transport.vm.VMTransport.iterate(VMTransport.java:231)
at
org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:122)
at
org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:43)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
... 1 more
Caused by: java.io.EOFException
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:399)
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377)
at org.apache.kahadb.page.PageFile.readPage(PageFile.java:867)
at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:446)
at org.apache.kahadb.page.Transaction$2.<init>(Transaction.java:437)
at
org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:434)
at org.apache.kahadb.page.Transaction.load(Transaction.java:410)
at org.apache.kahadb.page.Transaction.load(Transaction.java:367)
at org.apache.kahadb.index.BTreeIndex.loadNode(BTreeIndex.java:262)
at org.apache.kahadb.index.BTreeIndex.getRoot(BTreeIndex.java:174)
at org.apache.kahadb.index.BTreeIndex.put(BTreeIndex.java:189)
at
org.apache.activemq.store.kahadb.MessageDatabase.upadateIndex(MessageDatabase.java:1197)
at
org.apache.activemq.store.kahadb.MessageDatabase$14.execute(MessageDatabase.java:1027)
at org.apache.kahadb.page.Transaction.execute(Transaction.java:769)
at
org.apache.activemq.store.kahadb.MessageDatabase.process(MessageDatabase.java:1025)
at
org.apache.activemq.store.kahadb.MessageDatabase$13.visit(MessageDatabase.java:972)
at
org.apache.activemq.store.kahadb.data.KahaAddMessageCommand.visit(KahaAddMessageCommand.java:241)
at
org.apache.activemq.store.kahadb.MessageDatabase.process(MessageDatabase.java:969)
at
org.apache.activemq.store.kahadb.MessageDatabase.store(MessageDatabase.java:881)
at
org.apache.activemq.store.kahadb.MessageDatabase.store(MessageDatabase.java:863)
at
org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore.addMessage(KahaDBStore.java:431)
at
org.apache.activemq.store.kahadb.KahaDBStore$StoreQueueTask.run(KahaDBStore.java:1177)
... 3 more
{code}
At the moment the only work-around I have is to blank out the db.data and let
it get recreated from the journal.
I just want to stress that the corruption wasn't caused by KahaDB, at least we
haven't seen any cases. The corruptions were either caused by: disk filling up,
or bad copies when migrating directories etc.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira