Mark created AMQ-7041:
-------------------------

             Summary: Failed to load message under load in GlusterFS setup
                 Key: AMQ-7041
                 URL: https://issues.apache.org/jira/browse/AMQ-7041
             Project: ActiveMQ
          Issue Type: Bug
    Affects Versions: 5.15.2
         Environment: Two nodes master and slave.

GlusterFS is configured to replicate the journal across nodes. Only one node is 
active and writing.

 

Relevant from activemq.xml
{noformat}
<destinationPolicy>
            <policyMap>
                <policyEntries>

                    <policyEntry queue=">"
                                 memoryLimit="40mb"
                                 producerFlowControl="true"
                                 optimizedDispatch="true">
                    </policyEntry>

                    <policyEntry topic=">"
                                 memoryLimit="600mb"
                                 producerFlowControl="false"
                                 optimizedDispatch="true">
                    </policyEntry>


                </policyEntries>
            </policyMap>
        </destinationPolicy>


<persistenceAdapter>
            <kahaDB directory="${persistence.brokerData}"
                    enableJournalDiskSyncs="true"
                    ignoreMissingJournalfiles="true"
                    checkForCorruptJournalFiles="true"
                    checksumJournalFiles="true">
                <locker>
                    <shared-file-locker lockAcquireSleepInterval="3000"/>
                </locker>
            </kahaDB>
</persistenceAdapter>{noformat}
            Reporter: Mark


Hi,

 

I am trying to solve the issue that with ActiveMQ in GlusterFS setup. I tend to 
blame GlusterFS because without it, ActiveMQ performs well under the same load. 
However, I would like to understand why it might fail with GlusterFS or maybe 
some parts could be tuned to play nicely. Another point to mention is that 
GlusterFS doesn't report any errors whatsoever when this issue happens. Would 
be great to understand what the error is about.

There is not way to reproduce it with 100% probability, however there are steps 
that are likely to cause the issue:
 # add several (10+) consumers for a queue
 # add several (100+) async producers to the queue to produce very high load 
spike around 50-100k msg/s

 
{noformat}
2018-08-23 16:24:08,506 | ERROR | Failed to load message at: 181:32956327 | 
org.apache.activemq.store.kahadb.KahaDBStore | ActiveMQ BrokerService[AMQ01] 
Task-236
java.io.IOException: Unexpected error on journal read at: 181:32956327
        at 
org.apache.activemq.util.IOExceptionSupport.create(IOExceptionSupport.java:28)
        at 
org.apache.activemq.store.kahadb.KahaDBStore.loadMessage(KahaDBStore.java:1260)
        at 
org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore$4.execute(KahaDBStore.java:594)
        at 
org.apache.activemq.store.kahadb.disk.page.Transaction.execute(Transaction.java:779)
        at 
org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore.recoverNextMessages(KahaDBStore.java:583)
        at 
org.apache.activemq.store.ProxyMessageStore.recoverNextMessages(ProxyMessageStore.java:110)
        at 
org.apache.activemq.broker.region.cursors.QueueStorePrefetch.doFillBatch(QueueStorePrefetch.java:127)
        at 
org.apache.activemq.broker.region.cursors.AbstractStoreCursor.fillBatch(AbstractStoreCursor.java:448)
        at 
org.apache.activemq.broker.region.cursors.AbstractStoreCursor.reset(AbstractStoreCursor.java:168)
        at 
org.apache.activemq.broker.region.cursors.StoreQueueCursor.reset(StoreQueueCursor.java:169)
        at 
org.apache.activemq.broker.region.Queue.doPageInForDispatch(Queue.java:1981)
        at 
org.apache.activemq.broker.region.Queue.pageInMessages(Queue.java:2210)
        at org.apache.activemq.broker.region.Queue.iterate(Queue.java:1644)
        at 
org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:133)
        at 
org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:48)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Could not load journal record, unexpected 
command type: KAHA_TRACE_COMMAND at location: 181:32956327
        at 
org.apache.activemq.store.kahadb.KahaDBStore.loadMessage(KahaDBStore.java:1252)
        ... 16 more
2018-08-23 16:24:08,509 | INFO  | Stopping BrokerService[AMQ01] due to 
exception, java.io.IOException: Unexpected error on journal read at: 
181:32956327 | org.apache.activemq.util.DefaultIOExceptionHandler | ActiveMQ 
BrokerService[AMQ01] Task-236
java.io.IOException: Unexpected error on journal read at: 181:32956327
        at 
org.apache.activemq.util.IOExceptionSupport.create(IOExceptionSupport.java:28)
        at 
org.apache.activemq.store.kahadb.KahaDBStore.loadMessage(KahaDBStore.java:1260)
        at 
org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore$4.execute(KahaDBStore.java:594)
        at 
org.apache.activemq.store.kahadb.disk.page.Transaction.execute(Transaction.java:779)
        at 
org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore.recoverNextMessages(KahaDBStore.java:583)
        at 
org.apache.activemq.store.ProxyMessageStore.recoverNextMessages(ProxyMessageStore.java:110)
        at 
org.apache.activemq.broker.region.cursors.QueueStorePrefetch.doFillBatch(QueueStorePrefetch.java:127)
        at 
org.apache.activemq.broker.region.cursors.AbstractStoreCursor.fillBatch(AbstractStoreCursor.java:448)
        at 
org.apache.activemq.broker.region.cursors.AbstractStoreCursor.reset(AbstractStoreCursor.java:168)
        at 
org.apache.activemq.broker.region.cursors.StoreQueueCursor.reset(StoreQueueCursor.java:169)
        at 
org.apache.activemq.broker.region.Queue.doPageInForDispatch(Queue.java:1981)
        at 
org.apache.activemq.broker.region.Queue.pageInMessages(Queue.java:2210)
        at org.apache.activemq.broker.region.Queue.iterate(Queue.java:1644)
        at 
org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:133)
        at 
org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:48)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Could not load journal record, unexpected 
command type: KAHA_TRACE_COMMAND at location: 181:32956327
        at 
org.apache.activemq.store.kahadb.KahaDBStore.loadMessage(KahaDBStore.java:1252)
        ... 16 more
2018-08-23 16:24:08,511 | ERROR | 
org.apache.activemq.broker.region.cursors.QueueStorePrefetch@ef6b8e9:TEST,batchResetNeeded=false,size=654,cacheEnabled=false,maxBatchSize:200,hasSpace:true,pendingCachedIds.size:0,lastSyncCachedId:null,lastSyncCachedId-seq:null,lastAsyncCachedId:null,lastAsyncCachedId-seq:null,store=permits:10000,sd=nextSeq:19935,lastRet:MessageOrderCursor:[def:19165,
 low:19163, high:19163],pending:116 - Failed to fill batch | 
org.apache.activemq.broker.region.cursors.AbstractStoreCursor | ActiveMQ 
BrokerService[AMQ01] Task-236
org.apache.activemq.broker.SuppressReplyException: ShutdownBrokerInitiated
        at 
org.apache.activemq.util.DefaultIOExceptionHandler.handle(DefaultIOExceptionHandler.java:165)
        at 
org.apache.activemq.broker.BrokerService.handleIOException(BrokerService.java:2772)
        at 
org.apache.activemq.store.kahadb.KahaDBStore.loadMessage(KahaDBStore.java:1262)
        at 
org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore$4.execute(KahaDBStore.java:594)
        at 
org.apache.activemq.store.kahadb.disk.page.Transaction.execute(Transaction.java:779)
        at 
org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore.recoverNextMessages(KahaDBStore.java:583)
        at 
org.apache.activemq.store.ProxyMessageStore.recoverNextMessages(ProxyMessageStore.java:110)
        at 
org.apache.activemq.broker.region.cursors.QueueStorePrefetch.doFillBatch(QueueStorePrefetch.java:127)
        at 
org.apache.activemq.broker.region.cursors.AbstractStoreCursor.fillBatch(AbstractStoreCursor.java:448)
        at 
org.apache.activemq.broker.region.cursors.AbstractStoreCursor.reset(AbstractStoreCursor.java:168)
        at 
org.apache.activemq.broker.region.cursors.StoreQueueCursor.reset(StoreQueueCursor.java:169)
        at 
org.apache.activemq.broker.region.Queue.doPageInForDispatch(Queue.java:1981)
        at 
org.apache.activemq.broker.region.Queue.pageInMessages(Queue.java:2210)
        at org.apache.activemq.broker.region.Queue.iterate(Queue.java:1644)
        at 
org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:133)
        at 
org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:48)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Unexpected error on journal read at: 
181:32956327
        at 
org.apache.activemq.util.IOExceptionSupport.create(IOExceptionSupport.java:28)
        at 
org.apache.activemq.store.kahadb.KahaDBStore.loadMessage(KahaDBStore.java:1260)
        ... 16 more
Caused by: java.io.IOException: Could not load journal record, unexpected 
command type: KAHA_TRACE_COMMAND at location: 181:32956327
        at 
org.apache.activemq.store.kahadb.KahaDBStore.loadMessage(KahaDBStore.java:1252)
        ... 16 more
2018-08-23 16:24:08,511 | INFO  | Apache ActiveMQ 5.15.2 (AMQ01, ***) is 
shutting down | org.apache.activemq.broker.BrokerService | IOExceptionHandler: 
stopping BrokerService[AMQ01]{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to