Ben D'Herville created AMQ-7108:
-----------------------------------

             Summary: Scheduled job scheduling occasionally fails with 
java.lang.ClassCastException: 
org.apache.activemq.store.kahadb.data.KahaTraceCommand cannot be cast to 
org.apache.activemq.store.kahadb.data.KahaAddScheduledJobCommand
                 Key: AMQ-7108
                 URL: https://issues.apache.org/jira/browse/AMQ-7108
             Project: ActiveMQ
          Issue Type: Bug
          Components: Job Scheduler, KahaDB
    Affects Versions: 5.11.1
         Environment: *OS*: Linux ip-10-88-2-47 3.13.0-125-generic #174-Ubuntu 
SMP Mon Jul 10 18:51:24 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

*AMQ*: 5.11.1

*glusterfs*: 3.8.15

*java*: 1.8.0_144-b01

 
            Reporter: Ben D'Herville


*Issue*

We have been running the stack with the same stated versions mentioned in this 
ticket's environment details for over a year.  Recently we have experienced 
ActiveMQ outages as a result of the exception below.  There has been a increase 
in load but not significant.

Scheduled messages are being used as a polling mechanism in a workflow system.  
A workflow will execute a step then schedule a small object message, containing 
just an identifier, to check for a result in a number of seconds.  The consumer 
of the message will schedule another message if the task is not complete yet.  
We also have a batch process that can start a number of workflows concurrently. 
 Usually in the order of 100's which means 100's of scheduled messages can be 
created concurrently.

The most recent failure occurred with a batch of only 64.

We haven't noticed any data loss as a result of this but it doesn't cause a 
complete outage of our production system.

*Recovery*

The only way we have been able to recover from this error is to restart the 
broker.  kahadb is configured to check for corrupt journal which would recover 
the journal on startup.  It is unclear if the journal is corrupted though.

*Reproduction*

We have tried to reproduce this issue using the ActiveMQ performance maven 
plugin but was unable to.  We had 10000 producers concurrently schedule 500 
messages each.  

 

Note there are two stack traces here.  The second one is possibly a side-effect 
of the first one.
2018-Nov-21 15:16:48 JobScheduler:JMS 
[org.apache.activemq.store.kahadb.scheduler.JobSchedulerImpl] ERROR: JMS Failed 
to schedule job
java.lang.ClassCastException: 
org.apache.activemq.store.kahadb.data.KahaTraceCommand cannot be cast to 
org.apache.activemq.store.kahadb.data.KahaAddScheduledJobCommand
        at 
org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl.getPayload(JobSchedulerStoreImpl.java:529)
        at 
org.apache.activemq.store.kahadb.scheduler.JobSchedulerImpl.fireJob(JobSchedulerImpl.java:789)
        at 
org.apache.activemq.store.kahadb.scheduler.JobSchedulerImpl.mainLoop(JobSchedulerImpl.java:720)
        at 
org.apache.activemq.store.kahadb.scheduler.JobSchedulerImpl.run(JobSchedulerImpl.java:673)
        at java.lang.Thread.run(Thread.java:748)

2018-Nov-21 15:24:45 ActiveMQ Transport: tcp:///10.88.2.113:54256@61616 
[org.apache.activemq.transaction.LocalTransaction] WARN: POST COMMIT 
FAILED:java.lang.NullPointerException
        at 
org.apache.activemq.store.kahadb.AbstractKahaDBStore.store(AbstractKahaDBStore.java:482)
        at 
org.apache.activemq.store.kahadb.AbstractKahaDBStore.store(AbstractKahaDBStore.java:394)
        at 
org.apache.activemq.store.kahadb.scheduler.JobSchedulerImpl.doSchedule(JobSchedulerImpl.java:264)
        at 
org.apache.activemq.store.kahadb.scheduler.JobSchedulerImpl.schedule(JobSchedulerImpl.java:99)
        at 
org.apache.activemq.broker.scheduler.SchedulerBroker.doSchedule(SchedulerBroker.java:197)
        at 
org.apache.activemq.broker.scheduler.SchedulerBroker.access$000(SchedulerBroker.java:48)
        at 
org.apache.activemq.broker.scheduler.SchedulerBroker$1.afterCommit(SchedulerBroker.java:162)
        at 
org.apache.activemq.transaction.Transaction.fireAfterCommit(Transaction.java:126)
        at 
org.apache.activemq.transaction.Transaction.doPostCommit(Transaction.java:195)
        at 
org.apache.activemq.transaction.Transaction$2.call(Transaction.java:55)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
org.apache.activemq.store.kahadb.MessageDatabase.store(MessageDatabase.java:990)
        at 
org.apache.activemq.store.kahadb.MessageDatabase.store(MessageDatabase.java:957)
        at 
org.apache.activemq.store.kahadb.KahaDBTransactionStore.commit(KahaDBTransactionStore.java:298)
        at 
org.apache.activemq.transaction.LocalTransaction.commit(LocalTransaction.java:70)
        at 
org.apache.activemq.broker.TransactionBroker.commitTransaction(TransactionBroker.java:253)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to