[ 
https://issues.apache.org/jira/browse/ARTEMIS-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suman Moorthy updated ARTEMIS-3076:
-----------------------------------
    Description: 
I have an Artemis (version 2.11.0) HA configured (Master and Slave).

Master node goes down for unknown reason, the below log get printed 
continuously.
*AMQ222154: Error checking DLQ: 
ActiveMQShutdownException[errorType=SHUTDOWN_ERROR message=Journal must be in 
state=LOADED, was [STOPPED]]*
{code:java}
2021-01-15 23:02:05,414 WARN  [org.apache.activemq.artemis.core.server] 
AMQ222154: Error checking DLQ: 
ActiveMQShutdownException[errorType=SHUTDOWN_ERROR message=Journal must be in 
state=LOADED, was [STOPPED]] 
at 
org.apache.activemq.artemis.core.journal.impl.JournalImpl.checkJournalIsLoaded(JournalImpl.java:1087)
 [artemis-journal-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.journal.impl.JournalImpl.appendUpdateRecord(JournalImpl.java:886)
 [artemis-journal-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.journal.Journal.appendUpdateRecord(Journal.java:98)
 [artemis-journal-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.persistence.impl.journal.AbstractJournalStorageManager.updateDeliveryCount(AbstractJournalStorageManager.java:756)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.server.impl.QueueImpl.checkRedelivery(QueueImpl.java:3052)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.server.impl.RefsOperation.rollbackRedelivery(RefsOperation.java:166)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.server.impl.RefsOperation.afterRollback(RefsOperation.java:113)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.transaction.impl.TransactionImpl.afterRollback(TransactionImpl.java:589)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.transaction.impl.TransactionImpl.access$200(TransactionImpl.java:40)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.transaction.impl.TransactionImpl$4.done(TransactionImpl.java:442)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl$1.run(OperationContextImpl.java:244)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
 [artemis-commons-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
 [artemis-commons-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66)
 [artemis-commons-2.11.0.jar:2.11.0] 
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[rt.jar:1.8.0_275] 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[rt.jar:1.8.0_275] 
at 
org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
 [artemis-commons-2.11.0.jar:2.11.0]
{code}
 

The Slave comes up as expected, but throws an NPE:
{noformat}
2021-01-15 23:02:27,529 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221010: Backup Server is now live
2021-01-15 23:02:27,545 ERROR [org.apache.activemq.artemis.core.server] 
AMQ224000: Failure in initialisation: java.lang.NullPointerException 
at 
org.apache.activemq.artemis.core.server.impl.SharedStoreBackupActivation$FailbackChecker.<init>(SharedStoreBackupActivation.java:193)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.server.impl.SharedStoreBackupActivation.startFailbackChecker(SharedStoreBackupActivation.java:185)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.server.impl.SharedStoreBackupActivation.run(SharedStoreBackupActivation.java:118)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$ActivationThread.run(ActiveMQServerImpl.java:3863)
 [artemis-server-2.11.0.jar:2.11.0]{noformat}
Master attempts to start but, it doesn't progress beyond *"AMQ221034: Waiting 
indefinitely to obtain live lock"*
 The logs are stuck at this point even after multiple restarts.
{noformat}
2021-01-15 23:03:56,238 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221006: Waiting to obtain live lock
2021-01-15 23:03:56,300 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221013: Using NIO Journal
2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221043: Protocol module found: [artemis-server]. Adding protocol support 
for: CORE
2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221043: Protocol module found: [artemis-amqp-protocol]. Adding protocol 
support for: AMQP
2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221043: Protocol module found: [artemis-hornetq-protocol]. Adding protocol 
support for: HORNETQ
2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221043: Protocol module found: [artemis-mqtt-protocol]. Adding protocol 
support for: MQTT
2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221043: Protocol module found: [artemis-openwire-protocol]. Adding protocol 
support for: OPENWIRE
2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221043: Protocol module found: [artemis-stomp-protocol]. Adding protocol 
support for: STOMP
2021-01-15 23:03:56,644 WARN  [org.apache.activemq.artemis.core.server] 
AMQ222035: Directory \\test\data\paging\cd776bae-1a55-11eb-985d-0050569136c8 
did not have an identification file address.txt
2021-01-15 23:03:56,644 WARN  [org.apache.activemq.artemis.core.server] 
AMQ222035: Directory \\test\data\paging\a84f1e4f-1f1a-11eb-a37f-0050569136c8 
did not have an identification file address.txt
2021-01-15 23:03:56,644 WARN  [org.apache.activemq.artemis.core.server] 
AMQ222035: Directory \\test\data\paging\a87edff5-1f1a-11eb-a37f-0050569136c8 
did not have an identification file address.txt
2021-01-15 23:03:56,988 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221034: Waiting indefinitely to obtain live lock{noformat}
 

Can you please advise on the issue here and the steps to recover?

Does NPE in Slave start-up have any effects on the queue/functioning?

Do I need to stop the Slave manually to get the Master to start successfully?

  was:
I have an Artemis (version 2.11.0) HA configured (Master and Slave).

Master node goes down for unknown reason, the below log get printed 
continuously.
{code:java}
2021-01-15 23:02:05,414 WARN  [org.apache.activemq.artemis.core.server] 
AMQ222154: Error checking DLQ: 
ActiveMQShutdownException[errorType=SHUTDOWN_ERROR message=Journal must be in 
state=LOADED, was [STOPPED]]
2021-01-15 23:02:05,414 WARN  [org.apache.activemq.artemis.core.server] 
AMQ222154: Error checking DLQ: 
ActiveMQShutdownException[errorType=SHUTDOWN_ERROR message=Journal must be in 
state=LOADED, was [STOPPED]] 
at 
org.apache.activemq.artemis.core.journal.impl.JournalImpl.checkJournalIsLoaded(JournalImpl.java:1087)
 [artemis-journal-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.journal.impl.JournalImpl.appendUpdateRecord(JournalImpl.java:886)
 [artemis-journal-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.journal.Journal.appendUpdateRecord(Journal.java:98)
 [artemis-journal-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.persistence.impl.journal.AbstractJournalStorageManager.updateDeliveryCount(AbstractJournalStorageManager.java:756)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.server.impl.QueueImpl.checkRedelivery(QueueImpl.java:3052)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.server.impl.RefsOperation.rollbackRedelivery(RefsOperation.java:166)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.server.impl.RefsOperation.afterRollback(RefsOperation.java:113)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.transaction.impl.TransactionImpl.afterRollback(TransactionImpl.java:589)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.transaction.impl.TransactionImpl.access$200(TransactionImpl.java:40)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.transaction.impl.TransactionImpl$4.done(TransactionImpl.java:442)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl$1.run(OperationContextImpl.java:244)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
 [artemis-commons-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
 [artemis-commons-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66)
 [artemis-commons-2.11.0.jar:2.11.0] 
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[rt.jar:1.8.0_275] 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[rt.jar:1.8.0_275] 
at 
org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
 [artemis-commons-2.11.0.jar:2.11.0]
{code}
 

The Slave comes up as expected, but throws an NPE:
{noformat}

2021-01-15 23:02:27,529 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221010: Backup Server is now live
2021-01-15 23:02:27,545 ERROR [org.apache.activemq.artemis.core.server] 
AMQ224000: Failure in initialisation: java.lang.NullPointerException 
at 
org.apache.activemq.artemis.core.server.impl.SharedStoreBackupActivation$FailbackChecker.<init>(SharedStoreBackupActivation.java:193)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.server.impl.SharedStoreBackupActivation.startFailbackChecker(SharedStoreBackupActivation.java:185)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.server.impl.SharedStoreBackupActivation.run(SharedStoreBackupActivation.java:118)
 [artemis-server-2.11.0.jar:2.11.0] 
at 
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$ActivationThread.run(ActiveMQServerImpl.java:3863)
 [artemis-server-2.11.0.jar:2.11.0]{noformat}
Master attempts to start but, it doesn't progress beyond *"AMQ221034: Waiting 
indefinitely to obtain live lock"*
The logs are stuck at this point even after multiple restarts.
{noformat}
2021-01-15 23:03:56,238 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221006: Waiting to obtain live lock
2021-01-15 23:03:56,300 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221013: Using NIO Journal
2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221043: Protocol module found: [artemis-server]. Adding protocol support 
for: CORE
2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221043: Protocol module found: [artemis-amqp-protocol]. Adding protocol 
support for: AMQP
2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221043: Protocol module found: [artemis-hornetq-protocol]. Adding protocol 
support for: HORNETQ
2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221043: Protocol module found: [artemis-mqtt-protocol]. Adding protocol 
support for: MQTT
2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221043: Protocol module found: [artemis-openwire-protocol]. Adding protocol 
support for: OPENWIRE
2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221043: Protocol module found: [artemis-stomp-protocol]. Adding protocol 
support for: STOMP
2021-01-15 23:03:56,644 WARN  [org.apache.activemq.artemis.core.server] 
AMQ222035: Directory \\test\data\paging\cd776bae-1a55-11eb-985d-0050569136c8 
did not have an identification file address.txt
2021-01-15 23:03:56,644 WARN  [org.apache.activemq.artemis.core.server] 
AMQ222035: Directory \\test\data\paging\a84f1e4f-1f1a-11eb-a37f-0050569136c8 
did not have an identification file address.txt
2021-01-15 23:03:56,644 WARN  [org.apache.activemq.artemis.core.server] 
AMQ222035: Directory \\test\data\paging\a87edff5-1f1a-11eb-a37f-0050569136c8 
did not have an identification file address.txt
2021-01-15 23:03:56,988 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221034: Waiting indefinitely to obtain live lock{noformat}
 

Can you please advise on the issue here and the steps to recover?

Does NPE in Slave start-up have any effects on the queue/functioning?

Do I need to stop the Slave manually to get the Master to start successfully?


> Artemis Master node not starting after failover to Slave
> --------------------------------------------------------
>
>                 Key: ARTEMIS-3076
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3076
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>    Affects Versions: 2.11.0
>            Reporter: Suman Moorthy
>            Priority: Major
>
> I have an Artemis (version 2.11.0) HA configured (Master and Slave).
> Master node goes down for unknown reason, the below log get printed 
> continuously.
> *AMQ222154: Error checking DLQ: 
> ActiveMQShutdownException[errorType=SHUTDOWN_ERROR message=Journal must be in 
> state=LOADED, was [STOPPED]]*
> {code:java}
> 2021-01-15 23:02:05,414 WARN  [org.apache.activemq.artemis.core.server] 
> AMQ222154: Error checking DLQ: 
> ActiveMQShutdownException[errorType=SHUTDOWN_ERROR message=Journal must be in 
> state=LOADED, was [STOPPED]] 
> at 
> org.apache.activemq.artemis.core.journal.impl.JournalImpl.checkJournalIsLoaded(JournalImpl.java:1087)
>  [artemis-journal-2.11.0.jar:2.11.0] 
> at 
> org.apache.activemq.artemis.core.journal.impl.JournalImpl.appendUpdateRecord(JournalImpl.java:886)
>  [artemis-journal-2.11.0.jar:2.11.0] 
> at 
> org.apache.activemq.artemis.core.journal.Journal.appendUpdateRecord(Journal.java:98)
>  [artemis-journal-2.11.0.jar:2.11.0] 
> at 
> org.apache.activemq.artemis.core.persistence.impl.journal.AbstractJournalStorageManager.updateDeliveryCount(AbstractJournalStorageManager.java:756)
>  [artemis-server-2.11.0.jar:2.11.0] 
> at 
> org.apache.activemq.artemis.core.server.impl.QueueImpl.checkRedelivery(QueueImpl.java:3052)
>  [artemis-server-2.11.0.jar:2.11.0] 
> at 
> org.apache.activemq.artemis.core.server.impl.RefsOperation.rollbackRedelivery(RefsOperation.java:166)
>  [artemis-server-2.11.0.jar:2.11.0] 
> at 
> org.apache.activemq.artemis.core.server.impl.RefsOperation.afterRollback(RefsOperation.java:113)
>  [artemis-server-2.11.0.jar:2.11.0] 
> at 
> org.apache.activemq.artemis.core.transaction.impl.TransactionImpl.afterRollback(TransactionImpl.java:589)
>  [artemis-server-2.11.0.jar:2.11.0] 
> at 
> org.apache.activemq.artemis.core.transaction.impl.TransactionImpl.access$200(TransactionImpl.java:40)
>  [artemis-server-2.11.0.jar:2.11.0] 
> at 
> org.apache.activemq.artemis.core.transaction.impl.TransactionImpl$4.done(TransactionImpl.java:442)
>  [artemis-server-2.11.0.jar:2.11.0] 
> at 
> org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl$1.run(OperationContextImpl.java:244)
>  [artemis-server-2.11.0.jar:2.11.0] 
> at 
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
>  [artemis-commons-2.11.0.jar:2.11.0] 
> at 
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
>  [artemis-commons-2.11.0.jar:2.11.0] 
> at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66)
>  [artemis-commons-2.11.0.jar:2.11.0] 
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [rt.jar:1.8.0_275] 
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [rt.jar:1.8.0_275] 
> at 
> org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
>  [artemis-commons-2.11.0.jar:2.11.0]
> {code}
>  
> The Slave comes up as expected, but throws an NPE:
> {noformat}
> 2021-01-15 23:02:27,529 INFO  [org.apache.activemq.artemis.core.server] 
> AMQ221010: Backup Server is now live
> 2021-01-15 23:02:27,545 ERROR [org.apache.activemq.artemis.core.server] 
> AMQ224000: Failure in initialisation: java.lang.NullPointerException 
> at 
> org.apache.activemq.artemis.core.server.impl.SharedStoreBackupActivation$FailbackChecker.<init>(SharedStoreBackupActivation.java:193)
>  [artemis-server-2.11.0.jar:2.11.0] 
> at 
> org.apache.activemq.artemis.core.server.impl.SharedStoreBackupActivation.startFailbackChecker(SharedStoreBackupActivation.java:185)
>  [artemis-server-2.11.0.jar:2.11.0] 
> at 
> org.apache.activemq.artemis.core.server.impl.SharedStoreBackupActivation.run(SharedStoreBackupActivation.java:118)
>  [artemis-server-2.11.0.jar:2.11.0] 
> at 
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$ActivationThread.run(ActiveMQServerImpl.java:3863)
>  [artemis-server-2.11.0.jar:2.11.0]{noformat}
> Master attempts to start but, it doesn't progress beyond *"AMQ221034: Waiting 
> indefinitely to obtain live lock"*
>  The logs are stuck at this point even after multiple restarts.
> {noformat}
> 2021-01-15 23:03:56,238 INFO  [org.apache.activemq.artemis.core.server] 
> AMQ221006: Waiting to obtain live lock
> 2021-01-15 23:03:56,300 INFO  [org.apache.activemq.artemis.core.server] 
> AMQ221013: Using NIO Journal
> 2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
> AMQ221043: Protocol module found: [artemis-server]. Adding protocol support 
> for: CORE
> 2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
> AMQ221043: Protocol module found: [artemis-amqp-protocol]. Adding protocol 
> support for: AMQP
> 2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
> AMQ221043: Protocol module found: [artemis-hornetq-protocol]. Adding protocol 
> support for: HORNETQ
> 2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
> AMQ221043: Protocol module found: [artemis-mqtt-protocol]. Adding protocol 
> support for: MQTT
> 2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
> AMQ221043: Protocol module found: [artemis-openwire-protocol]. Adding 
> protocol support for: OPENWIRE
> 2021-01-15 23:03:56,581 INFO  [org.apache.activemq.artemis.core.server] 
> AMQ221043: Protocol module found: [artemis-stomp-protocol]. Adding protocol 
> support for: STOMP
> 2021-01-15 23:03:56,644 WARN  [org.apache.activemq.artemis.core.server] 
> AMQ222035: Directory \\test\data\paging\cd776bae-1a55-11eb-985d-0050569136c8 
> did not have an identification file address.txt
> 2021-01-15 23:03:56,644 WARN  [org.apache.activemq.artemis.core.server] 
> AMQ222035: Directory \\test\data\paging\a84f1e4f-1f1a-11eb-a37f-0050569136c8 
> did not have an identification file address.txt
> 2021-01-15 23:03:56,644 WARN  [org.apache.activemq.artemis.core.server] 
> AMQ222035: Directory \\test\data\paging\a87edff5-1f1a-11eb-a37f-0050569136c8 
> did not have an identification file address.txt
> 2021-01-15 23:03:56,988 INFO  [org.apache.activemq.artemis.core.server] 
> AMQ221034: Waiting indefinitely to obtain live lock{noformat}
>  
> Can you please advise on the issue here and the steps to recover?
> Does NPE in Slave start-up have any effects on the queue/functioning?
> Do I need to stop the Slave manually to get the Master to start successfully?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to