[ 
https://issues.apache.org/jira/browse/ARTEMIS-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17065455#comment-17065455
 ] 

Justin Bertram edited comment on ARTEMIS-2677 at 3/27/20, 2:14 PM:
-------------------------------------------------------------------

Attached broker.xml and broker-slave.xml.

I am running a master on 3GB RAM and a slave on 2GB RAM. In general failover 
works.

Load on Artemis we process is around average 5-7M messages (text and binary) a 
day (approx single message size text message size 7kb). Max messages around 
1GB.  

I tried to reproduce again with same configuration in AWS with other pair of 
master and slave by shutting down master. Failover worked fine and all 
consumers were successfully moved to the slave, producers got updated topology 
and Artemis was receiving messages, and didn't get 
{{RejectedExecutionException}}. 

I will try reproducing it in isolation on one machine. Can you please guide in 
which situations Artemis throws {{RejectedExecutionException}} (e.g, is it 
related to resources (memory, CPU, or slow NFS) on the machine at given 
point-in-time when this happened)?

Please let me know if you have any suggestion related to configuration tunings 
or resource changes needs to be applied help reduce the possibility of such 
exception?


was (Author: jigaronline):
Attached broker.xml and broker-slave.xml.

I am running a master on 3GB RAM and a slave on 2GB RAM. In-general failover 
works.

Load on Artemis we process is around average 5-7M messages (text and binary) a 
day (approx single message size text message size7 kb). Max messages around 
1GB. 

 

I tried to reproduce again with same configuration in AWS with other pair of 
master and slave by shutting down master. Failover worked fine and all 
consumers were successfully moved to the slave, producers got updated topology 
and Artemis was receiving messages, and didn't get "RejectedExecutionException".

 

I will try reproducing it, in isolation on one machine. Can you please guide in 
which situations Artemis throws "RejectedExecutionException"?

E.g, I mean is it related to resources (memory, cpu or slow NFS) on the machine 
at given point-in-time when this happened.

Please let me know if you have any suggestion related to configuration tunings 
or resource changes needs to be applied help reduce the possiblity of such 
exception?

 

 

 

 

 

> Artemis 2.11.0 RejectedExecutionException after successful failover
> -------------------------------------------------------------------
>
>                 Key: ARTEMIS-2677
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2677
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>         Environment: Environment: Artemis 2.11.0 (Master-Slave)
>  SharedStore, FilePing. AWS EFS/NFS.
>            Reporter: Jigar Shah
>            Priority: Major
>         Attachments: broker - slave.xml, broker.xml
>
>
> Observed an issue on master shutdown, the slave became active. Right after 
> slave being active it started printing "RejectedExecutionException". Also, 
> consumers from master to slave was not transferred. And client application 
> stopped processing messages.
> Note RejectedExecutionException had [Terminated, pool size = 0, active 
> threads = 0, queued tasks = 0, completed tasks = 59059]
> Following are the logs during failover from master to slave:
> Master1:
> {noformat}
> 2020-03-18 12:34:39,911 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 26fb51af-690c-11ea-959a-12f8371a8293
> 2020-03-18 12:34:39,912 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221029: stopped bridge 
> $.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9
> 2020-03-18 12:34:39,950 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 2655265e-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 2655265e-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 2662e209-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 2662e209-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 2669bfdd-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 2669bfdd-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,993 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 26615b64-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 26615b64-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.client] 
> AMQ212037: Connection failure to artemis1.sl.idsk.com/10.110.0.17:61616 has 
> been detected: AMQ219015: The connection was disconnected because of server 
> shutdown [code=DISCONNECTED]
> 2020-03-18 12:34:40,123 FINE [org.jgroups.protocols.FILE_PING] remove 
> persistence-fs
> 2020-03-18 12:34:40,137 FINE [org.jgroups.protocols.FD_SOCK] 
> ip-10-110-0-17-18496: socket to ip-10-110-0-88-23224 was closed gracefully
> 2020-03-18 12:34:40,138 FINE [org.jgroups.protocols.TCP] closing sockets and 
> stopping threads
> 2020-03-18 12:34:40,169 WARN [org.apache.activemq.artemis.core.client] 
> AMQ212004: Failed to connect to server.
> 2020-03-18 12:34:41,336 INFO [io.hawt.web.AuthenticationFilter] Destroying 
> hawtio authentication filter
> 2020-03-18 12:34:41,341 INFO [io.hawt.HawtioContextListener] Destroying 
> hawtio services
> 2020-03-18 12:34:41,464 INFO 
> [org.apache.activemq.hawtio.plugin.PluginContextListener] Destroyed 
> artemis-plugin plugin
> 2020-03-18 12:34:41,472 INFO 
> [org.apache.activemq.hawtio.branding.PluginContextListener] Destroyed 
> activemq-branding plugin
> 2020-03-18 12:34:41,509 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221002: Apache ActiveMQ Artemis Message Broker version 2.11.0 
> [19ade3bb-5f75-11ea-b327-1216d251b187] stopped, uptime 5 hours 2 minutes
> {noformat}
> Slave1:
> {noformat}
> 2020-03-18 12:35:01,613 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221010: Backup Server is now live
> 2020-03-18 12:35:01,626 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221027: Bridge ClusterConnectionBridge@5ced3917 
> [name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9, 
> queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9,
>  postOffice=PostOfficeImpl 
> [server=ActiveMQServerImpl::serverUUID=19ade3bb-5f75-11ea-b327-1216d251b187], 
> temp=false]@58d2a652 targetConnector=ServerLocatorImpl 
> (identity=(Cluster-connection-bridge::ClusterConnectionBridge@5ced3917 
> [name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9, 
> queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9,
>  postOffice=PostOfficeImpl 
> [server=ActiveMQServerImpl::serverUUID=19ade3bb-5f75-11ea-b327-1216d251b187], 
> temp=false]@58d2a652 targetConnector=ServerLocatorImpl 
> [initialConnectors=[TransportConfiguration(name=netty-connector, 
> factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)
>  ?port=61616&host=artemis2-sl-idsk-com], 
> discoveryGroupConfiguration=null]]::ClusterConnectionImpl@1788190971[nodeUUID=19ade3bb-5f75-11ea-b327-1216d251b187,
>  connector=TransportConfiguration(name=netty-connector, 
> factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)
>  ?port=61617&host=artemis2-sl-idsk-com, address=, 
> server=ActiveMQServerImpl::serverUUID=19ade3bb-5f75-11ea-b327-1216d251b187])) 
> [initialConnectors=[TransportConfiguration(name=netty-connector, 
> factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)
>  ?port=61616&host=artemis2-sl-idsk-com], discoveryGroupConfiguration=null]] 
> is connected
> 2020-03-18 12:35:20,621 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222054: Error on executing IOCallback: 
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$25/191327468@479377d0
>  rejected from 
> org.apache.activemq.artemis.utils.ActiveMQThreadPoolExecutor@291e146[Terminated,
>  pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 59059]
>  at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
>  [rt.jar:1.8.0_212]
>  at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) 
> [rt.jar:1.8.0_212]
>  at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) 
> [rt.jar:1.8.0_212]
>  at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase.onAddedTaskIfNotRunning(ProcessorBase.java:205)
>  [artemis-commons-2.11.0.jar:2.11.0]
>  at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase.task(ProcessorBase.java:193)
>  [artemis-commons-2.11.0.jar:2.11.0]
>  at 
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.execute(OrderedExecutor.java:54)
>  [artemis-commons-2.11.0.jar:2.11.0]
>  at 
> org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl.execute(OperationContextImpl.java:238)
>  [artemis-server-2.11.0.jar:2.11.0]
>  at 
> org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl.checkTasks(OperationContextImpl.java:221)
>  [artemis-server-2.11.0.jar:2.11.0]
>  at 
> org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl.done(OperationContextImpl.java:197)
>  [artemis-server-2.11.0.jar:2.11.0]
>  at 
> org.apache.activemq.artemis.core.io.IOCallback.lambda$done$0(IOCallback.java:43)
>  [artemis-journal-2.11.0.jar:2.11.0]
>  at java.util.ArrayList.forEach(ArrayList.java:1257) [rt.jar:1.8.0_212]
>  at org.apache.activemq.artemis.core.io.IOCallback.done(IOCallback.java:41) 
> [artemis-journal-2.11.0.jar:2.11.0]
>  at 
> org.apache.activemq.artemis.core.io.DelegateCallback.done(DelegateCallback.java:41)
>  [artemis-journal-2.11.0.jar:2.11.0]
>  at 
> org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.doInternalWrite(NIOSequentialFile.java:395)
>  [artemis-journal-2.11.0.jar:2.11.0]
>  at 
> org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.internalWrite(NIOSequentialFile.java:359)
>  [artemis-journal-2.11.0.jar:2.11.0]
>  at 
> org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.access$100(NIOSequentialFile.java:43)
>  [artemis-journal-2.11.0.jar:2.11.0]
>  at 
> org.apache.activemq.artemis.core.io.nio.NIOSequentialFile$SyncLocalBufferObserver.flushBuffer(NIOSequentialFile.java:434)
>  [artemis-journal-2.11.0.jar:2.11.0]
>  at 
> org.apache.activemq.artemis.core.io.buffer.TimedBuffer.flushBatch(TimedBuffer.java:361)
>  [artemis-journal-2.11.0.jar:2.11.0]
>  at 
> org.apache.activemq.artemis.core.io.buffer.TimedBuffer$CheckTimer.run(TimedBuffer.java:457)
>  [artemis-journal-2.11.0.jar:2.11.0]
>  at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_212]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to