[jira] [Commented] (ARTEMIS-2250) Shared store lock is not monitored while running
[ https://issues.apache.org/jira/browse/ARTEMIS-2250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923654#comment-16923654 ] Jigar Shah commented on ARTEMIS-2250: - Hello Bas, We are running into the similar situation, while using Master-Slave setup on AWS/EFS "Live gives up the control and Backup gains the control, and a situation arises where both Live and Backup are active at the same time, manipulating Journal creating un-recoverable Journals situation at-times". We have observed this situation with Artemis 2.6.3 and also Artemis 2.7.0. In out QA env. this happens on-an-average once or twice a week. We are also trying the way to consistently reproduce on AWS/EFS but not there yet. _"We were able to prevent the occurence by tweaking EFS connection settings so it does not occur anymore in our setup."_ You have mentioned above in the comment. It will be very helpful If possible can you please share connection setting/mount parameters you have used which works in your setup. Many Thanks > Shared store lock is not monitored while running > > > Key: ARTEMIS-2250 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2250 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.6.4 > Environment: AWS EFS (NFS) >Reporter: Bas >Priority: Major > > When using the shared store the live server can loose the lock on the journal > but does not notice it. This can happen when a shared file system is being > used like in AWS where we use EFS. > This can cause problems when the live server regains the network file system > connection and just continues to process messages. At some point the live or > the backup quits because it notices changes on the filesystems which it did > not do itself. > We were able to prevent the occurence by tweaking EFS connection settings so > it does not occur anymore in our setup. > For artemis we would like to show our change maybe someone can review the > change and see if it can be improved and adapted in artemis. > Patch is here for master: > https://github.com/emagiz/activemq-artemis/commit/788adfbd3e5a54c63eed0810b7377641684b6fe1.patch > Pull request: > https://github.com/apache/activemq-artemis/pull/2547 -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARTEMIS-2677) Artemis 2.11.0 RejectedExecutionException after successful failover
[ https://issues.apache.org/jira/browse/ARTEMIS-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068429#comment-17068429 ] Jigar Shah commented on ARTEMIS-2677: - We are trying to reproduce it. But not able to re-produce situation. Any issues in configuration was identified on uploaded broker.xml(s) > Artemis 2.11.0 RejectedExecutionException after successful failover > --- > > Key: ARTEMIS-2677 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2677 > Project: ActiveMQ Artemis > Issue Type: Bug > Environment: Environment: Artemis 2.11.0 (Master-Slave) > SharedStore, FilePing. AWS EFS/NFS. >Reporter: Jigar Shah >Priority: Major > Attachments: broker - slave.xml, broker.xml > > > Observed an issue on master shutdown, the slave became active. Right after > slave being active it started printing "RejectedExecutionException". Also, > consumers from master to slave was not transferred. And client application > stopped processing messages. > Note RejectedExecutionException had [Terminated, pool size = 0, active > threads = 0, queued tasks = 0, completed tasks = 59059] > Following are the logs during failover from master to slave: > Master1: > {noformat} > 2020-03-18 12:34:39,911 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 26fb51af-690c-11ea-959a-12f8371a8293 > 2020-03-18 12:34:39,912 INFO [org.apache.activemq.artemis.core.server] > AMQ221029: stopped bridge > $.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9 > 2020-03-18 12:34:39,950 WARN [org.apache.activemq.artemis.core.server] > AMQ222061: Client connection failed, clearing up resources for session > 2655265e-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 2655265e-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222061: Client connection failed, clearing up resources for session > 2662e209-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 2662e209-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222061: Client connection failed, clearing up resources for session > 2669bfdd-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 2669bfdd-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,993 WARN [org.apache.activemq.artemis.core.server] > AMQ222061: Client connection failed, clearing up resources for session > 26615b64-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 26615b64-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.client] > AMQ212037: Connection failure to artemis1.sl.idsk.com/10.110.0.17:61616 has > been detected: AMQ219015: The connection was disconnected because of server > shutdown [code=DISCONNECTED] > 2020-03-18 12:34:40,123 FINE [org.jgroups.protocols.FILE_PING] remove > persistence-fs > 2020-03-18 12:34:40,137 FINE [org.jgroups.protocols.FD_SOCK] > ip-10-110-0-17-18496: socket to ip-10-110-0-88-23224 was closed gracefully > 2020-03-18 12:34:40,138 FINE [org.jgroups.protocols.TCP] closing sockets and > stopping threads > 2020-03-18 12:34:40,169 WARN [org.apache.activemq.artemis.core.client] > AMQ212004: Failed to connect to server. > 2020-03-18 12:34:41,336 INFO [io.hawt.web.AuthenticationFilter] Destroying > hawtio authentication filter > 2020-03-18 12:34:41,341 INFO [io.hawt.HawtioContextListener] Destroying > hawtio services > 2020-03-18 12:34:41,464 INFO > [org.apache.activemq.hawtio.plugin.PluginContextListener] Destroyed > artemis-plugin plugin > 2020-03-18 12:34:41,472 INFO > [org.apache.activemq.hawtio.branding.PluginContextListener] Destroyed > activemq-branding plugin > 2020-03-18 12:34:41,509 INFO [org.apache.activemq.artemis.core.server] > AMQ221002: Apache ActiveMQ Artemis Message Broker version 2.11.0 > [19ade3bb-5f75-11ea-b327-1216d251b187] stopped, uptime 5 hours 2 minutes > {noformat} > Slave1: > {noformat} > 2020-03-18 12:35:01,613 INFO [org.apache.activemq.artemis.core.server] > AMQ221010: Backup Server is now live > 2020-03-18 12:35:01,626 INFO [org.apache.activemq.artemis.core.server] > AMQ221027: Bridge ClusterConnectionBridge@5ced3917 > [name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9, > queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9, > postOffice=PostOfficeImpl
[jira] [Created] (ARTEMIS-2677) Artemis 2.11.0 RejectedExecutionException after successful failover
Jigar Shah created ARTEMIS-2677: --- Summary: Artemis 2.11.0 RejectedExecutionException after successful failover Key: ARTEMIS-2677 URL: https://issues.apache.org/jira/browse/ARTEMIS-2677 Project: ActiveMQ Artemis Issue Type: Bug Environment: *Environment: 2.11.0* *SharedStore, FilePing. AWS EFS/NFS.* Reporter: Jigar Shah Observed an issue on master shutdown, the slave became active. Right after slave being active it started printing "RejectedExecutionException". Also, consumers from master to slave was not transferred. And client application stopped processing messages. Note RejectedExecutionException had [Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 59059] Following are the logs during failover from master to slave: Master1: 2020-03-18 12:34:39,911 WARN [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources for session 26fb51af-690c-11ea-959a-12f8371a8293 2020-03-18 12:34:39,912 INFO [org.apache.activemq.artemis.core.server] AMQ221029: stopped bridge $.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9 2020-03-18 12:34:39,950 WARN [org.apache.activemq.artemis.core.server] AMQ222061: Client connection failed, clearing up resources for session 2655265e-690c-11ea-9d7c-12f7d16c4f81 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources for session 2655265e-690c-11ea-9d7c-12f7d16c4f81 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] AMQ222061: Client connection failed, clearing up resources for session 2662e209-690c-11ea-9d7c-12f7d16c4f81 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources for session 2662e209-690c-11ea-9d7c-12f7d16c4f81 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] AMQ222061: Client connection failed, clearing up resources for session 2669bfdd-690c-11ea-9d7c-12f7d16c4f81 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources for session 2669bfdd-690c-11ea-9d7c-12f7d16c4f81 2020-03-18 12:34:39,993 WARN [org.apache.activemq.artemis.core.server] AMQ222061: Client connection failed, clearing up resources for session 26615b64-690c-11ea-9d7c-12f7d16c4f81 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources for session 26615b64-690c-11ea-9d7c-12f7d16c4f81 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure to artemis1.sl.idsk.com/10.110.0.17:61616 has been detected: AMQ219015: The connection was disconnected because of server shutdown [code=DISCONNECTED] 2020-03-18 12:34:40,123 FINE [org.jgroups.protocols.FILE_PING] remove persistence-fs 2020-03-18 12:34:40,137 FINE [org.jgroups.protocols.FD_SOCK] ip-10-110-0-17-18496: socket to ip-10-110-0-88-23224 was closed gracefully 2020-03-18 12:34:40,138 FINE [org.jgroups.protocols.TCP] closing sockets and stopping threads 2020-03-18 12:34:40,169 WARN [org.apache.activemq.artemis.core.client] AMQ212004: Failed to connect to server. 2020-03-18 12:34:41,336 INFO [io.hawt.web.AuthenticationFilter] Destroying hawtio authentication filter 2020-03-18 12:34:41,341 INFO [io.hawt.HawtioContextListener] Destroying hawtio services 2020-03-18 12:34:41,464 INFO [org.apache.activemq.hawtio.plugin.PluginContextListener] Destroyed artemis-plugin plugin 2020-03-18 12:34:41,472 INFO [org.apache.activemq.hawtio.branding.PluginContextListener] Destroyed activemq-branding plugin 2020-03-18 12:34:41,509 INFO [org.apache.activemq.artemis.core.server] AMQ221002: Apache ActiveMQ Artemis Message Broker version 2.11.0 [19ade3bb-5f75-11ea-b327-1216d251b187] stopped, uptime 5 hours 2 minutes Slave1: 2020-03-18 12:35:01,613 INFO [org.apache.activemq.artemis.core.server] AMQ221010: Backup Server is now live 2020-03-18 12:35:01,626 INFO [org.apache.activemq.artemis.core.server] AMQ221027: Bridge ClusterConnectionBridge@5ced3917 [name=$.artemis.internal.sf.my-cluster.2c85cd 9e-5f75-11ea-9be1-0af9119190b9, queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9, postOffice=PostOfficeImpl [server=ActiveMQSe rverImpl::serverUUID=19ade3bb-5f75-11ea-b327-1216d251b187], temp=false]@58d2a652 targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnecti onBridge@5ced3917 [name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9, queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea- 9be1-0af9119190b9, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=19ade3bb-5f75-11ea-b327-1216d251b187], temp=false]@58d2a652 targetConnector=ServerLo catorImpl [initialConnectors=[TransportConfiguration(name=netty-connector,
[jira] [Updated] (ARTEMIS-2677) Artemis 2.11.0 RejectedExecutionException after successful failover
[ https://issues.apache.org/jira/browse/ARTEMIS-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jigar Shah updated ARTEMIS-2677: Environment: *Environment: Artemis 2.11.0 (Master-Slave)* *SharedStore, FilePing. AWS EFS/NFS.* was: *Environment: 2.11.0* *SharedStore, FilePing. AWS EFS/NFS.* > Artemis 2.11.0 RejectedExecutionException after successful failover > --- > > Key: ARTEMIS-2677 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2677 > Project: ActiveMQ Artemis > Issue Type: Bug > Environment: *Environment: Artemis 2.11.0 (Master-Slave)* > *SharedStore, FilePing. AWS EFS/NFS.* >Reporter: Jigar Shah >Priority: Major > > Observed an issue on master shutdown, the slave became active. Right after > slave being active it started printing "RejectedExecutionException". Also, > consumers from master to slave was not transferred. And client application > stopped processing messages. > Note RejectedExecutionException had [Terminated, pool size = 0, active > threads = 0, queued tasks = 0, completed tasks = 59059] > Following are the logs during failover from master to slave: > Master1: > 2020-03-18 12:34:39,911 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 26fb51af-690c-11ea-959a-12f8371a8293 > 2020-03-18 12:34:39,912 INFO [org.apache.activemq.artemis.core.server] > AMQ221029: stopped bridge > $.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9 > 2020-03-18 12:34:39,950 WARN [org.apache.activemq.artemis.core.server] > AMQ222061: Client connection failed, clearing up resources for session > 2655265e-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 2655265e-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222061: Client connection failed, clearing up resources for session > 2662e209-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 2662e209-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222061: Client connection failed, clearing up resources for session > 2669bfdd-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 2669bfdd-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,993 WARN [org.apache.activemq.artemis.core.server] > AMQ222061: Client connection failed, clearing up resources for session > 26615b64-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 26615b64-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.client] > AMQ212037: Connection failure to artemis1.sl.idsk.com/10.110.0.17:61616 has > been detected: AMQ219015: The connection was disconnected because of server > shutdown [code=DISCONNECTED] > 2020-03-18 12:34:40,123 FINE [org.jgroups.protocols.FILE_PING] remove > persistence-fs > 2020-03-18 12:34:40,137 FINE [org.jgroups.protocols.FD_SOCK] > ip-10-110-0-17-18496: socket to ip-10-110-0-88-23224 was closed gracefully > 2020-03-18 12:34:40,138 FINE [org.jgroups.protocols.TCP] closing sockets and > stopping threads > 2020-03-18 12:34:40,169 WARN [org.apache.activemq.artemis.core.client] > AMQ212004: Failed to connect to server. > 2020-03-18 12:34:41,336 INFO [io.hawt.web.AuthenticationFilter] Destroying > hawtio authentication filter > 2020-03-18 12:34:41,341 INFO [io.hawt.HawtioContextListener] Destroying > hawtio services > 2020-03-18 12:34:41,464 INFO > [org.apache.activemq.hawtio.plugin.PluginContextListener] Destroyed > artemis-plugin plugin > 2020-03-18 12:34:41,472 INFO > [org.apache.activemq.hawtio.branding.PluginContextListener] Destroyed > activemq-branding plugin > 2020-03-18 12:34:41,509 INFO [org.apache.activemq.artemis.core.server] > AMQ221002: Apache ActiveMQ Artemis Message Broker version 2.11.0 > [19ade3bb-5f75-11ea-b327-1216d251b187] stopped, uptime 5 hours 2 minutes > Slave1: > 2020-03-18 12:35:01,613 INFO [org.apache.activemq.artemis.core.server] > AMQ221010: Backup Server is now live > 2020-03-18 12:35:01,626 INFO [org.apache.activemq.artemis.core.server] > AMQ221027: Bridge ClusterConnectionBridge@5ced3917 > [name=$.artemis.internal.sf.my-cluster.2c85cd > 9e-5f75-11ea-9be1-0af9119190b9, > queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9, > postOffice=PostOfficeImpl [server=ActiveMQSe > rverImpl::serverUUID=19ade3bb-5f75-11ea-b327-1216d251b187], >
[jira] [Updated] (ARTEMIS-2677) Artemis 2.11.0 RejectedExecutionException after successful failover
[ https://issues.apache.org/jira/browse/ARTEMIS-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jigar Shah updated ARTEMIS-2677: Attachment: broker.xml broker - slave.xml > Artemis 2.11.0 RejectedExecutionException after successful failover > --- > > Key: ARTEMIS-2677 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2677 > Project: ActiveMQ Artemis > Issue Type: Bug > Environment: Environment: Artemis 2.11.0 (Master-Slave) > SharedStore, FilePing. AWS EFS/NFS. >Reporter: Jigar Shah >Priority: Major > Attachments: broker - slave.xml, broker.xml > > > Observed an issue on master shutdown, the slave became active. Right after > slave being active it started printing "RejectedExecutionException". Also, > consumers from master to slave was not transferred. And client application > stopped processing messages. > Note RejectedExecutionException had [Terminated, pool size = 0, active > threads = 0, queued tasks = 0, completed tasks = 59059] > Following are the logs during failover from master to slave: > Master1: > {noformat} > 2020-03-18 12:34:39,911 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 26fb51af-690c-11ea-959a-12f8371a8293 > 2020-03-18 12:34:39,912 INFO [org.apache.activemq.artemis.core.server] > AMQ221029: stopped bridge > $.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9 > 2020-03-18 12:34:39,950 WARN [org.apache.activemq.artemis.core.server] > AMQ222061: Client connection failed, clearing up resources for session > 2655265e-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 2655265e-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222061: Client connection failed, clearing up resources for session > 2662e209-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 2662e209-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222061: Client connection failed, clearing up resources for session > 2669bfdd-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 2669bfdd-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,993 WARN [org.apache.activemq.artemis.core.server] > AMQ222061: Client connection failed, clearing up resources for session > 26615b64-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 26615b64-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.client] > AMQ212037: Connection failure to artemis1.sl.idsk.com/10.110.0.17:61616 has > been detected: AMQ219015: The connection was disconnected because of server > shutdown [code=DISCONNECTED] > 2020-03-18 12:34:40,123 FINE [org.jgroups.protocols.FILE_PING] remove > persistence-fs > 2020-03-18 12:34:40,137 FINE [org.jgroups.protocols.FD_SOCK] > ip-10-110-0-17-18496: socket to ip-10-110-0-88-23224 was closed gracefully > 2020-03-18 12:34:40,138 FINE [org.jgroups.protocols.TCP] closing sockets and > stopping threads > 2020-03-18 12:34:40,169 WARN [org.apache.activemq.artemis.core.client] > AMQ212004: Failed to connect to server. > 2020-03-18 12:34:41,336 INFO [io.hawt.web.AuthenticationFilter] Destroying > hawtio authentication filter > 2020-03-18 12:34:41,341 INFO [io.hawt.HawtioContextListener] Destroying > hawtio services > 2020-03-18 12:34:41,464 INFO > [org.apache.activemq.hawtio.plugin.PluginContextListener] Destroyed > artemis-plugin plugin > 2020-03-18 12:34:41,472 INFO > [org.apache.activemq.hawtio.branding.PluginContextListener] Destroyed > activemq-branding plugin > 2020-03-18 12:34:41,509 INFO [org.apache.activemq.artemis.core.server] > AMQ221002: Apache ActiveMQ Artemis Message Broker version 2.11.0 > [19ade3bb-5f75-11ea-b327-1216d251b187] stopped, uptime 5 hours 2 minutes > {noformat} > Slave1: > {noformat} > 2020-03-18 12:35:01,613 INFO [org.apache.activemq.artemis.core.server] > AMQ221010: Backup Server is now live > 2020-03-18 12:35:01,626 INFO [org.apache.activemq.artemis.core.server] > AMQ221027: Bridge ClusterConnectionBridge@5ced3917 > [name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9, > queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9, > postOffice=PostOfficeImpl > [server=ActiveMQServerImpl::serverUUID=19ade3bb-5f75-11ea-b327-1216d251b187], > temp=false]@58d2a652
[jira] [Commented] (ARTEMIS-2677) Artemis 2.11.0 RejectedExecutionException after successful failover
[ https://issues.apache.org/jira/browse/ARTEMIS-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17065455#comment-17065455 ] Jigar Shah commented on ARTEMIS-2677: - Attached broker.xml and broker-slave.xml. I am running a master on 3GB RAM and a slave on 2GB RAM. In-general failover works. Load on Artemis we process is around average 5-7M messages (text and binary) a day (approx single message size text message size7 kb). Max messages around 1GB. I tried to reproduce again with same configuration in AWS with other pair of master and slave by shutting down master. Failover worked fine and all consumers were successfully moved to the slave, producers got updated topology and Artemis was receiving messages, and didn't get "RejectedExecutionException". I will try reproducing it, in isolation on one machine. Can you please guide in which situations Artemis throws "RejectedExecutionException"? E.g, I mean is it related to resources (memory, cpu or slow NFS) on the machine at given point-in-time when this happened. Please let me know if you have any suggestion related to configuration tunings or resource changes needs to be applied help reduce the possiblity of such exception? > Artemis 2.11.0 RejectedExecutionException after successful failover > --- > > Key: ARTEMIS-2677 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2677 > Project: ActiveMQ Artemis > Issue Type: Bug > Environment: Environment: Artemis 2.11.0 (Master-Slave) > SharedStore, FilePing. AWS EFS/NFS. >Reporter: Jigar Shah >Priority: Major > Attachments: broker - slave.xml, broker.xml > > > Observed an issue on master shutdown, the slave became active. Right after > slave being active it started printing "RejectedExecutionException". Also, > consumers from master to slave was not transferred. And client application > stopped processing messages. > Note RejectedExecutionException had [Terminated, pool size = 0, active > threads = 0, queued tasks = 0, completed tasks = 59059] > Following are the logs during failover from master to slave: > Master1: > {noformat} > 2020-03-18 12:34:39,911 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 26fb51af-690c-11ea-959a-12f8371a8293 > 2020-03-18 12:34:39,912 INFO [org.apache.activemq.artemis.core.server] > AMQ221029: stopped bridge > $.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9 > 2020-03-18 12:34:39,950 WARN [org.apache.activemq.artemis.core.server] > AMQ222061: Client connection failed, clearing up resources for session > 2655265e-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 2655265e-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222061: Client connection failed, clearing up resources for session > 2662e209-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 2662e209-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222061: Client connection failed, clearing up resources for session > 2669bfdd-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 2669bfdd-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,993 WARN [org.apache.activemq.artemis.core.server] > AMQ222061: Client connection failed, clearing up resources for session > 26615b64-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.server] > AMQ222107: Cleared up resources for session > 26615b64-690c-11ea-9d7c-12f7d16c4f81 > 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.client] > AMQ212037: Connection failure to artemis1.sl.idsk.com/10.110.0.17:61616 has > been detected: AMQ219015: The connection was disconnected because of server > shutdown [code=DISCONNECTED] > 2020-03-18 12:34:40,123 FINE [org.jgroups.protocols.FILE_PING] remove > persistence-fs > 2020-03-18 12:34:40,137 FINE [org.jgroups.protocols.FD_SOCK] > ip-10-110-0-17-18496: socket to ip-10-110-0-88-23224 was closed gracefully > 2020-03-18 12:34:40,138 FINE [org.jgroups.protocols.TCP] closing sockets and > stopping threads > 2020-03-18 12:34:40,169 WARN [org.apache.activemq.artemis.core.client] > AMQ212004: Failed to connect to server. > 2020-03-18 12:34:41,336 INFO [io.hawt.web.AuthenticationFilter] Destroying > hawtio authentication filter > 2020-03-18 12:34:41,341 INFO [io.hawt.HawtioContextListener] Destroying > hawtio services > 2020-03-18 12:34:41,464 INFO >