[jira] [Commented] (ARTEMIS-2250) Shared store lock is not monitored while running

2019-09-05 Thread Jigar Shah (Jira)


[ 
https://issues.apache.org/jira/browse/ARTEMIS-2250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923654#comment-16923654
 ] 

Jigar Shah commented on ARTEMIS-2250:
-

Hello Bas,

We are running into the similar situation, while using Master-Slave setup on 
AWS/EFS 

"Live gives up the control and Backup gains the control, and a situation arises 
where both Live and Backup are active at the same time, manipulating Journal 
creating un-recoverable Journals situation at-times". We have observed this 
situation with Artemis 2.6.3 and also Artemis 2.7.0. In out QA env. this 
happens on-an-average once or twice a week. We are also trying the way to 
consistently reproduce on AWS/EFS but not there yet.

_"We were able to prevent the occurence by tweaking EFS connection settings so 
it does not occur anymore in our setup."_

You have mentioned above in the comment. It will be very helpful If possible 
can you please share connection setting/mount parameters you have used which 
works in your setup.

 

Many Thanks

 

> Shared store lock is not monitored while running
> 
>
> Key: ARTEMIS-2250
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2250
> Project: ActiveMQ Artemis
>  Issue Type: Bug
>  Components: Broker
>Affects Versions: 2.6.4
> Environment: AWS EFS (NFS)
>Reporter: Bas
>Priority: Major
>
> When using the shared store the live server can loose the lock on the journal 
> but does not notice it. This can happen when a shared file system is being 
> used like in AWS where we use EFS.
> This can cause problems when the live server regains the network file system 
> connection and just continues to process messages. At some point the live or 
> the backup quits because it notices changes on the filesystems which it did 
> not do itself.
> We were able to prevent the occurence by tweaking EFS connection settings so 
> it does not occur anymore in our setup.
> For artemis we would like to show our change maybe someone can review the 
> change and see if it can be improved and adapted in artemis.
> Patch is here for master:
> https://github.com/emagiz/activemq-artemis/commit/788adfbd3e5a54c63eed0810b7377641684b6fe1.patch
> Pull request: 
> https://github.com/apache/activemq-artemis/pull/2547



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (ARTEMIS-2677) Artemis 2.11.0 RejectedExecutionException after successful failover

2020-03-27 Thread Jigar Shah (Jira)


[ 
https://issues.apache.org/jira/browse/ARTEMIS-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068429#comment-17068429
 ] 

Jigar Shah commented on ARTEMIS-2677:
-

We are trying to reproduce it. But not able to re-produce situation. Any issues 
in configuration was identified on uploaded broker.xml(s)

> Artemis 2.11.0 RejectedExecutionException after successful failover
> ---
>
> Key: ARTEMIS-2677
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2677
> Project: ActiveMQ Artemis
>  Issue Type: Bug
> Environment: Environment: Artemis 2.11.0 (Master-Slave)
>  SharedStore, FilePing. AWS EFS/NFS.
>Reporter: Jigar Shah
>Priority: Major
> Attachments: broker - slave.xml, broker.xml
>
>
> Observed an issue on master shutdown, the slave became active. Right after 
> slave being active it started printing "RejectedExecutionException". Also, 
> consumers from master to slave was not transferred. And client application 
> stopped processing messages.
> Note RejectedExecutionException had [Terminated, pool size = 0, active 
> threads = 0, queued tasks = 0, completed tasks = 59059]
> Following are the logs during failover from master to slave:
> Master1:
> {noformat}
> 2020-03-18 12:34:39,911 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 26fb51af-690c-11ea-959a-12f8371a8293
> 2020-03-18 12:34:39,912 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221029: stopped bridge 
> $.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9
> 2020-03-18 12:34:39,950 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 2655265e-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 2655265e-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 2662e209-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 2662e209-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 2669bfdd-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 2669bfdd-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,993 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 26615b64-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 26615b64-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.client] 
> AMQ212037: Connection failure to artemis1.sl.idsk.com/10.110.0.17:61616 has 
> been detected: AMQ219015: The connection was disconnected because of server 
> shutdown [code=DISCONNECTED]
> 2020-03-18 12:34:40,123 FINE [org.jgroups.protocols.FILE_PING] remove 
> persistence-fs
> 2020-03-18 12:34:40,137 FINE [org.jgroups.protocols.FD_SOCK] 
> ip-10-110-0-17-18496: socket to ip-10-110-0-88-23224 was closed gracefully
> 2020-03-18 12:34:40,138 FINE [org.jgroups.protocols.TCP] closing sockets and 
> stopping threads
> 2020-03-18 12:34:40,169 WARN [org.apache.activemq.artemis.core.client] 
> AMQ212004: Failed to connect to server.
> 2020-03-18 12:34:41,336 INFO [io.hawt.web.AuthenticationFilter] Destroying 
> hawtio authentication filter
> 2020-03-18 12:34:41,341 INFO [io.hawt.HawtioContextListener] Destroying 
> hawtio services
> 2020-03-18 12:34:41,464 INFO 
> [org.apache.activemq.hawtio.plugin.PluginContextListener] Destroyed 
> artemis-plugin plugin
> 2020-03-18 12:34:41,472 INFO 
> [org.apache.activemq.hawtio.branding.PluginContextListener] Destroyed 
> activemq-branding plugin
> 2020-03-18 12:34:41,509 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221002: Apache ActiveMQ Artemis Message Broker version 2.11.0 
> [19ade3bb-5f75-11ea-b327-1216d251b187] stopped, uptime 5 hours 2 minutes
> {noformat}
> Slave1:
> {noformat}
> 2020-03-18 12:35:01,613 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221010: Backup Server is now live
> 2020-03-18 12:35:01,626 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221027: Bridge ClusterConnectionBridge@5ced3917 
> [name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9, 
> queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9,
>  postOffice=PostOfficeImpl 

[jira] [Created] (ARTEMIS-2677) Artemis 2.11.0 RejectedExecutionException after successful failover

2020-03-23 Thread Jigar Shah (Jira)
Jigar Shah created ARTEMIS-2677:
---

 Summary: Artemis 2.11.0 RejectedExecutionException after 
successful failover
 Key: ARTEMIS-2677
 URL: https://issues.apache.org/jira/browse/ARTEMIS-2677
 Project: ActiveMQ Artemis
  Issue Type: Bug
 Environment: *Environment: 2.11.0*
*SharedStore, FilePing. AWS EFS/NFS.*
Reporter: Jigar Shah


Observed an issue on master shutdown, the slave became active. Right after 
slave being active it started printing "RejectedExecutionException". Also, 
consumers from master to slave was not transferred. And client application 
stopped processing messages.

Note RejectedExecutionException had [Terminated, pool size = 0, active threads 
= 0, queued tasks = 0, completed tasks = 59059]

Following are the logs during failover from master to slave:


Master1:

2020-03-18 12:34:39,911 WARN [org.apache.activemq.artemis.core.server] 
AMQ222107: Cleared up resources for session 26fb51af-690c-11ea-959a-12f8371a8293
2020-03-18 12:34:39,912 INFO [org.apache.activemq.artemis.core.server] 
AMQ221029: stopped bridge 
$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9
2020-03-18 12:34:39,950 WARN [org.apache.activemq.artemis.core.server] 
AMQ222061: Client connection failed, clearing up resources for session 
2655265e-690c-11ea-9d7c-12f7d16c4f81
2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
AMQ222107: Cleared up resources for session 2655265e-690c-11ea-9d7c-12f7d16c4f81
2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
AMQ222061: Client connection failed, clearing up resources for session 
2662e209-690c-11ea-9d7c-12f7d16c4f81
2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
AMQ222107: Cleared up resources for session 2662e209-690c-11ea-9d7c-12f7d16c4f81
2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
AMQ222061: Client connection failed, clearing up resources for session 
2669bfdd-690c-11ea-9d7c-12f7d16c4f81
2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
AMQ222107: Cleared up resources for session 2669bfdd-690c-11ea-9d7c-12f7d16c4f81
2020-03-18 12:34:39,993 WARN [org.apache.activemq.artemis.core.server] 
AMQ222061: Client connection failed, clearing up resources for session 
26615b64-690c-11ea-9d7c-12f7d16c4f81
2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.server] 
AMQ222107: Cleared up resources for session 26615b64-690c-11ea-9d7c-12f7d16c4f81
2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.client] 
AMQ212037: Connection failure to artemis1.sl.idsk.com/10.110.0.17:61616 has 
been detected: AMQ219015: The connection was disconnected because of server 
shutdown [code=DISCONNECTED]
2020-03-18 12:34:40,123 FINE [org.jgroups.protocols.FILE_PING] remove 
persistence-fs
2020-03-18 12:34:40,137 FINE [org.jgroups.protocols.FD_SOCK] 
ip-10-110-0-17-18496: socket to ip-10-110-0-88-23224 was closed gracefully
2020-03-18 12:34:40,138 FINE [org.jgroups.protocols.TCP] closing sockets and 
stopping threads
2020-03-18 12:34:40,169 WARN [org.apache.activemq.artemis.core.client] 
AMQ212004: Failed to connect to server.
2020-03-18 12:34:41,336 INFO [io.hawt.web.AuthenticationFilter] Destroying 
hawtio authentication filter
2020-03-18 12:34:41,341 INFO [io.hawt.HawtioContextListener] Destroying hawtio 
services
2020-03-18 12:34:41,464 INFO 
[org.apache.activemq.hawtio.plugin.PluginContextListener] Destroyed 
artemis-plugin plugin
2020-03-18 12:34:41,472 INFO 
[org.apache.activemq.hawtio.branding.PluginContextListener] Destroyed 
activemq-branding plugin
2020-03-18 12:34:41,509 INFO [org.apache.activemq.artemis.core.server] 
AMQ221002: Apache ActiveMQ Artemis Message Broker version 2.11.0 
[19ade3bb-5f75-11ea-b327-1216d251b187] stopped, uptime 5 hours 2 minutes


Slave1:

2020-03-18 12:35:01,613 INFO [org.apache.activemq.artemis.core.server] 
AMQ221010: Backup Server is now live
2020-03-18 12:35:01,626 INFO [org.apache.activemq.artemis.core.server] 
AMQ221027: Bridge ClusterConnectionBridge@5ced3917 
[name=$.artemis.internal.sf.my-cluster.2c85cd
9e-5f75-11ea-9be1-0af9119190b9, 
queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9,
 postOffice=PostOfficeImpl [server=ActiveMQSe
rverImpl::serverUUID=19ade3bb-5f75-11ea-b327-1216d251b187], 
temp=false]@58d2a652 targetConnector=ServerLocatorImpl 
(identity=(Cluster-connection-bridge::ClusterConnecti
onBridge@5ced3917 
[name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9, 
queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-
9be1-0af9119190b9, postOffice=PostOfficeImpl 
[server=ActiveMQServerImpl::serverUUID=19ade3bb-5f75-11ea-b327-1216d251b187], 
temp=false]@58d2a652 targetConnector=ServerLo
catorImpl [initialConnectors=[TransportConfiguration(name=netty-connector, 

[jira] [Updated] (ARTEMIS-2677) Artemis 2.11.0 RejectedExecutionException after successful failover

2020-03-23 Thread Jigar Shah (Jira)


 [ 
https://issues.apache.org/jira/browse/ARTEMIS-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jigar Shah updated ARTEMIS-2677:

Environment: 
*Environment: Artemis 2.11.0 (Master-Slave)*
 *SharedStore, FilePing. AWS EFS/NFS.*

  was:
*Environment: 2.11.0*
*SharedStore, FilePing. AWS EFS/NFS.*


> Artemis 2.11.0 RejectedExecutionException after successful failover
> ---
>
> Key: ARTEMIS-2677
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2677
> Project: ActiveMQ Artemis
>  Issue Type: Bug
> Environment: *Environment: Artemis 2.11.0 (Master-Slave)*
>  *SharedStore, FilePing. AWS EFS/NFS.*
>Reporter: Jigar Shah
>Priority: Major
>
> Observed an issue on master shutdown, the slave became active. Right after 
> slave being active it started printing "RejectedExecutionException". Also, 
> consumers from master to slave was not transferred. And client application 
> stopped processing messages.
> Note RejectedExecutionException had [Terminated, pool size = 0, active 
> threads = 0, queued tasks = 0, completed tasks = 59059]
> Following are the logs during failover from master to slave:
> Master1:
> 2020-03-18 12:34:39,911 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 26fb51af-690c-11ea-959a-12f8371a8293
> 2020-03-18 12:34:39,912 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221029: stopped bridge 
> $.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9
> 2020-03-18 12:34:39,950 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 2655265e-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 2655265e-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 2662e209-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 2662e209-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 2669bfdd-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 2669bfdd-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,993 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 26615b64-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 26615b64-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.client] 
> AMQ212037: Connection failure to artemis1.sl.idsk.com/10.110.0.17:61616 has 
> been detected: AMQ219015: The connection was disconnected because of server 
> shutdown [code=DISCONNECTED]
> 2020-03-18 12:34:40,123 FINE [org.jgroups.protocols.FILE_PING] remove 
> persistence-fs
> 2020-03-18 12:34:40,137 FINE [org.jgroups.protocols.FD_SOCK] 
> ip-10-110-0-17-18496: socket to ip-10-110-0-88-23224 was closed gracefully
> 2020-03-18 12:34:40,138 FINE [org.jgroups.protocols.TCP] closing sockets and 
> stopping threads
> 2020-03-18 12:34:40,169 WARN [org.apache.activemq.artemis.core.client] 
> AMQ212004: Failed to connect to server.
> 2020-03-18 12:34:41,336 INFO [io.hawt.web.AuthenticationFilter] Destroying 
> hawtio authentication filter
> 2020-03-18 12:34:41,341 INFO [io.hawt.HawtioContextListener] Destroying 
> hawtio services
> 2020-03-18 12:34:41,464 INFO 
> [org.apache.activemq.hawtio.plugin.PluginContextListener] Destroyed 
> artemis-plugin plugin
> 2020-03-18 12:34:41,472 INFO 
> [org.apache.activemq.hawtio.branding.PluginContextListener] Destroyed 
> activemq-branding plugin
> 2020-03-18 12:34:41,509 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221002: Apache ActiveMQ Artemis Message Broker version 2.11.0 
> [19ade3bb-5f75-11ea-b327-1216d251b187] stopped, uptime 5 hours 2 minutes
> Slave1:
> 2020-03-18 12:35:01,613 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221010: Backup Server is now live
> 2020-03-18 12:35:01,626 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221027: Bridge ClusterConnectionBridge@5ced3917 
> [name=$.artemis.internal.sf.my-cluster.2c85cd
> 9e-5f75-11ea-9be1-0af9119190b9, 
> queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9,
>  postOffice=PostOfficeImpl [server=ActiveMQSe
> rverImpl::serverUUID=19ade3bb-5f75-11ea-b327-1216d251b187], 
> 

[jira] [Updated] (ARTEMIS-2677) Artemis 2.11.0 RejectedExecutionException after successful failover

2020-03-24 Thread Jigar Shah (Jira)


 [ 
https://issues.apache.org/jira/browse/ARTEMIS-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jigar Shah updated ARTEMIS-2677:

Attachment: broker.xml
broker - slave.xml

> Artemis 2.11.0 RejectedExecutionException after successful failover
> ---
>
> Key: ARTEMIS-2677
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2677
> Project: ActiveMQ Artemis
>  Issue Type: Bug
> Environment: Environment: Artemis 2.11.0 (Master-Slave)
>  SharedStore, FilePing. AWS EFS/NFS.
>Reporter: Jigar Shah
>Priority: Major
> Attachments: broker - slave.xml, broker.xml
>
>
> Observed an issue on master shutdown, the slave became active. Right after 
> slave being active it started printing "RejectedExecutionException". Also, 
> consumers from master to slave was not transferred. And client application 
> stopped processing messages.
> Note RejectedExecutionException had [Terminated, pool size = 0, active 
> threads = 0, queued tasks = 0, completed tasks = 59059]
> Following are the logs during failover from master to slave:
> Master1:
> {noformat}
> 2020-03-18 12:34:39,911 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 26fb51af-690c-11ea-959a-12f8371a8293
> 2020-03-18 12:34:39,912 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221029: stopped bridge 
> $.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9
> 2020-03-18 12:34:39,950 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 2655265e-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 2655265e-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 2662e209-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 2662e209-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 2669bfdd-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 2669bfdd-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,993 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 26615b64-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 26615b64-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.client] 
> AMQ212037: Connection failure to artemis1.sl.idsk.com/10.110.0.17:61616 has 
> been detected: AMQ219015: The connection was disconnected because of server 
> shutdown [code=DISCONNECTED]
> 2020-03-18 12:34:40,123 FINE [org.jgroups.protocols.FILE_PING] remove 
> persistence-fs
> 2020-03-18 12:34:40,137 FINE [org.jgroups.protocols.FD_SOCK] 
> ip-10-110-0-17-18496: socket to ip-10-110-0-88-23224 was closed gracefully
> 2020-03-18 12:34:40,138 FINE [org.jgroups.protocols.TCP] closing sockets and 
> stopping threads
> 2020-03-18 12:34:40,169 WARN [org.apache.activemq.artemis.core.client] 
> AMQ212004: Failed to connect to server.
> 2020-03-18 12:34:41,336 INFO [io.hawt.web.AuthenticationFilter] Destroying 
> hawtio authentication filter
> 2020-03-18 12:34:41,341 INFO [io.hawt.HawtioContextListener] Destroying 
> hawtio services
> 2020-03-18 12:34:41,464 INFO 
> [org.apache.activemq.hawtio.plugin.PluginContextListener] Destroyed 
> artemis-plugin plugin
> 2020-03-18 12:34:41,472 INFO 
> [org.apache.activemq.hawtio.branding.PluginContextListener] Destroyed 
> activemq-branding plugin
> 2020-03-18 12:34:41,509 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221002: Apache ActiveMQ Artemis Message Broker version 2.11.0 
> [19ade3bb-5f75-11ea-b327-1216d251b187] stopped, uptime 5 hours 2 minutes
> {noformat}
> Slave1:
> {noformat}
> 2020-03-18 12:35:01,613 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221010: Backup Server is now live
> 2020-03-18 12:35:01,626 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221027: Bridge ClusterConnectionBridge@5ced3917 
> [name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9, 
> queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9,
>  postOffice=PostOfficeImpl 
> [server=ActiveMQServerImpl::serverUUID=19ade3bb-5f75-11ea-b327-1216d251b187], 
> temp=false]@58d2a652 

[jira] [Commented] (ARTEMIS-2677) Artemis 2.11.0 RejectedExecutionException after successful failover

2020-03-24 Thread Jigar Shah (Jira)


[ 
https://issues.apache.org/jira/browse/ARTEMIS-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17065455#comment-17065455
 ] 

Jigar Shah commented on ARTEMIS-2677:
-

Attached broker.xml and broker-slave.xml.

I am running a master on 3GB RAM and a slave on 2GB RAM. In-general failover 
works.

Load on Artemis we process is around average 5-7M messages (text and binary) a 
day (approx single message size text message size7 kb). Max messages around 
1GB. 

 

I tried to reproduce again with same configuration in AWS with other pair of 
master and slave by shutting down master. Failover worked fine and all 
consumers were successfully moved to the slave, producers got updated topology 
and Artemis was receiving messages, and didn't get "RejectedExecutionException".

 

I will try reproducing it, in isolation on one machine. Can you please guide in 
which situations Artemis throws "RejectedExecutionException"?

E.g, I mean is it related to resources (memory, cpu or slow NFS) on the machine 
at given point-in-time when this happened.

Please let me know if you have any suggestion related to configuration tunings 
or resource changes needs to be applied help reduce the possiblity of such 
exception?

 

 

 

 

 

> Artemis 2.11.0 RejectedExecutionException after successful failover
> ---
>
> Key: ARTEMIS-2677
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2677
> Project: ActiveMQ Artemis
>  Issue Type: Bug
> Environment: Environment: Artemis 2.11.0 (Master-Slave)
>  SharedStore, FilePing. AWS EFS/NFS.
>Reporter: Jigar Shah
>Priority: Major
> Attachments: broker - slave.xml, broker.xml
>
>
> Observed an issue on master shutdown, the slave became active. Right after 
> slave being active it started printing "RejectedExecutionException". Also, 
> consumers from master to slave was not transferred. And client application 
> stopped processing messages.
> Note RejectedExecutionException had [Terminated, pool size = 0, active 
> threads = 0, queued tasks = 0, completed tasks = 59059]
> Following are the logs during failover from master to slave:
> Master1:
> {noformat}
> 2020-03-18 12:34:39,911 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 26fb51af-690c-11ea-959a-12f8371a8293
> 2020-03-18 12:34:39,912 INFO [org.apache.activemq.artemis.core.server] 
> AMQ221029: stopped bridge 
> $.artemis.internal.sf.my-cluster.2c85cd9e-5f75-11ea-9be1-0af9119190b9
> 2020-03-18 12:34:39,950 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 2655265e-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 2655265e-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 2662e209-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 2662e209-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 2669bfdd-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,951 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 2669bfdd-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,993 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222061: Client connection failed, clearing up resources for session 
> 26615b64-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.server] 
> AMQ222107: Cleared up resources for session 
> 26615b64-690c-11ea-9d7c-12f7d16c4f81
> 2020-03-18 12:34:39,994 WARN [org.apache.activemq.artemis.core.client] 
> AMQ212037: Connection failure to artemis1.sl.idsk.com/10.110.0.17:61616 has 
> been detected: AMQ219015: The connection was disconnected because of server 
> shutdown [code=DISCONNECTED]
> 2020-03-18 12:34:40,123 FINE [org.jgroups.protocols.FILE_PING] remove 
> persistence-fs
> 2020-03-18 12:34:40,137 FINE [org.jgroups.protocols.FD_SOCK] 
> ip-10-110-0-17-18496: socket to ip-10-110-0-88-23224 was closed gracefully
> 2020-03-18 12:34:40,138 FINE [org.jgroups.protocols.TCP] closing sockets and 
> stopping threads
> 2020-03-18 12:34:40,169 WARN [org.apache.activemq.artemis.core.client] 
> AMQ212004: Failed to connect to server.
> 2020-03-18 12:34:41,336 INFO [io.hawt.web.AuthenticationFilter] Destroying 
> hawtio authentication filter
> 2020-03-18 12:34:41,341 INFO [io.hawt.HawtioContextListener] Destroying 
> hawtio services
> 2020-03-18 12:34:41,464 INFO 
>