[ https://issues.apache.org/jira/browse/ARTEMIS-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Justin Bertram updated ARTEMIS-3277: ------------------------------------ Description: We are running an Artemis-cluster with the following specs: * Artemis version: 2.15.0 (runs in a Docker-container) * Docker version: Docker version 20.10.3, build 48d30b5 * Docker image: vromero/activemq-artemis:2.15.0 (JVM: OpenJDK Runtime Environment (build 1.8.0_232-b09)) * OS-version: CentOS 7.9.2009 (x86_64) * Cluster-setup: 3 primary nodes with 3 secondary nodes (symmetric, every address/queue is on every primary node. We have a one-to-one relation between addresses and queues, every address is ANYCAST with a single queue with the same name) Sometimes we are experiencing connection issues from the clients connecting to the cluster which results in messages not being consumed from or produced to the cluster. This seems to be specific to one single alternating master node at the time and a restart of the node seems to solve the problem. From the client we get this: {noformat} AMQ212037: Connection failure to X.X.X.X has been detected: AMQ219015: The connection was disconnected because of server shutdown [code=DISCONNECTED]{noformat} The Artemis console however is available so from this perspective the node seems to be reachable and working correctly. Server-side (Artemis) we get the follow error messages which maybe related: {noformat} Inconsistency during compacting: RollbackRecord ID = 82113064851 for an already rolled back transaction during compacting", "Error on reading compacting for JournalFileImpl: (activemq-data-######.amq id = #####, recordID = #####)", "Bridge Failed to ack", "Cannot find add info ######## on compactor or current records"{noformat} We are also experiencing the following issues which may or may not be related: # initial distribution/Redistribution not working correctly # Reset of Address-setting to default values. We use specific settings regarding i.e. DLA, redistributionDelay etc. but sometimes these settings seem to go back to their default values. Is this problem reported more often or has this something to do with our setup of the cluster ? If the latter is true is there a way to fix it ? was: We are running an Artemis-cluster with the following specs: ------ Artemis version: 2.15.0 (runs in a Docker-container) Docker version: Docker version 20.10.3, build 48d30b5 Docker image: vromero/activemq-artemis:2.15.0 (JVM: OpenJDK Runtime Environment (build 1.8.0_232-b09)) OS-version: CentOS 7.9.2009 (x86_64) Cluster-setup: 3 primary nodes with 3 secondary nodes (symmetric, every address/queue is on every primary node. We have a one-to-one relation between addresses and queues, every address is ANYCAST with a single queue with the same name) ------ Sometimes we are experiencing connection issues from the clients connecting to the cluster which results in messages not beeing consumed from or produced to the cluster. This seems to be specific to one single alternating master node at the time and a restart of the node seems to solve the problem. >From the client we get a "AMQ212037: Connection failure to ############################## has been detected: AMQ219015: The connection was disconnected because of server shutdown [code=DISCONNECTED]" error message.". The Artemis-console however is available so from this perspective the node seems to be reachable and working correctly. Server-side (Artemis) we get the follow error messages which maybe related "Inconsistency during compacting: RollbackRecord ID = 82113064851 for an already rolled back transaction during compacting", "Error on reading compacting for JournalFileImpl: (activemq-data-######.amq id = #####, recordID = #####)", "Bridge Failed to ack", "Cannot find add info ######## on compactor or current records", We are also experiencing the following issues which may or may not be related: 1) initial distribution/Redistribution not working correctly 2) Reset of Address-setting to default values. We use specific settings regarding i.e. DLA, redistributionDelay etc. but sometimes these settings seem to go back to their default values. Is this problem reported more often or has this something to do with our setup of the cluster ? If the latter is true is there a way to fix it ? > Client connection to Artemis-cluster node lost > ---------------------------------------------- > > Key: ARTEMIS-3277 > URL: https://issues.apache.org/jira/browse/ARTEMIS-3277 > Project: ActiveMQ Artemis > Issue Type: Bug > Affects Versions: 2.15.0 > Reporter: Jelmer Marinus > Priority: Major > > We are running an Artemis-cluster with the following specs: > * Artemis version: 2.15.0 (runs in a Docker-container) > * Docker version: Docker version 20.10.3, build 48d30b5 > * Docker image: vromero/activemq-artemis:2.15.0 (JVM: OpenJDK Runtime > Environment (build 1.8.0_232-b09)) > * OS-version: CentOS 7.9.2009 (x86_64) > * Cluster-setup: 3 primary nodes with 3 secondary nodes (symmetric, every > address/queue is on every primary node. We have a one-to-one relation between > addresses and queues, every address is ANYCAST with a single queue with the > same name) > Sometimes we are experiencing connection issues from the clients connecting > to the cluster which results in messages not being consumed from or produced > to the cluster. This seems to be specific to one single alternating master > node at the time and a restart of the node seems to solve the problem. From > the client we get this: > {noformat} > AMQ212037: Connection failure to X.X.X.X has been detected: AMQ219015: The > connection was disconnected because of server shutdown > [code=DISCONNECTED]{noformat} > The Artemis console however is available so from this perspective the node > seems to be reachable and working correctly. > Server-side (Artemis) we get the follow error messages which maybe related: > {noformat} > Inconsistency during compacting: RollbackRecord ID = 82113064851 for an > already rolled back transaction during compacting", "Error on reading > compacting for JournalFileImpl: (activemq-data-######.amq id = #####, > recordID = #####)", "Bridge Failed to ack", "Cannot find add info ######## on > compactor or current records"{noformat} > We are also experiencing the following issues which may or may not be related: > # initial distribution/Redistribution not working correctly > # Reset of Address-setting to default values. We use specific settings > regarding i.e. DLA, redistributionDelay etc. but sometimes these settings > seem to go back to their default values. > Is this problem reported more often or has this something to do with our > setup of the cluster ? If the latter is true is there a way to fix it ? > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@activemq.apache.org For additional commands, e-mail: issues-h...@activemq.apache.org For further information, visit: https://activemq.apache.org/contact