[
https://issues.apache.org/jira/browse/ARTEMIS-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301905#comment-17301905
]
Clebert Suconic commented on ARTEMIS-2870:
------------------------------------------
I just debugged a similar issue on ARTEMIS 2.9.0, and the issue was fixed by
ARTEMIS-2875....
I'm setting this as fixed with a link to ARTEMIS-2875
> CORE connection failure sometimes doesn't cleanup sessions
> ----------------------------------------------------------
>
> Key: ARTEMIS-2870
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2870
> Project: ActiveMQ Artemis
> Issue Type: Bug
> Components: Broker
> Affects Versions: 2.10.1, 2.14.0, 2.15.0
> Reporter: Markus Meierhofer
> Priority: Blocker
> Attachments: all_connections_list.png, artemis.log, broker.log,
> broker.xml, connection_nonexistent.png, consumer_list_for_one_queue.png,
> duplicated consumers.png, multiple_consumers_per_queue.png,
> session_with_connection_id.png, three consumers per queue.png
>
>
> h3. Summary
> Since the upgrade of our deployed artemis instances from version 2.6.4 to
> 2.10.1 we have noticed the problem that sometimes, a connection failure
> doesn't include the cleanup of its connected sessions, leading to "zombie"
> consumers and producers on queues.
>
> h3. The issue
> Our Artemis Clients are connected to the broker via the provided JMS
> abstraction, using the default connection TTL of 60 seconds. we are using
> both JMS Topics and JMS Queues.
> As most of our Clients are mobile and in a WiFi, connection losses may occur
> frequently, depending on the quality of the network. When the client is
> disconnected for 60 seconds, the broker usually closes the connection and
> cleans up all the sessions connected to it. The mobile Clients then create
> reconnect when they are online again. What we have noticed is that after many
> connection failures, messages may to be sent twice to the mobile clients.
> When analyzing the problem on the broker console, we found out that there
> were two consumers connected to each of the queues one mobile client usually
> consumes from. One of them belonged to the new connection of the mobile
> Client, which is fine.
> The other consumer belonged to a session whose connection already failed and
> was closed at that time. When analyzing the logs, we saw that for these
> connections, it contained a "Connection failure to ... has been detected"
> line, but no following "clearing up resources for session ..." log lines for
> these connections.
>
> h3. Instance of the issue
>
> The broken Session is the "7a9292cb-xxx" in the picture. In the logs you can
> see that the connection failure was detected, but the session was never
> cleared by the broker (mind the timestamp).
> !duplicated consumers.png!
> {code:java}
> [WARN 2020-07-27 14:33:29,794 Thread-13
> org.apache.activemq.artemis.core.client]: AMQ212037: Connection failure to
> /10.255.0.2:54812 has been detected: syscall:read(..) failed: Connection
> reset by peer [code=GENERIC_EXCEPTION]
> [WARN 2020-07-29 09:31:30,828 Thread-20
> org.apache.activemq.artemis.core.client]: AMQ212037: Connection failure to
> /10.255.0.2:55994 has been detected: AMQ229014: Did not receive data from
> /10.255.0.2:55994 within the 60,000ms connection TTL. The connection will now
> be closed. [code=CONNECTION_TIMEDOUT]
> {code}
>
> Attached you can find the full [^artemis.log] and our [^broker.xml]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)