[
https://issues.apache.org/jira/browse/ARTEMIS-4305?focusedWorklogId=931156&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-931156
]
ASF GitHub Bot logged work on ARTEMIS-4305:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 21/Aug/24 14:45
Start Date: 21/Aug/24 14:45
Worklog Time Spent: 10m
Work Description: jbertram commented on PR #4899:
URL:
https://github.com/apache/activemq-artemis/pull/4899#issuecomment-2302231138
> No, `MessageFlowRecordImpl` becoming bad means that the peer broker holds
on to an instance of it which should have been discarded.
As far as I can tell the `AMQ222139` message is a direct symptom of the
problem you describe. I don't think that message would occur if the broker
wasn't holding an instance of `MessageFlowRecordImpl` which should have been
discarded. Therefore, I think we're talking about the same essential problem.
> We were already running that earlier. That resulted in other bugs with the
topology.
If that's the case then I think that is the bug we should be attempting to
fix. For your particular use-case I think you should definitely be using `0`
for `reconnect-attempts`.
> We don't want to go back to a tried configuration which we know didn't
work.
Your current configuration with `reconnect-attempts` > `0` _also_ doesn't
work so I'm not clear what the difference here. If neither configuration works
then you should use the configuration which is actually appropriate for your
use-case and fix whatever bugs (if any) exist for that configuration rather
than fixing a bug for a configuration that isn't recommended.
As it stands, the recommended configuration fixes the problem with the
test-case in this PR so this fix is not valid and won't be merged.
Issue Time Tracking
-------------------
Worklog Id: (was: 931156)
Time Spent: 2h 20m (was: 2h 10m)
> Zero persistence does not work in kubernetes
> --------------------------------------------
>
> Key: ARTEMIS-4305
> URL: https://issues.apache.org/jira/browse/ARTEMIS-4305
> Project: ActiveMQ Artemis
> Issue Type: Bug
> Reporter: Ivan Iliev
> Priority: Major
> Time Spent: 2h 20m
> Remaining Estimate: 0h
>
> In a cluster deployed in kubernetes, when a node is destroyed it terminates
> the process and shuts down the network before the process has a chance to
> close connections. Then a new node might be brought up, reusing the old
> node’s ip. If this happens before the connection ttl, from artemis’ point of
> view, it looks like as if the connection came back. Yet it is actually not
> the same, the peer has a new node id, etc. This messes things up with the
> cluster, the old message flow record is invalid.
> One way to fix it could be if the {{Ping}} messages which are typically used
> to detect dead connections could use some sort of connection id to match that
> the other side is really the one which it is supposed to be.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact