[ 
https://issues.apache.org/jira/browse/ARTEMIS-4305?focusedWorklogId=931156&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-931156
 ]

ASF GitHub Bot logged work on ARTEMIS-4305:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 21/Aug/24 14:45
            Start Date: 21/Aug/24 14:45
    Worklog Time Spent: 10m 
      Work Description: jbertram commented on PR #4899:
URL: 
https://github.com/apache/activemq-artemis/pull/4899#issuecomment-2302231138

   > No, `MessageFlowRecordImpl` becoming bad means that the peer broker holds 
on to an instance of it which should have been discarded.
   
   As far as I can tell the `AMQ222139` message is a direct symptom of the 
problem you describe. I don't think that message would occur if the broker 
wasn't holding an instance of `MessageFlowRecordImpl` which should have been 
discarded. Therefore, I think we're talking about the same essential problem.
   
   > We were already running that earlier. That resulted in other bugs with the 
topology.
   
   If that's the case then I think that is the bug we should be attempting to 
fix. For your particular use-case I think you should definitely be using `0` 
for `reconnect-attempts`.
   
   > We don't want to go back to a tried configuration which we know didn't 
work.
   
   Your current configuration with `reconnect-attempts` > `0` _also_ doesn't 
work so I'm not clear what the difference here. If neither configuration works 
then you should use the configuration which is actually appropriate for your 
use-case and fix whatever bugs (if any) exist for that configuration rather 
than fixing a bug for a configuration that isn't recommended.
   
   As it stands, the recommended configuration fixes the problem with the 
test-case in this PR so this fix is not valid and won't be merged.




Issue Time Tracking
-------------------

    Worklog Id:     (was: 931156)
    Time Spent: 2h 20m  (was: 2h 10m)

> Zero persistence does not work in kubernetes
> --------------------------------------------
>
>                 Key: ARTEMIS-4305
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-4305
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>            Reporter: Ivan Iliev
>            Priority: Major
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> In a cluster deployed in kubernetes, when a node is destroyed it terminates 
> the process and shuts down the network before the process has a chance to 
> close connections. Then a new node might be brought up, reusing the old 
> node’s ip. If this happens before the connection ttl, from artemis’ point of 
> view, it looks like as if the connection came back. Yet it is actually not 
> the same, the peer has a new node id, etc. This messes things up with the 
> cluster, the old message flow record is invalid.
> One way to fix it could be if the {{Ping}} messages which are typically used 
> to detect dead connections could use some sort of connection id to match that 
> the other side is really the one which it is supposed to be.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact


Reply via email to