[jira] [Work logged] (ARTEMIS-4305) Zero persistence does not work in kubernetes

ASF GitHub Bot (Jira) Fri, 26 Apr 2024 01:25:42 -0700


     [ 
https://issues.apache.org/jira/browse/ARTEMIS-4305?focusedWorklogId=916585&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-916585
 ]


ASF GitHub Bot logged work on ARTEMIS-4305:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 26/Apr/24 08:24
            Start Date: 26/Apr/24 08:24
    Worklog Time Spent: 10m 
      Work Description: iiliev2 commented on PR #4899:
URL: 
https://github.com/apache/activemq-artemis/pull/4899#issuecomment-2078890094

   Initially I attempted what you suggest about lazy initializing the node id 
like that, precicely because I wanted to keep the code changes to a minimum. 
However, that ended up being much more complicated(rather than simplified), 
because of the way `ClientSessionFactoryImpl` creates a new connection object 
on re-connects. It is very hard to reason about both when reading the code and 
when needing to debug it at runtime. So instead of this, I had to fill the 
missing gaps to use the data that is already there anyway, just wasn't being 
propagated deep enough.
   
   IMO from a functional standpoint, adding the `TransportConfiguration` to the 
connector(and connection) is the right thing to do here anyway. I assume due to 
historical reasons, those classes were working with a subset of the data, and 
no one had a good reason to fix this until now. For example 
`NettyConnection#getConnectorConfig` was creating a bogus transport 
configuration, even though when it was created there was a configuration which 
was not being passed to it.
   
   `Ping` is already the only `Packet` that is being modified. Why do you want 
to use a raw `byte[]` rather than `UUID`? IMO that will be more confusing - it 
suggests that there could be other kind of data that can be contained.




Issue Time Tracking
-------------------

    Worklog Id:     (was: 916585)
    Time Spent: 0.5h  (was: 20m)

> Zero persistence does not work in kubernetes
> --------------------------------------------
>
>                 Key: ARTEMIS-4305
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-4305
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>            Reporter: Ivan Iliev
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In a cluster deployed in kubernetes, when a node is destroyed it terminates 
> the process and shuts down the network before the process has a chance to 
> close connections. Then a new node might be brought up, reusing the old 
> node’s ip. If this happens before the connection ttl, from artemis’ point of 
> view, it looks like as if the connection came back. Yet it is actually not 
> the same, the peer has a new node id, etc. This messes things up with the 
> cluster, the old message flow record is invalid.
> One way to fix it could be if the {{Ping}} messages which are typically used 
> to detect dead connections could use some sort of connection id to match that 
> the other side is really the one which it is supposed to be.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (ARTEMIS-4305) Zero persistence does not work in kubernetes

Reply via email to