[jira] [Commented] (ARTEMIS-2916) Two servers becoming Live using JDBC Shared Store

Apache Dev (Jira) Mon, 28 Sep 2020 08:57:43 -0700


    [ 
https://issues.apache.org/jira/browse/ARTEMIS-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203323#comment-17203323
 ]


Apache Dev commented on ARTEMIS-2916:
-------------------------------------

Thanks for you support.

I just attached log snippets of the issue. Unfortunately not debug logs, 
because issue has not been replicated anymore.

A couple more details:
 * broker is embedded
 * JGroups JDBC ping is used for discovery

Issue happened at [9/23/20 4:34:37:611 CEST]
 * broker1: Live backup broker
 * broker2: backup broker becoming Live at [9/23/20 4:34:37:611 CEST] while 
broker1 is still Live
 * broker3: idle backup broker

I tried to simulate scenarios where Live broker looses its lock: broker 
realizes it and shuts down itself. However, when issue happened, broker1 has 
not realized that another broker has become Live.

 

> Two servers becoming Live using JDBC Shared Store
> -------------------------------------------------
>
>                 Key: ARTEMIS-2916
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2916
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.13.0
>            Reporter: Apache Dev
>            Priority: Critical
>         Attachments: logs.zip
>
>
> We have similar scenario described in ARTEMIS-2421 but using:
>  * Artemis 2.13.0
>  * JDBC Shared Store
>  * 1 Master currently down
>  * 3 Slave
>  ** 1 Live
>  ** 2 Backup
> All 3 slaves are configured with:
> {code:xml}
> <ha-policy>
>    <shared-store>
>       <slave>
>          <allow-failback>false</allow-failback>
>          <failover-on-shutdown>true</failover-on-shutdown>
>       </slave>
>    </shared-store>
> </ha-policy>
> {code}
>  
> After 2 days of activities, with a single slave working as live we got 
> suddenly one slave server becoming live too while the other live server was 
> still working. No warnings/errors available. Just backup server started 
> creating configured addresses, queues and starting connectors, then it logged 
> "AMQ221010: Backup Server is now live". 
> The third slave broker started in the meanwhile to log continuously:
> {noformat}
> AMQ212034: There are more than one servers on the network broadcasting the 
> same node id. You will see this message exactly once (per node) if a node is 
> restarted, in which case it can be safely ignored. But if it is logged 
> continuously it means you really do have more than one node on the same 
> network active concurrently with the same node id. This could occur if you 
> have a backup node active at the same time as its live node. nodeID=...
> {noformat} 
> Final scenario was:
>  * 1 Master down
>  * 3 Slave
>  ** 2 Live
>  ** 1 Backup 
> I see that ARTEMIS-2421 was fixed only in the filesystem use-case. Should it 
> be fixed for JDBC too?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARTEMIS-2916) Two servers becoming Live using JDBC Shared Store

Reply via email to