We'll take a look at the NFS configuration.  Why was the primary server 
completely down when it was isolated from the network?  I configured 
<network-check-list>, enabled , <network-check-ping-command> and 
<network-check-ping6-command> so the primary server knew that the network was 
unhealthy as shown in below log:
[org.apache.activemq.artemis.logs] AMQ201001: Network is unhealthy, stopping 
service ActiveMQServerImpl

However; when we enabled back the network card, the primary server was 
completely down.  I had to start the primary server manually.

Regards,
Rahman

-----Original Message-----
From: Justin Bertram <jbert...@apache.org> 
Sent: Monday, February 28, 2022 10:15 AM
To: users@activemq.apache.org
Subject: Re: [EXTERNAL] Re: Artemis file locking not released

The backup and the live do have a direct connection. This allows the backup to 
share its connection details with the live. The live then takes those details 
and passes them on to clients so that the clients will know where to connect in 
case the live fails.

However, if this connection breaks it is *not* possible for the backup to 
simply "unlock" the journal and take over. The only entities which can unlock 
the journal is the live broker (who created the lock in the first
place) or NFS itself (e.g. in the case of some kind of connectivity failure). 
If the lock is not being released when the live broker's NFS connectivity fails 
then I would suggest you have a problem with your NFS configuration.


Justin

On Mon, Feb 28, 2022 at 6:55 AM Gunawan, Rahman (GSFC-703.H)[Halvik Corp] 
<rahman.guna...@nasa.gov.invalid> wrote:

> The backup server knew that the primary server had problem.  Below is 
> from the log from the backup server:
> ERROR [org.apache.activemq.artemis.core.client] AMQ214016: Failed to 
> create netty connection: java.net.UnknownHostException
>
> Thus, I'm thinking if the Artemis primary server lost connection to 
> NFS or network, the backup server can detect, unlock the file and take over.
> Please let me know if you have suggestions.
> Thanks
>
> Regards,
> Rahman
>
> -----Original Message-----
> From: Clebert Suconic <clebert.suco...@gmail.com>
> Sent: Saturday, February 26, 2022 9:27 AM
> To: users@activemq.apache.org
> Subject: [EXTERNAL] Re: Artemis file locking not released
>
> Could be some configuration on the remote file system attributes ?
>
> On Fri, Feb 25, 2022 at 12:03 PM Gunawan, Rahman (GSFC-703.H)[Halvik 
> Corp] <rahman.guna...@nasa.gov.invalid> wrote:
>
> > I'm using Artemis 2.19.1.  I'm using share file configuration and 
> > testing a scenario where the primary Artemis server is isolated from 
> > the network by disabling the network card.  Because the primary 
> > server lost communication to NFS, the file is never unlock and the 
> > backup server is always waiting for the lock.  When we enable the 
> > network card in primary server, the primary server is completely 
> > down.  Below is
> the primary server log:
> > "Reference Handler" Id=2 WAITING on java.lang.ref.Reference$Lock@64b6b3fc
> >         at java.lang.Object.wait(Native Method)
> >         -  waiting on java.lang.ref.Reference$Lock@64b6b3fc
> >         at java.lang.Object.wait(Object.java:502)
> >         at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
> >         at
> > java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
> >
> >
> >
> > ====================================================================
> > ==
> > =========
> > End Thread dump
> >
> > Is this bugs in Artemis share file configuration?
> >
> > Regards,
> > Rahman
> >
> --
> Clebert Suconic
>

Reply via email to