On Fri, Feb 9, 2024 at 4:45 PM Mark Thomas <ma...@apache.org> wrote:

> Using 0.0.0.0 as the address for the receiver is going to cause
> problems. I see similar issues with 11.0.x as 8.5.x. I haven't dug too
> deeply into things as a) I am short of time and b) I'm not convinced
> this should/could work anyway.
>
> What seems to happen is that the use of 0.0.0.0 confuses the cluster as
> to which node is which - I think because multiple nodes are using
> 0.0.0.0. That causes the failure of the initial state synchronisation.
>

Yes, this was indeed the problem. I chose 0.0.0.0 because binding to the
host's ip threw the following error -

> 01-Mar-2024 22:30:32.315 SEVERE [main]
> org.apache.catalina.tribes.transport.nio.NioReceiver.start Unable to start
> cluster receiver
>  java.net.BindException: Cannot assign requested address

The full stack trace is available in my previous mail.

To identify the problem, I ran my application outside the container, where
I did not encounter the above error. This led me to investigate on the
Docker side of things. By default, a Docker container uses a bridge
network, so binding to the host's ip address from inside the container is
simply not possible even when the receiver port has been correctly mapped.
I was able to get it to work by passing the --network=host flag to my
docker create command. This puts the container inside the host's network,
essentially de-containerizing its networking.
Although this works, this is not desirable because this opens every port on
the container, increasing the surface area for security and debugging.
0.0.0.0 is a natural choice and is used by a lot of applications running on
Docker, even the official Tomcat image on Docker Hub does so.
I am no expert on Docker or Tomcat, however, I don't think this is ideal.
Docker has become so ubiquitous that I couldn't imagine deploying without
it, but using clustering makes me lose some of the benefits of it. I have
not looked into it, but this might also impact the BackupManager because it
also requires a Receiver element.

On Mon, Feb 12, 2024 at 8:52 PM Christopher Schultz <
ch...@christopherschultz.net> wrote:

> If this is known to essentially always not-work... should we log
> something at startup?

I think this is the least that we could do, I am willing to work on this.
However, I also think that this should be looked into deeper to solve the
actual problem.

I understand that this discussion might be more fit for the dev mailing
list, please let me know if you think the above holds merit, and I will
move it there.

Sincerely,
Manak Bisht

Reply via email to