Hi Gobor,
Thank you for such a quick response, I appreciate it.
Actually, I'm not very good at all this security and networking stuff,
so I apologize in advance if I'm wrong in some statement.
> Does YARN containers share the host’s network in your case?
Yes, it does. And as far as I have researched, it always does, possibly
only unless you have configured YARN to use Docker containers
<https://hadoop.apache.org/docs/r3.4.1/hadoop-yarn/hadoop-yarn-site/DockerContainers.html>,
which is definitely not my case. I have also done some testing on my
node, which has 2 IP addresses:
- With default rest.bind-address (set by YARN to Node Manager's
hostname), the only IP address that opens a port is the one that NM
hostname is resolved to. The other one (not sure where it comes from,
this is a VM) remains closed
- With rest.bind-address set to 0.0.0.0, the port is open and accessible
via both IP addresses
> However if you have a single IP then using 0.0.0.0 and binding it
to lo + eth0 is something what I wouldn't worry about.
I got the point and basically I agree here, but I'm not sure how
future-proof this approach is. How probable is a scenario in which the
environment (single IP node) is changed (to a multi-homed node), but
unchanged configuration (still listening on 0.0.0.0) now leads to an
excessive network exposure? Either way, that's not my case. And I think
this is not restricted to YARN too: binding to all interfaces in
Standalone deployment might be too excessive as well.
> but you still have control on firewall, right?
Probably yes (saying for an average user). This would probably cover the
excessive binding leak, however only at the firewall level and not at
the "core". This adds a dependency on firewall. I'm not saying it's bad,
but rather that using the defense-in-depth approach and doing both
limited binding and adding firewall would be even better than relying on
firewall only.
I hope all the above proves the point that even with good enough
environment (number of IP address + firewall) it still does make sense
to restrict the binding. At least that's how I see this, please correct
me if I'm wrong.
> introduce reverse DNS lookup as a must have feature
Could we make it optional and disabled by default?
Thanks,
Yaroslav
On 2025/08/14 21:32:40 Gabor Somogyi wrote:
> Hi Yaroslav,
>
> First of all I would like to understand why you think binding to
0.0.0.0 is
> less secure in your case. Correct me if I'm wrong:
> Does YARN containers share the host’s network in your case? On a
> multi-homed node, 0.0.0.0 exposes on every host interface,
> which can be less secure than binding to a specific host IP. So this case
> pinning can matter.
>
> However if you have a single IP then using 0.0.0.0 and binding it to lo +
> eth0 is something what I wouldn't worry about.
> Like a "normal" kubernetes pod (default networking, single interface, no
> hostNetwork) has no such issue.
>
> As a general remark. Let's say you expose the REST endpoint on 2 IP
> addresses but you still have control on firewall, right?
>
> The main reason why I'm asking these questions is because using
> `getHostName` would introduce reverse DNS lookup as a must have feature.
> That could cause quite some turbulences at heavy users by additional
> traffic, PTR records can be wrong or spoofed, etc...
>
> BR,
> G
>
>
> On Thu, Aug 14, 2025 at 8:13 PM Yaroslav Chernysh <ya...@gmail.com>
> wrote:
>
> > Hi Flink community,
> >
> > Is there a particular reason to advertise Job Manager's REST endpoint
> > address in a form of IP address instead of hostname? More
precisely, I'm
> > talking about this code block
> >
<https://github.com/apache/flink/blob/release-2.0.0/flink-runtime/src/main/java/org/apache/flink/runtime/rest/RestServerEndpoint.java#L298-L304>
in
> > RestServerEndpoint.java:
> >
> > final InetSocketAddress bindAddress = (InetSocketAddress)
> > serverChannel.localAddress();
> > final String advertisedAddress;
> > if (bindAddress.getAddress().isAnyLocalAddress()) {
> > advertisedAddress = this.restAddress;
> > } else {
> > advertisedAddress =
> > bindAddress.getAddress().getHostAddress();
> > }
> >
> > That is (as far as I understood), if rest.bind-address is set to the
> > 0.0.0.0 wildcard (which means binding to all available interfaces),
then
> > the advertised address will be the value of rest.address. Otherwise, an
> > address in a form of IP address of the specified rest.bind-address
will be
> > used.
> > What if I want to bind the REST endpoint to some specific address (for
> > security reasons), but at the same time advertise it in the form of
> > hostname? Assuming that all the name resolution things work correctly.
> >
> > For me particularly, the problem this creates is with SSL. The
certificate
> > I have for the Job Manager (REST connectivity) is created with a
hostname
> > and not an IP address. I run Flink on YARN and this way the default
value
> > for rest.bind-address is Node Manager's hostname (thus, not the 0.0.0.0
> > wildcard), and the same goes for rest.address. This way, the advertised
> > address is in the form of an IP address. I'd like to access Flink's
UI via
> > the YARN Resource Manager proxy ("Tracking URL" in the application
page)
> > that has the Job Manager's certificate in its truststore. However,
due to
> > the Flink being advertised to Resource Manager with the IP address
and the
> > certificate holds the hostname, the connection from Resource
Manager to Job
> > Manager fails with:
> >
> > javax.net.ssl.SSLPeerUnverifiedException: Certificate for
<192.168.33.11>
> > doesn't match any of the subject alternative names: []
> >
> > The only way I can fix this (without code changes) is by explicitly
> > setting rest.bind-address to 0.0.0.0, which is not secure, as far as I
> > understand (less secure than binding to a specific address).
> > However, if I substitute the getHostAddress() call in the code
block above
> > with the getHostName(), the issue is gone.
> >
> > So, my question is: is there any particular reason not to
> > use getHostName() here (assuming hostname is available)?
> >
> > Thanks,
> > Yaroslav
> >
>