Hi Gabor,

I got your point on using `getHostName()`. Thank you for such a detailed explanation.

What do you think about advertising rest.address instead? In case of YARN (at least on my environment), this is already set by YARN to a NM hostname, so rDNS would be avoided.

Thanks,

Yaroslav


On 2025/08/15 21:12:58 Gabor Somogyi wrote:
> Hi Yaroslav,
>
> Thanks for your efforts in finding out all the details.
>
> I think making `getHostName` possible with a config + some additional
> warnings in the documentation can be considered.
> You need to evaluate your security standards but you win something on one
> side and introduce new attack vector on the other side.
>
> I would write something similar in the documentation, and I also suggest
> you consider these for your own situation as well:
> - rDNS is not trustworthy for security decisions. Attackers with control
> over PTR (or via poisoning/misconfig) can return arbitrary names.
> MITRE tracks this as CWE-350 [1] (Reliance on Reverse DNS for Security). If
> you base TLS host checks on rDNS, it’s bypassable.
> - Slow or failing DNS causes blocking delays (seconds) in JVM lookups.
> OpenJDK issues document repeated timeouts and lack of
> effective caching paths for some rDNS calls. Putting rDNS in critical paths
> (TLS, handshake, request handling) can amplify random outages.
>
> All in all I'm not yet convinced that this issue appears in other trending
> environments like k8s.
> Adding this together with the mentioned risks I personally wouldn't merge
> it to the main repo.
>
> BR,
> G
>
> [1] https://cwe.mitre.org/data/definitions/350.html
>
>
> On Fri, Aug 15, 2025 at 7:44 PM Yaroslav Chernysh <ya...@gmail.com>
> wrote:
>
> > Hi Gobor,
> >
> > Thank you for such a quick response, I appreciate it.
> >
> > Actually, I'm not very good at all this security and networking stuff, so
> > I apologize in advance if I'm wrong in some statement.
> >
> > > Does YARN containers share the host’s network in your case?
> >
> > Yes, it does. And as far as I have researched, it always does, possibly
> > only unless you have configured YARN to use Docker containers
> > <https://hadoop.apache.org/docs/r3.4.1/hadoop-yarn/hadoop-yarn-site/DockerContainers.html>, > > which is definitely not my case. I have also done some testing on my node,
> > which has 2 IP addresses:
> >
> > - With default rest.bind-address (set by YARN to Node Manager's hostname),
> > the only IP address that opens a port is the one that NM hostname is
> > resolved to. The other one (not sure where it comes from, this is a VM)
> > remains closed
> >
> > - With rest.bind-address set to 0.0.0.0, the port is open and accessible
> > via both IP addresses
> >
> > > However if you have a single IP then using 0.0.0.0 and binding it to
> > lo + eth0 is something what I wouldn't worry about.
> >
> > I got the point and basically I agree here, but I'm not sure how
> > future-proof this approach is. How probable is a scenario in which the
> > environment (single IP node) is changed (to a multi-homed node), but
> > unchanged configuration (still listening on 0.0.0.0) now leads to an
> > excessive network exposure? Either way, that's not my case. And I think
> > this is not restricted to YARN too: binding to all interfaces in Standalone
> > deployment might be too excessive as well.
> >
> > > but you still have control on firewall, right?
> >
> > Probably yes (saying for an average user). This would probably cover the > > excessive binding leak, however only at the firewall level and not at the > > "core". This adds a dependency on firewall. I'm not saying it's bad, but
> > rather that using the defense-in-depth approach and doing both limited
> > binding and adding firewall would be even better than relying on firewall
> > only.
> >
> > I hope all the above proves the point that even with good enough
> > environment (number of IP address + firewall) it still does make sense to > > restrict the binding. At least that's how I see this, please correct me if
> > I'm wrong.
> >
> > > introduce reverse DNS lookup as a must have feature
> >
> > Could we make it optional and disabled by default?
> >
> > Thanks,
> >
> > Yaroslav
> >
> > On 2025/08/14 21:32:40 Gabor Somogyi wrote:
> > > Hi Yaroslav,
> > >
> > > First of all I would like to understand why you think binding to 0.0.0.0
> > is
> > > less secure in your case. Correct me if I'm wrong:
> > > Does YARN containers share the host’s network in your case? On a
> > > multi-homed node, 0.0.0.0 exposes on every host interface,
> > > which can be less secure than binding to a specific host IP. So this case
> > > pinning can matter.
> > >
> > > However if you have a single IP then using 0.0.0.0 and binding it to lo +
> > > eth0 is something what I wouldn't worry about.
> > > Like a "normal" kubernetes pod (default networking, single interface, no
> > > hostNetwork) has no such issue.
> > >
> > > As a general remark. Let's say you expose the REST endpoint on 2 IP
> > > addresses but you still have control on firewall, right?
> > >
> > > The main reason why I'm asking these questions is because using
> > > `getHostName` would introduce reverse DNS lookup as a must have feature.
> > > That could cause quite some turbulences at heavy users by additional
> > > traffic, PTR records can be wrong or spoofed, etc...
> > >
> > > BR,
> > > G
> > >
> > >
> > > On Thu, Aug 14, 2025 at 8:13 PM Yaroslav Chernysh <ya...@gmail.com>
> > <ya...@gmail.com>
> > > wrote:
> > >
> > > > Hi Flink community,
> > > >
> > > > Is there a particular reason to advertise Job Manager's REST endpoint > > > > address in a form of IP address instead of hostname? More precisely,
> > I'm
> > > > talking about this code block
> > > >
> > <https://github.com/apache/flink/blob/release-2.0.0/flink-runtime/src/main/java/org/apache/flink/runtime/rest/RestServerEndpoint.java#L298-L304> > > <https://github.com/apache/flink/blob/release-2.0.0/flink-runtime/src/main/java/org/apache/flink/runtime/rest/RestServerEndpoint.java#L298-L304>
> > in
> > > > RestServerEndpoint.java:
> > > >
> > > > final InetSocketAddress bindAddress = (InetSocketAddress)
> > > > serverChannel.localAddress();
> > > > final String advertisedAddress;
> > > > if (bindAddress.getAddress().isAnyLocalAddress()) {
> > > > advertisedAddress = this.restAddress;
> > > > } else {
> > > > advertisedAddress =
> > > > bindAddress.getAddress().getHostAddress();
> > > > }
> > > >
> > > > That is (as far as I understood), if rest.bind-address is set to the
> > > > 0.0.0.0 wildcard (which means binding to all available interfaces),
> > then
> > > > the advertised address will be the value of rest.address. Otherwise, an
> > > > address in a form of IP address of the specified rest.bind-address
> > will be
> > > > used.
> > > > What if I want to bind the REST endpoint to some specific address (for
> > > > security reasons), but at the same time advertise it in the form of
> > > > hostname? Assuming that all the name resolution things work correctly.
> > > >
> > > > For me particularly, the problem this creates is with SSL. The
> > certificate
> > > > I have for the Job Manager (REST connectivity) is created with a
> > hostname
> > > > and not an IP address. I run Flink on YARN and this way the default
> > value
> > > > for rest.bind-address is Node Manager's hostname (thus, not the 0.0.0.0 > > > > wildcard), and the same goes for rest.address. This way, the advertised > > > > address is in the form of an IP address. I'd like to access Flink's UI
> > via
> > > > the YARN Resource Manager proxy ("Tracking URL" in the application
> > page)
> > > > that has the Job Manager's certificate in its truststore. However, due
> > to
> > > > the Flink being advertised to Resource Manager with the IP address and
> > the
> > > > certificate holds the hostname, the connection from Resource Manager
> > to Job
> > > > Manager fails with:
> > > >
> > > > javax.net.ssl.SSLPeerUnverifiedException: Certificate for
> > <192.168.33.11>
> > > > doesn't match any of the subject alternative names: []
> > > >
> > > > The only way I can fix this (without code changes) is by explicitly
> > > > setting rest.bind-address to 0.0.0.0, which is not secure, as far as I
> > > > understand (less secure than binding to a specific address).
> > > > However, if I substitute the getHostAddress() call in the code block
> > above
> > > > with the getHostName(), the issue is gone.
> > > >
> > > > So, my question is: is there any particular reason not to
> > > > use getHostName() here (assuming hostname is available)?
> > > >
> > > > Thanks,
> > > > Yaroslav
> > > >
> > >
> >
>

Reply via email to