Hi Gabor,
Thanks, let's add a new option then to make advertised address
configurable and document the default behavior. Would you mind filing a
ticket for that?
Regards,
Yaroslav
On 2025/08/18 13:41:54 Gabor Somogyi wrote:
> Hi Yaroslav,
>
> Having a config option to advertise something else is what I can support.
> Needless to say the actual behavior would remain as default.
>
> G
>
>
> On Mon, Aug 18, 2025 at 3:28 PM Yaroslav Chernysh <ya...@gmail.com>
> wrote:
>
> > Hi Gabor,
> >
> > I got your point on using `getHostName()`. Thank you for such a
detailed
> > explanation.
> >
> > What do you think about advertising rest.address instead? In case of
> > YARN (at least on my environment), this is already set by YARN to a NM
> > hostname, so rDNS would be avoided.
> >
> > Thanks,
> >
> > Yaroslav
> >
> >
> > On 2025/08/15 21:12:58 Gabor Somogyi wrote:
> > > Hi Yaroslav,
> > >
> > > Thanks for your efforts in finding out all the details.
> > >
> > > I think making `getHostName` possible with a config + some additional
> > > warnings in the documentation can be considered.
> > > You need to evaluate your security standards but you win something on
> > one
> > > side and introduce new attack vector on the other side.
> > >
> > > I would write something similar in the documentation, and I also
suggest
> > > you consider these for your own situation as well:
> > > - rDNS is not trustworthy for security decisions. Attackers with
control
> > > over PTR (or via poisoning/misconfig) can return arbitrary names.
> > > MITRE tracks this as CWE-350 [1] (Reliance on Reverse DNS for
> > Security). If
> > > you base TLS host checks on rDNS, it’s bypassable.
> > > - Slow or failing DNS causes blocking delays (seconds) in JVM
lookups.
> > > OpenJDK issues document repeated timeouts and lack of
> > > effective caching paths for some rDNS calls. Putting rDNS in critical
> > paths
> > > (TLS, handshake, request handling) can amplify random outages.
> > >
> > > All in all I'm not yet convinced that this issue appears in other
> > trending
> > > environments like k8s.
> > > Adding this together with the mentioned risks I personally wouldn't
> > merge
> > > it to the main repo.
> > >
> > > BR,
> > > G
> > >
> > > [1] https://cwe.mitre.org/data/definitions/350.html
> > >
> > >
> > > On Fri, Aug 15, 2025 at 7:44 PM Yaroslav Chernysh <ya...@gmail.com>
> > > wrote:
> > >
> > > > Hi Gobor,
> > > >
> > > > Thank you for such a quick response, I appreciate it.
> > > >
> > > > Actually, I'm not very good at all this security and networking
> > stuff, so
> > > > I apologize in advance if I'm wrong in some statement.
> > > >
> > > > > Does YARN containers share the host’s network in your case?
> > > >
> > > > Yes, it does. And as far as I have researched, it always does,
> > possibly
> > > > only unless you have configured YARN to use Docker containers
> > > >
> > <
> >
https://hadoop.apache.org/docs/r3.4.1/hadoop-yarn/hadoop-yarn-site/DockerContainers.html
> > >,
> > > > which is definitely not my case. I have also done some testing on
> > my node,
> > > > which has 2 IP addresses:
> > > >
> > > > - With default rest.bind-address (set by YARN to Node Manager's
> > hostname),
> > > > the only IP address that opens a port is the one that NM
hostname is
> > > > resolved to. The other one (not sure where it comes from, this is a
> > VM)
> > > > remains closed
> > > >
> > > > - With rest.bind-address set to 0.0.0.0, the port is open and
> > accessible
> > > > via both IP addresses
> > > >
> > > > > However if you have a single IP then using 0.0.0.0 and
binding it to
> > > > lo + eth0 is something what I wouldn't worry about.
> > > >
> > > > I got the point and basically I agree here, but I'm not sure how
> > > > future-proof this approach is. How probable is a scenario in
which the
> > > > environment (single IP node) is changed (to a multi-homed
node), but
> > > > unchanged configuration (still listening on 0.0.0.0) now leads
to an
> > > > excessive network exposure? Either way, that's not my case. And I
> > think
> > > > this is not restricted to YARN too: binding to all interfaces in
> > Standalone
> > > > deployment might be too excessive as well.
> > > >
> > > > > but you still have control on firewall, right?
> > > >
> > > > Probably yes (saying for an average user). This would probably
> > cover the
> > > > excessive binding leak, however only at the firewall level and not
> > at the
> > > > "core". This adds a dependency on firewall. I'm not saying it's
> > bad, but
> > > > rather that using the defense-in-depth approach and doing both
limited
> > > > binding and adding firewall would be even better than relying on
> > firewall
> > > > only.
> > > >
> > > > I hope all the above proves the point that even with good enough
> > > > environment (number of IP address + firewall) it still does make
> > sense to
> > > > restrict the binding. At least that's how I see this, please
> > correct me if
> > > > I'm wrong.
> > > >
> > > > > introduce reverse DNS lookup as a must have feature
> > > >
> > > > Could we make it optional and disabled by default?
> > > >
> > > > Thanks,
> > > >
> > > > Yaroslav
> > > >
> > > > On 2025/08/14 21:32:40 Gabor Somogyi wrote:
> > > > > Hi Yaroslav,
> > > > >
> > > > > First of all I would like to understand why you think binding to
> > 0.0.0.0
> > > > is
> > > > > less secure in your case. Correct me if I'm wrong:
> > > > > Does YARN containers share the host’s network in your case? On a
> > > > > multi-homed node, 0.0.0.0 exposes on every host interface,
> > > > > which can be less secure than binding to a specific host IP. So
> > this case
> > > > > pinning can matter.
> > > > >
> > > > > However if you have a single IP then using 0.0.0.0 and binding it
> > to lo +
> > > > > eth0 is something what I wouldn't worry about.
> > > > > Like a "normal" kubernetes pod (default networking, single
> > interface, no
> > > > > hostNetwork) has no such issue.
> > > > >
> > > > > As a general remark. Let's say you expose the REST endpoint
on 2 IP
> > > > > addresses but you still have control on firewall, right?
> > > > >
> > > > > The main reason why I'm asking these questions is because using
> > > > > `getHostName` would introduce reverse DNS lookup as a must have
> > feature.
> > > > > That could cause quite some turbulences at heavy users by
additional
> > > > > traffic, PTR records can be wrong or spoofed, etc...
> > > > >
> > > > > BR,
> > > > > G
> > > > >
> > > > >
> > > > > On Thu, Aug 14, 2025 at 8:13 PM Yaroslav Chernysh
<ya...@gmail.com>
> > > > <ya...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Flink community,
> > > > > >
> > > > > > Is there a particular reason to advertise Job Manager's REST
> > endpoint
> > > > > > address in a form of IP address instead of hostname? More
> > precisely,
> > > > I'm
> > > > > > talking about this code block
> > > > > >
> > > >
> > <
> >
https://github.com/apache/flink/blob/release-2.0.0/flink-runtime/src/main/java/org/apache/flink/runtime/rest/RestServerEndpoint.java#L298-L304
> > >
> > > >
> > <
> >
https://github.com/apache/flink/blob/release-2.0.0/flink-runtime/src/main/java/org/apache/flink/runtime/rest/RestServerEndpoint.java#L298-L304
> > >
> > > > in
> > > > > > RestServerEndpoint.java:
> > > > > >
> > > > > > final InetSocketAddress bindAddress = (InetSocketAddress)
> > > > > > serverChannel.localAddress();
> > > > > > final String advertisedAddress;
> > > > > > if (bindAddress.getAddress().isAnyLocalAddress()) {
> > > > > > advertisedAddress = this.restAddress;
> > > > > > } else {
> > > > > > advertisedAddress =
> > > > > > bindAddress.getAddress().getHostAddress();
> > > > > > }
> > > > > >
> > > > > > That is (as far as I understood), if rest.bind-address is set
> > to the
> > > > > > 0.0.0.0 wildcard (which means binding to all available
> > interfaces),
> > > > then
> > > > > > the advertised address will be the value of rest.address.
> > Otherwise, an
> > > > > > address in a form of IP address of the specified
rest.bind-address
> > > > will be
> > > > > > used.
> > > > > > What if I want to bind the REST endpoint to some specific
> > address (for
> > > > > > security reasons), but at the same time advertise it in the
form
> > of
> > > > > > hostname? Assuming that all the name resolution things work
> > correctly.
> > > > > >
> > > > > > For me particularly, the problem this creates is with SSL. The
> > > > certificate
> > > > > > I have for the Job Manager (REST connectivity) is created
with a
> > > > hostname
> > > > > > and not an IP address. I run Flink on YARN and this way the
> > default
> > > > value
> > > > > > for rest.bind-address is Node Manager's hostname (thus, not the
> > 0.0.0.0
> > > > > > wildcard), and the same goes for rest.address. This way, the
> > advertised
> > > > > > address is in the form of an IP address. I'd like to access
> > Flink's UI
> > > > via
> > > > > > the YARN Resource Manager proxy ("Tracking URL" in the
application
> > > > page)
> > > > > > that has the Job Manager's certificate in its truststore.
> > However, due
> > > > to
> > > > > > the Flink being advertised to Resource Manager with the IP
> > address and
> > > > the
> > > > > > certificate holds the hostname, the connection from Resource
> > Manager
> > > > to Job
> > > > > > Manager fails with:
> > > > > >
> > > > > > javax.net.ssl.SSLPeerUnverifiedException: Certificate for
> > > > <192.168.33.11>
> > > > > > doesn't match any of the subject alternative names: []
> > > > > >
> > > > > > The only way I can fix this (without code changes) is by
> > explicitly
> > > > > > setting rest.bind-address to 0.0.0.0, which is not secure, as
> > far as I
> > > > > > understand (less secure than binding to a specific address).
> > > > > > However, if I substitute the getHostAddress() call in the code
> > block
> > > > above
> > > > > > with the getHostName(), the issue is gone.
> > > > > >
> > > > > > So, my question is: is there any particular reason not to
> > > > > > use getHostName() here (assuming hostname is available)?
> > > > > >
> > > > > > Thanks,
> > > > > > Yaroslav
> > > > > >
> > > > >
> > > >
> > >
> >
>