Hi David,
Thank you for your comments.
Am 28.03.24 um 21:09 schrieb David Lloyd:
I would like to propose a PR to extend the InetAddress API in JDK
23, namely to provide interface to constructing InetAddress
objects from literal addresses in POSIX/BSD form (please see the
discussion [1]), to the Apps that need to mimic the behavior of
POSIX network APIs (|inet_addr|) used by standard network
utilities such as netcat/curl/wget and the majority of web
browsers. At present time, there's no way to construct
|InetAddress| object from such literal addresses because the new
API |InetAddress.ofLiteral()| and |Inet4Address.ofLiteral()| will
consume an octal address and successfully parse it as decimal,
ignoring the octal prefix. Hence, the resulting object will point
to a different IP address than it is expected to point to. There's
also no direct way to create an InetAddress from a literal address
with hexadecimal segments, although this can be the case in
certain systems.
Would this proposal be unique to IPv4 addresses, or is there an
equivalent for IPv6? (I would suspect that there isn't, given that the
parsing rules for IPv6 are a bit more well-defined...)
Yes this is IPv4 specific.
It is suggested to add a new factory method such as
|.ofPosixLiteral()| to |Inet4Address| class to fill this gap. This
won't introduce ambiguity into the API and won't break the long
standing behavior. As a new method, it will not affect Java
utilities such as HttpClient, nor the existing Java applications.
At the same time, the new method will help dealing with confusion
between BSD and Java standards.
I would suggest normatively calling this behavior "POSIX standard"
parsing (not BSD or POSIX/BSD), since it (at least nominally) comes
from a standards body [1]. Bear in mind that `inet_pton` follows
different rules though [2]. RFC 6943 [3] has a bit more to say about
so called "loose" vs "strict" IP address parsing rules.
Thanks, I will put a note on it in the PR and update the motivation text
accordingly.
There could be a slight discrepancy in terms of how different
standard tools are working under different OS. For example in
MacOS wget & nc disregard octal prefix (0) while allowing
hexadecimal prefix (0x), at the same time curl & ping process both
prefixes. In Ubuntu Server 22.04 both prefixes are processed, but
they are not allowed in /etc/hosts file, while in MacOS it's legal
to use 0x. Despite the deviations in how and where the BSD
standard is implemented, there are two distinct approaches. I
don't see why Java should't provide two different indepentent
APIs. It would give the future apps flexibility to decide which
standard to rely on, ability to see the full picture.
Please share your thoughts on whether such a change might be
desirable in JDK 23. Thank you for your help!
I guess it could be useful when the need arises to interoperate with
tooling that supports this kind of syntax, and if it was done, I would
agree that a separate method would be the way to go. But, I don't have
any comment as to whether the potential use cases are sufficient to
justify the API surface and additional implementation complexity
(whatever that may be).
This is the primary use case of the proposed API. Implementaions that
rely on inet_addr() syntax are still too widespread to ignore them.
As another random data point: the projects I've been working on have
relegated such extra-JDK IP address handling tasks to a utility
library [4]. We don't have a parser for this particular syntax though.
I think Guava and Apache Commons both have similar utility classes.
Neither of them have that syntax. But I also see libraries that do have
that syntax - an example is IPAddress library
(https://github.com/seancfoley/IPAddress).