On Tue, 26 Mar 2024 17:25:54 GMT, Sergey Chernyshev <schernys...@openjdk.org> wrote:
> There are two distinct approaches to parsing IPv4 literal addresses. One is > the Java baseline "strict" syntax (all-decimal d.d.d.d form family), another > one is the "loose" syntax of RFC 6943 section 3.1.1 [1] (POSIX `inet_addr` > allowing octal and hexadecimal forms [2]). The goal of this PR is to provide > interface to construct InetAddress objects from literal addresses in POSIX > form, to applications that need to mimic the behavior of `inet_addr` used by > standard network utilities such as netcat/curl/wget and the majority of web > browsers. At present time, there's no way to construct `InetAddress` object > from such literal addresses because the existing APIs such as > `InetAddress.getByName()`, `InetAddress#ofLiteral()` and > `Inet4Address#ofLiteral()` will consume an address and successfully parse it > as decimal, regardless of the octal prefix. Hence, the resulting object will > point to a different IP address. > > Historically `InetAddress.getByName()/.getAllByName()` were the only way to > convert a literal address into an InetAddress object. `getAllByName()` API > relies on POSIX `getaddrinfo` / `inet_addr` which parses IP address segments > with `strtoul` (accepts octal and hexadecimal bases). > > The fallback to `getaddrinfo` is undesirable as it may end up with network > queries (blocking mode), if `inet_addr` rejects the input literal address. > The Java standard explicitly says that > > "If a literal IP address is supplied, only the validity of the address format > is checked." > > @AlekseiEfimov contributed JDK-8272215 [3] that adds new factory methods > `.ofLiteral()` to `InetAddress` classes. Although the new API is not affected > by the `getaddrinfo` fallback issue, it is not sufficient for an application > that needs to interoperate with external tooling that follows POSIX standard. > In the current state, `InetAddress#ofLiteral()` and > `Inet4Address#ofLiteral()` will consume the input literal address and > (regardless of the octal prefix) parse it as decimal numbers. Hence, it's not > possible to reliably construct an `InetAddress` object from a literal address > in POSIX form that would point to the desired host. > > It is proposed to extend the factory methods with > `Inet4Address#ofPosixLiteral()` that allows parsing literal IP(v4) addresses > in "loose" syntax, compatible with `inet_addr` POSIX api. The implementation > is based on `.isBsdParsableV4()` method added along with JDK-8277608 [4]. The > changes in the original algorithm are as follows: > > - `IPAddressUtil#parseBsdLiteralV4()` method is e... src/java.base/share/classes/java/net/Inet4Address.java line 103: > 101: * octal and hexadecimal address segments. Please refer to > 102: * <a href="https://www.ietf.org/rfc/rfc6943.html#section-3.1.1"> > <i>RFC > 103: * 6943: Issues in Identifier Comparison for Security Purposes</i></a>. Suggestion: * <p> The above forms adhere "strict" decimal-only syntax. * Additionally, the * {@link Inet4Address#ofPosixLiteral(String)} method implements a * <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/inet_addr.html"> * POSIX {@code inet_addr}</a> compatible "loose" parsing algorithm, allowing * octal and hexadecimal address segments. Please refer to * <a href="https://www.ietf.org/rfc/rfc6943.html#section-3.1.1"> <i>RFC * 6943: Issues in Identifier Comparison for Security Purposes</i></a>. src/java.base/share/classes/java/net/Inet4Address.java line 224: > 222: * Inet4Address##format valid IPv4 address} an {@code > IllegalArgumentException} is thrown. > 223: * <p> This method doesn't block, i.e. no hostname lookup is > performed. > 224: * Please desrcibe the syntax this metod accepts here, in the normative part of the specification. src/java.base/share/classes/java/net/Inet4Address.java line 232: > 230: * {@code 255} for {@code "0255"}, {@linkplain > Inet4Address#ofPosixLiteral this} > 231: * method interprets the numbers based on their prefix (hexadecimal > {@code "0x"}, > 232: * octal {@code "0"}) and returns {@code 173} for {@code "0255"}. Suggestion: * when {@code posixIPAddressLiteral} parameter contains address segments with * leading zeroes. An address segment with a leading zero is always parsed as an octal * number by {@code ofPosixLiteral()}, therefore {@code 0255} (octal) will be parsed as * {@code 173} (decimal) by this method. On the other hand, {@link Inet4Address#ofLiteral * Inet4Address.ofLiteral} ignores leading zeros, parses all numbers as decimal and produces * {@code 255}. Where this method would parse {@code 0256.0256.0256.0256} (octal) and * produce {@code 174.174.174.174} (decimal) in four dotted quad notation, * {@code Inet4Address.ofLiteral} will throw {@code IllegalArgumentException}. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18493#discussion_r1557642992 PR Review Comment: https://git.openjdk.org/jdk/pull/18493#discussion_r1557678859 PR Review Comment: https://git.openjdk.org/jdk/pull/18493#discussion_r1557675986