Re: RFR: 8251989: Hex formatting and parsing utility [v10]

Naoto Sato Mon, 30 Nov 2020 09:33:03 -0800

Hi Roger,

Thanks for your thought and I agree with you. Since this is a utilityprimarily meant for developers, not end users, limiting the "hexadecimalstring/character" in Latin-1 seems reasonable.


Naoto

On 11/30/20 7:42 AM, Roger Riggs wrote:

Hi Naoto,

There are a couple of ways consistency can be achieved (and with what).
The existing hex conversions from strings to hex all delegate toCharacter.digit(ch, radix) which allowsboth digits and letters beyond Latin1. (See Integer.valueOf(string,radix), Long.valueOf(string, radix), etc.)For conversions from primitive to string they support conversion to theLatin1 characters "0-9", "a-f".
Making the conversion of strings to and from primitives consistentwithin HexFormat seems attractivebut would diverge from existing conversions and typically the non-Latin1digits and letters almost never appear.
There are uses cases (primarily in protocols and RFCs) where thehexadecimal characters arespecifed as "0-9", "a-f", and "A-F". If HexFormat usedCharacter.digit(string, radix) it would failto detect unexpected or illegal characters and render HexFormatunusable for those use cases.
Though it would diverge from consistency with existing parsing ofhexadecimal in Character, Integer, Long, etc,I'll post an update to use the string parsing allowing only Latin1hexadecimal characters.
Comments?

Thanks, Roger



On 11/27/20 5:43 PM, Naoto Sato wrote:
On Fri, 27 Nov 2020 16:57:07 GMT, Roger Riggs <[email protected]> wrote:
src/java.base/share/classes/java/util/HexFormat.java line 853:
851:      */
852:     public int fromHexDigit(int ch) {
853:         int value = Character.digit(ch, 16);
Do we need to limit parsing the hex digit for only [0-9a-fA-F]? Thiswould return `0` for other digits, say `fullwidth digit zero` (U+FF10)
The normal and conventional characters for hex encoding are limitedto the ASCII/Latin1 range.I don't know of any use case that would take advantage of non-ASCIIcharacters.
My point is that probably we should define `hexadecimal string` moreclearly. In the class description, that exclusively means [0-9a-fA-F]in the context of formatting, but in the parsing, it allows non-ASCIIdigits. e.g.,
HexFormat.of().parseHex("\uff10\uff11")
Succeeds. I would like consistency here.

-------------

PR: https://git.openjdk.java.net/jdk/pull/482

Re: RFR: 8251989: Hex formatting and parsing utility [v10]

Reply via email to