2020/8/19 14:14:56 -0700, roger.ri...@oracle.com:
> Please review a java.util.Hex API to encode and decode hexadecimal 
> strings to and from byte arrays.
> 
> JavaDoc:
> http://cr.openjdk.java.net/~rriggs/hex-javadoc/java.base/java/util/Hex.html

This mostly looks good -- it’ll be nice to have a standard utility for
this sort of thing.

A few comments:

  - Why do the short-form `encoder` factory methods return encoders that
    produce upper-case hex strings?  `Integer::toHexString` and other
    existing `toHex` methods return lower-case hex strings.  That’s also
    what you get from common Unix CLI tools (e.g., `od -tx1`).

    Please consider changing these methods to return lower-case encoders.

  - Is it worth having static `Hex.encode(byte[])` and
    `Hex.decode(CharSequence)` convenience methods for the simplest
    cases?

  - [Warning: Bikeshed] The verbs “encode” and “decode” seem unfortunate.

    Over in `java.nio.charsets` we have encoders that transform
    characters to bytes, and decoders that transform bytes to characters.
    The coded thing is the bytes; the uncoded thing is the characters.

    In `java.util` we already have the `Base64` class, which provides
    encoders that transform bytes to characters, and decoders that
    transform characters to bytes.  The coded thing is the characters;
    the uncoded thing is the bytes.

    The use of “encode” and “decode” in `Base64` was likely inspired by
    the fact that the format has been known as “base 64 encoding” for
    decades, having originated as a hack for transporting non-ASCII data
    via SMTP.  Developers looking to do base-64 operations will, thus,
    expect this terminology.

    Here you’re proposing that the `Hex` class follow the `Base64` class.
    Consistency with existing nearby APIs is a worthy goal.  If I were
    just looking to convert a byte array into a readable hex string,
    however, I’d probably want to “format” it rather than “encode” it,
    something like `String.format("%x")` on steroids.  Likewise, if I
    were looking to convert a hex string into bytes then I’d want to
    “parse” it rather than “decode” it, i.e., `Integer::parseInt` on
    steroids.

    If you were to rename the nested classes to `Hex.Formatter` and
    `Hex.Parser`, and rename all methods accordingly, then this API would
    be inconsistent with the nearby `Base64` but likely more consistent
    with developer expectations.

    (`Hex` is already inconsistent with `Base64` in that it doesn’t
     prefix the names of its factory methods with `get`, which is a good
     thing.)

- Mark

Reply via email to