Hi,
On 5/9/2018 6:49 PM, Vincent Ryan wrote:
Thanks Roger for your comments.
The main motivator for this class is to provide a basic hex. encoder/decoder
for smaller amounts of binary data
and to provide a hexdump encoder for larger amounts of binary data, while
recognising the need to cater for
custom formats too.
The class does not attempt to satisfy every custom format. Instead it provides
some basic formatting methods
and functions from which a wide variety of custom formats can be constructed.
Developers can also write their
own formatting functions to use with this class.
I'm fine with covering the simple cases if the rest can be covered by
the extensible formatting API.
More comments in-line below.
On 9 May 2018, at 16:20, Roger Riggs <[email protected]> wrote:
Hi Vinnie,
On the API and spec, a few comments:
- Expanding the printable string from ASCII to ISO-8859-1 would make it a bit
more useful in more cases.
That might suggest using the Charset converter to do the work (less
optimized but more functional).
Yes adding support for additional charsets could be useful but it is a tradeoff
against cluttering the API.
I think ASCII is sufficient for the common case.
Converting the control characters to "." below <space> 0-31 makes sense
to avoid odd carriage control effects.
But the characters above 127 are printable and there is no reason to
blank them out.
- There is no API support for ByteBuffers, another common source of bytes, that
would make a good addition
for completeness. John Rose suggested a ByteSequence interface in the
context of file processing but
that hasn't settled down.
Support for byte buffers was requested before and I suggested wrapping in an
I/O stream.
Is that acceptable?
If it were a simple one or two method pattern, I'd be likely to agree
but its not simple unless the ByteArray is
backed by an array and using the position and limit it gets more
complicated.
A convenience method would need only to allocate an array (limit -
position) and get, the rest is as easy as byte[].
For example,
public static Stream<String> dumpAsStream(ByteBuffer buffer, int
fromIndex, int toIndex,
Formatter formatter) {
int len = Math.max(0, buffer.limit() - buffer.position());
byte[] bytes = new byte[len];
buffer.get(bytes);
return dumpAsStream(bytes, fromIndex, toIndex, 16, formatter);
}
- The class name "Hex" might be a bit more evocative as HexDump or HexConverter.
I tried to keep the class name short as many of its method names are long.
- Method names; the "Hex" in method names might be unnecessary/redundant
since, as static methods,
they would frequently appear in code as "Hex.fromHexString" and a simple
"Hex.fromString" would be fine.
Ditto, toHexString(bytes) -> toString(bytes)…
I agree the repetition is ugly. Shortening to fromString() and toString() is
appealing except for the possible
confusion with Object.toString
There are no instances, so little room for confusion.
- There are not many forms that allow the formatter to be supplied, for
example, dump(in, out) might be
a case where a formatter would be desired.
The 3 dump methods are just convenience functions around the stream generators.
The dumpAsStream
methods are the expected entry points for customizers.
Since those methods are writing formatted output, they should be
directed to a PrintStream,
of which System.out/err would be common case.
OutputStream isn't going to know the correct line separator; (and it
should not be hardcoded as '\n' Hex:606)
Btw, the example in the class javadoc does not compile.
- Hex.Formatter interface could have a default method that provides the default
formatting or as
a static method so it can be used with a method reference.
Sure. What do you suggest?
Hex: 69:
69 String.format("%08x %s |%s|",
70 offset,
71 Hex.toFormattedHexString(chunk, fromIndex, toIndex),
72 Hex.toPrintableString(chunk, fromIndex, toIndex));
- On the example in the class javadoc, I would use the implementation of the
default formatter with both hex and ascii
to show how that works.
Can you give me an example?
Define the method with the default modifier and the method body of
HEXDUMP_FORMATTER.format.
(Hex:68-73).
- As Max observes, being able to supply the delimiters might be a good
addition. (I'm thinking IP addresses too).
Sure. Add another toHexString method that takes a delimiter character?
I take it back, that's a customization that could be done by a user
supplied formatter.
Thanks, Roger
It looks quite good and very useful.
Thanks, Roger
On 5/8/2018 10:34 PM, Weijun Wang wrote:
Nice tool.
However, I am not sure how toFormattedHexString() and toPrintableString() are
useful, seems only for providing a customizable dump format which is, actually,
not very customizable.
For me, toHexString and fromHexString are of course the most useful methods. As
for dump, I can only think of
1. The existing sun.security.HexDumpEncoder format, when I want to dump a lot
of bytes as a block
2. "00:11:22:33:AA:BB:CC" which fits in one line and also easy to read, when I
want inline debugging output
If the customizable dump method is both powerful and simple enough to create 2) above, I'll be
happy. Otherwise, I can live with toHexString().replaceAll("(..)(?=.)", "$1:”).
Thanks
Max
On May 4, 2018, at 4:22 AM, Vincent Ryan <[email protected]> wrote:
Hello,
Please review this proposal for a new API to conveniently generate and display
binary data using hex string representation.
It supports both bulk and stream operations and it can also generate the
well-known hexdump format [1].
Thanks
Bug: https://bugs.openjdk.java.net/browse/JDK-8170769
API:
http://cr.openjdk.java.net/~vinnie/8170769/javadoc.05/api/java.base/java/util/Hex.html
Webrev: http://cr.openjdk.java.net/~vinnie/8170769/webrev.05/
____
[1] https://docs.oracle.com/cd/E86824_01/html/E54763/hexdump-1.html