Re: [11] RFR: 8170769 Provide a simple hexdump facility for binary data

2018-05-10 Thread Roger Riggs

Hi,

On 5/9/2018 6:49 PM, Vincent Ryan wrote:

Thanks Roger for your comments.

The main motivator for this class is to provide a basic hex. encoder/decoder 
for smaller amounts of binary data
and to provide a hexdump encoder for larger amounts of binary data, while 
recognising the need to cater for
custom formats too.

The class does not attempt to satisfy every custom format. Instead it provides 
some basic formatting methods
and functions from which a wide variety of custom formats can be constructed. 
Developers can also write their
own formatting functions to use with this class.


I'm fine with covering the simple cases if the rest can be covered by 
the extensible formatting API.


More comments in-line below.



On 9 May 2018, at 16:20, Roger Riggs  wrote:

Hi Vinnie,

On the API and spec, a few comments:

  - Expanding the printable string from ASCII to ISO-8859-1 would make it a bit 
more useful in more cases.
That might suggest using the Charset converter to do the work (less 
optimized but more functional).

Yes adding support for additional charsets could be useful but it is a tradeoff 
against cluttering the API.
I think ASCII is sufficient for the common case.
Converting the control characters to "." below   0-31 makes sense 
to avoid odd carriage control effects.
But the characters above 127 are printable and  there is no reason to 
blank them out.

- There is no API support for ByteBuffers, another common source of bytes, that 
would make a good addition
for completeness.  John Rose suggested a ByteSequence interface in the 
context of file processing but
that hasn't settled down.

Support for byte buffers was requested before and I suggested wrapping in an 
I/O stream.
Is that acceptable?
If it were a simple one or two method pattern, I'd be likely to agree 
but its not simple unless the ByteArray is
backed by an array and  using the position and limit it gets more 
complicated.


A convenience method would need only to allocate an array (limit - 
position) and get, the rest is as easy as byte[].

For example,

 public static Stream dumpAsStream(ByteBuffer buffer, int
   fromIndex, int toIndex,
  Formatter formatter) {
    int len = Math.max(0, buffer.limit() - buffer.position());
    byte[] bytes = new byte[len];
    buffer.get(bytes);
    return dumpAsStream(bytes, fromIndex, toIndex, 16, formatter);
    }





- The class name "Hex" might be a bit more evocative as HexDump or HexConverter.

I tried to keep the class name short as many of its method names are long.



- Method names;  the "Hex" in method names might be unnecessary/redundant 
since, as static methods,
   they would frequently appear in code as "Hex.fromHexString" and a simple 
"Hex.fromString" would be fine.
   Ditto,  toHexString(bytes) -> toString(bytes)…

I agree the repetition is ugly. Shortening to fromString() and toString() is 
appealing except for the possible
confusion with Object.toString


There are no instances, so little room for confusion.


- There are not many forms that allow the formatter to be supplied, for 
example, dump(in, out) might be
   a case where a formatter would be desired.

The 3 dump methods are just convenience functions around the stream generators. 
The dumpAsStream
methods are the expected entry points for customizers.


Since those methods are writing formatted output, they should be 
directed to a PrintStream,

of which System.out/err would be common case.

OutputStream isn't going to know the correct line separator; (and it 
should not be hardcoded as '\n'  Hex:606)


Btw,  the example in the class javadoc does not compile.


- Hex.Formatter interface could have a default method that provides the default 
formatting or as
   a static method so it can be used with a method reference.

Sure. What do you suggest?

Hex: 69:

  69 String.format("%08x  %s  |%s|",
  70 offset,
  71 Hex.toFormattedHexString(chunk, fromIndex, toIndex),
  72 Hex.toPrintableString(chunk, fromIndex, toIndex));





- On the example in the class javadoc, I would use the implementation of the 
default formatter with both hex and ascii
   to show how that works.

Can you give me an example?
Define the method with the default modifier and the method body of 
HEXDUMP_FORMATTER.format.

(Hex:68-73).




  - As Max observes, being able to supply the delimiters might be a good 
addition.  (I'm thinking IP addresses too).

Sure. Add another toHexString method that takes a delimiter character?
I take it back, that's a customization that could be done by a user 
supplied formatter.


Thanks, Roger




It looks quite good and very useful.

Thanks, Roger

On 5/8/2018 10:34 PM, Weijun Wang wrote:

Nice tool.

However, I am not sure how toFormattedHexString() and toPrintableString() are 
useful, seems only for providing a customizable dump 

Re: [11] RFR: 8170769 Provide a simple hexdump facility for binary data

2018-05-10 Thread Vincent Ryan


> On 10 May 2018, at 01:08, Weijun Wang  wrote:
> 
> 
> 
>> On May 10, 2018, at 6:49 AM, Vincent Ryan  wrote:
>> 
>>> 
>>> - As Max observes, being able to supply the delimiters might be a good 
>>> addition.  (I'm thinking IP addresses too).
>> 
>> Sure. Add another toHexString method that takes a delimiter character?
> 
> Good idea.
> 
> Still, I am not sure about all those methods taking InputStream as an input. 
> Most likely I would use this class for debugging purpose, so any side effect 
> (advancing inside an InputStream or ByteBuffer) might not be desirable. 
> 
> Another question, I just noticed that sun.security.HexDumpEncoder emits 
> uppercase letters (ABCDEF) and your code emits lowercase letters. Do we need 
> to provide user a choice here?

The hexdump format uses lowercase but it is trivial to generate uppercase using 
a custom formatter.


> 
> Thanks
> Max
> 



Re: [11] RFR: 8170769 Provide a simple hexdump facility for binary data

2018-05-09 Thread Weijun Wang


> On May 10, 2018, at 6:49 AM, Vincent Ryan  wrote:
> 
>> 
>> - As Max observes, being able to supply the delimiters might be a good 
>> addition.  (I'm thinking IP addresses too).
> 
> Sure. Add another toHexString method that takes a delimiter character?

Good idea.

Still, I am not sure about all those methods taking InputStream as an input. 
Most likely I would use this class for debugging purpose, so any side effect 
(advancing inside an InputStream or ByteBuffer) might not be desirable. 

Another question, I just noticed that sun.security.HexDumpEncoder emits 
uppercase letters (ABCDEF) and your code emits lowercase letters. Do we need to 
provide user a choice here?

Thanks
Max



Re: [11] RFR: 8170769 Provide a simple hexdump facility for binary data

2018-05-09 Thread Vincent Ryan
Thanks for your comments Max.

> On 9 May 2018, at 03:34, Weijun Wang  wrote:
> 
> Nice tool.
> 
> However, I am not sure how toFormattedHexString() and toPrintableString() are 
> useful, seems only for providing a customizable dump format which is, 
> actually, not very customizable.

These provide implementations only to generate the human-readable hexdump 
format. 
They are not designed to be customizable. Custom formats should be supplied via 
Hex.Formatter.


> 
> For me, toHexString and fromHexString are of course the most useful methods. 
> As for dump, I can only think of
> 
> 1. The existing sun.security.HexDumpEncoder format, when I want to dump a lot 
> of bytes as a block
> 2. "00:11:22:33:AA:BB:CC" which fits in one line and also easy to read, when 
> I want inline debugging output
> 
> If the customizable dump method is both powerful and simple enough to create 
> 2) above, I'll be happy. Otherwise, I can live with 
> toHexString().replaceAll("(..)(?=.)", "$1:”).

How about adding another toHexString method that takes a delimiter character?

> 
> Thanks
> Max
> 
>> On May 4, 2018, at 4:22 AM, Vincent Ryan  wrote:
>> 
>> Hello,
>> 
>> Please review this proposal for a new API to conveniently generate and 
>> display binary data using hex string representation.
>> It supports both bulk and stream operations and it can also generate the 
>> well-known hexdump format [1].
>> 
>> Thanks
>> 
>> 
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8170769
>> API: 
>> http://cr.openjdk.java.net/~vinnie/8170769/javadoc.05/api/java.base/java/util/Hex.html
>> Webrev: http://cr.openjdk.java.net/~vinnie/8170769/webrev.05/
>> 
>> 
>> [1] https://docs.oracle.com/cd/E86824_01/html/E54763/hexdump-1.html
> 



Re: [11] RFR: 8170769 Provide a simple hexdump facility for binary data

2018-05-09 Thread Vincent Ryan
Thanks Roger for your comments.

The main motivator for this class is to provide a basic hex. encoder/decoder 
for smaller amounts of binary data
and to provide a hexdump encoder for larger amounts of binary data, while 
recognising the need to cater for
custom formats too.

The class does not attempt to satisfy every custom format. Instead it provides 
some basic formatting methods
and functions from which a wide variety of custom formats can be constructed. 
Developers can also write their
own formatting functions to use with this class.

More comments in-line below.


> On 9 May 2018, at 16:20, Roger Riggs  wrote:
> 
> Hi Vinnie,
> 
> On the API and spec, a few comments:
> 
>  - Expanding the printable string from ASCII to ISO-8859-1 would make it a 
> bit more useful in more cases.
>That might suggest using the Charset converter to do the work (less 
> optimized but more functional).

Yes adding support for additional charsets could be useful but it is a tradeoff 
against cluttering the API.
I think ASCII is sufficient for the common case.


> 
> - There is no API support for ByteBuffers, another common source of bytes, 
> that would make a good addition
>for completeness.  John Rose suggested a ByteSequence interface in the 
> context of file processing but
>that hasn't settled down.

Support for byte buffers was requested before and I suggested wrapping in an 
I/O stream.
Is that acceptable?


> 
> - The class name "Hex" might be a bit more evocative as HexDump or 
> HexConverter.

I tried to keep the class name short as many of its method names are long.


> 
> - Method names;  the "Hex" in method names might be unnecessary/redundant 
> since, as static methods,
>   they would frequently appear in code as "Hex.fromHexString" and a simple 
> "Hex.fromString" would be fine.
>   Ditto,  toHexString(bytes) -> toString(bytes)…

I agree the repetition is ugly. Shortening to fromString() and toString() is 
appealing except for the possible
confusion with Object.toString


> 
> - There are not many forms that allow the formatter to be supplied, for 
> example, dump(in, out) might be
>   a case where a formatter would be desired.

The 3 dump methods are just convenience functions around the stream generators. 
The dumpAsStream
methods are the expected entry points for customizers.


> 
> - Hex.Formatter interface could have a default method that provides the 
> default formatting or as
>   a static method so it can be used with a method reference.

Sure. What do you suggest?


> 
> - On the example in the class javadoc, I would use the implementation of the 
> default formatter with both hex and ascii
>   to show how that works.

Can you give me an example?


> 
>  - As Max observes, being able to supply the delimiters might be a good 
> addition.  (I'm thinking IP addresses too).

Sure. Add another toHexString method that takes a delimiter character?


> 
> It looks quite good and very useful.
> 
> Thanks, Roger
> 
> On 5/8/2018 10:34 PM, Weijun Wang wrote:
>> Nice tool.
>> 
>> However, I am not sure how toFormattedHexString() and toPrintableString() 
>> are useful, seems only for providing a customizable dump format which is, 
>> actually, not very customizable.
>> 
>> For me, toHexString and fromHexString are of course the most useful methods. 
>> As for dump, I can only think of
>> 
>> 1. The existing sun.security.HexDumpEncoder format, when I want to dump a 
>> lot of bytes as a block
>> 2. "00:11:22:33:AA:BB:CC" which fits in one line and also easy to read, when 
>> I want inline debugging output
>> 
>> If the customizable dump method is both powerful and simple enough to create 
>> 2) above, I'll be happy. Otherwise, I can live with 
>> toHexString().replaceAll("(..)(?=.)", "$1:”).
>> 
>> Thanks
>> Max
>> 
>>> On May 4, 2018, at 4:22 AM, Vincent Ryan  wrote:
>>> 
>>> Hello,
>>> 
>>> Please review this proposal for a new API to conveniently generate and 
>>> display binary data using hex string representation.
>>> It supports both bulk and stream operations and it can also generate the 
>>> well-known hexdump format [1].
>>> 
>>> Thanks
>>> 
>>> 
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8170769
>>> API: 
>>> http://cr.openjdk.java.net/~vinnie/8170769/javadoc.05/api/java.base/java/util/Hex.html
>>> Webrev: http://cr.openjdk.java.net/~vinnie/8170769/webrev.05/
>>> 
>>> 
>>> [1] https://docs.oracle.com/cd/E86824_01/html/E54763/hexdump-1.html
> 



Re: [11] RFR: 8170769 Provide a simple hexdump facility for binary data

2018-05-09 Thread Roger Riggs

Hi Vinnie,

On the API and spec, a few comments:

 - Expanding the printable string from ASCII to ISO-8859-1 would make 
it a bit more useful in more cases.
   That might suggest using the Charset converter to do the work (less 
optimized but more functional).


- There is no API support for ByteBuffers, another common source of 
bytes, that would make a good addition
   for completeness.  John Rose suggested a ByteSequence interface in 
the context of file processing but

   that hasn't settled down.

- The class name "Hex" might be a bit more evocative as HexDump or 
HexConverter.


- Method names;  the "Hex" in method names might be 
unnecessary/redundant since, as static methods,
  they would frequently appear in code as "Hex.fromHexString" and a 
simple "Hex.fromString" would be fine.

  Ditto,  toHexString(bytes) -> toString(bytes)...

- There are not many forms that allow the formatter to be supplied, for 
example, dump(in, out) might be

  a case where a formatter would be desired.

- Hex.Formatter interface could have a default method that provides the 
default formatting or as

  a static method so it can be used with a method reference.

- On the example in the class javadoc, I would use the implementation of 
the default formatter with both hex and ascii

  to show how that works.

 - As Max observes, being able to supply the delimiters might be a good 
addition.  (I'm thinking IP addresses too).


It looks quite good and very useful.

Thanks, Roger

On 5/8/2018 10:34 PM, Weijun Wang wrote:

Nice tool.

However, I am not sure how toFormattedHexString() and toPrintableString() are 
useful, seems only for providing a customizable dump format which is, actually, 
not very customizable.

For me, toHexString and fromHexString are of course the most useful methods. As 
for dump, I can only think of

1. The existing sun.security.HexDumpEncoder format, when I want to dump a lot 
of bytes as a block
2. "00:11:22:33:AA:BB:CC" which fits in one line and also easy to read, when I 
want inline debugging output

If the customizable dump method is both powerful and simple enough to create 2) above, I'll be 
happy. Otherwise, I can live with toHexString().replaceAll("(..)(?=.)", "$1:").

Thanks
Max


On May 4, 2018, at 4:22 AM, Vincent Ryan  wrote:

Hello,

Please review this proposal for a new API to conveniently generate and display 
binary data using hex string representation.
It supports both bulk and stream operations and it can also generate the 
well-known hexdump format [1].

Thanks


Bug: https://bugs.openjdk.java.net/browse/JDK-8170769
API: 
http://cr.openjdk.java.net/~vinnie/8170769/javadoc.05/api/java.base/java/util/Hex.html
Webrev: http://cr.openjdk.java.net/~vinnie/8170769/webrev.05/


[1] https://docs.oracle.com/cd/E86824_01/html/E54763/hexdump-1.html




Re: [11] RFR: 8170769 Provide a simple hexdump facility for binary data

2018-05-08 Thread Weijun Wang
Nice tool.

However, I am not sure how toFormattedHexString() and toPrintableString() are 
useful, seems only for providing a customizable dump format which is, actually, 
not very customizable.

For me, toHexString and fromHexString are of course the most useful methods. As 
for dump, I can only think of

1. The existing sun.security.HexDumpEncoder format, when I want to dump a lot 
of bytes as a block
2. "00:11:22:33:AA:BB:CC" which fits in one line and also easy to read, when I 
want inline debugging output

If the customizable dump method is both powerful and simple enough to create 2) 
above, I'll be happy. Otherwise, I can live with 
toHexString().replaceAll("(..)(?=.)", "$1:").

Thanks
Max

> On May 4, 2018, at 4:22 AM, Vincent Ryan  wrote:
> 
> Hello,
> 
> Please review this proposal for a new API to conveniently generate and 
> display binary data using hex string representation.
> It supports both bulk and stream operations and it can also generate the 
> well-known hexdump format [1].
> 
> Thanks
> 
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8170769
> API: 
> http://cr.openjdk.java.net/~vinnie/8170769/javadoc.05/api/java.base/java/util/Hex.html
> Webrev: http://cr.openjdk.java.net/~vinnie/8170769/webrev.05/
> 
> 
> [1] https://docs.oracle.com/cd/E86824_01/html/E54763/hexdump-1.html



[11] RFR: 8170769 Provide a simple hexdump facility for binary data

2018-05-03 Thread Vincent Ryan
Hello,

Please review this proposal for a new API to conveniently generate and display 
binary data using hex string representation.
It supports both bulk and stream operations and it can also generate the 
well-known hexdump format [1].

Thanks


Bug: https://bugs.openjdk.java.net/browse/JDK-8170769
API: 
http://cr.openjdk.java.net/~vinnie/8170769/javadoc.05/api/java.base/java/util/Hex.html
Webrev: http://cr.openjdk.java.net/~vinnie/8170769/webrev.05/


[1] https://docs.oracle.com/cd/E86824_01/html/E54763/hexdump-1.html