Thanks for your proposal.
Comments below.
> On 12 Dec 2018, at 02:35, Stuart Marks <[email protected]> wrote:
>
>
>
> On 12/11/18 1:21 PM, Vincent Ryan wrote:
>> My preference is for PrintStream rather than Writer, for the same reason as
>> Roger: it’s more convenient
>> for handling System.out. Does that address your concern?
>
> PrintStream is fine with me.
>
>> I cannot simply cast 8859-1 characters into UTF-8 because UTF-8 is
>> multi-byte charset so some 0x8X characters
>> Will trigger the multi-byte sequence and will end up being misinterpreted.
>> Hence my rather awkward conversion to a String.
>> Is there a better way?
>
> In toPrintableString(),
>
> 259 StringBuilder printable = new StringBuilder(toIndex - fromIndex);
> 260 for (int i = fromIndex; i < toIndex; i++) {
> 261 if (bytes[i] > 0x1F && bytes[i] < 0x7F) {
> 262 printable.append((char) bytes[i]);
> 263 } else if (bytes[i] > (byte)0x9F && bytes[i] <= (byte)0xFF) {
> 264 printable.append(new String(new byte[]{bytes[i]},
> ISO_8859_1));
> 265
> 266 } else {
> 267 printable.append('.');
> 268 }
> 269 }
>
> It works to cast ASCII bytes char, because the 7-bit ASCII range overlaps the
> low 7 bits of the UTF-16 char range. The bytes values of ISO 8859-1 overlap
> the low 8 bits of UTF-16, so casts work for them too.
>
> For any other charset, you'd need to do codeset conversion. But you're
> cleverly supporting only ISO 8859-1, so you don't have to do any conversion.
> :-)
>
>> I’m not sure I’ve addressed your concern regarding IOExceptions - can you
>> elaborate?
>
> Taking out the OutputStream overloads addressed my concerns. In at least one
> case the code would wrap the OutputStream into a PrintStream, print stuff to
> it, and then throw away the PrintStream. If an output error occurred, any
> error state in the PrintStream would also be thrown away. The creation of the
> PrintStream wrapper would also use the system's default charset instead of
> letting the caller control it.
>
> The dump() overloads now all take PrintStream, so it's the caller's
> responsibility to ensure that the PrintStream is using the right charset and
> to check for errors after. So this is all OK now.
>
> Note that the internal getPrintStream(), to wrap an OutputStream in a
> PrintStream, is now obsolete and can be removed.
>
> (Oh, I see Roger has said much the same things. Oh well, the peril of
> parallel reviews.)
>
> **
>
>> BTW updated webrev/javadoc available:
>> http://cr.openjdk.java.net/~vinnie/8170769/webrev.08/
>> http://cr.openjdk.java.net/~vinnie/8170769/javadoc.08/api/java.base/java/util/HexFormat.html
>
> Now we have a somewhat unsatisfying asymmetry in the APIs.
>
> There are four kinds of inputs:
>
> 1. byte[]
> 2. byte[] subrange
> 3. InputStream
> 4. ByteBuffer
>
> and two kinds of outputs:
>
> 1. PrintStream
> 2. Stream<String>
>
> and two variations of formatters:
>
> 1. default formatter
> 2. custom formatter + chunk size
>
> This is a total of 16 combinations. But there are only eight methods: three
> PrintStream methods with choice of input, two stream-output methods using the
> default formatter, and three stream-output methods using custom
> chunk+formatter.
>
> You don't have to provide ALL combinations, but what's here is an odd subset
> with some apparently arbitrary choices. For example, if I have a ByteBuffer
> and I want to dump it to System.out using default formatting, I have to go
> the Stream.forEachOrdered route AND provide the default chunk size and
> formatter.
>
> HexFormat.dumpAsStream(buf, DEFAULT_CHUNK_SIZE, HEXDUMP_FORMATTER)
> .forEachOrdered(System.out::println);
>
> These aren't huge deals, but they're easily stumbled over.
>
> One approach to organizing this is to have a HexFormat instance that contains
> the setup information and then to have instance methods that either update
> the setup or perform conversion and output. I'd use static factory methods
> instead of constructors. For example, you could do this:
>
> static factories methods:
> - from(byte[])
> - from(byte[], fromIndex, toIndex)
> - from(InputStream)
> - from(ByteBuffer)
>
> formatter setup instance methods:
> - format(chunksize, formatter)
>
> output:
> - void dump(PrintStream)
> - Stream<String> stream()
>
> Using this approach my example from above could be performed as follows:
>
> HexFormat.from(buf).dump(System.out);
>
> Note, I'm not saying that you HAVE to do it this way. (In particular, the
> naming could use work.) This is quite possibly overkill. But it's something
> you might consider, as it gets you all 16 combinations using seven methods,
> compared to the eight static methods in the current proposal that cover only
> half of the combinations.
>
> Alternatively, pare down the set of static methods to a bare minimum. Provide
> one that can do everything, and then provide one or two more that are
> essentially the same as the first but with some hardwired defaults. For
> example, to help minimize things, you can wrap a ByteBuffer around a byte
> array subrange, or get an InputStream from a byte array subrange. But you
> can't get an InputStream from a ByteBuffer or vice-versa, without a lot of
> work.
>
> (I haven't looked at the to* or from* methods.)
>
> s’marks
Your chaining approach has merit and the method chaining is attractive but it
would be a significant change to the API at this advanced stage.
I agree that there are gaps in the combinations and perhaps that could be
addressed by introducing a few convenience methods.
I think ByteBuffer is under-represented and propose adding the following two
methods to handle the simplified, default cases:
public static Stream<String> dumpAsStream(ByteBuffer buffer)
public static void dump(ByteBuffer buffer, PrintStream out)
That would be 10 combos for 10 methods versus 16 combos for 7 methods which is
still not full coverage but perhaps more methods could
be added in the future if there is demand for them.