Thanks for your proposal. Comments below. > On 12 Dec 2018, at 02:35, Stuart Marks <stuart.ma...@oracle.com> wrote: > > > > On 12/11/18 1:21 PM, Vincent Ryan wrote: >> My preference is for PrintStream rather than Writer, for the same reason as >> Roger: it’s more convenient >> for handling System.out. Does that address your concern? > > PrintStream is fine with me. > >> I cannot simply cast 8859-1 characters into UTF-8 because UTF-8 is >> multi-byte charset so some 0x8X characters >> Will trigger the multi-byte sequence and will end up being misinterpreted. >> Hence my rather awkward conversion to a String. >> Is there a better way? > > In toPrintableString(), > > 259 StringBuilder printable = new StringBuilder(toIndex - fromIndex); > 260 for (int i = fromIndex; i < toIndex; i++) { > 261 if (bytes[i] > 0x1F && bytes[i] < 0x7F) { > 262 printable.append((char) bytes[i]); > 263 } else if (bytes[i] > (byte)0x9F && bytes[i] <= (byte)0xFF) { > 264 printable.append(new String(new byte[]{bytes[i]}, > ISO_8859_1)); > 265 > 266 } else { > 267 printable.append('.'); > 268 } > 269 } > > It works to cast ASCII bytes char, because the 7-bit ASCII range overlaps the > low 7 bits of the UTF-16 char range. The bytes values of ISO 8859-1 overlap > the low 8 bits of UTF-16, so casts work for them too. > > For any other charset, you'd need to do codeset conversion. But you're > cleverly supporting only ISO 8859-1, so you don't have to do any conversion. > :-) > >> I’m not sure I’ve addressed your concern regarding IOExceptions - can you >> elaborate? > > Taking out the OutputStream overloads addressed my concerns. In at least one > case the code would wrap the OutputStream into a PrintStream, print stuff to > it, and then throw away the PrintStream. If an output error occurred, any > error state in the PrintStream would also be thrown away. The creation of the > PrintStream wrapper would also use the system's default charset instead of > letting the caller control it. > > The dump() overloads now all take PrintStream, so it's the caller's > responsibility to ensure that the PrintStream is using the right charset and > to check for errors after. So this is all OK now. > > Note that the internal getPrintStream(), to wrap an OutputStream in a > PrintStream, is now obsolete and can be removed. > > (Oh, I see Roger has said much the same things. Oh well, the peril of > parallel reviews.) > > ** > >> BTW updated webrev/javadoc available: >> http://cr.openjdk.java.net/~vinnie/8170769/webrev.08/ >> http://cr.openjdk.java.net/~vinnie/8170769/javadoc.08/api/java.base/java/util/HexFormat.html > > Now we have a somewhat unsatisfying asymmetry in the APIs. > > There are four kinds of inputs: > > 1. byte[] > 2. byte[] subrange > 3. InputStream > 4. ByteBuffer > > and two kinds of outputs: > > 1. PrintStream > 2. Stream<String> > > and two variations of formatters: > > 1. default formatter > 2. custom formatter + chunk size > > This is a total of 16 combinations. But there are only eight methods: three > PrintStream methods with choice of input, two stream-output methods using the > default formatter, and three stream-output methods using custom > chunk+formatter. > > You don't have to provide ALL combinations, but what's here is an odd subset > with some apparently arbitrary choices. For example, if I have a ByteBuffer > and I want to dump it to System.out using default formatting, I have to go > the Stream.forEachOrdered route AND provide the default chunk size and > formatter. > > HexFormat.dumpAsStream(buf, DEFAULT_CHUNK_SIZE, HEXDUMP_FORMATTER) > .forEachOrdered(System.out::println); > > These aren't huge deals, but they're easily stumbled over. > > One approach to organizing this is to have a HexFormat instance that contains > the setup information and then to have instance methods that either update > the setup or perform conversion and output. I'd use static factory methods > instead of constructors. For example, you could do this: > > static factories methods: > - from(byte[]) > - from(byte[], fromIndex, toIndex) > - from(InputStream) > - from(ByteBuffer) > > formatter setup instance methods: > - format(chunksize, formatter) > > output: > - void dump(PrintStream) > - Stream<String> stream() > > Using this approach my example from above could be performed as follows: > > HexFormat.from(buf).dump(System.out); > > Note, I'm not saying that you HAVE to do it this way. (In particular, the > naming could use work.) This is quite possibly overkill. But it's something > you might consider, as it gets you all 16 combinations using seven methods, > compared to the eight static methods in the current proposal that cover only > half of the combinations. > > Alternatively, pare down the set of static methods to a bare minimum. Provide > one that can do everything, and then provide one or two more that are > essentially the same as the first but with some hardwired defaults. For > example, to help minimize things, you can wrap a ByteBuffer around a byte > array subrange, or get an InputStream from a byte array subrange. But you > can't get an InputStream from a ByteBuffer or vice-versa, without a lot of > work. > > (I haven't looked at the to* or from* methods.) > > s’marks
Your chaining approach has merit and the method chaining is attractive but it would be a significant change to the API at this advanced stage. I agree that there are gaps in the combinations and perhaps that could be addressed by introducing a few convenience methods. I think ByteBuffer is under-represented and propose adding the following two methods to handle the simplified, default cases: public static Stream<String> dumpAsStream(ByteBuffer buffer) public static void dump(ByteBuffer buffer, PrintStream out) That would be 10 combos for 10 methods versus 16 combos for 7 methods which is still not full coverage but perhaps more methods could be added in the future if there is demand for them.