FYI I’ve updated the webrev/javadoc with latest edits including the 2 new ByteBuffer methods:
http://cr.openjdk.java.net/~vinnie/8170769/webrev.09/ <http://cr.openjdk.java.net/~vinnie/8170769/webrev.09/> http://cr.openjdk.java.net/~vinnie/8170769/javadoc.09/api/java.base/java/util/HexFormat.html <http://cr.openjdk.java.net/~vinnie/8170769/javadoc.09/api/java.base/java/util/HexFormat.html> (NOTE: internal link to HexFormat.Formatter class is currently broken) If that is an acceptable compromise then I believe all outstanding issues have now been addressed. Thanks to all those who provided review comments. > On 12 Dec 2018, at 12:32, Vincent Ryan <vincent.x.r...@oracle.com> wrote: > > Thanks for your proposal. > Comments below. > >> On 12 Dec 2018, at 02:35, Stuart Marks <stuart.ma...@oracle.com> wrote: >> >> >> >> On 12/11/18 1:21 PM, Vincent Ryan wrote: >>> My preference is for PrintStream rather than Writer, for the same reason as >>> Roger: it’s more convenient >>> for handling System.out. Does that address your concern? >> >> PrintStream is fine with me. >> >>> I cannot simply cast 8859-1 characters into UTF-8 because UTF-8 is >>> multi-byte charset so some 0x8X characters >>> Will trigger the multi-byte sequence and will end up being misinterpreted. >>> Hence my rather awkward conversion to a String. >>> Is there a better way? >> >> In toPrintableString(), >> >> 259 StringBuilder printable = new StringBuilder(toIndex - fromIndex); >> 260 for (int i = fromIndex; i < toIndex; i++) { >> 261 if (bytes[i] > 0x1F && bytes[i] < 0x7F) { >> 262 printable.append((char) bytes[i]); >> 263 } else if (bytes[i] > (byte)0x9F && bytes[i] <= (byte)0xFF) { >> 264 printable.append(new String(new byte[]{bytes[i]}, >> ISO_8859_1)); >> 265 >> 266 } else { >> 267 printable.append('.'); >> 268 } >> 269 } >> >> It works to cast ASCII bytes char, because the 7-bit ASCII range overlaps >> the low 7 bits of the UTF-16 char range. The bytes values of ISO 8859-1 >> overlap the low 8 bits of UTF-16, so casts work for them too. >> >> For any other charset, you'd need to do codeset conversion. But you're >> cleverly supporting only ISO 8859-1, so you don't have to do any conversion. >> :-) >> >>> I’m not sure I’ve addressed your concern regarding IOExceptions - can you >>> elaborate? >> >> Taking out the OutputStream overloads addressed my concerns. In at least one >> case the code would wrap the OutputStream into a PrintStream, print stuff to >> it, and then throw away the PrintStream. If an output error occurred, any >> error state in the PrintStream would also be thrown away. The creation of >> the PrintStream wrapper would also use the system's default charset instead >> of letting the caller control it. >> >> The dump() overloads now all take PrintStream, so it's the caller's >> responsibility to ensure that the PrintStream is using the right charset and >> to check for errors after. So this is all OK now. >> >> Note that the internal getPrintStream(), to wrap an OutputStream in a >> PrintStream, is now obsolete and can be removed. >> >> (Oh, I see Roger has said much the same things. Oh well, the peril of >> parallel reviews.) >> >> ** >> >>> BTW updated webrev/javadoc available: >>> http://cr.openjdk.java.net/~vinnie/8170769/webrev.08/ >>> http://cr.openjdk.java.net/~vinnie/8170769/javadoc.08/api/java.base/java/util/HexFormat.html >> >> Now we have a somewhat unsatisfying asymmetry in the APIs. >> >> There are four kinds of inputs: >> >> 1. byte[] >> 2. byte[] subrange >> 3. InputStream >> 4. ByteBuffer >> >> and two kinds of outputs: >> >> 1. PrintStream >> 2. Stream<String> >> >> and two variations of formatters: >> >> 1. default formatter >> 2. custom formatter + chunk size >> >> This is a total of 16 combinations. But there are only eight methods: three >> PrintStream methods with choice of input, two stream-output methods using >> the default formatter, and three stream-output methods using custom >> chunk+formatter. >> >> You don't have to provide ALL combinations, but what's here is an odd subset >> with some apparently arbitrary choices. For example, if I have a ByteBuffer >> and I want to dump it to System.out using default formatting, I have to go >> the Stream.forEachOrdered route AND provide the default chunk size and >> formatter. >> >> HexFormat.dumpAsStream(buf, DEFAULT_CHUNK_SIZE, HEXDUMP_FORMATTER) >> .forEachOrdered(System.out::println); >> >> These aren't huge deals, but they're easily stumbled over. >> >> One approach to organizing this is to have a HexFormat instance that >> contains the setup information and then to have instance methods that either >> update the setup or perform conversion and output. I'd use static factory >> methods instead of constructors. For example, you could do this: >> >> static factories methods: >> - from(byte[]) >> - from(byte[], fromIndex, toIndex) >> - from(InputStream) >> - from(ByteBuffer) >> >> formatter setup instance methods: >> - format(chunksize, formatter) >> >> output: >> - void dump(PrintStream) >> - Stream<String> stream() >> >> Using this approach my example from above could be performed as follows: >> >> HexFormat.from(buf).dump(System.out); >> >> Note, I'm not saying that you HAVE to do it this way. (In particular, the >> naming could use work.) This is quite possibly overkill. But it's something >> you might consider, as it gets you all 16 combinations using seven methods, >> compared to the eight static methods in the current proposal that cover only >> half of the combinations. >> >> Alternatively, pare down the set of static methods to a bare minimum. >> Provide one that can do everything, and then provide one or two more that >> are essentially the same as the first but with some hardwired defaults. For >> example, to help minimize things, you can wrap a ByteBuffer around a byte >> array subrange, or get an InputStream from a byte array subrange. But you >> can't get an InputStream from a ByteBuffer or vice-versa, without a lot of >> work. >> >> (I haven't looked at the to* or from* methods.) >> >> s’marks > > Your chaining approach has merit and the method chaining is attractive but it > would be a significant change to the API at this advanced stage. > > I agree that there are gaps in the combinations and perhaps that could be > addressed by introducing a few convenience methods. > I think ByteBuffer is under-represented and propose adding the following two > methods to handle the simplified, default cases: > > public static Stream<String> dumpAsStream(ByteBuffer buffer) > > public static void dump(ByteBuffer buffer, PrintStream out) > > > That would be 10 combos for 10 methods versus 16 combos for 7 methods which > is still not full coverage but perhaps more methods could > be added in the future if there is demand for them. > > > >