On Mon, 8 Aug 2022 10:03:05 GMT, Hannes Wallnöfer <hann...@openjdk.org> wrote:
>> Please review a medium-size cleanup related to handling newlines in >> `Content`/`HtmlTree`/`Text[Builder]` trees. >> This is the 3rd and last in the recent series of cleanup tasks in the code >> to generate HTML. >> >> Until recently, the `Text`/`TextBuilder` objects were "ready-for-writing", >> meaning that characters like `<`, `&`, etc were encoded as entities, and >> newlines were represented by the platform line separator. This caused >> difficulties when counting the size of generated text, since these >> characters had to be detected and accounted for. A recent change addressed >> entities, but stayed clear of the newline issue, which is now addressed here. >> >> A related performance problem is that the method `Utils.normalizeNewlines` >> always created new strings, even when it was not necessary. >> >> With this change, newlines in `Text`, `TextBuilder` and other subtypes of >> `Content` containing text are always represented by `\n`, and are converted >> to the platform line separator as needed. >> >> I note the following assumptions: >> >> * the input may contain any flavor of newlines in the user-provided >> documentation comments >> * the output should aways use the platform line separator (`\n` on Linux and >> macOS, `\r\n` on Windows etc.) >> >> Together, these imply that some amount of the input will need to be scanned >> to normalize newlines at some point in the process. While it would be nice >> to not have to normalize newlines, it's a case or "pay now, or pay later". >> >> Notable parts of this change: >> >> * `Utils.normalizeNewlines` is moved to be a static method in `Text` and >> rewritten to not create new strings unless needed. This allows a number of >> questionable imports of `toolkit.utils.DocletConstants` to be removed from >> the `html.markup` package. While it would be possible to call this method >> automatically on all strings passed to `Text`, `TextBuilder` etc, in many >> cases it is not necessary: put another way, it only needs to be called on >> strings that have come from the user, either in documentation comments or in >> command-line options. >> * `Content.write`, and the implementations thereof, are given a new >> parameter which gives the newline sequence to use. This is set to `\n` for >> `toString` and the platform separator when writing to a file. >> * `DocletConstants.NL` goes away and is superseded by: >> * `Text.NL`: always `\n`, to be used when adding text to a `Content` >> tree (this replaces most uses of `DocletConstants.NL`) >> * `DocFile.PLATFORM_LINE_SEPARATOR`: the platform line separator, to be >> used when writing to files >> * The various `charCount` methods no longer have to worry about use of the >> platform line separator >> * Assertions are used to verify that there are no `\r` characters added into >> `Text`/`TextBuilder`/`RawHtml`/`Content` >> >> Other cleanup: >> * `DocletConstants.DEFAULT_TAB_STOP_LENGTH` is moved to `BaseOptions` which >> is the only class that uses it. This leaves just two remaining constants in >> `DocletConstants` which should probably be moved elsewhere as well, and the >> `DocletConstants` class deleted. That is a step too far for the work here. >> * Minor IDE-suggestions, for lambdas, enhanced switch, adding braces, etc >> >> One test was affected by the `DocletConstants.NL` change, but apart from >> that, this is a pure-cleanup change, with no (intentional) externally >> visible effect. > > src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/markup/HtmlDocument.java > line 80: > >> 78: writer.write(docType.text); >> 79: writer.write(newline); >> 80: docContent.write(writer, newline,true); > > Missing space after comma Fixed. This is a mildly annoying side-effect of IDEA wanting to label boolean arguments: it makes the missing spaces much harder to see. > src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/markup/Text.java > line 51: > >> 49: */ >> 50: public static Text of(CharSequence content) { >> 51: assert checkNewlines(content); > > Redundant assert as private constructor also contains it. Fixed ------------- PR: https://git.openjdk.org/jdk/pull/9691