> Please review a medium-size cleanup related to handling newlines in 
> `Content`/`HtmlTree`/`Text[Builder]` trees.
> This is the 3rd and last in the recent series of cleanup tasks in the code to 
> generate HTML.
> 
> Until recently, the `Text`/`TextBuilder` objects were "ready-for-writing", 
> meaning that characters like `<`, `&`, etc were encoded as entities, and 
> newlines were represented by the platform line separator. This caused 
> difficulties when counting the size of generated text, since these characters 
> had to be detected and accounted for.  A recent change addressed entities, 
> but stayed clear of the newline issue, which is now addressed here.
> 
> A related performance problem is that the method `Utils.normalizeNewlines` 
> always created new strings, even when it was not necessary.
> 
> With this change, newlines in `Text`, `TextBuilder` and other subtypes of 
> `Content` containing text are always represented by `\n`, and are converted 
> to the platform line separator as needed. 
> 
> I note the following assumptions:
> 
> * the input may contain any flavor of newlines in the user-provided 
> documentation comments
> * the output should aways use the platform line separator (`\n` on Linux and 
> macOS, `\r\n` on Windows etc.)
> 
> Together, these imply that some amount of the input will need to be scanned 
> to normalize newlines at some point in the process. While it would be nice to 
> not have to normalize newlines, it's a case or "pay now, or pay later". 
> 
> Notable parts of this change:
> 
> * `Utils.normalizeNewlines` is moved to be a static method in `Text` and 
> rewritten to not create new strings unless needed. This allows a number of 
> questionable imports of `toolkit.utils.DocletConstants` to be removed from 
> the `html.markup` package.  While it would be possible to call this method 
> automatically on all strings passed to `Text`, `TextBuilder` etc, in many 
> cases it is not necessary: put another way, it only needs to be called on 
> strings that have come from the user, either in documentation comments or in 
> command-line options.
> * `Content.write`, and the implementations thereof, are given a new parameter 
> which gives the newline sequence to use. This is set to `\n` for `toString` 
> and the platform separator when writing to a file.
> * `DocletConstants.NL` goes away and is superseded by:
>    *  `Text.NL`: always `\n`, to be used when adding text to a `Content` tree 
> (this replaces most uses of `DocletConstants.NL`)
>    * `DocFile.PLATFORM_LINE_SEPARATOR`: the platform line separator, to be 
> used when writing to files
> * The various `charCount` methods no longer have to worry about use of the 
> platform line separator 
> * Assertions are used to verify that there are no `\r` characters added into 
> `Text`/`TextBuilder`/`RawHtml`/`Content`
> 
> Other cleanup:
> * `DocletConstants.DEFAULT_TAB_STOP_LENGTH` is moved to `BaseOptions` which 
> is the only class that uses it. This leaves just two remaining constants in 
> `DocletConstants` which should probably be moved elsewhere as well, and the 
> `DocletConstants` class deleted. That is a step too far for the work here.
> * Minor IDE-suggestions, for lambdas, enhanced switch, adding braces, etc
> 
> One test was affected by the `DocletConstants.NL` change, but apart from 
> that, this is a pure-cleanup change, with no (intentional) externally visible 
> effect.

Jonathan Gibbons has updated the pull request with a new target base due to a 
merge or a rebase. The incremental webrev excludes the unrelated changes 
brought in by the merge/rebase. The pull request contains three additional 
commits since the last revision:

 - Address review feedback
 - Merge remote-tracking branch 'upstream/master' into 8236048.normalizeNewlines
 - JDK-8236048: Cleanup use of Utils.normalizeNewlines

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/9691/files
  - new: https://git.openjdk.org/jdk/pull/9691/files/7eca2355..28d6401c

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=9691&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=9691&range=00-01

  Stats: 17547 lines in 716 files changed: 10199 ins; 4432 del; 2916 mod
  Patch: https://git.openjdk.org/jdk/pull/9691.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/9691/head:pull/9691

PR: https://git.openjdk.org/jdk/pull/9691

Reply via email to