Please review an enhancement to make `DocCommentParser` normalize whitespace 
inside `<pre>` elements. The normalization is conceptually simple and and 
intended to be minimally invasive. Before parsing, `DocCommentParser` checks 
whether the text is a traditional doc comment and whether every line starts 
with a space character, which is commonly the case in traditional doc comments. 
If so, a single leading space is removed in block content (top level text and 
`{@code}`/`{@literal}` tags) when parsing within HTML `<pre>` tags.

This fixes the incidental one-space indentation in the vast majority of JDK 
code samples using `<pre>` alone or in combination with `<code>` or `{@code}`. 
In fact, I only found one code sample in JDK code that isn't solved by this 
change, for which I included a fix in this PR (it's in 
`String.startsWith(String, int)`, where I replaced the 10 char indentation and 
trailing line with a `<blockquote>`). 

The many added `boolean inBlockContent` arguments pased around in 
`DocCommentParser` are to make sure the removal is not applied to multiline 
inline content, which is maybe a bit fussy considering there is not a lot of 
multiline inline content in `<pre>` tags and it usually would not mind about 
removal of a non-essential space character, but I wanted to keep the change 
minimal. There are few javadoc tests that had to be adapted, most of the 
testing is done in `test/langtools/tools/javac/doctree`. 

If the exact number of leading whitespace in `<pre>` tags is important to any 
javadoc user the old output can be restored by increasing the indentation by 1. 
There will be a release note for this of course. 

Unfortunately, there is another whitespace problem that can't be solved as 
easily, and that is a leading blank line caused by `<pre><code>\n` open tags. 
Browsers will [ignore a newline immediately following a `<pre>` tag][1], but 
not if there is a `<code>` tag in between. There are hundreds of occurrences of 
this in JDK code, including variants with space characters mixed in. The fix in 
javadoc proper would be too complex, so I decided to solve it with 3 lines of 
JavaScript and a regex to reverse the order of `<code>\n` at the beginning of 
`<pre>` tags while removing any intermediary space. Script operation is 
indiscernible and it solves the problem.

[1]: https://html.spec.whatwg.org/#the-pre-element:the-pre-element

-------------

Commit messages:
 - Make sure whitespace normalization is only applied to block content
 - Only normalize inside <pre> tags, add tests
 - 8346118: Improve whitespace normalization in preformatted text

Changes: https://git.openjdk.org/jdk/pull/23868/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23868&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8346118
  Stats: 334 lines in 11 files changed: 258 ins; 0 del; 76 mod
  Patch: https://git.openjdk.org/jdk/pull/23868.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/23868/head:pull/23868

PR: https://git.openjdk.org/jdk/pull/23868

Reply via email to