Please review a relatively simple upgrade to the DocCommentParser for handling standalone HTML files.

In a standalone file, HTML content is treated as being in 3 parts ... the preamble, the  body, and the postamble, where the body is the content of an equivalent doc comment.  Traditionally, the preamble ends at the end of the opening tag for the `body` element, and the body ends at the beginning of the start of the closing tag for the `body` element. In other words, the body has traditionally been the inner HTML of the enclosing `body` element.

Since then, a style has evolved where authors are wrapping the content in a `main` element as well, presumably to satisfy accessibility checkers (which is good).  But this conflicts with the traditional determination of the content of the file, because (amongst other reasons) there can only be one `main` element in a generated file.

The change is for the preamble to also include the opening tag of a `main` element if it immediately follows the opening tag of the `body` element (allowing for inter-element whitespace. The change is also for the body to stop at the closing tag of a `main` element if one is encountered.

Two test cases are added to an existing test. The test cases consist of HTML files containing `main` elements as well as `body` elements. The corresponding .out files are dumps of the doc comment tree, showing the content of the preamble, body and postamble.

The change was also testing by building JDK API docs, running doccheck, and reviewing the results, to confirm that affected files which previously contained errors no longer do so.

-- Jon

JBS: https://bugs.openjdk.java.net/browse/JDK-8223805
Webrev: file:///w/jjg/work/jdk.closed.dev/8223805/webrev.00/webrev/index.html

Reply via email to