On Tue, 22 Feb 2022 14:46:27 GMT, Hannes Wallnöfer <hann...@openjdk.org> wrote:

>> Please review a change to straighten out the mechanism to avoid generating 
>> empty HTML elements in javadoc. 
>> 
>> Currently this is implemented using the `Content.isValid()` method in 
>> `HtmlTree.add(Content)` to check whether the the content argument should be 
>> added or dropped. This seems wrong since we don't really want to check for 
>> validity but non-emptiness (we want to drop empty elements, not print a 
>> warning or error).
>> 
>> So the first decision was to get rid of the `Content.isValid()` method and 
>> use `Content.isEmpty()` instead. I considered keeping `isValid()` and using 
>> it to actually do some validation checks, but in the end decided it didn't 
>> make much sense and scrapped the method.
>> 
>> My initial approach was to take care of emptiness at the content creation 
>> site, but it soon turned out there are too many sites potentially creating 
>> empty content. After fixing 40+ sites using a new 
>> `Content.addNonEmpty(Content)` method I still got empty elements that 
>> shouldn't be there in the JDK documentation.
>> 
>> Therefore `HtmlTree.add(Content)` still checks for and drops empty content 
>> arguments. Instead, I added a new `HtmlTree.addUnchecked(Content)` method 
>> that adds the argument without the checks. The new method is only used 5 
>> times in the whole codebase. It replaces the use of the special 
>> `HtmlTree.EMPTY` content token which is no longer needed.
>> 
>> A few remarks about the rewritten `HtmlTree.isEmpty()` method: its purpose 
>> is to determine whether content is missing in an element that should have 
>> content. It therefore always returns `false` for elements that are expected 
>> or allowed to be without content. This is maybe a bit counter-intuitive, but 
>> it is what is needed. We could do a combined check such as `isVoid() || 
>> !isEmpty()` but that would still leave out elements that *can* be empty, 
>> such as `<script>`.
>> 
>> The implementation of `HtmlTree.isEmpty()` is deliberately simple and 
>> conservative. For instance, it doesn't look at attributes to decide whether 
>> an element is allowed to be empty. The rationale for this is to make the 
>> behavior as predictable as possible and avoid surprises.
>> 
>> Since we no longer have a validity check in `HtmlTree.add(Content)` I had to 
>> remove a few lines of code in `TestHtmlDocument.java` that expected invalid 
>> tags to be dropped. Otherwise there are no test changes, as this is a 
>> noreg-cleanup task. I did compare the javadoc JDK documention before and 
>> after this change, the output is unchanged.
>
> Hannes Wallnöfer has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Introduce Content.isDiscardable()

Generally nice.

It is still weird to me that we might create empty placeholders and then throw 
them away, but given that we do, this seems like a better (and more explicit) 
solution than what we had before.

I also like the mildly-unrelated cleanup for HtmlTree.SPAN.

There is one question about why `HtmlTree.A` might reasonably be empty in HTML 
5. The only reason I can think of is to declare an `id`, since an `href` with 
no content is unclickable. But a node with just an `id` and no content can 
apply to other nodes as well, can't it? e.g. `span`

src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/SourceToHTMLConverter.java
 line 318:

> 316:                     HtmlIds.forLine(currentLineNo),
> 317:                     Text.of(utils.replaceTabs(line)));
> 318:             pre.addUnchecked(anchor);

Why does this need to be `addUnchecked` ?

src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/markup/HtmlTree.java
 line 1017:

> 1015:      * Returns {@code true} if the HTML tree does not affect the output 
> and can be discarded.
> 1016:      * This implementation considers non-void elements without content 
> as discardable, with the
> 1017:      * exception of {@code SCRIPT} and {@code A} which can sometimes be 
> used without content.

In HTML 5, when is it reasonable to have `A` without content?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7159

Reply via email to