On Fri, 13 Dec 2024 16:43:10 GMT, Nizar Benalla <nbena...@openjdk.org> wrote:

>> test/docs/jdk/javadoc/doccheck/doccheckutils/checkers/BadCharacterChecker.java
>>  line 122:
>> 
>>> 120:                 return Charset.forName(m2.group(1));
>>> 121:             }
>>> 122:             return html5 ? StandardCharsets.UTF_8 : 
>>> StandardCharsets.ISO_8859_1;
>> 
>> What is the basis for assuming ISO-8859-1 for non-HTML5 files?
>
> I assumed text would be written in latin characters, but I guess this can be 
> removed and we can simply use UTF8?

Unicode has some characters such as bidi characters which I don't want to allow 
but this test should only check for bistrot and character encoding, so UTF_8 
could work as a default.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21879#discussion_r1884240869

Reply via email to