On Tue, 9 Aug 2022 20:38:25 GMT, Naoto Sato <na...@openjdk.org> wrote:
>> To support Windows command prompt's codepage, following charsets should be >> moved from jdk.charsets module to java.base module. >> >> - IBM860 >> - IBM861 >> - IBM863 >> - IBM864 >> - IBM865 >> - IBM869 > > I looked at this issue a bit more. It looks to me that the issue is caused by > the fact that the encoding of `System.out` falls back to the default > encoding, as `IBM864` is not in `java.base`. This issue seems not new and > reproducible with the releases since JDK9 where modularization has been > introduced. Also, I think other encodings than those `IBM*` listed here, can > possibly cause this issue. In order to fix this completely, those obscure > encodings also have to be in `java.base` which I don't think we would want to > do. Hello @naotoj . Sorry for my bad reaction. I checked these charsets with IBM CDRA definitions. These are also same, but some round-trip definitions are not same, like #9661 . I think there come from files under https://unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/ . As you know, `CP860/CP861/CP863/CP864/CP865/CP869` are defined into [IANA Character Sets](https://www.iana.org/assignments/character-sets/character-sets.xhtml) as an alias. Even if the registered names are `IBM*`, these charset implementations are from Microsoft. I think these charset should be usable as default charset on Windows command prompt. Please reconsider current Java implementation. ------------- PR: https://git.openjdk.org/jdk/pull/9761