When I ran the test programs to determine the behavior changes that occur with the <utf8:activeCodePage>UTF-8</utf8:activeCodePage> setting in the manifest, I actually copied the executables of the compiled test programs and modified the manifest resource of the copied executables to include the <utf8:activeCodePage>UTF-8</utf8:activeCodePage> setting.
I agree that UTF-8 handling for argv would be a great plus as the JNI_CreateJavaVM function currently expects the arguments to be encoded in the character set returned by the GetACP() function. Adding the <utf8:activeCodePage>UTF-8</utf8:activeCodePage> setting to the java.manifest file would allow Unicode characters that aren't in the system default codepage to be passed as command line arguments to Java programs on Windows 10 (starting with version 1903) and Windows 11. The JVM still needs to support the case where GetACP() returns a charset other than 65001 (UTF-8) as (a) the JVM can still be loaded by executables that don't have the <utf8:activeCodePage>UTF-8</utf8:activeCodePage> setting present in their manifest by loading jvm.dll through LoadLibrary and calling the JNI_CreateJavaVM function of jvm.dll and (b) the JVM can still be loaded on earlier versions on Windows that don't support the <utf8:activeCodePage>UTF-8</utf8:activeCodePage> setting. ________________________________ From: core-libs-dev <core-libs-dev-r...@openjdk.java.net> on behalf of Bernd Eckenfels <e...@zusammenkunft.net> Sent: Tuesday, October 5, 2021 5:02 AM To: core-libs-dev <core-libs-dev@openjdk.java.net> Subject: Re: Implementing JEP 400 on Windows 10 and Windows 11 I think the last sentence was missing a „not“ and referring to the same manifest? However the results are a bit of a mess, but utf-8 handling for argv would be great plus (if converted correctly), right? -- http://bernd.eckenfels.net ________________________________ Von: core-libs-dev <core-libs-dev-r...@openjdk.java.net> im Auftrag von Magnus Ihse Bursie <magnus.ihse.bur...@oracle.com> Gesendet: Tuesday, October 5, 2021 10:34:26 AM An: John Platts <john_pla...@hotmail.com>; core-libs-dev <core-libs-dev@openjdk.java.net> Betreff: Re: Implementing JEP 400 on Windows 10 and Windows 11 On 2021-10-05 03:22, John Platts wrote: > I wrote a test program (in C++) to detect the codepages that would be > returned by the GetACP(), GetOEMCP(), and GetConsoleCP() functions when the > <utf8:activeCodePage>UTF-8</utf8:activeCodePage> setting is added to the > manifest. > > The <utf8:activeCodePage> manifest element (supported on Windows 10 Version > 1903 or later) is in the > http://schemas.microsoft.com/SMI/2019/WindowsSettings namespace and is added > to the asmv3:WindowsSettings element as shown below: > <asmv3:windowsSettings > xmlns:dpi1="http://schemas.microsoft.com/SMI/2005/WindowsSettings" > > xmlns:dpi2="http://schemas.microsoft.com/SMI/2016/WindowsSettings" > > xmlns:utf8="http://schemas.microsoft.com/SMI/2019/WindowsSettings"> > <dpi1:dpiAware>true/PM</dpi1:dpiAware> > <dpi2:dpiAwareness>PerMonitorV2, PerMonitor, system</dpi2:dpiAwareness> > <utf8:activeCodePage>UTF-8</utf8:activeCodePage> > </asmv3:windowsSettings> > > Here is the output of the test program without the > <utf8:activeCodePage>UTF-8</utf8:activeCodePage> setting present in the > executable manifest: > GetACP() result: 1252 > GetOEMCP() result: 437 > GetConsoleCP() result: 437 > System default LCID: 1033 > User default LCID: 1033 > User default UI LCID: 1033 > Codepage from system default LCID: 1252 > Codepage from user default LCID: 1252 > Codepage from user default UI LCID: 1252 > > Here is the output of the same test program with an executable manifest that > includes the <utf8:activeCodePage>UTF-8</utf8:activeCodePage> setting: > GetACP() result: 65001 > GetOEMCP() result: 65001 > GetConsoleCP() result: 437 > System default LCID: 1033 > User default LCID: 1033 > User default UI LCID: 1033 > Codepage from system default LCID: 1252 > Codepage from user default LCID: 1252 > Codepage from user default UI LCID: 1252 > > Note that the <utf8:activeCodePage>UTF-8</utf8:activeCodePage> setting in the > application manifest changes the results of the GetACP() and GetOEMCP() calls > but not the GetConsoleCP() call. This is really confusing. I'm glad you are gathering empirical evidence of how it works. :-) > I wrote another test program, and the argument strings passed into the > main(int argc, char** argv) function are converted to UTF-8 if the > <utf8:activeCodePage>UTF-8</utf8:activeCodePage> setting is there in the > application manifest whereas the argument strings passed into the main (int > argc, char** argv) function are converted to the ANSI codepage (which is > usually code page 1252 on US English systems) if the > <utf8:activeCodePage>UTF-8</utf8:activeCodePage> setting is there in the > UTF-8 manifest. I'm not sure I understand this. What is the difference between "the application manifest" and "the UTF-8 manifest"? /Magnus