On Fri, 4 Jun 2021 13:36:27 GMT, Maxim Kartashev <github.com+28651297+mkartas...@openjdk.org> wrote:
>> Character strings within JVM are produced and consumed in several formats. >> Strings come from/to Java in the UTF8 format and POSIX APIs (like fprintf() >> or dlopen()) consume strings also in UTF8. On Windows, however, the >> situation is far less simple: some new(er) APIs expect UTF16 (wide-character >> strings), some older APIs can only work with strings in a "platform" format, >> where not all UTF8 characters can be represented; which ones can depends on >> the current "code page". >> >> This commit switches the Windows version of native library loading code to >> using the new UTF16 API `LoadLibraryW()` and attempts to streamline the use >> of various string formats in the surrounding code. >> >> Namely, exception messages are made to consume strings explicitly in the >> UTF8 format, while logging functions (that end up using legacy Windows API) >> are made to consume "platform" strings in most cases. One exception is >> `JVM_LoadLibrary()` logging where the UTF8 name of the library is logged, >> which can, of course, be fixed, but was considered not worth the additional >> code (NB: this isn't a new bug). >> >> The test runs in a separate JVM in order to make NIO happy about non-ASCII >> characters in the file name; tests are executed with LC_ALL=C and that >> doesn't let NIO work with non-ASCII file names even on Linux or MacOS. >> >> Tested by running `test/hotspot/jtreg:tier1` on Linux and >> `jtreg:test/hotspot/jtreg/runtime` on Windows 10. The new test (` >> jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode`) was explicitly ran >> on those platforms as well. >> >> Results from Linux: >> >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> >> jtreg:test/hotspot/jtreg:tier1 1784 1784 0 0 >> >> ============================== >> TEST SUCCESS >> >> >> Building target 'run-test-only' in configuration >> 'linux-x86_64-server-release' >> Test selection 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', >> will run: >> * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode >> >> Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode' >> Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java >> Test results: passed: 1 >> >> >> Results from Windows 10: >> >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg/runtime 746 746 0 0 >> ============================== >> TEST SUCCESS >> Finished building target 'run-test-only' in configuration >> 'windows-x86_64-server-fastdebug' >> >> >> Building target 'run-test-only' in configuration >> 'windows-x86_64-server-fastdebug' >> Test selection 'test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', will run: >> * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode >> >> Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode' >> Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java >> Test results: passed: 1 > > Maxim Kartashev has updated the pull request incrementally with one > additional commit since the last revision: > > Updated the test to run on Windows only and to use a character from the > supplementary plane in the path name. I came to realize that changing `os::dll_load()` to accept UTF-8 (standard or modified) will break all the users of that function except `JVM_LoadLibrary()`. Consider `os::native_java_library()` that still operates with the platform encoding on Windows and works correctly if CWD contains Latin-1 characters (assuming 1252 code page). With this change, `java` will fail to start if its path name contains, say, Æ because `os::dll_load()` will expect it to be encoded as `c3 86` (UTF-8), but will get `c6` (Latin-1) instead. One possible solution is to update all the call sites of `os::dll_load()` (quite laborous), another is to introduce `os::dll_load_utf8()` and change only `JVM_LoadLibrary()` at this point in time. Advice is welcome. ------------- PR: https://git.openjdk.java.net/jdk/pull/4169