Integrated: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character
On Sun, 24 Apr 2022 09:18:54 GMT, Ichiroh Takiguchi wrote: > On JDK19 with Linux ja_JP.eucjp locale, > System.getenv() returns unexpected value if environment variable has Japanese > EUC characters. > It seems this issue happens because of JEP 400. > Arguments for ProcessBuilder have same kind of issue. This pull request has now been integrated. Changeset: 890771e7 Author:Ichiroh Takiguchi URL: https://git.openjdk.java.net/jdk/commit/890771e708277c5f7ea9460ff7bcc7e4cae87eab Stats: 264 lines in 5 files changed: 217 ins; 24 del; 23 mod 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character Reviewed-by: naoto, rriggs - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v2]
On Mon, 2 May 2022 15:00:07 GMT, Roger Riggs wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > Can you clarify the use of different charset properties for the > ProcessEnvironment vs the CommandLine strings? > > They are used to decode or encode strings in the APIs to native functions > respectively. > If I understand `native.encoding` was introduced to reflect the default > encoding of file contents, while `sun.jnu.encoding` is used to encode > filenames. > > I think I would have expected to see `sun.jnu.encoding` to be used in all > contexts when encoding/decoding strings to the native representations. > (Assuming its not required for backward compatibility). Hello @RogerRiggs . Do you have any comment ? - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v9]
On Tue, 26 Apr 2022 21:17:34 GMT, Naoto Sato wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > Thanks for fixing the issue. > As to the test, it has lots of redundancy, and too complicated to me. Can you > simplify the test? Hello @naotoj . To keep consistency, I modified other javadoc texts on `StaticProperty.java`. And fix javdoc tag error. - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v9]
> On JDK19 with Linux ja_JP.eucjp locale, > System.getenv() returns unexpected value if environment variable has Japanese > EUC characters. > It seems this issue happens because of JEP 400. > Arguments for ProcessBuilder have same kind of issue. Ichiroh Takiguchi has updated the pull request incrementally with one additional commit since the last revision: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character - Changes: - all: https://git.openjdk.java.net/jdk/pull/8378/files - new: https://git.openjdk.java.net/jdk/pull/8378/files/12018014..b940a0d6 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=8378=08 - incr: https://webrevs.openjdk.java.net/?repo=jdk=8378=07-08 Stats: 43 lines in 1 file changed: 0 ins; 28 del; 15 mod Patch: https://git.openjdk.java.net/jdk/pull/8378.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/8378/head:pull/8378 PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v8]
On Fri, 13 May 2022 18:29:56 GMT, Naoto Sato wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > src/java.base/share/classes/jdk/internal/util/StaticProperty.java line 251: > >> 249: >> 250: /** >> 251: * Return the {@code sun.jnu.encoding} system property. > > This can be eliminated by changing the`@return` block tag below to `{@return > the {@code sun.jnu.encoding} ...}`. Hello @naotoj . I appreciate your comment. I'd like to confirm one thing. I applied following change /** * {@return the {@code sun.jnu.encoding} system property} * * {@link SecurityManager#checkPropertyAccess} is NOT checked * in this method. The caller of this method should take care to ensure * that the returned property is not made accessible to untrusted code. */ By javadoc command with -html5 option, above part was converted to Returns the sun.jnu.encoding system property. SecurityManager.checkPropertyAccess(java.lang.String) is NOT checked in this method. The caller of this method should take care to ensure that the returned property is not made accessible to untrusted code. The 1st word changed from **Return** to **Returns**. Is it OK ? And it seems `@return` is translated to Japanese on Japanese environment. - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v2]
On Mon, 2 May 2022 15:00:07 GMT, Roger Riggs wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > Can you clarify the use of different charset properties for the > ProcessEnvironment vs the CommandLine strings? > > They are used to decode or encode strings in the APIs to native functions > respectively. > If I understand `native.encoding` was introduced to reflect the default > encoding of file contents, while `sun.jnu.encoding` is used to encode > filenames. > > I think I would have expected to see `sun.jnu.encoding` to be used in all > contexts when encoding/decoding strings to the native representations. > (Assuming its not required for backward compatibility). Hello @RogerRiggs and @naotoj . I modified `ProcessEnvironment.java`and `ProcessImpl.java`. `i18nEnvArg.java` checks return code instead of string detection. - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v8]
> On JDK19 with Linux ja_JP.eucjp locale, > System.getenv() returns unexpected value if environment variable has Japanese > EUC characters. > It seems this issue happens because of JEP 400. > Arguments for ProcessBuilder have same kind of issue. Ichiroh Takiguchi has updated the pull request incrementally with one additional commit since the last revision: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character - Changes: - all: https://git.openjdk.java.net/jdk/pull/8378/files - new: https://git.openjdk.java.net/jdk/pull/8378/files/6dbaa751..12018014 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=8378=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk=8378=06-07 Stats: 72 lines in 3 files changed: 10 ins; 2 del; 60 mod Patch: https://git.openjdk.java.net/jdk/pull/8378.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/8378/head:pull/8378 PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v7]
> On JDK19 with Linux ja_JP.eucjp locale, > System.getenv() returns unexpected value if environment variable has Japanese > EUC characters. > It seems this issue happens because of JEP 400. > Arguments for ProcessBuilder have same kind of issue. Ichiroh Takiguchi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - Merge branch 'master' into 8285517 - 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character - 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character - 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character - 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character - 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character - 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character - Changes: https://git.openjdk.java.net/jdk/pull/8378/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk=8378=06 Stats: 223 lines in 5 files changed: 213 ins; 0 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/8378.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/8378/head:pull/8378 PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v6]
On Sat, 7 May 2022 06:50:40 GMT, Ichiroh Takiguchi wrote: >> On JDK19 with Linux ja_JP.eucjp locale, >> System.getenv() returns unexpected value if environment variable has >> Japanese EUC characters. >> It seems this issue happens because of JEP 400. >> Arguments for ProcessBuilder have same kind of issue. > > Ichiroh Takiguchi has updated the pull request incrementally with one > additional commit since the last revision: > > 8285517: System.getenv() returns unexpected value if environment variable > has non ASCII character I checked internal data $ cat StaticPropertyVals.java import java.lang.reflect.*; public class StaticPropertyVals { public static void main(String[] args) throws Exception { Class cls = Class.forName("jdk.internal.util.StaticProperty"); for (Field fid : cls.getDeclaredFields()) { if (Modifier.isStatic(fid.getModifiers())) { fid.setAccessible(true); System.out.println(fid.getName() + "=" + fid.get(null)); } } } } $ env LANG=kk_KZ.pt154 LC_ALL=kk_KZ.pt154 java -XshowSettings:properties --add-opens=java.base/jdk.internal.util=ALL-UNNAMED StaticPropertyVals 2>&1 | egrep -i 'encoding|charset' WARNING: The encoding of the underlying platform's file system is not supported: PT154 file.encoding = UTF-8 native.encoding = PT154 sun.io.unicode.encoding = UnicodeLittle sun.jnu.encoding = UTF-8 NATIVE_ENCODING=PT154 FILE_ENCODING=UTF-8 SUN_JNU_ENCODING=UTF-8 jnuCharset=UTF-8 $ - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v2]
On Mon, 2 May 2022 15:00:07 GMT, Roger Riggs wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > Can you clarify the use of different charset properties for the > ProcessEnvironment vs the CommandLine strings? > > They are used to decode or encode strings in the APIs to native functions > respectively. > If I understand `native.encoding` was introduced to reflect the default > encoding of file contents, while `sun.jnu.encoding` is used to encode > filenames. > > I think I would have expected to see `sun.jnu.encoding` to be used in all > contexts when encoding/decoding strings to the native representations. > (Assuming its not required for backward compatibility). Hello @RogerRiggs . When I executed the new testcase with JDK18.0.1, I got following errors: ERROR: argument EUC_JP_TEXT is: Actual: \u7FB2\u221A\uFFFD Expected: \u6F22\u5B57 ERROR: Variable EUC_JP_TEXT is encoded by: Actual: \xE6\xBC\xA2\xE5\xAD\x97 Expected: \xB4\xC1\xBB\xFA ERROR: Value EUC_JP_TEXT is encoded by: Actual: \xE6\xBC\xA2\xE5\xAD\x97 Expected: \xB4\xC1\xBB\xFA The new testcase verifies internal byte data for EUC_JP_TEXT environment variable and value. - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v5]
On Fri, 6 May 2022 15:13:38 GMT, Roger Riggs wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > test/jdk/java/lang/ProcessBuilder/Basic.java line 606: > >> 604: ? Charset.forName(jnuEncoding, Charset.defaultCharset()) >> 605: : Charset.defaultCharset(); >> 606: if (new String(tested.getBytes(cs), cs).equals(tested)) { > > Isn't it always true that the round trip encoding to bytes and back (using > the same Charset) to a string is equal()? > And if it is always true, then the if(...) can be removed. Above code is related to following code: https://github.com/openjdk/jdk/blob/5212535a276a92d96ca20bdcfccfbce956febdb1/test/jdk/java/lang/ProcessBuilder/Basic.java#L1569-L1570 If `Charset.defaultCharset()` is `UTF-8`, this code is not skipped. I think this code will be skipped on JDK17 on C locale. - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v6]
> On JDK19 with Linux ja_JP.eucjp locale, > System.getenv() returns unexpected value if environment variable has Japanese > EUC characters. > It seems this issue happens because of JEP 400. > Arguments for ProcessBuilder have same kind of issue. Ichiroh Takiguchi has updated the pull request incrementally with one additional commit since the last revision: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character - Changes: - all: https://git.openjdk.java.net/jdk/pull/8378/files - new: https://git.openjdk.java.net/jdk/pull/8378/files/5bd8492f..994a7fd0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=8378=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk=8378=04-05 Stats: 82 lines in 1 file changed: 47 ins; 12 del; 23 mod Patch: https://git.openjdk.java.net/jdk/pull/8378.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/8378/head:pull/8378 PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v4]
On Thu, 5 May 2022 01:34:48 GMT, Naoto Sato wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > test/jdk/java/lang/System/i18nEnvArg.java line 70: > >> 68: Map environ = pb.environment(); >> 69: environ.clear(); >> 70: environ.put("LANG", "ja_JP.eucjp"); > > There are many duplicate pieces of code here and in the `else` block below. > Can you simplify this `if` statement more? Modified. But I'm not sure, it's expected one. > test/jdk/java/lang/System/i18nEnvArg.java line 110: > >> 108: String s = System.getenv(EUC_JP_TEXT); >> 109: ByteArrayOutputStream baos = new ByteArrayOutputStream(); >> 110: PrintStream ps = new PrintStream(baos); > > Can utilize try-with-resources pattern. Use `shouldNotContain()` to find the error message. - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v2]
On Mon, 2 May 2022 15:00:07 GMT, Roger Riggs wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > Can you clarify the use of different charset properties for the > ProcessEnvironment vs the CommandLine strings? > > They are used to decode or encode strings in the APIs to native functions > respectively. > If I understand `native.encoding` was introduced to reflect the default > encoding of file contents, while `sun.jnu.encoding` is used to encode > filenames. > > I think I would have expected to see `sun.jnu.encoding` to be used in all > contexts when encoding/decoding strings to the native representations. > (Assuming its not required for backward compatibility). Hello @RogerRiggs and @naotoj . I put sun.jnu.encoding related code into s`StaticProperty.java`. Could you review the files again including javadoc comment ? - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v5]
> On JDK19 with Linux ja_JP.eucjp locale, > System.getenv() returns unexpected value if environment variable has Japanese > EUC characters. > It seems this issue happens because of JEP 400. > Arguments for ProcessBuilder have same kind of issue. Ichiroh Takiguchi has updated the pull request incrementally with one additional commit since the last revision: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character - Changes: - all: https://git.openjdk.java.net/jdk/pull/8378/files - new: https://git.openjdk.java.net/jdk/pull/8378/files/84e6d639..5bd8492f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=8378=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk=8378=03-04 Stats: 103 lines in 4 files changed: 39 ins; 39 del; 25 mod Patch: https://git.openjdk.java.net/jdk/pull/8378.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/8378/head:pull/8378 PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v4]
On Thu, 5 May 2022 01:31:24 GMT, Naoto Sato wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > src/java.base/unix/classes/java/lang/ProcessEnvironment.java line 73: > >> 71: @SuppressWarnings("removal") >> 72: String jnuEncoding = >> AccessController.doPrivileged((PrivilegedAction) () >> 73: -> System.getProperty("sun.jnu.encoding")); > > No need to issue `doPrivileged()`, which is deprecated Hello @naotoj . If I just use `System.getProperty("sun.jnu.encoding")`, following testcases were failed. java/lang/ProcessBuilder/SecurityManagerClinit.java java/lang/ProcessHandle/PermissionTest.java Please give me your suggestion again. - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v4]
On Tue, 26 Apr 2022 21:17:34 GMT, Naoto Sato wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > Thanks for fixing the issue. > As to the test, it has lots of redundancy, and too complicated to me. Can you > simplify the test? Hello @naotoj . I removed `i18nEnvArg.sh` and modified `i18nEnvArg.java`. Could you review it again ? - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v4]
> On JDK19 with Linux ja_JP.eucjp locale, > System.getenv() returns unexpected value if environment variable has Japanese > EUC characters. > It seems this issue happens because of JEP 400. > Arguments for ProcessBuilder have same kind of issue. Ichiroh Takiguchi has updated the pull request incrementally with one additional commit since the last revision: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character - Changes: - all: https://git.openjdk.java.net/jdk/pull/8378/files - new: https://git.openjdk.java.net/jdk/pull/8378/files/9db79874..84e6d639 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=8378=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk=8378=02-03 Stats: 107 lines in 2 files changed: 40 ins; 60 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/8378.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/8378/head:pull/8378 PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v2]
On Tue, 3 May 2022 18:35:33 GMT, Naoto Sato wrote: >> I think 3 processes required for this testing. >> 1. Start test process on ja_JP.eucjp locale >> 2. Set the test data by encoder >> 3. Verify the encoded data by decoder >> >> To verify the encoded data, I think I need to check the byte data. > > Do we need to verify the intermediate byte encoding? Could we simply compare > the text given to environ.put() in Start process and System.getenv() in > Verify process match? Hello @naotoj . I think if 2nd process' encoder (like UTF-8) and 3rd process' decoder are same encoding, System.getenv() returns expected data. Actually the testcase checks 3 parts on Verify process: 1. Expected environment variable is defined or not (it uses System.getenv()) 2. Expected argument is received or not 3. Expected environment variable is encoded by proper encoding When I ran this testcase with jdk18.0.1, I got following result: Unexpected argument was received: \u6F22\u5B57<->\u7FB2\u221A\uFFFD Unexpected environment variables: \xE6\xBC\xA2\xE5\xAD\x97\x3D\xE6\xBC\xA2\xE5\xAD\x97 It means 1st test was passed, 2nd and 3rd test were failed. I don't think environment variable issue can be seen without 3rd test. Please let me know if you find out another way. - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v2]
On Mon, 2 May 2022 20:50:11 GMT, Naoto Sato wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > test/jdk/java/lang/System/i18nEnvArg.java line 63: > >> 61: } >> 62: >> 63: public static class Start { > > Do we need `Start` class? Can it be merged with the parent to reduce the > complexity? I'm not sure, it expected one... I created i18nEnvArg.sh for start-up, then add `@modules jdk.charsets` and remove `i18nEnvArg$Start` class. > test/jdk/java/lang/System/i18nEnvArg.java line 87: > >> 85: System.getProperty("java.class.path"), >> 86: "i18nEnvArg$Verify", >> 87: EUC_JP_TEXT); > > Can utilize `ProcessTools` class. Use `ProcessTools` > test/jdk/java/lang/System/i18nEnvArg.java line 130: > >> 128: environ_mid.setAccessible(true); >> 129: byte[][] environ = (byte[][]) environ_mid.invoke(null, >> 130: (Object[])null); > > I am not sure I like this piece, invoking a package private method of > `ProcessEnvironment` class. Can we simply verify the result of > `System.getenv(EUC_JP_TEXT)`? I think 3 processes required for this testing. 1. Start test process on ja_JP.eucjp locale 2. Set the test data by encoder 3. Verify the encoded data by decoder To verify the encoded data, I think I need to check the byte data. - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v2]
On Mon, 2 May 2022 15:00:07 GMT, Roger Riggs wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > Can you clarify the use of different charset properties for the > ProcessEnvironment vs the CommandLine strings? > > They are used to decode or encode strings in the APIs to native functions > respectively. > If I understand `native.encoding` was introduced to reflect the default > encoding of file contents, while `sun.jnu.encoding` is used to encode > filenames. > > I think I would have expected to see `sun.jnu.encoding` to be used in all > contexts when encoding/decoding strings to the native representations. > (Assuming its not required for backward compatibility). Hello @RogerRiggs and @naotoj . I put new testcase. I'm not sure, it's expected one... During my code changes, SecurityManager related testcases were failed. I'm not sure I can use `System.getProperty("sun.jnu.encoding")` or not. So I put SecurityManager related code. - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v3]
> On JDK19 with Linux ja_JP.eucjp locale, > System.getenv() returns unexpected value if environment variable has Japanese > EUC characters. > It seems this issue happens because of JEP 400. > Arguments for ProcessBuilder have same kind of issue. Ichiroh Takiguchi has updated the pull request incrementally with one additional commit since the last revision: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character - Changes: - all: https://git.openjdk.java.net/jdk/pull/8378/files - new: https://git.openjdk.java.net/jdk/pull/8378/files/050dd663..9db79874 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=8378=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk=8378=01-02 Stats: 156 lines in 5 files changed: 83 ins; 50 del; 23 mod Patch: https://git.openjdk.java.net/jdk/pull/8378.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/8378/head:pull/8378 PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v2]
On Mon, 2 May 2022 15:00:07 GMT, Roger Riggs wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > Can you clarify the use of different charset properties for the > ProcessEnvironment vs the CommandLine strings? > > They are used to decode or encode strings in the APIs to native functions > respectively. > If I understand `native.encoding` was introduced to reflect the default > encoding of file contents, while `sun.jnu.encoding` is used to encode > filenames. > > I think I would have expected to see `sun.jnu.encoding` to be used in all > contexts when encoding/decoding strings to the native representations. > (Assuming its not required for backward compatibility). Hello @RogerRiggs . You are right. Both case, `sun.jnu.encoding` should be used. - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v2]
On Wed, 27 Apr 2022 17:20:11 GMT, Roger Riggs wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > src/java.base/unix/classes/java/lang/ProcessEnvironment.java line 68: > >> 66: private static final Map theUnmodifiableEnvironment; >> 67: static final int MIN_NAME_LENGTH = 0; >> 68: static final Charset cs = >> Charset.forName(StaticProperty.nativeEncoding(), > > cs should have an all-CAPS name to make it clear its a static value > throughout the implementation. > And `private` too. Change to "private static final" > test/jdk/java/lang/System/i18nEnvArg.java line 41: > >> 39: >> 40: public class i18nEnvArg { >> 41: final static String text = "\u6F22\u5B57"; > > Can this be renamed to indicate it is the string under test? like KANJI_TEXT > or EUC_JP_TEXT or similar. Use EUC_JP_TEXT > test/jdk/java/lang/System/i18nEnvArg.java line 49: > >> 47: final static int maxSize = 4096; >> 48: >> 49: public static void main(String[] args) throws Exception { > > There are 3 main()'s in this test; it would be useful to add a comment > indicating the purpose of each. Add comments > test/jdk/java/lang/System/i18nEnvArg.java line 60: > >> 58: Process p = pb.start(); >> 59: InputStream is = p.getInputStream(); >> 60: byte[] ba = new byte[maxSize]; > > You can use `InputStream.readAllBytes()` here to return a byte buffer. > > Its not clear why all the bytes are read? I assume its only for a possible > error from the child. Use InputStream.readAllBytes() > test/jdk/java/lang/System/i18nEnvArg.java line 128: > >> 126: Method enviorn_mid = cls.getDeclaredMethod("environ"); >> 127: enviorn_mid.setAccessible(true); >> 128: byte[][] environ = (byte[][]) enviorn_mid.invoke(null, > > typo: "enviorn_mid" -> "environ_mid". Fix typo > test/jdk/java/lang/System/i18nEnvArg.java line 138: > >> 136: bb.put(environ[i+1]); >> 137: byte[] envb = Arrays.copyOf(ba, bb.position()); >> 138: if (Arrays.equals(kanji, envb)) return; > > It seems odd to exit the loop on the first match. > > I think AIX always has environment variables not set by the caller. On this testing, use 2 environment variables: * LANG=ja_JP.eucjp * \u6F22\u5B57=\u6F22\u5B57 If the other environment variable is defined, it's printed. > test/jdk/java/lang/System/i18nEnvArg.java line 146: > >> 144: sb.append((char)b); >> 145: } else { >> 146: sb.append(String.format("\\x%02X", (int)b & >> 0xFF)); > > The methods of java.util.HexFormat may be useful here. > It seems that the "5C" would fit into this "else" clause and not need a > separate statement. Use HexFormat class - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v2]
On Tue, 26 Apr 2022 21:17:34 GMT, Naoto Sato wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > Thanks for fixing the issue. > As to the test, it has lots of redundancy, and too complicated to me. Can you > simplify the test? Hello @naotoj and @RogerRiggs . I appreciate your comments. I applied the changes. Additionally, I put bugid into test/jdk/java/lang/ProcessBuilder/Basic.java > src/java.base/unix/classes/java/lang/ProcessImpl.java line 146: > >> 144: private static final LaunchMechanism launchMechanism = >> platform.launchMechanism(); >> 145: private static final byte[] helperpath = >> toCString(StaticProperty.javaHome() + "/lib/jspawnhelper"); >> 146: private static Charset jnuCharset = null; > > Can we simply call `jnuCharset()` here? Change jnuCharset to "private static". > test/jdk/java/lang/ProcessBuilder/Basic.java line 605: > >> 603: String fileEncoding = System.getProperty("file.encoding"); >> 604: Charset cs; >> 605: if (nativeEncoding != null && >> !nativeEncoding.equals(fileEncoding)) { > > What's the reason for comparing `file.encoding`? Does > `Charset.forName(nativeEncoding, Charset.defaultCharset())` suffice? Remove file.encoding related code. > test/jdk/java/lang/System/i18nEnvArg.java line 26: > >> 24: /* >> 25: * @test >> 26: * @bug 8285517 > > If the test should work only on a particular env, should describe here and > add `@requires` tag. Add "@requires (os.family == "linux")" > test/jdk/java/lang/System/i18nEnvArg.java line 50: > >> 48: >> 49: public static void main(String[] args) throws Exception { >> 50: ProcessBuilder pb = new ProcessBuilder(javeExe, > > Can utilize `jdk.test.lib.process.ProcessTools`. Use ProcessTools class. - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v2]
> On JDK19 with Linux ja_JP.eucjp locale, > System.getenv() returns unexpected value if environment variable has Japanese > EUC characters. > It seems this issue happens because of JEP 400. > Arguments for ProcessBuilder have same kind of issue. Ichiroh Takiguchi has updated the pull request incrementally with one additional commit since the last revision: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character - Changes: - all: https://git.openjdk.java.net/jdk/pull/8378/files - new: https://git.openjdk.java.net/jdk/pull/8378/files/89ee195a..050dd663 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=8378=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk=8378=00-01 Stats: 106 lines in 4 files changed: 26 ins; 41 del; 39 mod Patch: https://git.openjdk.java.net/jdk/pull/8378.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/8378/head:pull/8378 PR: https://git.openjdk.java.net/jdk/pull/8378
RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character
On JDK19 with Linux ja_JP.eucjp locale, System.getenv() returns unexpected value if environment variable has Japanese EUC characters. It seems this issue happens because of JEP 400. Arguments for ProcessBuilder have same kind of issue. - Commit messages: - 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character Changes: https://git.openjdk.java.net/jdk/pull/8378/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk=8378=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8285517 Stats: 188 lines in 4 files changed: 180 ins; 0 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/8378.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/8378/head:pull/8378 PR: https://git.openjdk.java.net/jdk/pull/8378
Integrated: 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX
On Tue, 22 Feb 2022 12:17:59 GMT, Ichiroh Takiguchi wrote: > Run jtreg:jdk/java/lang/ProcessBuilder/Basic.java on AIX. > The test was failed by: > Incorrect handling of envstrings containing NULs > > According to my investigation, this issue was happened after following change > was applied. > JDK-8272600: (test) Use native "sleep" in Basic.java > > test.nativepath value was added into AIX's LIBPATH during running this > testcase. > On AIX, test.nativepath value should be removed from LIBPATH value before > comparing the values. This pull request has now been integrated. Changeset: c5c6058f Author:Ichiroh Takiguchi URL: https://git.openjdk.java.net/jdk/commit/c5c6058fd57d4b594012035eaf18a57257f4ad85 Stats: 7 lines in 1 file changed: 5 ins; 0 del; 2 mod 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX Reviewed-by: rriggs - PR: https://git.openjdk.java.net/jdk/pull/7574
Re: RFR: 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX [v4]
On Fri, 25 Feb 2022 15:03:19 GMT, Roger Riggs wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Add null check > > My preference is to pass the unmodified LIBPATH in the environment in each of > the three cases. > The expected checks already expect the unmodified LIBPATH so the only change > is in the setup of the environment to be passed to the child. > > Already covered at line 1366: > > Insert at line 1873 in the if/then/else: > > } else if (AIX.is()) { > envp = new String[] {"=ExitValue=3", "=C:=\", > "LIBPATH="+libpath}; > } else { > > (Or assign it to a local as is done for windows, your preference). > > And at line 1921'ish: > ``` >} else if (AIX.is()) { > envp = new String[] {"LC_ALL=C\u\u", // Yuck! > "FO\u=B\uR", "LIBPATH="+libpath}; > > > I looked using `sh -c env` but I think it would be more work to change the > way the output is compared. > The output of `env` is different and has some other values that get inserted. Many thanks, @RogerRiggs . I tried your suggested code, it worked fine on my AIX system. Could you review this change again ? - PR: https://git.openjdk.java.net/jdk/pull/7574
Re: RFR: 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX [v5]
> Run jtreg:jdk/java/lang/ProcessBuilder/Basic.java on AIX. > The test was failed by: > Incorrect handling of envstrings containing NULs > > According to my investigation, this issue was happened after following change > was applied. > JDK-8272600: (test) Use native "sleep" in Basic.java > > test.nativepath value was added into AIX's LIBPATH during running this > testcase. > On AIX, test.nativepath value should be removed from LIBPATH value before > comparing the values. Ichiroh Takiguchi has updated the pull request incrementally with one additional commit since the last revision: Pass LIBPATH setting for AIX - Changes: - all: https://git.openjdk.java.net/jdk/pull/7574/files - new: https://git.openjdk.java.net/jdk/pull/7574/files/b6d6afe7..60f8e96e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=7574=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7574=03-04 Stats: 27 lines in 1 file changed: 8 ins; 17 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7574.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7574/head:pull/7574 PR: https://git.openjdk.java.net/jdk/pull/7574
Re: RFR: 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX [v4]
On Thu, 24 Feb 2022 18:51:08 GMT, Roger Riggs wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Add null check > > The piece I was missing is that the HotSpot AIX specific code, computes and > sets LIBPATH based on > the path to the java executable. This computed LIBPATH value is set in the > environment unless > it is overridden by an existing LIBPATH in the environment. > > The test cases that failed, do not supply a value for LIBPATH so the launcher > provided value is inserted > and thereafter reported as the environment from the child process. > > Since both of these tests are testing a different condition not related to > LIBPATH, > the presence/accuracy of LIBPATH is not the point of the test. > > The current solution to remove `test.nativepath` works and is ok by me. > > (The alternative used by #7581, to include LIBPATH in the environment when > launching might be slightly more robust to future changes.) > > Thanks for the followup. Hello @RogerRiggs , @backwaterred . Could you give me your suggestion, please ? - PR: https://git.openjdk.java.net/jdk/pull/7574
Re: RFR: 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX [v4]
On Thu, 24 Feb 2022 18:51:08 GMT, Roger Riggs wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Add null check > > The piece I was missing is that the HotSpot AIX specific code, computes and > sets LIBPATH based on > the path to the java executable. This computed LIBPATH value is set in the > environment unless > it is overridden by an existing LIBPATH in the environment. > > The test cases that failed, do not supply a value for LIBPATH so the launcher > provided value is inserted > and thereafter reported as the environment from the child process. > > Since both of these tests are testing a different condition not related to > LIBPATH, > the presence/accuracy of LIBPATH is not the point of the test. > > The current solution to remove `test.nativepath` works and is ok by me. > > (The alternative used by #7581, to include LIBPATH in the environment when > launching might be slightly more robust to future changes.) > > Thanks for the followup. Thanks @RogerRiggs and @tstuefe . I appreciate your deeper investigations. Could you give me some advice on what to do next? - PR: https://git.openjdk.java.net/jdk/pull/7574
Re: RFR: 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX [v4]
On Thu, 24 Feb 2022 17:01:13 GMT, Roger Riggs wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Add null check > > Javac is compiling the source to a .class file. The contents of the > `java.library.path` do not affect the class file generated. > None of the code of the class is executed during compilation. Thanks @RogerRiggs I could understand why this issue was happened. You said: > As far as I can tell, the test does not modify the environment and originally > passes LIBPATH without modification. I'd like to confirm one thing. My patch did not pass `LIBPATH` environment variable to child process. You mean the testcase should pass parent `LIBPATH` to child process ? If so, it's better to use #7581 . (Please use bugid [8282219](https://bugs.openjdk.java.net/browse/JDK-8282219)) - PR: https://git.openjdk.java.net/jdk/pull/7574
Re: RFR: 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX [v4]
On Wed, 23 Feb 2022 18:49:22 GMT, Ichiroh Takiguchi wrote: >> Run jtreg:jdk/java/lang/ProcessBuilder/Basic.java on AIX. >> The test was failed by: >> Incorrect handling of envstrings containing NULs >> >> According to my investigation, this issue was happened after following >> change was applied. >> JDK-8272600: (test) Use native "sleep" in Basic.java >> >> test.nativepath value was added into AIX's LIBPATH during running this >> testcase. >> On AIX, test.nativepath value should be removed from LIBPATH value before >> comparing the values. > > Ichiroh Takiguchi has updated the pull request incrementally with one > additional commit since the last revision: > > Add null check In my investigation, this issue happened after JDK-8272600 was applied. [JDK-8272600](https://bugs.openjdk.java.net/browse/JDK-8272600) (test) Use native "sleep" in Basic.java It added `/native` on `@run` tag. - @run main/othervm/timeout=300 -Djava.security.manager=allow Basic - @run main/othervm/timeout=300 -Djava.security.manager=allow -Djdk.lang.Process.launchMechanism=fork Basic + @run main/othervm/native/timeout=300 -Djava.security.manager=allow Basic + @run main/othervm/native/timeout=300 -Djava.security.manager=allow -Djdk.lang.Process.launchMechanism=fork Basic For this change, `test.nativepath` setting added into `LIBPATH` environment variable on parent (main) process. Older AIX might require parent's `LIBPATH` setting against child process on some of cases. But it seems the current AIX does not require `LIBPATH` setting. #7581 applied parent `LIBPATH` setting against child process. Please choose the appropriate one. - PR: https://git.openjdk.java.net/jdk/pull/7574
Re: RFR: 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX [v4]
On Wed, 23 Feb 2022 19:59:46 GMT, Roger Riggs wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Add null check > > The changes seem ok, modifying both the LIBPATH passed in the environment and > the expected return value. > > Since both of these tests are testing a different condition, the > presence/accuracy of LIBPATH is not the point of the test. @RogerRiggs I'm using AIX 7.1. AIX 5.3 and 6.1 were already in EOL. OK, another way, `LIBPATH=...` can be removed from `result`. Also we can use #7581 . - PR: https://git.openjdk.java.net/jdk/pull/7574
Re: RFR: 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX [v2]
On Wed, 23 Feb 2022 15:44:07 GMT, Tyler Steele wrote: >> @RogerRiggs >> Many thanks. that's good point. >> Only 1st part has `test.nativepath` because of following code. >> >> if (AIX.is()) { >> pb.environment().put("LIBPATH", libpath); >> } >> >> On current condition, parent (main) process have `test.nativepath` setting >> into LIBPATH environment, but child process does not require >> `test.nativepath` setting. >> So just `libpath` value should not have `test.nativepath` related entry. >> And above code does not require on current condition and this test said >> "Pass Empty environment to child". >> So it should be removed. >> >> #7581 is exactly same issue. >> Please choose the appropriate one. > > Hi @takiguc , thanks for your changes. I closed my PR in favour of yours; we > definitely don't need two PRs for this issue :-) > > One comment on the approach you took. I considered modifying the static > libpath variable as well, but what really swayed me away from choosing that > route is the Windows tests. On Windows, the situation is analogous to AIX in > that a static systemroot variable is set by querying the parent environment, > but it is explicitly passed to the child process[es] when they are created. > This ensures that the systemroot returned by the child process is the same as > the expected value. Admittedly it's a bit of a nit-pick, but I think having > the Windows version of the test do one thing and AIX version do another make > it harder to understand what is going on. It's for that reason that I took > the approach that I did in my PR. @backwaterred I applied change. Please check this one ? - PR: https://git.openjdk.java.net/jdk/pull/7574
Re: RFR: 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX [v4]
> Run jtreg:jdk/java/lang/ProcessBuilder/Basic.java on AIX. > The test was failed by: > Incorrect handling of envstrings containing NULs > > According to my investigation, this issue was happened after following change > was applied. > JDK-8272600: (test) Use native "sleep" in Basic.java > > test.nativepath value was added into AIX's LIBPATH during running this > testcase. > On AIX, test.nativepath value should be removed from LIBPATH value before > comparing the values. Ichiroh Takiguchi has updated the pull request incrementally with one additional commit since the last revision: Add null check - Changes: - all: https://git.openjdk.java.net/jdk/pull/7574/files - new: https://git.openjdk.java.net/jdk/pull/7574/files/eff26e8f..b6d6afe7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=7574=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7574=02-03 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7574.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7574/head:pull/7574 PR: https://git.openjdk.java.net/jdk/pull/7574
Re: RFR: 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX [v2]
On Tue, 22 Feb 2022 18:10:28 GMT, Roger Riggs wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Use expectedLibpath > > For my curiosity, how is AIX different from other Linux in that the > test.nativepath is not/should not be in LIBPATH? @RogerRiggs Many thanks. that's good point. Only 1st part has `test.nativepath` because of following code. if (AIX.is()) { pb.environment().put("LIBPATH", libpath); } On current condition, parent (main) process have `test.nativepath` setting into LIBPATH environment, but child process does not require `test.nativepath` setting. So just `libpath` value should not have `test.nativepath` related entry. And above code does not require on current condition and this test said "Pass Empty environment to child". So it should be removed. #7581 is exactly same issue. Please choose the appropriate one. - PR: https://git.openjdk.java.net/jdk/pull/7574
Re: RFR: 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX [v3]
> Run jtreg:jdk/java/lang/ProcessBuilder/Basic.java on AIX. > The test was failed by: > Incorrect handling of envstrings containing NULs > > According to my investigation, this issue was happened after following change > was applied. > JDK-8272600: (test) Use native "sleep" in Basic.java > > test.nativepath value was added into AIX's LIBPATH during running this > testcase. > On AIX, test.nativepath value should be removed from LIBPATH value before > comparing the values. Ichiroh Takiguchi has updated the pull request incrementally with one additional commit since the last revision: Change LIBPATH usage - Changes: - all: https://git.openjdk.java.net/jdk/pull/7574/files - new: https://git.openjdk.java.net/jdk/pull/7574/files/8b88686d..eff26e8f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=7574=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7574=01-02 Stats: 11 lines in 1 file changed: 1 ins; 5 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/7574.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7574/head:pull/7574 PR: https://git.openjdk.java.net/jdk/pull/7574
Re: RFR: 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX [v2]
On Tue, 22 Feb 2022 17:20:25 GMT, Ichiroh Takiguchi wrote: >> Run jtreg:jdk/java/lang/ProcessBuilder/Basic.java on AIX. >> The test was failed by: >> Incorrect handling of envstrings containing NULs >> >> According to my investigation, this issue was happened after following >> change was applied. >> JDK-8272600: (test) Use native "sleep" in Basic.java >> >> test.nativepath value was added into AIX's LIBPATH during running this >> testcase. >> On AIX, test.nativepath value should be removed from LIBPATH value before >> comparing the values. > > Ichiroh Takiguchi has updated the pull request incrementally with one > additional commit since the last revision: > > Use expectedLibpath @RogerRiggs I appreciate your good suggestion. Since `libpath` is just used on AIX and it's `static final String`. I created `expectedLibpath` for expected result. About `removeAll` method, it requires regular expression string. I think if directory name has `.`, unexpected situation may be happened. - PR: https://git.openjdk.java.net/jdk/pull/7574
Re: RFR: 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX [v2]
> Run jtreg:jdk/java/lang/ProcessBuilder/Basic.java on AIX. > The test was failed by: > Incorrect handling of envstrings containing NULs > > According to my investigation, this issue was happened after following change > was applied. > JDK-8272600: (test) Use native "sleep" in Basic.java > > test.nativepath value was added into AIX's LIBPATH during running this > testcase. > On AIX, test.nativepath value should be removed from LIBPATH value before > comparing the values. Ichiroh Takiguchi has updated the pull request incrementally with one additional commit since the last revision: Use expectedLibpath - Changes: - all: https://git.openjdk.java.net/jdk/pull/7574/files - new: https://git.openjdk.java.net/jdk/pull/7574/files/9d1faeb7..8b88686d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=7574=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7574=00-01 Stats: 31 lines in 1 file changed: 16 ins; 14 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7574.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7574/head:pull/7574 PR: https://git.openjdk.java.net/jdk/pull/7574
RFR: 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX
Run jtreg:jdk/java/lang/ProcessBuilder/Basic.java on AIX. The test was failed by: Incorrect handling of envstrings containing NULs According to my investigation, this issue was happened after following change was applied. JDK-8272600: (test) Use native "sleep" in Basic.java test.nativepath value was added into AIX's LIBPATH during running this testcase. On AIX, test.nativepath value should be removed from LIBPATH value before comparing the values. - Commit messages: - 8282219: jdk/java/lang/ProcessBuilder/Basic.java fails on AIX Changes: https://git.openjdk.java.net/jdk/pull/7574/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk=7574=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8282219 Stats: 18 lines in 1 file changed: 14 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7574.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7574/head:pull/7574 PR: https://git.openjdk.java.net/jdk/pull/7574
Withdrawn: 8274544: Langtools command's usage were garbled on Japanese Windows
On Thu, 30 Sep 2021 09:36:34 GMT, Ichiroh Takiguchi wrote: > JEP-400 (UTF-8 by Default) was eabled on JDK18-b13. > After JDK18-b13, javac and some other langtool command's usage were garbled > on Japanese Windows. > These commands use PrintWriter instead of standard out/err with PrintStream. This pull request has been closed without being integrated. - PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows [v5]
On Fri, 19 Nov 2021 16:48:03 GMT, Naoto Sato wrote: >> Hello @naotoj . >> For PrintStream.getCharset(), following changes may be required. >> >> +++ src/java.base/share/classes/java/io/OutputStreamWriter.java >> +Charset getCharset() { >> +return se.getCharset(); >> +} >> >> +++ src/java.base/share/classes/java/io/PrintStream.java >> +public Charset getCharset() { >> +return charOut.getCharset(); >> +} >> >> +++ src/java.base/share/classes/sun/nio/cs/StreamEncoder.java >> +public Charset getCharset() { >> +return cs; >> +} >> >> For javac code, we may not use PrintStream.getCharset() directly because >> javac code is compiled by boot compiler. >> We need to use reflection, like: >> >> +++ src/jdk.compiler/share/classes/com/sun/tools/javac/util/Log.java >> +private static Charset getCharset(PrintStream ps) { >> +try { >> +Method getCharset = >> PrintStream.class.getDeclaredMethod("getCharset"); >> +return (Charset)getCharset.invoke(ps); >> +} catch (Exception e) { >> +return Charset.defaultCharset(); >> +} >> +} >> >> If we add following constructors against PrintWriter, we just change javap >> and jshell code. >> But I cannot evaluate this code changes. >> >> +++ src/java.base/share/classes/java/io/PrintWriter.java >> +public PrintWriter(PrintStream ps) { >> + this((OutputStream)ps, false, ps.getCharset()); >> +} >> +public PrintWriter(PrintStream ps, boolean autoFlush) { >> + this((OutputStream)ps, autoFlush, ps.getCharset()); >> +} >> >> I really appreciate if you handle this kind of code change via JEP-400. > > I think this PR can now safely be withdrawn, as > https://github.com/openjdk/jdk/pull/6401 is now integrated. @takiguc, if you > do not mind, I will create a PR for the remaining jshell issue. Please let me > know. Thanks @naotoj . I opened new pr via 8274784. I'd like to close this pr since main issue was fixed by #6401. - PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8276970: Default charset for PrintWriter that wraps PrintStream [v3]
On Thu, 18 Nov 2021 00:11:49 GMT, Naoto Sato wrote: > > BTW, I still observe on Windows (system locale=ja-JP): > > ``` > > D:\projects\jdk\git\jdk>.\build\windows-x64\jdk\bin\jshell > > -J-Duser.language=ja > > | JShellへようこそ -- バージョン18-internal > > | 概要については、次を入力してください: /help intro > > > > jshell> System.out.println("\u3042") > > 縺・ > > ``` > > This needs to be separately addressed in > > https://bugs.openjdk.java.net/browse/JDK-8274784 > > The following diff seems to fix the garbled char issue above: > > ``` > $ git diff > diff --git > a/src/jdk.jshell/share/classes/jdk/jshell/execution/RemoteExecutionControl.java > > b/src/jdk.jshell/share/classes/jdk/jshell/execution/RemoteExecutionControl.java > index 810e80acf47..be0b9dcb0c3 100644 > --- > a/src/jdk.jshell/share/classes/jdk/jshell/execution/RemoteExecutionControl.java > +++ > b/src/jdk.jshell/share/classes/jdk/jshell/execution/RemoteExecutionControl.java > @@ -30,6 +30,7 @@ import java.io.PrintStream; > import java.lang.reflect.Method; > import java.net.Socket; > > +import java.nio.charset.Charset; > import java.util.HashMap; > import java.util.Map; > import java.util.function.Consumer; > @@ -63,8 +64,8 @@ public class RemoteExecutionControl extends > DirectExecutionControl implements Ex > InputStream inStream = socket.getInputStream(); > OutputStream outStream = socket.getOutputStream(); > Map> outputs = new HashMap<>(); > -outputs.put("out", st -> System.setOut(new PrintStream(st, true))); > -outputs.put("err", st -> System.setErr(new PrintStream(st, true))); > +outputs.put("out", st -> System.setOut(new PrintStream(st, true, > Charset.forName(System.getProperty("native.encoding"); > +outputs.put("err", st -> System.setErr(new PrintStream(st, true, > Charset.forName(System.getProperty("native.encoding"); > Map> input = new HashMap<>(); > input.put("in", System::setIn); > forwardExecutionControlAndIO(new RemoteExecutionControl(), inStream, > outStream, outputs, input); > ``` Many thanks, @naotoj . It's good idea !. By this code change, we just touch RemoteExecutionControl.java and encoding issue on jshell may be OK. I'd like to discuss it on #5771 . - PR: https://git.openjdk.java.net/jdk/pull/6401
Re: RFR: 8276970: Default charset for PrintWriter that wraps PrintStream
On Mon, 15 Nov 2021 22:43:37 GMT, Naoto Sato wrote: > Fixing the default charset for PrintWriter/OutputStreamWriter that wraps a > PrintStream to its charset. This issue was raised during the conversations in > https://github.com/openjdk/jdk/pull/5771 > A corresponding CSR has also been drafted: > https://bugs.openjdk.java.net/browse/JDK-8277078 I tested some of java tool commands on #5771 . > jar.exe, javac.exe, javadoc.exe, javap.exe, jdeps.exe, jlink.exe, jmod.exe, > jpackage.exe It worked fine as expected on CentOS7 (ja_JP.eucjp locale) and Windows 10 Pro for Japanese. - PR: https://git.openjdk.java.net/jdk/pull/6401
Re: RFR: 8275007: Java fails to start with null charset if LC_ALL is set to certain locales [v3]
On Mon, 15 Nov 2021 17:28:44 GMT, Naoto Sato wrote: >> @naotoj , sorry for bothering you again. >> Still I got exception. It seems value "jnuEncoding" should be overwritten or >> set "fastEncoding" to "FAST_UTF_8" on jnu_util.c. >> Could you check this part again ? >> >> $ env LC_ALL=kk_KZ.pt154 >> ./build/linux-x86_64-server-release/images/jdk/bin/java Hello.java >> WARNING: The encoding of the underlying platform's file system is not >> supported: PT154 >> Error: A JNI error has occurred, please check your installation and try again >> Exception in thread "main" java.util.ServiceConfigurationError: >> java.nio.charset.spi.CharsetProvider: Provider >> sun.nio.cs.ext.ExtendedCharsets not found >> at java.base/java.util.ServiceLoader.fail(ServiceLoader.java:593) >> at >> java.base/java.util.ServiceLoader.loadProvider(ServiceLoader.java:875) >> at >> java.base/java.util.ServiceLoader$ModuleServicesLookupIterator.hasNext(ServiceLoader.java:1084) >> at java.base/java.util.ServiceLoader$2.hasNext(ServiceLoader.java:1309) >> at java.base/java.util.ServiceLoader$3.hasNext(ServiceLoader.java:1393) >> at >> java.base/java.nio.charset.Charset$ExtendedProviderHolder$1.run(Charset.java:428) >> at >> java.base/java.nio.charset.Charset$ExtendedProviderHolder$1.run(Charset.java:422) >> at >> java.base/java.security.AccessController.doPrivileged(AccessController.java:318) >> at >> java.base/java.nio.charset.Charset$ExtendedProviderHolder.extendedProviders(Charset.java:422) >> at >> java.base/java.nio.charset.Charset$ExtendedProviderHolder.(Charset.java:418) >> at >> java.base/java.nio.charset.Charset.lookupExtendedCharset(Charset.java:442) >> at java.base/java.nio.charset.Charset.lookup2(Charset.java:472) >> at java.base/java.nio.charset.Charset.lookup(Charset.java:460) >> at java.base/java.nio.charset.Charset.isSupported(Charset.java:501) >> at java.base/jdk.internal.loader.NativeLibraries.findBuiltinLib(Native >> Method) >> at >> java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:157) >> at >> java.base/jdk.internal.loader.NativeLibraries.findFromPaths(NativeLibraries.java:322) >> at >> java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:289) >> at >> java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:274) >> at >> java.base/jdk.internal.loader.BootLoader.loadLibrary(BootLoader.java:149) >> at >> java.base/sun.nio.fs.UnixNativeDispatcher.(UnixNativeDispatcher.java:667) >> at java.base/sun.nio.fs.UnixFileSystem.(UnixFileSystem.java:65) >> at java.base/sun.nio.fs.LinuxFileSystem.(LinuxFileSystem.java:39) >> at >> java.base/sun.nio.fs.LinuxFileSystemProvider.newFileSystem(LinuxFileSystemProvider.java:46) >> at >> java.base/sun.nio.fs.LinuxFileSystemProvider.newFileSystem(LinuxFileSystemProvider.java:39) >> at >> java.base/sun.nio.fs.UnixFileSystemProvider.(UnixFileSystemProvider.java:55) >> at >> java.base/sun.nio.fs.LinuxFileSystemProvider.(LinuxFileSystemProvider.java:41) >> at >> java.base/sun.nio.fs.DefaultFileSystemProvider.(DefaultFileSystemProvider.java:35) >> at >> java.base/java.nio.file.FileSystems$DefaultFileSystemHolder.getDefaultProvider(FileSystems.java:114) >> at >> java.base/java.nio.file.FileSystems$DefaultFileSystemHolder$1.run(FileSystems.java:103) >> at >> java.base/java.nio.file.FileSystems$DefaultFileSystemHolder$1.run(FileSystems.java:101) >> at >> java.base/java.security.AccessController.doPrivileged(AccessController.java:318) >> at >> java.base/java.nio.file.FileSystems$DefaultFileSystemHolder.defaultFileSystem(FileSystems.java:101) >> at >> java.base/java.nio.file.FileSystems$DefaultFileSystemHolder.(FileSystems.java:94) >> at java.base/java.nio.file.FileSystems.getDefault(FileSystems.java:183) >> at java.base/java.nio.file.Path.of(Path.java:147) >> at java.base/java.nio.file.Paths.get(Paths.java:69) >> at >> java.base/jdk.internal.jimage.ImageReaderFactory.(ImageReaderFactory.java:51) >> at >> java.base/jdk.internal.module.SystemModuleFinders$SystemImage.(SystemModuleFinders.java:385) >> at >> java.base/jdk.internal.module.SystemModuleFinders$SystemModuleReader.findImageLocation(SystemModuleFinders.java:429) >> at >> java.base/jdk.internal.module.SystemModuleFinders$SystemModuleReader.read(SystemModuleFinders.java:483) >> at >> java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:809) >> at >> java.base/jdk.internal.loader.BuiltinClassLoader.findClassInModuleOrNull(BuiltinClassLoader.java:741) >> at >> java.base/jdk.internal.loader.BuiltinClassLoader.findClass(BuiltinClassLoader.java:621) >> at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:633) >> at java.base/java.lang.Class.forName(Class.java:577) >>
Re: RFR: 8275007: Java fails to start with null charset if LC_ALL is set to certain locales [v3]
On Fri, 12 Nov 2021 19:11:16 GMT, Naoto Sato wrote: >> Naoto Sato has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Force the jnu encoding to UTF-8 if the original one is not supported > > I reproduced those failures on Debian Linux. Corrected the issue by replacing > unsupported jnu encoding with `UTF-8` in `System::initPhase1`. @naotoj , sorry for bothering you again. Still I got exception. It seems value "jnuEncoding" should be overwritten or set "fastEncoding" to "FAST_UTF_8" on jnu_util.c. Could you check this part again ? $ env LC_ALL=kk_KZ.pt154 ./build/linux-x86_64-server-release/images/jdk/bin/java Hello.java WARNING: The encoding of the underlying platform's file system is not supported: PT154 Error: A JNI error has occurred, please check your installation and try again Exception in thread "main" java.util.ServiceConfigurationError: java.nio.charset.spi.CharsetProvider: Provider sun.nio.cs.ext.ExtendedCharsets not found at java.base/java.util.ServiceLoader.fail(ServiceLoader.java:593) at java.base/java.util.ServiceLoader.loadProvider(ServiceLoader.java:875) at java.base/java.util.ServiceLoader$ModuleServicesLookupIterator.hasNext(ServiceLoader.java:1084) at java.base/java.util.ServiceLoader$2.hasNext(ServiceLoader.java:1309) at java.base/java.util.ServiceLoader$3.hasNext(ServiceLoader.java:1393) at java.base/java.nio.charset.Charset$ExtendedProviderHolder$1.run(Charset.java:428) at java.base/java.nio.charset.Charset$ExtendedProviderHolder$1.run(Charset.java:422) at java.base/java.security.AccessController.doPrivileged(AccessController.java:318) at java.base/java.nio.charset.Charset$ExtendedProviderHolder.extendedProviders(Charset.java:422) at java.base/java.nio.charset.Charset$ExtendedProviderHolder.(Charset.java:418) at java.base/java.nio.charset.Charset.lookupExtendedCharset(Charset.java:442) at java.base/java.nio.charset.Charset.lookup2(Charset.java:472) at java.base/java.nio.charset.Charset.lookup(Charset.java:460) at java.base/java.nio.charset.Charset.isSupported(Charset.java:501) at java.base/jdk.internal.loader.NativeLibraries.findBuiltinLib(Native Method) at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:157) at java.base/jdk.internal.loader.NativeLibraries.findFromPaths(NativeLibraries.java:322) at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:289) at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:274) at java.base/jdk.internal.loader.BootLoader.loadLibrary(BootLoader.java:149) at java.base/sun.nio.fs.UnixNativeDispatcher.(UnixNativeDispatcher.java:667) at java.base/sun.nio.fs.UnixFileSystem.(UnixFileSystem.java:65) at java.base/sun.nio.fs.LinuxFileSystem.(LinuxFileSystem.java:39) at java.base/sun.nio.fs.LinuxFileSystemProvider.newFileSystem(LinuxFileSystemProvider.java:46) at java.base/sun.nio.fs.LinuxFileSystemProvider.newFileSystem(LinuxFileSystemProvider.java:39) at java.base/sun.nio.fs.UnixFileSystemProvider.(UnixFileSystemProvider.java:55) at java.base/sun.nio.fs.LinuxFileSystemProvider.(LinuxFileSystemProvider.java:41) at java.base/sun.nio.fs.DefaultFileSystemProvider.(DefaultFileSystemProvider.java:35) at java.base/java.nio.file.FileSystems$DefaultFileSystemHolder.getDefaultProvider(FileSystems.java:114) at java.base/java.nio.file.FileSystems$DefaultFileSystemHolder$1.run(FileSystems.java:103) at java.base/java.nio.file.FileSystems$DefaultFileSystemHolder$1.run(FileSystems.java:101) at java.base/java.security.AccessController.doPrivileged(AccessController.java:318) at java.base/java.nio.file.FileSystems$DefaultFileSystemHolder.defaultFileSystem(FileSystems.java:101) at java.base/java.nio.file.FileSystems$DefaultFileSystemHolder.(FileSystems.java:94) at java.base/java.nio.file.FileSystems.getDefault(FileSystems.java:183) at java.base/java.nio.file.Path.of(Path.java:147) at java.base/java.nio.file.Paths.get(Paths.java:69) at java.base/jdk.internal.jimage.ImageReaderFactory.(ImageReaderFactory.java:51) at java.base/jdk.internal.module.SystemModuleFinders$SystemImage.(SystemModuleFinders.java:385) at java.base/jdk.internal.module.SystemModuleFinders$SystemModuleReader.findImageLocation(SystemModuleFinders.java:429) at java.base/jdk.internal.module.SystemModuleFinders$SystemModuleReader.read(SystemModuleFinders.java:483) at java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:809) at java.base/jdk.internal.loader.BuiltinClassLoader.findClassInModuleOrNull(BuiltinClassLoader.java:741) at
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows [v5]
On Wed, 10 Nov 2021 21:19:30 GMT, Naoto Sato wrote: >> Ichiroh Takiguchi has updated the pull request with a new target base due to >> a merge or a rebase. The pull request now contains five commits: >> >> - 8274544: Langtools command's usage were garbled on Japanese Windows >> - 8274544: Langtools command's usage were garbled on Japanese Windows >> - 8274544: Langtools command's usage were garbled on Japanese Windows >> - 8274544: Langtools command's usage were garbled on Japanese Windows >> - Langtools command's usage were grabled on Japanese Windows > > Good suggestions. Filed a JBS issue: > https://bugs.openjdk.java.net/browse/JDK-8276970 Hello @naotoj . For PrintStream.getCharset(), following changes may be required. +++ src/java.base/share/classes/java/io/OutputStreamWriter.java +Charset getCharset() { +return se.getCharset(); +} +++ src/java.base/share/classes/java/io/PrintStream.java +public Charset getCharset() { +return charOut.getCharset(); +} +++ src/java.base/share/classes/sun/nio/cs/StreamEncoder.java +public Charset getCharset() { +return cs; +} For javac code, we may not use PrintStream.getCharset() directly because javac code is compiled by boot compiler. We need to use reflection, like: +++ src/jdk.compiler/share/classes/com/sun/tools/javac/util/Log.java +private static Charset getCharset(PrintStream ps) { +try { +Method getCharset = PrintStream.class.getDeclaredMethod("getCharset"); +return (Charset)getCharset.invoke(ps); +} catch (Exception e) { +return Charset.defaultCharset(); +} +} If we add following constructors against PrintWriter, we just change javap and jshell code. But I cannot evaluate this code changes. +++ src/java.base/share/classes/java/io/PrintWriter.java +public PrintWriter(PrintStream ps) { + this((OutputStream)ps, false, ps.getCharset()); +} +public PrintWriter(PrintStream ps, boolean autoFlush) { + this((OutputStream)ps, autoFlush, ps.getCharset()); +} I really appreciate if you handle this kind of code change via JEP-400. - PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8275007: Java fails to start with null charset if LC_ALL is set to certain locales [v2]
On Tue, 9 Nov 2021 19:38:01 GMT, Naoto Sato wrote: >> Please review the subject fix. In light of JEP400, Java runtime can/should >> start in UTF-8 charset if the underlying native encoding is not supported. > > Naoto Sato has updated the pull request incrementally with one additional > commit since the last revision: > > Emit a warning on unsupported jnu encoding I applied same kind of patch into jdk17u. The situation was same. $ env LC_ALL=kk_KZ.pt154 ./build/linux-x86_64-server-release/images/jdk/bin/java -showversion Hello.java WARNING: The encoding of the underlying platform's file system is not supported by the JVM: PT154 openjdk version "17.0.1-internal" 2021-10-19 OpenJDK Runtime Environment (build 17.0.1-internal+0-adhoc.jdktest.jdk17u) OpenJDK 64-Bit Server VM (build 17.0.1-internal+0-adhoc.jdktest.jdk17u, mixed mode, sharing) Error: A JNI error has occurred, please check your installation and try again Exception in thread "main" java.util.ServiceConfigurationError: java.nio.charset.spi.CharsetProvider: Unable to load sun.nio.cs.ext.ExtendedCharsets at java.base/java.util.ServiceLoader.fail(ServiceLoader.java:586) ... - PR: https://git.openjdk.java.net/jdk/pull/6282
Re: RFR: 8275007: Java fails to start with null charset if LC_ALL is set to certain locales [v2]
On Tue, 9 Nov 2021 19:38:01 GMT, Naoto Sato wrote: >> Please review the subject fix. In light of JEP400, Java runtime can/should >> start in UTF-8 charset if the underlying native encoding is not supported. > > Naoto Sato has updated the pull request incrementally with one additional > commit since the last revision: > > Emit a warning on unsupported jnu encoding I could see following output on my CentOS7. $ env LC_ALL=kk_KZ.pt154 ~/jdk-17.0.1/bin/java -showversion Hello.java Error occurred during initialization of VM java.lang.NullPointerException at java.lang.System.getProperty(java.base/System.java:918) at sun.security.action.GetPropertyAction.privilegedGetProperty(java.base/GetPropertyAction.java:106) at java.nio.charset.Charset.defaultCharset(java.base/Charset.java:607) at java.lang.String.(java.base/String.java:1412) at java.lang.String.(java.base/String.java:1432) at jdk.internal.util.SystemProps$Raw.platformProperties(java.base/Native Method) at jdk.internal.util.SystemProps$Raw.(java.base/SystemProps.java:234) at jdk.internal.util.SystemProps.initProperties(java.base/SystemProps.java:54) at java.lang.System.initPhase1(java.base/System.java:2089) $ env LC_ALL=kk_KZ.pt154 ~/jdk-18-b22/bin/java -showversion Hello.java Error occurred during initialization of VM java.lang.NullPointerException at java.lang.System.getProperty(java.base/System.java:929) at sun.security.action.GetPropertyAction.privilegedGetProperty(java.base/GetPropertyAction.java:106) at java.nio.charset.Charset.defaultCharset(java.base/Charset.java:646) at java.lang.String.(java.base/String.java:1411) at java.lang.String.(java.base/String.java:1431) at jdk.internal.util.SystemProps$Raw.platformProperties(java.base/Native Method) at jdk.internal.util.SystemProps$Raw.(java.base/SystemProps.java:240) at jdk.internal.util.SystemProps.initProperties(java.base/SystemProps.java:54) at java.lang.System.initPhase1(java.base/System.java:2100) - PR: https://git.openjdk.java.net/jdk/pull/6282
Re: RFR: 8275007: Java fails to start with null charset if LC_ALL is set to certain locales [v2]
On Tue, 9 Nov 2021 19:38:01 GMT, Naoto Sato wrote: >> Please review the subject fix. In light of JEP400, Java runtime can/should >> start in UTF-8 charset if the underlying native encoding is not supported. > > Naoto Sato has updated the pull request incrementally with one additional > commit since the last revision: > > Emit a warning on unsupported jnu encoding I don't have Oracle Linux... I tried to find out unsupported locales by using following commands. I used RHEL7.9. $ cat CharsetIsSupported.java import java.nio.charset.Charset; public class CharsetIsSupported { public static void main(String[] args) throws Exception { for(String s : args) { if (!Charset.isSupported(s)) System.out.println(s); } } } $ locale -a | while read LOC; do env LANG=$LOC locale charmap; done | sort -u | xargs ~/jdk-17.0.1/bin/java CharsetIsSupported.java > encoding.lst $ locale -a | while read LOC; do echo -e "$LOC\t\c"; env LANG=$LOC locale charmap; done | grep -f encoding.lst cy_GB ISO-8859-14 cy_GB.iso885914 ISO-8859-14 hy_AM.armscii8 ARMSCII-8 ka_GE GEORGIAN-PS ka_GE.georgianpsGEORGIAN-PS kk_KZ PT154 kk_KZ.pt154 PT154 lg_UG ISO-8859-10 lg_UG.iso885910 ISO-8859-10 tg_TJ KOI8-T tg_TJ.koi8t KOI8-T I tried following command, "sun.jnu.encoding" system property was referred following source code. We may need to care about these files. $ find src -type f | xargs grep '"sun.jnu.encoding"' src/java.base/share/classes/java/lang/System.java:var jnuEncoding = props.getProperty("sun.jnu.encoding"); src/java.base/share/classes/java/lang/System.java: props.getProperty("sun.jnu.encoding")); src/java.base/share/classes/jdk/internal/util/SystemProps.java: put(props, "sun.jnu.encoding", raw.propDefault(Raw._sun_jnu_encoding_NDX)); src/java.base/share/classes/sun/launcher/LauncherHelper.java:private static final String encprop = "sun.jnu.encoding"; src/java.base/share/classes/sun/nio/fs/Util.java: GetPropertyAction.privilegedGetProperty("sun.jnu.encoding"), src/java.base/share/native/libjava/jni_util.h:NO_ENCODING_YET = 0, /* "sun.jnu.encoding" not yet set */ src/java.desktop/share/classes/sun/font/SunFontManager.java: String sysEncoding = System.getProperty("sun.jnu.encoding"); - PR: https://git.openjdk.java.net/jdk/pull/6282
Re: RFR: 8275007: Java fails to start with null charset if LC_ALL is set to certain locales
On Tue, 9 Nov 2021 19:38:46 GMT, Naoto Sato wrote: >>> If this issue is not a part of JEP-400, I think UTF_8 may not be only >>> solution for sun/nio/fs/Util.java. >>> Please explain why UTF_8 is better than ISO_8859_1. >> >> Just to add to Naoto's comment. When the JDK starts in unusual/unsupported >> configurations then it needs a default for the stdout/stderr print streams, >> for encoding/decoding files paths in java.io, and for encoding/decoding >> files paths in the java.nio.file implementation. It's important that the >> latter two are the same, having one be UTF-8 and the other be 8859-1 won't >> work. >> >> There may be an argument that the JDK should emit a warning at startup. > >> There may be an argument that the JDK should emit a warning at startup. > > Updated to emit a warning, if `sun.jnu.encoding` is not supported. Hello @naotoj . I appreciate your code update. I'd like to confirm one thing. If you could not recreate this issue on your test machine, please ignore this message. I tested your code on my CentOS7 x86_64. I saw following exception. $ git log --oneline | head e91e9d85327 8276721: G1: Refine G1EvacFailureObjectsSet::iterate 8822d41fdcc 8274736: Concurrent read/close of SSLSockets causes SSLSessions to be invalidated unnecessarily c1e41fe38bb 8276842: G1: Only calculate size in bytes from words when needed c8b0ee6b8a0 8276833: G1: Make G1EvacFailureRegions::par_iterate const 0699220830a 8268882: C2: assert(n->outcnt() != 0 || C->top() == n || n->is_Proj()) failed: No dead instructions after post-alloc d7012fbd604 8276880: Remove java/lang/RuntimeTests/exec/ExecWithDir as unnecessary f9024d0606e 8230130: javadoc search result dialog shows cut off headers for long results 055de6f5662 8223358: Incorrect HTML structure in annotation pages a60e91259ba 8198626: java/awt/KeyboardFocusmanager/TypeAhead/TestDialogTypeAhead.html fails on mac dde959dfcef 8276808: java/nio/channels/Channels/TransferTo.java timed out $ cat Hello.java public class Hello { public static void main(String[] args) throws Exception { System.out.println("Hello"); } } $ env LC_ALL=kk_KZ.pt154 locale LANG=ja_JP.UTF-8 LC_CTYPE="kk_KZ.pt154" LC_NUMERIC="kk_KZ.pt154" LC_TIME="kk_KZ.pt154" LC_COLLATE="kk_KZ.pt154" LC_MONETARY="kk_KZ.pt154" LC_MESSAGES="kk_KZ.pt154" LC_PAPER="kk_KZ.pt154" LC_NAME="kk_KZ.pt154" LC_ADDRESS="kk_KZ.pt154" LC_TELEPHONE="kk_KZ.pt154" LC_MEASUREMENT="kk_KZ.pt154" LC_IDENTIFICATION="kk_KZ.pt154" LC_ALL=kk_KZ.pt154 $ env LC_ALL=kk_KZ.pt154 ./build/linux-x86_64-server-release/images/jdk/bin/java Hello.java WARNING: The encoding of the underlying platform's file system is not supported by the JVM: PT154 Error: A JNI error has occurred, please check your installation and try again Exception in thread "main" java.util.ServiceConfigurationError: java.nio.charset.spi.CharsetProvider: Unable to load sun.nio.cs.ext.ExtendedCharsets at java.base/java.util.ServiceLoader.fail(ServiceLoader.java:586) at java.base/java.util.ServiceLoader.loadProvider(ServiceLoader.java:861) at java.base/java.util.ServiceLoader$ModuleServicesLookupIterator.hasNext(ServiceLoader.java:1084) at java.base/java.util.ServiceLoader$2.hasNext(ServiceLoader.java:1309) at java.base/java.util.ServiceLoader$3.hasNext(ServiceLoader.java:1393) at java.base/java.nio.charset.Charset$ExtendedProviderHolder$1.run(Charset.java:428) at java.base/java.nio.charset.Charset$ExtendedProviderHolder$1.run(Charset.java:422) at java.base/java.security.AccessController.doPrivileged(AccessController.java:318) at java.base/java.nio.charset.Charset$ExtendedProviderHolder.extendedProviders(Charset.java:422) at java.base/java.nio.charset.Charset$ExtendedProviderHolder.(Charset.java:418) at java.base/java.nio.charset.Charset.lookupExtendedCharset(Charset.java:442) at java.base/java.nio.charset.Charset.lookup2(Charset.java:472) at java.base/java.nio.charset.Charset.lookup(Charset.java:460) at java.base/java.nio.charset.Charset.isSupported(Charset.java:501) at java.base/sun.launcher.LauncherHelper.makePlatformString(LauncherHelper.java:891) Caused by: java.lang.ExceptionInInitializerError at java.base/sun.nio.fs.UnixFileSystem.(UnixFileSystem.java:51) at java.base/sun.nio.fs.LinuxFileSystem.(LinuxFileSystem.java:39) at java.base/sun.nio.fs.LinuxFileSystemProvider.newFileSystem(LinuxFileSystemProvider.java:46) at java.base/sun.nio.fs.LinuxFileSystemProvider.newFileSystem(LinuxFileSystemProvider.java:39) at java.base/sun.nio.fs.UnixFileSystemProvider.(UnixFileSystemProvider.java:55) at java.base/sun.nio.fs.LinuxFileSystemProvider.(LinuxFileSystemProvider.java:41) at java.base/sun.nio.fs.DefaultFileSystemProvider.(DefaultFileSystemProvider.java:35) at
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows [v3]
On Tue, 19 Oct 2021 01:26:35 GMT, Jonathan Gibbons wrote: >> Ichiroh Takiguchi has refreshed the contents of this pull request, and >> previous commits have been removed. The incremental views will show >> differences compared to the previous content of the PR. > > This is pretty ugly code to be replicating so many times. > > What if the tools have been run in an environment where `System.out` and > `System.err` have already been redirected in some manner, with > `System.setOut` or `System.setErr`? You should not assume that `System.out` > and `System.err` will always refer to the console. Hello @jonathan-gibbons . I tested -Xstdout option for javac command on Linux and Windows platform. Output file encoding was UTF-8. I assume it's expected result. - PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8275007: Java fails to start with null charset if LC_ALL is set to certain locales
On Mon, 8 Nov 2021 09:44:34 GMT, Alan Bateman wrote: >> If this issue is not a part of JEP-400, I think UTF_8 may not be only >> solution for sun/nio/fs/Util.java. >> Please explain why UTF_8 is better than ISO_8859_1. > >> If this issue is not a part of JEP-400, I think UTF_8 may not be only >> solution for sun/nio/fs/Util.java. >> Please explain why UTF_8 is better than ISO_8859_1. > > Just to add to Naoto's comment. When the JDK starts in unusual/unsupported > configurations then it needs a default for the stdout/stderr print streams, > for encoding/decoding files paths in java.io, and for encoding/decoding files > paths in the java.nio.file implementation. It's important that the latter two > are the same, having one be UTF-8 and the other be 8859-1 won't work. > > There may be an argument that the JDK should emit a warning at startup. Thanks @AlanBateman . Yeah, I got a message from my co-worker. He said unmappable conversion was still there even if ISO8859_1, and there was no big difference between ISO8859_1 and UTF_8. I'm very sorry for your confusion... - PR: https://git.openjdk.java.net/jdk/pull/6282
Re: RFR: 8275007: Java fails to start with null charset if LC_ALL is set to certain locales
On Fri, 5 Nov 2021 17:31:50 GMT, Naoto Sato wrote: > Please review the subject fix. In light of JEP400, Java runtime can/should > start in UTF-8 charset if the underlying native encoding is not supported. If this issue is not a part of JEP-400, I think UTF_8 may not be only solution for sun/nio/fs/Util.java. Please explain why UTF_8 is better than ISO_8859_1. - PR: https://git.openjdk.java.net/jdk/pull/6282
Re: RFR: 8275007: Java fails to start with null charset if LC_ALL is set to certain locales
On Sat, 6 Nov 2021 16:20:41 GMT, Alan Bateman wrote: >> @naotoj >> I'm not reviewer. >> Could you explain more detail, why JEP-400 needs to touch sun.jnu.encoding >> system property's fallback ? > >> Could you explain more detail, why JEP-400 needs to touch sun.jnu.encoding >> system property's fallback ? > > This is about running on unusual configurations where the native encoding > (native.encoding and sun.jnu.encoding) doesn't name a charset in java.base. > There has been several reports of NPE in such cases. Now that the default > charset is UTF-8 it means there is a sane fallback so the VM can startup. Thanks @AlanBateman . In my understanding, sun.jnu.encoding property may be related file system access. Java may not be access to appropriate file. I mean if the file name has malformed character, it may be changed to "?". In this case, unexpected file access may be happened. 8-bit pass-through may be better... - PR: https://git.openjdk.java.net/jdk/pull/6282
Re: RFR: 8275007: Java fails to start with null charset if LC_ALL is set to certain locales
On Fri, 5 Nov 2021 20:22:03 GMT, Naoto Sato wrote: >> Do you need to add test? > >> Do you need to add test? > > I would if I could, but specifying the environment dependent settings in the > test would be fragile and error-prone, so I did not add a test but marked the > issue as `noreg-hard`. @naotoj I'm not reviewer. Could you explain more detail, why JEP-400 needs to touch sun.jnu.encoding system property's fallback ? - PR: https://git.openjdk.java.net/jdk/pull/6282
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows [v5]
On Wed, 3 Nov 2021 18:41:19 GMT, Naoto Sato wrote: >> Ichiroh Takiguchi has updated the pull request with a new target base due to >> a merge or a rebase. The pull request now contains five commits: >> >> - 8274544: Langtools command's usage were garbled on Japanese Windows >> - 8274544: Langtools command's usage were garbled on Japanese Windows >> - 8274544: Langtools command's usage were garbled on Japanese Windows >> - 8274544: Langtools command's usage were garbled on Japanese Windows >> - Langtools command's usage were grabled on Japanese Windows > > Firstly, please do NOT rebase your fix, as it will clobber the review > history. Use merge instead. > As to the fix, I am not sure consolidating the code into Log.java would be OK > as it would introduce module dependency. @naotoj I'm sorry about my bad operation. @jonathan-gibbons I appreciate if you give me some suggestions. - PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows [v5]
On Mon, 1 Nov 2021 16:10:26 GMT, Ichiroh Takiguchi wrote: >> JEP-400 (UTF-8 by Default) was eabled on JDK18-b13. >> After JDK18-b13, javac and some other langtool command's usage were garbled >> on Japanese Windows. >> These commands use PrintWriter instead of standard out/err with PrintStream. > > Ichiroh Takiguchi has updated the pull request with a new target base due to > a merge or a rebase. The pull request now contains five commits: > > - 8274544: Langtools command's usage were garbled on Japanese Windows > - 8274544: Langtools command's usage were garbled on Japanese Windows > - 8274544: Langtools command's usage were garbled on Japanese Windows > - 8274544: Langtools command's usage were garbled on Japanese Windows > - Langtools command's usage were grabled on Japanese Windows I needed to rebase my fixed code since JavapTask.java and JdepsTask.java were updated. Hello @jonathan-gibbons . Could you check new fixes ? I changed following parts: * Native charset is generated on com/sun/tools/javac/util/Log.java. * module-info.java was updated Hello @naotoj . Could you check new fixes again ? - PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows [v5]
> JEP-400 (UTF-8 by Default) was eabled on JDK18-b13. > After JDK18-b13, javac and some other langtool command's usage were garbled > on Japanese Windows. > These commands use PrintWriter instead of standard out/err with PrintStream. Ichiroh Takiguchi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - 8274544: Langtools command's usage were garbled on Japanese Windows - 8274544: Langtools command's usage were garbled on Japanese Windows - 8274544: Langtools command's usage were garbled on Japanese Windows - 8274544: Langtools command's usage were garbled on Japanese Windows - Langtools command's usage were grabled on Japanese Windows - Changes: https://git.openjdk.java.net/jdk/pull/5771/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk=5771=04 Stats: 104 lines in 17 files changed: 74 ins; 0 del; 30 mod Patch: https://git.openjdk.java.net/jdk/pull/5771.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5771/head:pull/5771 PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows [v4]
On Mon, 25 Oct 2021 14:20:52 GMT, Ichiroh Takiguchi wrote: >> JEP-400 (UTF-8 by Default) was eabled on JDK18-b13. >> After JDK18-b13, javac and some other langtool command's usage were garbled >> on Japanese Windows. >> These commands use PrintWriter instead of standard out/err with PrintStream. > > Ichiroh Takiguchi has updated the pull request with a new target base due to > a merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains four additional > commits since the last revision: > > - 8274544: Langtools command's usage were garbled on Japanese Windows > - 8274544: Langtools command's usage were garbled on Japanese Windows > - 8274544: Langtools command's usage were garbled on Japanese Windows > - Langtools command's usage were grabled on Japanese Windows Terminal setting $ locale LANG=ja_JP.eucjp LC_CTYPE="ja_JP.eucjp" LC_NUMERIC="ja_JP.eucjp" LC_TIME="ja_JP.eucjp" LC_COLLATE="ja_JP.eucjp" LC_MONETARY="ja_JP.eucjp" LC_MESSAGES="ja_JP.eucjp" LC_PAPER="ja_JP.eucjp" LC_NAME="ja_JP.eucjp" LC_ADDRESS="ja_JP.eucjp" LC_TELEPHONE="ja_JP.eucjp" LC_MEASUREMENT="ja_JP.eucjp" LC_IDENTIFICATION="ja_JP.eucjp" LC_ALL= Java testcase by using Scanner. $ cat scan.java import java.util.*; public class scan { public static void main(String[] args) throws Exception { System.out.println("Please input some characters:"); Scanner scanner = new Scanner(System.in); System.out.println(scanner.next()); } } $ ~/jdk-18-b19/bin/java scan.java Please input some characters: あいうえお ?? When `src/jdk.internal.le/share/classes/jdk/internal/org/jline/terminal/impl/AbstractTerminal.java` is modified $ jshell | JShellへようこそ -- バージョン18-internal | 概要については、次を入力してください: /help intro jshell> import java.nio.charset.* jshell> System.out.println(System.getProperty("native.encoding")) EUC-JP-LINUX jshell> System.out.println(Charset.defaultCharset()) UTF-8 jshell> var scan = new Scanner(System.in) scan ==> java.util.Scanner[delimiters=\p{javaWhitespace}+] ... \E][infinity string=\Q∞\E] jshell> var s = scan.next() あいうえお s ==> "あいうえお" jshell> System.out.println(s) あいうえお jshell> /exit | 終了します $ - PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows [v4]
> JEP-400 (UTF-8 by Default) was eabled on JDK18-b13. > After JDK18-b13, javac and some other langtool command's usage were garbled > on Japanese Windows. > These commands use PrintWriter instead of standard out/err with PrintStream. Ichiroh Takiguchi has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - 8274544: Langtools command's usage were garbled on Japanese Windows - 8274544: Langtools command's usage were garbled on Japanese Windows - 8274544: Langtools command's usage were garbled on Japanese Windows - Langtools command's usage were grabled on Japanese Windows - Changes: - all: https://git.openjdk.java.net/jdk/pull/5771/files - new: https://git.openjdk.java.net/jdk/pull/5771/files/4427d87c..e2a87848 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=5771=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk=5771=02-03 Stats: 35926 lines in 1051 files changed: 24201 ins; 7865 del; 3860 mod Patch: https://git.openjdk.java.net/jdk/pull/5771.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5771/head:pull/5771 PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8270490: Charset.forName() taking fallback default value [v3]
On Fri, 22 Oct 2021 17:33:43 GMT, Naoto Sato wrote: >> During the review of JEP 400, a proposal to provide an overloaded method to >> `Charset.forName()` was suggested >> [[1]](https://github.com/openjdk/jdk/pull/4733#discussion_r669693954). This >> PR is to implement the proposal. A CSR is also drafted as >> https://bugs.openjdk.java.net/browse/JDK-8275348 > > Naoto Sato has updated the pull request incrementally with one additional > commit since the last revision: > > Removed `@throws IllegalCharsetNameException` About javadoc, I think following description is not clear. * @param fallback * fallback charset in case the charset object for the named * charset is not available. May be {@code null} - PR: https://git.openjdk.java.net/jdk/pull/6045
Re: RFR: 8270490: Charset.forName() taking fallback default value [v2]
On Thu, 21 Oct 2021 18:00:46 GMT, Naoto Sato wrote: >> I'm not reviewer. >> >> I'd like to write down following code without `try-catch`. >> >> var cs = Charset.forName(charsetName, null); >> if (cs == null) cs = StandardCharsets.UTF_8; >> >> or please find out more easy way. > >> I'd like to write down following code without `try-catch`. > > You don't *have to* try-catch those exceptions if you are not interested, as > they are subclasses of `RuntimeException`. > >> ``` >> var cs = Charset.forName(charsetName, null); >> if (cs == null) cs = StandardCharsets.UTF_8; >> ``` > > This could be simplified to > > > var cs = Charset.forName(charsetName, StandardCharsets.UTF_8); @naotoj Oh sorry, I'd like to detect fallback charset is used, like: var cs = Charset.forName(charsetName, null); if (cs == null) { System.err.println("Used UTF-8 encoding instead of "+charsetName+"); cs = StandardCharsets.UTF_8; } - PR: https://git.openjdk.java.net/jdk/pull/6045
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows [v3]
On Tue, 19 Oct 2021 01:26:35 GMT, Jonathan Gibbons wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8274544: Langtools command's usage were garbled on Japanese Windows > > This is pretty ugly code to be replicating so many times. > > What if the tools have been run in an environment where `System.out` and > `System.err` have already been redirected in some manner, with > `System.setOut` or `System.setErr`? You should not assume that `System.out` > and `System.err` will always refer to the console. @jonathan-gibbons I appreciate your comment. I'd like to confirm something. > This is pretty ugly code to be replicating so many times. > What if the tools have been run in an environment where `System.out` and > `System.err` have already been redirected in some manner, with > `System.setOut` or `System.setErr`? You should not assume that `System.out` > and `System.err` will always refer to the console. I was confused since the fixed code did not call System.out/System.err directly. I tried following code on Japanese Windows. import java.io.*; import java.nio.charset.*; public class OutputCheck { public static void main(String[] args) throws Exception { String s = "\u3042"; System.out.println("[1]:"+s); PrintStream ps = System.out; System.setOut(new PrintStream(System.out)); System.out.println("[2]:"+s); ps.println("[3]:"+s); System.setOut(new PrintStream(System.out, true, Charset.forName(System.getProperty("native.encoding"; System.out.println("[4]:"+s); } } Output is: > jdk-18-b14\bin\java OutputCheck.java [1]:あ [2]:縺・ [3]:あ [4]:あ [2] refers default charset (UTF-8) [3] is same as [1] [4] specifies native.encoding system encoding Could you explain more detail ? - PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8270490: Charset.forName() taking fallback default value [v2]
On Wed, 20 Oct 2021 19:02:30 GMT, Naoto Sato wrote: >> During the review of JEP 400, a proposal to provide an overloaded method to >> `Charset.forName()` was suggested >> [[1]](https://github.com/openjdk/jdk/pull/4733#discussion_r669693954). This >> PR is to implement the proposal. A CSR is also drafted as >> https://bugs.openjdk.java.net/browse/JDK-8275348 > > Naoto Sato has updated the pull request incrementally with one additional > commit since the last revision: > > Moved the null sentence into @param tag. I'm not reviewer. I'd like to write down following code without `try-catch`. var cs = Charset.forName(charsetName, null); if (cs == null) cs = StandardCharsets.UTF_8; or please find out more easy way. - PR: https://git.openjdk.java.net/jdk/pull/6045
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows
On Thu, 14 Oct 2021 23:43:47 GMT, Naoto Sato wrote: >> @takiguc - if JShell is still an issue, is there a chance you could try this: >> https://github.com/lahodaj/jdk/commit/cfa6b3eebbc22c5a48d31cfd692ff98690653686 >> >> Not sure if it will help, but it might (this won't change the default >> charset, but could send the output in a sensible encoding for the console. >> >> Thanks! > > I believe both @lahodaj and my proposed changes are needed, which I provided > here: > https://github.com/openjdk/jdk/commit/82a9033953655042480c90d388fcbc8053fc5ff5 > Can you check this works? Hello @naotoj . [82a9033](https://github.com/openjdk/jdk/commit/82a9033953655042480c90d388fcbc8053fc5ff5) worked on Windows platform, then I got your point. To minimize changes, I think I should create new method and call from JShellToolBuilder.java and JShellToolProvider.java. static PrintStream derivePrintStream(PrintStream ps) { if (Charset.defaultCharset().equals(NATIVE_CHARSET)) { return ps; } else { return new PrintStream(new WriterOutputStream( new OutputStreamWriter(ps, NATIVE_CHARSET), Charset.defaultCharset())); } } Additionally, I tested this issue on non-UTF-8 platform like RHEL7/CentOS7 EUC locale (ja_JP.eucjp). It seemed JLine's terminal's encoding was still UTF-8. I could not input Japanese character as expected, also Cut & Paste with Japanese string did not work. In my investigation, Charset.defaultCharset() was used on AbstractTerminal.java https://github.com/openjdk/jdk/blob/master/src/jdk.internal.le/share/classes/jdk/internal/org/jline/terminal/impl/AbstractTerminal.java#L55-L62 I should be like, this.encoding = encoding != null ? encoding : nativeCharset; (JLine was upgraded by b19, I need to rebase if it's required) One more thing, For Inter-Process Communication between jshell front-end and agent, I think default charset should be same for non UTF-8 environment. I think JdiInitiator.java should be modified. Please give me your suggestion. - PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows [v3]
On Fri, 8 Oct 2021 21:07:32 GMT, Naoto Sato wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8274544: Langtools command's usage were garbled on Japanese Windows > > BTW, does the PoC in the jshell bug report really causing the issue? > > System.out.println("\u3042") > > This is ASCII, so save/restore does not seem to cause any issues across JDKs > with and without JEP400. Did you mean `Systemout.println("あ")` instead? Hello @naotoj . Sorry I'm late. I'd like to show you jshell issue step by step. 1. `System.out.println(...)` did not work if non-ASCII character was printed on JDK18-b13. Because jshell output encoding was MS932, jshell agent output encoding was UTF-8 ![jshell-932-01](https://user-images.githubusercontent.com/33543753/137185670-02bcec50-d5af-4515-b16b-2893094732d5.png) 2. Above fix was applied against `JShellToolProvider.java` only. The issue was not fixed. ![jshell-932-02](https://user-images.githubusercontent.com/33543753/137186394-2c8bab09-7889-42d4-bbb7-2fb7b8a86a80.png) 3. Just applied lahodaj's fix `JShellToolBuilder.java`. It worked fine as expected ![jshell-932-03](https://user-images.githubusercontent.com/33543753/137187148-d1eb0821-599a-4c27-a739-434ed21ff5b6.png) 4. I checked compatibility between JDK17 and this fix by using `/save` and `/open` It seemed saved a.jsh's encoding was MS932 ![jshell-932-04](https://user-images.githubusercontent.com/33543753/137187671-b271a772-790d-4925-aa84-dc003e904c34.png) 5. I think jshell and agent should use same `file.encoding` system property. I applied following fix. --- a/src/jdk.jshell/share/classes/jdk/jshell/execution/JdiInitiator.java +++ b/src/jdk.jshell/share/classes/jdk/jshell/execution/JdiInitiator.java @@ -27,6 +27,7 @@ package jdk.jshell.execution; import java.io.File; import java.io.IOException; import java.nio.charset.StandardCharsets; +import java.nio.charset.Charset; import java.nio.file.Files; import java.util.ArrayList; import java.util.HashMap; @@ -86,6 +87,17 @@ public class JdiInitiator { Map customConnectorArgs) { this.remoteAgent = remoteAgent; this.connectTimeout = (int) (timeout * CONNECT_TIMEOUT_FACTOR); +if (!StandardCharsets.UTF_8.equals(Charset.defaultCharset())) { +boolean encodingFlag = true; +for (String s : remoteVMOptions.toArray(new String[0])) { +if (s.startsWith("-Dfile.encoding=")) +encodingFlag = false; +} +if (encodingFlag) { +remoteVMOptions.add("-Dfile.encoding=" ++Charset.defaultCharset().name()); +} +} String connectorName = isLaunch ? "com.sun.jdi.CommandLineLaunch" ![image](https://user-images.githubusercontent.com/33543753/137186123-46ba46cb-e1ac-4a9f-ac77-05c2502cd368.png) I think `JShellToolBuilder.java` and `JdiInitiator.java`. Could you give me some suggestions ? - PR: https://git.openjdk.java.net/jdk/pull/5771
Questions for JDK-4947890
I have some questions for JDK-4947890。 One of app’s JVM was upgraded from JDK11 to JDK17. Then the working behavior was changed on Windows 10 Multilingual User Interface (MUI). (Base was Japanese Windows 10, display language setting was English (United States)) In my investigation, the working behavior was changed by JDK12+b23. JDK-4947890 Minimize JNI upcalls in system-properties initialization [1] Following change was applied on src/java.base/share/native/libjava/System.c, From (not for MacOS platform) [2] Put sprops->sun_jnu_encoding into file.encoding system property To [3] Put sprops->encoding into file.encoding system property I checked JDK-4947890’s CSR JDK-8213895 [4] Modify JVM interface functions for property initialization But it seems that above CSR does not mention such a significant specification change. I checked JDK17u, same code was still used [5]. Questions: I’d like to confirm * This change was intentional or not ? * We can revert tp JDK11’s spec ? Thanks, Ichiroh Takiguchi [1] https://bugs.openjdk.java.net/browse/JDK-4947890 [2] http://hg.openjdk.java.net/jdk/jdk/rev/0bdbf854472f#l12.466 [3] http://hg.openjdk.java.net/jdk/jdk/rev/0bdbf854472f#l12.276 [4] https://bugs.openjdk.java.net/browse/JDK-8213895 [5] https://github.com/openjdk/jdk17u/blob/master/src/java.base/share/native/libjava/System.c#L149
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows [v3]
On Wed, 6 Oct 2021 18:08:04 GMT, Naoto Sato wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8274544: Langtools command's usage were garbled on Japanese Windows > > I got your issue now. Since the current code issues `FileReader()` without > specifying the explicit charset, this is the prime example that is affected > by the JEP. The question is, in which encoding the jshell save file should > be? I think it belongs to the spec of jshell, and it should be specified > somewhere in the document. > > BTW, the suggestion I made in `JShellToolProvider` still holds regardless of > the save file issue. Hello @naotoj . Do you think I need to fix jshell issue by this PR ? - PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows [v2]
On Wed, 6 Oct 2021 00:02:55 GMT, Naoto Sato wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8274544: Langtools command's usage were garbled on Japanese Windows > > I just grepped `System.out/err` in jshell source directory, and found another > location in `tool/JShellToolProvider.java` that uses bare stdout/err. Would > you also apply the fix and see the result? Hello @naotoj . Sorry I'm late. > I just grepped `System.out/err` in jshell source directory, and found another > location in `tool/JShellToolProvider.java` that uses bare stdout/err. Would > you also apply the fix and see the result? I applied following changes and lahodaj's code (I'm not sure, it's expected one...) : in; PrintStream xout = (out == null) -? System.out +? new PrintStream(System.out, true, nativeCharset) : (out instanceof PrintStream) ? (PrintStream) out -: new PrintStream(out); +: new PrintStream(out, true, nativeCharset); PrintStream xerr = (err == null) -? System.err +? new PrintStream(System.err, true, nativeCharset) : (err instanceof PrintStream) ? (PrintStream) err -: new PrintStream(err); +: new PrintStream(err, true, nativeCharset); try { return JavaShellToolBuilder .builder() But it did not work for previously saved encoded command list. (lahodaj's code worked fine for standard case.) I think you may be confused because of my bad explanation. User can create Jshell's command list by `/save`. Native encoding was used before JDK18. Now UTF-8 is used by JDK18. To read saved command list (`/open`) which was encoded by native encoding, Charset.defaultCharset() should be same as native.encoding... I think we need to provide workaround for it. - PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows [v2]
On Mon, 4 Oct 2021 16:24:18 GMT, Naoto Sato wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8274544: Langtools command's usage were garbled on Japanese Windows > > src/jdk.compiler/share/classes/com/sun/tools/javac/util/Log.java line 265: > >> 263: * @return a map of writers >> 264: */ >> 265: private final static Charset nativeCharset; > > Inserting this static initializer in the middle of a method, between its > javadoc and impl, is odd. Moved to another place > src/jdk.compiler/share/classes/com/sun/tools/javac/util/Log.java line 267: > >> 265: private final static Charset nativeCharset; >> 266: static { >> 267: Charset cs = Charset.defaultCharset(); > > This could move into the `catch` section as a last resort. Move `cs = Charset.defaultCharset()` into `catch` section > src/jdk.jdeps/share/classes/com/sun/tools/javap/JavapTask.java line 419: > >> 417: return new PrintWriter(System.err, true, nativeCharset); >> 418: } else { >> 419: if (s.equals((OutputStream)System.err) || >> s.equals((OutputStream)System.out)) { > > Can we use `==` here? Used `==` > src/jdk.jpackage/share/classes/jdk/jpackage/main/Main.java line 50: > >> 48: * @param args command line arguments >> 49: */ >> 50: private final static Charset nativeCharset; > > Static initializer dissecting main method (javadoc/impl) Moved to another place - PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows [v3]
> JEP-400 (UTF-8 by Default) was eabled on JDK18-b13. > After JDK18-b13, javac and some other langtool command's usage were garbled > on Japanese Windows. > These commands use PrintWriter instead of standard out/err with PrintStream. Ichiroh Takiguchi has updated the pull request incrementally with one additional commit since the last revision: 8274544: Langtools command's usage were garbled on Japanese Windows - Changes: - all: https://git.openjdk.java.net/jdk/pull/5771/files - new: https://git.openjdk.java.net/jdk/pull/5771/files/bfe90241..4427d87c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=5771=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk=5771=01-02 Stats: 61 lines in 8 files changed: 32 ins; 21 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/5771.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5771/head:pull/5771 PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows
On Mon, 4 Oct 2021 10:24:01 GMT, Jan Lahoda wrote: >> Helllo @naotoj . >> I used `System.console()` and `Console.charset()`. >> >> The following executables had same issue, I fixed them. >>> jar.exe, javac.exe, javadoc.exe, javap.exe, jdeps.exe, jlink.exe, jmod.exe, >>> jpackage.exe >> >> I could not find out the following executables had same issue or not >>> jabswitch.exe, jcmd.exe, jfr.exe, jhsdb.exe, jimage.exe, jinfo.exe, >>> jmap.exe, jps.exe, jrunscript.exe, jstack.exe, jstat.exe, jstatd.exe, >>> kinit.exe, klist.exe, ktab.exe >> >> The following executables worked fine. >>> jarsigner.exe, java.exe, javaw.exe, jdb.exe, jdeprscan.exe, jshell.exe, >>> keytool.exe, rmiregistry.exe, serialver.exe >> >> The following executables were GUI apps >>> jaccessinspector.exe, jaccesswalker.exe, jconsole.exe >> >> These fixes don't have testcases. >> Do you have any idea about testcase for this issue ? > > @takiguc - if JShell is still an issue, is there a chance you could try this: > https://github.com/lahodaj/jdk/commit/cfa6b3eebbc22c5a48d31cfd692ff98690653686 > > Not sure if it will help, but it might (this won't change the default > charset, but could send the output in a sensible encoding for the console. > > Thanks! Thanks, @lahodaj . I opened [JDK-8274784](https://bugs.openjdk.java.net/browse/JDK-8274784). I tested your code, it worked fine on standard case ! Many thanks. But, to execute previous saved command list, user needs to specify -J-Dfile.encoding=COMPAT and -R-Dfile.encoding=COMPAT options. Do you have any idea about this kind of case ? I'd like to discuss jshell things by [JDK-8274784](https://bugs.openjdk.java.net/browse/JDK-8274784). - PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows
On Fri, 1 Oct 2021 18:14:11 GMT, Naoto Sato wrote: >> JEP-400 (UTF-8 by Default) was eabled on JDK18-b13. >> After JDK18-b13, javac and some other langtool command's usage were garbled >> on Japanese Windows. >> These commands use PrintWriter instead of standard out/err with PrintStream. > > The encoding used in `Log.java` should first check whether it is run within > console or not. If `System.console()` returns the console, its `.charset()` > should be used instead of `native.encoding` value. > As to the jshell issue, it is a different issue than javac, so I would expect > it to be handled with a different issue. Helllo @naotoj . I used `System.console()` and `Console.charset()`. The following executables had same issue, I fixed them. > jar.exe, javac.exe, javadoc.exe, javap.exe, jdeps.exe, jlink.exe, jmod.exe, > jpackage.exe I could not find out the following executables had same issue or not > jabswitch.exe, jcmd.exe, jfr.exe, jhsdb.exe, jimage.exe, jinfo.exe, jmap.exe, > jps.exe, jrunscript.exe, jstack.exe, jstat.exe, jstatd.exe, kinit.exe, > klist.exe, ktab.exe The following executables worked fine. > jarsigner.exe, java.exe, javaw.exe, jdb.exe, jdeprscan.exe, jshell.exe, > keytool.exe, rmiregistry.exe, serialver.exe The following executables were GUI apps > jaccessinspector.exe, jaccesswalker.exe, jconsole.exe These fixes don't have testcases. Do you have any idea about testcase for this issue ? - PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows [v2]
> JEP-400 (UTF-8 by Default) was eabled on JDK18-b13. > After JDK18-b13, javac and some other langtool command's usage were garbled > on Japanese Windows. > These commands use PrintWriter instead of standard out/err with PrintStream. Ichiroh Takiguchi has updated the pull request incrementally with one additional commit since the last revision: 8274544: Langtools command's usage were garbled on Japanese Windows - Changes: - all: https://git.openjdk.java.net/jdk/pull/5771/files - new: https://git.openjdk.java.net/jdk/pull/5771/files/f348c26e..bfe90241 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=5771=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk=5771=00-01 Stats: 194 lines in 11 files changed: 141 ins; 34 del; 19 mod Patch: https://git.openjdk.java.net/jdk/pull/5771.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5771/head:pull/5771 PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8274544: Langtools command's usage were garbled on Japanese Windows
On Fri, 1 Oct 2021 12:13:03 GMT, Pavel Rappo wrote: >> JEP-400 (UTF-8 by Default) was eabled on JDK18-b13. >> After JDK18-b13, javac and some other langtool command's usage were garbled >> on Japanese Windows. >> These commands use PrintWriter instead of standard out/err with PrintStream. > > For searchability, you could fix the typo in the PR title and JBS summary: > grABled -> gARbled. A nit, really. @pavelrappo Many thanks. I changed PR and JBS. - PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8274544: Langtools command's usage were grabled on Japanese Windows
On Thu, 30 Sep 2021 21:45:15 GMT, Naoto Sato wrote: >> Screenshot >> ![javac-screenshot](https://user-images.githubusercontent.com/33543753/135429041-0ed22b36-0b1e-4626-92ca-8b58acf8872d.png) >> >> javac does not use PrintStream for standard out/err, it uses PrintWriter. >> I put some codes on >> src/jdk.compiler/share/classes/com/sun/tools/javac/util/Log.java >> * Using native.encoding system property. But >> test/langtools/tools/javac/diags/CheckResourceKeys.java was failed. >> * Use java.io.Console, like Console cons = System.console() and >> cons.charset(), but "javac 2>&1 | more" does not work as expected because >> System.console() returns null. >> >> So I added -Dfile.encoding=COMPAT expect java and javaw commands on launcher. >> >> jshell does not work as expected on b12 >> >>>jdk-18-b12\bin\jshell.exe >> | JShellへようこそ -- バージョン18-ea >> | 概要については、次を入力してください: /help intro >> >> jshell> "\u3042".getBytes() >> $1 ==> byte[2] { -126, -96 } >> >> on b13 >> >>>jdk-18-b13\bin\jshell.exe >> | JShellへようこそ -- バージョン18-ea >> | 概要については、次を入力してください: /help intro >> >> jshell> "\u3042".getBytes() >> $1 ==> byte[3] { -29, -127, -126 } >> >> It's UTF-8, not native encoding. >> I think backend java process should use same fine.encoding system property >> setting. >> >> I don't think it's good fix, so please give me some advices. > >> * Using native.encoding system property. But >> test/langtools/tools/javac/diags/CheckResourceKeys.java was failed. > > What was the cause of the failure? > >> * Use java.io.Console, like Console cons = System.console() and >> cons.charset(), but "javac 2>&1 | more" does not work as expected because >> System.console() returns null. >> >> So I added -Dfile.encoding=COMPAT expect java and javaw commands on launcher. > > I think we should fix the root cause of this, instead of specifying > `file.encoding=COMPAT` > >> >> jshell does not work as expected on b12 > > Do you mean `b13`? > >> >> ``` >> >jdk-18-b12\bin\jshell.exe >> | JShellへようこそ -- バージョン18-ea >> | 概要については、次を入力してください: /help intro >> >> jshell> "\u3042".getBytes() >> $1 ==> byte[2] { -126, -96 } >> ``` >> >> on b13 >> >> ``` >> >jdk-18-b13\bin\jshell.exe >> | JShellへようこそ -- バージョン18-ea >> | 概要については、次を入力してください: /help intro >> >> jshell> "\u3042".getBytes() >> $1 ==> byte[3] { -29, -127, -126 } >> ``` >> >> It's UTF-8, not native encoding. I think backend java process should use >> same fine.encoding system property setting. > > No it should not. `file.encoding` should not be inherited. > > Naoto @naotoj I applied following change on src/jdk.compiler/share/classes/com/sun/tools/javac/util/Log.java --- a/src/jdk.compiler/share/classes/com/sun/tools/javac/util/Log.java +++ b/src/jdk.compiler/share/classes/com/sun/tools/javac/util/Log.java @@ -26,6 +26,7 @@ package com.sun.tools.javac.util; import java.io.*; +import java.nio.charset.Charset; import java.util.Arrays; import java.util.EnumMap; import java.util.EnumSet; @@ -261,12 +262,22 @@ public class Log extends AbstractLog { * @param context the context in which to find writers to use * @return a map of writers */ +private final static Charset nativeCharset; +static { +Charset cs = Charset.defaultCharset(); +try { +cs = Charset.forName(System.getProperty("native.encoding")); +} catch (Exception e) { +} +nativeCharset = cs; +} + private static Map initWriters(Context context) { PrintWriter out = context.get(outKey); PrintWriter err = context.get(errKey); if (out == null && err == null) { -out = new PrintWriter(System.out, true); -err = new PrintWriter(System.err, true); +out = new PrintWriter(System.out, true, nativeCharset); +err = new PrintWriter(System.err, true, nativeCharset); return initWriters(out, err); } else if (out == null || err == null) { PrintWriter pw = (out != null) ? out : err; I got following exception via tools/javac/diags/CheckResourceKeys.java Error: no match for "native.encoding" java.lang.Exception: 1 errors occurred at CheckResourceKeys.main(CheckResourceKeys.java:59) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) at java.base/java.lang.Thread.run(Thread.java:833) About jshell, my sample was not good, See this one. By b12 >jdk-18-b12\bin\jshell | JShellへようこそ -- バージョン18-ea | 概要については、次を入力してください: /help intro jshell> System.out.println("\u3042") あ By b13
Re: RFR: 8274544: Langtools command's usage were grabled on Japanese Windows
On Thu, 30 Sep 2021 09:36:34 GMT, Ichiroh Takiguchi wrote: > JEP-400 (UTF-8 by Default) was eabled on JDK18-b13. > After JDK18-b13, javac and some other langtool command's usage were garbled > on Japanese Windows. > These commands use PrintWriter instead of standard out/err with PrintStream. Screenshot ![javac-screenshot](https://user-images.githubusercontent.com/33543753/135429041-0ed22b36-0b1e-4626-92ca-8b58acf8872d.png) javac does not use PrintStream for standard out/err, it uses PrintWriter. I put some codes on src/jdk.compiler/share/classes/com/sun/tools/javac/util/Log.java * Using native.encoding system property. But test/langtools/tools/javac/diags/CheckResourceKeys.java was failed. * Use java.io.Console, like Console cons = System.console() and cons.charset(), but "javac 2>&1 | more" does not work as expected because System.console() returns null. So I added -Dfile.encoding=COMPAT expect java and javaw commands on launcher. jshell does not work as expected on b12 >jdk-18-b12\bin\jshell.exe | JShellへようこそ -- バージョン18-ea | 概要については、次を入力してください: /help intro jshell> "\u3042".getBytes() $1 ==> byte[2] { -126, -96 } on b13 >jdk-18-b13\bin\jshell.exe | JShellへようこそ -- バージョン18-ea | 概要については、次を入力してください: /help intro jshell> "\u3042".getBytes() $1 ==> byte[3] { -29, -127, -126 } It's UTF-8, not native encoding. I think backend java process should use same fine.encoding system property setting. I don't think it's good fix, so please give me some advices. - PR: https://git.openjdk.java.net/jdk/pull/5771
RFR: 8274544: Langtools command's usage were grabled on Japanese Windows
JEP-400 (UTF-8 by Default) was eabled on JDK18-b13. After JDK18-b13, javac and some other langtool command's usage were garbled on Japanese Windows. These commands use PrintWriter instead of standard out/err with PrintStream. - Commit messages: - Langtools command's usage were grabled on Japanese Windows Changes: https://git.openjdk.java.net/jdk/pull/5771/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk=5771=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8274544 Stats: 35 lines in 2 files changed: 34 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/5771.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5771/head:pull/5771 PR: https://git.openjdk.java.net/jdk/pull/5771
Re: RFR: 8266013: Unexpected replacement character handling on stateful CharsetEncoder [v2]
On Fri, 30 Apr 2021 16:11:30 GMT, Ichiroh Takiguchi wrote: >> When an invalid character is converted by getBytes() method, the character >> is converted to replacement byte data. >> Shift code (SO/SI) may not be added into right place by EBCDIC Mix charset. >> EBCDIC Mix charset encoder is stateful encoder. >> Shift code should be added by switching character set. >> On x-IBM1364, "\u3000\uD800" should be converted to "\x0E\x40\x40\x0F\x6F", >> but "\x0E\x40\x40\x6F\x0F" >> SI is not in right place. >> >> Also ISO2022 related charsets use escape sequence to switch character set. >> But same kind of issue is there. > > Ichiroh Takiguchi has updated the pull request incrementally with one > additional commit since the last revision: > > 8266013: Unexpected replacement character handling on stateful > CharsetEncoder Gentle reminder Currently stateful CharsetEncoder (like EBCDIC Mix, ISO2022 related) cannot handle replacement characters. Please give me your suggestion or advice. - PR: https://git.openjdk.java.net/jdk/pull/3719
Re: RFR: 8266013: Unexpected replacement character handling on stateful CharsetEncoder [v2]
> When an invalid character is converted by getBytes() method, the character is > converted to replacement byte data. > Shift code (SO/SI) may not be added into right place by EBCDIC Mix charset. > EBCDIC Mix charset encoder is stateful encoder. > Shift code should be added by switching character set. > On x-IBM1364, "\u3000\uD800" should be converted to "\x0E\x40\x40\x0F\x6F", > but "\x0E\x40\x40\x6F\x0F" > SI is not in right place. > > Also ISO2022 related charsets use escape sequence to switch character set. > But same kind of issue is there. Ichiroh Takiguchi has updated the pull request incrementally with one additional commit since the last revision: 8266013: Unexpected replacement character handling on stateful CharsetEncoder - Changes: - all: https://git.openjdk.java.net/jdk/pull/3719/files - new: https://git.openjdk.java.net/jdk/pull/3719/files/d6a0a41b..33107e67 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=3719=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk=3719=00-01 Stats: 59 lines in 2 files changed: 40 ins; 5 del; 14 mod Patch: https://git.openjdk.java.net/jdk/pull/3719.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3719/head:pull/3719 PR: https://git.openjdk.java.net/jdk/pull/3719
Re: RFR: 8264208: Console charset API [v4]
On Mon, 12 Apr 2021 23:01:24 GMT, Naoto Sato wrote: >> Please review the changes for the subject issue. This has been suggested in >> a recent discussion thread for the JEP 400 >> [[1](https://mail.openjdk.java.net/pipermail/core-libs-dev/2021-March/075214.html)]. >> A CSR has also been drafted, and comments are welcome >> [[2](https://bugs.openjdk.java.net/browse/JDK-8264209)]. > > Naoto Sato has updated the pull request incrementally with one additional > commit since the last revision: > > Reverted PrintStream changes I have 2 concerns: 1. I think method name "charset()" is too short. It's not called frequently. This method name should explain functionality. 2. Sometimes stderr may be redirected to stdout by shell. Why do we need to set different encodings for these two (sun.stdout.encoding and sun.stderr.encoding) ? - PR: https://git.openjdk.java.net/jdk/pull/3419
Re: RFR: 8242541: Small charset issues (ISO8859-16, x-eucJP-Open, x-IBM834 and x-IBM949C)
Hello Naoto. Thank you for your attention. And I'm sorry about bad response. I added ISO8859_16 entry into test/jdk/java/nio/charset/Charset/RegisteredCharsets.java Could you review the fix again ? Bug:https://bugs.openjdk.java.net/browse/JDK-8242541 Change: https://cr.openjdk.java.net/~itakiguchi/8242541/webrev.01/ Thanks, Ichiroh Takiguchi On 2020-04-23 00:54, naoto.s...@oracle.com wrote: Hi Takiguchi-san, Change looks good. I'd expect a test case in open/test/jdk/java/nio/charset/Charset/RegisteredCharsets.java for the added "ISO8859_16" alias. Naoto On 4/20/20 6:37 PM, Ichiroh Takiguchi wrote: Hello. Could you review the fix ? Bug: https://bugs.openjdk.java.net/browse/JDK-8242541 Change: https://cr.openjdk.java.net/~itakiguchi/8242541/webrev.00/ I applied following changes: * Missing historical name alias in ISO8859-16 * Typo hisname on x-eucJP-Open * x-IBM834 and x-IBM949C charset source codes should be template style I appreciate your feedback and suggestions. Thanks, Ichiroh Takiguchi IBM Japan, Ltd.
RFR: 8242541: Small charset issues (ISO8859-16, x-eucJP-Open, x-IBM834 and x-IBM949C)
Hello. Could you review the fix ? Bug:https://bugs.openjdk.java.net/browse/JDK-8242541 Change: https://cr.openjdk.java.net/~itakiguchi/8242541/webrev.00/ I applied following changes: * Missing historical name alias in ISO8859-16 * Typo hisname on x-eucJP-Open * x-IBM834 and x-IBM949C charset source codes should be template style I appreciate your feedback and suggestions. Thanks, Ichiroh Takiguchi IBM Japan, Ltd.
Re: [15] RFR: 8241311: Move some charset mapping tests from closed to open
Hello Naoto. I tested webrev.06 code. It's fine, thanks. I'm interested in about @module for these testcases. I think webrev.04 code worked via jtreg. I could not see any warning. At this case, @module is required ? Thanks, Ichiroh Takiguchi On 2020-03-24 10:06, naoto.s...@oracle.com wrote: Hi Takiguchi-san, On 3/23/20 5:48 AM, Ichiroh Takiguchi wrote: Hello Naoto. I'm not reviewer, but I have a concern about following code on test/jdk/sun/nio/cs/mapping/TestConv.java == 98 } catch (Exception ex) { 99 System.out.println("Exception thrown while testing " + encoding); 100 ex.printStackTrace(); 101 return; 102 } == 3 stack trace (java.io.UnsupportedEncodingException) were displayed against following encodings: MS50221_0208, MS50221_0212, MS932_0208 I think only UnsupportedEncodingException should be caught. The other exception should be handled as error. What do you think ? Good catch. I believe the test logic that assumes the file name is the charset name is simply wrong. I added the check whether the input charset name is supported or not, and only do the test if the charset is supported: http://cr.openjdk.java.net/~naoto/8241311/webrev.05/ Naoto Thanks, Ichiroh Takiguchi On 2020-03-21 01:21, naoto.s...@oracle.com wrote: Hello, Please review the fix to the following issue: https://bugs.openjdk.java.net/browse/JDK-8241311 The proposed changeset is located at: https://cr.openjdk.java.net/~naoto/8241311/webrev.04/ This is simply to move some test cases that have been in closed repository into open repository. Naoto
Re: [15] RFR: 8241311: Move some charset mapping tests from closed to open
Hello Naoto. I'm not reviewer, but I have a concern about following code on test/jdk/sun/nio/cs/mapping/TestConv.java == 98 } catch (Exception ex) { 99 System.out.println("Exception thrown while testing " + encoding); 100 ex.printStackTrace(); 101 return; 102 } == 3 stack trace (java.io.UnsupportedEncodingException) were displayed against following encodings: MS50221_0208, MS50221_0212, MS932_0208 I think only UnsupportedEncodingException should be caught. The other exception should be handled as error. What do you think ? Thanks, Ichiroh Takiguchi On 2020-03-21 01:21, naoto.s...@oracle.com wrote: Hello, Please review the fix to the following issue: https://bugs.openjdk.java.net/browse/JDK-8241311 The proposed changeset is located at: https://cr.openjdk.java.net/~naoto/8241311/webrev.04/ This is simply to move some test cases that have been in closed repository into open repository. Naoto
Re: RFR 8232161: Align some one-way conversion in MS950 charset with Windows
Hello Naoto. I appreciate your comments. I modified TestMS950.java testcase. I think it's easy to read. Could you review the fix again ? Bug:https://bugs.openjdk.java.net/browse/JDK-8232161 Change: https://cr.openjdk.java.net/~itakiguchi/8232161/webrev.03/ Thanks, Ichiroh Takiguchi On 2020-03-05 04:31, naoto.s...@oracle.com wrote: On 3/4/20 9:18 AM, Ichiroh Takiguchi wrote: Hello Naoto. I appreciate your comments. I applied following changes: * MS950.nr and TestMS950.java data were sorted by Unicode order * Added some comments into TestMS950.java * Change comment on MS950.map Could you review the fix ? Bug: https://bugs.openjdk.java.net/browse/JDK-8232161 Change: https://cr.openjdk.java.net/~itakiguchi/8232161/webrev.02/ I'd expect the sort order to be aligned with other *.nr files, i.e., sorted by the source byte sequence. Same for the test case (TestMS950.java) and the comment in MS950.map. As to the test comment, how about adding something below to @summary line? "Those test data confirm the preferred b2c irreversible mappings defined in MS950.nr file." Naoto Thanks, Ichiroh Takiguchi On 2020-03-03 10:31, naoto.s...@oracle.com wrote: Hi Takiguchi-san, A few comments: - I'd recommend sorting the entries in MS950.nr and test data in TestMS950.java for readability. - Add some comment about the objective in the test. It'd be hard for engineers who have no previous knowledge to these bytes. Naoto On 3/2/20 9:33 AM, Ichiroh Takiguchi wrote: Hello. Could you review the fix ? Bug: https://bugs.openjdk.java.net/browse/JDK-8232161 Change: https://cr.openjdk.java.net/~itakiguchi/8232161/webrev.01/ CSR 8233385 [1] was approved. [1] https://bugs.openjdk.java.net/browse/JDK-8233385 Thanks, Ichiroh Takiguchi IBM Japan, Ltd.
Re: RFR 8232161: Align some one-way conversion in MS950 charset with Windows
Hello Naoto. I appreciate your comments. I applied following changes: * MS950.nr and TestMS950.java data were sorted by Unicode order * Added some comments into TestMS950.java * Change comment on MS950.map Could you review the fix ? Bug:https://bugs.openjdk.java.net/browse/JDK-8232161 Change: https://cr.openjdk.java.net/~itakiguchi/8232161/webrev.02/ Thanks, Ichiroh Takiguchi On 2020-03-03 10:31, naoto.s...@oracle.com wrote: Hi Takiguchi-san, A few comments: - I'd recommend sorting the entries in MS950.nr and test data in TestMS950.java for readability. - Add some comment about the objective in the test. It'd be hard for engineers who have no previous knowledge to these bytes. Naoto On 3/2/20 9:33 AM, Ichiroh Takiguchi wrote: Hello. Could you review the fix ? Bug: https://bugs.openjdk.java.net/browse/JDK-8232161 Change: https://cr.openjdk.java.net/~itakiguchi/8232161/webrev.01/ CSR 8233385 [1] was approved. [1] https://bugs.openjdk.java.net/browse/JDK-8233385 Thanks, Ichiroh Takiguchi IBM Japan, Ltd.
RFR 8232161: Align some one-way conversion in MS950 charset with Windows
Hello. Could you review the fix ? Bug:https://bugs.openjdk.java.net/browse/JDK-8232161 Change: https://cr.openjdk.java.net/~itakiguchi/8232161/webrev.01/ CSR 8233385 [1] was approved. [1] https://bugs.openjdk.java.net/browse/JDK-8233385 Thanks, Ichiroh Takiguchi IBM Japan, Ltd.
Re: RFR: 8239965: XMLEncoder/Test4625418.java fails due to "Error: Cp943 - can't read properly"
Hello Naoto. I appreciate your comments. And sorry for my easy mistake. Could you review the fix again ? Bug:https://bugs.openjdk.java.net/browse/JDK-8239965 Change: https://cr.openjdk.java.net/~itakiguchi/8239965/webrev.01/ Thanks, Ichiroh Takiguchi On 2020-02-27 05:56, naoto.s...@oracle.com wrote: Takiguchi-san, Some nits: - Please add 'noreg-self' to the jira issue. - Copyright year should be modified. - Add the bug number to '@bug' tag. Naoto On 2/26/20 10:03 AM, Ichiroh Takiguchi wrote: Hello. Could you review the fix ? Bug: https://bugs.openjdk.java.net/browse/JDK-8239965 Change: https://cr.openjdk.java.net/~itakiguchi/8239965/webrev.00/ Sorry for side effect. Test4625418.java should skip Cp943 and x-IBM943 should be skipped after JDK-8235834 [1] was integrated. [1] https://bugs.openjdk.java.net/browse/JDK-8235834 Thanks, Ichiroh Takiguchi IBM Japan, Ltd.
RFR: 8239965: XMLEncoder/Test4625418.java fails due to "Error: Cp943 - can't read properly"
Hello. Could you review the fix ? Bug:https://bugs.openjdk.java.net/browse/JDK-8239965 Change: https://cr.openjdk.java.net/~itakiguchi/8239965/webrev.00/ Sorry for side effect. Test4625418.java should skip Cp943 and x-IBM943 should be skipped after JDK-8235834 [1] was integrated. [1] https://bugs.openjdk.java.net/browse/JDK-8235834 Thanks, Ichiroh Takiguchi IBM Japan, Ltd.
Re: RFR: 8235834 IBM-943 charset encoder needs updating
Hello Naoto. Yes, I agree with your suggestion. I'm very happy if you are the sponsor of this issue. Thanks, Ichiroh Takiguchi On 2020-02-22 01:53, naoto.s...@oracle.com wrote: Two subtle comments to the new webrev: - I'd add "private" to those static finals. - "cs.name()" in the exception messages can be replaced with "csName". Otherwise it looks good. If you agree with the above, no further review is needed. Also, if you need a sponsor, I can sponsor your changeset. Naoto On 2/21/20 5:10 AM, Ichiroh Takiguchi wrote: Hello Naoto. I appreciate your suggestions. I applied your suggestions into new patch. Could you review the fix again ? Bug: https://bugs.openjdk.java.net/browse/JDK-8235834 Change: https://cr.openjdk.java.net/~itakiguchi/8235834/webrev.01/ Thanks, Ichiroh Takiguchi IBM Japan, Ltd. On 2020-02-21 02:51, naoto.s...@oracle.com wrote: Thanks for the explanation. Here are comments to your fix: - Can you add some comments about the gist of the change to IBM943.c2b? Summarizing the explanation below would be fine. For the test case: - Please wrap the @bug line appropriately. - Those byte array/Strings can be "static final" outside the test method. - Looks like the mapping is exactly the same with IBM943C for those code points. Is that correct? If so, would you please add some comment as such? - Typo: "round-tip" -> "round-trip" Naoto On 2/20/20 2:47 AM, Ichiroh Takiguchi wrote: Hello Naoto. I appreciate your comment. The definition has not changed recently. Applying the change will prevent the characters which are used on Japanese PC from being converted to "?". I checked IBM-943 definition and changelog. Definitions and creation date are as follows: (All definitions are valid) 03AF34B0.TPMAP120: May 20 1997 (Current OpenJDK) 34B003AF.RPMAP130: May 20 1997 (Proposed change, upward compatible) 34B003AF.RPMAP14A: Jul 29 1998 (Same as 34B003AF.RPMAP130) 34B003AF.RPMAP15A: Jan 8 2003 (Additionally, 13 characters are changed) According to IBM943.map, OpenJDK JDK refers 03AF34B0.TPMAP120. 03AF34B0.TPMAP120 just has B2C conversion table only, no C2B definition. 34B003AF.RPMAP130 which has C2B definition was released on same date. I assume C2B definition was not implemented at that time. C2B definition for 34B003AF.RPMAP130 and 34B003AF.RPMAP14A are same, only replacement character is changed. 34B003AF.RPMAP15A is the latest, but it's almost same as MS932. If 34B003AF.RPMAP15A is applied, 03AF34B0.TPMAP14A is also required. I'd like to add C2B definition without B2C definition because of compatibility. I don't want to apply 03AF34B0.TPMAP14A B2C definition. So I'd like to apply 34B003AF.RPMAP130 definition. Thanks, Ichiroh Takiguchi On 2020-02-19 07:33, naoto.s...@oracle.com wrote: Hi Takiguchi-san, Can you please elaborate the rationale for the change? It looks like IBM943 chaset hasn't changed for a long time, at least from JDK8. Has the mapping definition recently changed? Naoto On 2/17/20 10:23 AM, Ichiroh Takiguchi wrote: Hello. Could you review the fix ? Bug: https://bugs.openjdk.java.net/browse/JDK-8235834 Change: https://cr.openjdk.java.net/~itakiguchi/8235834/webrev.00/ IBM-943 is for IBM Japanese PC encoding. MS932 is for Microsoft Japanese Windows encoding. IBM-943 charset encoder does not contain 5 compatible entries compared to MS932 charset. Thanks, Ichiroh Takiguchi IBM Japan, Ltd.
Re: RFR: 8235834 IBM-943 charset encoder needs updating
Hello Naoto. I appreciate your suggestions. I applied your suggestions into new patch. Could you review the fix again ? Bug:https://bugs.openjdk.java.net/browse/JDK-8235834 Change: https://cr.openjdk.java.net/~itakiguchi/8235834/webrev.01/ Thanks, Ichiroh Takiguchi IBM Japan, Ltd. On 2020-02-21 02:51, naoto.s...@oracle.com wrote: Thanks for the explanation. Here are comments to your fix: - Can you add some comments about the gist of the change to IBM943.c2b? Summarizing the explanation below would be fine. For the test case: - Please wrap the @bug line appropriately. - Those byte array/Strings can be "static final" outside the test method. - Looks like the mapping is exactly the same with IBM943C for those code points. Is that correct? If so, would you please add some comment as such? - Typo: "round-tip" -> "round-trip" Naoto On 2/20/20 2:47 AM, Ichiroh Takiguchi wrote: Hello Naoto. I appreciate your comment. The definition has not changed recently. Applying the change will prevent the characters which are used on Japanese PC from being converted to "?". I checked IBM-943 definition and changelog. Definitions and creation date are as follows: (All definitions are valid) 03AF34B0.TPMAP120: May 20 1997 (Current OpenJDK) 34B003AF.RPMAP130: May 20 1997 (Proposed change, upward compatible) 34B003AF.RPMAP14A: Jul 29 1998 (Same as 34B003AF.RPMAP130) 34B003AF.RPMAP15A: Jan 8 2003 (Additionally, 13 characters are changed) According to IBM943.map, OpenJDK JDK refers 03AF34B0.TPMAP120. 03AF34B0.TPMAP120 just has B2C conversion table only, no C2B definition. 34B003AF.RPMAP130 which has C2B definition was released on same date. I assume C2B definition was not implemented at that time. C2B definition for 34B003AF.RPMAP130 and 34B003AF.RPMAP14A are same, only replacement character is changed. 34B003AF.RPMAP15A is the latest, but it's almost same as MS932. If 34B003AF.RPMAP15A is applied, 03AF34B0.TPMAP14A is also required. I'd like to add C2B definition without B2C definition because of compatibility. I don't want to apply 03AF34B0.TPMAP14A B2C definition. So I'd like to apply 34B003AF.RPMAP130 definition. Thanks, Ichiroh Takiguchi On 2020-02-19 07:33, naoto.s...@oracle.com wrote: Hi Takiguchi-san, Can you please elaborate the rationale for the change? It looks like IBM943 chaset hasn't changed for a long time, at least from JDK8. Has the mapping definition recently changed? Naoto On 2/17/20 10:23 AM, Ichiroh Takiguchi wrote: Hello. Could you review the fix ? Bug: https://bugs.openjdk.java.net/browse/JDK-8235834 Change: https://cr.openjdk.java.net/~itakiguchi/8235834/webrev.00/ IBM-943 is for IBM Japanese PC encoding. MS932 is for Microsoft Japanese Windows encoding. IBM-943 charset encoder does not contain 5 compatible entries compared to MS932 charset. Thanks, Ichiroh Takiguchi IBM Japan, Ltd.
Re: RFR: 8235834 IBM-943 charset encoder needs updating
Hello Naoto. I appreciate your comment. The definition has not changed recently. Applying the change will prevent the characters which are used on Japanese PC from being converted to "?". I checked IBM-943 definition and changelog. Definitions and creation date are as follows: (All definitions are valid) 03AF34B0.TPMAP120: May 20 1997 (Current OpenJDK) 34B003AF.RPMAP130: May 20 1997 (Proposed change, upward compatible) 34B003AF.RPMAP14A: Jul 29 1998 (Same as 34B003AF.RPMAP130) 34B003AF.RPMAP15A: Jan 8 2003 (Additionally, 13 characters are changed) According to IBM943.map, OpenJDK JDK refers 03AF34B0.TPMAP120. 03AF34B0.TPMAP120 just has B2C conversion table only, no C2B definition. 34B003AF.RPMAP130 which has C2B definition was released on same date. I assume C2B definition was not implemented at that time. C2B definition for 34B003AF.RPMAP130 and 34B003AF.RPMAP14A are same, only replacement character is changed. 34B003AF.RPMAP15A is the latest, but it's almost same as MS932. If 34B003AF.RPMAP15A is applied, 03AF34B0.TPMAP14A is also required. I'd like to add C2B definition without B2C definition because of compatibility. I don't want to apply 03AF34B0.TPMAP14A B2C definition. So I'd like to apply 34B003AF.RPMAP130 definition. Thanks, Ichiroh Takiguchi On 2020-02-19 07:33, naoto.s...@oracle.com wrote: Hi Takiguchi-san, Can you please elaborate the rationale for the change? It looks like IBM943 chaset hasn't changed for a long time, at least from JDK8. Has the mapping definition recently changed? Naoto On 2/17/20 10:23 AM, Ichiroh Takiguchi wrote: Hello. Could you review the fix ? Bug: https://bugs.openjdk.java.net/browse/JDK-8235834 Change: https://cr.openjdk.java.net/~itakiguchi/8235834/webrev.00/ IBM-943 is for IBM Japanese PC encoding. MS932 is for Microsoft Japanese Windows encoding. IBM-943 charset encoder does not contain 5 compatible entries compared to MS932 charset. Thanks, Ichiroh Takiguchi IBM Japan, Ltd.
RFR: 8235834 IBM-943 charset encoder needs updating
Hello. Could you review the fix ? Bug:https://bugs.openjdk.java.net/browse/JDK-8235834 Change: https://cr.openjdk.java.net/~itakiguchi/8235834/webrev.00/ IBM-943 is for IBM Japanese PC encoding. MS932 is for Microsoft Japanese Windows encoding. IBM-943 charset encoder does not contain 5 compatible entries compared to MS932 charset. Thanks, Ichiroh Takiguchi IBM Japan, Ltd.
Re: RFR: CSR JDK-8233385 Align some one-way conversion in MS950 charset with Windows
Hello. This is reminder, again. Could you review CSR JDK-8233385 [1] ? [1] https://bugs.openjdk.java.net/browse/JDK-8233385 Thanks, Ichiroh Takiguchi On 2020-01-22 02:47, naoto.s...@oracle.com wrote: Looks good to me. Naoto On 1/20/20 4:30 AM, Ichiroh Takiguchi wrote: Hello. This is just a gentle reminder. Could you review CSR JDK-8233385 [1] ? [1] https://bugs.openjdk.java.net/browse/JDK-8233385 Thanks, Ichiroh Takiguchi On 2020-01-10 22:13, Ichiroh Takiguchi wrote: Hello. Could you review CSR JDK-8233385 [1] ? [1] https://bugs.openjdk.java.net/browse/JDK-8233385 Thanks, Ichiroh Takiguchi IBM Japan, Ltd.
8236548 Concern about CLDR Timezone translated data
Hello. I have 2 concerns about CLDR Timezone translated data. The detail information is in JDK-8236548 [1]. Can you show me how to solve this kind of ICU related issue ? [1] https://bugs.openjdk.java.net/browse/JDK-8236548 Thanks, Ichiroh Takiguchi IBM Japan, Ltd.
Re: RFR: CSR JDK-8233385 Align some one-way conversion in MS950 charset with Windows
Hello. This is just a gentle reminder. Could you review CSR JDK-8233385 [1] ? [1] https://bugs.openjdk.java.net/browse/JDK-8233385 Thanks, Ichiroh Takiguchi On 2020-01-10 22:13, Ichiroh Takiguchi wrote: Hello. Could you review CSR JDK-8233385 [1] ? [1] https://bugs.openjdk.java.net/browse/JDK-8233385 Thanks, Ichiroh Takiguchi IBM Japan, Ltd.