Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v5]
On Fri, 6 May 2022 15:13:38 GMT, Roger Riggs wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > test/jdk/java/lang/ProcessBuilder/Basic.java line 606: > >> 604: ? Charset.forName(jnuEncoding, Charset.defaultCharset()) >> 605: : Charset.defaultCharset(); >> 606: if (new String(tested.getBytes(cs), cs).equals(tested)) { > > Isn't it always true that the round trip encoding to bytes and back (using > the same Charset) to a string is equal()? > And if it is always true, then the if(...) can be removed. Above code is related to following code: https://github.com/openjdk/jdk/blob/5212535a276a92d96ca20bdcfccfbce956febdb1/test/jdk/java/lang/ProcessBuilder/Basic.java#L1569-L1570 If `Charset.defaultCharset()` is `UTF-8`, this code is not skipped. I think this code will be skipped on JDK17 on C locale. - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v5]
On Fri, 6 May 2022 20:03:35 GMT, Naoto Sato wrote: >> Ichiroh Takiguchi has updated the pull request incrementally with one >> additional commit since the last revision: >> >> 8285517: System.getenv() returns unexpected value if environment variable >> has non ASCII character > > src/java.base/share/classes/jdk/internal/util/StaticProperty.java line 77: > >> 75: SUN_JNU_ENCODING = getProperty(props, "sun.jnu.encoding"); >> 76: jnuCharset = Charset.forName(SUN_JNU_ENCODING, >> Charset.defaultCharset()); >> 77: } > > I am not sure it is OK to initialize `Charset` here, as `sun_jnu_encoding` is > initialized in `System.initPhase1()` and pulling `Charset` there may cause > some init order change. I'd only cache the encoding string here. Note the initialization of `sun.jnu.encoding` in System.java:2142'ish. This happens before StaticProperty is initialized; at line 2147. If the `sun.jnu.encoding` is not supported (this is before Providers are enabled) then it is forced to `UTF-8`. So it is the case that the encoding is supported by that point. Note that `Charset.isSupported(...)` does the full lookup in its implementation. If it is not supported, the system still comes up using UTF-8 and a warning is printed at line 2282. Comparing the class initialization order may be a useful thing to cross check. - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v5]
On Fri, 6 May 2022 14:23:00 GMT, Ichiroh Takiguchi wrote: >> On JDK19 with Linux ja_JP.eucjp locale, >> System.getenv() returns unexpected value if environment variable has >> Japanese EUC characters. >> It seems this issue happens because of JEP 400. >> Arguments for ProcessBuilder have same kind of issue. > > Ichiroh Takiguchi has updated the pull request incrementally with one > additional commit since the last revision: > > 8285517: System.getenv() returns unexpected value if environment variable > has non ASCII character src/java.base/share/classes/jdk/internal/util/StaticProperty.java line 77: > 75: SUN_JNU_ENCODING = getProperty(props, "sun.jnu.encoding"); > 76: jnuCharset = Charset.forName(SUN_JNU_ENCODING, > Charset.defaultCharset()); > 77: } I am not sure it is OK to initialize `Charset` here, as `sun_jnu_encoding` is initialized in `System.initPhase1()` and pulling `Charset` there may cause some init order change. I'd only cache the encoding string here. - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v5]
On Fri, 6 May 2022 14:23:00 GMT, Ichiroh Takiguchi wrote: >> On JDK19 with Linux ja_JP.eucjp locale, >> System.getenv() returns unexpected value if environment variable has >> Japanese EUC characters. >> It seems this issue happens because of JEP 400. >> Arguments for ProcessBuilder have same kind of issue. > > Ichiroh Takiguchi has updated the pull request incrementally with one > additional commit since the last revision: > > 8285517: System.getenv() returns unexpected value if environment variable > has non ASCII character test/jdk/java/lang/ProcessBuilder/Basic.java line 606: > 604: ? Charset.forName(jnuEncoding, Charset.defaultCharset()) > 605: : Charset.defaultCharset(); > 606: if (new String(tested.getBytes(cs), cs).equals(tested)) { Isn't it always true that the round trip encoding to bytes and back (using the same Charset) to a string is equal()? And if it is always true, then the if(...) can be removed. test/jdk/java/lang/System/i18nEnvArg.java line 104: > 102: String s = System.getenv(EUC_JP_TEXT); > 103: if (!EUC_JP_TEXT.equals(s)) { > 104: System.err.println("ERROR: getenv() returns unexpected > data"); Please add the unexpected data `s` to the output string. test/jdk/java/lang/System/i18nEnvArg.java line 108: > 106: if (!EUC_JP_TEXT.equals(args[0])) { > 107: System.err.print("ERROR: Unexpected argument was > received: "); > 108: for(char ch : EUC_JP_TEXT.toCharArray()) { This is the expected value, the previous "Unexpected" labeling might be mis-understood. test/jdk/java/lang/System/i18nEnvArg.java line 111: > 109:System.err.printf("\\u%04X", (int)ch); > 110: } > 111: System.err.print("<->"); This might be clearer if labeled as the actual/incorrect value and on a separate line. - PR: https://git.openjdk.java.net/jdk/pull/8378
Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v5]
> On JDK19 with Linux ja_JP.eucjp locale, > System.getenv() returns unexpected value if environment variable has Japanese > EUC characters. > It seems this issue happens because of JEP 400. > Arguments for ProcessBuilder have same kind of issue. Ichiroh Takiguchi has updated the pull request incrementally with one additional commit since the last revision: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character - Changes: - all: https://git.openjdk.java.net/jdk/pull/8378/files - new: https://git.openjdk.java.net/jdk/pull/8378/files/84e6d639..5bd8492f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=8378=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk=8378=03-04 Stats: 103 lines in 4 files changed: 39 ins; 39 del; 25 mod Patch: https://git.openjdk.java.net/jdk/pull/8378.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/8378/head:pull/8378 PR: https://git.openjdk.java.net/jdk/pull/8378