Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v5]

2022-05-07 Thread Ichiroh Takiguchi
On Fri, 6 May 2022 15:13:38 GMT, Roger Riggs  wrote:

>> Ichiroh Takiguchi has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   8285517: System.getenv() returns unexpected value if environment variable 
>> has non ASCII character
>
> test/jdk/java/lang/ProcessBuilder/Basic.java line 606:
> 
>> 604: ? Charset.forName(jnuEncoding, Charset.defaultCharset())
>> 605: : Charset.defaultCharset();
>> 606: if (new String(tested.getBytes(cs), cs).equals(tested)) {
> 
> Isn't it always true that the round trip encoding to bytes and back (using 
> the same Charset) to a string is equal()?
> And if it is always true, then the if(...) can be removed.

Above code is related to following code:
https://github.com/openjdk/jdk/blob/5212535a276a92d96ca20bdcfccfbce956febdb1/test/jdk/java/lang/ProcessBuilder/Basic.java#L1569-L1570
If `Charset.defaultCharset()` is `UTF-8`, this code is not skipped.
I think this code will be skipped on JDK17 on C locale.

-

PR: https://git.openjdk.java.net/jdk/pull/8378


Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v5]

2022-05-06 Thread Roger Riggs
On Fri, 6 May 2022 20:03:35 GMT, Naoto Sato  wrote:

>> Ichiroh Takiguchi has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   8285517: System.getenv() returns unexpected value if environment variable 
>> has non ASCII character
>
> src/java.base/share/classes/jdk/internal/util/StaticProperty.java line 77:
> 
>> 75: SUN_JNU_ENCODING = getProperty(props, "sun.jnu.encoding");
>> 76: jnuCharset = Charset.forName(SUN_JNU_ENCODING, 
>> Charset.defaultCharset());
>> 77: }
> 
> I am not sure it is OK to initialize `Charset` here, as `sun_jnu_encoding` is 
> initialized in `System.initPhase1()` and pulling `Charset` there may cause 
> some init order change. I'd only cache the encoding string here.

Note the initialization of `sun.jnu.encoding` in System.java:2142'ish.
This happens before StaticProperty is initialized;  at line 2147.
If the `sun.jnu.encoding` is not supported (this is before Providers are 
enabled)
then it is forced to `UTF-8`.
So it is the case that the encoding is supported by that point.
Note that `Charset.isSupported(...)` does the full lookup in its implementation.

If it is not supported, the system still comes up using UTF-8 and a warning is 
printed at line 2282.

Comparing the class initialization order may be a useful thing to cross check.

-

PR: https://git.openjdk.java.net/jdk/pull/8378


Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v5]

2022-05-06 Thread Naoto Sato
On Fri, 6 May 2022 14:23:00 GMT, Ichiroh Takiguchi  
wrote:

>> On JDK19 with Linux ja_JP.eucjp locale,
>> System.getenv() returns unexpected value if environment variable has 
>> Japanese EUC characters.
>> It seems this issue happens because of JEP 400.
>> Arguments for ProcessBuilder have same kind of issue.
>
> Ichiroh Takiguchi has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   8285517: System.getenv() returns unexpected value if environment variable 
> has non ASCII character

src/java.base/share/classes/jdk/internal/util/StaticProperty.java line 77:

> 75: SUN_JNU_ENCODING = getProperty(props, "sun.jnu.encoding");
> 76: jnuCharset = Charset.forName(SUN_JNU_ENCODING, 
> Charset.defaultCharset());
> 77: }

I am not sure it is OK to initialize `Charset` here, as `sun_jnu_encoding` is 
initialized in `System.initPhase1()` and pulling `Charset` there may cause some 
init order change. I'd only cache the encoding string here.

-

PR: https://git.openjdk.java.net/jdk/pull/8378


Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v5]

2022-05-06 Thread Roger Riggs
On Fri, 6 May 2022 14:23:00 GMT, Ichiroh Takiguchi  
wrote:

>> On JDK19 with Linux ja_JP.eucjp locale,
>> System.getenv() returns unexpected value if environment variable has 
>> Japanese EUC characters.
>> It seems this issue happens because of JEP 400.
>> Arguments for ProcessBuilder have same kind of issue.
>
> Ichiroh Takiguchi has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   8285517: System.getenv() returns unexpected value if environment variable 
> has non ASCII character

test/jdk/java/lang/ProcessBuilder/Basic.java line 606:

> 604: ? Charset.forName(jnuEncoding, Charset.defaultCharset())
> 605: : Charset.defaultCharset();
> 606: if (new String(tested.getBytes(cs), cs).equals(tested)) {

Isn't it always true that the round trip encoding to bytes and back (using the 
same Charset) to a string is equal()?
And if it is always true, then the if(...) can be removed.

test/jdk/java/lang/System/i18nEnvArg.java line 104:

> 102: String s = System.getenv(EUC_JP_TEXT);
> 103: if (!EUC_JP_TEXT.equals(s)) {
> 104: System.err.println("ERROR: getenv() returns unexpected 
> data");

Please add the unexpected data `s` to the output string.

test/jdk/java/lang/System/i18nEnvArg.java line 108:

> 106: if (!EUC_JP_TEXT.equals(args[0])) {
> 107: System.err.print("ERROR: Unexpected argument was 
> received: ");
> 108: for(char ch : EUC_JP_TEXT.toCharArray()) {

This is the expected value, the previous "Unexpected" labeling might be 
mis-understood.

test/jdk/java/lang/System/i18nEnvArg.java line 111:

> 109:System.err.printf("\\u%04X", (int)ch);
> 110: }
> 111: System.err.print("<->");

This might be clearer if labeled as the actual/incorrect value and on a 
separate line.

-

PR: https://git.openjdk.java.net/jdk/pull/8378


Re: RFR: 8285517: System.getenv() returns unexpected value if environment variable has non ASCII character [v5]

2022-05-06 Thread Ichiroh Takiguchi
> On JDK19 with Linux ja_JP.eucjp locale,
> System.getenv() returns unexpected value if environment variable has Japanese 
> EUC characters.
> It seems this issue happens because of JEP 400.
> Arguments for ProcessBuilder have same kind of issue.

Ichiroh Takiguchi has updated the pull request incrementally with one 
additional commit since the last revision:

  8285517: System.getenv() returns unexpected value if environment variable has 
non ASCII character

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/8378/files
  - new: https://git.openjdk.java.net/jdk/pull/8378/files/84e6d639..5bd8492f

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=8378=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=8378=03-04

  Stats: 103 lines in 4 files changed: 39 ins; 39 del; 25 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8378.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8378/head:pull/8378

PR: https://git.openjdk.java.net/jdk/pull/8378