RFR: 8282042: [testbug] FileEncodingTest.java depends on default encoding

Tyler Steele Thu, 17 Feb 2022 15:10:01 -0800

FileEncodingTest expects all non-Windows platforms will have 
`Charset.defaultCharset().name()` set to US-ASCII when file.encoding is set to 
COMPAT. This assumption does not hold for AIX where it is ISO-8859-1.


According to [JEP-400](https://openjdk.java.net/jeps/400), we should expect  
`Charset.defaultCharset().name()` to equal 
`System.getProperty("native.encoding")` whenever the COMPAT flag is set. From 
JEP-400: "... if file.encoding is set to COMPAT on the command line, then the 
run-time value of file.encoding will be the same as the run-time value of 
native.encoding...". So one way to resolve this is to choose the value for each 
system from the native.encoding property.

With these changes however, my test systems continue to fail. 

- AIX reports: Default Charset: ISO-8859-1, expected: ISO8859-1
- Linux/Z reports: Default Charset: US-ASCII, expected: ANSI_X3.4-1968
- Linux/PowerLE reports: Default Charset: US-ASCII, expected: ANSI_X3.4-1968

Note that the expected value is populated from native.encoding.

This implies more work to be done. It looks to me that some modification to 
java_props_md.c may be needed to ensure that the System properties for 
native.encoding return [canonical 
names](http://www.iana.org/assignments/character-sets). 

---

A tempting alternative is to set the expected value for AIX to "ISO-8859-1" in 
the test explicitly, as was done for the Windows expected encoding prior to 
this proposed change. The main advantage to this alternative is that it is 
quick and easy, but the disadvantages are that it fails to test that COMPAT 
behaves as specified in JEP-400, and the approach does not scale well if it 
happens that other systems require other cases. I wonder if this is the reason 
non-English locals are excluded by the test.

Proceeding with this change and the work implied by the new failures it 
highlights goes beyond the scope of what I thought was a simple testbug. So I'm 
opening this up for some comments before proceeding down the rabbit hole of 
further changes. If there is generally positive support for this direction I'm 
happy to make the modifications necessary to populate native.encoding with 
canonical names. As I am new to OpenJDK, I am especially looking to ensure that 
changing the value returned by native.encoding will not have unintended 
consequences elsewhere in the project.

-------------

Commit messages:
 - Changes FileEncodingTest test to delegate behaviour of 
-Dfile.encoding=COMPAT to

Changes: https://git.openjdk.java.net/jdk/pull/7525/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7525&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8282042
  Stats: 8 lines in 1 file changed: 0 ins; 4 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7525.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7525/head:pull/7525

PR: https://git.openjdk.java.net/jdk/pull/7525

RFR: 8282042: [testbug] FileEncodingTest.java depends on default encoding

Reply via email to