On 8/18/21 2:03 PM, Simon Nash wrote:
I am the developer of a fairly large application that uses file I/O
extensively. In most cases, the charset should be UTF-8 and I have used
an explicit charset parameter on all method invocations where this
applies. In some cases, the charset needs to be the platform default
charset to produce output that is readable by other programs or by a
user (for example, Windows-1252 on some versions of Windows).
In the cases that need the platform default charset, I have omitted the
charset (intentionally, not carelessly or accidentally). If the
behaviour changes in these cases, it will produce unexpected results for
users.
In preparation for JEP 400, we have provided a new system property that
retrieves the native encoding name in JDK17:
https://bugs.openjdk.java.net/browse/JDK-8265989
Apps that have luxury to make code base change can use the property to
retrieve the native encoding, then use it to replace no-arg I/O
constructors to the explicit equivalent ones.
I could try to find all the method invocations that currently use the
implicit default charset, although I have no idea how to do this other
than reading every line of code. The problems with this are 1) that I
would almost certainly miss some invocations that need to be changed and
2) more seriously, from what I have seen in the JEP I don't think there
is a way to update these method invocations that works exactly as at
present on all versions of Java back to JDK 8 and provides the same
behaviour on JDK 18. This is because (as far as I can tell) there is no
API call that returns the "old-style" platform default charset and can
be used on all JDK versions from JDK 8 to JDK 18. Adding a -D option
when the application is started isn't possible in some contexts such as
launching the application from a Windows executable jar file association.
If I could make a single API call when the application is first started
to force backward-compatible behaviour in all cases, this would solve
the problem. This feels very much like a "hack" and I would much prefer
a clean solution but it would be better than nothing.
I am reluctant to provide such a single API call which has the same
effect as `COMPAT`, as it will become less meaningful when UTF-8 as
default sinks in.
Naoto