Re: RFR: 8260265: UTF-8 by Default [v2]
On Wed, 14 Jul 2021 20:53:34 GMT, Naoto Sato wrote: >> src/java.base/share/classes/java/io/FileReader.java line 41: >> >>> 39: * @see InputStreamReader >>> 40: * @see FileInputStream >>> 41: * @see java.nio.charset.Charset#defaultCharset() >> >> The @ see duplicates the link above, the javadoc can do without the @ see. > > If I remove that `@see`, I don't see the link in `See Also` section. Am I > missing something? In my view the @ linkplain is sufficient to allow the reader to navigate; but YMMV. >> src/java.base/share/classes/java/lang/System.java line 802: >> >>> 800: * {@systemProperty file.encoding} >>> 801: * The name of the default charset. Users may specify >>> 802: * {@code UTF-8} or {@code COMPAT} on the command line to the >>> value. >> >> The wording could imply that only those two values can be supplied. >> It could be rephrased to say that *if* the property is supplied on the >> command line >> it overrides the default UTF-8. > > That was intentional. Only those two are supported, others continue to work > as before (but not supported). Still it leaves an uncomfortable feeling, perhaps remedied by an "other values have unspecified behavior" or the "other values are implementation specific". - PR: https://git.openjdk.java.net/jdk/pull/4733
Re: RFR: 8260265: UTF-8 by Default [v2]
On Wed, 14 Jul 2021 16:20:28 GMT, Jesse Glick wrote: >> src/java.base/share/classes/java/io/PrintStream.java line 49: >> >>> 47: * All characters printed by a {@code PrintStream} are converted >>> into >>> 48: * bytes using the given encoding or charset, or the default >>> 49: * console charset if not specified. >> >> JEP 400 doesn't give a rationale for using the console charset for >> PrintStream. >> PrintStreams are used for output to files and other media other than just a >> tty/console. >> The charset of system.out/err should use the console charset. > > This was my thinking in > https://github.com/openjdk/jdk/pull/4733#issuecomment-876793372. OK, I am now conviced. Modified not to default to Console.charset() for generic PrintStream w/o charset constructor. - PR: https://git.openjdk.java.net/jdk/pull/4733
Re: RFR: 8260265: UTF-8 by Default [v2]
> This is an implementation for the `JEP 400: UTF-8 by Default`. The gist of > the changes is `Charset.defaultCharset()` returning `UTF-8` and > `file.encoding` system property being added in the spec, but another notable > modification is in `java.io.PrintStream` where it continues to use the > `Console` encoding as the default charset instead of `UTF-8`. Other changes > are mostly clarification of the term "default charset" and their links. > Corresponding CSR has also been drafted. > > JEP 400: https://bugs.openjdk.java.net/browse/JDK-8187041 > CSR: https://bugs.openjdk.java.net/browse/JDK-8260266 Naoto Sato has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Reflects review comments - Merge branch 'master' into JDK-8260265 - 8260265: UTF-8 by Default - Changes: - all: https://git.openjdk.java.net/jdk/pull/4733/files - new: https://git.openjdk.java.net/jdk/pull/4733/files/107210cf..981200f7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=4733=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk=4733=00-01 Stats: 6434 lines in 314 files changed: 3877 ins; 1452 del; 1105 mod Patch: https://git.openjdk.java.net/jdk/pull/4733.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4733/head:pull/4733 PR: https://git.openjdk.java.net/jdk/pull/4733
Re: RFR: 8260265: UTF-8 by Default [v2]
On Wed, 14 Jul 2021 14:55:39 GMT, Roger Riggs wrote: >> Naoto Sato has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request contains three additional >> commits since the last revision: >> >> - Reflects review comments >> - Merge branch 'master' into JDK-8260265 >> - 8260265: UTF-8 by Default > > src/java.base/share/classes/java/io/Console.java line 587: > >> 585: try { >> 586: cs = Charset.forName(csname); >> 587: } catch (Exception ignored) { } > > A separate enhancement... > I've long thought that should be a way to avoid the exception here. > For example, a Charset.forName(csname, default); > The caller might have a default in mind or supply null and then be able to > test for null. Agreed. Will file an RFE for this. > src/java.base/share/classes/java/io/FileReader.java line 41: > >> 39: * @see InputStreamReader >> 40: * @see FileInputStream >> 41: * @see java.nio.charset.Charset#defaultCharset() > > The @ see duplicates the link above, the javadoc can do without the @ see. If I remove that `@see`, I don't see the link in `See Also` section. Am I missing something? > src/java.base/share/classes/java/lang/System.java line 802: > >> 800: * {@systemProperty file.encoding} >> 801: * The name of the default charset. Users may specify >> 802: * {@code UTF-8} or {@code COMPAT} on the command line to the >> value. > > The wording could imply that only those two values can be supplied. > It could be rephrased to say that *if* the property is supplied on the > command line > it overrides the default UTF-8. That was intentional. Only those two are supported, others continue to work as before (but not supported). > src/java.base/share/classes/java/nio/charset/Charset.java line 601: > >> 599: * value designates {@code COMPAT}, the default charset is derived >> from >> 600: * the {@code native.encoding} system property, which typically >> depends >> 601: * upon the locale and charset of the underlying operating system. > > The description in java.lang.System of the file.encoding property does not > indicate it is 'implementation specific'. > In that context, it appears to be part of the JavaSE spec. > Having the spec in a single place with references to it from others could > avoid duplication. `file.encoding` is listed under `@implNote` tag in `System.getProperties()`, so it is implementation-specific. - PR: https://git.openjdk.java.net/jdk/pull/4733