Re: RFR: 8260265: UTF-8 by Default [v2]

2021-07-14 Thread Roger Riggs
On Wed, 14 Jul 2021 20:53:34 GMT, Naoto Sato  wrote:

>> src/java.base/share/classes/java/io/FileReader.java line 41:
>> 
>>> 39:  * @see InputStreamReader
>>> 40:  * @see FileInputStream
>>> 41:  * @see java.nio.charset.Charset#defaultCharset()
>> 
>> The @ see duplicates the link above, the javadoc can do without the @ see.
>
> If I remove that `@see`, I don't see the link in `See Also` section. Am I 
> missing something?

In my view the @ linkplain is sufficient to allow the reader to navigate; but 
YMMV.

>> src/java.base/share/classes/java/lang/System.java line 802:
>> 
>>> 800:  * {@systemProperty file.encoding}
>>> 801:  * The name of the default charset. Users may specify
>>> 802:  * {@code UTF-8} or {@code COMPAT} on the command line to the 
>>> value.
>> 
>> The wording could imply that only those two values can be supplied.
>> It could be rephrased to say that *if* the property is supplied on the 
>> command line
>> it overrides the default UTF-8.
>
> That was intentional. Only those two are supported, others continue to work 
> as before (but not supported).

Still it leaves an uncomfortable feeling, perhaps remedied by an "other values 
have unspecified behavior"
or the "other values are implementation specific".

-

PR: https://git.openjdk.java.net/jdk/pull/4733


Re: RFR: 8260265: UTF-8 by Default [v2]

2021-07-14 Thread Naoto Sato
On Wed, 14 Jul 2021 16:20:28 GMT, Jesse Glick 
 wrote:

>> src/java.base/share/classes/java/io/PrintStream.java line 49:
>> 
>>> 47:  *  All characters printed by a {@code PrintStream} are converted 
>>> into
>>> 48:  * bytes using the given encoding or charset, or the default
>>> 49:  * console charset if not specified.
>> 
>> JEP 400 doesn't give a rationale for using the console charset for 
>> PrintStream.
>> PrintStreams are used for output to files and other media other than just a 
>> tty/console.
>> The charset of system.out/err should use the console charset.
>
> This was my thinking in 
> https://github.com/openjdk/jdk/pull/4733#issuecomment-876793372.

OK, I am now conviced. Modified not to default to Console.charset() for generic 
PrintStream w/o charset constructor.

-

PR: https://git.openjdk.java.net/jdk/pull/4733


Re: RFR: 8260265: UTF-8 by Default [v2]

2021-07-14 Thread Naoto Sato
> This is an implementation for the `JEP 400: UTF-8 by Default`. The gist of 
> the changes is `Charset.defaultCharset()` returning `UTF-8` and 
> `file.encoding` system property being added in the spec, but another notable 
> modification is in `java.io.PrintStream` where it continues to use the 
> `Console` encoding as the default charset instead of `UTF-8`. Other changes 
> are mostly clarification of the term "default charset" and their links. 
> Corresponding CSR has also been drafted.
> 
> JEP 400: https://bugs.openjdk.java.net/browse/JDK-8187041
> CSR: https://bugs.openjdk.java.net/browse/JDK-8260266

Naoto Sato has updated the pull request with a new target base due to a merge 
or a rebase. The incremental webrev excludes the unrelated changes brought in 
by the merge/rebase. The pull request contains three additional commits since 
the last revision:

 - Reflects review comments
 - Merge branch 'master' into JDK-8260265
 - 8260265: UTF-8 by Default

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4733/files
  - new: https://git.openjdk.java.net/jdk/pull/4733/files/107210cf..981200f7

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=4733=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=4733=00-01

  Stats: 6434 lines in 314 files changed: 3877 ins; 1452 del; 1105 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4733.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4733/head:pull/4733

PR: https://git.openjdk.java.net/jdk/pull/4733


Re: RFR: 8260265: UTF-8 by Default [v2]

2021-07-14 Thread Naoto Sato
On Wed, 14 Jul 2021 14:55:39 GMT, Roger Riggs  wrote:

>> Naoto Sato has updated the pull request with a new target base due to a 
>> merge or a rebase. The incremental webrev excludes the unrelated changes 
>> brought in by the merge/rebase. The pull request contains three additional 
>> commits since the last revision:
>> 
>>  - Reflects review comments
>>  - Merge branch 'master' into JDK-8260265
>>  - 8260265: UTF-8 by Default
>
> src/java.base/share/classes/java/io/Console.java line 587:
> 
>> 585: try {
>> 586: cs = Charset.forName(csname);
>> 587: } catch (Exception ignored) { }
> 
> A separate enhancement...
> I've long thought that should be a way to avoid the exception here.
> For example,  a Charset.forName(csname, default);
> The caller might have a default in mind or supply null and then be able to 
> test for null.

Agreed. Will file an RFE for this.

> src/java.base/share/classes/java/io/FileReader.java line 41:
> 
>> 39:  * @see InputStreamReader
>> 40:  * @see FileInputStream
>> 41:  * @see java.nio.charset.Charset#defaultCharset()
> 
> The @ see duplicates the link above, the javadoc can do without the @ see.

If I remove that `@see`, I don't see the link in `See Also` section. Am I 
missing something?

> src/java.base/share/classes/java/lang/System.java line 802:
> 
>> 800:  * {@systemProperty file.encoding}
>> 801:  * The name of the default charset. Users may specify
>> 802:  * {@code UTF-8} or {@code COMPAT} on the command line to the 
>> value.
> 
> The wording could imply that only those two values can be supplied.
> It could be rephrased to say that *if* the property is supplied on the 
> command line
> it overrides the default UTF-8.

That was intentional. Only those two are supported, others continue to work as 
before (but not supported).

> src/java.base/share/classes/java/nio/charset/Charset.java line 601:
> 
>> 599:  * value designates {@code COMPAT}, the default charset is derived 
>> from
>> 600:  * the {@code native.encoding} system property, which typically 
>> depends
>> 601:  * upon the locale and charset of the underlying operating system.
> 
> The description in java.lang.System of the file.encoding property does not 
> indicate it is 'implementation specific'.
> In that context, it appears to be part of the JavaSE spec.
> Having the spec in a single place with references to it from others could 
> avoid duplication.

`file.encoding` is listed under `@implNote` tag in `System.getProperties()`, so 
it is implementation-specific.

-

PR: https://git.openjdk.java.net/jdk/pull/4733