Re: RFR: 8265989: System property for the native character encoding name [v3]

2021-05-03 Thread Roger Riggs
On Fri, 30 Apr 2021 22:10:21 GMT, Naoto Sato  wrote:

>> After some internal discussion, we thought it was good to expose the native 
>> environment's default character encoding, which Charset.defaultCharset() is 
>> currently based on. This way applications will have a better migration path 
>> after the [JEP 400](https://openjdk.java.net/jeps/400) is implemented, in 
>> which Charset.defaultCharset() will return UTF-8, but the value of this new 
>> system property will remain intact. A 
>> [CSR](https://bugs.openjdk.java.net/browse/JDK-8266075) has been filed with 
>> more detailed information.
>
> Naoto Sato has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Fixed a typo.

Marked as reviewed by rriggs (Reviewer).

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name [v3]

2021-05-03 Thread Iris Clark
On Fri, 30 Apr 2021 22:10:21 GMT, Naoto Sato  wrote:

>> After some internal discussion, we thought it was good to expose the native 
>> environment's default character encoding, which Charset.defaultCharset() is 
>> currently based on. This way applications will have a better migration path 
>> after the [JEP 400](https://openjdk.java.net/jeps/400) is implemented, in 
>> which Charset.defaultCharset() will return UTF-8, but the value of this new 
>> system property will remain intact. A 
>> [CSR](https://bugs.openjdk.java.net/browse/JDK-8266075) has been filed with 
>> more detailed information.
>
> Naoto Sato has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Fixed a typo.

Marked as reviewed by iris (Reviewer).

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name [v3]

2021-04-30 Thread Naoto Sato
> After some internal discussion, we thought it was good to expose the native 
> environment's default character encoding, which Charset.defaultCharset() is 
> currently based on. This way applications will have a better migration path 
> after the [JEP 400](https://openjdk.java.net/jeps/400) is implemented, in 
> which Charset.defaultCharset() will return UTF-8, but the value of this new 
> system property will remain intact. A 
> [CSR](https://bugs.openjdk.java.net/browse/JDK-8266075) has been filed with 
> more detailed information.

Naoto Sato has updated the pull request incrementally with one additional 
commit since the last revision:

  Fixed a typo.

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3777/files
  - new: https://git.openjdk.java.net/jdk/pull/3777/files/e76ff410..01b2ffc3

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=3777=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=3777=01-02

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3777.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3777/head:pull/3777

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name [v2]

2021-04-30 Thread Naoto Sato
On Fri, 30 Apr 2021 21:09:36 GMT, Roger Riggs  wrote:

> To support the statement that changing the property has no effect.
> Please add it to the jdk.internal.util.StaticProperties cached values and an 
> internal access method.

Thanks. Added.

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name [v2]

2021-04-30 Thread Naoto Sato
> After some internal discussion, we thought it was good to expose the native 
> environment's default character encoding, which Charset.defaultCharset() is 
> currently based on. This way applications will have a better migration path 
> after the [JEP 400](https://openjdk.java.net/jeps/400) is implemented, in 
> which Charset.defaultCharset() will return UTF-8, but the value of this new 
> system property will remain intact. A 
> [CSR](https://bugs.openjdk.java.net/browse/JDK-8266075) has been filed with 
> more detailed information.

Naoto Sato has updated the pull request incrementally with one additional 
commit since the last revision:

  Added internal getter for native.encoding

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3777/files
  - new: https://git.openjdk.java.net/jdk/pull/3777/files/fa5246c8..e76ff410

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=3777=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=3777=00-01

  Stats: 16 lines in 1 file changed: 15 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3777.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3777/head:pull/3777

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name

2021-04-30 Thread Roger Riggs
On Wed, 28 Apr 2021 22:24:31 GMT, Naoto Sato  wrote:

> After some internal discussion, we thought it was good to expose the native 
> environment's default character encoding, which Charset.defaultCharset() is 
> currently based on. This way applications will have a better migration path 
> after the [JEP 400](https://openjdk.java.net/jeps/400) is implemented, in 
> which Charset.defaultCharset() will return UTF-8, but the value of this new 
> system property will remain intact. A 
> [CSR](https://bugs.openjdk.java.net/browse/JDK-8266075) has been filed with 
> more detailed information.

To support the statement that changing the property has no effect.
Please add it to the jdk.internal.util.StaticProperties cached values and an 
internal access method.

Otherwise looks good.

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name

2021-04-29 Thread Joe Wang
On Wed, 28 Apr 2021 22:24:31 GMT, Naoto Sato  wrote:

> After some internal discussion, we thought it was good to expose the native 
> environment's default character encoding, which Charset.defaultCharset() is 
> currently based on. This way applications will have a better migration path 
> after the [JEP 400](https://openjdk.java.net/jeps/400) is implemented, in 
> which Charset.defaultCharset() will return UTF-8, but the value of this new 
> system property will remain intact. A 
> [CSR](https://bugs.openjdk.java.net/browse/JDK-8266075) has been filed with 
> more detailed information.

Marked as reviewed by joehw (Reviewer).

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name

2021-04-29 Thread Joe Wang
On Thu, 29 Apr 2021 13:39:58 GMT, Roger Riggs  wrote:

>> Thanks, Joe and Iris. I agree with Iris and that's the reason I chose the 
>> description. System properties are inherently mutable. There are some 
>> "protected" ones, by that I mean a private copy is made just after 
>> initialization for VM use, which is not affected by later setProperty() 
>> calls. But I don't think this new property should be treated as such.
>
> The value of native.encoding should be a static property; that is, read once 
> at startup and later modification
> does not change the behavior of the Java runtime.
> 
> Whereas the value of native.encoding is derived from the value of native 
> variables and those native
> values do not change while Java is running, I think the behavior of the Java 
> runtime should stay the same.
> Unless it is static, the Java runtime will need to read the property every 
> time it is needed and behavior can
> change from call to call and actions in one thread can affect other threads.
> Is there a use case where the application needs to change the encoding for 
> every use in the Java runtime
> independently of the native values.
> Such an application should be explicit about its charset requirements and use 
> APIs to select them explicitly.

Ok, understandable, and having a statement to clearly say 'no effect' would be 
indeed helpful. While working on methods with a charset parameter some time 
ago, I remember reading some user discussions about unable to programmatically 
change the default encoding, a confusion mostly as the user attempted to set 
encoding, e.g. System.setProperty("file.encoding", encoding), it seems that 
"the property is set (meaning no error), but it did not have the desired 
effect". Then there was also a mention in some tutorial where file.encoding and 
sun.jnu.encoding were recognized as read-only. The problem or confusion was 
that it appeared there’s such an API and the operation was successful, but it 
really didn’t have the effect.

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name

2021-04-29 Thread Naoto Sato
On Wed, 28 Apr 2021 22:24:31 GMT, Naoto Sato  wrote:

> After some internal discussion, we thought it was good to expose the native 
> environment's default character encoding, which Charset.defaultCharset() is 
> currently based on. This way applications will have a better migration path 
> after the [JEP 400](https://openjdk.java.net/jeps/400) is implemented, in 
> which Charset.defaultCharset() will return UTF-8, but the value of this new 
> system property will remain intact. A 
> [CSR](https://bugs.openjdk.java.net/browse/JDK-8266075) has been filed with 
> more detailed information.

Unless there is some use case other than for migration, I would not introduce a 
static method as Alan mentioned.

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name

2021-04-29 Thread Alan Bateman
On Thu, 29 Apr 2021 14:12:29 GMT, Maurizio Cimadamore  
wrote:

> Naive question: any reason as to why we're not providing a new static API 
> method in Charset to return the platform encoder? This initially will return 
> same thing as Charset.defaultEncoder - but as JEP 400 is delivered the two 
> will diverge. Any reason as to why we don't want to expose the platform 
> encoder in the API?

Agree an API would be better.  We've looked at it a few times but have concerns 
that it would be confusing to most developers, esp. if it were another static 
method on Charset.  System and Runtime have been examined too. The proposal 
here doesn't preclude adding an API in the future, it's mostly a question on 
whether it is needed and where would it be defined.

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name

2021-04-29 Thread Maurizio Cimadamore
On Thu, 29 Apr 2021 14:08:36 GMT, Maurizio Cimadamore  
wrote:

> Naive question: any reason as to why we're not providing a new static API 
> method in Charset to return the platform encoder? This initially will return 
> same thing as Charset.defaultEncoder - but as JEP 400 is delivered the two 
> will diverge. Any reason as to why we don't want to expose the platform 
> encoder in the API?

Ok, I see this is addressed in the CSR:


We converged on a system property out of concern that an API method for the 
native encoding would be confusing for many developers.


The problem with this approach is that I think clients that need the platform 
encoder will have to stash it somewhere in a static field (to prevent reading 
property and parsing value over and over). It might also be harder for javadoc 
(I'm thinking of some of the Panama API) to express a dependency on it.

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name

2021-04-29 Thread Maurizio Cimadamore
On Wed, 28 Apr 2021 22:24:31 GMT, Naoto Sato  wrote:

> After some internal discussion, we thought it was good to expose the native 
> environment's default character encoding, which Charset.defaultCharset() is 
> currently based on. This way applications will have a better migration path 
> after the [JEP 400](https://openjdk.java.net/jeps/400) is implemented, in 
> which Charset.defaultCharset() will return UTF-8, but the value of this new 
> system property will remain intact. A 
> [CSR](https://bugs.openjdk.java.net/browse/JDK-8266075) has been filed with 
> more detailed information.

Naive question: any reason as to why we're not providing a new static API 
method in Charset to return the platform encoder? This initially will return 
same thing as Charset.defaultEncoder - but as JEP 400 is delivered the two will 
diverge. Any reason as to why we don't want to expose the platform encoder in 
the API?

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name

2021-04-29 Thread Naoto Sato
On Thu, 29 Apr 2021 13:23:42 GMT, Alan Bateman  wrote:

>> `native.encoding` preserves the encoding that current 
>> `Charset.defaultCharset()` is returning, which is based on `file.encoding`. 
>> So I believe the current implementation is correct. If it is biased toward 
>> `sun.jnu.encoding`, it would be problematic especially on Windows where that 
>> represents `system locale`.
>
> Okay for now but the value of file.encoding will change to "UTF-8" so will be 
> different to native.encoding.

Yes. That will be covered by JEP400.

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name

2021-04-29 Thread Roger Riggs
On Thu, 29 Apr 2021 13:06:35 GMT, Naoto Sato  wrote:

>> I suspect that if setProperty("native.encoding", "foo") succeeds, then it 
>> will return "foo".   I also believe that a later invocation of 
>> getProperty("native.encoding") will also return "foo".  If that's the case, 
>> then I don't think that the "read-only" alternative phrasing is correct.  To 
>> me, the alternative suggests that an error will be return if there is an 
>> attempt to set it or that the potential new value will be ignored.  The "no 
>> effect" phrasing avoids this problem.   I also suspect that the "no effect" 
>> phrasing was selected to align with the apiNote in lines 719-721.
>
> Thanks, Joe and Iris. I agree with Iris and that's the reason I chose the 
> description. System properties are inherently mutable. There are some 
> "protected" ones, by that I mean a private copy is made just after 
> initialization for VM use, which is not affected by later setProperty() 
> calls. But I don't think this new property should be treated as such.

The value of native.encoding should be a static property; that is, read once at 
startup and later modification
does not change the behavior of the Java runtime.

Whereas the value of native.encoding is derived from the value of native 
variables and those native
values do not change while Java is running, I think the behavior of the Java 
runtime should stay the same.
Unless it is static, the Java runtime will need to read the property every time 
it is needed and behavior can
change from call to call and actions in one thread can affect other threads.
Is there a use case where the application needs to change the encoding for 
every use in the Java runtime
independently of the native values.
Such an application should be explicit about its charset requirements and use 
APIs to select them explicitly.

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name

2021-04-29 Thread Alan Bateman
On Thu, 29 Apr 2021 13:11:53 GMT, Naoto Sato  wrote:

>> src/java.base/share/classes/jdk/internal/util/SystemProps.java line 69:
>> 
>>> 67: ? raw.propDefault(Raw._sun_jnu_encoding_NDX)
>>> 68: : raw.propDefault(Raw._file_encoding_NDX));
>>> 69: put(props, "native.encoding", nativeEncoding);
>> 
>> Shouldn't native.encoding be biased toward sun.jnu.encoding rather than 
>> file.encoding? Or maybe you'll change it when preparing the changes for JEP 
>> 400?
>
> `native.encoding` preserves the encoding that current 
> `Charset.defaultCharset()` is returning, which is based on `file.encoding`. 
> So I believe the current implementation is correct. If it is biased toward 
> `sun.jnu.encoding`, it would be problematic especially on Windows where that 
> represents `system locale`.

Okay for now but the value of file.encoding will change to "UTF-8" so will be 
different to native.encoding.

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name

2021-04-29 Thread Naoto Sato
On Thu, 29 Apr 2021 07:17:26 GMT, Alan Bateman  wrote:

>> After some internal discussion, we thought it was good to expose the native 
>> environment's default character encoding, which Charset.defaultCharset() is 
>> currently based on. This way applications will have a better migration path 
>> after the [JEP 400](https://openjdk.java.net/jeps/400) is implemented, in 
>> which Charset.defaultCharset() will return UTF-8, but the value of this new 
>> system property will remain intact. A 
>> [CSR](https://bugs.openjdk.java.net/browse/JDK-8266075) has been filed with 
>> more detailed information.
>
> src/java.base/share/classes/jdk/internal/util/SystemProps.java line 69:
> 
>> 67: ? raw.propDefault(Raw._sun_jnu_encoding_NDX)
>> 68: : raw.propDefault(Raw._file_encoding_NDX));
>> 69: put(props, "native.encoding", nativeEncoding);
> 
> Shouldn't native.encoding be biased toward sun.jnu.encoding rather than 
> file.encoding? Or maybe you'll change it when preparing the changes for JEP 
> 400?

`native.encoding` preserves the encoding that current 
`Charset.defaultCharset()` is returning, which is based on `file.encoding`. So 
I believe the current implementation is correct. If it is biased toward 
`sun.jnu.encoding`, it would be problematic especially on Windows where that 
represents `system locale`.

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name

2021-04-29 Thread Naoto Sato
On Thu, 29 Apr 2021 05:38:21 GMT, Iris Clark  wrote:

>> src/java.base/share/classes/java/lang/System.java line 704:
>> 
>>> 702:  * {@systemProperty native.encoding}
>>> 703:  * Character encoding name derived from the host 
>>> environment and/or
>>> 704:  * the user's settings. Setting this system property has no 
>>> effect.
>> 
>> May be "This property is read-only" instead of "Setting this system property 
>> has no effect" to not confuse with "user's settings"?
>
> I suspect that if setProperty("native.encoding", "foo") succeeds, then it 
> will return "foo".   I also believe that a later invocation of 
> getProperty("native.encoding") will also return "foo".  If that's the case, 
> then I don't think that the "read-only" alternative phrasing is correct.  To 
> me, the alternative suggests that an error will be return if there is an 
> attempt to set it or that the potential new value will be ignored.  The "no 
> effect" phrasing avoids this problem.   I also suspect that the "no effect" 
> phrasing was selected to align with the apiNote in lines 719-721.

Thanks, Joe and Iris. I agree with Iris and that's the reason I chose the 
description. System properties are inherently mutable. There are some 
"protected" ones, by that I mean a private copy is made just after 
initialization for VM use, which is not affected by later setProperty() calls. 
But I don't think this new property should be treated as such.

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name

2021-04-29 Thread Alan Bateman
On Wed, 28 Apr 2021 22:24:31 GMT, Naoto Sato  wrote:

> After some internal discussion, we thought it was good to expose the native 
> environment's default character encoding, which Charset.defaultCharset() is 
> currently based on. This way applications will have a better migration path 
> after the [JEP 400](https://openjdk.java.net/jeps/400) is implemented, in 
> which Charset.defaultCharset() will return UTF-8, but the value of this new 
> system property will remain intact. A 
> [CSR](https://bugs.openjdk.java.net/browse/JDK-8266075) has been filed with 
> more detailed information.

src/java.base/share/classes/jdk/internal/util/SystemProps.java line 69:

> 67: ? raw.propDefault(Raw._sun_jnu_encoding_NDX)
> 68: : raw.propDefault(Raw._file_encoding_NDX));
> 69: put(props, "native.encoding", nativeEncoding);

Shouldn't native.encoding be biased toward sun.jnu.encoding rather than 
file.encoding? Or maybe you'll change it when preparing the changes for JEP 400?

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name

2021-04-28 Thread Iris Clark
On Wed, 28 Apr 2021 22:24:31 GMT, Naoto Sato  wrote:

> After some internal discussion, we thought it was good to expose the native 
> environment's default character encoding, which Charset.defaultCharset() is 
> currently based on. This way applications will have a better migration path 
> after the [JEP 400](https://openjdk.java.net/jeps/400) is implemented, in 
> which Charset.defaultCharset() will return UTF-8, but the value of this new 
> system property will remain intact. A 
> [CSR](https://bugs.openjdk.java.net/browse/JDK-8266075) has been filed with 
> more detailed information.

Marked as reviewed by iris (Reviewer).

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name

2021-04-28 Thread Iris Clark
On Thu, 29 Apr 2021 00:37:37 GMT, Joe Wang  wrote:

>> After some internal discussion, we thought it was good to expose the native 
>> environment's default character encoding, which Charset.defaultCharset() is 
>> currently based on. This way applications will have a better migration path 
>> after the [JEP 400](https://openjdk.java.net/jeps/400) is implemented, in 
>> which Charset.defaultCharset() will return UTF-8, but the value of this new 
>> system property will remain intact. A 
>> [CSR](https://bugs.openjdk.java.net/browse/JDK-8266075) has been filed with 
>> more detailed information.
>
> src/java.base/share/classes/java/lang/System.java line 704:
> 
>> 702:  * {@systemProperty native.encoding}
>> 703:  * Character encoding name derived from the host 
>> environment and/or
>> 704:  * the user's settings. Setting this system property has no 
>> effect.
> 
> May be "This property is read-only" instead of "Setting this system property 
> has no effect" to not confuse with "user's settings"?

I suspect that if setProperty("native.encoding", "foo") succeeds, then it will 
return "foo".   I also believe that a later invocation of 
getProperty("native.encoding") will also return "foo".  If that's the case, 
then I don't think that the "read-only" alternative phrasing is correct.  To 
me, the alternative suggests that an error will be return if there is an 
attempt to set it or that the potential new value will be ignored.  The "no 
effect" phrasing avoids this problem.   I also suspect that the "no effect" 
phrasing was selected to align with the apiNote in lines 719-721.

-

PR: https://git.openjdk.java.net/jdk/pull/3777


Re: RFR: 8265989: System property for the native character encoding name

2021-04-28 Thread Joe Wang
On Wed, 28 Apr 2021 22:24:31 GMT, Naoto Sato  wrote:

> After some internal discussion, we thought it was good to expose the native 
> environment's default character encoding, which Charset.defaultCharset() is 
> currently based on. This way applications will have a better migration path 
> after the [JEP 400](https://openjdk.java.net/jeps/400) is implemented, in 
> which Charset.defaultCharset() will return UTF-8, but the value of this new 
> system property will remain intact. A 
> [CSR](https://bugs.openjdk.java.net/browse/JDK-8266075) has been filed with 
> more detailed information.

src/java.base/share/classes/java/lang/System.java line 704:

> 702:  * {@systemProperty native.encoding}
> 703:  * Character encoding name derived from the host environment 
> and/or
> 704:  * the user's settings. Setting this system property has no 
> effect.

May be "This property is read-only" instead of "Setting this system property 
has no effect" to not confuse with "user's settings"?

test/jdk/java/lang/System/PropertyTest.java line 83:

> 81: {"java.runtime.version"},
> 82: {"java.runtime.name"},
> 83: {"native.encoding"},

Does this test differentiate between read-only and modifiable properties? 
"native.encoding" looks like it's explicitly overridable. Or do we need a test 
to verify it's not overridable?

-

PR: https://git.openjdk.java.net/jdk/pull/3777