Re: RFR: 8316734: URLEncoder should specify that replacement bytes will be used in case of coding error [v2]

Alan Bateman Thu, 23 Nov 2023 22:46:29 -0800

On Thu, 23 Nov 2023 11:18:17 GMT, Darragh Clarke <[email protected]> wrote:


>> Currently the descriptions of `URLEncoder.encode` and `URLDecoder.decode` 
>> don't specify their use of replacement bytes or replacement character when 
>> they cannot handle a character or sequence of bytes. This is longstanding 
>> behavior but needs to be documented.
>> 
>> **Solution**
>> - Added a new line to `URLEncoder.encode` API documentation to document that 
>> the charset's replacement bytes are used.
>> 
>> - Also changed `URLDecoder.decode` API documentation to document its use of 
>> the charset's replacement character, also changed some wording.
>
> Darragh Clarke has updated the pull request incrementally with two additional 
> commits since the last revision:
> 
>  - cleanup
>  - implemented feedback

src/java.base/share/classes/java/net/URLEncoder.java line 209:

> 207:      * <p>
> 208:      * If a character needs encoding but cannot be encoded, the
> 209:      * {@linkplain CharsetEncoder##cae replacement bytes} will be used.

I think this text will appear in the "Note" section of the method description. 
We are adding normative text so I think would be better if the new text went 
into the first paragraph or introduce a new parameter before the "Note". We 
could replace the "Note" heading with `@apiNote` if you want to clean this up.

As regards the text,  I think it would be more correct to say that if the input 
string is malformed, or if the input cannot be mapped to a valid byte sequence 
in the given charset, then the erroneous input with be replaced with the 
charset's replacement value.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16709#discussion_r1403991785

Re: RFR: 8316734: URLEncoder should specify that replacement bytes will be used in case of coding error [v2]

Reply via email to