Re: RFR: 8316681: Rewrite URLEncoder.encode to use small reusable buffers [v4]

Alan Bateman Thu, 21 Sep 2023 23:28:55 -0700

On Thu, 21 Sep 2023 22:48:56 GMT, Claes Redestad <[email protected]> wrote:


>> `URLEncoder` currently appends chars that needs encoding into a 
>> `java.io.CharArrayWriter`, converts that to a `String`, uses 
>> `String::getBytes` to get the encoded bytes and then appends these bytes in 
>> a escaped manner to the output stream. This is somewhat inefficient.
>> 
>> This PR replaces the `CharArrayWriter` with a reusable `CharBuffer` + 
>> `ByteBuffer` pair. This allows us to encode to the output `StringBuilder` in 
>> small chunks, with greatly reduced allocation as a result.
>> 
>> The exact size of the buffers is an open question, but generally it seems 
>> that a tiny buffer wins by virtue of allocating less, and that the per chunk 
>> overheads are relatively small.
>
> Claes Redestad has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   REPLACE, add comment, add tests verifying that surrogate pairs at boundary 
> works.

src/java.base/share/classes/java/net/URLEncoder.java line 250:

> 248:         CharsetEncoder ce = charset.newEncoder()
> 249:                 .onMalformedInput(CodingErrorAction.REPLACE)
> 250:                 .onUnmappableCharacter(CodingErrorAction.REPLACE);

@dfuch The explicit use of the REPLACE action makes me wonder if the spec 
should be clarified at some point as it doesn't seem to be documented.

src/java.base/share/classes/java/net/URLEncoder.java line 296:

> 294:     }
> 295: 
> 296:     private static void flushToStringBuilder(StringBuilder out, 
> CharsetEncoder ce, CharBuffer cb, ByteBuffer bb, boolean endOfInput) {

This method would benefit from a comment to summarise the parameters, only 
because there are so many. 

Also just a formatting nit is that this line is much longer than everything 
else in the file so maybe side-by-side diffs a bit difficult to look at.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15865#discussion_r1333933819
PR Review Comment: https://git.openjdk.org/jdk/pull/15865#discussion_r1333933207

Re: RFR: 8316681: Rewrite URLEncoder.encode to use small reusable buffers [v4]

Reply via email to