On Tue, 3 Mar 2026 10:00:50 GMT, Eirik Bjørsnøs <[email protected]> wrote:

>> I have removed it, thanks.
>> 
>> The idea was to try to set expectations that this may perform better than 
>> `getBytes(cs).length` but is not guaranteed to, and in particular isn't 
>> guaranteed to avoid allocating the encoded `byte[]` to run in constant 
>> space. But "equivalent or better performance" already gives the 
>> implementation latitude to fall back to `getBytes(cs).length`.
>
>> The idea was to try to set expectations that this may perform better than 
>> `getBytes(cs).length` but is not guaranteed to
> 
> 
>  * @apiNote This method provides equivalent or better performance than..
> 
> 
> This may be read as a strong promise that performance will always be better 
> which may be a difficult promise to keep.
> 
> Should we consider  saying something like "may offer better performance than" 
> instead?

Thanks, ideas for better language are welcome. I think ideally readers would 
come away with the understanding that they should always prefer this method to 
calling `getBytes(cs).length`.

> This may be read as a strong promise that performance will always be better 
> which may be a difficult promise to keep.

Thinking about this a bit more, I think it should usually be possible to keep 
that promise, an implementation can always fall back to using 
`getBytes(cs).length` for a particular charset. But I guess once the 
implementation of `encodedLength` diverges from `getBytes(cs).length`, it's 
conceivable that a compiler could do a better job of optimizing 
`getBytes(cs).length`, especially if it's able to do escape analysis and 
eliminate the array.

And also technically for the case where `encodedLength` falls back to 
`getBytes(cs).length` it's slightly _worse_ because it's doing 
`getBytes(cs).length` and a bit of additional work to check if the provided 
charsets is one of the ones it can optimize.

I went looking for some other discussions of performance in javadoc for 
inspiration

`String#lines`

> This method provides better performance than split("\R") by supplying 
> elements lazily and by faster search of new line terminators.

`Boolean(boolean)` (and other boxed primitive constructors)

> The static factory {@link #valueOf(boolean)} is generally a better choice, as 
> it is likely to yield significantly better space and time performance.

So perhaps something like

> This method is generally a better choice than calling than {@link 
> #getBytes(Charset) getBytes(cs).length}, as it may provide better space and 
> time performance.

or

> This method may provide better space and time performance than {@link 
> #getBytes(Charset) getBytes(cs).length}.

what do you think?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/28454#discussion_r2877435711

Reply via email to