Re: RFR: 8372353: API to compute the byte length of a String encoded in a given Charset [v17]

Eirik Bjørsnøs Mon, 09 Feb 2026 13:36:03 -0800

On Fri, 6 Feb 2026 14:08:07 GMT, Liam Miller-Cushon <[email protected]> wrote:


> The main reason `getEncodedLength` wasn't used is that it doesn't make it 
> clear that the unit of length is bytes. For UTF-8 a byte length is intuitive, 
> for e.g. UTF-16 or UTF-32 the "encoded length" could also be the count of 
> int16 (number of wchar_t) or int32.

Emphasizing the unit of measurement is a laudable goal. I just feel that in 
this case it obscures what is being computed.

What’s computed here is the *encoded* length, the unit of measurement seems a 
secondary concern.

A String does not intrinsically have a «byte length», this concept seems only 
meaningful in relation to an encoding operation.

Was `getEncodedByteLength` considered? `getEncodedLengthInBytes`?

Was Charset considered as a home for this method? There the operational context 
of encoding would be obvious.

> Including a `get` prefix or not was also considered, one benefit of `get` is 
> that it aligns with `getBytes`, and also it may help convey that the method 
> is doing computation (it's often going to be O(1), compared to e.g. 
> `length()` which is O(1)).

Again a laudable goal, but the actual computation seems obscure.

String is prime real estate for millions of programmers. We should get this 
right.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/28454#issuecomment-3861158061

Re: RFR: 8372353: API to compute the byte length of a String encoded in a given Charset [v17]

Reply via email to