Re: RFR: 8356980: Better handling of non-breaking space

Sergey Bylokhov Wed, 14 May 2025 19:28:14 -0700

On Wed, 14 May 2025 17:34:45 GMT, Naoto Sato <[email protected]> wrote:


>> For the l10n files, they are synced by the translation team and we don't 
>> edit them. IMO, I think it's fine leaving those ones as is. Especially 
>> because language rules can cause different spacing and punctuation 
>> characters, so generally we don't ensure translations are equivalent to the 
>> original file's value in that regard. (So viewing them as a Unicode escape 
>> sequence vs UTF-8 literal may not bring much benefit.)
>
> I believe it is OK to leave these as UTF-8 native characters, as these files 
> are l10n resource bundles. If we wanted to replace those look-alike spaces to 
> unicode escapes, other characters may also need the same treatment, such as 
> hyphen-minus, quotations, etc. In fact there are lot more look alikes defined 
> in the unicode consortium 
> (https://www.unicode.org/Public/security/latest/confusables.txt), and I don't 
> think we would want to convert them.

maybe this is just a translation error and a simple space can be used instead, 
like in all the other properties in these files?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25234#discussion_r2090083320

Re: RFR: 8356980: Better handling of non-breaking space

Reply via email to