-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Mark,

On 2/11/20 9:47 AM, Mark Thomas wrote:
> On 11/02/2020 14:26, Christopher Schultz wrote:
> 

>> This appears to be a bug in (at least old versions of) Java
>> and/or native2ascii. I've got local installations of Java 8, 11
>> (Adopt), 11 (Oracle), and 13 (OpenJDK), and only Java 8 has a
>> "native2ascii" binary present. I see ant's <native2ascii> task
>> has its own implementation, but it's probably very simple, just
>> like the native2ascii program itself. Java's Reader classes
>> incorrectly interpret the BOM as an actual character instead of
>> an ignorable UTF-8 control sequence.
> 
> But the chances of us being able to "fix" the Ant implementation
> are considerably higher :).

Fair enough. ant handles this completely, so I'm happy to file a bug
against it. It would, unfortunately, cause an incompatibility between
<native2ascii> and Java's (legacy) native2ascii program. The ant team
might reject the request. I guess that's no worse than the current
situation :)

>> Ensuring that the first line of the file is a comment or a blank
>> line fixes things:
>> 
>> # BOM first.property=foo second.property=bar
>> 
>> becomes:
>> 
>> \ufeff# BOM first.property=foo second.property=bar
> 
> Does the BOM end up creating an additional property in this case?

Probably, but who cares? Code is unlikely to do:

bundle.getProperty([UTf-8 bom])

And get confused by what comes out.

>>> Overall, I guess I am -0 on adding BOMs.
>> 
>> Okay. This is a fairly recent change to Tomcat, and frankly, we
>> (a) don't get a huge number of outside contributions which
>> include changes to the localized properties files (except for the
>> translation-only contributions, which have been great!) and (b)
>> often ignore the non-English translations in the first place
>> because we are lazy.
>> 
>> I think maybe this can stay on the back-burner until we see if we
>> end up with any problems.
> 
> Sounds reasonable to me. It looks like we have options if we need
> them but with a few minor issues to research / iron out first if we
> go that way.
> 
>> Does/can "checkstyle" check for valid UTF-8 byte sequences in 
>> .properties files? I think that may be a helpful check to add if
>> it's not already in there.
> 
> Don't know. +1 if such a thing exists.

I know nothing about Checkstyle, so I'll defer to anyone who does know
how to configure it to do these things :)

- -chris
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAl5C07kACgkQHPApP6U8
pFjsZhAAoyo8KeqHqs1ZakdexBQJ8g1YHuGKC87SG3Guw/GoMFTjsyU9sWPyAnBP
wvizChhnWD3WaWKrEI+Tp4D35v/L1ORuwquDYIqRgxras+xvjnyzWDFfrYPA1WkF
RQ5Ns4A8f/lkPAb+4Y2xKN8wLnWY/zmJ5GmJ0fibyORqlAfANgUp16hHaT4bDRDM
AqPWbODT5YBhpTRurTqejJeXGJLfBFdxbH+liZdQ8uYeaYNSEV23YPXxVq5upgMD
daZxkusaacu6Uz1F0w/6uAJJ65xo+qzeANYmJ0Hn+jfrWwtgspTPOfPct9VSpuJ7
YnBcllm8vvshjGYB/83Q/IaWdKQvJ+BhHwLatuS5gz7EaM4V3ibZiwXDyPOMEoek
XeV983OgLw7IONEjhLXqKyooqywSpy9v0gU+GmRHh7fk453gFzBm3I7FF7FtZotw
XE8OyOmyjUuw48v+NcjR0fAQ+wzgBYRlVItICY1s/OMr2dDAWcDB1jG2nlSdf2TV
HGHqZrgvtOF+/v5wGCpZAdnjeU8qqOmk/m+SJwK76nfz11e79MMCkDBjiVypet6E
/LRbGzgjoZn3lAsApaLTKbp0kVaLEJlZ2Xg/DuzBCZWyvrGTiEEVC7Hr2aMsjsQq
v4NHfogOMz5zcxyJ8nxGNTK5JHBXNp//kg9SWWUCFvf7UJRDFWg=
=svCv
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to