On 02/10/2018 02:28, William A Rowe Jr wrote:
> On Mon, Oct 1, 2018 at 4:15 PM Luis Gil de Bernabé <lgilbern...@apache.org>
> wrote:
> 
>> I have updated long ago the files from es, but i wasn't sure to commit
>> them as the FR file Lucien is saying.
>> So now i'm sure it could be done, i will add them to trunk (es
>> modification only)
>>
>> @Christophe JAILLET <christophe.jail...@wanadoo.fr>  im not helpfull
>> here, but i vote +1 to update our build tool :)
>> rgd
>>
> 
> Also +1 to a freshened build toolchain, or trusting the JDK (Open or
> otherwise)
> to provide the functionality.

The toolchain will as far as I understand only be used to build. I don't
understand under which circumstances this would be problematic.

> Very concerned about trusting utf-8 for anything that is expected to be
> console
> readable. Not as much the xml/html decorated contents, but the man pages
> and any text files concern me. I've been living in a utf-8 console for a
> very
> long time, but I don't expect my experience is typical.

I would say that this is the usual experience. I've been trying to find
data on this with no luck, but I would suggest that with perhaps
exception of the English only world, there is an overwhelming majority
who uses UTF-8. If you only use ASCII, as in English, UTF-8 and
ISO-8859-1 becomes the same.

> I'm also not keen on delivering 1%-3% more bytes over the wire where we
> can directly represent the contents in an ISO-8859 representation.

Again, if you're using ASCII only, there is no such overhead. Delivering
entities is a lot more expensive. If at all significant, I expect to see
bandwidth usage drop and not increase.

>  That said, shifting to utf-8 also buys us the symbolic sets. Let's avoid
> the
> emoji's, but various bullets and highlight characters could drastically
> improve
> the readability of our docs.

I'm a bit out on a limb here, but AFAICR, these symbols are represented
with 4 byte width and occurs in predictable codepages. It should be easy
to set up a build test that fails if used. If anything, this should be
avoided, because even though fairly widely used, they are a relatively
new introduction. Things may break even though UTF-8 support is in place.

> So on balance, utf-8 doesn't seem unreasonable, *unless* we expect most
> localized users will be reading the plain text of the file.

And even then it's easier than entities.



Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to