On 03/19/16 02:15, David Woodhouse wrote:

> So we treat it as an opaque sequence of bytes on the way *in*, then
> make assumptions on the way *out* about what it was?

On the way in, it is assumed to be UTF-8, unless the user says
otherwise. If the user says otherwise (in i18n.commitencoding), that
statement is captured in the commit object. Either way, the commit
message is not converted.

On the way out, the commit message is always converted. From: what the
commit object says; default is UTF-8, or else what the user stuck in
there. To: UTF-8 by default, or what the user specified in
i18n.logOutputEncoding. So normally there is a transparent UTF-8 -->
UTF-8 conversion on output.

> Which is just one of the set of classic "oops, I dropped the label"
> bugs which rendered the legacy charsets unworkable, but this time it
> almost seems to be *deliberate*.
> 
> The logic behind not re-coding is silly. Because throwing away the
> charset label on the input text and assuming it's already in
> i18n.commitencoding definitely *isn't* a reversible operation :)

Right; if git had considered my LC_CTYPE locale category setting, and
had automatically converted between it and its internal UTF-8
representation, I would have never looked at "i18n.*". :(

Laszlo
_______________________________________________
edk2-devel mailing list
[email protected]
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to