On Sat, 2016-03-19 at 01:55 +0100, Laszlo Ersek wrote:
> 
> Okay, here's what I'll do. I will switch i18n.commitencoding back to
> UTF-8. And, I will add a commit-msg hook that converts the commit
> message in-place from latin2 to UTF-8, with "iconv". That should keep
> us both happy. Deal?

That sounds like a good solution; thanks.

I think it's arguably a git bug that it's necessary. Git ought to
honour the locale settings when dealing with user input. If it can
convert on *reading* from repository files, why wouldn't it convert on
*writing* them?

The 'git commit' man page says both that

       ·   The commit log messages are uninterpreted sequences of non-NUL
           bytes.

and also that

        2. git log, git show, git blame and friends look at the encoding
           header of a commit object, and try to re-code the log message into
           UTF-8 unless otherwise specified. You can specify the desired
           output encoding with i18n.logoutputencoding ...

... which seems self-contradictory. It goes on to say:

       Note that we deliberately chose not to re-code the commit log message
       when a commit is made to force UTF-8 at the commit object level,
       because re-coding to UTF-8 is not necessarily a reversible operation.

So we treat it as an opaque sequence of bytes on the way *in*, then
make assumptions on the way *out* about what it was?

Which is just one of the set of classic "oops, I dropped the label"
bugs which rendered the legacy charsets unworkable, but this time it
almost seems to be *deliberate*.

The logic behind not re-coding is silly. Because throwing away the
charset label on the input text and assuming it's already in
i18n.commitencoding definitely *isn't* a reversible operation :)

-- 
dwmw2

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
edk2-devel mailing list
[email protected]
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to