On Sat, 2016-03-19 at 01:55 +0100, Laszlo Ersek wrote: > > Okay, here's what I'll do. I will switch i18n.commitencoding back to > UTF-8. And, I will add a commit-msg hook that converts the commit > message in-place from latin2 to UTF-8, with "iconv". That should keep > us both happy. Deal?
That sounds like a good solution; thanks. I think it's arguably a git bug that it's necessary. Git ought to honour the locale settings when dealing with user input. If it can convert on *reading* from repository files, why wouldn't it convert on *writing* them? The 'git commit' man page says both that · The commit log messages are uninterpreted sequences of non-NUL bytes. and also that 2. git log, git show, git blame and friends look at the encoding header of a commit object, and try to re-code the log message into UTF-8 unless otherwise specified. You can specify the desired output encoding with i18n.logoutputencoding ... ... which seems self-contradictory. It goes on to say: Note that we deliberately chose not to re-code the commit log message when a commit is made to force UTF-8 at the commit object level, because re-coding to UTF-8 is not necessarily a reversible operation. So we treat it as an opaque sequence of bytes on the way *in*, then make assumptions on the way *out* about what it was? Which is just one of the set of classic "oops, I dropped the label" bugs which rendered the legacy charsets unworkable, but this time it almost seems to be *deliberate*. The logic behind not re-coding is silly. Because throwing away the charset label on the input text and assuming it's already in i18n.commitencoding definitely *isn't* a reversible operation :) -- dwmw2
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ edk2-devel mailing list [email protected] https://lists.01.org/mailman/listinfo/edk2-devel

