Junio C Hamano <[EMAIL PROTECTED]> writes:
> Yes, the patch had some context conflicts with some other patch
> so the patch application was done by hand, and I did a quick and
> dirty cut & paste of the commit message from "cat mbox" output.
>
> I will probably drop future patches encoded in QP.
This was totally inappropriate; sorry, but I was in a bad mood.
A more serious response.
- I personally consider commit message encoding a per project
issue (so is blob contents encoding). If for example a
project is Japanese only, MS-DOS^WWindows programming
project, I do not think it is unreasonable if all the commit
messages and source files are encoded in Shift-JIS. More
Unixy projects over there might use EUC-JP in source files
and maybe ISO-2022 in the log messages (because the latter is
the standard way to exchange e-mails there). As long as
project participants agree to use the same encodings, things
should work just fine within a project.
- However, weird people are known to merge projects that
started out as totally separate into one. It would be a
disaster for the commit log viewer when this happens. For
this reason, some people recommend using a common deniminator
encoding, namely UTF-8, for commit logs from day one, even if
you do not envision such a merge may happen in the future.
This recommendation also goes to author and committer
identification (but not blob contents). But this is just an
recommendation, and it is still up to the individual project
what encoding to use in the log messages, and the low-level
GIT should not dictate nor interfere; "git-commit-tree" and
"git-cat-file commit" are 8-bit clean.
- The e-mail patch acceptance machinery found in tools/
directory is tuned for the established patch exchange
practice used in the linux-kernel mailing list. No MIME, no
QP, no guarantee to pass things outside ASCII.
- Eventually, tools/mailinfo.c should be taught about MIME to
do at least:
- detect whitespace corrupted patch via sending MUA using
flowed-text and reject it;
- detect multipart PGP signed message, discard the attached
signature which is often useless, and unwrap the payload;
- decode QP and B encodings as necessary, and after splitting
the message to the info, msg and patch part, transliterate
the info and msg part from original encoding to UTF-8 (when
'--utf8' flag is given, perhaps).
One of the requirement there is that it still needs to be
_fast_. Linus needs to be able to make 5 commits per second
out of his mailbox.
So that is the technical part of the response. There is one
Project policy part of the response: I would endorse the
application of that UTF-8 recommendation to the git project
itself, at least in principle.
But that in practice would happen only after the above mailinfo
update takes place. Until then, it is very likely that I will
occasionally fail to spot and to hand-correct people's name left
undecoded the way the patch acceptance machinery passed through
in the log message. Please live with it (or send such patches
to mailinfo ;-).
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html