Junio C Hamano <[EMAIL PROTECTED]> writes:

> Yes, the patch had some context conflicts with some other patch
> so the patch application was done by hand, and I did a quick and
> dirty cut & paste of the commit message from "cat mbox" output.
>
> I will probably drop future patches encoded in QP.

This was totally inappropriate; sorry, but I was in a bad mood.

A more serious response.

 - I personally consider commit message encoding a per project
   issue (so is blob contents encoding).  If for example a
   project is Japanese only, MS-DOS^WWindows programming
   project, I do not think it is unreasonable if all the commit
   messages and source files are encoded in Shift-JIS.  More
   Unixy projects over there might use EUC-JP in source files
   and maybe ISO-2022 in the log messages (because the latter is
   the standard way to exchange e-mails there).  As long as
   project participants agree to use the same encodings, things
   should work just fine within a project.

 - However, weird people are known to merge projects that
   started out as totally separate into one.  It would be a
   disaster for the commit log viewer when this happens.  For
   this reason, some people recommend using a common deniminator
   encoding, namely UTF-8, for commit logs from day one, even if
   you do not envision such a merge may happen in the future.

   This recommendation also goes to author and committer
   identification (but not blob contents).  But this is just an
   recommendation, and it is still up to the individual project
   what encoding to use in the log messages, and the low-level
   GIT should not dictate nor interfere; "git-commit-tree" and
   "git-cat-file commit" are 8-bit clean.

 - The e-mail patch acceptance machinery found in tools/
   directory is tuned for the established patch exchange
   practice used in the linux-kernel mailing list.  No MIME, no
   QP, no guarantee to pass things outside ASCII.

 - Eventually, tools/mailinfo.c should be taught about MIME to
   do at least:

   - detect whitespace corrupted patch via sending MUA using
     flowed-text and reject it;

   - detect multipart PGP signed message, discard the attached
     signature which is often useless, and unwrap the payload;

   - decode QP and B encodings as necessary, and after splitting
     the message to the info, msg and patch part, transliterate
     the info and msg part from original encoding to UTF-8 (when
     '--utf8' flag is given, perhaps).

   One of the requirement there is that it still needs to be
   _fast_.  Linus needs to be able to make 5 commits per second
   out of his mailbox.

So that is the technical part of the response.  There is one
Project policy part of the response: I would endorse the
application of that UTF-8 recommendation to the git project
itself, at least in principle.

But that in practice would happen only after the above mailinfo
update takes place.  Until then, it is very likely that I will
occasionally fail to spot and to hand-correct people's name left
undecoded the way the patch acceptance machinery passed through
in the log message.  Please live with it (or send such patches
to mailinfo ;-).


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to