On Sun, Aug 04, 2013 at 11:14:40AM -0700, Jonathan Nieder wrote:
> Alexey Shumkin wrote:
> > On Fri, Aug 02, 2013 at 04:23:38PM -0700, Jonathan Nieder wrote:
> >> 1. Log messages use the configured log output encoding, which is
> >> meant to be whatever encoding works best with local terminals
> >> (and does not have much to do with what encoding should be used
> >> for email)
> >> 2. Filenames are left as is: on Linux, usually UTF-8, and in the Mingw
> >> port (which uses Unicode filesystem APIs), always UTF-8
> > I cannot say exactly if it makes sense for THIS patch, but I'd like to
> > remind about Cygwin port, which definitely does not use UTF-8 encoding
> > (in my case it is Windows-1251) for filenames.
> >> 3. The "This is an automated email" preface uses a project description
> >> from .git/description, which is typically in UTF-8 to support
> >> gitweb.
> Thanks for clarifying. So in the context you describe, (1) is
> configurable, (2) is Windows-1251, (3) is unconfigurably UTF-8, and
> there is no way with current git facilities to force the email to use
> a single encoding unless (3) happens to contain no special characters.
> What is the value of the "[i18n] commitEncoding" setting in your
commitEncoding is equal to filenames' encoding, Windows-1251, of course.
> What encoding do the raw commit messages (shown with
> "git log --format=raw") use for their text, and what do they declare
> with an in-commit 'encoding' header, if any?
Well, despite `git log --help`
The raw format shows the entire commit exactly as stored in
the commit object"
on a Linux box (UTF-8) I can see "readable" commit messages nevertheless
they are stored in 'Windows-1251' (so they are converted to UTF-8). To
be sure I've checked actual content of them with `git cat-file commit`
Actually, to be honest, I usually use modified version of Git (see
ecaee8050cec23eb4cf082512e907e3e52c20b57) in 'next' branch, that could
affect the results, so I've checked `git log --format=raw` with
unmodified v188.8.131.52 of Git.
But let's go back to the answer to your question. Commit encoding stored
as a header in a raw commit messages is 'Windows-1251'.
> Does everyone on this project use Cygwin?i
This is a "closed" (commercial) project and every developer uses Cygwin,
except me. I use a Linux box as a desktop (mail, IM, web-browsing; but
development goes on Cygwin). And sometimes I run utility scripts
included to that project on my desktop (as far as Linux works with files
much faster than Cygwin does ;))
Also, a Git server is a coLinux box (http://www.colinux.org/) on a
Windows Server 2003, but I guess, it does not much matter here.
> That should be fine, but
> I'd expect there to be problems as soon as someone wants to try the
> Mingw port ("Git for Windows").
Yep, one of our developers tried to use modern version of TortoiseGit
with MinGW port of Git. That was a failure. As far as since v1.7.9 MinGW
port transcodes filenames to store them internally in UTF-8. This
problem could be solved with converting once that non-ASCII filenames to
UTF-8, but I do not want to use MinGW port. I like Cygwin
"infrastructure" that is more Linux-like than MinGW.
> I wonder if there should be an "[i18n] repositoryPathEncoding"
> configuration item to support this kind of repository. Then git could
> be aware of the intended encoding of paths, could recode them for
> display to a terminal, and at least on Linux and Mingw could recode
> them for use in filenames on disk. "repositoryPathEncoding = none"
> would mean the current behavior of treating paths as raw sequences of
I'd be happy if such a setting exists. That could solve many problems
with cross-platform projects with non-ASCII filenames.
Indeed, MinGW port does resolve that problem somehow!
> What do you think?
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html