Am 10.04.2013 01:47, schrieb Ken Ismert:
> 
> I bumped into the UTF-16 display problem with Git Extensions running on top 
> of msysGit. After lots of searching and experimenting, I came up with a 
> solution that works for me.
> 
> Note: Please see questions below.
> 
> This method is for MSysGit 1.8.1, and is tested on Windows XP. I use Git 
> Extensions 2.44, but since the changes are at the Git level, they should work 
> for Git Gui as well. Steps:

There has been a discussion about handling UTF-16 on the git ML a while back, 
see http://thread.gmane.org/gmane.comp.version-control.git/159708

As suggested there, I would try to use a clean/smudge filter (i.e. store UTF-16 
files as UTF-8 in the repository and convert back to UTF-16 on checkout). That 
way git can treat your UTF-16 files as text in most cases (i.e. you can merge 
them, git-grep works, gitattributes work (eol-conversion, ident-replacement, 
built-in diff patterns...)).

If you use a textconv filter, UTF-16 content will be treated as binary by most 
git operations.

There's also an 'encoding' attribute and a 'gui.encoding' setting which in 
theory should solve your issue (i.e. specify encoding of files for display by 
GUI tools). I don't know if Git Extensions supports that, or whether its 
supposed to work for binary files at all.

> 3) Modify the global ~/Git/etc/gitconfig or your local ~/.git/config file, 
> and add these lines:
> 
>     [diff "astextutf16"]
>         textconv = astextutf16

Why not simply "textconv = iconv -f utf-16 -t utf-8", without the extra script?

> c) I had success with iconv, but is there any built-in UTF-16 to UTF-8 
> converter that ships with msysGit?

There are ready-to-use UTF-conversion functions in the codebase, but these are 
not accessible as a git command or built-in filter.

> As a quick fix, how hard would it be to add a 'utf16' diff filter, similar to 
> cpp or |csharp? Or is this simply the wrong place to put in a work-around?

As described above, I think a diff filter is not the right tool for the job. 
The only universal format for text content that works reasonably well with 
established text-based technologies (merge algorithms, regex etc.) is UTF-8. If 
we want to benefit from these technologies, git should store text files as 
UTF-8 and convert from / to platform-specific formats on checkin / checkout or 
for display.

Bye,
Karsten
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to