On Mon, Dec 18, 2017 at 11:13:34AM +0100, Torsten Bögershausen wrote:

> Just to confirm my missing knowledge here:
> Does this mean, that git-gui and gitk can decode/reencode
> the content of a file/blob, when the .gitattributes say so ?

That's my impression, yes.

> If yes, would it make sense to enhance the "git diff" instead ?
> "git diff --encoding" will pick up the commited encoding from
> .attributes, convert it into UTF-8, and run the diff ?

You can do something like this already:

  git config diff.utf16.textconv 'iconv -f utf16 -t utf8'
  echo file diff=utf16 >.gitattributes

I have no objection to teaching it that "encoding" means roughly the
same thing, which would have two advantages:

 - we could do it in-process, which would be much faster

 - we could skip the extra config step, which is a blocker for
   server-side diffs on sites like GitHub

I don't think you'd really need "--encoding". This could just be
triggered by the normal "--textconv" rules (i.e., just treat this "as
if" the textconv above was configured when we see an encoding
attribute).

> We actually could enhance the "git diff" output with a single
> line saying
> "Git index-encoding=cp1251"
> or so, which can be picked up by "git apply".

That extends it beyond what textconv can do (we assume that textconv
patches are _just_ for human consumption, and can't be applied). And it
would mean you'd potentially want to trigger it in more places. On the
other hand, I don't know that we're guaranteed that a round-trip
between encodings will produce a byte-wise identical result. The nice
thing about piggy-backing on textconv is that it's already dealt with
that problem.

-Peff

Reply via email to