Re: [PATCH v4 0/6] convert: add support for different encodings

2018-01-23 Thread Junio C Hamano
Torsten Bögershausen  writes:

> On Sat, Jan 20, 2018 at 04:24:12PM +0100, lars.schnei...@autodesk.com wrote:
>> From: Lars Schneider 
>> 
>> Hi,
>> 
>> Patches 1-4 and 6 are preparation and helper functions.
>> Patch 5 is the actual change.
>
> I (still) have 2 remarks on convert.c - to make live easier,
> I will send a small "on top" patch the next days.

Thanks, both.  I'll stay on the sideline ;-) and deal with other
topics first.


Re: [PATCH v4 0/6] convert: add support for different encodings

2018-01-23 Thread Torsten Bögershausen
On Sat, Jan 20, 2018 at 04:24:12PM +0100, lars.schnei...@autodesk.com wrote:
> From: Lars Schneider 
> 
> Hi,
> 
> Patches 1-4 and 6 are preparation and helper functions.
> Patch 5 is the actual change.

I (still) have 2 remarks on convert.c - to make live easier,
I will send a small "on top" patch the next days.


[PATCH v4 0/6] convert: add support for different encodings

2018-01-20 Thread lars . schneider
From: Lars Schneider 

Hi,

Patches 1-4 and 6 are preparation and helper functions.
Patch 5 is the actual change.

This series depends on Torsten's "convert_to_git(): safe_crlf/checksafe
becomes int conv_flags" patch:
https://public-inbox.org/git/20180113224931.27031-1-tbo...@web.de/

Changes since v3:

* I renamed the attribute from "checkout-encoding" to "working-tree-encoding"
  in the hope to convey better what the attribute is about.

* I rebased the series to Git 2.16 and removed Torsten's patch as he
  posted the patch on his own.

* Fix documentation wording. (Torsten)

* A macro was used in a commit before it's introduction. Fixed!(Junio)

Thanks,
Lars

   RFC: 
https://public-inbox.org/git/bdb9b884-6d17-4be3-a83c-f67e2afa2...@gmail.com/
v1: 
https://public-inbox.org/git/20171211155023.1405-1-lars.schnei...@autodesk.com/
v2: 
https://public-inbox.org/git/2017122915.39680-1-lars.schnei...@autodesk.com/
v3: 
https://public-inbox.org/git/20180106004808.77513-1-lars.schnei...@autodesk.com/


Base Ref:
Web-Diff: https://github.com/larsxschneider/git/commit/21f4dac5ab
Checkout: git fetch https://github.com/larsxschneider/git encoding-v4 && git 
checkout 21f4dac5ab


### Interdiff (v3-rebased-2.16..v4):

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 1bc03e69cb..a8dbf4be30 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -272,8 +272,8 @@ few exceptions.  Even though...
   catch potential problems early, safety triggers.


-`checkout-encoding`
-^^^
+`working-tree-encoding`
+^^^

 Git recognizes files encoded with ASCII or one of its supersets (e.g.
 UTF-8 or ISO-8859-1) as text files.  All other encodings are usually
@@ -281,17 +281,17 @@ interpreted as binary and consequently built-in Git text 
processing
 tools (e.g. 'git diff') as well as most Git web front ends do not
 visualize the content.

-In these cases you can teach Git the encoding of a file in the working
-directory with the `checkout-encoding` attribute. If a file with this
+In these cases you can tell Git the encoding of a file in the working
+directory with the `working-tree-encoding` attribute. If a file with this
 attributes is added to Git, then Git reencodes the content from the
 specified encoding to UTF-8 and stores the result in its internal data
 structure (called "the index"). On checkout the content is encoded
 back to the specified encoding.

-Please note that using the `checkout-encoding` attribute may have a
+Please note that using the `working-tree-encoding` attribute may have a
 number of pitfalls:

-- Git clients that do not support the `checkout-encoding` attribute
+- Git clients that do not support the `working-tree-encoding` attribute
   will checkout the respective files UTF-8 encoded and not in the
   expected encoding. Consequently, these files will appear different
   which typically causes trouble. This is in particular the case for
@@ -304,7 +304,7 @@ number of pitfalls:
 - Reencoding content requires resources that might slow down certain
   Git operations (e.g 'git checkout' or 'git add').

-Use the `checkout-encoding` attribute only if you cannot store a file in
+Use the `working-tree-encoding` attribute only if you cannot store a file in
 UTF-8 encoding and if you want Git to be able to process the content as
 text.

@@ -313,7 +313,7 @@ with byte order mark (BOM) and you want Git to perform 
automatic line
 ending conversion based on your platform.

 
-*.txttext checkout-encoding=UTF-16
+*.txttext working-tree-encoding=UTF-16
 

 Use the following attributes if your '*.txt' files are UTF-16 little
@@ -321,7 +321,7 @@ endian encoded without BOM and you want Git to use Windows 
line endings
 in the working directory.

 
-*.txtcheckout-encoding=UTF-16LE text eol=CRLF
+*.txtworking-tree-encoding=UTF-16LE text eol=CRLF
 

 You can get a list of all available encodings on your platform with the
diff --git a/convert.c b/convert.c
index 8559651b3f..13fad490ce 100644
--- a/convert.c
+++ b/convert.c
@@ -323,7 +323,7 @@ static int encode_to_git(const char *path, const char *src, 
size_t src_len,
const char *advise_msg = _(
  "You told Git to treat '%s' as %s. A byte order mark "
  "(BOM) is prohibited with this encoding. Either use "
- "%.6s as checkout encoding or remove the BOM from the "
+ "%.6s as working tree encoding or remove the BOM from the "
  "file.");

advise(advise_msg, path, enc->name, enc->name, enc->name);
@@ -339,7 +339,7 @@ static int encode_to_git(const char *path, const char *src, 
size_t src_len,
const char *advise_msg = _(
  "You told Git to treat '%s' as %s. A byte order mark "
  "(BOM) is required with this encoding. Either use "
- "%sBE/%sLE as checkout encoding or add a BOM to the "
+