Hi Philip,

Thanks for the extensive answer. I`ll address my concerns inline, please 
excuse me if my lack of *nix background gets too obvious, I might be 
missing some well known facts.

In line with almost all "version control" systems, Git takes the view that 
> what is in the reposititory 'is' the versioned artefact, rather than some 
> small (allowable?) variant of it. 
>

I`m not following you here, sorry. My very complaint/thought was exactly 
about actually expecting Git to observe the repository content as "the" 
versioned artifact, without trying to be smart about guessing if some 
content is allowable in there or not - especially if the part it complains 
about (line ending) in the new version (applying patch) was exactly the 
same in the old version (inside repository).

There is some on going work on ensuring that the CRLF line ending choices 
> of a user can be accomodated. However the area of auto adjustment has lots 
> of hidden issues, especially given the *nix freedoms to put anything 
> anywhere (e.g. a loose CR in the middle of a text line). Then there is the 
> historic Mac CR conventions.
>

My remark is exactly the opposite of any smart/auto adjustment - I would 
simply prefer not to get bothered about something that didn`t even change, 
yet still to get warned when it does change. It just seems much more 
intuitive (to me).
 

> Patches always have CRLF as the eol (that's the email RFC standard, IIUC).
>

Might be we are not aligned in regards to our "patch" terminology/meaning, 
but I`m simply talking about a diff that Git produces - that one certainly 
isn`t restricted to CLRF line endings, but it preserves line endings found 
in old/new file. Quite contrary, even, the diff headers seem to strictly 
have LF line endings, easily observed in editor that supports showing 
symbols (like Notepad++, for example), once you output the diff/patch to a 
file.

I can imagine that e-mailing such files as plain text through an e-mail 
client requires them to conform to e-mail standards, but that is irrelevant 
here, I didn`t even get there yet :)
 

> Getting your setting right can be a challenge as there are a lot of old 
> 'backward compatibility' options available that are no longer what might be 
> called 'best practice'. Plus you have to have a clear view on what your 
> practice is!
>  
> Most folk appear to use LF for eol in the repo (text files), and then 
> convert them to local eol on export (and vice versa on add/commit).
>  
> If you have got to a situation where "your" repo has mixed eol types, then 
> you, and the whole project, will probably need a specific commit point 
> where the repo is normalised ( "-Xrenormalize" ) to whatever the project 
> decision is. Plus the project decision needs to identify what the user 
> config settings should be on the different platforms - so that every one 
> can smoothly switch to the new easy-to-use method.
>

All this is pretty understandable to me, but also seems pretty irrelevant 
in regards to what I`m talking about, too.... Unless I`m still missing it, 
of course :)

All these seem to try to make Git "smarter", being able to "normalize" and 
"convert" different line endings, where my wish is exactly the opposite - 
play it "stupid" and keep it simple, if something didn`t change, don`t 
bother with it (and don`t bother me).

As it`s usually easier to discuss through examples, here`s a try. Below, we 
have a 3 line text file, with CRLF line endings, already inside Git repo:

line<CR><LF>
line<CR><LF>
line<CR><LF>


(*1*) Now, we change  the file to something like this (changing second line 
from "line" to "lin2", leaving line endings as they are):

line<CR><LF>
lin2<CR><LF>
line<CR><LF>


If we do a simple "git diff" between these two files, Git will produce this:

$ git diff sample.txt
diff --git a/sample.txt b/sample.txt
index ff9a745..2dfb04d 100644
--- a/sample.txt
+++ b/sample.txt
@@ -1,3 +1,3 @@
 line
-line
+lin2^M
 line


Notice the "^M" whitespace error indicator on the "+line2" line... Now, if 
we use the --ws-error-highlight option, we can see this:

$ git diff --ws-error-highlight=old,new sample.txt
diff --git a/sample.txt b/sample.txt
index ff9a745..2dfb04d 100644
--- a/sample.txt
+++ b/sample.txt
@@ -1,3 +1,3 @@
 line
-line^M
+lin2^M
 line


Here, we see that the old line has the "^M" mark as well, meaning that 
there is actually no change in this regard - thus one may expect Git not to 
treat line ending inside "new" file differently than one in the "old" file, 
as it didn`t change at all (and Git seems to have a mean to recognize that, 
showing it when requested).

If we output the diff to a patch file:

$ git diff sample.txt > sample.patch


... we still can clearly see that both old and new line (correctly) have 
the same CRLF line endings:

diff --git a/sample.txt b/sample.txt<LF>
index ff9a745..2dfb04d 100644<LF>
--- a/sample.txt<LF>
+++ b/sample.txt<LF>
@@ -1,3 +1,3 @@<LF>
 line<CR><LF>
-line<CR><LF>
+lin2<CR><LF>
 line<CR><LF>


Now, if we checkout previous file version and then apply that same patch, 
git-apply produces the warning which made me open this topic:

$ git checkout HEAD -- sample.txt
$ git apply sample.patch
sample.patch:8: trailing whitespace.
lin2
warning: 1 line adds whitespace errors.


What I don`t understand is why Git complains here, when the line ending 
didn`t even change between old/new...? I mean, I totally understand it from 
*nix point of view, but it seems counter-intuitive in regards to "stupid 
content tracking", which Git is supposed to do, while it actually tries to 
be smart here (smarter then needed, at least).

(*2*) Then, we have an even worse situation - assume the same starting 
file, but edited on *nix, changing the second line *and* its line ending as 
well:

line<CR><LF>
lin2<LF>
line<CR><LF>


Notice that besides "line" to "lin2" change, we now miss <CR> on the second 
line, too. If we do git-diff now, we get this:

$ git diff sample.txt
diff --git a/sample.txt b/sample.txt
index ff9a745..0727536 100644
--- a/sample.txt
+++ b/sample.txt
@@ -1,3 +1,3 @@
 line
-line
+lin2
 line


Note how Git doesn`t complain about new line "lin2" line ending anymore (no 
"^M" marker) - even though it`s actually different than it was!

If we use --ws-error-highlight again, we can clearly see the end of line 
difference, too:

$ git diff --ws-error-highlight=old,new sample.txt
diff --git a/sample.txt b/sample.txt
index ff9a745..0727536 100644
--- a/sample.txt
+++ b/sample.txt
@@ -1,3 +1,3 @@
 line
-line^M
+lin2
 line


Git seems to stay true to its own "stupid content tracking" philosophy 
here, but one may in fact expect it to warn about actual whitespace change 
in this case, instead of the previous one, where whitespace didn`t even 
change... :/

If we output this diff to a patch file now:

$ git diff sample.txt > sample2.patch


... we can clearly see the whitespace (line ending) change:

diff --git a/sample.txt b/sample.txt<LF>
index ff9a745..0727536 100644<LF>
--- a/sample.txt<LF>
+++ b/sample.txt<LF>
@@ -1,3 +1,3 @@<LF>
 line<CR><LF>
-line<CR><LF>
+lin2<LF>
 line<CR><LF>


Yet, git-apply`ing the patch above happily consumes it without any warnings:

$ git checkout HEAD -- sample.txt
$ git apply sample2.patch


... effectively (and silently) breaking our previous line endings :/


So, while (*1*) is "just" annoying (being warned about line ending 
"whitespace error" that isn`t even changed), case (*2*) seems plain wrong, 
breaking previous line ending without even a hint...

Thus, it seems we could all benefit from additional git-apply --whitespace 
option which would in fact warn on line-ending changes *only*, being it 
from CRLF to LF or vice versa, no discrimination and favoring one over the 
other :)

 If you want, have a look at the main git list [1,2] for "CRLF" to find the 
> various discussions. Torsten has continued to work the issue.
>

Thank you for this, I`ll need to investigate it in more depth for sure, 
taking some time.

Yet, what I already remarked, it seems this is the opposite of what I would 
like - have Git not bother with "assumed" or preset line ending 
"normalization", leaving that to me, but just warning me when actual line 
ending change happens so I can investigate if it`s correct or not.

That would stay true to tracking content (changes) only, and not falling 
into trap of validating its "correctness" - unless otherwise requested, of 
course, which can always be convenient, but I don`t think it should be 
imposed, especially if concept of "correctness" is assumed (like LF vs 
CRLF), and not explicitly set by user.
 

>  As a general point, it is worth separating out the different whitespace 
> errors, as the eol is just one particular issue. 
>
The space/tab/etc at end of line is another, and then tab vs space 
> indentation (for those still using teletypes[3], it's 8 spaces per 1" tab!).
>

Indeed, I`m concerned with line endings in particular here.

Thanks for your time again!

BugA
 

> --
> Philip
>  
>  
> [1] http://public-inbox.org/git 
> [2] http://public-inbox.org/git/20161130170232.19685-1-tbo...@web.de/  
> Torsten Bögershausen Wed, 30 Nov 2016, "-Xrenormalize"
> [3] https://en.wikipedia.org/wiki/Teletype_Model_33 
> https://en.wikipedia.org/wiki/Teleprinter
>  
>
> ----- Original Message ----- 
> *From:* Igor Djordjevic <javascript:> 
> *To:* Git for human beings <javascript:> 
> *Sent:* Friday, December 23, 2016 12:21 AM
> *Subject:* [git-users] git-apply: do not show *inherited* whitespace 
> errors?
>
> Hi to all, 
>
> I`m using Git for Windows mainly for now, where the issue is probably more 
> (or only) pronounced, yet I kind of feel this might be of interest across 
> platforms (for Git in general), as it makes cross-platform collaboration 
> harder than it needs to be (or so it seems to me). I`m posting here and not 
> on the main Git mailing list as this topic might be very well elaborated 
> till now, so not to produce unnecessary noise there.
>
> When creating a patch, Git exports line endings as they are in the file 
> (usually CR+LF on Windows), yet when you apply the patch, it warns you of 
> "whitespace errors" (caused by CR part) - even though no whitespace 
> actually changed (both old and new hunk have the same CR+LF line endings).
>
> Now, I may understand that in Unix world there was a need to consider 
> anything other than LF line ending a "whitespace error" which should be 
> reported accordingly, but with Git being widely used across platforms 
> nowadays it seems it should know a bit better now, especially when line 
> endings *didn`t change in the first place*.
>
> But even worse, if I change the patch file line endings manually to LF and 
> git-apply the patch like that - no warning is given, even though line 
> endings are in fact different now!
>
> I`m aware of --ws-error-highlight setting for git-diff, allowing to 
> show/compare line endings between old/new lines, but manual line comparison 
> seems rather impractical in case of applying multiple patches - and yet 
> real whitespace errors (actually introduced with the patch we`re applying) 
> are lost in the noise (or worse, not even reported, as in CR+LF to LF 
> conversion).
>
> Current git-apply --whitespace options seem to allow for a variety of 
> settings, but none to warn about *new* whitespace errors *only*, or even 
> more important - about line ending changes in general (as converting from 
> CRLF to LF might be considered a whitespace error situation, even though 
> general Git thinks differently at the moment, probably assuming you 
> actually corrected an existing whitespace error... :P).
>
> So, what are the thoughts on this? Would having an additional --whitespace 
> option (like "preserve", "warn-new", "warn-changed", or something) to warn 
> about *new* whitespace errors *only* make sense in general? Seems so to me, 
> but yet as I`m still new around, I might be missing a bigger picture...
>
> p.s. Please let me know if you think this should be reposted to main Git 
> mailing list (or to Git for Windows one, even).
>
> Thanks, BugA
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Git for human beings" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to git-users+...@googlegroups.com <javascript:>.
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to