Re: [git-users] git-apply: do not show *inherited* whitespace errors?

2016-12-26 Thread Philip Oakley
Hi Igor,

is the test repo available for viewing ?

Have you tried:
>>  `git cat-file blob   >tmp.txt`
 (with  replaced by a specific blob hash) 
to see what is actually in each blob, just in case your config settings and the 
attribute settings are doing different conversions.

I certainly see, for my test GfW repo, that the worktree file has CRLF, yet the 
in repo blob has just LFs inside.

The other common gotcha is that (AIUI) git diff will compare the index version 
rather than the work tree version if one is not careful, which mans there could 
be a difference you had not expected.

Philip
  - Original Message - 
  From: Igor Djordjevic 
  To: Git for human beings 
  Cc: igor.d.djordje...@gmail.com ; philipoak...@iee.org 
  Sent: Saturday, December 24, 2016 11:15 AM
  Subject: Re: [git-users] git-apply: do not show *inherited* whitespace errors?


  Hi Philip,


  No worries, and thanks for your response once again. Unfortunately, that very 
latter part is the essence of my (original) question, just providing a 
step-by-step example of what I`m talking about :)


  I have no issue "per se", thus question about my project size/position has no 
much sense. I`m discussing the current Git logic, and the example provided at 
the latter part of the previous message is all that`s needed to replicate my 
concern.


  But to satisfy your curiosity (and just in case it *does* matter :), provided 
example works the same with both "git version 2.10.2.windows.1" and "git 
version 2.11.0.windows.1".


  I totally understand the problematics you`re describing, and having Git 
trying to assist and ease the pain is great, but it seems to be lacking a 
fundamental option to *not* warn/fail on line ending *not* being changed, and 
even worse, to warn/fail when line ending *does* change -- without a need to 
*assume* or *know* which line ending should be considered "correct", but just 
to warn/fail on *change*, no further knowledge needed.


  In the end, I may repost the cleaned-up version of this topic to the main Git 
mailing list, providing this one as a reference, as it may be more concerned of 
actual Git internals and its overall logic/approach, where I don`t really have 
an "issue" I would be needing help with, thus this git-users group might have 
not been appropriate for this discussion after all.


  Thanks for your time once again, and please feel free to tune in here (or 
there) again as you see fit, and as the time allows. Happy holidays!

  BugA

  On Saturday, December 24, 2016 at 10:17:49 AM UTC+1, Philip Oakley wrote:
Hi Igor,

Sorry for the misunderstandings. 

I think we are agreeing that the repository content, byte for byte, is the 
versioned artefact.

Can I ask what size of project you have / are participating in / your 
position in any hierarchy? It helps clarify how far the issues reach.

I think you are saying that _you_ are happy that in the repo you can have 
different files with different line endings and possibly even different line 
endings in the same file. Each originator provides the file 'as-is', in their 
own eol style.

That mixed eol method can work when the tools are fully eol agnostic, or at 
least there are limited users reading each file who use the same eol style.  
However that can lead to issues as more people want to view more of the files 
in multiple OS's and use tools that are not 100% eol agnostic and eol 
preserving.

Git is quite happy with mixed eols because internally it does do the 
byte-for-byte snapshot versioning (note that it does not do change set 
versioning!, diffs are done after the fact).

Because most users don't have fully eol agnostic / eol preserving tools, 
Git tries to provide helpers for that case, and depending on who provided the 
Git client version, you get certain defaults preset (or not). I suspect to have 
a Git version that has a eol helper default you were not expecting. Plus there 
are 'old' and 'new' versions of the helpers!

Most teams end up deciding that versioning mixed eol types are a big 
problem (of the type you are seeing), and hence decide on a project wide 
requirement, which, given Git's Linux roots, tends to be LF eol. It then 
becomes relatively easy to expand/contract using the helpers (internal to git) 
in a smudge/clean filter style for the different OS's.

What version of Git are you using: GfW, Github, Git source, ... and 
--version?

I may have been _wrong_ about the patch vs email standards issue - I use 
the tools such as format-patch and send-email so that I don't get hit by these 
eol problems! (outlook and gmail are terrible for that)

If you at least comment on your project and your Git client it may help.

I haven't covered the latter part of your email (time is pressing)
--
Philip
Happy Christmas / festive greetings to all.
  - Original Message - 
  From: Igor D

Re: [git-users] git-apply: do not show *inherited* whitespace errors?

2016-12-24 Thread Igor Djordjevic
Hi Philip,

No worries, and thanks for your response once again. Unfortunately, that 
very latter part is the essence of my (original) question, just providing a 
step-by-step example of what I`m talking about :)

I have no issue "per se", thus question about my project size/position has 
no much sense. I`m discussing the current Git logic, and the example 
provided at the latter part of the previous message is all that`s needed to 
replicate my concern.

But to satisfy your curiosity (and just in case it *does* matter :), 
provided example works the same with both "git version 2.10.2.windows.1" 
and "git version 2.11.0.windows.1".

I totally understand the problematics you`re describing, and having Git 
trying to assist and ease the pain is great, but it seems to be lacking a 
fundamental option to *not* warn/fail on line ending *not* being changed, 
and even worse, to warn/fail when line ending *does* change -- without a 
need to *assume* or *know* which line ending should be considered 
"correct", but just to warn/fail on *change*, no further knowledge needed.

In the end, I may repost the cleaned-up version of this topic to the main 
Git mailing list, providing this one as a reference, as it may be more 
concerned of actual Git internals and its overall logic/approach, where I 
don`t really have an "issue" I would be needing help with, thus this 
git-users group might have not been appropriate for this discussion after 
all.

Thanks for your time once again, and please feel free to tune in here (or 
there) again as you see fit, and as the time allows. Happy holidays!

BugA

On Saturday, December 24, 2016 at 10:17:49 AM UTC+1, Philip Oakley wrote:
>
> Hi Igor,
>  
> Sorry for the misunderstandings. 
>  
> I think we are agreeing that the repository content, byte for byte, is the 
> versioned artefact.
>  
> Can I ask what size of project you have / are participating in / your 
> position in any hierarchy? It helps clarify how far the issues reach.
>  
> I think you are saying that _you_ are happy that in the repo you can have 
> different files with different line endings and possibly even different 
> line endings in the same file. Each originator provides the file 'as-is', 
> in their own eol style.
>  
> That mixed eol method can work when the tools are fully eol agnostic, or 
> at least there are limited users reading each file who use the same eol 
> style.  However that can lead to issues as more people want to view more of 
> the files in multiple OS's and use tools that are not 100% eol agnostic and 
> eol preserving.
>  
> Git is quite happy with mixed eols because internally it does do the 
> byte-for-byte snapshot versioning (note that it does *not* do change set 
> versioning!, diffs are done after the fact).
>  
> Because most users don't have fully eol agnostic / eol preserving tools, 
> Git tries to provide helpers for that case, and depending on who provided 
> the Git client version, you get certain defaults preset (or not). I suspect 
> to have a Git version that has a eol helper default you were not expecting. 
> Plus there are 'old' and 'new' versions of the helpers!
>  
> Most teams end up deciding that versioning mixed eol types are a big 
> problem (of the type you are seeing), and hence decide on a project wide 
> requirement, which, given Git's Linux roots, tends to be LF eol. It then 
> becomes relatively easy to expand/contract using the helpers (internal to 
> git) in a smudge/clean filter style for the different OS's.
>  
> What version of Git are you using: GfW, Github, Git source, ... and 
> --version?
>  
> I may have been _wrong_ about the patch vs email standards issue - I use 
> the tools such as format-patch and send-email so that I don't get hit by 
> these eol problems! (outlook and gmail are terrible for that)
>  
> If you at least comment on your project and your Git client it may help.
>  
> I haven't covered the latter part of your email (time is pressing)
> --
> Philip
> Happy Christmas / festive greetings to all.
>
> ----- Original Message ----- 
> *From:* Igor Djordjevic  
> *To:* Git for human beings  
> *Cc:* philip...@iee.org  
> *Sent:* Friday, December 23, 2016 9:36 PM
> *Subject:* Re: [git-users] git-apply: do not show *inherited* whitespace 
> errors?
>
> Hi Philip, 
>
> Thanks for the extensive answer. I`ll address my concerns inline, please 
> excuse me if my lack of *nix background gets too obvious, I might be 
> missing some well known facts.
>
> In line with almost all "version control" systems, Git takes the view that 
>> what is in the reposititory 'is' the versioned artefact, rather than some 
>> small (allowable?) variant of it. 
>>
>
> I`m not following you 

Re: [git-users] git-apply: do not show *inherited* whitespace errors?

2016-12-24 Thread Philip Oakley
Hi Igor,

Sorry for the misunderstandings. 

I think we are agreeing that the repository content, byte for byte, is the 
versioned artefact.

Can I ask what size of project you have / are participating in / your position 
in any hierarchy? It helps clarify how far the issues reach.

I think you are saying that _you_ are happy that in the repo you can have 
different files with different line endings and possibly even different line 
endings in the same file. Each originator provides the file 'as-is', in their 
own eol style.

That mixed eol method can work when the tools are fully eol agnostic, or at 
least there are limited users reading each file who use the same eol style.  
However that can lead to issues as more people want to view more of the files 
in multiple OS's and use tools that are not 100% eol agnostic and eol 
preserving.

Git is quite happy with mixed eols because internally it does do the 
byte-for-byte snapshot versioning (note that it does not do change set 
versioning!, diffs are done after the fact).

Because most users don't have fully eol agnostic / eol preserving tools, Git 
tries to provide helpers for that case, and depending on who provided the Git 
client version, you get certain defaults preset (or not). I suspect to have a 
Git version that has a eol helper default you were not expecting. Plus there 
are 'old' and 'new' versions of the helpers!

Most teams end up deciding that versioning mixed eol types are a big problem 
(of the type you are seeing), and hence decide on a project wide requirement, 
which, given Git's Linux roots, tends to be LF eol. It then becomes relatively 
easy to expand/contract using the helpers (internal to git) in a smudge/clean 
filter style for the different OS's.

What version of Git are you using: GfW, Github, Git source, ... and --version?

I may have been _wrong_ about the patch vs email standards issue - I use the 
tools such as format-patch and send-email so that I don't get hit by these eol 
problems! (outlook and gmail are terrible for that)

If you at least comment on your project and your Git client it may help.

I haven't covered the latter part of your email (time is pressing)
--
Philip
Happy Christmas / festive greetings to all.
  - Original Message - 
  From: Igor Djordjevic 
  To: Git for human beings 
  Cc: philipoak...@iee.org 
  Sent: Friday, December 23, 2016 9:36 PM
  Subject: Re: [git-users] git-apply: do not show *inherited* whitespace errors?


  Hi Philip,


  Thanks for the extensive answer. I`ll address my concerns inline, please 
excuse me if my lack of *nix background gets too obvious, I might be missing 
some well known facts.


In line with almost all "version control" systems, Git takes the view that 
what is in the reposititory 'is' the versioned artefact, rather than some small 
(allowable?) variant of it. 


  I`m not following you here, sorry. My very complaint/thought was exactly 
about actually expecting Git to observe the repository content as "the" 
versioned artifact, without trying to be smart about guessing if some content 
is allowable in there or not - especially if the part it complains about (line 
ending) in the new version (applying patch) was exactly the same in the old 
version (inside repository).


There is some on going work on ensuring that the CRLF line ending choices 
of a user can be accomodated. However the area of auto adjustment has lots of 
hidden issues, especially given the *nix freedoms to put anything anywhere 
(e.g. a loose CR in the middle of a text line). Then there is the historic Mac 
CR conventions.


  My remark is exactly the opposite of any smart/auto adjustment - I would 
simply prefer not to get bothered about something that didn`t even change, yet 
still to get warned when it does change. It just seems much more intuitive (to 
me).

Patches always have CRLF as the eol (that's the email RFC standard, IIUC).


  Might be we are not aligned in regards to our "patch" terminology/meaning, 
but I`m simply talking about a diff that Git produces - that one certainly 
isn`t restricted to CLRF line endings, but it preserves line endings found in 
old/new file. Quite contrary, even, the diff headers seem to strictly have LF 
line endings, easily observed in editor that supports showing symbols (like 
Notepad++, for example), once you output the diff/patch to a file.


  I can imagine that e-mailing such files as plain text through an e-mail 
client requires them to conform to e-mail standards, but that is irrelevant 
here, I didn`t even get there yet :)



Getting your setting right can be a challenge as there are a lot of old 
'backward compatibility' options available that are no longer what might be 
called 'best practice'. Plus you have to have a clear view on what your 
practice is!


Most folk appear to use LF for eol in the repo (text files), and then 
convert them to local eol on export (and vice versa on add

Re: [git-users] git-apply: do not show *inherited* whitespace errors?

2016-12-23 Thread Igor Djordjevic
Hi Philip,

Thanks for the extensive answer. I`ll address my concerns inline, please 
excuse me if my lack of *nix background gets too obvious, I might be 
missing some well known facts.

In line with almost all "version control" systems, Git takes the view that 
> what is in the reposititory 'is' the versioned artefact, rather than some 
> small (allowable?) variant of it. 
>

I`m not following you here, sorry. My very complaint/thought was exactly 
about actually expecting Git to observe the repository content as "the" 
versioned artifact, without trying to be smart about guessing if some 
content is allowable in there or not - especially if the part it complains 
about (line ending) in the new version (applying patch) was exactly the 
same in the old version (inside repository).

There is some on going work on ensuring that the CRLF line ending choices 
> of a user can be accomodated. However the area of auto adjustment has lots 
> of hidden issues, especially given the *nix freedoms to put anything 
> anywhere (e.g. a loose CR in the middle of a text line). Then there is the 
> historic Mac CR conventions.
>

My remark is exactly the opposite of any smart/auto adjustment - I would 
simply prefer not to get bothered about something that didn`t even change, 
yet still to get warned when it does change. It just seems much more 
intuitive (to me).
 

> Patches always have CRLF as the eol (that's the email RFC standard, IIUC).
>

Might be we are not aligned in regards to our "patch" terminology/meaning, 
but I`m simply talking about a diff that Git produces - that one certainly 
isn`t restricted to CLRF line endings, but it preserves line endings found 
in old/new file. Quite contrary, even, the diff headers seem to strictly 
have LF line endings, easily observed in editor that supports showing 
symbols (like Notepad++, for example), once you output the diff/patch to a 
file.

I can imagine that e-mailing such files as plain text through an e-mail 
client requires them to conform to e-mail standards, but that is irrelevant 
here, I didn`t even get there yet :)
 

> Getting your setting right can be a challenge as there are a lot of old 
> 'backward compatibility' options available that are no longer what might be 
> called 'best practice'. Plus you have to have a clear view on what your 
> practice is!
>  
> Most folk appear to use LF for eol in the repo (text files), and then 
> convert them to local eol on export (and vice versa on add/commit).
>  
> If you have got to a situation where "your" repo has mixed eol types, then 
> you, and the whole project, will probably need a specific commit point 
> where the repo is normalised ( "-Xrenormalize" ) to whatever the project 
> decision is. Plus the project decision needs to identify what the user 
> config settings should be on the different platforms - so that every one 
> can smoothly switch to the new easy-to-use method.
>

All this is pretty understandable to me, but also seems pretty irrelevant 
in regards to what I`m talking about, too Unless I`m still missing it, 
of course :)

All these seem to try to make Git "smarter", being able to "normalize" and 
"convert" different line endings, where my wish is exactly the opposite - 
play it "stupid" and keep it simple, if something didn`t change, don`t 
bother with it (and don`t bother me).

As it`s usually easier to discuss through examples, here`s a try. Below, we 
have a 3 line text file, with CRLF line endings, already inside Git repo:

line
line
line


(*1*) Now, we change  the file to something like this (changing second line 
from "line" to "lin2", leaving line endings as they are):

line
lin2
line


If we do a simple "git diff" between these two files, Git will produce this:

$ git diff sample.txt
diff --git a/sample.txt b/sample.txt
index ff9a745..2dfb04d 100644
--- a/sample.txt
+++ b/sample.txt
@@ -1,3 +1,3 @@
 line
-line
+lin2^M
 line


Notice the "^M" whitespace error indicator on the "+line2" line... Now, if 
we use the --ws-error-highlight option, we can see this:

$ git diff --ws-error-highlight=old,new sample.txt
diff --git a/sample.txt b/sample.txt
index ff9a745..2dfb04d 100644
--- a/sample.txt
+++ b/sample.txt
@@ -1,3 +1,3 @@
 line
-line^M
+lin2^M
 line


Here, we see that the old line has the "^M" mark as well, meaning that 
there is actually no change in this regard - thus one may expect Git not to 
treat line ending inside "new" file differently than one in the "old" file, 
as it didn`t change at all (and Git seems to have a mean to recognize that, 
showing it when requested).

If we output the diff to a patch file:

$ git diff sample.txt > sample.patch


... we still can clearly see that both old and new line (correctly) have 
the same CRLF line endings:

diff --git a/sample.txt b/sample.txt
index ff9a745..2dfb04d 100644
--- a/sample.txt
+++ b/sample.txt
@@ -1,3 +1,3 @@
 line
-line
+lin2
 line


Now, if we checkout previous file version and then apply that same patch, 
git-apply 

Re: [git-users] git-apply: do not show *inherited* whitespace errors?

2016-12-23 Thread Philip Oakley
Hi Igor,

In line with almost all "version control" systems, Git takes the view that what 
is in the reposititory 'is' the versioned artefact, rather than some small 
(allowable?) variant of it.

There is some on going work on ensuring that the CRLF line ending choices of a 
user can be accomodated. However the area of auto adjustment has lots of hidden 
issues, especially given the *nix freedoms to put anything anywhere (e.g. a 
loose CR in the middle of a text line). Then there is the historic Mac CR 
conventions.


I've not seen the patch issue, though I only use them to submit patches from a 
windows PC to the main git list (rather than applying them the other way 
around).


Patches always have CRLF as the eol (that's the email RFC standard, IIUC).

Getting your setting right can be a challenge as there are a lot of old 
'backward compatibility' options available that are no longer what might be 
called 'best practice'. Plus you have to have a clear view on what your 
practice is! 

Most folk appear to use LF for eol in the repo (text files), and then convert 
them to local eol on export (and vice versa on add/commit).


If you have got to a situation where "your" repo has mixed eol types, then you, 
and the whole project, will probably need a specific commit point where the 
repo is normalised ( "-Xrenormalize" ) to whatever the project decision is. 
Plus the project decision needs to identify what the user config settings 
should be on the different platforms - so that every one can smoothly switch to 
the new easy-to-use method.

If you want, have a look at the main git list [1,2] for "CRLF" to find the 
various discussions. Torsten has continued to work the issue.

As a general point, it is worth separating out the different whitespace errors, 
as the eol is just one particular issue. 
The space/tab/etc at end of line is another, and then tab vs space indentation 
(for those still using teletypes[3], it's 8 spaces per 1" tab!). 
--
Philip


[1] http://public-inbox.org/git 
[2] http://public-inbox.org/git/20161130170232.19685-1-tbo...@web.de/  Torsten 
Bögershausen Wed, 30 Nov 2016, "-Xrenormalize"
[3] https://en.wikipedia.org/wiki/Teletype_Model_33 
https://en.wikipedia.org/wiki/Teleprinter
 
  - Original Message - 
  From: Igor Djordjevic 
  To: Git for human beings 
  Sent: Friday, December 23, 2016 12:21 AM
  Subject: [git-users] git-apply: do not show *inherited* whitespace errors?


  Hi to all,


  I`m using Git for Windows mainly for now, where the issue is probably more 
(or only) pronounced, yet I kind of feel this might be of interest across 
platforms (for Git in general), as it makes cross-platform collaboration harder 
than it needs to be (or so it seems to me). I`m posting here and not on the 
main Git mailing list as this topic might be very well elaborated till now, so 
not to produce unnecessary noise there.


  When creating a patch, Git exports line endings as they are in the file 
(usually CR+LF on Windows), yet when you apply the patch, it warns you of 
"whitespace errors" (caused by CR part) - even though no whitespace actually 
changed (both old and new hunk have the same CR+LF line endings).


  Now, I may understand that in Unix world there was a need to consider 
anything other than LF line ending a "whitespace error" which should be 
reported accordingly, but with Git being widely used across platforms nowadays 
it seems it should know a bit better now, especially when line endings *didn`t 
change in the first place*.


  But even worse, if I change the patch file line endings manually to LF and 
git-apply the patch like that - no warning is given, even though line endings 
are in fact different now!


  I`m aware of --ws-error-highlight setting for git-diff, allowing to 
show/compare line endings between old/new lines, but manual line comparison 
seems rather impractical in case of applying multiple patches - and yet real 
whitespace errors (actually introduced with the patch we`re applying) are lost 
in the noise (or worse, not even reported, as in CR+LF to LF conversion).


  Current git-apply --whitespace options seem to allow for a variety of 
settings, but none to warn about *new* whitespace errors *only*, or even more 
important - about line ending changes in general (as converting from CRLF to LF 
might be considered a whitespace error situation, even though general Git 
thinks differently at the moment, probably assuming you actually corrected an 
existing whitespace error... :P).


  So, what are the thoughts on this? Would having an additional --whitespace 
option (like "preserve", "warn-new", "warn-changed", or something) to warn 
about *new* whitespace errors *only* make sense in general? Seems so to me, but 
yet as I`m still new around, I might be missing a bigger picture...


  p.s. Please let me know if you think this should be reposted to main Git 
mailing list (or to Git for Windows one, even).


  Thanks, BugA

  -- 
  You received this