Re: [BUG] Git does not convert CRLF=LF on files with \r not before \n
Indeed, when changing the gitattributes for '* text', the replacement is OK. Thanks for all the explanations. At first, my use case was some source files (imported from another VCS) with CR in different contexts: - lines ending with CRCRLF - all content in LF or CRLF but some CR that should be EOL... - CR in the middle of the line for no reason! For all this, I will fix the files during import. But when digging I found some shell or awk scripts with CR as a valid char in search/replacement string. I know that the EOL should not be CRLF in this case, but I don't know if this situation could happen in DOS batch files or PowerShell scripts with CRLF EOL. 2015-04-21 21:28 GMT+02:00 Torsten Bögershausen tbo...@web.de: On 2015-04-21 15.51, Alexandre Garnier wrote: Here is a test: git init -q crlf-test cd crlf-test echo '* text=auto' .gitattributes git add .gitattributes git commit -q -m Normalize EOL echo -ne 'some content\r\nother \rcontent with CR\r\ncontent\r\nagain content with\r\r\n' inline-cr.txt echo Working directory content: cat -A inline-cr.txt echo git add inline-cr.txt echo Indexed content: git show :inline-cr.txt | cat -A Result -- File content: some content^M$ other ^Mcontent with CR^M$ content^M$ again content with^M^M$ Indexed content: some content^M$ other ^Mcontent with CR^M$ content^M$ again content with^M^M$ Expected result --- File content: some content^M$ other ^Mcontent with CR^M$ content^M$ again content with^M^M$ Indexed content: some content$ other ^Mcontent with CR$ content$ again content with^M$ # or even 'again content with$' for this last line If you remove the \r that are not at the end of the lines, EOL are converted as expected: File content: some content^M$ other content with CR^M$ content^M$ again content with^M$ Indexed content: some content$ other content with CR$ content$ again content with$ First of all, thanks for the info. The current implementation of Git does an auto-detection if a file is text or binary. For a file which is suspected to be text, it is expected to have either LF or CRLF as line endings, but a bare CR make Git wonder: Should this still be treated as a text file ? If yes, should the CR be kept as is, or should it be converted into LF (or CRLF) ? The current implementation may simply be explained by the fact that nobody has so far asked to treat this file as text, so the implementation assumes it to be binary. (Which makes the code a little bit easier, at the time it was written) So the status of today is that you can force Git to let the CR as is, when you specify that the file is text. Is there a real life problem behind it ? And what should happen to the CRs ? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] Git does not convert CRLF=LF on files with \r not before \n
Alexandre Garnier zig...@gmail.com writes: Indeed, when changing the gitattributes for '* text', the replacement is OK. OK. Earlier I said: But it would be a bug if the same thing happens when the user explicitly tells us that the file has CRLF line endings, and I suspect we have that bug, which may want to be corrected. but you are saying that my suspicion is incorrect and we do not have such a bug. Thanks for digging further. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] Git does not convert CRLF=LF on files with \r not before \n
Alexandre Garnier zig...@gmail.com writes: echo '* text=auto' .gitattributes git add .gitattributes git commit -q -m Normalize EOL echo -ne 'some content\r\nother \rcontent with CR\r\ncontent\r\nagain With text=auto, the user instructs us to guess, and we expect either LF or CRLF line-terminated files that is *TEXT*. A lone CR in the middle of the line would mean we cannot reliably guess---it may be LF terminated file with CRs sprinkled inside text, some of which happen to be at the end of the line, or it may be CRLF terminated file with CRs sprinkled in. We try to preserve the user input by not munging when we are not sure. You are seeing the designed and intended behaviour. But it would be a bug if the same thing happens when the user explicitly tells us that the file has CRLF line endings, and I suspect we have that bug, which may want to be corrected. I've Cc'ed various people who worked on convert.c around line endings. I recall we saw a few other discussion threads on text=auto and eol settings. Stakeholders may want to have a unified discussion to first list the issues in the current implementation and come up with fixes for them. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] Git does not convert CRLF=LF on files with \r not before \n
On 2015-04-21 15.51, Alexandre Garnier wrote: Here is a test: git init -q crlf-test cd crlf-test echo '* text=auto' .gitattributes git add .gitattributes git commit -q -m Normalize EOL echo -ne 'some content\r\nother \rcontent with CR\r\ncontent\r\nagain content with\r\r\n' inline-cr.txt echo Working directory content: cat -A inline-cr.txt echo git add inline-cr.txt echo Indexed content: git show :inline-cr.txt | cat -A Result -- File content: some content^M$ other ^Mcontent with CR^M$ content^M$ again content with^M^M$ Indexed content: some content^M$ other ^Mcontent with CR^M$ content^M$ again content with^M^M$ Expected result --- File content: some content^M$ other ^Mcontent with CR^M$ content^M$ again content with^M^M$ Indexed content: some content$ other ^Mcontent with CR$ content$ again content with^M$ # or even 'again content with$' for this last line If you remove the \r that are not at the end of the lines, EOL are converted as expected: File content: some content^M$ other content with CR^M$ content^M$ again content with^M$ Indexed content: some content$ other content with CR$ content$ again content with$ First of all, thanks for the info. The current implementation of Git does an auto-detection if a file is text or binary. For a file which is suspected to be text, it is expected to have either LF or CRLF as line endings, but a bare CR make Git wonder: Should this still be treated as a text file ? If yes, should the CR be kept as is, or should it be converted into LF (or CRLF) ? The current implementation may simply be explained by the fact that nobody has so far asked to treat this file as text, so the implementation assumes it to be binary. (Which makes the code a little bit easier, at the time it was written) So the status of today is that you can force Git to let the CR as is, when you specify that the file is text. Is there a real life problem behind it ? And what should happen to the CRs ? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[BUG] Git does not convert CRLF=LF on files with \r not before \n
Here is a test: git init -q crlf-test cd crlf-test echo '* text=auto' .gitattributes git add .gitattributes git commit -q -m Normalize EOL echo -ne 'some content\r\nother \rcontent with CR\r\ncontent\r\nagain content with\r\r\n' inline-cr.txt echo Working directory content: cat -A inline-cr.txt echo git add inline-cr.txt echo Indexed content: git show :inline-cr.txt | cat -A Result -- File content: some content^M$ other ^Mcontent with CR^M$ content^M$ again content with^M^M$ Indexed content: some content^M$ other ^Mcontent with CR^M$ content^M$ again content with^M^M$ Expected result --- File content: some content^M$ other ^Mcontent with CR^M$ content^M$ again content with^M^M$ Indexed content: some content$ other ^Mcontent with CR$ content$ again content with^M$ # or even 'again content with$' for this last line If you remove the \r that are not at the end of the lines, EOL are converted as expected: File content: some content^M$ other content with CR^M$ content^M$ again content with^M$ Indexed content: some content$ other content with CR$ content$ again content with$ -- Alex -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html