Hi:

On Fri, Jun 27, 2008 at 5:13 AM, Edward Z. Yang
<[EMAIL PROTECTED]> wrote:
>
> Hello all,
>
> I did some check-ups on git-svn, and actually there is no bug.

Thanks for this study. You should forward it to the upstream mailing
list for further comments, because this issue is not just a
Windows-only concern.

> What is happening with the CRLF files is that they are being transferred
> from Subversion to Git as CRLF, and thus are represented inside the Git
> repository as CRLF.
> ...
> In short, I think it's a problem with improper line-endings in the
> Subversion repository, and Git's just cleaning things up. There is no bug.

I do not agree that there is no bug. According to your findings, the
problem is in git-svn.perl, when converting from a non-LF only svn to
a git repo. git-svn should take into account this wider semantics. It
should save in some way the non-LF status of the imported file(s)
while converting them to LF only for the git repo. Then when
reintegrating git-side changes back to the upstream svn repo like in
git svn dcommit, it should conditionally consider the saved non-LF
status. I have not considered the case where the user on the git-side
wants to change line endings explicitly.

For git-svn to automatically convert all text files to LF, like
implicit-crlf.txt, git-svn will need the text file auto-evaluator
functionality that is present in some other facet of git.

I think the more fundamental problem is that git tools are implicitly
implemented as LF-only, or without checking the CRLF case. As long as
this design decision (indecision?) is not fixed, this fundamental
problem will be pushed to the edges like git-svn (and as you mention,
git gui and gitk).

> As previously reported, the CRLF files, explicit-crlf.txt and
> implicit-crlf.txt had "changes" (even from a clean checkout).

This is wierd. Of course you are talking about "core.autocrlf = true".
Inside the git repo, these files are still CRLF but after a clean
checkout, the resulting git status is "modified"?!?! I did a
simplified test myself (see attachment ./crlf-test.sh) without git-svn
using git 1.5.5-GIT on Ubuntu and the results confirm it.

Ubuntu-git> ./crlf-test.sh
#### created a repo with a CRLF file, core.autocrlf = false
#### clone the repo with core.autocrlf false
#### clone the repo with core.autocrlf true

#### checked out files are the same via md5sum
5fb0817dd3d5802b05a7a2d7142b354e  cloned/crlf.txt
5fb0817dd3d5802b05a7a2d7142b354e  cloned-autocrlf/crlf.txt

#### However git status shows modified on core.autocrlf=true
# On branch master
nothing to commit (working directory clean)
----
# On branch master
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#
#       modified:   crlf.txt
#
no changes added to commit (use "git add" and/or "git commit -a")

#### compare git ls-tree master (original tree object)
100644 blob ea0584cfe09ad03141721109a52da3a13622cd1e    crlf.txt
----
100644 blob ea0584cfe09ad03141721109a52da3a13622cd1e    crlf.txt

#### compare git ls-files --stage (index and working tree)
100644 ea0584cfe09ad03141721109a52da3a13622cd1e 0       crlf.txt
----
100644 ea0584cfe09ad03141721109a52da3a13622cd1e 0       crlf.txt

#### compare git hash-object, shows different hash
ea0584cfe09ad03141721109a52da3a13622cd1e
----
20aeba2bad864cf6904f9caaea55f46f03ce6ac1
#### reverted to core.autocrlf=false
ea0584cfe09ad03141721109a52da3a13622cd1e

I don't understand what git-status is actually doing to determine the
"modified" status, but if it is checking the hash-object of each
tracked file, then it makes sense.

Also, I wonder if "created a repo with a CRLF file, core.autocrlf =
false" is contrived for regular Unix-based git usage. Do
multi-platform projects that use git actually prepare their repos with
CRLF measures such as gitattributes? What is the necessary
housekeeping for multi-platform projects when using git?

> However, the most important right now is informing users about the way
> git-svn handles SVN repositories with CRLF files in them, and why they need
> to either:
>
> 1. Make a mass commit for CRLF -> LF
> 2. Turn off core.autocrlf
> 3. Set all files that have problems to crlf=true with gitattributes

1. Haha, good luck trying to educate the svn user for this issue. ;-)
In other words, this is not feasible, given the myriad of ways git-svn
is being used.

2. Not the optimal solution but it's the easiest workaround to come to
mind, since git-svn.perl really isn't core.autocrlf-aware.

3. "all files that have problems" == text files with CRLF? Could we do
this automatically during the git svn clone or git svn fetch phase? Or
are you suggesting SVN users create a .gitattributes file on the svn
side just in case they use git svn?


Best regards,
Clifford Caoile

Attachment: crlf-test.sh
Description: Bourne shell script

Reply via email to