On Thu, Sep 20, 2012 at 11:20:31PM -0400, Cristian Tibirna wrote:
> Running the script in attachment produces a git repository in which were
> operated a large number of file renames, in which many of the renamed files
> (in this particular case all) have the same content but different names.
>
> The commit data from the renaming operation (last commit in the script-
> generated history) is inexactly rendered by the command
>
> git diff-tree -r -C master
>
> The logical result is correctly produced by the more restricted command
>
> git diff-tree -r -M master
>
> IMO for this particular last commit both the above commands should return the
> same result.
Interesting. I get the same results from both commands. But I did have
to munge your script, as my "rename" command does not seem to work like
the one you expect in your script. So I may have misinterpreted the
intent of it.
However, I would not be surprised if one could conduct a situation in
which "-C" and "-M" produced different results. Since the content of all
the files is the same, git has to make a guess about which files match
up based on their filenames. The current heuristic is very stupid and
just tries to match basenames (e.g., moving "foo/Makefile" to
"bar/Makefile" is a better match than moving the same content to
"bar/foo.c"). But in this case, the basenames don't match at all.
By using "-C", we will typically have more rename sources available, and
we may therefore process the possible pairs in a different order. Since
our name heuristic is largely useless, our results depend on that order.
I think the real solution is to improve the name heuristic. Something
like an edit distance would make more sense (though I think it is not as
simple as an edit distance across the whole pathname, as moving a
basename across directories should probably be preferred to changing the
filename inside a directory).
Largely I think nobody has cared much because this only comes up when
you move multiple identical files. Quite often there is a minor
difference even between very similar files, and that is enough to come
up with sane results.
-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html