On 05/17/2013 01:50 PM, Martin Langhoff wrote:
> On Fri, May 17, 2013 at 5:10 AM, Michael Haggerty <mhag...@alum.mit.edu> 
> wrote:
>> For one-time imports, the fix is to use a tool that is not broken, like
>> cvs2git.
> As one of the earlier maintainers of cvsimport, I do believe that
> cvs2git is less broken, for one-shot imports, than cvsimport. Users
> looking for a one-shot import should not use cvsimport as there are
> better options there. Myself, I have used parsecvs (long ago, so
> perhaps it isn't the best of the crop nowadays).
> TBH, I am puzzled and amused at all the chest-thumping about cvs
> importers. Yeah, yours is a bit better or saner, but we all wade in
> the muddle of essentially broken data. So "is not broken" is rather
> misleading when talking to end users. It carries so many caveats about
> whether it'll work on the users' particular repo that it is not a
> generally truthful statement.

I disagree.  I use the following definition of "correct":

    The Git history output by an importer must not contradict the
    history that is recorded in CVS.

We both know that the CVS history omits important data, and that the
history is mutable, etc.  So there are lots of hypothetical histories
that do not contradict CVS.  But some things are recorded unambiguously
in the CVS history, like

* The contents at any tag or the tip of any branch (i.e., what is in the
working tree when you check it out).

* The order of modifications to a single file on a single branch and the
file contents after each of those revisions.

* Who committed a particular change, and approximately when (modulo
clock skew).

If a tool doesn't get these things correct (especially the first!) then
it should only be used with great caution.  cvsimport can make mistakes
on the first two.  As far as I know, cvs2svn/cvs2git are correct
according to this definition.

That being said, I appreciate that cvsimport can do incremental imports.
 cvs2git doesn't even attempt it.  I've thought about what it would take
to implement correct incremental imports in cvs2svn/cvs2git, and it is
far beyond the budget of time that I have for the project.  So I
definitely give props to cvsimport for attempting incremental imports
and apparently often doing a good enough job that it is useful to people.

> [...]
> At the time, I looked into trying to use cvs2svn (precursor to
> cvs2git) as the "CVS read" side of cvsimport, but it did not support
> incremental imports at all, and it took forever to run.

cvs2svn still doesn't support incremental imports, and it still takes a
long time to run (though less than before).  cvs2git is considerably
faster, partly because of the speed and convenience of using
git-fast-import.  But conversion time is much less of an issue for
one-time conversions.

> It was a time when git was new and people were dipping their toes in
> the pool, and some developers were pining to use git on projects that
> used CVS (like we use git-svn now). Incremental imports were a must.
> One of the nice features of cvsimport is that it can do incrementals
> on a repo imported with another tool. That earns it a place under the
> sun. If it didn't have that, I'd be voting for removal (after a review
> that the replacement *is* actually better ;-) across a number of test
> repos).

Incremental imports are indeed the saving grace of cvsimport and for
that reason I don't advocate it's removal.  But I think we should be
clearer about warning users against using it for one-time imports,
because it can produce output that is *objectively* incorrect in
important ways.

Regarding tests, the failing tests that I added to the cvsimport test
suite a few years ago were taken directly from the cvs2svn/cvs2git test
suite, where they pass :-)


Michael Haggerty
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to