On Mon, Jan 21, 2013 at 06:28:53AM -0500, Eric S. Raymond wrote:
> John Keeping <[email protected]>:
>> But this is nothing more than a sticking plaster that happens to do
>> enough in this particular case
>
> I'm beginning to think that's the best outcome we ever get in this
> problem domain...
I don't think we can ever get a perfect outcome, but it should be
possible to do a little bit better without too much effort.
>> - if the Git repository happened to be on
>> a different branch, the start date would be wrong and too many or too
>> few commits could be output. Git doesn't detect that they commits are
>> identical to some that we already have because we're explicitly telling
>> it to make a new commit with the specified parent.
>
> Then I don't understand the actual failure case. Either that or you
> don't understand the effect of -i. Have you actually experimented with
> it? The reason I suspect you don't understand the feature is that it
> shouldn't make any difference to the way -i works which repository branch is
> active at the time of the second import.
>
> Here is how I model what is going on:
>
> 1. We make commits to multiple branches of a CVS repo up to some given time T.
>
> 2. We import it, ending up with a collection of git branches all of which
> have tip commits dated T or earlier. And *every* commit dated T or earlier
> gets copied over.
>
> 3. We make more commits to the same set of branches in CVS.
>
> 4. We now run cvsps -d T on the repo. This generates an incremental
> fast-import stream describing all CVS commits *newer* than T (see
> the cvsps manual page).
This is the problem step. There are two scenarios that have problems:
1. If I create a new development branch in my Git repository and commit
something to it then git-cvsimport-3 will pass a time to cvsps that
is newer than the actual time of the last import, so T is wrong.
It may be possible to fix this case purely in git-cvsimport-3.
2. If the branch I have checked out is not the newest CVS branch, then
git-cvsimport-3 will pass a value of T that is before the time of the
last import. This case is more subtle but it results in unwanted
duplicate commits since git-fast-import will just do what it's told
and create the new commits.
So if we have the following commits:
commit1 at time 1
commit2 at time 2
commit3 at time 3
and I call "cvsps -d 2 -i" I end up with the series:
commit1 at time 1
commit2 at time 2
commit3 at time 3
commit2 at time 2 - effectively reverting the previous commit
commit3 at time 3 - a duplicate
... and potentially genuinely new commits
This is demonstrated by running the Git test t9650.
I also disagree that cvsps outputs commits *newer* than T since it will
also output commits *at* T, which is what I changed with the patch in my
previous message. This fixes the duplicate commit2 in the series above,
but not the duplicate commit3.
> 5. That stream should consist of a set of disconnected branches, each
> (because of -i) beginning with a root commit containing "from
> refs/heads/foo^0" which says to parent the commit on the tip of
> branch foo, whatever that happens to be. (I don't have to guess
> about this, I tested the feature before shipping.)
>
> 6. Now, when git fast-import interprets that stream in the context of
> the repository produced in step 2, for each branch in the
> incremental dump the branch root commit is parented on the tip
> commit of the same branch in the repo.
>
> At step 6, it shouldn't matter at all which branch is active, because
> where an incremental branch root gets attached has nothing to do with
> which branch is active.
>
> It is sufficient to avoid duplicate commits that cvsps -d 0 -d T and
> cvsps -d T run on the same CVS repo operate on *disjoint sets* of CVS
> file commits. I can see this technique possibly getting confused if T
> falls in the middle of a changeset where the CVS timestamps for the
> file commits are out of order. But that's the same case that will
> fail if we're importing at file-commit granularity, so there's no new
> bug here.
>
> Can you explain at what step my logic is incorrect?
Your logic is correct - for cvsps - the problem is where T comes from.
Perhaps it is simplest to just save a CVS_LAST_IMPORT_TIME file in
$GIT_DIR and not worry about it any more.
John
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html