Hi,

I recently downloaded Tailor to convert a very large (~2 million lines of code
with 6 years of revision history and about 100,000 discrete CVS patches)
CVS repository to Mercurial.  I used the cvs backend for cvs and the hg
backend for mercurial.  It was tailor version 0.9.22.

Firstly, thanks for an excellent tool!

I tried two different processes:
1.  I converted our HEAD branch.  This was relatively straightforward to do 
    (after following the instructions).
2.  I converted a branch in our repository.  This was more difficult (see 
    below).

I had a couple of small problems with number 1:
- There is a copy-and-paste typo in the mercurial backend:
diff -r tailor-0.9.22/vcpx/hg.py ../tailor-0.9.22/vcpx/hg.py
20c20
< class HglibWorkingDir(UpdatableSourceWorkingDir, 
SyncronizableTargetWorkingDir):
---
> class HgWorkingDir(UpdatableSourceWorkingDir, SyncronizableTargetWorkingDir):
- We happen to have a bunch of $Id:$-type tags in our CVS repository for which
    there are a lot of commits in the repository.  If we try to merge changes
    between the version on the branch and the HEAD, then we get mostly spurious
    conflicts (files that changed in both, and can be merged, except for the ID 
field).
    To avoid this, I added the "-kk" flag to the CVS update and checkout 
commands,
    which normalizes these fields:

diff -U 3 -r tailor-0.9.22/vcpx/cvsps.py ../tailor-0.9.22/vcpx/cvsps.py
--- tailor-0.9.22/vcpx/cvsps.py 2006-05-01 17:54:33.000000000 -0400
+++ ../tailor-0.9.22/vcpx/cvsps.py      2006-05-27 22:19:16.000000000 -0400
@@ -223,7 +223,7 @@
                 rmtree(join(self.basedir, e.name))
             else:
                 cmd = self.repository.command("-d", "%(repository)s",
-                                              "-q", "update", "-d",
+                                              "-q", "update", "-kk", "-d",
                                               "-r", e.new_revision)
                 cvsup = ExternalCommand(cwd=self.basedir, command=cmd)
                 retry = 0
@@ -297,7 +297,7 @@
             parentdir, subdir = split(self.basedir)
             cmd = self.repository.command("-q",
                                           "-d", self.repository.repository,
-                                          "checkout",
+                                          "checkout", "-kk",
                                           "-d", subdir)
             if revision:
                 cmd.extend(["-r", revision])

- CVS wants to make sure that the second changes before it exits (which means 
    only one operation per second, which is frustrating with a hundred 
    thousand CVS changesets).  To avoid this, I rsync'd a new copy of the 
    repository (so that there was no possibility of corruption) and then
    modified CVS to not at the end.

--- BUILD/cvs-1.11.19/src/update.c      2005-01-31 17:18:01.000000000 -0500
+++ BUILD/cvs-1.11.19-nowait/src/update.c       2006-05-27 21:52:27.621272500 
-0400
@@ -525,11 +525,11 @@
 #endif

     /* see if we need to sleep before returning to avoid time-stamp races */
     if (last_register_time)
     {
-       sleep_past (last_register_time);
+       /* sleep_past (last_register_time); */
     }

     return err;
 }

I had more problems with number 2:
- I had accidentally put a "/" on the end of the CVS module name, which meant 
    the first character of the path was truncated (eg, I got 
    "pplications/main.cc" instead of "applications/main.cc".  I fixed this by 
    removing the "/", but it would be good to put this in the doco or 
    (preferably) check for it and remove it.
- CVS has a bugs with the rlog command when using -r:branch syntax, which 
    leads to changesets on the wrong branch being listed in the log.  For 
    example:

cvs -f -d /export/cvsroot-copy rlog -r:prod 
algorithms/machine_learning/mlp/random.cc

RCS file: /export/cvsroot-copy/algorithms/machine_learning/mlp/random.cc,v
head: 1.2
branch:
locks: strict
access list:
symbolic names:
        prod: 1.2.0.106
        mingw32-branch: 1.2.0.20
keyword substitution: kv
total revisions: 3;     selected revisions: 1
description:
----------------------------
revision 1.2.20.1
date: 2004/05/31 22:45:58;  author: jeremy;  state: Exp;  lines: +8 -2
* Initial mingw32 work; most things compile now

This would lead to a "Something went wrong: unable to determine the exact 
upstream revision of the checked out tree" error, since the date of the 
incorrect revision was much earlier than the date of the branch (earlier this 
year).

I fixed this by filtering the revisions as they were parsed (ie, by not 
trusting CVS):

diff -U 3 -r tailor-0.9.22/vcpx/cvsps.py ../tailor-0.9.22/vcpx/cvsps.py
--- tailor-0.9.22/vcpx/cvsps.py 2006-05-01 17:54:33.000000000 -0400
+++ ../tailor-0.9.22/vcpx/cvsps.py      2006-05-27 21:11:58.000000000 -0400
@@ -401,8 +425,20 @@
                 cs = self.__parseRevision(entry)
                 if cs is None:
                     break
+
                 date,author,changelog,e,rev,state,newentry = cs

+                # CVS seems to sometimes mess up what it thinks the branch 
is...
+                if not cvs_revs_same_branch(normalize_cvs_rev(rev), branchnum):
+                    self.log.warning("skipped revision %s on entry %s "
+                                     "as revision didn't match branch revision 
%s "
+                                     "for branch %s"
+                                     % (str(normalize_cvs_rev(rev)), entry, 
str(branchnum),
+                                        str(branch)))
+                    expected_revisions -= 1
+                    continue
+
+
                 # Skip spurious entries added in a branch
                 if not (rev == '1.1' and state == 'dead' and
                         changelog.startswith('file ') and

I hope that this can save somebody some frustration!  As for the patch to filter
the CVS revisions: I think that this would be a good thing to add to the 
official
distribution; from Googling I see that I'm not the only person to think that
Tailor didn't work on branches (when it was in fact a CVS bug)!

Cheers,
Jeremy
_______________________________________________
Tailor mailing list
[email protected]
http://lists.zooko.com/mailman/listinfo/tailor

Reply via email to