On 06.07.2011 18:35, Herbert Duerr wrote:
 > there is another tool, a HG extension called hg-git, which can
 > convert HG bookmarks to git branches.
 > http://hg-git.github.com/

Great find! I was already brushing up my python and mercurial internals
skills to extend hg-fast-export's export_commit() for our one big
hg-repo with one hg-branch and many hg-bookmarks. I'm glad that cup passed.

 > so my current plan is this:
 > 1. convert OOO340 repo to git via hg-fast-export.sh
 > 2. pull all CWSes into OOO340 repo and create bookmarks
 > 3. use hg-git to push all of them into the converted git repo

Sounds good!

unfortunately it won't work :(

problem is that hg-fast-export.sh and hg-git don't quite agree what a converted git repo should look like. i've actually tried it out with a trivial repo and 2 branches and surprisingly it actually worked, but then i tried it on a real repo (the hg-git one), just take some arbitrary changeset and convert up to that with hg-fast-export.sh, then hg-git push of the rest fails...

next thing i tried is to convert the whole thing via hg-git push, but unfortunately they weren't lying when they used the word "slow". after a couple of hours it had converted one percent of the changesets, and the progress predicted an ETA of >8 days.

so then i've taken a deeper look at the hg-fast-export code, and it seems surprisingly easy to hack it to do something with HG bookmarks.

basically a HG bookmark points at a single revision, and is thus quite similar to a git "ref". the hg-fast-export writes a header for every changeset, with a branch name: "commit refs/heads/$branchname"

so i'm detecting all the HG heads that are marked by bookmarks, and just use the bookmark name as the branch name.
also, a bookmark for the head that corresponds to OOO340 is necessary.

then invoke it like this:

mkdir git
cd git
git init
hg-fast-export.sh -r ../hg -M dummy

the "-M dummy" sets the default branch, so all changesets that are _not_ heads end up on the "dummy" branch, and for every head/bookmark a branch pointing to that head is created.
the "dummy" branch/ref can be deleted after conversion.

i don't understand git very well, but i hope this should work :)

the tool already errors out if there is more than one head without a branch/bookmark.

the attached patch just adds 3 lines of actual code plus some parameter shuffling. (a case that may not be handled properly is if there is a HG branch and a HG bookmark with the same name, but we don't have HG branches at all...)

i've started converting yesterday evening, and maybe it'll be finished today (on my 3 year old laptop...)

I'd clarify step one to "convert OOO340 repo to a bare git repo via
hg-fast-export.sh". After all these steps please don't forget git
pack-refs and git repack (e.g. with "-a -d -f --window=200
--depth=1000") to get a nice and tight repository.

Herbert
diff --git a/hg-fast-export.py b/hg-fast-export.py
index 519b556..6a66a24 100755
--- a/hg-fast-export.py
+++ b/hg-fast-export.py
@@ -149,7 +149,7 @@ def sanitize_name(name,what="branch"):
     sys.stderr.write('Warning: sanitized %s [%s] to [%s]\n' % (what,name,n))
   return n
 
-def export_commit(ui,repo,revision,old_marks,max,count,authors,sob,brmap):
+def 
export_commit(ui,repo,revision,old_marks,max,count,bookmarks,authors,sob,brmap):
   def get_branchname(name):
     if brmap.has_key(name):
       return brmap[name]
@@ -157,7 +157,7 @@ def 
export_commit(ui,repo,revision,old_marks,max,count,authors,sob,brmap):
     brmap[name]=n
     return n
 
-  
(revnode,_,user,(time,timezone),files,desc,branch,_)=get_changeset(ui,repo,revision,authors)
+  
(revnode,_,user,(time,timezone),files,desc,branch,_)=get_changeset(ui,repo,revision,bookmarks,authors)
 
   branch=get_branchname(branch)
 
@@ -257,7 +257,7 @@ def load_authors(filename):
   sys.stderr.write('Loaded %d authors\n' % l)
   return cache
 
-def verify_heads(ui,repo,cache,force):
+def verify_heads(ui,repo,cache,bookmarks,force):
   branches=repo.branchtags()
   l=[(-repo.changelog.rev(n), n, t) for t, n in branches.items()]
   l.sort()
@@ -275,7 +275,7 @@ def verify_heads(ui,repo,cache,force):
   # verify that branch has exactly one head
   t={}
   for h in repo.heads():
-    (_,_,_,_,_,_,branch,_)=get_changeset(ui,repo,h)
+    (_,_,_,_,_,_,branch,_)=get_changeset(ui,repo,h,bookmarks)
     if t.get(branch,False):
       sys.stderr.write('Error: repository has at least one unnamed head: hg 
r%s\n' %
           repo.changelog.rev(h))
@@ -294,7 +294,9 @@ def 
hg2git(repourl,m,marksfile,mappingfile,headsfile,tipfile,authors={},sob=Fals
 
   ui,repo=setup_repo(repourl)
 
-  if not verify_heads(ui,repo,heads_cache,force):
+  bookmarks = dict([(repo._bookmarks[bm], bm) for bm in repo._bookmarks])
+
+  if not verify_heads(ui,repo,heads_cache,bookmarks,force):
     return 1
 
   try:
@@ -308,14 +310,14 @@ def 
hg2git(repourl,m,marksfile,mappingfile,headsfile,tipfile,authors={},sob=Fals
     max=tip
 
   for rev in range(0,max):
-       (revnode,_,_,_,_,_,_,_)=get_changeset(ui,repo,rev,authors)
+       (revnode,_,_,_,_,_,_,_)=get_changeset(ui,repo,rev,bookmarks,authors)
        mapping_cache[revnode.encode('hex_codec')] = str(rev)
 
 
   c=0
   brmap={}
   for rev in range(min,max):
-    c=export_commit(ui,repo,rev,old_marks,max,c,authors,sob,brmap)
+    c=export_commit(ui,repo,rev,old_marks,max,c,bookmarks,authors,sob,brmap)
 
   state_cache['tip']=max
   state_cache['repo']=repourl
diff --git a/hg2git.py b/hg2git.py
index baa41cd..33b3850 100755
--- a/hg2git.py
+++ b/hg2git.py
@@ -68,11 +68,13 @@ def get_branch(name):
     return origin_name + '/' + name
   return name
 
-def get_changeset(ui,repo,revision,authors={}):
+def get_changeset(ui,repo,revision,bookmarks,authors={}):
   node=repo.lookup(revision)
   (manifest,user,(time,timezone),files,desc,extra)=repo.changelog.read(node)
   tz="%+03d%02d" % (-timezone / 3600, ((-timezone % 3600) / 60))
   branch=get_branch(extra.get('branch','master'))
+  if bookmarks.get(node, False):
+      branch = bookmarks[node] # should create ref/heads/bookmarkname
   return 
(node,manifest,fixup_user(user,authors),(time,tz),files,desc,branch,extra)
 
 def mangle_key(key):

Reply via email to