On 06.07.2011 18:35, Herbert Duerr wrote:
> there is another tool, a HG extension called hg-git, which can
> convert HG bookmarks to git branches.
> http://hg-git.github.com/
Great find! I was already brushing up my python and mercurial internals
skills to extend hg-fast-export's export_commit() for our one big
hg-repo with one hg-branch and many hg-bookmarks. I'm glad that cup passed.
> so my current plan is this:
> 1. convert OOO340 repo to git via hg-fast-export.sh
> 2. pull all CWSes into OOO340 repo and create bookmarks
> 3. use hg-git to push all of them into the converted git repo
Sounds good!
unfortunately it won't work :(
problem is that hg-fast-export.sh and hg-git don't quite agree what a
converted git repo should look like.
i've actually tried it out with a trivial repo and 2 branches and
surprisingly it actually worked, but then i tried it on a real repo (the
hg-git one), just take some arbitrary changeset and convert up to that
with hg-fast-export.sh, then hg-git push of the rest fails...
next thing i tried is to convert the whole thing via hg-git push, but
unfortunately they weren't lying when they used the word "slow".
after a couple of hours it had converted one percent of the changesets,
and the progress predicted an ETA of >8 days.
so then i've taken a deeper look at the hg-fast-export code, and it
seems surprisingly easy to hack it to do something with HG bookmarks.
basically a HG bookmark points at a single revision, and is thus quite
similar to a git "ref".
the hg-fast-export writes a header for every changeset, with a branch
name: "commit refs/heads/$branchname"
so i'm detecting all the HG heads that are marked by bookmarks, and just
use the bookmark name as the branch name.
also, a bookmark for the head that corresponds to OOO340 is necessary.
then invoke it like this:
mkdir git
cd git
git init
hg-fast-export.sh -r ../hg -M dummy
the "-M dummy" sets the default branch, so all changesets that are _not_
heads end up on the "dummy" branch, and for every head/bookmark a branch
pointing to that head is created.
the "dummy" branch/ref can be deleted after conversion.
i don't understand git very well, but i hope this should work :)
the tool already errors out if there is more than one head without a
branch/bookmark.
the attached patch just adds 3 lines of actual code plus some parameter
shuffling.
(a case that may not be handled properly is if there is a HG branch and
a HG bookmark with the same name, but we don't have HG branches at all...)
i've started converting yesterday evening, and maybe it'll be finished
today (on my 3 year old laptop...)
I'd clarify step one to "convert OOO340 repo to a bare git repo via
hg-fast-export.sh". After all these steps please don't forget git
pack-refs and git repack (e.g. with "-a -d -f --window=200
--depth=1000") to get a nice and tight repository.
Herbert
diff --git a/hg-fast-export.py b/hg-fast-export.py
index 519b556..6a66a24 100755
--- a/hg-fast-export.py
+++ b/hg-fast-export.py
@@ -149,7 +149,7 @@ def sanitize_name(name,what="branch"):
sys.stderr.write('Warning: sanitized %s [%s] to [%s]\n' % (what,name,n))
return n
-def export_commit(ui,repo,revision,old_marks,max,count,authors,sob,brmap):
+def
export_commit(ui,repo,revision,old_marks,max,count,bookmarks,authors,sob,brmap):
def get_branchname(name):
if brmap.has_key(name):
return brmap[name]
@@ -157,7 +157,7 @@ def
export_commit(ui,repo,revision,old_marks,max,count,authors,sob,brmap):
brmap[name]=n
return n
-
(revnode,_,user,(time,timezone),files,desc,branch,_)=get_changeset(ui,repo,revision,authors)
+
(revnode,_,user,(time,timezone),files,desc,branch,_)=get_changeset(ui,repo,revision,bookmarks,authors)
branch=get_branchname(branch)
@@ -257,7 +257,7 @@ def load_authors(filename):
sys.stderr.write('Loaded %d authors\n' % l)
return cache
-def verify_heads(ui,repo,cache,force):
+def verify_heads(ui,repo,cache,bookmarks,force):
branches=repo.branchtags()
l=[(-repo.changelog.rev(n), n, t) for t, n in branches.items()]
l.sort()
@@ -275,7 +275,7 @@ def verify_heads(ui,repo,cache,force):
# verify that branch has exactly one head
t={}
for h in repo.heads():
- (_,_,_,_,_,_,branch,_)=get_changeset(ui,repo,h)
+ (_,_,_,_,_,_,branch,_)=get_changeset(ui,repo,h,bookmarks)
if t.get(branch,False):
sys.stderr.write('Error: repository has at least one unnamed head: hg
r%s\n' %
repo.changelog.rev(h))
@@ -294,7 +294,9 @@ def
hg2git(repourl,m,marksfile,mappingfile,headsfile,tipfile,authors={},sob=Fals
ui,repo=setup_repo(repourl)
- if not verify_heads(ui,repo,heads_cache,force):
+ bookmarks = dict([(repo._bookmarks[bm], bm) for bm in repo._bookmarks])
+
+ if not verify_heads(ui,repo,heads_cache,bookmarks,force):
return 1
try:
@@ -308,14 +310,14 @@ def
hg2git(repourl,m,marksfile,mappingfile,headsfile,tipfile,authors={},sob=Fals
max=tip
for rev in range(0,max):
- (revnode,_,_,_,_,_,_,_)=get_changeset(ui,repo,rev,authors)
+ (revnode,_,_,_,_,_,_,_)=get_changeset(ui,repo,rev,bookmarks,authors)
mapping_cache[revnode.encode('hex_codec')] = str(rev)
c=0
brmap={}
for rev in range(min,max):
- c=export_commit(ui,repo,rev,old_marks,max,c,authors,sob,brmap)
+ c=export_commit(ui,repo,rev,old_marks,max,c,bookmarks,authors,sob,brmap)
state_cache['tip']=max
state_cache['repo']=repourl
diff --git a/hg2git.py b/hg2git.py
index baa41cd..33b3850 100755
--- a/hg2git.py
+++ b/hg2git.py
@@ -68,11 +68,13 @@ def get_branch(name):
return origin_name + '/' + name
return name
-def get_changeset(ui,repo,revision,authors={}):
+def get_changeset(ui,repo,revision,bookmarks,authors={}):
node=repo.lookup(revision)
(manifest,user,(time,timezone),files,desc,extra)=repo.changelog.read(node)
tz="%+03d%02d" % (-timezone / 3600, ((-timezone % 3600) / 60))
branch=get_branch(extra.get('branch','master'))
+ if bookmarks.get(node, False):
+ branch = bookmarks[node] # should create ref/heads/bookmarkname
return
(node,manifest,fixup_user(user,authors),(time,tz),files,desc,branch,extra)
def mangle_key(key):