I've spent a bit of holiday hacking time working on a git_export command for
monotone, more as an experiment than anything else. I've committed the
result to net.venge.monotone.fast-export for people to have a look at.
There's probably not much preventing this from landing on mainline, other
than some documentation and possibly tests. Although I'm not really sure how
we would want to go about testing it beyond what I've already done. The fun
part about a command like this is that I expect most users of it would have
some expectation of being their own testers in terms of verifying their
conversions and such.

This successfully (I think) converts the entire monotone database with 276
branches (more or less what you get when you pull '*' from monotone.ca) to a
git repository.Here's some details on the conversion:

exported monotone database
- 174MB in size
- 276 branches
- 127 tags (with one duplicate name monotone-viz-1.0.1-1
- export time 83m42.134s (on a 2.0GHz pentium-m laptop)
- export file size 2.9GB
- 15245 revisions exported

imported git repository
- 719MB in size (before being repacked)
- import time 23m15.463s
- repack -adf time 3m14.385s
- packed repository size 60MB
- 277 branches (the extra one is "master")
- 126 tags (missing the duplicate above)

Three exported branch names "net.prjek:tester",
"net.prjet:tester/drop-for-propagate" and "prjek.net:tester" where changed
(with sed) during the import process because git does not allow colon's (and
various other characters) in branch/ref names. I simply changed ":" and "/"
in these names to "." although the "/" should have worked it did cause an
error of some sort.

The conversion was verified by checking out each of the 276 branches and 126
tags from both git and mtn and comparing the resulting workspaces. The
script I used to do this verification was a bit dumb and failed to checkout
a few revisions so these weren't compared. Using only the branch name failed
in some cases because there were multiple heads and using only a tag name
failed in some cases because the tagged revisions had no branch certs. All
of the branches and tags that did checkout were identical according to diff
-qr so I'm reasonably confident that the new exporter basically works.

I suspect that the various other git fast-import conversion scripts that
exist for monotone are probably slower and less robust than this
implementation (unless they work similarly from rosters) which uses the
monotone internals to do the work. I spent a bit of time initially trying to
export revisions using the revision data structures but this didn't work
very well. Git only deals with files and trying to order a mix of renames of
directories and files from monotone correctly from revisions was difficult.
Ultimately I didn't use the revision data structures at all but built up a
similar files-only based revision representation by comparing rosters. Much
like what is done for make_cset, but ignoring directories and producing only
file deletions, renames and additions. This works much better, correctly
handles pivot_root and a few other odd things that working with revisions
proved difficult.

This exporter does not (yet) handle all rename ordering issues that are
possible. For example <rename a b> followed by <rename b c> will probably
fail on import unless it is executed as <rename b c> followed by <rename a
b>. Similarly <rename a b> followed by <rename b a> which is indeed
possible, will probably fail on import and requires the introduction of a
third temporary file. These problems can be fixed in the exporter and can
also be fixed in the exported data by re-ordering renames as required.

WARNING: Please don't bet your life on this implementation! If you do use it
to convert a repository you must do careful verification of the converted
results. WORKSFORME is the only assurance I can make.

This feels a bit like throwing in the proverbial towel and I hope this
doesn't elicit any ill-will from the current monotone crowd. I'm not really
planning on converting my personal stuff from monotone any time soon but
knowing it can be done without losing information is nice. I'm still happy
to contribute to monotone but with 2 small kids my free/hacking time is
pretty limited.

Cheers,
Derek
_______________________________________________
Monotone-devel mailing list
[email protected]
http://lists.nongnu.org/mailman/listinfo/monotone-devel

Reply via email to