To prepare the migration from bitbucket, I started to play a bit with its
API to see what could be done. So far I've quickly draft two (ugly) python
scripts to archive the forks and pull-requests. Since this is a one shot
for us, I did not cared about robustness, safety, generality, beauty, etc.

You can see them there : https://gitlab.com/ggael/bitbucket-migration-tools and
contribute!

** Forks **

You can see the summary of the fork script there:
http://manao.inria.fr/eigen_tmp/archive_forks_log.html

The hg clones (history+checkout) represents 20GB, maybe 12GB if we remove
the checkouts. Among the 460 forks, 214 seems to have no change at all
(according to "hg out") and could be dropped. I don't know yet where to
host them though.

This script can be ran incrementally.


** Pull-Requests **

You can find the output of the pull-requests script there:
http://manao.inria.fr/eigen_tmp/pullrequests/

There is a short summary, and then for each PR a static .html file plus
diff/patch files, and other details. For instance, see:
http://manao.inria.fr/eigen_tmp/pullrequests/OPEN/686/pr686.html

Currently this script cannot be ran incrementally. You have to run it just
before closing the respective repository!

Also, this script does not grab inline comments. Only the main discussions
is archived. Those can be obtained by iterating over the "activity" pages,
but I don't think that's worth the effort because they would be difficult
to exploit anyway.


** hg to git **

As discussed in the other thread, if we switch from hg to git, then all
hashes will have to be updated. Generating a map file is easy, and thus
updating the links/hashes in bug comments and PR comments should not be too
difficult (we only have to figure out the right regex to catch all
variants).

However, updating the hashes within the commit messages will require to
rewrite the whole history in a careful order. Does anyone here feels brave
enough to write such a script? If not, I guess we could live with an online
php script doing the hash conversion on demand. I don't think we'll have to
follow such hashes so frequently.

cheers,
gael

Reply via email to