On Thu, Sep 12, 2019 at 9:53 AM Joseph Mirabel <[email protected]>
wrote:

> To summarize what is missing:
>
> - there are a lot of "backporting rev[0-9]+" that are not found. I don't
> know what "backporting" means. I just know that "hg log --rev N" for all
> the one I tested returns "unknown revision".
>

$ hg log -v  | grep "backporting"  does not give me that many:

backporting 1784: fixed bug #62

-> For this one "hg log -r 1784" tells me that the true hash is
0430786c2a6f (but we don't care if we lost one link!)

The followings are very old subversion references, we don't care at all:

backporting 964177 (gcc 3.3 fix)
backporting r964165 (gcc 3.3 fixes)
backporting rev 951682 (compilation fix in aligned allocator)
backporting rev 918446: fix MSVC internal compilation error
backporting rev925153 (bugfix in MapBase::coeffRef(int) )
backporting commit 918468 (fix MSVC internal error)


> - I do not catch ranges of revision number like "[0-9]+-[0-9]+". It
> wouldn't be hard to achieve.
>
-> I could not find any meaningful one.

I only found two meaningful "([0-9]+:)([0-9]+)" occurrences like:
6089:76b6c62565a6. In this case, if \2 is a valid hash, then \1 (e.g.,
"6089:") should be dropped to enable auto-link creation, see:
https://github.com/jmirabel/eigen_tmp/commit/4325f05a4b60

But that's really no big deal as the true hash has been properly updated,
and there are only 2 occurrences!

> - what about unamed hg heads ? Should we drop them ? If not, I would
> appreciate if someone knowing mercurial could name them (with a name valid
> for hg and git).
>
I guess this issue is related to changeset 59a7e404a93c, which is an empty
commit closing a branch. If your script ignores it without additional
issue, then that's fine.

So for me this is all good, I would only care about updating references to
bugs/PR by applying the following substitutions:

"([Bb]ug) (\d+)" -> "\2 #\1"
"bugs (\d+) and (\d+)" -> "bugs #\1 and #\2"
"http://eigen.tuxfamily.org/bz/show_bug.cgi\?id=(\d+)" -> "#\1"
"[Pp]ull [Rr]equest #(\d+)" -> "pull request PR-\1"
"PR #(\d+)" -> "PR-\1"


gael


>
> Joseph
>
>
> Le 12/09/2019 à 00:37, Gael Guennebaud a écrit :
>
>
>
> On Wed, Sep 11, 2019 at 7:38 PM Joseph Mirabel <[email protected]>
> wrote:
>
>> Dear Eigen developers,
>>
>> - I can convert all reference like to revisions or mercurial hashes that
>> follows the regex in [2].
>>
>
> I'll look at it more carefully later, but... wow!! this looks very
> promising: https://github.com/jmirabel/eigen_tmp/commit/6e53e31dc2d79da
>
> For comparison, the same commit in our official git-mirror:
> https://github.com/eigenteam/eigen-git-mirror/commit/6ba9310bc2c168
>
> gael
>
>
>> - I did not try to convert URLs although it should not be hard.
>>
>> - I manually edited the author file [5] so that they would fit git
>> author format. If you find yourself in the list and want to update it,
>> you can contact me.
>>
>>
>> It should not be hard to add more rules to the plugin convert_references
>> if anyone feels like doing it.
>>
>>
>> Best,
>>
>> Joseph
>>
>>
>> [1] https://github.com/frej/fast-export.git
>>
>> [2]
>>
>> https://github.com/jmirabel/fast-export/blob/1fdc76e0626acd6adfcc0d900d14f36b459c4798/plugins/convert_references/__init__.py#L16
>>
>> [3] https://github.com/jmirabel/fast-export
>>
>> [4] https://github.com/jmirabel/eigen_tmp.git
>>
>> [5]
>> https://github.com/jmirabel/fast-export/blob/master/eigen/authors_reworked
>>
>>
>>
>> Le 11/09/2019 à 18:03, Gael Guennebaud a écrit :
>> > To prepare the migration from bitbucket, I started to play a bit with
>> > its API to see what could be done. So far I've quickly draft two
>> > (ugly) python scripts to archive the forks and pull-requests. Since
>> > this is a one shot for us, I did not cared about robustness, safety,
>> > generality, beauty, etc.
>> >
>> > You can see them there
>> > : https://gitlab.com/ggael/bitbucket-migration-tools and contribute!
>> >
>> > ** Forks **
>> >
>> > You can see the summary of the fork script
>> > there: http://manao.inria.fr/eigen_tmp/archive_forks_log.html
>> >
>> > The hg clones (history+checkout) represents 20GB, maybe 12GB if we
>> > remove the checkouts. Among the 460 forks, 214 seems to have no change
>> > at all (according to "hg out") and could be dropped. I don't know yet
>> > where to host them though.
>> >
>> > This script can be ran incrementally.
>> >
>> >
>> > ** Pull-Requests **
>> >
>> > You can find the output of the pull-requests script
>> > there: http://manao.inria.fr/eigen_tmp/pullrequests/
>> >
>> > There is a short summary, and then for each PR a static .html file
>> > plus diff/patch files, and other details. For instance,
>> > see: http://manao.inria.fr/eigen_tmp/pullrequests/OPEN/686/pr686.html
>> >
>> > Currently this script cannot be ran incrementally. You have to run it
>> > just before closing the respective repository!
>> >
>> > Also, this script does not grab inline comments. Only the main
>> > discussions is archived. Those can be obtained by iterating over the
>> > "activity" pages, but I don't think that's worth the effort because
>> > they would be difficult to exploit anyway.
>> >
>> >
>> > ** hg to git **
>> >
>> > As discussed in the other thread, if we switch from hg to git, then
>> > all hashes will have to be updated. Generating a map file is easy, and
>> > thus updating the links/hashes in bug comments and PR comments should
>> > not be too difficult (we only have to figure out the right regex to
>> > catch all variants).
>> >
>> > However, updating the hashes within the commit messages will require
>> > to rewrite the whole history in a careful order. Does anyone here
>> > feels brave enough to write such a script? If not, I guess we could
>> > live with an online php script doing the hash conversion on demand. I
>> > don't think we'll have to follow such hashes so frequently.
>> >
>> > cheers,
>> > gael
>> >
>> >
>>
>>
>>
>>

Reply via email to