Hi!

It's good news that others are working on this already (I would be surprised, if nobody did). Probably, more migration tools will evolve over the next months (thanks to bitbucket for forcing this ...)

To archive previous PRs, it may be sufficient to simply attach a mercurial-patch (or a git-translation of that) to the discussion (which could be retrieved automatically, if someone cares to program that) -- having the entire discussion archived would be great as well, of course (not sure if the API supports retrieving that as well). Actually, current PRs also have the problem that they are not usable, if the requester (accidentally) removes (or overwrites) the fork repository.


And maybe having existing PRs imported to any form of new PR-management should not need to be the top-priority. If we have a much better mechanism for PRs on any future host, than just some kind of (more or less) static image of previous PRs could be good enough (especially, if the new mechanism does not have an easy way to import existing PRs -- which could be difficult since the new host won't have all users registered which ever took part at our current PRs).


Cheers,
Christoph


On 26/08/2019 15.04, David Tellenbach wrote:
Hi,

The point you missed is that especially the "grafted from" links do not include the full 
URL, just the hg-hash (which is different from git-hashes). And just greping for "grafted 
from" gives me 425 results (in total -- if you want the log of individual branches, you need 
to use the `-b` option).
For a more precise count, you should grep for hexadecimal numbers longer than a 
few digits inside the commit messages.
I see, thanks for the explanation.

I somewhat doubt that any existing hg->git converters automatically translates 
these hashes, but I'd be very happy if someone finds out otherwise. Changing these 
manually is definitely not an option.
I might have good news on this one: We are apparently not the only project that works on 
migrating from Mercurial to Git. The OpenJDK project (a free implementation of the Java 
platform) has created Skara, a set of tools to handle all kind of stuff related to 
contributing to OpenJDK (https://github.com/openjdk/skara 
<https://github.com/openjdk/skara>). Some of the tools could be really helpful for 
our issues (see https://openjdk.java.net/jeps/357 
<https://openjdk.java.net/jeps/357>).

The relevant tool seem to be git-openjdk-import which is used to import from 
Mercurial to Git. I just had a short glance on the code but it seems to be very 
generic and does not seem to contain OpenJDP related stuff at all. The interesting 
part is the follow paragraph from https://openjdk.java.net/jeps/357 
<https://openjdk.java.net/jeps/357>

We've also prototyped new tool, git-translate. This tool uses a file 
called.hgcommits that is generated by the conversion tools and committed to the 
Git repositories. This file contains a sequence of lines, each of which 
contains two hexadecimal hashes: the first is the hash of a Mercurial changeset 
and the second is the hash of the Git commit resulting from converting that 
Mercurial changeset. The tool git-translate simply queries the file .hgcommits


I haven't managed to get everything work out of the box but haven't tried too 
hard. Might be even worth opening a thread on the Skara mailing list.

However, even if we have a translate tool this is still complicated: Changing 
hashes or links in a commit again alters the git hash and the translation is 
wrong for this particular commit. This could be a problem if a commit is 
referenced by more than one other commit or if commit a references commit b 
references commit c.

I see essentially three options:
1. Migrate to another mercurial provider
2. Convert to git, stay at bitbucket
3. Convert to git, migrate to another provider
1. We could migrate to Tuxfamily and keep mercurial. As you said this would 
imply we have to handle pull requests separately which is possible. As you 
surly know LLVM does exactly that by using Phabricator. However this would fix 
some of the issues above but links to bitbucket would remain a problem. Another 
downside of mercurial is that only very few projects are using it and 
contributing would be much easier in the case of git.

I really don't see much difference in usability between hg and git -- both have 
their advantages and little quirks, IMO. And I don't think that hg was ever the 
main-hurdle for people contributing to Eigen ...

If Phabricator allows to import our existing PRs that would of course be a nice 
option. But I'm really pessimistic about that at the moment, since this also 
requires to match all users which made the PR or took part in the discussion to 
the new host (maybe that would be the only argument for staying with bitbucket).

I tried a few things regarding PRs: We can clearly get all Bitbucket PRs using 
its API (e.g. curl 
https://api.bitbucket.org/2.0/repositories/eigen/eigen/pullrequests --request 
GET) but such a Bitbucket PR is basically defined by source and destination 
repo and doesn't seem to contain any kind off diff. The obvious problem is that 
not only the Eigen repo will be closed (or deleted...) but also all of its 
forks. To really transfer PRs we would have to migrate at least part of the 
forks as well which is absolutely unrealistic.

I've also tried Phabricator and think its a great tool but has major downsides: 
It uses a different kind of workflow based on pure diffs (you can literally 
just copy the result of hg diff or git diff into a web tool) which might be 
hard to adapt for new users and is only free if self-hosted. The only real 
reason I'm mentioning this is that I guess we could get plain diffs from the 
Bitbucket PRs and could make them work with Phabricator. However, I really 
don't want to advertise this solution but it might be at least one.

I'm really pessimistic on this issue but see basically two options:
1. Try something exotic like the Phabricator workaround sketched above (I m 
totally unsure about this).
2. Get the diffs from all Bitbucket PRs and archive them separately (on an 
Eigen page for historical purposes only). Handle all open PRs and define a 
migration period during that we don't accept new PRs.

Thanks,
David


On 24. Aug 2019, at 15:05, Christoph Hertzberg <[email protected]> 
wrote:

Hi!

On 24/08/2019 12.30, David Tellenbach wrote:
just some thoughts about some points you've made:
b) Fixing internal links inside commit messages ("grafted from ...", "fixes error 
introduced in commit ...")
Maybe I've forgot something crucial but doing something like
for branch in $(hg branches | awk '{print $1}'); do
     hg update -C  $branch > /dev/null
     echo "$branch $(hg log -v | egrep "bitbucket.org" | wc -l)"
done
gives me
Branch                       Links
------                       ------
default                      9
[...]

The point you missed is that especially the "grafted from" links do not include the full 
URL, just the hg-hash (which is different from git-hashes). And just greping for "grafted 
from" gives me 425 results (in total -- if you want the log of individual branches, you need 
to use the `-b` option).
For a more precise count, you should grep for hexadecimal numbers longer than a 
few digits inside the commit messages.

I somewhat doubt that any existing hg->git converters automatically translates 
these hashes, but I'd be very happy if someone finds out otherwise. Changing these 
manually is definitely not an option.

Also, if we stayed with mercurial, but used a different provider, we can't modify the 
history, because that would influence all the hashes (but then only the 9 direct links to 
"bitbucket.org/..." you found would be broken, which is acceptable, IMO)

Of course we can just ignore these links (though I think broken links/hashes 
are even worse than non-existing ones ...)

Another point are links inside the codebase that point to bitbucket.
Following the same logic as above I use
hg grep "bitbucket.org"
and get 11 links (all seem to be the same). Again something fixable manually.

Agreed, this part is easy to fix manually.

c) Fixing external links to the repository. Most notably, any links from our 
bugtracker will eventually fail (even if we stayed with bitbucket, the hashes 
won't match). I doubt that we could set up any automatic forwarding for that.
This might be by far the most complicated point since a lot (the majority?) of 
all issues contain links to commits. If desired I can find a concrete number 
but I doubt that it will be very...motivating. I also doubt that Bitbucket will 
provide any functionality to redirect links to other Git providers but I could 
image that there could be some workaround if we decide to migrate to Bitbucket 
Git. Something we should keep in mind before choosing a new provider.

If you (or anyone else) are/is really interested, I can try to make a MySQL 
dump of the underlying database (I'd need to strip the user data). If we have 
some automatic translation between the hashes, this could even allow us to 
automatically convert all links.
Migrating to bitbucket-git will still break all existing links, since the 
hashes don't match. And as bitbucket is not even planning to provide an 
automated repository conversion, I would not count on any kind of forwarding 
mechanism.


Any third-party which relies on our main repository will need to change as well (not 
directly "our" problem, but we need to give a reasonable amount of time for 
everyone to migrate to whatever will be our future official repository).
It's currently unclear for me what exactly will happen with the hg repo but I 
guess it will be archived or something similar. In this case we can link to the 
new repo on the README page. I don't have any further ideas regarding this but 
also think we should migrate somewhat fast.

Yes, I think this is unclear for everyone at the moment. The announcement from 
bitbucket sounds a lot like they will literally delete all hg-repositories in 
June next year :(
If it was at least frozen/archived as it is, we would have almost no problems 
with point c).

For manual redirection, we can of course open a new git-project which just 
contains a README.md saying that bitbucket dropped hg-support, and point to 
where Eigen migrated to.

I see essentially three options:
1. Migrate to another mercurial provider
2. Convert to git, stay at bitbucket
3. Convert to git, migrate to another provider
1. We could migrate to Tuxfamily and keep mercurial. As you said this would 
imply we have to handle pull requests separately which is possible. As you 
surly know LLVM does exactly that by using Phabricator. However this would fix 
some of the issues above but links to bitbucket would remain a problem. Another 
downside of mercurial is that only very few projects are using it and 
contributing would be much easier in the case of git.

I really don't see much difference in usability between hg and git -- both have 
their advantages and little quirks, IMO. And I don't think that hg was ever the 
main-hurdle for people contributing to Eigen ...

If Phabricator allows to import our existing PRs that would of course be a nice 
option. But I'm really pessimistic about that at the moment, since this also 
requires to match all users which made the PR or took part in the discussion to 
the new host (maybe that would be the only argument for staying with bitbucket).


2. The only reason I see for this is the one I mentioned above: If there is (or 
will be) any support to redirect bitbucket links it will most likely only work 
if we stay at bitbucket. Compared with other code hosting services I find 
bitbucket (not mercurial) to be really complicated and not intuitive.

It might be an option, if they allowed to automatically migrate pull-requests. 
But at the moment, they don't even seem to plan automatic migration of 
repositories.

3. In an ideal world this would be my absolute preference (not very 
surprising). Regarding the choice of a service I want to make the personal 
point that I would rather migrate to Gitlab than to Github because it is as 
least as good as Github and I think that diversity of tools and providers is 
crucial for open source. In the long run we could even think about migrating 
issues to Gitlab and installing test runners (this is another story).

In my ideal world, somebody volunteers to do the work necessary for migration 
:) -- including the issues I pointed out (doesn't have to be the same person 
doing everything, of course). Even some proof-of-concept demos what can be 
automated would be nice!

I don't have any real preferences between mercurial/git or 
github/gitlab/bitbucket.

I totally agree that having automated test runners on pull-requests will be a 
big plus (for which I'm even willing to sacrifice some of my original points, 
especially since we may need to anyway).

Cheers,
Christoph


Thanks,
David
On 21. Aug 2019, at 14:53, Christoph Hertzberg <[email protected]> 
wrote:

Hello Eigen users and contributers!

As some may have noticed, bitbucket/atlassian is "sunsetting" its mercurial 
support:

https://bitbucket.org/blog/sunsetting-mercurial-support-in-bitbucket

If they stick to their timeline, we will have to migrate until June 1st, 2020. 
That means we still have time, but if we do nothing, things will break ...


Converting the repository itself to git should not be a bigger issue -- and if 
we do this we could as well migrate to a more mainstream provider (i.e., 
github).

I think the main problems for migration are:
a) Migrating open pull-requests (for historical reasons, the closed/merged ones 
should probably be archived as well)
b) Fixing internal links inside commit messages ("grafted from ...", "fixes error 
introduced in commit ...")
c) Fixing external links to the repository. Most notably, any links from our 
bugtracker will eventually fail (even if we stayed with bitbucket, the hashes 
won't match). I doubt that we could set up any automatic forwarding for that.
d) Any third-party which relies on our main repository will need to change as well (not 
directly "our" problem, but we need to give a reasonable amount of time for 
everyone to migrate to whatever will be our future official repository).

Smaller issues (relatively easy to fix or not as important):
e) Change links from our wiki (to downloads)
f) Change URLs for automated doxygen generation and for unit-tests
g) Automatic links from the repository to our bugtracker (currently "Bug X" 
automatically links to http://eigen.tuxfamily.org/bz/show_bug.cgi?id=X)
h) Change hashes in bench/perf_monitoring/changesets.txt

I probably missed a few things ...


I see essentially three options:
1. Migrate to another mercurial provider
2. Convert to git, stay at bitbucket
3. Convert to git, migrate to another provider

Honestly, I see no good reason for option 2. And the only real reason I see for 
option 1 would be that it safes a lot of hassle with b) and h) -- also perhaps 
it would simplify c) (e.g., we could easily crawl through our bugzilla-database 
and just replace some URLs).


Any opinions on this? Preferences for how to proceed, or other alternatives?
Does anyone have experience with migrating from hg to git? Or migrating between 
providers? Especially, also dealing with the issues listed above.
Does anyone see issues I forgot?


Cheers,
Christoph


--
Dr.-Ing. Christoph Hertzberg

Besuchsadresse der Nebengeschäftsstelle:
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-Straße 5
28359 Bremen, Germany

Postadresse der Hauptgeschäftsstelle Standort Bremen:
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-Straße 1
28359 Bremen, Germany

Tel.:     +49 421 178 45-4021
Zentrale: +49 421 178 45-0
E-Mail:   [email protected]

Weitere Informationen: http://www.dfki.de/robotik
  -------------------------------------------------------------
  Deutsches Forschungszentrum für Künstliche Intelligenz GmbH
  Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany

  Geschäftsführung:
  Prof. Dr. Jana Koehler (Vorsitzende)
  Dr. Walter Olthoff

  Vorsitzender des Aufsichtsrats:
  Prof. Dr. h.c. Hans A. Aukes
  Amtsgericht Kaiserslautern, HRB 2313
  -------------------------------------------------------------





--
 Dr.-Ing. Christoph Hertzberg

 Besuchsadresse der Nebengeschäftsstelle:
 DFKI GmbH
 Robotics Innovation Center
 Robert-Hooke-Straße 5
 28359 Bremen, Germany

 Postadresse der Hauptgeschäftsstelle Standort Bremen:
 DFKI GmbH
 Robotics Innovation Center
 Robert-Hooke-Straße 1
 28359 Bremen, Germany

 Tel.:     +49 421 178 45-4021
 Zentrale: +49 421 178 45-0
 E-Mail:   [email protected]

 Weitere Informationen: http://www.dfki.de/robotik
  -------------------------------------------------------------
  Deutsches Forschungszentrum für Künstliche Intelligenz GmbH
  Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany

  Geschäftsführung:
  Prof. Dr. Jana Koehler (Vorsitzende)
  Dr. Walter Olthoff

  Vorsitzender des Aufsichtsrats:
  Prof. Dr. h.c. Hans A. Aukes
  Amtsgericht Kaiserslautern, HRB 2313
  -------------------------------------------------------------



Reply via email to