On 30 Nov 2014 09:28, "Donald Stufft" <don...@stufft.io> wrote:
>
> As promised in the "Move selected documentation repos to PSF BitBucket
> account?" thread I've written up a PEP for moving selected repositories
from
> hg.python.org to Github.
>
> You can see this PEP online at: https://www.python.org/dev/peps/pep-0481/
>
> I've also reproduced the PEP below for inline discussion.

Given that hg.python.org isn't going anywhere, you could also use hg-git to
maintain read-only mirrors at the existing URLs and minimise any breakage
(as well as ensuring a full historical copy remains available on PSF
infrastructure). Then the only change needed would be to set up appropriate
GitHub web hooks to replace anything previously based on a commit hook
rather than periodic polling.

The PEP should also cover providing clear instructions for setting up
git-remote-hg with the remaining Mercurial repos (most notably CPython), as
well as documenting a supported workflow for generating patches based on
the existing CPython GitHub mirror.

Beyond that, GitHub is indeed the most expedient option. My two main
reasons for objecting to taking the expedient path are:

1. I strongly believe that the long term sustainability of the overall open
source community requires the availability and use of open source
infrastructure. While I admire the ingenuity of the "free-as-in-beer" model
for proprietary software companies fending off open source competition, I
still know a proprietary platform play when I see one (and so do venture
capitalists looking to extract monopoly rents from the industry in the
future). (So yes, I regret relenting on this principle in previously
suggesting the interim use of another proprietary hosted service)

2. I also feel that this proposal is far too cavalier in not even
discussing the possibility of helping out the Mercurial team to resolve
their documentation and usability issues rather than just yelling at them
"your tool isn't popular enough for us, and we find certain aspects of it
too hard to use, so we're switching to something else rather than working
with you to address our concerns". We consider the Mercurial team a
significant enough part of the Python ecosystem that Matt was one of the
folks specifically invited to the 2014 language summit to discuss their
concerns around the Python 3 transition. Yet we'd prefer to switch to
something else entirely rather than organising a sprint with them at PyCon
to help ensure that our existing Mercurial based infrastructure is
approachable for git & GitHub users? (And yes, I consider some of the core
Mercurial devs to be friends, so this isn't an entirely abstract concern
for me)

Given my proposal to use BitBucket as a near term solution for enabling
pull request based workflows, it's clear I consider the second argument the
more significant of the two.

However, if others consider some short term convenience that may or may not
attract additional contributors more important than supporting the broader
Python and open source communities (an argument I'm more used to hearing in
the ruthlessly commercial environment of Red Hat, rather than in upstream
contexts that tend to be less worried about "efficiency at any cost"), I'm
not going to expend energy trying to prevent a change I disagree with on
principle, but will instead work to eliminate (or significantly reduce) the
current expedience argument in GitHub's favour.

As a result, I'm -0 on the PEP, rather than -1 (and will try to stay out of
further discussions).

Given this proposal, I'm planning to refocus PEPs 474 & 462 specifically on
resolving the CPython core workflow issues, since that will require
infrastructure customisation regardless, and heavy customisation of GitHub
based infrastructure requires opting in to the use of the GitHub specific
APIs that create platform lockin. (Note that the argument in PEP 481 about
saving overall infrastructure work is likely spurious - the vast majority
of that work will be in addressing the complex CPython workflow
requirements, and moving some support repos to GitHub does little to
alleviate that)

If folks decide they want to migrate the ancillary repos back from GitHub
after that other infrastructure work is done, so be it, but if they don't,
that's OK too. We're already running heterogeneous infrastructure across
multiple services (especially if you also take PyPA into account), so
having additional support repos externally hosted isn't that big a deal
from a purely practical perspective.

Regards,
Nick.
>
> -----------------------
>
> Abstract
> ========
>
> This PEP proposes migrating to Git and Github for certain supporting
> repositories (such as the repository for Python Enhancement Proposals) in
a way
> that is more accessible to new contributors, and easier to manage for core
> developers. This is offered as an alternative to PEP 474 which aims to
achieve
> the same overall benefits but while continuing to use the Mercurial DVCS
and
> without relying on a commerical entity.
>
> In particular this PEP proposes changes to the following repositories:
>
> * https://hg.python.org/devguide/
> * https://hg.python.org/devinabox/
> * https://hg.python.org/peps/
>
>
> This PEP does not propose any changes to the core development workflow for
> CPython itself.
>
>
> Rationale
> =========
>
> As PEP 474 mentions, there are currently a number of repositories hosted
on
> hg.python.org which are not directly used for the development of CPython
but
> instead are supporting or ancillary repositories. These supporting
repositories
> do not typically have complex workflows or often branches at all other
than the
> primary integration branch. This simplicity makes them very good targets
for
> the "Pull Request" workflow that is commonly found on sites like Github.
>
> However where PEP 474 wants to continue to use Mercurial and wishes to
use an
> OSS and self-hosted and therefore restricts itself to only those
solutions this
> PEP expands the scope of that to include migrating to Git and using
Github.
>
> The existing method of contributing to these repositories generally
includes
> generating a patch and either uploading them to bugs.python.org or
emailing
> them to p...@python.org. This process is unfriendly towards non-comitter
> contributors as well as making the process harder than it needs to be for
> comitters to accept the patches sent by users. In addition to the benefits
> in the pull request workflow itself, this style of workflow also enables
> non techincal contributors, especially those who do not know their way
around
> the DVCS of choice, to contribute using the web based editor. On the
committer
> side the Pull Requests enable them to tell, before merging, whether or not
> a particular Pull Request will break anything. It also enables them to do
a
> simple "push button" merge which does not require them to check out the
> changes locally. Another such feature that is useful in particular for
docs,
> is the ability to view a "prose" diff. This Github specific feature
enables
> a committer to view a diff of the rendered output which will hide things
like
> reformatting a paragraph and show you what the actual "meat" of the change
> actually is.
>
>
> Why Git?
> --------
>
> Looking at the variety of DVCS which are available today it becomes fairly
> clear that git has gotten the vast mindshare of people who are currently
using
> it. The Open Hub (Previously Ohloh) statistics [#openhub-stats]_ show that
> currently 37% of the repositories Open Hub is indexing is using git which
is
> second only to SVN (which has 48%) while Mercurial has just 2% of the
indexed
> repositories (beating only bazaar which has 1%). In additon to the Open
Hub
> statistics a look at the top 100 projects on PyPI (ordered by total
download
> counts) shows us that within the Python space itself there is a majority
of
> projects using git:
>
> === ========= ========== ====== === ====
> Git Mercurial Subversion Bazaar CVS None
> === ========= ========== ====== === ====
> 62  22        7          4      1   1
> === ========= ========== ====== === ====
>
>
> Chosing a DVCS which has the larger mindshare will make it more likely
that any
> particular person who has experience with DVCS at all will be able to
> meaningfully use the DVCS that we have chosen without having to learn a
new
> tool.
>
> In addition to simply making it more likely that any individual will
already
> know how to use git, the number of projects and people using it means
that the
> resources for learning the tool are likely to be more fully fleshed out
and
> when you run into problems the liklihood that someone else had that
problem
> and posted a question and recieved an answer is also far likelier.
>
> Thirdly by using a more popular tool you also increase your options for
tooling
> *around* the DVCS itself. Looking at the various options for hosting
> repositories it's extremely rare to find a hosting solution (whether OSS
or
> commerical) that supports Mercurial but does not support Git, on the flip
side
> there are a number of tools which support Git but do not support
Mercurial.
> Therefore the popularity of git increases the flexibility of our options
going
> into the future for what toolchain these projects use.
>
> Also by moving to the more popular DVCS we increase the likelhood that the
> knowledge that the person has learned in contributing to these support
> repositories will transfer to projects outside of the immediate CPython
project
> such as to the larger Python community which is primarily using Git
hosted on
> Github.
>
> In previous years there was concern about how well supported git was on
Windows
> in comparison to Mercurial. However git has grown to support Windows as a
first
> class citizen. In addition to that, for Windows users who are not well
aquanted
> with the Windows command line there are GUI options as well.
>
> On a techincal level git and Mercurial are fairly similar, however the git
> branching model is signifcantly better than Mercurial "Named Branches" for
> non-comitter contributors. Mercurial does have a "Bookmarks" extension
however
> this isn't quite as good as git's branching model. All bookmarks live in
the
> same namespace so it requires individual users to ensure that they
namespace
> the branchnames themselves lest the risk collision. It also is an
extension
> which requires new users to first discover they need an extension at all
and
> then figure out what they need to do in order to enable that extension.
Since
> it is an extension it also means that in general support for them outside
of
> Mercurial core is going to be less than 100% in comparison to git where
the
> feature is built in and core to using git at all. Finally users who are
not
> used to Mercurial are unlikely to discover bookmarks on their own,
instead they
> will likely attempt to use Mercurial's "Named Branches" which, given the
fact
> they live "forever", are not often what a project wants their
contributors to
> use.
>
>
> Why Github?
> -----------
>
> There are a number of software projects or web services which offer
> functionality similar to that of Github. These range from commerical web
> services such as a Bitbucket to self-hosted OSS solutions such as
Kallithea or
> Gitlab. This PEP proposes that we move these repositories to Github.
>
> There are two primary reasons for selecting Github: Popularity and
> Quality/Polish.
>
> Github is currently the most popular hosted repository hosting according
to
> Alexa where it currently has a global rank of 121. Much like for Git
itself by
> choosing the most popular tool we gain benefits in increasing the
likelhood
> that a new contributor will have already experienced the toolchain, the
quality
> and availablity of the help, more and better tooling being built around
it, and
> the knowledge transfer to other projects. A look again at the top 100
projects
> by download counts on PyPI shows the following hosting locations:
>
> ====== ========= =========== ========= =========== ==========
> GitHub BitBucket Google Code Launchpad SourceForge Other/Self
> ====== ========= =========== ========= =========== ==========
> 62     18        6           4         3           7
> ====== ========= =========== ========= =========== ==========
>
> In addition to all of those reasons, Github also has the benefit that
while
> many of the options have similar features when you look at them in a
feature
> matrix the Github version of each of those features tend to work better
and be
> far more polished. This is hard to quantify objectively however it is a
fairly
> common sentiment if you go around and ask people who are using these
services
> often.
>
> Finally a reason to choose a web service at all over something that is
> self-hosted is to be able to more efficiently use volunteer time and
donated
> resources. Every additional service hosted on the PSF infrastruture by the
> PSF infrastructure team further spreads out the amount of time that the
> volunteers on that team have to spend and uses some chunk of resources
that
> could potentionally be used for something where there is no free or
affordable
> hosted solution available.
>
> One concern that people do have with using a hosted service is that there
is a
> lack of control and that at some point in the future the service may no
longer
> be suitable. It is the opinion of this PEP that Github does not currently
and
> has not in the past engaged in any attempts to lock people into their
platform
> and that if at some point in the future Github is no longer suitable for
one
> reason or another than at that point we can look at migrating away from
Github
> onto a different solution. In other words, we'll cross that bridge if and
when
> we come to it.
>
>
> Example: Scientific Python
> --------------------------
>
> One of the key ideas behind the move to both git and Github is that a
feature
> of a DVCS, the repository hosting, and the workflow used is the social
network
> and size of the community using said tools. We can see this is true by
looking
> at an example from a sub-community of the Python community: The Scientific
> Python community. They have already migrated most of the key pieces of the
> SciPy stack onto Github using the Pull Request based workflow starting
with
> IPython and as more projects moved over it became a natural default for
new
> projects.
>
> They claim to have seen a great benefit from this move, where it enables
casual
> contributors to easily move between different projects within their
> sub-community without having to learn a special, bespoke workflow and a
> different toolchain for each project. They've found that when people can
use
> their limited time on actually contributing instead of learning the
different
> tools and workflows that not only do they contribute more to one project,
that
> they also expand out and contribute to other projects. This move is also
> attributed to making it commonplace for members of that community to go
so far
> as publishing their research and educational materials on Github as well.
>
> This showcases the real power behind moving to a highly popular toolchain
and
> workflow, as each variance introduces yet another hurdle for new and
casual
> contributors to get past and it makes the time spent learning that
workflow
> less reusable with other projects.
>
>
> Migration
> =========
>
> Through the use of hg-git [#hg-git]_ we can easily convert a Mercurial
> repository to a Git repository by simply pushing the Mercurial repository
to
> the Git repository. People who wish to continue to use Mercurual locally
can
> then use hg-git going into the future using the new Github URL, however
they
> will need to re-clone their repositories as using Git as the server seems
to
> trigger a one time change of the changeset ids.
>
> As none of the selected repositories have any tags, branches, or bookmarks
> other than the ``default`` branch the migration will simply map the
``default``
> branch in Mercurial to the ``master`` branch in git.
>
> In addition since none of the selected projects have any great need of a
> complex bug tracker, they will also migrate their issue handling to using
the
> GitHub issues.
>
> In addition to the migration of the repository hosting itself there are a
> number of locations for each particular repository which will require
updating.
> The bulk of these will simply be changing commands from the hg equivilant
to
> the git equivilant.
>
> In particular this will include:
>
> * Updating www.python.org to generate PEPs using a git clone and link to
>   Github.
> * Updating docs.python.org to pull from Github instead of hg.python.org
for the
>   devguide.
> * Enabling the ability to send an email to python-check...@python.org for
each
>   push.
> * Enabling the ability to send an IRC message to #python-dev on Freenode
for
>   each push.
> * Migrate any issues for these projects to their respective bug tracker on
>   Github.
>
> This will restore these repositories to similar functionality as they
currently
> have. In addition to this the migration will also include enabling
testing for
> each pull request using Travis CI [#travisci]_ where possible to ensure
that
> a new PR does not break the ability to render the documentation or PEPs.
>
>
> User Access
> ===========
>
> Moving to Github would involve adding an additional user account that
will need
> to be managed, however it also offers finer grained control, allowing the
> ability to grant someone access to only one particular repository instead
of
> the coarser grained ACLs available on hg.python.org.
>
>
> References
> ==========
>
> .. [#openhub-stats] `Open Hub Statistics <
https://www.openhub.net/repositories/compare>`
> .. [#hg-git] `hg-git <https://hg-git.github.io/>`
> .. [#travisci] `Travis CI <https://travis-ci.org/>`
>
> ---
> Donald Stufft
> PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to