Hi Donald, Nick, Richard, all,

finally got around to read and think about the issues discussed in PEP470.  
First of all thanks for going through the effort of trying to 
advance the overall situation with a focus on making it easier 
for our wonderful and beloved "end users" :)

However, I think PEP470 needs to achieve stronger backward compatibility for
end-users because, as is typical for the 99%, they like to see change
but hate to be forced to change themselves.

Allow me to remind of how PEP438 worked in this regard: all
end users always remained able to install all projects, including those
with ancient tools and they all benefitted from the changes PEP438
brought: 90% of the projects were automatically switched to
"pypi-explicit" mode, speeding up and making more reliable installs for
everyone across the board.  Let me thank specifically and once
again our grand tooler Donald here who implemented most of it.

However, PEP470 does not achieve this level of backward compatibility yet.
Let's look at its current procedure leading up to the final switch:

    "After that switch, an email will be sent to projects which rely on
    hosting external to PyPI. This email will warn these projects that
    externally hosted files have been deprecated on PyPI and that in 6
    months from the time of that email that all external links will be
    removed from the installer APIs. (...)

    Five months after the initial email, another email must be sent to
    any projects still relying on external hosting. (...)

    Finally a month later all projects will be switched to the pypa-only
    mode and PyPI will be modified to remove the externally linked files
    functionality."

This process tries to trigger changes from those 2974 project maintainers 
who are today operating in pypi-crawl* modes.  If we are left with a 1000 
stale project maintainers at final-switch time, and speculate about just 100 
downloads for each of their projects, it means this final switch may get 
us 100000 failing installation interactions the day after the final switch.  
Might be higher or lower, but i hope we agree that we'll very likely 
have a significant "stale project maintainer" problem affecting 
many end-users and existing CI installations etc.

Even for those maintainers who switch to use an external index
as currently advertised by the PEP, and with their release files also
being downloaded a 100 times each, we'll have another 50000 interactions 
from end users which need to re-configure their tool usage to switch to 
use an external index.  Granted, those using a new pip version would get 
a useful hint how to do that.  Others, using older versions, would have 
to discover the project pypi website to hopefully understand how to 
make their stuff work again.

In any case, we'd likely get a ton of end-user side installation issues 
and i think PEP470 needs to be modified to try minimize this number.
It could take the ball where PEP438 dropped it:

    "Thus the hope is that eventually all projects on PyPI can be migrated to
    the pypi-explicit mode, while preserving the ability to install release
    files hosted externally via installer tools. Deprecation of hosting
    modes to eventually only allow the pypi-explicit mode is NOT REGULATED
    by this PEP but is expected to become feasible some time after
    successful implementation of the transition phases described in this
    PEP. It is expected that deprecation requires a new process to deal with
    abandoned packages because of unreachable maintainers for still popular
    packages."

PEP470 could be this successor, cleaning up and simplifying the situation.
But how to maintain full backward compat and get rid of crawling? 
here is a sketched process how we could get rid of pypi-crawl* modes:

- sent a warning note to maintainers a month before their pypi-crawl* 
  hosted projects are converted (informing about the process, see next points).
  Advertise a tool to convert pypi-crawl* hosting modes to pypi-explicit.
  This tool automates the crawling to register all found release files
  either as explicit references with MD5s, or upload them to become 
  pypi-hosted files, at the option of the maintainer.  It will also switch
  the hosting mode on the pypi site automatically.

  We'll also disallow pypi-crawl* modes on pypi at warning time for new
  projects or to switch to them from other modes.

- a month later a pypi admin (guess who!) uses the same conversion tool,
  but with his admin superpowers, to convert any remaining 
  pypi-crawl* hosting-mode projects automatically with one addition:
  all those admin-converted projects will get a "stale" flag
  because the maintainer did not react and perform the conversion himself.
  This "stale" status will be shown on the web page and new tool releases
  can maybe learn to read this flag from the simple page so that they can warn 
  the end users they are installing a project with a known-to-be stale 
  maintainer.

  The admin-driven conversion can be done incrementally in bunches,
  to make it even more unlikely that we are going to face storms 
  of unhappy end users at any one point and to iron out issues as we go.

The result of this process is that we have only one hosting mode: 
pypi-explicit which is already introduced and specified with PEP438. 
And pypi's simple pages will continue to present two kinds of links:

- rel="internal": release files directly uploaded to pypi

- other external links will be direct URLS with hash-checksums to external
  release files.  Tools already can already recognize them and inform the user.

  sidenote: if people have a PIP_DOWNLOAD_CACHE they will
  only depend on reachability of pypi after they first installed
  an external dependency.  So it's operationally a good situation given
  the fact that using "--allow-externals" provides exactly the same 
  file installation integrity as pypi hosted files itself do.

After we completed the automated admin-pypi transition there is no external
scraping, no unverified links and tools could drop support for them over
time.  And there remain two ways how you can release files:  upload them
to pypi or register a checksummed link.   In addition, we will have 
a clear list of a bunch of "stale" marked projects and can work 
with it further.

Note that with this proposed process 93% of maintainers, most toolers
and all end-users can remain ignorant of this PEP and will not be
bothered: everything just continues to work unmodified.  Some end users
will experience a speed up because the client-side will not need
to download/crawl additional external simple pages.  There are no new
things people need to learn except for the "crawl" maintainers to whom
we nicely and empathically send a message: "switch or be switched" :)

You'll note that the process proposed here does not require
pypi.python.org to manage "external additional indexes" information or
tools to learn to recognize them.  At this point, I am not sure it's 
really needed for the cleanup and simplifiation issues PEP470 tries to 
address.

backward-compat-is-a-thing'ly yours,
holger

Attachment: signature.asc
Description: Digital signature

_______________________________________________
Distutils-SIG maillist  -  [email protected]
https://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to