On Mar 10, 2013, at 12:29 PM, Donald Stufft <don...@stufft.io> wrote:
> > On Mar 10, 2013, at 2:18 PM, holger krekel <hol...@merlinux.eu> wrote: > >> On Sun, Mar 10, 2013 at 13:35 -0400, Donald Stufft wrote: >>> On Mar 10, 2013, at 11:07 AM, holger krekel <hol...@merlinux.eu> wrote: >>>> [...] >>>> Transitioning to "pypi-cache" mode >>>> ------------------------------------- >>>> >>>> When transitioning from the currently implicit "pypi-ext" mode to >>>> "pypi-cache" for a given package, a package maintainer should >>>> be able to retrieve/verify the historic release files which will >>>> be cached from pypi.python.org. The UI should present this list >>>> and have the maintainer accept it for completing the transition >>>> to the "pypi-cache" mode. Upon future release registration actions, >>>> pypi.python.org will perform crawling for the homepage/download sites >>>> and cache release files *before* returning a success return code for >>>> the release registration. >>>> [...] >>> >>> Some concerns: >>> >>> 1. We cannot automatically switch people to pypi-cache. We _have_ to get >>> explicit permission from them. >> >> Could you detail how you arrive at this conclusion? >> (I've seen the claim before but not the underlying reasoning, maybe >> i just missed it) >> >> There would be prior notifications to the package maintainers. If they >> don't want to have their packages cached at pypi.python.org, they can set >> the mode to "pypi-only" and leave manual instructions. I suspect there will >> be very few people if anyone, objecting to pypi-cache mode. If that is >> false we might need to prolong pypi-ext mode some more for them and >> eventually switch them to pypi-only when we eventually decide to get >> rid of external hosting. > > I asked VanL. His statement on re-hosting packages was: > > "We could do it if we had permission. The tricky part would be getting > permission for already-existing packages." > > I'm pretty sure that emailing someone and assuming we have permission if they > don't opt-out doesn't count as permission. > >> >>> 2. The cache mechanism is going to be fragile, and in the long term leaves >>> a window open for security issues. >> >> fragility: not sure it's too bad. Once the mode is activited release >> registration ("submit" POST action on "/pypi" http endpoint) will only >> succeed if according releases can be found through homepage/download. >> Changing the mode to pypi-cache in the presence of historic release >> files hosted elsewhere needs a good pypi.python.org UI interaction and >> may take several tries if neccessary sites cannot be reached. Nevertheless, >> this step is potentially fragile [X]. > > I see, so pypi-cache would only be triggered once during release creation. > Cache makes it sound like we'd continuously monitor the given external urls > instead of it actually being a pull based method of getting files. I think the term "mirror" is more accurate than "cache" here. Aaron Meurer > >> >> Security: the PEP does not try to prevent package tampering. MITM attacks >> between pypi.python.org and the download sites may occur as much as they >> can happen today between installers and the download sites. >> I think we should consider protection against package tampering >> in a separate discussion/PEP. >> >>> If we're going to do a phased in per project solution like this I think it >>> would work much better to have 2 modes. >>> >>> 1. Legacy - Current behavior, new external links are accepted, existing >>> ones are displayed >> >>> 2. PyPI Only - New behavior, no new external links are accepted, existing >>> ones are removed >>> >>> Present the project owners with 2 one way buttons: >>> - Switch to PyPI Only and re-host external files [1] >> >> Doesn't this have the same fragility problem as [X] above? > > Yes, and any pull based solution will. The difference is with a one time and > done solution we can live with a little bit more fragility. > >> >>> - Switch to PyPI Only and do NOT re-host external files >> >> Are there any problems for doing this automatically (with a prior >> notification to maintainers) for all the projects where we don't >> find externally hosted packages? I'd expect very few false negatives >> and they can be quickly switched back. > > Only thing I could think of is a host being temporarily down being counted as > a false positive. > >> >> Back to pypi-cache: it is there to make it super-easy for package >> maintainers. There are all kinds of release habits and scripts pushing out >> things to google/bitbucket/github/other sites. With "pypi-cache" they >> don't need to change any of that. They just need to be fine with >> pypi.python.org pulling in the packages for caching. > > Yes I understand the goal here. The problem is that there's not really a good > way to secure this without requiring changes to their workflow. At best > they'll have to push information about every file so that PyPI is able to > verify the files it is downloading, and if we are requiring them to push data > about those files we might as well require them to push the files themselves. > This also has the effect we can provide immediate feedback when files do not > validate on PyPI. > >> >> We might think about phasing out pypi-cache after some larger time >> frame so that we eventually only have pypi-only and things are eventually >> simple and saner. >> >> best, >> holger >> >> >> >>> These buttons would be one time and quit. Once your project has been >>> switched to PyPI Only you cannot go back to Legacy mode. All new projects >>> would be already switched to PyPI Only. After some amount of time switch >>> all Projects to PyPI Only but _do not_ re-host their packages as we cannot >>> legally do so without their permission. >>> >>> The above is simpler, still provides people an easy migration path, moves >>> us to remove external hosting, and doesn't entangle us with legal issues. >>> >>> [1] There is still a small window here where someone could MITM PyPI >>> fetching these files, however since it would be a one time and down deal >>> this risk is minimal and is worth it to move to an pypi only solution. >>> >>> ----------------- >>> Donald Stufft >>> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA > > > ----------------- > Donald Stufft > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA > > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG@python.org > http://mail.python.org/mailman/listinfo/catalog-sig _______________________________________________ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig