On Mar 10, 2013, at 12:29 PM, Donald Stufft <don...@stufft.io> wrote:

>
> On Mar 10, 2013, at 2:18 PM, holger krekel <hol...@merlinux.eu> wrote:
>
>> On Sun, Mar 10, 2013 at 13:35 -0400, Donald Stufft wrote:
>>> On Mar 10, 2013, at 11:07 AM, holger krekel <hol...@merlinux.eu> wrote:
>>>> [...]
>>>> Transitioning to "pypi-cache" mode
>>>> -------------------------------------
>>>>
>>>> When transitioning from the currently implicit "pypi-ext" mode to
>>>> "pypi-cache" for a given package, a package maintainer should
>>>> be able to retrieve/verify the historic release files which will
>>>> be cached from pypi.python.org.  The UI should present this list
>>>> and have the maintainer accept it for completing the transition
>>>> to the "pypi-cache" mode.  Upon future release registration actions,
>>>> pypi.python.org will perform crawling for the homepage/download sites
>>>> and cache release files *before* returning a success return code for
>>>> the release registration.
>>>> [...]
>>>
>>> Some concerns:
>>>
>>> 1. We cannot automatically switch people to pypi-cache. We _have_ to get 
>>> explicit permission from them.
>>
>> Could you detail how you arrive at this conclusion?
>> (I've seen the claim before but not the underlying reasoning, maybe
>> i just missed it)
>>
>> There would be prior notifications to the package maintainers.  If they
>> don't want to have their packages cached at pypi.python.org, they can set
>> the mode to "pypi-only" and leave manual instructions.  I suspect there will
>> be very few people if anyone, objecting to pypi-cache mode.  If that is
>> false we might need to prolong pypi-ext mode some more for them and
>> eventually switch them to pypi-only when we eventually decide to get
>> rid of external hosting.
>
> I asked VanL. His statement on re-hosting packages was:
>
>    "We could do it if we had permission. The tricky part would be getting 
> permission for already-existing packages."
>
> I'm pretty sure that emailing someone and assuming we have permission if they 
> don't opt-out doesn't count as permission.
>
>>
>>> 2. The cache mechanism is going to be fragile, and in the long term leaves 
>>> a window open for security issues.
>>
>> fragility: not sure it's too bad.  Once the mode is activited release
>> registration ("submit" POST action on "/pypi" http endpoint) will only
>> succeed if according releases can be found through homepage/download.
>> Changing the mode to pypi-cache in the presence of historic release
>> files hosted elsewhere needs a good pypi.python.org UI interaction and
>> may take several tries if neccessary sites cannot be reached.  Nevertheless,
>> this step is potentially fragile [X].
>
> I see, so pypi-cache would only be triggered once during release creation. 
> Cache makes it sound like we'd continuously monitor the given external urls 
> instead of it actually being a pull based method of getting files.

I think the term "mirror" is more accurate than "cache" here.

Aaron Meurer

>
>>
>> Security: the PEP does not try to prevent package tampering. MITM attacks
>> between pypi.python.org and the download sites may occur as much as they
>> can happen today between installers and the download sites.
>> I think we should consider protection against package tampering
>> in a separate discussion/PEP.
>>
>>> If we're going to do a phased in per project solution like this I think it 
>>> would work much better to have 2 modes.
>>>
>>> 1. Legacy - Current behavior, new external links are accepted, existing 
>>> ones are displayed
>>
>>> 2. PyPI Only - New behavior, no new external links are accepted, existing 
>>> ones are removed
>>>
>>> Present the project owners with 2 one way buttons:
>>>  - Switch to PyPI Only and re-host external files [1]
>>
>> Doesn't this have the same fragility problem as [X] above?
>
> Yes, and any pull based solution will. The difference is with a one time and 
> done solution we can live with a little bit more fragility.
>
>>
>>>  - Switch to PyPI Only and do NOT re-host external files
>>
>> Are there any problems for doing this automatically (with a prior
>> notification to maintainers) for all the projects where we don't
>> find externally hosted packages?  I'd expect very few false negatives
>> and they can be quickly switched back.
>
> Only thing I could think of is a host being temporarily down being counted as 
> a false positive.
>
>>
>> Back to pypi-cache: it is there to make it super-easy for package
>> maintainers.  There are all kinds of release habits and scripts pushing out
>> things to google/bitbucket/github/other sites.  With "pypi-cache" they
>> don't need to change any of that.  They just need to be fine with
>> pypi.python.org pulling in the packages for caching.
>
> Yes I understand the goal here. The problem is that there's not really a good 
> way to secure this without requiring changes to their workflow. At best 
> they'll have to push information about every file so that PyPI is able to 
> verify the files it is downloading, and if we are requiring them to push data 
> about those files we might as well require them to push the files themselves. 
> This also has the effect we can provide immediate feedback when files do not 
> validate on PyPI.
>
>>
>> We might think about phasing out pypi-cache after some larger time
>> frame so that we eventually only have pypi-only and things are eventually
>> simple and saner.
>>
>> best,
>> holger
>>
>>
>>
>>> These buttons would be one time and quit. Once your project has been 
>>> switched to PyPI Only you cannot go back to Legacy mode. All new projects 
>>> would be already switched to PyPI Only. After some amount of time switch 
>>> all Projects to PyPI Only but _do not_ re-host their packages as we cannot 
>>> legally do so without their permission.
>>>
>>> The above is simpler, still provides people an easy migration path, moves 
>>> us to remove external hosting, and doesn't entangle us with legal issues.
>>>
>>> [1] There is still a small window here where someone could MITM PyPI 
>>> fetching these files, however since it would be a one time and down deal 
>>> this risk is minimal and is worth it to move to an pypi only solution.
>>>
>>> -----------------
>>> Donald Stufft
>>> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
>
>
> -----------------
> Donald Stufft
> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG@python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
_______________________________________________
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig

Reply via email to