On August 27, 2015 at 10:33:15 AM, Tres Seaver (tsea...@palladion.com) wrote:
> On 08/27/2015 07:51 AM, Donald Stufft wrote:
>  
> > This leaves the user feeling annoyed that we didn’t just search those
> > locations by default. I truly think it is a bad experience and I only
> > ever added it because I wanted the discussion to be over with and I
> > was trying to placate people by giving them a bad feature
>  
> I don't understand the sensibility here: an error message which tells me
> "not hosted on PyPI, try 'pip...' instead" seems like a *good* UX to me:
> Having a tool which respectw its default policy ("trust only PyPI")
> while giving me the information I need to off-road when needed is a good
> balance.
>  

Given my experience dealing with pip’s users and the fallout of PEP 438, the 
very next question we’d get after implementing that UI will either be “If pip 
knows what I need to do, why can’t it just do it for me instead of making me 
type it out again” OR “Give me a flag so I can just automatically accept every 
externally hosted index”.

Both of these asks are completely logical from an end user who doesn’t 
understand why the situation is the way it is, but are also essentially “let me 
be insecure all the time implicitly” flags.

On the other hand, if we just remove it then we can explain that we used to 
support an insecure method of finding links, but that we no longer support it. 
The difference here is that there is no bread crumb of “here’s some information 
that pip obviously knows, because it’s telling it to you” to lead people to ask 
for something to opt into a “global insecure” flag. We have a clear answer that 
doesn’t leave room for argument: “We no longer get that information from PyPI 
so we cannot use it”.

I think it’s a bad API because I think it’s going to cause frustration with 
people, particularly that pip is making them do extra leg work to approve or 
type in an repository URL. In addition, the discovery mechanism will only be in 
new versions of pip, however only about a third of our users upgrade quickly 
after a release (see: https://caremad.io/2015/04/a-year-of-pypi-downloads/) so 
the error case is going to be happening with the vast bulk of users anyways 
(The “Unknown” in that graph is older than 1.4).

I also think it’s a bad experience because you’re mandating that they are 
lowering the “uptime” of any particular installation set that includes an 
external repository unless every repository has a 100% uptime. This is because 
you’re adding new single points of failures into a system. PyPI has had a 
99.94% uptime over the last year which corresponds with 5 1/2 hours of 
downtime. Let’s assume that’s a rough average for what we can expect, If 
someone adds a single additional repository then the uptime of the system as a 
whole becomes 99.88% (or X hours of downtime), a third repository brings it to 
99.82% (X hours), a fourth brings it to 99.76% (X hours). I think this is a 
conservative estimate of what the affects of the downtime would be.

On the other hand, here’s what I consider a good experience which is possible 
if my assumptions about what is acceptable for data sovereignty are correct:

Project “Foo” doesn’t want to host their projects in the US for $REASONS, they 
go to https://pypi.python.org/ and register their project, when registering 
they select to have their uploads hosted in the EU. Anytime they upload their 
files to https://pypi.python.org instead of storing them in a bucket in 
us-west-2, PyPI checks for their preferences, sees they have selected the EU 
and instead stores their files in eu-west-1 (Ireland).

User “Jill” wants to install Project “Foo” and she is using pip 1.5.6 from her 
Debian operating system. When she types in ``pip install Foo`` pip goes to 
https://pypi.python.org/simple/foo/ gets a list of files which have been hosted 
in the EU. Without any updates or changes required on her end, pip downloads 
these files and installs them.

Here’s the thing though, which I’ve been saying: I don’t know the laws and I 
don’t think it’s reasonable to expect me to learn the laws for all these other 
countries. There are open questions on how to actually implement this. For 
example, what exactly are we trying to achieve? If we’re trying to protect 
against the US government compelling the hosting company to do something, then 
you’re pretty much boned because if the files were hosted in the EU you still 
have the fact that it’d be a service controlled by a US Non Profit, ran by 
volunteers that live in the US, developed by someone who lives in the US who is 
employed by someone who lives in the US. If we’re trying to comply with some 
sort of data locality laws like 
https://en.wikipedia.org/wiki/Data_Protection_Directive does OSS even count as 
“personal data”? If it does, then does uploading it to https://pypi.python.org/ 
which is located in the US but storing and hosting it from the EU satisfy the 
requirements? What about putting it behind Fastly (another US company), when a 
US user requests those files can it route them and cache them in a US 
Datacenter? Is it OK to have it linked from https://pypi.python.org/ (Again, 
hosted in the US) or do we need a whole separate repository to handle these 
files?

I think we can make this a great experience, but it is it’s own discussion and 
it needs to include stakeholders who actually know what the requirements are. I 
need someone who can put forth some effort into making it a reality instead of 
expecting me to do it all. If nobody wants to put in any effort to make it 
happen, maybe it’s not actually that important to them?

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA


_______________________________________________
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to