I wish there was a way to automatically redirect people requesting PIL to
get Pillow instead. It's a PYPI/setuptools friendly repackaging of PIL that
has been going on for years now:

https://pypi.python.org/pypi/Pillow/2.4.0


On 19 May 2014 23:29, Donald Stufft <[email protected]> wrote:

> Just an update, asyncmongo has released to PyPI now, so I’ve removed
> them from the gists as well. Still no word back from PIL.
>
> On May 18, 2014, at 11:21 AM, Donald Stufft <[email protected]> wrote:
>
> >
> > On May 18, 2014, at 2:20 AM, holger krekel <[email protected]> wrote:
> >
> >> On Sat, May 17, 2014 at 20:20 -0400, Donald Stufft wrote:
> >>> On May 17, 2014, at 1:51 PM, holger krekel <[email protected]> wrote:
> >>>
> >>>> On Sat, May 17, 2014 at 11:32 -0400, Donald Stufft wrote:
> >>>>> More conclusions!
> >>>>>
> >>>>> In that same time period PyPI received a total of ~16463209 hits to
> a page on
> >>>>> the simple installer API. This means that in total these projects
> represent
> >>>>> a combined 0.56% of the simple installer traffic on PyPI. However
> looking at
> >>>>> the numbers you can see that PIL is an obvious outlier with the hits
> dropping
> >>>>> drastically after that. PIL on it's own represents 0.44% of the hits
> on PyPI
> >>>>> during that time period leaving only 0.12% for anything not PIL.
> >>>>
> >>>> So the current numbers roughly mean that around 92193 end-user sites
> per
> >>>> day depend on crawling currently, right?  Do you know if these are
> also
> >>>> unique IPs (they might indicate duplicates although companies also
> have NATting
> >>>> firewalls)?
> >>>>
> >>>> holger
> >>>
> >>> Here’s the number of IP addresses that accessed each /simple/ page per
> day.
> >>>
> >>> https://gist.github.com/dstufft/347112c3bcc91220e4b2
> >>>
> >>> Unique IPs: 95541
> >>> Unique IPs for Only Hosted off PyPI: 8248 (8.63%)
> >>> Unique IPs for Only Hosted off PyPI w/o PIL: 2478 (2.59%)
> >>>
> >>> It's important to remember when looking at these numbers that almost
> all of
> >>> them represent something downloading a package unsafely which will
> generally
> >>> contain Python code that they will then be executed. Breaking the
> unsafe thing
> >>> is, in my opinion, non optional and the only thing needed to be
> discussed about
> >>> it is how to go about doing it exactly. The safe thing I think
> *should* be
> >>> removed for the various other reasons that have been outlined and it
> only
> >>> represents a tiny fraction of uses.
> >>>
> >>> The numbers to be specific are, 8248 of the above 8248 IPs downloaded
> something
> >>> unsafely, while 214 of them also downloaded something safely. That
> means that
> >>> 100% of the 8248 addresses could have been attacked through their use
> of PyPI
> >>> and only 2.59% downloaded anything that was safely hosted off of PyPI.
> >>>
> >>> Looking at the same numbers for projects which have *any* files hosted
> off of
> >>> PyPI (the numbers thus far have been projects which have *only* files
> hosted
> >>> off of PyPI) I see that 35046 IP addresses accessed a project that had
> any
> >>> unsafely hosted off of PyPI files while only 2852 IP addresses
> accessed a
> >>> project that had any safely hosted off of PyPI files.
> >>>
> >>> That means that roughly a minimum floor of ~36% of the users of PyPI
> were
> >>> vulnerable to a MITM attack on 2014-05-14 unless they were using pip
> 1.5
> >>> without any --allow-unverified flags or they were using pip 1.4 with
> >>> --allow-no-insecure and even in that case they could still be
> vulnerable if
> >>> there is any use of setup_requires. I say that's a minimum because
> that only
> >>> counts the projects where I happened to find a file hosted unsafely
> externally.
> >>> It does not count at all any projects which I did not find a file like
> that but
> >>> which still has locations on their simple page like that. This is
> especially
> >>> troublesome for projects where they have old domain names in those
> links that
> >>> point to domains that are no longer registered.
> >>>
> >>> Also just FYI I've removed pyPDF from both lists as I've contacted the
> author
> >>> and there are packages now hosted on PyPI for it. I've also contacted
> PIL and a
> >>> few other authors (of which I've just heard back from cx_Oracle and
> they appear
> >>> to be willing to upload as well).
> >>
> >> Thanks Donald for both the numbers and contacting some key authors which
> >> i think is a very good move!  I suggest to now wait a week or so to see
> >> where we stand then, update the numbers and then try to settle on
> >> crawl-deprecation paths.
> >>
> >> Also, let's please just talk about "checksummed" packages or integrity.
> >> Even all pypi hosted packages are unsafe in the sense that they
> >> might contain bad code from malicious uploaders or http-interceptors
> >> that executes on end-user machines during installation.  Thus the term
> >> "safe" is misleading and should not be used when communicating to
> >> end-users.  Currently, we can only say or improve anything related to
> >> integrity: what people download is what was uploaded by whoever happened
> >> to have the credentials (*) or MITM access on http upload.  Speaking of
> the
> >> latter, maybe we should also think about moving to https uploads and
> >> certificate-pinning, and that also for installers.  And also, as Marius
> >> pointed out, pypi is currently using the relatively weak MD5 hash.
> >
> > The problem with upload is when people use setup.py upload they are
> often times
> > using the upload from distutils. Since that is in the standard library
> we can't
> > really go backwards in time and make it safe. People who use my twine
> utility
> > to upload instead of setup.py upload are not vulnerable to MITM on
> upload.
> >
> > While I don't particularly like the MD5 hash, it's not true that the MD5
> hash
> > current presents a problem against the threat model that we're worried
> about.
> > It's relatively easy to generate a collision attack, which would mean
> that a
> > malicious author could generate two packages, an unsafe and a safe one
> that
> > hashed to the same thing. However MD5 is still resistant to 2nd preimage
> > attacks so an attacker could not create a package that hashes to a given
> hash.
> >
> >>
> >> Without resolving these issues we can not even truthfully declare
> >> integrity as something that the pypi-hosted packages themselves are
> providing.
> >
> > We cannot fix every problem at once. Right now the tools exist for
> authors to
> > make it possible to do everything safely. The externally hosted files
> represent
> > an easier to exploit attack than a MITM on author upload. The MITM
> requires a
> > privileged network position on specific individuals whom are also not
> using
> > twine or the browser to upload their distributions.
> >
> > Attacking people who are installing these packages is far easier. It
> would
> > either require a privileged network position on one of ~90k IP addresses
> on any
> > particular day (a much easier feat than for authors periodically) or,
> even
> > easier, locate an expired domain registration and simply register the
> domain
> > which wouldn't require a privileged network position at all.
> >
> >>
> >> best,
> >> holger
> >>
> >> (*) did you happen to have run some password crackers against
> >> the pypi database?  Might be a larger attack vector than highjacking
> >> DNS entries.
> >
> > No I have not. The database currently uses bcrypt with a work factor of
> 12
> > which makes it computationally hard for me to brute force passwords for
> all
> > ~30k users which have a password set. If there was a specific user I was
> > interested in a smart brute force attack might be able to locate
> something.
> > Rate-limiting log in attempts is also on the list of things to add in
> > Warehouse.
> >
> > -----------------
> > Donald Stufft
> > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372
> DCFA
> >
> > _______________________________________________
> > Distutils-SIG maillist  -  [email protected]
> > https://mail.python.org/mailman/listinfo/distutils-sig
>
>
> -----------------
> Donald Stufft
> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372
> DCFA
>
>
> _______________________________________________
> Distutils-SIG maillist  -  [email protected]
> https://mail.python.org/mailman/listinfo/distutils-sig
>
>
_______________________________________________
Distutils-SIG maillist  -  [email protected]
https://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to