I wish there was a way to automatically redirect people requesting PIL to get Pillow instead. It's a PYPI/setuptools friendly repackaging of PIL that has been going on for years now:
https://pypi.python.org/pypi/Pillow/2.4.0 On 19 May 2014 23:29, Donald Stufft <[email protected]> wrote: > Just an update, asyncmongo has released to PyPI now, so I’ve removed > them from the gists as well. Still no word back from PIL. > > On May 18, 2014, at 11:21 AM, Donald Stufft <[email protected]> wrote: > > > > > On May 18, 2014, at 2:20 AM, holger krekel <[email protected]> wrote: > > > >> On Sat, May 17, 2014 at 20:20 -0400, Donald Stufft wrote: > >>> On May 17, 2014, at 1:51 PM, holger krekel <[email protected]> wrote: > >>> > >>>> On Sat, May 17, 2014 at 11:32 -0400, Donald Stufft wrote: > >>>>> More conclusions! > >>>>> > >>>>> In that same time period PyPI received a total of ~16463209 hits to > a page on > >>>>> the simple installer API. This means that in total these projects > represent > >>>>> a combined 0.56% of the simple installer traffic on PyPI. However > looking at > >>>>> the numbers you can see that PIL is an obvious outlier with the hits > dropping > >>>>> drastically after that. PIL on it's own represents 0.44% of the hits > on PyPI > >>>>> during that time period leaving only 0.12% for anything not PIL. > >>>> > >>>> So the current numbers roughly mean that around 92193 end-user sites > per > >>>> day depend on crawling currently, right? Do you know if these are > also > >>>> unique IPs (they might indicate duplicates although companies also > have NATting > >>>> firewalls)? > >>>> > >>>> holger > >>> > >>> Here’s the number of IP addresses that accessed each /simple/ page per > day. > >>> > >>> https://gist.github.com/dstufft/347112c3bcc91220e4b2 > >>> > >>> Unique IPs: 95541 > >>> Unique IPs for Only Hosted off PyPI: 8248 (8.63%) > >>> Unique IPs for Only Hosted off PyPI w/o PIL: 2478 (2.59%) > >>> > >>> It's important to remember when looking at these numbers that almost > all of > >>> them represent something downloading a package unsafely which will > generally > >>> contain Python code that they will then be executed. Breaking the > unsafe thing > >>> is, in my opinion, non optional and the only thing needed to be > discussed about > >>> it is how to go about doing it exactly. The safe thing I think > *should* be > >>> removed for the various other reasons that have been outlined and it > only > >>> represents a tiny fraction of uses. > >>> > >>> The numbers to be specific are, 8248 of the above 8248 IPs downloaded > something > >>> unsafely, while 214 of them also downloaded something safely. That > means that > >>> 100% of the 8248 addresses could have been attacked through their use > of PyPI > >>> and only 2.59% downloaded anything that was safely hosted off of PyPI. > >>> > >>> Looking at the same numbers for projects which have *any* files hosted > off of > >>> PyPI (the numbers thus far have been projects which have *only* files > hosted > >>> off of PyPI) I see that 35046 IP addresses accessed a project that had > any > >>> unsafely hosted off of PyPI files while only 2852 IP addresses > accessed a > >>> project that had any safely hosted off of PyPI files. > >>> > >>> That means that roughly a minimum floor of ~36% of the users of PyPI > were > >>> vulnerable to a MITM attack on 2014-05-14 unless they were using pip > 1.5 > >>> without any --allow-unverified flags or they were using pip 1.4 with > >>> --allow-no-insecure and even in that case they could still be > vulnerable if > >>> there is any use of setup_requires. I say that's a minimum because > that only > >>> counts the projects where I happened to find a file hosted unsafely > externally. > >>> It does not count at all any projects which I did not find a file like > that but > >>> which still has locations on their simple page like that. This is > especially > >>> troublesome for projects where they have old domain names in those > links that > >>> point to domains that are no longer registered. > >>> > >>> Also just FYI I've removed pyPDF from both lists as I've contacted the > author > >>> and there are packages now hosted on PyPI for it. I've also contacted > PIL and a > >>> few other authors (of which I've just heard back from cx_Oracle and > they appear > >>> to be willing to upload as well). > >> > >> Thanks Donald for both the numbers and contacting some key authors which > >> i think is a very good move! I suggest to now wait a week or so to see > >> where we stand then, update the numbers and then try to settle on > >> crawl-deprecation paths. > >> > >> Also, let's please just talk about "checksummed" packages or integrity. > >> Even all pypi hosted packages are unsafe in the sense that they > >> might contain bad code from malicious uploaders or http-interceptors > >> that executes on end-user machines during installation. Thus the term > >> "safe" is misleading and should not be used when communicating to > >> end-users. Currently, we can only say or improve anything related to > >> integrity: what people download is what was uploaded by whoever happened > >> to have the credentials (*) or MITM access on http upload. Speaking of > the > >> latter, maybe we should also think about moving to https uploads and > >> certificate-pinning, and that also for installers. And also, as Marius > >> pointed out, pypi is currently using the relatively weak MD5 hash. > > > > The problem with upload is when people use setup.py upload they are > often times > > using the upload from distutils. Since that is in the standard library > we can't > > really go backwards in time and make it safe. People who use my twine > utility > > to upload instead of setup.py upload are not vulnerable to MITM on > upload. > > > > While I don't particularly like the MD5 hash, it's not true that the MD5 > hash > > current presents a problem against the threat model that we're worried > about. > > It's relatively easy to generate a collision attack, which would mean > that a > > malicious author could generate two packages, an unsafe and a safe one > that > > hashed to the same thing. However MD5 is still resistant to 2nd preimage > > attacks so an attacker could not create a package that hashes to a given > hash. > > > >> > >> Without resolving these issues we can not even truthfully declare > >> integrity as something that the pypi-hosted packages themselves are > providing. > > > > We cannot fix every problem at once. Right now the tools exist for > authors to > > make it possible to do everything safely. The externally hosted files > represent > > an easier to exploit attack than a MITM on author upload. The MITM > requires a > > privileged network position on specific individuals whom are also not > using > > twine or the browser to upload their distributions. > > > > Attacking people who are installing these packages is far easier. It > would > > either require a privileged network position on one of ~90k IP addresses > on any > > particular day (a much easier feat than for authors periodically) or, > even > > easier, locate an expired domain registration and simply register the > domain > > which wouldn't require a privileged network position at all. > > > >> > >> best, > >> holger > >> > >> (*) did you happen to have run some password crackers against > >> the pypi database? Might be a larger attack vector than highjacking > >> DNS entries. > > > > No I have not. The database currently uses bcrypt with a work factor of > 12 > > which makes it computationally hard for me to brute force passwords for > all > > ~30k users which have a password set. If there was a specific user I was > > interested in a smart brute force attack might be able to locate > something. > > Rate-limiting log in attempts is also on the list of things to add in > > Warehouse. > > > > ----------------- > > Donald Stufft > > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 > DCFA > > > > _______________________________________________ > > Distutils-SIG maillist - [email protected] > > https://mail.python.org/mailman/listinfo/distutils-sig > > > ----------------- > Donald Stufft > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 > DCFA > > > _______________________________________________ > Distutils-SIG maillist - [email protected] > https://mail.python.org/mailman/listinfo/distutils-sig > >
_______________________________________________ Distutils-SIG maillist - [email protected] https://mail.python.org/mailman/listinfo/distutils-sig
