On 5 May 2017 at 14:10, Gregory P. Smith <g...@krypto.org> wrote: > This is not a solvable problem. IMNSHO We should never attempt to implement > pre screening of packages. > > It is a good post-package-upload task for someone to try and do as a > research project. > > Automated code scanning can only find already known things and similar > signatures (at which point it can have false positives) and we aren't just > talking about obfuscated source code. PyPI hosts binary wheels made using > unreproduceable build processes on untrusted machines created from > unverifiable inputs. Scanning services such as Google's > https://www.virustotal.com/en/about/ exist but I'm not sure that'd be of > much value to PyPI.
Red Hat's approach to this (https://github.com/fabric8-analytics/) relies heavily on "popularity within your cohort" as a proxy for safety. It's far from being a perfect approach (since there's still a risk of the "bystander effect" coming into play, where everyone assumes everyone else is handling the security audits), but it at least gives people a heads up when they're doing something relatively unusual and hence may want to take more care and treat their potential dependencies with a bit more suspicion. Cheers, Nick. P.S. Full disclosure: until I switched teams a few months ago, working on fabric8-analytics (and its precursor projects) was my day job at Red Hat. As far as I'm aware, the current version still doesn't take the raw PyPI Big Query download data into account, but it does track component usage across public GitHub repositories - the benefit of focusing on the latter is that it gives you co-occurence information (i.e. "component X is often used in combination with component Y"), rather than the raw popularity metrics offered by the download numbers (which can also be heavily skewed by artifact caches, and the lack thereof, in automated build and test pipelines). -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ PSF-Community mailing list PSF-Community@python.org https://mail.python.org/mailman/listinfo/psf-community