On 2018-09-04 11:40:17 -0500 (-0500), Dustin Ingram wrote: > On Tue, Sep 4, 2018 at 11:33 AM Jeremy Stanley <fu...@yuggoth.org> wrote: > > > > Yes. If you haven't tried running a mirror of PyPI lately you're > > likely not to have noticed, but the various nightly builds for > > tensorflow seem to be the majority of the data on PyPI now. I'm sure > > it's a very neat and useful tool, but we basically had to switch > > from mirroring PyPI in our CI system to using a caching proxy > > because of this. > > Side note: PyPI now provides a list of the largest packages by total > filesize: https://pypi.org/stats/ > > Depending on what mirror you're using, you may be able to exclude > these packages from your mirror if you don't need them, e.g. for > bandersnatch: > https://github.com/pypa/bandersnatch/blob/master/docs/filtering_configuration.md#blacklist-filtering-settings
We played whack-a-mole blacklisting some of the largest offenders in our bandersnatch config for a while, but really needed to rebuild the mirror from scratch since there's no easy way to go back and delete the now-blacklisted packages from before the blacklist entries were added (and that's a week+ effort to bootstrap a new mirror these days). In the end we just switched to a caching proxy we already had on hand because it got us most of the benefit of mirroring with a tiny fraction of the disk space, given we use fewer than 1000 packaged Python library dependencies across our CI jobs anyway. -- Jeremy Stanley
signature.asc
Description: PGP signature
-- Distutils-SIG mailing list -- distutils-sig@python.org To unsubscribe send an email to distutils-sig-le...@python.org https://mail.python.org/mm3/mailman3/lists/distutils-sig.python.org/ Message archived at https://mail.python.org/mm3/archives/list/distutils-sig@python.org/message/3SPP3O47YY7OO2UHADY6AA6PDJMKEFDS/