On Wed, Jul 22, 2015 at 9:12 PM Wes Turner <wes.tur...@gmail.com> wrote:
> > On Jul 22, 2015 5:12 PM, "Brett Cannon" <bcan...@gmail.com> wrote: > > > > > > > > On Wed, Jul 22, 2015 at 2:19 PM Wes Turner <wes.tur...@gmail.com> wrote: > >> > >> https://github.com/dstufft/pypi-stats > >> > >> https://github.com/dstufft/pypi-external-stats > > > > > > I'm not quite sure what I'm supposed to get from those links, Wes, as > that code still scrapes every project individually and downloads them while > all I'm trying to avoid having to scrape PyPI and instead just download a > single file (plus I don't want the files but just the metadata already > returned by the JSON API). > > An online query or an offline dump? > Offline dump. I literally just want a single file to download. Anyway, it's sounding like there isn't one currently so it would need to be a new feature for Warehouse. -Brett > > > > -Brett > > > >> > >> - [ ] a flat bigquery w/ pandas.io.gbq ala GitHub Archive would be great > > http://pandas.pydata.org/pandas-docs/version/0.16.2/io.html#io-bigquery > > >> - [ ] it's probably worth it to add RDFa to PyPi and warehouse pages > (in addition to the auxiliary executed/extracted JSON) for #search > > https://github.com/pypa/warehouse/blob/master/warehouse/packaging/models.py > > > https://github.com/pypa/warehouse/blob/master/tests/unit/packaging/test_models.py > > https://github.com/pypa/warehouse/blob/master/warehouse/packaging/views.py > > > https://github.com/pypa/warehouse/blob/master/warehouse/templates/packaging/detail.html > > https://github.com/pypa/warehouse/blob/master/warehouse/routes.py > > > https://github.com/pypa/warehouse/blob/master/tests/unit/legacy/api/test_json.py > > https://github.com/pypa/warehouse/blob/master/warehouse/legacy/api/json.py > > >> > >> On Jul 22, 2015 4:08 PM, "Brett Cannon" <bcan...@gmail.com> wrote: > >>> > >>> When I wrote > https://nothingbutsnark.svbtle.com/python-3-support-on-pypi I wrote a > script to download every project's JSON metadata by scraping the simple > index and then making the appropriate GET request for the JSON metadata. It > worked, but somewhat of a hassle. > >>> > >>> Is there some dump somewhere that is built daily, weekly, or monthly > of all the metadata on PyPI for offline analysis? > >>> > >>> _______________________________________________ > >>> Distutils-SIG maillist - Distutils-SIG@python.org > >>> https://mail.python.org/mailman/listinfo/distutils-sig > >>> >
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig