> On Sep 18, 2016, at 7:42 AM, Jason Antman <ja...@jasonantman.com> wrote:
> Thanks so much for all of your work on this.
> I know it's an old thread, but I've built a tool that generates some
> statistics from this information. It queries data for a given list of
> projects, or for all of a specific user's projects, caches the data locally
> on disk, and generates both a HTML report with a bunch of graphs, as well as
> download badges. For my own use, I'm running it cron'ed from my desktop once
> a night, and uploading the reports and badges to a public Amazon S3 bucket.
> The project is: https://pypi.org/project/pypi-download-stats/
> Example output:
> It's a bit rough around the edges, and currently doesn't have any unit tests
> - my hope is that this will be an interim solution until Warehouse has
> built-in stats, but I'd be happy to polish it up a bit as time allows if
> anyone finds it useful.
Awesome, this is exactly the kind of thing I was hoping to enable by making
these all public :)
> Side note for Donald: It appears that the dataset currently contains data for
> 2016-01-22 to 2016-03-06 and 2016-05-22 to current. Is there any plan or
> possibility of backfilling either the 2016-03-07 to 2016-05-21 gap, or the
> older data?
Yes there is. I started backfilling then got distracted and quit (hence the
gap). Filling in the gap is easy, it’s just a matter of downloading the files
and running a script. Getting even older data requires more effort to first
munge the existing log files into the correct format, and then run them through
the same script as above. When it’s all said and done we should be able to go
back to like, Jan of 2014 I think.
Distutils-SIG maillist - Distutils-SIG@python.org