Hi,

I ran a couple of queries against GitHubs public big query dataset [0] last 
week. I’m interested in requirement files in particular, so I ran a query 
extracting all available requirement files.

Since queries against this dataset are rather expensive ($7 on all repos), I 
thought I’d share the raw data here [1]. The data contains the repo name, the 
requirements file path and the contents of the file. Every line represents a 
JSON blob, read it with:

with open('data.json') as f:
    for line in f.readlines():
        data = json.loads(line)

Maybe that’s of interest to some of you.

If you have any ideas on what to do with the data, please let me know.

—

Jannis Gebauer



[0]: https://cloud.google.com/bigquery/public-data/github 
<https://cloud.google.com/bigquery/public-data/github>
[1]: https://github.com/jayfk/requirements-dataset 
<https://github.com/jayfk/requirements-dataset>
_______________________________________________
Distutils-SIG maillist  -  [email protected]
https://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to