Hi, I sent the following emails to the github list, but they never seem to got in. If they did, then I apologize for resending them.
Essentially I offer a workaround to retrieve a list of authors of a repository using the google appengine. Ondrej On Sat, Jul 18, 2009 at 12:43 PM, Ondrej Certik<[email protected]> wrote: > On Sat, Jul 18, 2009 at 2:10 AM, Ondrej Certik<[email protected]> wrote: >> Hi, >> >> is there some way to obtain a list of authors, like in this command: >> >> $ git shortlog -ns >> 923 Ondrej Certik >> 374 Kirill Smelkov >> 257 Mateusz Paprocki >> 109 Fredrik Johansson >> 102 Fabian Pedregosa >> 55 Jason Gedge >> [...] >> >> >> using the GitHub API? The only way I figured so far is to use the >> Network API >> >> http://develop.github.com/p/network.html >> >> to retrieve *all* commits (e.g. first uses "dates" to get the range >> and then "network_data_chunk" for all commits), then extract author >> information from it. It's pretty wasteful and quite slow too. >> >> Is there some other way? > > Here is a python script that does that: > > -------- > from django.utils import simplejson > import urllib2 > > s = urllib2.urlopen("http://github.com/certik/sympy/network_meta").read() > data = simplejson.loads(s) > dates = data["dates"] > nethash = data["nethash"] > print len(dates) > print nethash > base = "http://github.com/certik/sympy" > url = "%s/network_data_chunk?nethash=%s&start=0&end=%d" % (base, nethash, > len(dates)-1) > print "downloading..." > s = urllib2.urlopen(url).read() > print " done." > data = simplejson.loads(s, encoding="latin-1") > commits = data["commits"] > authors = [x["author"] for x in commits] > authors = list(set(authors)) > authors.sort() > print authors > print len(authors) > ---------- > > this prints: > > > $ python a.py > 2819 > c55c72cf04eda4b54e26ef4bc30881a97de59e3e > downloading... > done. > [u'Aaron Meurer', u'Abderrahim Kitouni', u'Akshay Srinivasan', u'Alan > Bromborsky', u'Ali Raza Syed', u'Andrej "qwp0" Tokar\u010d\xc3\xadk', > u'Andrew Docherty', u'Andrew Straw', u'Andy R. Terrel', u'Barry > Wardell', u'Bastian Weber', u'Ben Goodrich', u'Bernhard R. Link', > u'Boris Timokhin', u'Brian E. Granger', u'Chris Smith', u'Chris.Wu', > u'David Lawrence', u'David Marek', u'David Roberts', u'David Roberts > (dvdr18 [at] gmail [dot] com)', u'Elrond der Elbenfuerst', u'Fabian > Pedregosa', u'Fabian Seoane', u'Felix Kaiser', u'Florian Mickler', > u'Freddie Witherden', u'Fredrik', u'Fredrik Johansson', u'Friedrich > Hagedorn', u'Goutham', u'Henrik Johansson', u'Hubert Tsang', u'James > Abbatiello', u'James Aspnes', u'Jaroslaw Tworek', u'Jochen Voss', > u'Johann Cohen-Tanugi', u'Jurjen N.E. Bos', u'Kaifeng Zhu', u'Kirill > Smelkov', u'Konrad Meyer', u'Luke Peterson', u'Mateusz Paprocki', > u'Nicolas Pourcelot', u'Nimish Telang', u'Ondrej Certik', u'Or Dvory', > u'Pan Peng', u'Pauli Virtanen', u'Priit Laes', u'Riccardo Gori', > u'RizgarMella [email protected]', u'Robert', u'Robert Cimrman', > u'Robert Kern', u'Roberto Nobrega', u'Ronan Lamy', u'Ryan Krauss', > u'Saroj', u'Saroj Adhikari', u'Sebastian Krause', u'Sebastian Kreft', > u'Sebastian Kr\xc3\xa4mer', u'Stefano Maggiolo', u'Stepan Roucka', > u'Ted Horst', u'Thomas Sidoti', u'Tomasz Buchert', u'Toon > Verstraelen', u'Vinay Kumar', u'Vinzent Steinberg', u'basti.kr', > u'brian.jorgensen', u'certik', u'convert-repo', u'fabian', > u'fredrik.johansson', u'inferno1386', u'kirill.smelkov', u'lethargo', > u'mattpap', u'ondrej.certik', u'pearu.peterson'] > 84 > > > > However, if I wanted to also get email addresses, I think I'd have to > go over all users individually, probably use the commit ID to get to > the author of the commit using GitHub API. Any ideas on this? I have implemented the above approach here: http://repos.sympy.org/ and it seems to be working, e.g.: http://repos.sympy.org/hooks/repos/agZzeW1weTJyEQsSClJlcG9zaXRvcnkYwRIM/ I am using the appengine's task queue and I am restricting github API calls to 55 per minute, to be sure I don't break the 60 requests per minute limit. But obviously, if the same thing could be achieved by just one API call (I don't know), it'd be much less wasteful. Ondrej --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "GitHub" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/github?hl=en -~----------~----~----~----~------~----~------~--~---
