On Nov 17, 2011, at 12:09 PM, Miles Fidelman wrote: > Matt Amory wrote: >> Is anyone involved with, or does anyone know of any project to extract and >> aggregate bibliography data from individual works to produce some kind of >> "most-cited" authors list across a collection? Local/Network/Digital/OCLC >> or historic? >> >> Sorry to be vague, but I'm trying to get my head around whether this is a >> tired old idea or worth pursuing... >> >> > Sounds like you're describing citeseer - http://citeseerx.ist.psu.edu/ - it's > a combination bibliographic and citation index for computer science > literature. It includes a good degree of citation analysis. Incredibly > useful tool.
Another recent project (that I haven't had a chance to play with yet) is Total Impact : http://total-impact.org/about.php It's from some of the folks in altmetrics, who are trying to find better bibliometrics for measuring value: http://altmetrics.org/manifesto/ I don't see a list of what they're scraping I think they're using the publisher's indexes, PubMed and other databases rather than parsing the text themselves ... but the software's available, if you wanted to take a look. Or you could just ask Heather or Jason, they're both approachable and always eager to talk, when I've run into them at meetings. I also seem to remember someone at the DataCite meeting this summer who was involved in a project to parse references in papers ... unfortunately, I don't have that notebook here to check ... but I *think* it was John Kunze. (and I don't think it was part of the person's presentation, but something that I had picked up in the Q/A part) -Joe